Hass.io fails after restart or reboot

Hi,

I have a NUC-like device (Dell Wyse Z90DE7 thin client). I have been trying to get Hass.io up and running on it. I have been flashing the NUC image from the main Hass.io page. Every time it boots, it takes a very long time. It takes 12 to 15 minutes on “Docker Application container Engine” during startup even after the first startup.

Immediately after flashing the device, I can boot up Hass normally once and restore from snapshot. I used it like this and had a very long uptime, and everything was snappy and worked properly (ZHA, my in-development Firmata component, other docker containers, etc.). However, once I either reboot the hardware or just Home Assistant, it fails to successfully start Home Assistant.

This exact HA configuration had been running for months perfectly fine on a Debian VM on another machine, so I am doubtful it is an issue with my configuration.

I thought this may be an issue with the small 16GB MLC SSD in the thin client, but I booted into another OS and did some SMART and basic long/large read/write tests and everything came out fine. The flash is also not particularly slow, so I am unsure why it takes so long to boot each time.

The issue appears to be with the Hass.io backend. Troubleshooting with the hassio command was not helpful because it is making HTTP requests to some other service that I could not quickly find a log for. When I run something like hassio ha restart --log-level trace after a failure, I see a 400 Bad Request error with the message as “null.”

I tried not using my snapshot and changing some settings/adding entities and restarted and the issue persisted with Hass.io failing to load as a component and the CLI giving bad requests.

I did see some issues in journalctl related to Docker and of course Home Assistant itself. I have not been able to find the log for the Hass.io backend. I have attached the log and redacted a lot of the details to names of things, mac addresses, etc. (this also makes it easier to read). The errors start with my Firmata component, so I tried again without it and I got basically the same result.

You will see an error related to trying to connect to Hue hubs. There are Hue hubs on my network, but I have set it to ignore in discovery yet it still insists on connecting :frowning: (this is some other unrelated issue with Home Assistant).

At this point I have reflashed Hass.io onto my device 3 or 4 times to try things and debug this. If there are any more steps I can take to debug, please let me know! I really want to get to the bottom of this! I really like Hass.io!

OK, of course after more Googling I finally found a solution to the complete failures. I added hassio: to the top of my config and that seemed to do the trick for the errors. However, Docker still takes 12 minutes at boot.

Hmm, I spoke too soon. Let me try the thread count increase.

The inability to run stop, start, or restart appears to be because another task is in progress. It would be great if the error messages in the supervisor log would carry over to said 400 Bad Request. Also, there does not appear to be a way to kill a task. Hass.io seems to just hang on starting home assistant even though it fails and quits.

Have you tried installing an OS like Ubuntu or Debian and then installing Hass.io using the alternitive install script? Might be worth a try.

Using the NUC image on some non NUCs has been known to cause issues.

Currently have it working by rebooting the machine or restarting the docker engine until it works.

Yeah, that will probably work. That’s similar to what the venv install that this replaced. I’d rather not do that would be adding another Docker/abstraction layer to strong but not super-powerful hardware. I’d like to avoid running a full Debian stack on top of everything else and have to maintain it/update it.

I want to try to figure out what these issues are. Theoretically and as far as I can tell, there shouldn’t be much of a difference between the NUC image and an image you would put on a standard PC. It appears to be a simple build for an amd64 machine plus some drivers.

I definitely want to figure out why Docker takes so long to start—that’s probably the source of the other issues.

I’ve not used the NUC image myself, I just install using Ubuntu, even when I was using a NUC, but I believe the NUC image is a stripped down Linux version in the same way the HassOS image is for Pis, so it could be driver compatibility issue with your particular machine?

Having a base OS is not going to throttle your machine to any noticeable level over using the NUC image, if that’s a concern. I run mine on a 8-9yo Dell Optiplex, CPU usage normally sits under 10% even with Shinobi constant recording cameras so an HDD. I do an OS update once a month, everything ticks along very nicely.