Woke up to completely broken HA. What could have happened?

Last night, we went to bed with everything working as normal. This morning I woke up and noticed the lights weren’t turning on via motion sensor. To my shock and horror in the app I am met with error after error about configuration.yaml not being found, and I couldn’t do anything in the app anymore.

Not too distressed but very annoyed I went to the PC, and tried accessing it directly through the Pi’s IP. Nope. Connection refused. Pinged it, and everything looked normal. Tried SSH into it, didn’t work. Tried connecting it direct to my monitor and there is simply no connection.

What the hell happened? Why did it happen? What could I do to prevent it from happening in the future? I do have a backup but I’m not exactly keen to install it again if I don’t know what the issue is.

Presumably a restart hasn’t fixed it?

Is it running on an SD card or an SSD?

SD card. I know you’re not “supposed” to do that, but I barely have anything running on it (5 lights, 2 motion sensors, 2 switches, IR blaster, some sensors from my PC, the TV and 8? automations). Wanted to see how to use home assistant. I don’t currently own a spare NVME or enclosure for one, but I plan to change to better hardware in the future, probably ditching the pi altogether.

Edit: No, a restart didn’t fix. That was the first thing I tried, just forgot to mention.

Can you read the SD card via another machine?

Are you able to access the SD card by inserting it into a PC? Are the files accessible?

Yeah, I had a look but as I’m unfamiliar with this it doesn’t really tell me anything. It’s understood by windows as a boot drive, as it should and looks to be the same as when I wrote the boot image to it.

Anyone got any ideas?

I’m still guessing sdcard

Just because it’s readable does not make it writable or consistently so

1 Like

If you can read the files you should be able to pull out the system journal onto another machine and look at the last logs prior to it dying. It’s basically the same process as detailed out in this guide:

The only difference is you aren’t starting from the ssh addon, you’re starting from the sdcard mounted in another machine. Journal files are all still in the same place though, var/log/journal relative to wherever you mounted it.

I would also guess sdcard given the abruptness and the inability to even start it anymore. There’s not much that can cause a complete failure to even start the machine besides disk corruption. How long were you running it successfully prior to this? Also do you remember if you changed commit_interval or left it at the default?

2 Likes

I had this happen once. A manual restart of the server fixed it. My best guess is there was a reboot and HA could not read configuration.yaml. I’ve never had that error again. Make sure you are using good quality storage as in not an SD card. I wonder if a file was corrupted since the restart did not help you.

1 Like

Thank you. This instance was running for only a few days, maybe 4 or 5. I’ve run into a problem, however. I’ve gotten into the directories with Linux Reader (not running a Linux system right now), but of course don’t have access to journalctl. Are there any log readers that I could get on Windows?

If not I’ll resume looking into this probably not such a mystery another day. Thanks again though.

@giqcass I wonder too, I did notice there was a notification stating that 2022.9.2 was available but as far as I’m aware there was no auto update feature enabled. At least, I know I did not enable anything like that as all previous updates I had to explicitly give permission to. Electricity in my building is rock solid, really can’t think of any other way it could have rebooted itself.

None that I’m aware of. I haven’t even managed to find a way to read the journal on alpine linux so I kind of doubt there’s a windows one.

However one thing you can do that’s relatively simply is just make a debian container. These are the commands you would use to do this linux or mac machine. I assume there’s something similar on windows although I have no idea since I really never use windows machines:

id="$(docker run --mount type=bind,src=<mount path>/var/log/journal,dst=/var/log/journal -dit debian)"
docker exec -it "${id}" sh

After that you’ll be inside the debian machine so you can install packages you need with apt-get to read from the journal. Which I guess is just apt-get systemd, not sure there’s a way to get journalctl without all of it.

If that is proving too complicated on a windows machine then an alternative is just to make a debian VM. Then you’ll have a full debian box to work with and can mount the sdcard in there by following the instructions for whatever VM software you’re using.

Regarding SD cards, I am using a Sandisk Max Endurance SD Card in my RPi for Home Assistant. Hopefully it will last like Sandisk says it will, been only 1 month now. That said, I do have nightly backups to Google Drive running.

This might help to check your SD card for errors: How to Check an SD Card for Errors. But I advise you to not make any changes to the card, just check if the card is dying.

I actually suspect the RPi4 itself was responsible for this.

I tried another SD card and it also wasn’t working, I was getting the ACT light giving a constant quick pulse of light so the pi itself couldn’t read the SD cards.
On top of that, I still couldn’t get any output from the HDMI port and suddenly it wasn’t appearing in my router’s list of connected devices.

Using RPi imager I wrote PINN to the second SD card, and it’s now seemingly fixed the problem. I’m still not sure what went wrong to begin with, but from now on will be running HA from a drive rather than an SD card, if I can just write an SD card to fix the issue hopefully without jeopardising my data on the external drive.

Regardless, I’m almost certain it’s the Pi’s fault and not anything else. When I have the time I’ll mount the original SD card and perform an autopsy.

1 Like

Is your Pi not just under-powered? Try a heavier power supply… :wink:

1 Like

How did you “try” another SD card?

Verify the pi.
Get a new SD card and use the Raspberry Pi Imager to flash the latest Pi OS image.
Plug this SD into your Pi and apply power.
This will install the Pi OS on that SD card.

Why Pi OS and not some other test?
Simple, the Pi OS is pretty I/O intensive and is there is a problem with the Pi you won’t be able to install it.

If you see the low power warning on the Pi display, then the power supply may be your problem. Low power while writing data to an SD can corrupt the card.

I am also thinking that your SD card has failed.

Since you have a backup, why not just reinstall Home Assistant and restore from the last backup?