Since I moved my HA (2021.10.6) install off SD onto a SSD (restoring a full backup from the SD onto a new SSD install), I’ve got some issues.
The worst so far is for whatever reason, HA just dies every morning around 5-ish. The Rpi4 can be pinged but Lovelace, Homekit control, Zigbee buttons, all that is dead. I have to powercycle the device. I have yet to try to SSH into it as I was adamant to get it up and running ASAP. I’ll try that the next time this happens.
The logs restart after I kickstart it back to life (which is weird, isn’t it?), so there’s nothing to go on. This has never happened, not once, with the SD card. It always happens in the wee hours, which would be kinda weird if it was, say, a power issue.
So, questions:
what could this be?
how can I make sure I have got logs to go on? I don’t remember having logging problems until this. I don’t have specific logger: setting in my config.yaml but nor do I use default_config:, so maybe that’s why? I’ll start by adding logger: default: warning (correctly formatted!).
I am on this install since 2019 without major issues until this. Thanks!
EDIT: some more info:
using official Rpi4 power 3A supply, only other thing plugged is a CC2531 Zigbee stick
SSD enclosure has the recommended ASMedia chipset, SSD is Patriot brand
no voltage warnings in logs
no CPU/RAM/temp spikes before it dies
it’s plugged into UPS, so the power should be fairly smooth
Last logbook entry is 5:07, and then nothing until 7:17 when I manually powercycled the Pi. There are entries from 4 to 5 so that’s probably not it. Thanks!
I’ve rebooted once more since, so that is lost… How can I retain logs for longer? I’m looking at logger documentation but don’t see anything to that effect.
In logs that I do have available, no voltage warning. I use the official Rpi4 power supply.
Only the one previous log is saved. If you want more it’s going to be a mess of file sensors and file notifications to create a copy of your log file. Not worth the effort. Just check the log file 1 after rebooting.
I am using the monitoring, yes . No spikes in CPU, RAM or temp before the die-off. I’ll add that to the post. Thanks.
The Pi has the obligatory CC2531 connected, nothing else beside the SSD. No voltage warnings.
Reddit tells me the USB 3 controller may overheat and cause a hang. It doesn’t have a sensor. I’ll try to stick a heatsink on it. Jesus this SSD on RPi stuff is picky
Are you using one of the known working adapters? Can you disconnect the zigbee dongle for the night? The original pi power supply is the minimum recommended supply for a pi4 with an ssd.
i also have similar problem. hangs every alternate day 6.11am…yesterday I increased the swap size. and today It didnt stop at scheduled (!) time, still running. i will observe a week, lets see.
That uses an NVMe hard drive? IIRC they draw more power under load than a sata version. If you have a powered usb hub, you could use that to power the zigbee dongle.
It happened again, at the same time. Nothing of interest in log, no apparent spike of any kind. I’m returning back the SD card as I can’t deal with this right now (it’s a bitch to diagnose as it only ever happens at 5AM, and there’s a toddler involved who absolutely requires a working zigbee device around that time). I’ve ordered (another) powered USB hub and some heatsinks and will give it a go later. If I have to choose between unreliable solution that’s been working flawlessly for 2+ years (I have backups) or a reliable one that dies for no apparent reason every day, I’ll take the former
update:
it’s been running without any hangs for 5 days since I increase my swap from 1GB to 3GB. I did see some spikes in the swap size and actually, its current value is more than the default value. perhaps you can try increasing the swap size give it a try.