Thanks. Nothing in there really jumps out as a possible culprit though - the system had been running on the same hardware, with the same peripherals, and the same configuration (except regular updates to the latest HA version) for months without issues, and it’s running without issues now. Other than a ~2 hr gap in sensor data and a ~24 hr gap in system logs, there is no indication of anything ever being wrong
I had some issues before with the SSD, so it stands to reason that this could be my problem. It just seems very strange that almost a day of logs disappeared while HA was still acting perfectly normal and dutifully recording sensor data.
Your not missing logfile data. The current logfile is created on reboot. The old logfile is home-assistant.log.1
Pulling the power on a system without a proper shutdown can corrupt the file system - something that may not be immediately noticeable but may cause problems in the future.
Hardware does go bad. Power supplies, SD cards, SSDs etc. It may not be consistent how this presents itself but that does not mean that there is not an issue.
I am talking about the OS level logs pulled via journalctl. There is nothing between Jul 14 21:40:12 and Jul 15 20:48:39. home-assistant.log[.1] has nothing that would shed light on why the system crashed.
I agree that something is an issue, and it may well be the HW… but without logs it’s hard to figure out what, and all I can do is randomly replace hardware and hope the problem goes away… (though the SSD is probably a good start given past problems, so I may do that).
I’m using a CM4 on a Waveshare PoE UPS Base Board, with a M.2 NVMe connector. As mentioned the SSD / board combination has not been the most reliable, so, it’s definitely a suspect.
Thanks! Didn’t know about that, enabled debug as described.
I am still scratching my head how a day’s worth of logs could just disappear. It’s apparently not unheard of with systemd. Not great, not losing all the logs is kinda important…
I agree with that.
My Haos on rpi4 freezes once every 15 days aprox. Unplug and rep… do the thing, but the logs are gone.
Im thinking a script that copy the logs every x time, so I can backup the files to see them later…