Bit of a quandary tonight. My HA install (HAOS on RPi4 - running current latest version) stopped responding on the web UI / mobile UI, yet was still cheerfully running (i.e. I was getting my normal alerts being sent via Pushover for CCTV and other automation events). Obviously, I’d love to find out what went wrong so I could either fix it, or report it, but I can’t find any way to access the running instance logs without going through the UI. Power cycling the RPi (which I had to do in the end), while fixing the issue short-term, gets me no closer to actually solving the underlying problem as, to my endless frustration, HAOS does not persist any logs between boots making post-lockup fault finding extremely challenging.
What are other people doing to diagnose this sort of failure, and what could I do myself so I can diagnose failures more effectively in the future?
Thanks Tom. I already had the SSH add-on installed, but it wasn’t set to start on boot, nor did it have any configured keys (doh! and doh!) Now configured and working, so we’ll see what happens when (if?) it dies next time.
Still think that some sort of basic log rotation might not be a bad addition to the product…
Thanks again Tom - that’s actually solved my problem. Looks like my SSL cert was reporting expired, despite being renewed with the LE add-on last week and this killed the UI, which didn’t appear to even want to load with errors.
I suspect I haven’t done a full restart after running the certificate renewal process, so HA had never loaded the updated files.