How to debug crashes resulting in a completely unresponsive system

Dear Community,

For a while I’ve been having random crashes where my entire system is unresponsive and the only solution possible is to power cycle the device, and even then, sometimes after the power cycle home assistant doesn’t manage to boot and returns a 500 server error, after which I have to power cycle again and then HA boots up normally.

I can’t even SSH into the machine anymore (through the add-on Advanced SSH & Web Terminal).

I am running a fairly straightforward installation, namely

  • Home Assistant OS
  • Home Assistant Yellow
  • RPi CM4 4GB
  • running on a 1TB samsung SSD

I have disabled all my optional addons, and have been running a week with just

  • Mosquitto
  • Zwave JS
  • Zigbee2Mqtt

And already I have had 2 crashes.
My entire setup used to run rock-solid on a RPi 3, at some point it was online for 2 years without any issue. End of last year I upgraded to a yellow (by doing a back-up, and restoring the back-up while installing), and ever since I have had these random crashes.

How can I find what is wrong? Could it be a hardware issue?

3 Likes