Sudden Catastrophic Failure

Got up this morning and no home automation was working, when I looked at the HA web interface all my devices were grayed out. I SSH into the HA rPi and try to restart the core and got:

Error: Can't create container from homeassistant: 409 Client Error for http+docker://localhost/v1.41/containers/create?name=homeassistant: Conflict ("Conflict. The container name "/homeassistant" is already in use by container "a19d8f32328b9d41a6bc2e54d61d917b0427ae249c52f2f1e0f7a55cc8af453d". You have to remove (or rename) that container to be able to reuse that name.")

Doing ha host reboot knocked me off the SSH immediately but wouldn’t reboot the system, I had to physically pull the plug on the rPi to get it back up. It is back up now but started out extremely sluggish and is starting to settle down again. Any idea what could cause this?

It’s not on SD, it’s booting from a brand new SSD drive, so it’s not an SD failure.

From the logs it occurred about 3am, when no automations are running and nothing is triggering anything and the logs look 100% normal, it just crashed.

Same for me when trying to update from 2022.10.5 using “ha core update --version=2022.11.5”.

Nothing new:

@CO_4X4 How did you solve it? Hardware issue? Or also Supervisor going crazy?


Update: Nothing helped.

  • Even ha su repair timed out with Post "http://supervisor/supervisor/repair": context deadline exceeded (Client.Timeout exceeded while awaiting headers).
  • Also ha host reboot gave Post "http://supervisor/host/reboot": context deadline exceeded (Client.Timeout exceeded while awaiting headers) so I really had to shutdown the host the hard way (power cable - pull, wait, plug) for the first time in more than 2 years.

Lesson learned: Dear Supervisor, if you’re screwed, everything is screwed.