Periodic I/O issues on HA OS host

I am running HA OS on a Dell OptiPlex 3080 Micro form factor PC, transitioned from a Docker install on a different host in June 2024. Randomly since installation (every 1-20 days) Lovelace will hang if I try to reload the UI in the browser. If I happen to have an instance of the UI already loaded in the browser, sensor state history will not load, but sensor values continue to update in realtime.

Plugging monitor and keyboard into the HA OS host PC shows a bunch of I/O errors in the console (photo attached).

A few other observations that may or may not be relevant:

  • InfluxDB running on a different physical machine continues to update with new sensor data
  • Some automations work; others do not
  • Some devices continue to operate via the UI eg kasa light switches
  • Some devices do not operate eg zigbee smart buttons
  • ssh add-on returns permission denied for new logins
  • ssh sessions already running before issues started return “I/O error” for “ha” or shell commands like “ls”

I never had any of these issues when running HA via Docker for several years prior. Is this a periodic hardware failure? Has anybody had similar issues? It seems suspect that issues cropped up when I migrated to HA OS on new hardware, but being periodic is puzzling.

Looks like you’re having a hardware failure. Either your storage is failing or is about to fail.

I swapped out the SSD and it’s been stable since early August. Looks like it was indeed failing hardware.

Awesome! Happy you got it working!