Something (HACS integration) keeps on corrupting my HA database

This issue is driving me crazy for 3 weeks and I just can’t find the root cause.

I have lost my HA sqlite db many times and even with a new/clean database I get the problem within a week.

Some days I see a core dump but the database is ok. Other days I see a recorder ‘ended unfinished session’ warning in the logs and usually the database will be corrupt after this if I run the action to purge the db.

To troubleshoot, I removed all my HACS integrations one at a time to see if the core dump issue would stop. After many hours I thought I traced the issue back to a integration (asus router) so I removed it fully from my system.

Everything was fine for 5 days but then I noticed a core dump file and the ‘ended unfinished session’ error. So I forced a purge and corrupt database again.

I have already lost all of my historic data so I am looking for tips/suggestions from the community for a fresh start or way to find the root cause.

I am thinking that I should start entirely new and recreate everything from scratch.

Any other ideas or suggestions?

Hello jata,

What server hardware and installation type are you running.
I can for instance think of several things a rasPi could do to cause these issues.

Thanks @Sir_Goodenough

I have HA running in a container with two main systems a RPI5 and a N100 minipc. Both setups use a good quality SSD as the system disk formatted ext4 with smart monitoring enabled.

I use the rpi5 dedicated for HA and related apps all configured as containers in docker (e.g. influx, grafana, mqtt, nodered). Use a RPI5 power supply and have no additional disks etc

I tried moving HA to the minipc and I still had the issue but I might have copied over the issue as I restored from a backup then created a new db,

Both these systems are well looked after my me and everything else is working totally fine.

For a test I guess I could

  1. move my prod HA container to the minipc
  2. use the RPI as a cloned/parallel test system
  3. monitor both and see if just the RPI or both systems fail
2 Likes

I have now configured my system based on the above scenario.

HA is now migrated to my N100 minipc.

I have a cloned copy of the config and db copied onto my RPI5

Both are running and working normally so I will now monitor for the next few days to see if I see anything.

I have a feeling it is something to do with the RPI. I am thinking:

  1. the SSD is dying
  2. the base Pi OS install on the SD card
  3. the swap file that I have configured on the SSD