What is the most robust database configuration?

The subject says it all really.

For background, I’m running HAOS in a VM on Linux. I update everything as soon as the update is released. This morning I noticed that all my history was gone - it was like I installed Home Assistant at 4AM today, when it’s actually been running for several months. I did some reading and found that it was probably due to a corrupted database. As suggested, I checked my logs and sure enough found the log entries pointing to a corrupted database. I tired following some of the recovery tutorials but got errors about missing tables in my “corrupted” copy of the database. I gave up and did a restore of “Settings and History” and this worked. I now just have a gap of about a day in my history, not a totally missing history.

While reading for help to fix this issue I saw some people say that it’s all SQLite’s fault and they migrated to MariaDB. When I started searching for maybe a tutorial on migrating from SQLite to MariaDB all I found was people having trouble with their MariaDB implementations.

Is there a tutorial somewhere on creating a robust database configuration for Home Assistant? My setup is still new/small and losing my history now is not a big deal, but I can imagine that in the future I might be quite upset to lose years of data.

I don’t know the best, but I’ve used MariaDB as an addon for years without significant issue. I keep about 4gb/100 days of history online. I plan on starting to use InfluxDB again for long term stats. InfluxDB is only a secondary history store, as it samples data.

For safety you want to use data storage support as part of core, so there is not much choice. But using second Influx as an independant DB, not an addon, allows better backup choices as it grows big.

Also excluding high frequency unneeded history (configuration.yaml) is a good practice. So is disabling frequently updated unneeded entities created by integration. A bunch of sensors tracking voltage, power and energy changes that will never be used is common and simply creates a lot of unnecessary traffic and data storage.

If you plan on keeping more than 4gb in it, nothing. Less than that HA is optimized to work with SQLite default the best.
HA is not designed to work efficiently with any huge database. I you want to keep all of your history you should push that to something else and keep the moment to moment actions fast and nimble in the DB that HA is running from.

1 Like

I don’t know what I plan to keep, but I do know my current database is only 76MB! I guess there’s no way to figure out how.why it got corrupted now that I’ve done the restore.

Chances of a VM messing that up are very low, so I have no idea.

Your hardware is flaky (possibly RAM or PSU).

SQLite is the most tested database by coverage. It’s used in many commercial jets, all phones, and browsers.

When I searched for the problem I was seeing, there were a lot of others in the same situation. Do we all have flaky hardware? Is it not possible that Home Assistant has a bug that is corrupting the database in software?

Lots of DB corruptions comes from wear out of SD card (not your case) or not carefully reading the release notes (DB schema updates that are interrupted before finished by HA restart). True is also that usually only most significant DB schema changes are listed in release notes, some smaller are not mentioned or as a part of some bumps… I found it to be a good practice to leave HA alone after upgrade for some time and eventually check the log file for any information about DB update and eventually wait for process to finished before any further restart.
Said that SQLite is now heavily optimized for HA and safe to use. Also any engine updates comes with updates of HA itself. In contrary, I recall how many problems people faced when it was announced that MariaDB needs to be upgraded to at least some specific version, otherwise HA update might create serious performance issues. And not for every installation it was easy to be performed, as lots of people used Synology package that was not available at that time in required version. Then migration lets say docker required to either give up on historical data or do some trickery with data migration between different MariaDB instances.

Yes flaky hardware. Most cases stem from shitty SD cards (they tend to be shitty overall compared to other storage forms like SSDs and M.2 drives.)