Hi.
I’ve seen people report issues with the standard home-assistant_v2.db getting too large which caused problems with performance and in some cases reliability.
The only solution I’ve seen so far is to use recorder to limit the number of days and/or remove some of the sensor tracking data in order to keep the database
size managed which also keeps backups a reasonable size.
I was thinking that, rather than removing sensors and limiting history, that this could instead be improved by setting it so that anything Day-1 (or -2) and older gets offloaded in an overnight job to a secondary database for longer term archiving.
This way we get the best of both worlds.
A lean performant system as well as archival stats which we could call upon if required.
Thanks all for the replies @Pippyn + @robertklep This suggestion would ideally be part of core so it works across all releases and allows for long term data.
I don’t know the % split between people using core and those with Supervisor versions, but the suggestion was that the core release would manage it as standard so it’s available to all.
For example I didn’t know about some historical issues as well as recent improvements (as well as what recorder did) until after I’d spotted my backups getting much larger… and then subsequently researched the subject.
I only spotted the size increase due to constant tweaking as I slowly expand my setup and then, once I did identify the cause, I had to manually intervene to resolve by cutting down what I record and for how long.
@tom_l Yes, I found as apart of my research the info on the big improvements that came in recent releases that made the DB far more responsive as well as shrinking its overall size. I was thinking more of long term data storage where someone may also wish/need to monitor historical data over months/years.
I don’t know if the current setup would stand up to that, but it would make for biiig nightly backups
The idea being to have the big historical database available to back up separately so it could be done on a separate schedule compared to the real-time database. Weekly or monthly or manually.
I’m still new to HA, but I’ve seen that there’s been a lot of recent improvements to make it more user friendly and robust which hopefully gives HA a wider appeal.
This suggestion is made in the same vein.
using an external database (MariaDB or InfluxDB) isn’t in any way limited to be used only by a Supervised/OS install. Both of those can be used by any installation method.
You just have to set up your config to use them just as you would when running a Supervised version of HA as well.
I use both of those (but with very limited use of Influx simply because I haven’t taken the time to learn it) and I run a HA Container install.
I get why the default location for the DB is on the host machine and inside HA itself. The “Target Audience™” has become less tech savvy and most of those users could care less about the DB and don’t necessarily have another location available for the data.
But I don’t get why the long term data isn’t provided the ability to be handled like the short term recorder db config.
At this point (as far as I know) you have zero control over what, where or for how long the long term data is stored.