I added a FR for the same thing 4 days ago here:
I’ll vote for yours if you vote for mine.
JK, I already voted for yours too.
I added a FR for the same thing 4 days ago here:
I’ll vote for yours if you vote for mine.
JK, I already voted for yours too.
Absolutely. I already do this, I’ve been using mysql to create the long term stats and then store them in a new database. Then I can use them as sql sensors, or view the data in its whole through grafana.
Surely it is easy to just copy two tables to another database once an hour? Then you always have something to restore from.
Does a separation in two databases really solves the root of the problems?
Yes, I think so.
My “short-term” DB for an interval of 14 days contains all measurements for a gazillion of entities. It is over 5 Gig in size. I don’t care to back it up. If it’s gone it’s gone. A new one will be complete within another 14 days.
My long-terms are really precious. These are only a handful and AFAIK they only create a min/max/avg value triple each hour. So that db should be small and would always be included in my backups.
Yes, that’s on my todo list and I know that InfluxDB will accomplish the same result. But my proposal is for people that want to use Home Assistant as it now contains the long term statistics platform that’s not bad. But data integrity is at risk, at least. If people use the statistics expect that his data are safe.
Of course I’ll do it as soon as I post this reply
That’s interesting! How do you do that? Triggers that insert data in the other DB? Os is there other way to do it in mysql?
It’s a core feature. It just work out of the box. I’m able to manage tables, select, insert, and so on. But not all people is expert in database management. And their data integrity is a priority too.
No, it doesn’t resolve the root of the problems. But is a very good architecture. Even in business software the DB with data to be used for data mining, business intelligence is normally a stand alone database, not some tables inside other DB…
Exactly! You got the point!
I agree, that it is a good pattern. It would minimize the risk of a database issue, since only errors in some part of the code (the part that maintains the min/max/avg value triple each hour) would destroy the long-terms database. But those parts can also have errors and would then affect the long-term database.
I think, the basic assumption here is, that most of the errors are part of the management of the short-term database; and therefore the risk would decrease.
I am a bit sceptical, why we have to clear the database so often. The database is normally the most stable part of the whole system and not vice versa.
The problem is that default DB is SQLite, that use a normal file on the filesystem. If you don’t change the recorder commit_interval parameter the DB is updated (=file write access) every second. If you consider that most of users (>57%) are using a Raspberry Pi to run Home Assistant and most of them use the SD to store data or use a SSD without activating TRIM you can understand that DB corruption is around the corner.
Furthermore it happens that some updates or restore of snapshots create problems and require a DB wipeout.
So, database splitting would be a much better strategy for every user. And so you can backup the long term but not the short term DB, keeping your backup (aka snapshot) much more compact.
Then, if you want a more stable DB you have to change the commit_interval to something longer (not too much) and use an external RDBMS like MariaDB or MySQL.
Yeah I’ve created events that run at a particular time, mostly at midnight, they parse for various things like min/max/avg of house temperature, mains min/max voltage, internet speedtest results, calculate the aircon running time for the day, and then I insert them into a new database. I just use Mysql Workbench to do it, easier than working from the CLI.
I then pull this back into HA with a sql sensor, and also view the long term data through Grafana.
TRIM is new to me. It seems to be not activated per default in Home Assistant OS. I found this issue about the topic.
I am wondering, how to activate it manually.
ahah, no, the issue was not written by me but the last comment (energywave) is mine You can read my blog post with detailed instructions on how to activate TRIM that totally works. Only pay attention that not all controller/SSD supports it. You can use google translator to translate from italian
But better not going further OT with that here
Voted yes.
And a separate long term database should support a completely different DSN so it can be run on a remote instance. And as well, store state and attribute data with the proper DB native data types instead of all as varchar with an enormous string of unparsable and space wasting JSON. Separate tables for states, entities, attributes, etc. all linked together via linking tables. As in, a proper and efficient relational database that offloads query work to the database engine, not program code.
And hell, maybe the same thing can be applied to the recorder database as well.
I fully agree with the troubleshooting of the database, but this could also happen in other ways than the duplication, in short, it must be made safe in its integrity.A serious home automation system cannot afford to lose even one bit, especially of that database.
I really would like a way to separate data into separate databases
lots of long term data I do want , but also a lot of fluff data that I do not want getting backed up in the daily backups
if we could have multiple recorders setup that would be fantastic
but currently only influx and another type of database that can be done at same time , unless using nodered
INFLUXDB require more resources, I think only a raspberry isn’t sufficient to have all system working fine!
My current sqlite db has problems and I can’t delete it without losing statistics. Migrating.the 16 Gb to mysql is a real PITA. I’m completely lost how to solve this bug
“problems” is not going to get you help.
The normal solution to a database “problem” is to delete it and ha creates a new one. It also deletes statistics. Thats why I voted here.
I wasted many days trying to find the cause, also tried to migrate 16Gb to mysql which is a real challenge (it would take days to import).
A second database for statistics would solve it. Or an option in HA to delete all tables except statistics. If I try that through sqlite3 home assistant recreates a complete new database…
Edit: with trial and error on the new recorder purge options I managed to trim down the sqlite3 database from 16 to 1 Gb. Hopefully the new purge options keep the database down in size and tolerable for restores (mine caused havoc once I tried to restore it from a backup file). I really hope home assistant makes some changes on how statistics are stored, I would prefer an external mysql or mariadb. The current setup makes me very unhappy once I need to rely on restores from HA backup files, my SSD will fail sooner or later.
Has anyone tried to use the rrd-tool custom integration?