I run occasionally into an excessive database file size. Grooming the community provided a wealth of insights in scattered places. This what I learned while taming the recorder.
Problem Description: I operate a Raspberry Pie 3 with Home Assistant OS, build-in database sqlite, managing 1000+ entities. I run into an excessive use of the Recorder while performing some tests, resulting in a >1 GB file size for home-assistant_v2.db. This led to the following problems: HA crashes while compiling ESPhome, and creating a full snapshot is not possible.
To understand the reasons and to find a remedy I made use of the following tools: Template Editor, Sensors (filesize, systemmonitor, sql, infludb), Plug-Ins (Samba Server, SQLite Web Plug-In, InfluxDB Plug-In), Linux tools (top, du), and Win10 tools (sqlite3_analyzer.exe).
I learned to differentiate between three time frames: runtime, short term recording, and log term storage.
To understand the runtime load, I measured the number of states and events with the template editor ( in my case the result was 1066 and 437):
Current number of events= {{states | count}}, and states= {{states.sensor | count }}
short term recording. My recorder is set to purge every day, thus storing every not filtered state change and event for at least 24 hours. The SQLite Web plug-in gives answers about the load: The first two queries return the total number of states and events.
select count('*') from states
select count('*') from events
To identify the most load consuming entity_id:
SELECT entity_id, COUNT(*) as count FROM states GROUP BY entity_id ORDER BY count DESC LIMIT 10;
A list of all monitored sensors:
select distinct entity_id from states WHERE domain = 'sensor' order by entity_id
Show the latest 10 entries for a specific sensor. Looking at the difference between timestamps, one can determine its updating interval.
select state, created from states WHERE entity_id = 'sensor.ferres_timestamp_delta' ORDER by state_id DESC LIMIT 10;
long term monitoring : I found influxDB very useful. The following experiment compares the internal load (purple) and the database file size (blue), covering a period of two days. The peak load was 2M states and 1M events, consuming 1.4GB space. Two morning purges (at 4:00) can be seen. However, this does not reduce the size of the data base. Only a service call recorder.purge
with the option repack: true
reduced the size to 0.3 MB (at 11:00).
My conclusions:
-
Separate short and long time requirements. Deploy different database machines.
-
Identify the most harmful recorder contributors in both states and events tables with respect to the number of involved entities, their update frequency, and keep duration.
This led me to follow the following strategy:
a) Reduce the recorder candidates by very stringent whitelist filtering,
b) Reduce the update frequency of sensors at the origin (workaround: automate saving sensor data to an input_number entity at suitable intervals),
c) Reduce the recorder keep time to a minimum. -
Erase intermediate recordings (i.e. for test purposes) manually with purge/repack.
-
Consider using notify (platform file) as an alternative.
-
Keep a fresh database for later usage. In a emergency, replace databases with file system commands, while HA is frozen under the system call
homeassistant.stop
.
[edit] I ended up reducing the initial database size by 99%, without loosing any of my desired data.