I was getting a huge, huge sqlite3 recorder database, with only storing a couple of days of data. Wonder why that is?
So I created a some sensors to get some visibility over time:
sensor:
- platform: command_line
command: "stat -f '%z' /Users/louie/homeassistant/config/home-assistant_v2.db"
unit_of_measurement: bytes
name: HASS database size
scan_interval: 120
- platform: command_line
command: /opt/local/bin/sqlite3 -separator '' -noheader -list /Users/louie/homeassistant/config/home-assistant_v2.db 'se\
lect count(*) from states;'
name: HASS database states
scan_interval: 120
- platform: command_line
command: /opt/local/bin/sqlite3 -separator '' -noheader -list /Users/louie/homeassistant/config/home-assistant_v2.db 'se\
lect count(*) from events;'
name: HASS database events
scan_interval: 120
(I’m running Home Assistant on a Mac, so the various paths and stuff are going to be different than you might customarily see on a LInux box…)
At the start, I created only the first sensor to grab the size of the database file. Then I see these sensors, and see that my database gets up like 1.4GB in size. Hmm… Now, I happen to feed all this stuff from HASS into InfluxDB and I have grafana around, so magical plotting action and I see:
Which is a plot of the database size (scale on the left Y axis) and the rate data is being added (on the right Y axis scale.) You can see when the expiration happens in those steps once per day.
So I fiddled with some things to reduce the volume of stuff being logged, and you can see last few days are better. This came after some insight when I added the second two sensors to get a count of the number of rows in the “events” and “states” table.
Now, looking at the last few days of action you can see that I was running with more than 1.5 million rows in the “events” table, adding them at about 6 per second!
.
This prompted me to go prospecting in the database to see what all these events were, and it the result of the mqtt_statestream
component. Each time a message is transmitted to the MQTT broker, you can see a couple of service invocation events logged. I feed some real time power measurements, updated each second into Home Assistant. I already excluded those sensors from the recorder/history component, but the effect was still being felt because of this component.
So I disabled mqtt_statestream
and you can see the rate of new events added to the table (the blue dotted line) fall off immediately. Over the next few days the old event data will get aged out, and my database will return to a reasonable size.
This was an interesting exercise of using Home Assistant to diagnose itself, along with Grafana and InfluxDB. Hope this might prove helpful to others.