How to reduce your database size and extend the life of your SD card

tescophil · June 28, 2020, 7:41pm

Put the DB in memory and stop ALL writes to the SD card…

recorder:
   db_url: 'sqlite:///:memory:'

Simple as that…, I have a 1 day purge setting and cut the recorder down to what I actually need to see, which makes it about 100MB. This also produces huge performance gains vs. any other kind of DB implementation.

PS, Set your log level to critical and further reduce writes

ffeingol · June 28, 2020, 9:02pm

Doesn’t that cause your database to be totally wiped out if you reboot your server or restart Home Assistant? From the doc: “The database ceases to exist as soon as the database connection is closed.”

firstof9 · June 28, 2020, 11:11pm

Yes but he’s also set his purge to 1 day. Likely logging relevant data to influxDB, just a guess here.

tescophil · June 29, 2020, 6:13am

Yes, the database is wiped on restart, but so what ? The DB contains only historical log data and absolutely nothing relevant to the operation of the system (in my case at least). I use the logbook for debug purposes only and the history view is interesting, but not functionally necessary.

bschatzow · June 29, 2020, 12:49pm

I switched to mariadb and can’t find the db file either with Samba or ssh. I would like to find it before I start trying to reduce its size. I used the default settings to install mariadb. Thanks

francisp · June 29, 2020, 1:43pm

It is not easy to find. There is a topic about it :

bschatzow · June 29, 2020, 1:56pm

Found it using add on phpmyadmin. Also if you have the ssh / terminal add on and have protection mode disabled you can find it by the following:

docker ps you need to find the id of the maria container
2.docker exec -it fbd81dccedd4 /bin/bash
cd data/databases
4 . du -h homeassistant - this gives the size of the folder where the database is stored
or cd homeassistant
ls -hal - this gives the individual files in the folder.
Type "exit " to get out of the bash.

bdraco · June 30, 2020, 5:23pm

If you get a chance, please keep track of your database size before updating to 0.112 and then check again 11 days later to compare.

If you have a large database > 16GiB, please be aware that it may take a while for the 0.112 update to complete the database schema changes.

MarkB1 · June 30, 2020, 5:37pm

Any idea when this will be released ?

bdraco · June 30, 2020, 5:48pm

Tomorrow if all goes well.

ffeingol · June 30, 2020, 8:59pm

If you don’t compact your database (whatever is appropriate for the database platform that you are using) would you even notice a change? I’m more familiar with with SQLite and MySQLand they won’t release unused space unless you vacuum (SQLite) or optimize (MySQL).

bdraco · June 30, 2020, 10:01pm

If you use the recorder.purge service with repack the database will get cleaned up with SQLite, PostgreSQL, MySQL and MariaDB. Please note that 0.112 is the first version to support MySQL and MariaDB with repack.

The default auto_purge will not repack

firstof9 · June 30, 2020, 10:14pm

So far with the tweaks from this thread and the 0.112 beta, my database size for my 14 day purge interval is down to 2.8GB from 6.6GB and looks like it’s getting lower.

MarkB1 · July 1, 2020, 8:50am

Is this the correct format for my exclude.yaml?

      - binary_sensor.alarm_siren_son_sv_status      
      - sensor.esp_connected_bssid_alarm_siren_son_sv
      - sensor.esp_connected_ssid_alarm_siren_son_sv

six spaces, that’s what i copied out of my config.yaml

mbuscher · July 1, 2020, 3:38pm

You could use SQL to get the database size. It wouldn’t be exactly the same as disk allocation, but it should be close.

SELECT
    table_name AS `Table Name`,
	table_rows AS `Row Count`,
	ROUND(SUM(data_length)/(1024*1024*1024), 3) AS `Table Size [GB]`,
	ROUND(SUM(index_length)/(1024*1024*1024), 3) AS `Index Size [GB]`,
	ROUND(SUM(data_length+index_length)/(1024*1024*1024), 3) `Total Size [GB]`
FROM information_schema.TABLES
WHERE table_schema = 'homeassistant'
GROUP BY table_name
ORDER BY table_name

Above is for an installation on MariaDB. Adjust the database name (table_schema) as needed, or remove the WHERE clause entirely if all you have on that service is your Home Assistant schema.

You’d get something like the following (here produced using phpMyAdmin):

reclusivemonkey · July 12, 2020, 8:30am

tl;dr Is the likelihood of SD card failures post HA 0.112 lower?

This week my SD card failed. I’d had a notification of a power issue at some point, which I foolishly ignored. Soon after my SD failed and I had to restart. I had some ‘faux’ backups so it wasn’t too bad. First thing I did was to create a real backup using the snapshots, the excellent Hass.io Google Drive Backup, IFTTT and Dropbox.

Next I am looking at logbook, history and recorder. A long time ago I’d thrown date, time and weather into my excludes but now I am wondering what do I really want in there past what’s useful for troubleshooting and whether this is even an issue with the new DB changes. I run HA from a RPi 3 model B, official psu and now SanDisk Extreme 64 GB microSDXC A2 App Performance as the previous card was the 32Gb with A1 and I noted that the A2 is the recommended one.

I know many people recommend upgrading from the Pi, but for me personally the Pi has many advantages;

cheap (much cheaper than a NUC which would be the step up)
lower power requirements (I am guessing noticeably so over a NUC but correct me if I am wrong)
silent (again, a NUC will have fans I am guessing)

Now I have a ‘true’ backup I am thinking that if the SD goes again, I’m just a £15 SD card and a few minutes away from being back up and running, which seems a much cheaper option than everything that would be involved with moving off the Pi. Has anyone switched from the Pi to the NUC and felt that there wasn’t much gained from it? I appreciate that it’s a little early with 0.112 to really judge the DB stuff.

tom_l · July 15, 2020, 5:46pm

With this methodical application of excludes and the improvements in 0.112 my 7 day database has fallen from 1.5GB to 240MB, which is an 84% reduction in size. I still have all the graphs and history I was interested in (pages and pages of graphs).

SpacemanSpiff · September 2, 2020, 8:09am

use JSON_EXTRACT instead of JSON_VALUE if on SQLite

karansoi · September 8, 2020, 9:07pm

I have read this thread from the first post by @tom_l, and this is what I have done so far…

Created a directory www
Created files logbook.yaml, history.yaml and recorder.yaml
Added the following to my configuration.yaml

logbook:  !include logbook.yaml
history:  !include history.yaml
recorder:  !include recorder.yaml

In the scripts .yaml file added the following

print_entities_to_file:
  alias: 'Print Entities To File'
  sequence:
  - service: notify.entity_log
    data_template:
      message: >
        {% for state in states %}
          - {{- state.entity_id -}}
        {% endfor %}

Restarted HASS and from developer tools service, called the service script.turn_on for the entity script.print_entities_to_file. this generated the entity_log.txt file in the www directory. I downloaded this file and opened it in my notepad++.
In a worksheet I have copied these entities and made the table as described and run the sort also.
Now that I have done the exercise of checking exclude and include for the recorder, history and logbook. do I Iist these exclude/include in the respective yaml files ie logbook.yaml, history.yaml and recorder.yaml
I request guidance from @tom_l to proceed with the Mariadb part.

tom_l · September 9, 2020, 3:57am

What MariaDB part?