Large homeassistant database files

and a big thanks from me too to both @MojaveTom and @ehsahog. I moved the HA db on my PI3 to my Synology NAS and its just so much quicker now.

Great post! I just moved my DB also and it’s WAY faster and snappier! Thanks!

My db grew to nearly 1.7GB since installing Home Assistant just a few weeks ago. My graphs started getting incredibly slow, more than 60 seconds to load.

I’ve followed these instructions from @MojaveTom more or less. Had to make some small changes to fit my install, and also I was not interested in migrating my data so I could skip that part. Here is how I did it:

sudo apt-get update && sudo apt-get upgrade
sudo apt-get install mysql-server && sudo apt-get install mysql-client
sudo apt-get install libmysqlclient-dev
sudo apt-get install python-dev python3-dev
sudo pip3 install --upgrade mysql-connector
sudo pip3 install mysqlclient

it asked me to create the root password at this point

mysql -uroot -p

CREATE DATABASE homeassistant;
CREATE USER 'hass'@'localhost' IDENTIFIED BY '********';
GRANT ALL PRIVILEGES ON homeassistant.* TO 'hass'@'localhost';
FLUSH PRIVILEGES;
exit;

Test if user works:

mysql -uhass homeassistant -p
exit;

Switch to homeassistant user:

sudo su -s /bin/bash homeassistant
source /srv/homeassistant/bin/activate
pip3 install --upgrade mysqlclient
exit

Add to configuration.yaml

recorder:
  db_url: mysql://hass:********@localhost/homeassistant?charset=utf8

Restart home assistant

sudo systemctl stop [email protected]
sudo systemctl start [email protected]

worked without a hitch.

Now to figure out why it was so big in the first place and trying to avoid it.

I have mostly zwave devices (7 off-the-shelf, and 2 built with a z-uno), and a philips hue bridge (1st gen) running 3 lights. It doesn’t make sense to me that these few devices should be generating so much data.

I used the advice from @bbrendon

cd /home/homeassistant/.homeassistant
sqlite3 -line home-assistant_v2.db 'select * from states'|grep entity_id|awk '{print $3}'|sort |uniq -c |sort -n|tail -20 

turns out my aeotec multisensor has more than 100’000 rows for each of 8 entities.

 113195 sensor.multisensor_sourcenodeid_4_2
 113223 sensor.multisensor_alarm_level_4_1
 113225 sensor.multisensor_relative_humidity_4_5
 113225 sensor.multisensor_temperature_4_1
 113226 sensor.multisensor_burglar_4_10
 113226 sensor.multisensor_luminance_4_3
 113227 sensor.multisensor_ultraviolet_4_27
 113230 sensor.multisensor_alarm_type_4_0

This sort of makes sense because I use it as an environmental monitor and have it report data as often as it allows. I’ve just timed how often it updates and it seems to be between 6 and 7 seconds, so it definitely creates more than 10’000 data points in a day.

so I updated my config to exclude the ones I don’t need:

recorder:
  db_url: mysql://hass:********@localhost/homeassistant?charset=utf8
  exclude:
    domains:
      - automation
    entities:
      - sun.sun
      - sensor.multisensor_sourcenodeid_4_2
      - sensor.multisensor_alarm_level_4_1
      - sensor.multisensor_burglar_4_10
      - sensor.multisensor_alarm_type_4_0
      - sensor.airquality_general_24_2
      - sensor.time_utc
      - sensor.time__date
      - sensor.time
      - sensor.date__time
      - sensor.internet_time

The next step is to figure out how to offload data to a cloud db before I purge it on the Pi. Any advice for doing that? Am I best off having Home Assistant write directly into a DB I have in the cloud? Or would I be better of having a script take old data from the local db and write it into a cloud db? Or can I do a backup at the same time as my purge interval and then just splice the backup into an archive db? Other options?

I need the data in a usable format in the cloud because I later want to build an AI integration to detect deeper patterns.

7 Likes

Why would you want to keep the data? I know; I am also a bad example as I never purge my database, but am starting to ask myself the same question. For historical data I now rely on influxdb and Grafana (still a newbie around that topic and really like phpmyadmin which is not available for influx). Influx and Grafana never let me down and are very fast, even while running on the same rpi3 as MySQL/haas…

that is interesting… in actually using a pi2
but soon I’ll get a pi3 and give a try but as said before my database is running an a synology NAS and there is a huge difference once running the history panel !

I successfully completed this today using @Josh_Levent’s instructs above. I did have to make one small adjustment. I have a pre-Dec2016 All-in-One installation so the syntax for switching to the correct user/venv was slightly different. Dropping this here for anyone in that boat.

Where Josh says “Switch to homeassistant user:”

sudo su -s /bin/bash hass
source /srv/hass/hass_venv/bin/activate
pip3 install --upgrade mysqlclient
exit

1 Like

Thanks for the comment about influxdb and grafana, I’ve managed to get that working on my RPi3 in the meantime as well. Now I can set my purge time real low, and keep the data in influxdb :slight_smile:

Still want to figure out how to get it from there into long-term cloud storage at some point. There is still some junk data in the influxdb, but certainly a lot less than the recorder is putting in the main database.

If anyone is interested about the influxdb/grafana install, happy to write up my walk-through on that here as well.

2 Likes

Hi Josh,
What is that you mean by idnetified by *******
And i would guess the localhost is the IP 127.0.0.1 ?

I am interested in a walk-through :slight_smile:

This line is where you create the user, which consists of three components, the username (“hass”), the password ("********") and the hostname (“localhost”).

********: make up your own password and then put in the config file.

localhost: default hostname, you can use any hostname or ip address that identifies the machine, e.g. “127.0.0.1”. Just make sure that the hostname specified in this line, is the same as the one you put in the config:

  db_url: mysql://hass:********@localhost/homeassistant?charset=utf8

The last item on this config line is the database name, which needs to match as well. In my case: homeassistant

thanks for these instructions - worked perfectly!

Thanks Josh,
Got everything working great. Now where does the data base file get saved to now?

I would like to install MySql but I wanted to install that in a Docker container. I tried to install Docker but at the end of the installation, it failed. Do you have any expertise in that?

should be /var/lib/mysql/yourdatabasename

I’ve been running mysql database on the same RPi3 as HA, it has started to run so slowly, so I checked the DB size and it was 1.5GB when using 30 day retention… So I blitzed it and now use 7 days and might even move the DB onto my QNAP NAS

Just wondering where is the database file saved now? Also, what is the name of the file? I looked at /var/lib/mysql/homeassistant and there is only one file there db.opt, which doesn’t look like the right file.

Finally, my home-assistant_v2.db file is still present and growing, even after I deleted it and restarted HA.

Sounds like you didn’t configure the recorder in Home Assistant correctly. When it’s configured correctly, the home-assistant_v2.db will not grow or be recreated.

Here’s my config:

recorder:
  purge_days: 3
  db_url: mysql://hass:PASSWORD@localhost/homeassistant?charset=utf8  

Anything that I may be missing?

Check your home-assistant.log file for any possible errors related to the recorder. It could be something as simple as an authentication problem.

Unfortunately, nothing in the logs. I can access mysql using the same credentials with mysql -uhass homeassistant -p