Large homeassistant database files

Its definitely faster rebooting and pulling up graphs.

Brilliant!

Thanks mate. Another thing to add to the list :slight_smile:

1 Like

my DB is now 10 days old and 500 kb.
the one before is 1000 kb and took a month.

but i get a little trouble looking at grafics and history if it is over 200.
that i let it get that big is because i dont look at them very much, but still want to collect that data.

through appdeamon i collect my data in another way and there is my log(from the sensors i want to see) only 2 kb in the last month.

i think it would be very wise if there comes a moment where we can decide which data we want to collect and which data not.

1 Like

@adamderuwe can you point to a tutorial or walk through on this? My db is getting huge and is really unwieldy when trying to look at history or bring up details of a device.

2 Likes

Please tell us how, i’m very interested in this!

If you don’t want to bring your historical data over, its very easy:

  • Bring up MySQL somehow (personally, I brought up a MySQL Docker). But you could just as easily do a sudo apt-get install mysql-server

  • Install dependencies: sudo apt-get install libmysqlclient-dev and pip3 install mysqlclient

  • In your configuration.yaml, add a section like is documented here

That’s at a pretty high level, but if you need more specific instructions, ask away.

6 Likes

It creates the table structure itself then? You don’t have to set the db structure up manually or via a script?

Thanks for the instructions, seems really easy!

However, i don’t understand why the home-assistant_v2.db got so big anyway. It was 323MB on my system, but i have set these parameters in my configuration.yaml

recorder:
  purge_days: 8

Is it ignoring this, or do i create that much events in 8 days to fill up 323MB ?

I see my db file to grow as well and I can see that the gui is quit slow when it is trying to read from sqllite.
I was thinking going tor influxdb or mysql. Anyone knows the advantages/disadvantages going that way?
Influxdb or mysql?

Yep, that is correct.

1 Like

I had the same issue. I was looking at the data in the sqlite database to figure out where all the space was going, then decided I didn’t care because I was going to MySQL anyway. Sorry I’m not more helpful.

MySQL vs Influxdb doesn’t really matter at this scale. We’re talking about small amounts of data here. Influxdb would probably be best for this sort of time series data if we were talking massive scale, but for this, I’d probably go with MySQL just because you will find more documentation, should you need it.

1 Like

My database grows over 2Mb / hour, so right now I’m at 27Mb since the database was created about 12 hours ago. The growth has to do with the way HA logs events in the database.

In my case, the reason is my current implementation of my lighting control. Our house has a lighting control panel, where each lighting circuit is controlled by a latching relay. I’m working on a gateway between the lighting control panel and HA. I can control the panel with HA using MQTT, but someone can also turn a light ON or OFF with a hard wired switch. In order for HA to reflect changes made by a hardwired switch, and to do so in a timely manner, I publish a MQTT status message every 10 seconds or so.

This leads to about 120 events being logged / minute, when I scale it for the whole house, that would grow to over 600 events / minute. This brings up the question.

Should HA log duplicate, intermediate events or only event changes? For my lighting system, should HA log every event from 8:30pm to 9:13am, or just the OFF event at 8:30pm and the ON event at 9:13am

8:30:pm   GELV/binaryOutput/308/presentValue      OFF
8:31:pm   GELV/binaryOutput/308/presentValue      OFF
. . .
9:10am   GELV/binaryOutput/308/presentValue       OFF
9:10am   GELV/binaryOutput/308/presentValue       ON

For now I’m going to modify my gateway program so it pushes out a MQTT message only if an change in state is made, but this might be something the developers want to look at changing.

1 Like

Thanks much; I think I’m going to give this a shot today now that it appears the segfault issue has been crushed. :joy:

I have a little experience with MySQL from my Joomla!/Web dev days so I think I’ll go that route.

I see a huge number of similar things with Zwave and template sensors on my setup; duplicate lines in almost sets of 6-8 in a row of the same state messages. And my db is at 640mb despite setting the purge to hoild only 14 days of data.

I just setup MySQL last night and it works perfectly. I changed the MySQL datadir variable so that the databases are stored on an external USB drive, since I am running on Raspberry Pi and don’t want to deal with extra wear and tear on the SD card.

It has been running now for the past 12+ hours without a hitch :smiley:

Remember, even though Home Assistant auto creates the tables, you still have to create the empty database on MySQL before starting Home Assistant.

1 Like

I will also have a look at MySQL. A query to purge data automatically is easily ran every month or so through a cronjob. Or isn’t that needed and does HASS respect the purge_days parameter when using MySQL?

When viewing the contents of my HASS database in my MariaDB instance, it looks like the purge_days value is respected. I have mine set to 14, and the oldest entry in my database is from the 18th of September.

Good to see others were also having the same issues as me!

As I’m running Home Assistant in a Docker container, I spun up another container for MySQL. Incase anyone is interested, here’s the commands I used

docker run --net=host --restart=always --name mysql -d \
  -e 'DB_USER=hass' -e 'DB_PASS=pass' -e 'DB_NAME=homeassistant' \
  sameersbn/mysql:latest

My configuration YAML file

recorder:
  purge_days: 60
  db_url: 'mysql://hass:[email protected]/homeassistant'

I have noticed my home_assistant_v2 file being touched. But at the moment that was a week ago when the MySQL server crashed for some reason.

6 Likes

Is there anyone that can write down a 'how do I use MySQL instead of SQLlite on an RPI AIO installer? I think many would be interested in doing this but don’t understand how/what to do step by step.

I now have a db of 720mb that build in a months so need to start to take some action.

Give me a bit of time and I’ll write something up. I’m about to do this tonight or tomorrow and I’ll put my instructions up here.

2 Likes