Mysql (mariadb) backend is writing 5-20 MB/s and growing 1 GB per day

mgee · April 12, 2023, 8:58pm

Hi, after upgrading to 2023.4.2, I noticed that my database backend, which is running MariaDB on another host, is doing alot of disk writes, about 5 - 20 MB/s (as reported by htop and iotop).
I have stopped other services using the database, and it is HA alone that is the reason for all the writes.
If I stop HA, the writes also stops.

# mariadb --version
mariadb  Ver 15.1 Distrib 10.5.18-MariaDB, for Linux (x86_64) using  EditLine wrapper

# mariadb -u root --password
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 150658
Server version: 10.5.18-MariaDB MariaDB Server
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> SHOW PROCESSLIST;
+--------+------+------------+---------------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
| Id     | User | Host       | db            | Command | Time | State    | Info                                                                                                 | Progress |
+--------+------+------------+---------------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
|    219 | rosa | rosa:56626 | homeassistant | Sleep   |    3 |          | NULL                                                                                                 |    0.000 |
|    221 | rosa | rosa:50452 | homeassistant | Query   |    0 | Updating | UPDATE events SET context_id=NULL, context_user_id=NULL, context_parent_id=NULL, context_id_bin='؈?  |    0.000 |
|    222 | rosa | rosa:50454 | homeassistant | Sleep   |    4 |          | NULL                                                                                                 |    0.000 |
|    223 | rosa | rosa:50468 | homeassistant | Sleep   |    6 |          | NULL                                                                                                 |    0.000 |
|    336 | rosa | rosa:55848 | homeassistant | Sleep   |    5 |          | NULL                                                                                                 |    0.000 |
| 150658 | root | localhost  | NULL          | Query   |    0 | starting | SHOW PROCESSLIST                                                                                     |    0.000 |
+--------+------+------------+---------------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
6 rows in set (0.000 sec)

What configuration have I done wrong? My HA installation resides on a Raspberry Pi 4 Model B Rev 1.1, and has been happily running for 3+ years now.

I would like to find the reason heavy disk usage and fix it, but am willing to nuke the whole database and start out fresh if that is what is needed…

Any suggestions on how to debug this?

My recorder config is as follows:

recorder:
  db_url: !secret mariadb_connection_url
  auto_purge: false
  purge_keep_days: 9999

The mariadb_connection_url is in the following form:

mariadb_connection_url: mysql://HOST:PW@IPADDRESS/homeassistant?charset=utf8

My backend has plenty of space, but the database is now ca 107 GB (was 106 GB yesterday).
I thought I could just keep all the data there, and not purge, since I do have the space - but why is so much data written to it every single second?

Keywords: database, mysql, mariadb, backend, recorder.

mgee · April 12, 2023, 9:37pm

Ok, so I stopped HA, and from the console of the backend server running the mariadb I did this:

# mariadb -u root --password
Enter password: 

MariaDB [homeassistant]> optimize table events;
Stage: 1 of 1 'altering table'     50% of stage done
Stage: 1 of 1 'altering table'      0% of stage done
Stage: 1 of 1 'altering table'   97.8% of stage done
Stage: 1 of 1 'altering table'   98.9% of stage done
Stage: 1 of 1 'altering table'   99.5% of stage done
Stage: 1 of 1 'altering table'   74.9% of stage done
Stage: 1 of 1 'altering table'   87.4% of stage done

So, it seems to never finish? (I pressed enter a few times to record the state before it reset to a lower value). Is there something wrong with how mariadb handles the homeassistant db? I cancelled it now after half an hour running. Should I wait longer?

mgee · April 14, 2023, 8:47pm

Replying to myself again in case anyone else ends up here:
I manged to complete the “optimize table events” and “optimize table states”, it took several hours, but effectively reduced the size from ca 107 GB to around 70 GB. I also added this to the HA config:

recorder:
  db_url: !secret mariadb_connection_url
  commit_interval: 60
  auto_purge: true
  purge_keep_days: 180

and restarted a few times. But the disk hammering kept going, so I did a mysqldump for backup and then nuked the whole database and started out fresh (drop database homeassistent). The disk IO is now negligible.

Too bad, since now I don’t have way to look back on energy usage month-to-month, year-to-year and so on.

Is there a recommended include/exclude list or filter that I can use to still be able to keep some data for long term reference?

PeteRage · April 14, 2023, 9:32pm

I’m pretty sure there is a big database update as part of the 2022.4 release and the migration of the data into the new tables may take a while for large databases. I recall someone saying 10+ hours. How long depends on the size and disk speed. So I think that is what you were experiencing.

mgee · April 14, 2023, 9:38pm

Hm, ok thanks that might be it. But it would be really nice to have some indication of the status of that migration, in the web UI or the logs. Some kind of progress at least every hour or so. I saw no such indication, and let HA run for at least three days I think.

In any case - I still need to sort out how to keep long term data (energy usage) and drop highfreq data to avoid growing the db beyond 1 TB in a few years.

leonidostrovski · April 26, 2023, 5:32am

At the microlevel:
You have one or more sensors sending updates every 1s-2s as result crashes the DB.
You can use an SQL query to find it.

You can delete all rows you found from Developer Tools>Service>Purge Entity.

At the macrolevel:
DB size control is a big problem!
Some sensors send updates every 1-2 seconds, and some sensors send data every 2-3 hours. For some sensors, you need to keep all data, and for others, not.
For example, for the binary sensor “door close”, you need only data with state changes. For the temperature sensors, you need the last month in the resolution mean for ~10 minutes, not every 10 seconds. So you do not really need 99% of the stored GBs of data…

I think there is no real, easy-to-use solution for it. I spent a lot of time trying to find a solution.