I’ve had HA running for some weeks now and with trial and error & help from this community it has started to work rather well.
After a while I noticed was much increased home-assistant_v2.db (as was mentioned in docs as well) and I also managed to corrupt it few times. So I went with postgresql as recorder backend.
This was rather painless and there was a step by step guide to it in Migrate Home Assistant from SQLite to PostgreSQL | Sigfried Seldeslachts and with few mods it was up and running.
Next I wanted to make sure that I’d have a way to keep database bit smaller and janitor it regularly (myself)
First step would be to remove all entries where state just does not change.
some example data would be
state_id | entity_id | state | attributes | event_id | last_changed | last_changed_ts | last_updated | last_updated_ts | old_state_id | attributes_id | context_id | context_user_id | context_parent_id | origin_idx
----------+---------------------------------+-------+------------+----------+--------------+-------------------+--------------+-------------------+--------------+---------------+----------------------------+-----------------+-------------------+------------
308719 | sensor.ilmasensori1_temperature | 23.6 | | | | | | 1675588279.821811 | 308713 | 293077 | 01GRGDG7GD54CB3933XGWDRYYJ | | | 0
308725 | sensor.ilmasensori1_temperature | 23.6 | | | | 1675588279.821811 | | 1675588279.861426 | 308719 | 293083 | 01GRGDG7HNWN1Q49E3KGENFHRE | | | 0
308731 | sensor.ilmasensori1_temperature | 23.6 | | | | 1675588279.821811 | | 1675588280.722815 | 308725 | 293089 | 01GRGDG8CJ7KVZMRPA79S37S0D | | | 0
308737 | sensor.ilmasensori1_temperature | 23.6 | | | | 1675588279.821811 | | 1675588280.762024 | 308731 | 293095 | 01GRGDG8DTC0D738CFK7S4QCK8 | | | 0
308743 | sensor.ilmasensori1_temperature | 23.6 | | | | 1675588279.821811 | | 1675588280.800759 | 308737 | 293101 | 01GRGDG8F0Y6BAQ5MCPENH9MWX | | | 0
308749 | sensor.ilmasensori1_temperature | 23.6 | | | | 1675588279.821811 | | 1675588280.838888 | 308743 | 293107 | 01GRGDG8G6QK2ARH8H2T2PTM1D | | | 0
308755 | sensor.ilmasensori1_temperature | 23.6 | | | | 1675588279.821811 | | 1675588287.316702 | 308749 | 293113 | 01GRGDGETM8RMD0Y2KF17NQGJ6 | | | 0
308761 | sensor.ilmasensori1_temperature | 23.6 | | | | 1675588279.821811 | | 1675588287.356455 | 308755 | 293119 | 01GRGDGEVW4CEY76N66NSE7PGJ | | | 0
308767 | sensor.ilmasensori1_temperature | 23.6 | | | | 1675588279.821811 | | 1675588287.71043 | 308761 | 293125 | 01GRGDGF6Y56RYG6MCDXQFVKDZ | | | 0
308773 | sensor.ilmasensori1_temperature | 23.6 | | | | 1675588279.821811 | | 1675588287.748196 | 308767 | 293131 | 01GRGDGF8450Y2YVYEREQ285TJ | | | 0
308779 | sensor.ilmasensori1_temperature | 23.6 | | | | 1675588279.821811 | | 1675588287.786542 | 308773 | 293137 | 01GRGDGF9A7W9GTFRECWZV7KYZ | | | 0
308785 | sensor.ilmasensori1_temperature | 23.6 | | | | 1675588279.821811 | | 1675588287.826042 | 308779 | 293143 | 01GRGDGFAJJ9ZEHMBVYP08EBAS | | | 0
308791 | sensor.ilmasensori1_temperature | 23.6 | | | | 1675588279.821811 | | 1675588291.204033 | 308785 | 293149 | 01GRGDGJM45JB25P9QDRSP5VJN | | | 0
308797 | sensor.ilmasensori1_temperature | 23.6 | | | | 1675588279.821811 | | 1675588291.242631 | 308791 | 293155 | 01GRGDGJNAAV4H5TYK2ZD4KYHH | | | 0
308803 | sensor.ilmasensori1_temperature | 23.6 | | | | 1675588279.821811 | | 1675588291.281162 | 308797 | 293161 | 01GRGDGJPHB8XNJ0K0RM1Q4B3Y | | | 0
308809 | sensor.ilmasensori1_temperature | 23.6 | | | | 1675588279.821811 | | 1675588291.31921 | 308803 | 293167 | 01GRGDGJQQ49XBSEQ58PV4716P | | | 0
308815 | sensor.ilmasensori1_temperature | 23.9 | | | | | | 1675588292.313652 | 308809 | 293173 | 01GRGDGKPS5NACP9SYT28RZDX5 | | | 0
In some cases timestamp is updated fairly often (perhaps sensor should be adjusted in zigbee2mqtt to report less often?)
Here I would not have any use for the entries 308725 → 308809 and hence could drop them out.
I did however come across the problem of ‘states_old_state_id_fkey’ CONSTRAINT.
It would not let me remove any entries where there was reference from another entry to here and
it appears that all entries in database are ‘chained’ together through state_ids.
So I dug into homeassistant_core and found from recorder purge.py and db_schema.py and queries.py
I noticed that ‘purge’ sets old_state_id to NULL so it can proceed with purging rows.
I tried that in a way that I set old_state_id to NULL for all entries that had last_changed_ts (WHERE last_changed_ts IS NOT NULL;) since those seem to be rows that I have no use for.
However removal did not work after that since still there was a row that I wanted to keep which was still referring to one of those removable rows.
So I looked more to the code and it seemed that old_state_id reference is not used in anything (important?) I removed the foreign key CONSTRAINT and hence was able to remove whatever rows I wanted.
Things seem to work as before. Nothing ‘seems’ to be broken (yet).
While I understand that this constraint could be important the removal of it and cleaning data does not seem to affect my usecases (at least yet)
So I am asking
- Is the constraint just something that is set up my sqlachemy?
- Is it used by any functionality that I did not notice
- If no current, then what future uses might break because of this?