I’m trying to use long term home assistant history to answer questions like these:
- How long was my heater on in the February 2022?
- What was the average outdoor temperature in February 2022?
- How did the indoor humidity change through the seasons in 2022?
These are just examples. I don’t have long term storage set up yet, so these aren’t going to be answerable for the past, but I want to ask them about data I’m collecting now.
Questions
Influxdb Question(s)
I have influxdb, with an autogen database that stores everything for 31d, like a lot of tutorials recommend.
I added retention policies for yearly and forever. I would like to store hourly data for a year and daily data forever (until the heat death of the universe, or I have to format my HA instance).
- It looks like I can’t do that very automatically. I found out how to add continuous queries for each unit type. I’ve been typing in things like this:
CREATE CONTINUOUS QUERY "cg_24h_hPa" ON "homeassistant" BEGIN SELECT mean(value) as value into "forever"."hPa" from "autogen"."hPa" group by time(24h), entity_id FILL(previous) END
CREATE CONTINUOUS QUERY "cg_1h_hPa" ON "homeassistant" BEGIN SELECT mean(value) as value into "year"."hPa" from "autogen"."hPa" group by time(1h), entity_id FILL(previous) END
It’s painful. Is there a better way to make everything be averaged and recorded like that?
- I got to the
state
database in influxdb. There are 450+ entities. There are boolean things, like heater on/off. There are strings, like weather data. There are integers and chars that don’t have units. Averaging these sometimes doesn’t make any sense. How can I store these things and have usable results long term?
Database Questions
-
Is influxdb the right choice?
-
Should I just give up and have the default retention policy collect everything forever? I don’t want to make it read GBs every time it makes an annual graph, so I think the separate retention policies make sense.
Statistics
In the hourly or daily statistics for things like booleans, it would be more useful to have the % of time it was on, and the duration of it being on, etc. For example, knowing that the heater was on for 20mins from 3-4pm is useful. Knowing it was on for 33% of the time in that hour is useful. Knowing that the heater was on for 18% of the day on Feb 21st and 25% on Feb 25th is very useful.
- How do I get that kind of aggregated statistics into the history/database? Should I compute those things in homeassistant? Is there a way to do that with queries in influxdb? Is this a case for templates?
Old Entities
I’m also seeing a lot of old entities in the db. Stuff that is no longer used in home assistant. Devices that have been renamed, or taken offline, or whatever.
- Do you regularly reformat your HA instance to remove this kind of thing? I have a lot of work into configuring HA, but I would probably not mind moving everything from one instance to another, just to get rid of old stuff I don’t need anymore. Is there a better way?
Please answer anything you can. Sorry for the logjam of questions. I have been working on this, loading up on stuff, and I’m ready to ask for help, so there is a backlog of stuff.