How can I query the HA DB from an integration?

I’d like to set up an automation that adds a row to a spreadsheet once a month, containing some values from a sensor’s long-term statistics.

For the spreadsheet part I can use the Google Sheets integration’s add row service, but AFAICT there is no way to access long-term statistics from automations, templates, etc. So I’m instead trying to write a custom integration that will do this.

Unfortunately there also doesn’t seem to be a long-term statistics API, and the only documentation at all appears to be an (incomplete) description of the schema at Home Assistant Statistics | Home Assistant.

Is there an API that I can use in a custom integration to query the HA DB using the connection that it already has open?

I have never truly understood the long term statistics, as I see no way of actually viewing them anyway, so interested to see others comments on your question.

1 Like

However now this SQL-Sensor is configured in Devices-Services, as an integration(add it the same place)

Maybe in combination with file-sensor

File - Home Assistant (PS: i have no idea how Google Sheets integration works :slight_smile: )

1 Like

There are API, but not sure if they are documented, yet (as they are pretty much WIP).
See Changes to long term statistics APIs | Home Assistant Developer Docs

1 Like

Thank you @boheme61 ! I wasn’t aware of the SQL integration. I used that to create sensors, disabled automatic polling on them, then set up an automation that updates their values and calls the sheets service to add a new row, including the sensor values via a template expression.

Here’s my query, in case anyone else is interested in a long-term statistics example:

SELECT STRFTIME('%Y-%m', hour, 'auto', 'localtime') AS month,
       ROUND(SUM(CASE WHEN peak THEN delta ELSE 0 END), 1) AS peak,
       ROUND(SUM(CASE WHEN peak THEN 0 ELSE delta END), 1) AS offpeak
FROM (
  SELECT start_ts AS hour,
         DATETIME(start_ts, 'auto', 'localtime') >= DATETIME(start_ts, 'auto', 'localtime', 'start of day', '+16 hours')
         AND DATETIME(start_ts, 'auto', 'localtime') < DATETIME(start_ts, 'auto', 'localtime', 'start of day', '+21 hours') AS peak,
         state - LAG(state, 1, 0) OVER(ORDER BY start_ts) AS delta
  FROM statistics
  LEFT JOIN statistics_meta ON statistics.metadata_id = statistics_meta.id
  WHERE statistic_id = 'sensor.powerwall_load_import'
)
WHERE DATETIME(hour, 'auto', 'localtime', 'start of month') = DATETIME('now', 'auto', 'localtime', 'start of month', '-1 months')
GROUP BY month;

This queries the LTS hourly values for my Tesla Powerwall “energy consumed by the house” sensor, dividing them into “peak” (4pm - 9pm) and “off-peak” and then summing those across the month.

The state values in the LTS statistics table are the value of the sensor at the end of the hour, and start_ts is a timestamp representing the beginning of the hour. SQLite seems to require the ‘auto’ modifier to parse these, which was a bit surprising. You usually want to also use ‘localtime’ to interpret the times in your own time zone rather than UTC.

From the statistics table alone you have no idea what these values are; you have to join their metadata_id against the statistics_meta table’s id to select the statistic_name (entity name) that you want to query.

So the inner query uses a window expression to compute the delta between each state and the previous one, and whether it was a peak or off-peak hour. The outer expression then sums up the peak and off-peak values across the entire previous month.

Each SQL sensor can only have one value. For simplicity I just used the same SQL for both, and selected the peak column in one, and offpeak in the other. If I were really concerned about efficiency I could have given each sensor its own query, but these are only getting triggered once a month so it seemed unnecessary.

1 Like

Good to hear you solved it, im still at very simple(1 entry/state) queries :slight_smile:
I still haven’t got a good grip off SQLite(and don’t have time), i used Grafana + influxDB in HA for awhile ( For realtime Graphs ) but dropped it( now there is a few good graph-options within HA), and never got to the point of extracting “specifics” to another sheet/table/graph
More than a decade ago i worked abit with Oracle DB, i totally like the structure , and “overview” there much much more … thou i still haven’t giving MariaDB a try in HA (in that case it would be an external instance, with an external websever, can alway use i.e iframe or other suitable cards if i need the info in HA ( History is seldom info for i.e triggers or automations )