Initially, I didn’t expect it would result in so lengthy post. Please do not be scared of it. It might look complex, but it is not. It’s probably my fault, trying to provide too many details, believing those are all needed as a background to how it works.
Introduction
This article is the result of a deep dive into how to effectively store and visualize long-term energy data from Home Assistant. In the next sections, I’ll explain why moving your data outside of Home Assistant to TimescaleDB can make a big difference and how you can do it.
For me, it was also a first real hands-on experience with TimescaleDB (even though I had known about it for years).
The article is structured as a step-by-step tutorial, starting with required components configuration, through TimescaleDB basics, and ending with examples of Grafana visualizations. You’ll find ready-to-use examples you can copy and adapt to your own setup, depending on the sensors you have.
What’s covered:
- Why it’s worth adding TimescaleDB to your Home Assistant system
- What you need to set it up and how to configure it
- Basic concepts of TimescaleDB
- How to aggregate data efficiently
- How to use Continuous Aggregates (CAGGs)
- How to set up Real-time Aggregates
- How to build Hierarchical CAGGs
- How to use data compression
- How to manage data retention
- How to create Grafana visualizations based on your data
Why Not Rely Solely on Home Assistant for Long-Term Energy Data Storage?
While Home Assistant (HA) is a powerful automation platform, it’s not ideally suited for long-term storage and analysis of electricity energy consumption data on its own. Here are a few disadvantages:
- Database limitations: By default, HA uses SQLite, which struggles with large datasets and concurrent access.
- Low reliability: SQLite introduces a high risk of data loss due to file corruption, crashes, or storage issues. It plays against storing very long-term data. Attempting recovery can be time-consuming and may result in extended Home Assistant outages.
- Performance issues: Complex queries against SQLite are virtually impossible.
- Limited analytics: HA’s built-in visualization and analytics capabilities are basic compared to specialized tools like Grafana.
- Inflexible data access: Data stored in HA’s default database is difficult to manipulate or import from external sources. In contrast, PostgreSQL/TimescaleDB allows flexible modification and integration with third-party data, such as electricity provider exports, or recovery from backups if HA’s internal storage fails.
- Refractory on entities replacement: In HA, it’s difficult to maintain entity naming when making changes to the household systems, such as installing solar panels (FVE). Such changes lead to using different measurement devices, thus differently named sensors. Tracking energy before and after such events results in fragmented or incomplete views, without an option to visualise them as continuous data.
Benefits of Using TimescaleDB
TimescaleDB, a time-series extension of PostgreSQL, offers a more robust solution for energy consumption data logging and analysis:
- Efficient time-series storage: Built specifically to handle large volumes of time-stamped data.
- Scalability: Handles months or years of high-resolution data with ease.
- Advanced queries: Supports SQL with time-series specific functions.
- Continuous Aggregates: Enables efficient pre-aggregation of data for fast queries.
- Flexible retention and compression: Easily drop or compress old data while preserving aggregate summaries (which can also be compressed)
- Changes to the data are up to your will. You can import data from other systems, rename entities. For example, you can import consumed energy data offered by the provider from time periods HA system did not measure house consumption, etc.
- On-the-fly data transformation like filtering, renaming, unit conversion, and more.
Bonus: Grafana can bring all this data to life with interactive dashboards right from the start.
All in one, with TimescaleDB, it’s possible to achieve energy analytics unavailable in HA, like the return value of FV installations collected over the years.
Prerequisites
To get started, ensure the following components are installed and configured:
- Home Assistant sensors. Read below for the requirements for HA sensors
- Long-Term Statistics Store (LTSS) integration.
- TimescaleDB with TimescaleDB Toolkit extension enabled. It might be an add-on running in your HA Supervisor environment or as an external service.
- Grafana, like TimescaleDB, you can use the HA add-on or an independent installation of Grafana.
The next paragraphs go through key points of configuration.
Home Assistant Sensors
Before we move on with the database, we need to acknowledge the character of data and make sure HA provides sensors we can use later. The article is about collecting energy data, thus focusing on energy sensors only. Examples found in the article are built upon the following measurements:
- household consumption
- energy purchased (equal to house consumption prior FVA installation)
- energy injected
- FV panels production
- energy charged to batteries
- energy discharged from batteries
To achieve the goal smoothly, it’s best to stick to the following rules:
- provide utility sensors for energy (at least hourly)
- make them consistent unit-wise. My proposal: kWh
Read explanation to these requirements
In general, energy data tracked by Home Assistant sensors grows in time indefinitely. Optionally, it’s allowed that from time to time the value is reset to zero and starts counting again. The reset can be triggered by a measurement device or by HA itself (utility sensors) at any time.
It’s important to understand this concept because it makes common methods like sum or delta not applicable to data evolving this way. Managing it using common SQL might be pretty complex.
TimescaleDB comes with ready-to-use tools helping with that.
There is one more issue to mention and solve: data changes are reported with finite frequency (granularity). It leads to “ignoring” some energy amount when attempting to calculate this data within intervals. Let’s look at the diagram:
0 5 12 15 23
time ─|───o─────o─────o─────|───────o─────o─────|────>
x midnight y
Suppose we want to calculate total energy usage for the day (midnight to midnight). With data distributed as in the picture, it’s impossible to precisely determine:
- the energy used between x and midnight
- the energy used between midnight and y
- the daily energy
Simple grouping by day would result in 12 units during the first day and 8 units on the second day. 3 units are lost. Sometimes your energy readings will be frequent enough that the error is small, but you can’t always count on that.
We could try to interpolate (estimate) these missing parts, but for our case (explained later), interpolation isn’t the way to go.
How to solve this?
We can use Home Assistant’s Utility Meters. Based on my observations:
- Such sensors reset automatically at the start of each period (quarter-hourly, hourly, daily, monthly, etc.)
- The first recorded value for each period is always 0.
- The first value following the reset represents the energy collected since the last known value before zero-point.
In the result, the value at y represents the delta (the amount of energy used) between x and y.
This method causes energy from before the end of the period to be accounted for in the next one.
0 5 12 0 3 7
time ─|───o─────o─────o─────|o──────o─────o─────|────>
x midnight y
Worth mentioning that utility sensors (regardless of reset time interval) based on the same source sensor are all updated at the same rate as the source sensor. It renders into conclusion that there is no need to publish daily sensors if there is hourly one. The shortest period utility sensor is enough to cover our needs, since we can make daily values out of hourly ones.
Note:
In this article, we assume hourly data as the baseline.
This could be considered enough for most use cases, unless you’re dealing with other needs like quarter-hourly data, for often used for spot market energy trading.
Consistent Units for Energy Sensors
Home Assistant can create energy sensors with different units — for example, Wh, kWh, or even MWh.
When you configure your utility meters, make sure all related sensors use the same unit.
If your sensors use mixed units, you have three options (some are messier than others):
- Convert the units when recording data into the database
- Convert the units later during data aggregation
- Unify units during visualization
Later in this article, you’ll see that units are stored together with the data. However, unit conversion is not covered in the example queries — they assume everything is already consistent.
My recommendation:
Stick to kWh
as your standard unit.
It’s the most common, easy to read, and fits well for hourly or daily energy tracking.
Installation
Timescale DB
TimescaleDB can be installed as a separate service or using a ready-to-use Home Assistant add-on. The installation process is described on the add-on github pages.
To execute actions described in this part of the article, you will need a DB client. The first already built-in option is a psql
console. You can access it directly from HA console:
docker exec -it addon_77b2833f_timescaledb psql -U <username> <databasename>
If you prefer a GUI client, you can use pgAdmin4 (also available as an HA add-on) or use other clients like DBeaver (my favorite). In both cases, enabling the exposition of the PostgreSQL port in TimescaleDB Add-on settings is required.
After TimescaleDB installation, only the postgres
login role is available. If you create your own, make it SUPERUSER
- it might be needed. Then be careful, it gives a superpower
Privileges
Assuming TimescaleDB was installed via the add-on installer and the necessary databases were created, those databases are accessible to any role that can authenticate.
To operate LTSS and Grafana, we need to define dedicated login roles with minimal required privileges. Using the default PostgreSQL role is discouraged due to its broad and unrestricted permissions.
Let’s create roles specifically for LTSS and Grafana:
CREATE ROLE app_ltss LOGIN PASSWORD 'some_password';
CREATE ROLE app_grafana LOGIN PASSWORD 'another_password';
ALTER ROLE app_grafana SET statement_timeout = '5s';
When running the system on low-performance hardware like a Raspberry Pi, it’s easy to overload the device to the point where it may struggle to recover on its own. To mitigate this, you can set a statement timeout for Grafana’s database role, ensuring that no Grafana query runs longer than 5 seconds.
LTSS needs permission to create tables in the public schema:
GRANT CREATE ON SCHEMA public to app_ltss;
Extensions
Next, install the required extensions that will be used later:
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS timescaledb_toolkit;
CREATE EXTENSION IF NOT EXISTS btree_gist;
Why install these manually?
postgis
– Installing it up front prevents the LTSS component from requesting superuser privileges to install it later.timescaledb_toolkit
– Enables additional time-series features needed for handling continuously increasing values like energy consumption.btree_gist
– Required later for advanced indexing when generating cost reports.
Timezone does matter!
Because we’re working with data grouped by days, it’s critical to ensure that the database uses the correct timezone. Otherwise, query results can be misaligned—for example, a “midnight” timestamp might fall on the wrong day depending on the timezone.
You can resolve this either by configuring the database appropriately or by explicitly passing the timezone in queries that rely on date grouping.
How PostgreSQL Handles Timezones
The active timezone used in a session is determined by several layers of configuration. The settings are applied in the following order:
server defaults -> database -> client connection -> login role
Some tools (e.g., DBeaver) automatically set the timezone at the start of the client connection, usually based on the system or application settings. Others, like psql or Grafana, do not override the session timezone, relying instead on whatever is configured at the database level.
This is important to understand, especially if you observe different query results depending on which tool you’re using.
Check the Active Timezone
To check the current session timezone and its source, use:
SELECT setting, source FROM pg_settings WHERE name = 'TimeZone';
To examine both server and database-level timezone settings, run:
SELECT 'server' AS source, boot_val AS timezone FROM pg_Settings WHERE name = 'TimeZone'
UNION
SELECT 'database' AS source, split_part(cfg, '=', 2) AS timezone
FROM pg_db_role_setting s
JOIN pg_database d ON s.setdatabase = d.oid
JOIN unnest(s.setconfig) AS cfg ON TRUE
WHERE datname = CURRENT_CATALOG AND split_part(cfg, '=', 1) = 'TimeZone';
- The server row shows the default timezone at the instance level (usually UTC when using TimescaleDB add-on).
- If a database row appears, its value overrides the server default for that specific database.
Set the Preferred Timezone
If the database timezone does not match your desired setting, it’s best to set it at the database level. This ensures consistency and also affects TimescaleDB background processes (like CAGG refresh policies).
To set the database timezone:
ALTER DATABASE <databasename> SET TimeZone = 'desired timezone';
Avoid abbreviations like CET
or PST
. Instead, use full names (e.g., Europe/Prague
, America/New_York
). These handle daylight saving time automatically and are much more reliable year-round.
You can find a list of all supported timezone names with:
SELECT * FROM pg_timezone_names;
Configuring LTSS to Export Energy Sensors
LTSS is the Home Assistant’s custom component. It can be installed by HACS.
Installation instructions are available in project Github.
LTSS writes all data into a single table: public.ltss
(without option to change). In the article, the table will be referenced just as ltss
since the schema public
usually doesn’t need to be explicitly referenced.
During start, the LTSS checks the existence of the ltss
table. It creates that table if it doesn’t exist. It also tries to add the postgis
extension to the database. For the latter operation, it requires superuser privileges. But as you remember, we have already installed it.
Once LTSS is installed, configure it to publish all needed sensors to TimescaleDB.
Changes to LTSS configuration require HA restart.
Configuration of LTSS typically involves selecting the appropriate sensors in the integration’s configuration in configuration.yaml.
The example below makes LTSS component connected to the ha_ltss
database available in TimescaleDB installed as an HA add-on with user app_ltss
. It will publish specified Glances sensors (unimportant from pov of this article, just an example) and all sensors whose name ends with _hourly
phrase.
ltss:
db_url: postgresql://app_ltss:some_password@77b2833f-timescaledb/ha_ltss
include:
entities:
- sensor.glances_cpu_load
- sensor.glances_cpu_used
- sensor.glances_cpu_percent
- sensor.glances_cpu_thermal_0_temperature
- sensor.glances_ram_free
- sensor.glances_ram_used
- sensor.glances_ram_used_percent
- sensor.glances_swap_free
- sensor.glances_swap_used
- sensor.glances_data_free
- sensor.glances_swap_used
- sensor.glances_data_used_percent
entity_globs:
- sensor.*_hourly
The configuration is similar to what the recorder offers, including globs. Unfortunately, it inherits the limitation that entities included by globs cannot be excluded anymore. In such a case, there are two options: replace globs with an explicit list of sensors, or let LTSS publish a broader set of data, to reject some entities with the use of a before trigger. While the second option seems tempting, it’s easy to forget about such a trigger later on. Also, it still costs additional communication and processing. But sometimes knowing this way might be life-saving.
Anyway, here is an example of such a trigger:
CREATE OR REPLACE FUNCTION public.ltss_exclude_entities()
RETURNS TRIGGER
LANGUAGE plpgsql
AS $$
BEGIN
-- Filter out unwanted entities before insertion
IF NEW.entity_id NOT IN
(
'sensor.unwanted_energy_consumption_hourly',
'sensor.solar_production_hourly'
)
THEN
RETURN NULL; -- Skip insert
END IF;
RETURN NEW;
END;
$$;
CREATE TRIGGER ltss_exclude_entities
BEFORE INSERT ON public.ltss
FOR EACH ROW
EXECUTE FUNCTION public.ltss_exclude_entities();
HA sensors often include extensive and unnecessary metadata in their attributes field. These attributes are stored as JSONB in the database, which can significantly increase disk usage. If the additional metadata is not essential for your use case, it’s wise to strip it away to optimize storage. Below is a trigger that retains only the unit_of_measurement attribute, which is typically the most relevant, thereby minimizing storage overhead caused by extraneous data:
CREATE OR REPLACE FUNCTION public.ltss_strip_attributes()
RETURNS TRIGGER
LANGUAGE plpgsql
AS $$
BEGIN
-- Keep only 'unit_of_measurement' in the attributes JSONB
IF NEW.attributes ? 'unit_of_measurement'
THEN
NEW.attributes = jsonb_build_object(
'unit_of_measurement',
NEW.attributes->'unit_of_measurement');
ELSE
NEW.attributes = '{}'::jsonb;
END IF;
RETURN NEW;
END;
$$;
CREATE TRIGGER ltss_strip_attributes
BEFORE INSERT ON public.ltss
FOR EACH ROW
EXECUTE FUNCTION public.ltss_strip_attributes();
This trigger is especially useful when storing large amounts of sensor data where the attributes field may otherwise bloat the database.
Note, there is another tool to manage a large amount of data: compression (described later).
Before we jump into Grafana, let’s connect the database again and set the privileges of the newly created table to allow reading it by any connected user. It’s not needed when using continuous aggregates, but might be useful when playing with origin data.
GRANT SELECT ON TABLE ltss TO public;
Grafana
For Grafana, you can choose to use the Home Assistant add-on or external Grafana.
Installation itself requires no notes.
Grafana natively supports PostgreSQL. The only thing is to configure a datasource within Grafana. Provide 77b2833f-timescaledb:5432
as Host Url, then the database name, user name, and password as created earlier. Enable TimescaleDB switch - it will enable TimescaleDB-specific features in SQL builder. Although the configuration of diagrams described will not need it.
Note, the data source offers no option to set TimeZone.