Import or insert non-realtime or past historical sensor values into the Home Assistant PostgreSQL database

I am using HomeAssistant in docker with a PostgreSQL database. I chose PostgreSQL for familiarity and also the option to allow infinite retention of past data (or, at least a lot longer than the default 10 days!)

I would like to be able to specify a “measurement taken at” timestamp when adding a sensor reading to Home Assistant. I’d like to be able to do it automatically & programmatically but manual would be OK. Ideally I could extend the POST request (and/or manual input fields) with this field (which can default to ‘now’) but I could maybe live with historical imports only.

I have many use cases for this. I have sensors that are offline most of the time and batch upload when they get connected. But, to start with a simple example, I would like to add my bodyweight which i have been tracking for some time. I’m like to input the historical data for this so i can draw graphs etc. If I add an input sensor, I can add today’s weight, but HomeAssistant will regard any data input as being recorded as of the current moment, and I often take my weight but only input it later.

I’m not the first to want this and wont be the last, but I understand that HA will likely not implement it in core as the whole system is built around realtime and this is a huge complication.

I have seen a few options for historical data. Which option should I choose? Does any of you have any experience and success with this?

Option 1 (doesn’t work): Some recommend using POST requests or File sensors, and indeed I can import data this way, but after a lot of experimentation I can confidently say that HA will block you from any messing with the sensor reading timestamp - ITS ‘NOW’ OR NEVER!

Option 2: @ldotlopez has created a historical sensor module: GitHub - ldotlopez/ha-historical-sensor: Historical sensors for Home Assistant . I think this involves writing custom integrations using his code for each value you wish to add as there are only two integrations using it (both energy). So I think I could maybe build an integration that uses the module? He does note that ☠️ The historical sensor is based on a **highly experimental hack** but I do think that this is going to be the case for anything involving non-realtime data!

Option 3: Then there’s @frenck 's Spook integrations, which does a lot of things I don’t want/need and am a bit scared (wooooo!) to risk, but in particular affects recorder: Recorder - Spook 👻 a scary powerful toolbox for Home Assistant. This adds a way to import data, which would be an acceptable workaround for me but not quite what I want (being able to insert measurements for any sensor at any given time)

Option 4: Lastly I could make write queries to the PostgreSQL database directly. As long as I know what I’m doing I think this is best as it will give me the most flexibility and doesn’t touch anything that doesn’t need to be touched.

I think it is just a case of inserting into the states table, but I’m not sure if statistics is needed too (since I have such a large retention period I think it will not matter for me? at least until HA becomes too slow).

Here is my idea but I haven’t tried it yet.

metadata AS (
    SELECT metadata_id
    FROM entity_meta
    UNION ALL
    SELECT metadata_id FROM states_meta WHERE entity_id = "sensor.bodyweight"
)

INSERT INTO states (
    state, last_updated, last_updated_ts, entity_id, metadata_id
)
VALUES (
    71.5,
    null,
    1726520990,
    "sensor.bodyweight",
    (SELECT metadata_id FROM metadata)
);

For reference, I ran a query to get the schema out, and it looks like this:

"table_schema","table_name","column_name","data_type","is_nullable","character_maximum_length"
"public","event_data","hash","bigint","YES",NULL
"public","event_data","shared_data","text","YES",NULL
"public","event_data","data_id","bigint","NO",NULL
"public","event_types","event_type","character varying","YES",64
"public","event_types","event_type_id","bigint","NO",NULL
"public","events","event_type_id","bigint","YES",NULL
"public","events","context_user_id_bin","bytea","YES",NULL
"public","events","event_type","character","YES",1
"public","events","event_id","bigint","NO",NULL
"public","events","context_parent_id","character","YES",1
"public","events","context_id_bin","bytea","YES",NULL
"public","events","origin","character","YES",1
"public","events","event_data","character","YES",1
"public","events","context_parent_id_bin","bytea","YES",NULL
"public","events","data_id","bigint","YES",NULL
"public","events","context_user_id","character","YES",1
"public","events","time_fired_ts","double precision","YES",NULL
"public","events","context_id","character","YES",1
"public","events","time_fired","timestamp with time zone","YES",NULL
"public","events","origin_idx","smallint","YES",NULL
"public","migration_changes","migration_id","character varying","NO",255
"public","migration_changes","version","smallint","NO",NULL
"public","recorder_runs","created","timestamp with time zone","NO",NULL
"public","recorder_runs","start","timestamp with time zone","NO",NULL
"public","recorder_runs","run_id","bigint","NO",NULL
"public","recorder_runs","closed_incorrect","boolean","NO",NULL
"public","recorder_runs","end","timestamp with time zone","YES",NULL
"public","schema_changes","change_id","bigint","NO",NULL
"public","schema_changes","schema_version","integer","YES",NULL
"public","schema_changes","changed","timestamp with time zone","NO",NULL
"public","state_attributes","hash","bigint","YES",NULL
"public","state_attributes","attributes_id","bigint","NO",NULL
"public","state_attributes","shared_attrs","text","YES",NULL
"public","states","state_id","bigint","NO",NULL
"public","states","state","character varying","YES",255
"public","states","context_user_id","character","YES",1
"public","states","old_state_id","bigint","YES",NULL
"public","states","metadata_id","bigint","YES",NULL
"public","states","last_updated","timestamp with time zone","YES",NULL
"public","states","origin_idx","smallint","YES",NULL
"public","states","last_changed_ts","double precision","YES",NULL
"public","states","event_id","smallint","YES",NULL
"public","states","attributes","character","YES",1
"public","states","context_parent_id_bin","bytea","YES",NULL
"public","states","context_id","character","YES",1
"public","states","entity_id","character","YES",1
"public","states","context_parent_id","character","YES",1
"public","states","context_user_id_bin","bytea","YES",NULL
"public","states","attributes_id","bigint","YES",NULL
"public","states","last_updated_ts","double precision","YES",NULL
"public","states","last_changed","timestamp with time zone","YES",NULL
"public","states","context_id_bin","bytea","YES",NULL
"public","states","last_reported_ts","double precision","YES",NULL
"public","states_meta","entity_id","character varying","YES",255
"public","states_meta","metadata_id","bigint","NO",NULL
"public","statistics","metadata_id","bigint","YES",NULL
"public","statistics","start","timestamp with time zone","YES",NULL
"public","statistics","state","double precision","YES",NULL
"public","statistics","id","bigint","NO",NULL
"public","statistics","start_ts","double precision","YES",NULL
"public","statistics","last_reset","timestamp with time zone","YES",NULL
"public","statistics","created","timestamp with time zone","YES",NULL
"public","statistics","last_reset_ts","double precision","YES",NULL
"public","statistics","min","double precision","YES",NULL
"public","statistics","created_ts","double precision","YES",NULL
"public","statistics","mean","double precision","YES",NULL
"public","statistics","sum","double precision","YES",NULL
"public","statistics","max","double precision","YES",NULL
"public","statistics_meta","statistic_id","character varying","YES",255
"public","statistics_meta","source","character varying","YES",32
"public","statistics_meta","has_sum","boolean","YES",NULL
"public","statistics_meta","has_mean","boolean","YES",NULL
"public","statistics_meta","name","character varying","YES",255
"public","statistics_meta","id","bigint","NO",NULL
"public","statistics_meta","unit_of_measurement","character varying","YES",255
"public","statistics_runs","run_id","bigint","NO",NULL
"public","statistics_runs","start","timestamp with time zone","NO",NULL
"public","statistics_short_term","last_reset","timestamp with time zone","YES",NULL
"public","statistics_short_term","max","double precision","YES",NULL
"public","statistics_short_term","created_ts","double precision","YES",NULL
"public","statistics_short_term","metadata_id","bigint","YES",NULL
"public","statistics_short_term","state","double precision","YES",NULL
"public","statistics_short_term","mean","double precision","YES",NULL
"public","statistics_short_term","created","timestamp with time zone","YES",NULL
"public","statistics_short_term","last_reset_ts","double precision","YES",NULL
"public","statistics_short_term","start_ts","double precision","YES",NULL
"public","statistics_short_term","sum","double precision","YES",NULL
"public","statistics_short_term","start","timestamp with time zone","YES",NULL
"public","statistics_short_term","min","double precision","YES",NULL
"public","statistics_short_term","id","bigint","NO",NULL

Either make the insert directly with your own SQL command or look into Spook integration.

Hi Wally, thanks for your response.

Is the query that I included writing to the correct tables (state and state_metadata)? Is it fine to ignore the statistics table?

metadata AS (
    SELECT metadata_id
    FROM entity_meta
    UNION ALL
    SELECT metadata_id FROM states_meta WHERE entity_id = "sensor.bodyweight"
)

INSERT INTO states (
    state, last_updated, last_updated_ts, entity_id, metadata_id
)
VALUES (
    71.5,
    null,
    1726520990,
    "sensor.bodyweight",
    (SELECT metadata_id FROM metadata)
);

I do not know, because I chose the Spook solution. :slight_smile:

Ok thank you. I will hope that someone else comes along who knows the Home Assistant database, because the spook integration is only really useful for manual once off imports, I would ideally like to be able to regularly insert data points as at a given timestampe.

There are service calls in Spook to make it automated.
I use that for inserting electric costs.

But search the forum for threads with the keywords “historical”, “past”, “data”

I did of course start by searching the forum, that’s how I found @frenck 's Spook extension and @ldotlopez 's thread which I referred to in my OP.

There is also How to add manually values to a sensor (to cover the past quarter with data)? which has a link to the database documentation (which in turn links to Home Assistant Events | Home Assistant Data Science Portal) but this does not begin to explain what I want to do.

I am certain that I’m not the first to try to insert a past value straight into the database and I sure would love to learn from someone elses mistakes before I make my own!!

I’ll look into Spook’s service calls - if suitable for inserting single values that would be great

I have followed the threads about historical data input for some time and I know there are some threads with posts from people that have done it the SQL way and they posted examples or linked to their GitHub repository.

Hi,

I’m not actively developing ‘historical sensors’ right now, I’ve hit a dead end.

A few months ago the HA developers added a layer between the database and the states code, it’s not possible to manipulate the metadata or states tables without going through it.

You don’t notice this as long as you only have one historical sensor active but as soon as there are more than one exceptions about database integrity start to appear.

Since I haven’t been able to find any documentation on this layer I haven’t been able to move forward.

Right now the simplest and safest option is to use HA statistics via spook or via the add_statistics or import_statistics methods.