Is there any way to send backdated sensor data to HA?

I asked a more specific version of this question about MQTT sensors last year, and never got any replies, so I am thinking this is not possible, but wanted to ask a more general version.

I want to send sensor readings to HA but have the datetime that reading is effective as of not be “now”. So I want to be able to say “this sensor read 1.5 at 12:30pm (and it’s 2pm now)”.

I suspect this isn’t possible because HA is focused on automations and you wouldn’t really drive an automation on something like this. Still, there are a number of use cases where I get bulk updates of data, and want to store those in HA so that I can look at the trends over time.

Yes, you can do it by sending an event via an automation.

Thanks, I got that to work, left a comment on that thread with the syntax I used.

Any idea if this can be used with MQTT though? That’s my ultimate use case. Right now I’ve only gotten it to work with the developer tools in the HA UI. I’m not sure where else I can manually send the state_changed event.

You can create an automation that fires an event as the action (“manual event” in the UI). And of course you can use an MQTT trigger.

Thanks, that seems promising. However, I can’t seem to get an automation to actually update the state. My YAML in the action of my automation:

event: state_changed
event_data:
  entity_id: input_number.test_number
  new_state:
    entity_id: input_number.test_number
    state: "3.18"
    attributes: {}
    last_changed: "2023-12-19T04:18:00.123400+00:00"
    last_updated: "2023-12-19T04:18:00.123400+00:00"
    context:
      id: "{{ context.id }}"
      parent_id: null
      user_id: null

Originally I had those values coming from the MQTT payload, and I had less stuff (attributes, last_updated, the context), but it wasn’t working and I slowly tried to match the state_changed event emitted by the dev tools as closely as possible.

I really can’t figure out what the difference is now. What I am doing is going into dev tools and listening to the event state_changed then triggering state_changed via the dev tools then via my automation. The one triggered from the dev tools works (inserts a row into the states table in the DB), and the one triggered from the automation does not work. Here are the payloads as captured by listening with the dev tools:

# Triggered from dev tools, worked
event_type: state_changed
data:
  entity_id: input_number.test_number
  new_state:
    entity_id: input_number.test_number
    state: "3.18"
    attributes: {}
    last_changed: "2023-12-19T04:18:00.123400+00:00"
    last_updated: "2023-12-19T04:18:00.123400+00:00"
    context:
      id: 01HJ08EPGF9A2WZ1AAINFW93NK
      parent_id: null
      user_id: null
origin: REMOTE
time_fired: "2023-12-19T05:23:23.279688+00:00"
context:
  id: 01HJ08EPGF9A2WZ1AAINFW93NK
  parent_id: null
  user_id: 90faf9b3cfe54ed7b2a8b6b28bf9ab12
# Triggered from automation, does not work
event_type: state_changed
data:
  entity_id: input_number.test_number
  new_state:
    entity_id: input_number.test_number
    state: "3.18"
    attributes: {}
    last_changed: "2023-12-19T04:18:00.123400+00:00"
    last_updated: "2023-12-19T04:18:00.123400+00:00"
    context:
      id: 01HJ092MYT8RA8EPCDHQ0WZ28E
      parent_id: null
      user_id: null
origin: LOCAL
time_fired: "2023-12-19T05:34:17.250880+00:00"
context:
  id: 01HJ092MYT8RA8EPCDHQ0WZ28E
  parent_id: null
  user_id: null

I ran those through diff and the only difference I think could matter would be the LOCAL vs REMOVE origin, and the existence of the user_id. I don’t think there’s anything I can do about those though, and they seem unlikely to be causing the problem to me.

Any ideas how I can trigger state_changed from an automation?

In the automation call the action to generate the event, here’s a full example. So for your case put all of that event data in the event data section

# Script to turn a switch on
# Template Parameters
# switch
# turn_on
# tv_powerstrip_hub
# 10
rs_tv_powerstrip_hub_switch_turn_on:
  description: "reliable tv_powerstrip_hub switch turn_on"
  mode: queued
  variables:
    start_time: "{{ now().timestamp() }}"
  sequence:
    - service: script.rs_switch_turn_on
      continue_on_error: true
      data_template:
        entity_id: switch.tv_powerstrip_hub
        message_id: input_text.rs_switch_tv_powerstrip_hub_message
        retry_count_id: counter.rs_switch_tv_powerstrip_hub_retry
        error_count_id: counter.rs_switch_tv_powerstrip_hub_err
        call_count_id: counter.rs_switch_tv_powerstrip_hub_calls
        timeout_seconds: 10
    - event: rs_call_complete_switch_tv_powerstrip_hub
      event_data:
        duration: "{{ ((now().timestamp() - start_time)*1000)|round(0) }}"

Be aware that if all that is in the trigger part of the automation, what you are actually doing is saying I want this automation to fire when you get a state_changed event - and ALL this data in the data section matches the data in the received state_changed event.

I’ve spent quite a while to get the state_changed event to record data from an automation and then a script, but still no luck. Here’s my full script:

alias: Test State Changed Event
sequence:
  - event: state_changed
    event_data:
      entity_id: input_number.test_number
      new_state:
        entity_id: input_number.test_number
        state: "3.22"
        attributes: {}
        last_changed: "2023-12-19T04:22:00.123400+00:00"
        last_updated: "2023-12-19T04:22:00.123400+00:00"
        context:
          id: "{{ context.id }}"
          parent_id: null
          user_id: null
      old_state:
        entity_id: input_number.test_number
        state: "1.1"
        attributes: {}
        last_changed: "2023-12-19T21:29:00.003517+00:00"
        last_updated: "2023-12-19T21:29:00.003517+00:00"
        context:
          id: "{{ context.id }}"
          parent_id: null
          user_id: null
mode: single

Running that script shows this error in the logs:

2023-12-19 16:46:53.170 ERROR (MainThread) [homeassistant.core] Error running job: <Job listen state_changed HassJobType.Callback <function handle_subscribe_entities.<locals>.forward_entity_changes at 0x7ff189cf00e0>>
Traceback (most recent call last):
  File "/home/dale/.pyenv/versions/3.11.2/envs/homeassistant/lib/python3.11/site-packages/homeassistant/core.py", line 1061, in async_fire
    job.target(event)
  File "/home/dale/.pyenv/versions/3.11.2/envs/homeassistant/lib/python3.11/site-packages/homeassistant/components/websocket_api/commands.py", line 310, in forward_entity_changes
    connection.send_message(messages.cached_state_diff_message(msg["id"], event))
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dale/.pyenv/versions/3.11.2/envs/homeassistant/lib/python3.11/site-packages/homeassistant/components/websocket_api/messages.py", line 109, in cached_state_diff_message
    return _cached_state_diff_message(event).replace(IDEN_JSON_TEMPLATE, str(iden), 1)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dale/.pyenv/versions/3.11.2/envs/homeassistant/lib/python3.11/site-packages/homeassistant/components/websocket_api/messages.py", line 120, in _cached_state_diff_message
    {"id": IDEN_TEMPLATE, "type": "event", "event": _state_diff_event(event)}
                                                    ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dale/.pyenv/versions/3.11.2/envs/homeassistant/lib/python3.11/site-packages/homeassistant/components/websocket_api/messages.py", line 147, in _state_diff_event
    return _state_diff(event_old_state, event_new_state)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dale/.pyenv/versions/3.11.2/envs/homeassistant/lib/python3.11/site-packages/homeassistant/components/websocket_api/messages.py", line 157, in _state_diff
    new_state_context = new_state.context
                        ^^^^^^^^^^^^^^^^^
AttributeError: 'dict' object has no attribute 'context'

Note that I do have a context in my new_state, but the problem is that the code is expecting a State object, and this new_state is just a dict. Also for the record, I tried it without any old_state payload and just got a different, but similar error where the code was failing to parse this dict that it expects to be a State object.

I added a logger to the code to print the objects and you can see the difference in my objects and the ones HA creates.

{'entity_id': 'input_number.test_number', 'state': '3.22', 'attributes': {}, 'last_changed': '2023-12-19T04:22:00.123400+00:00', 'last_updated': '2023-12-19T04:22:00.123400+00:00', 'context': {'id': '01HJ20QHA5A7WM2J18PC2GEZA3', 'parent_id': None, 'user_id': None}}
<state input_number.test_number=0.5; initial=None, editable=True, min=0.0, max=100.0, step=0.01, mode=slider, unit_of_measurement=ft, friendly_name=Test Number @ 2023-12-19T17:39:29.431559-05:00>

You can see the type of the object is just different, and yet I can’t figure out what I could do to change that, since all I have access to is the YAML I’m putting in the event_data payload of the script above.

Ok, well I got it to “work”, but only by inserting some hacky code into the HA codebase to convert the dict states into true HA State objects when processing Events. I won’t bother posting the code here, because I don’t think anyone should “fix” this like this, but for the record, I’m checking the class of the “new_state” key within the if event.event_type == EVENT_STATE_CHANGED branch of the _process_one_event method in the recorder/core.py file.

My main goal in doing that was just confirming this was the problem, which appears to be the case. So to review, the issue is that scripts (and automations) do not send a true State object as part of the state_changed event, instead they’re just sending a dict, which the code cannot handle. On the other hand, the Events page of the dev tools do send real State objects, so it does work from there.

My “fix” is checking these Events when the recorder pulls them off, and converting the state dicts to State objects if needed. I think the more proper fix would be finding where in the code these Events are put on the bus and ensuring they are converted to States there. I may try to create a PR to do that, but I’m not familiar with the HA code base, or with Python, so it’s unlikely I’ll be able to figure that out (I only found the spot in the recorder code because I had stack traces to guide me). If anyone has any ideas and wants to point me in the right direction that would be appreciated.

More realistically though, I do wonder if there is any way to get this to work as is? Has anyone actually sent state_changed events from an automation or script? Does anyone know of a way to create a true State object with templating?

What do you mean by ‘send a bunch of events’?

I had a look at the devtools section. On submit of event it makes a rest call. It requires a bearer token. But maybe you can write a rest automation or shell script to do the same post.

Just need to create a homeassistant user account to authenticate with to create the bearer token. (Whitelist ip to your ha IP if possible, can’t remember)…

Then use curl to authenticate with ha and get a bearer token.

Then use curl to post the state_changed event change to the rest service with bearer token in the header. The same way the UI does.

Thanks, that’s an interesting idea I hadn’t considered. It’s worth pointing out that all of this can be avoided if you just manually insert the data into the database, which is what I’m currently doing. I was just trying to see if there was a less brittle way to do it that wouldn’t break when there are schema changes to the database.

I’m not sure if these hacky methods are better than just inserting into the database, but I’m still hopeful for a more official way to add backdated sensor data.

Just to highlight why someone might want this. My electricity company provides data with a 24 hour delay in half-hour segments. I want to load this once per day and just batch set the prior values.

It’s strange how this isn’t just a simple timestamp to provide when setting the value. I’ll look into the DB idea, however currently HA runs in a container separate to the code that’d download the data and I’d prefer to keep them isolated.

I am in a similar situation.
My water company posts delayed hourly usage readings once a day.
I can scrape them from the webpage and want to post them to HA backdated to the time they occured.

I know I can manually insert them into the state table of the home-assistant_v2.db SQL database but that sounds a bit brittle if HA is running.

Also, I don’t know how best to trigger the statistics routines to update the statistics and short_term_statistics table for the given sensor. Or even if it is possible.

+1 here, energy reading from the provider are non-scrapeable and want to manually input them backdated only once a month or so

@puterboy do you have any guide / link on how to manually insert the data in the DB?

See also this conversation: Manual input for electricity, gas and water consumption - #13 by StephenH

I posted some reasonably robust Python code to insert backdated state data into the states table. The source can be either a CSV file or a SQLite db – with 2 columns (one for timestamp in UTC, other for data to be entered)

https://community.home-assistant.io/t/enter-delayed-aka-backdated-mqtt-data-asynchronously-based-on-attached-timestamp /750485/7?u=puterboy

1 Like

See the following link for updated version that also adds the corresponding statistics to the statistics and short_term_statistics tables.

1 Like