Merge statistics of old entity to new entity

theneweinstein · October 1, 2022, 10:01am

Hi,

It would be nice if we would merge entities in the statistics database. When you for instance rename an entity or replace a sensor, it would be nice if the data from the old sensor could be merged with the new sensor. For instance, in my energy dashboard I now have two sensor for my solar panel production since I started using a different integration to monitor my panels, it would be great if I could merge that into one.

tom_l · October 4, 2022, 12:07pm

Looks like you get your wish next month:

github.com/home-assistant/core

Fix preserving long term statistics when entity_id is changed

home-assistant:dev ← home-assistant:recorder_fix_startup

opened 07:54PM - 03 Oct 22 UTC

emontnemery

+6 -1

## Proposed change  Correct recorder startup to ensure long term statistics is preserved when entity_id is changed via the UI ## Type of change  - [ ] Dependency upgrade - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New integration (thank you!) - [ ] New feature (which adds functionality to an existing integration) - [ ] Deprecation (breaking change to happen in the future) - [ ] Breaking change (fix/feature causing existing functionality to break) - [ ] Code quality improvements to existing code or addition of tests ## Additional information  - This PR fixes or closes issue: fixes # - This PR is related to issue: - Link to documentation pull request: ## Checklist  - [ ] The code change is tested and works locally. - [ ] Local tests pass. **Your PR cannot be merged unless tests pass** - [ ] There is no commented out code in this PR. - [ ] I have followed the [development checklist][dev-checklist] - [ ] The code has been formatted using Black (`black --fast homeassistant tests`) - [ ] Tests have been added to verify that the new code works. If user exposed functionality or configuration variables are added/changed: - [ ] Documentation added/updated for [www.home-assistant.io][docs-repository] If the code communicates with devices, web services, or third-party tools: - [ ] The [manifest file][manifest-docs] has all fields filled out correctly. Updated and included derived files by running: `python3 -m script.hassfest`. - [ ] New or updated dependencies have been added to `requirements_all.txt`. Updated by running `python3 -m script.gen_requirements_all`. - [ ] For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description. - [ ] Untested files have been added to `.coveragerc`. The integration reached or maintains the following [Integration Quality Scale][quality-scale]:  - [ ] No score or internal - [ ] 🥈 Silver - [ ] 🥇 Gold - [ ] 🏆 Platinum  To help with the load of incoming pull requests: - [ ] I have reviewed two other [open pull requests][prs] in this repository. [prs]: https://github.com/home-assistant/core/pulls?q=is%3Aopen+is%3Apr+-author%3A%40me+-draft%3Atrue+-label%3Awaiting-for-upstream+sort%3Acreated-desc+review%3Anone+-status%3Afailure  [dev-checklist]: https://developers.home-assistant.io/docs/en/development_checklist.html [manifest-docs]: https://developers.home-assistant.io/docs/en/creating_integration_manifest.html [quality-scale]: https://developers.home-assistant.io/docs/en/next/integration_quality_scale_index.html [docs-repository]: https://github.com/home-assistant/home-assistant.io

janmolemans · March 4, 2023, 11:36am

I have created a script that merges 2 existing entities: hass_utils/merge_sensor_statistics.ipynb at 51403be6a1f180c02c6e66e1bdc75e493192006e · janmolemans/hass_utils · GitHub

theneweinstein · March 5, 2023, 4:42pm

Thanks! Need to make some adjustments for my use case, but now I finally have all my old entities merged into the new ones.

BluetriX · March 25, 2023, 5:27pm

Hello all,

I have the same problem. I have measured the energy with different meters (Shelly Plug S, Shelly 1PM, Shelly Plus 1PM, Solarman Integration).

Now I only use Shelly Plus 1PM. For the energy history, the statistics of the old sensors are still there but they are otherwise dead. In the Energy Dashboard settings comes “Entity not defined”.

Is there an option to merge the old values? The script didn’t work for me and broke the DB (backup helped ).

theneweinstein · March 25, 2023, 6:26pm

I had the same issue, that the script broke my database. In the end I rewrote part of the script, so it updates the database instead of completely overriding it (merge_statistics · GitHub)

It typically still gives an error, since the timestamp of the last entry of the old sensor is the same as the timestamp for the first entry of the new sensor, but this doesn’t break the database. The error message you get tells you the timestamp that is duplicate. In the end I had to manually remove one of these duplicates before I could merge (I used the phpMyAdmin addon for this).

You also will have to correct the statistics after merging (developer tools → statistics → adjust sum), otherwise I got a large negative value at the crossing between the old and new sensor.

Bart1992 · April 1, 2023, 7:09pm

Hi @theneweinstein

Thanks for your script,

however the script crashes in the merge function I´m getting the following error :

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File /usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/base.py:1410, in Connection.execute(self, statement, parameters, execution_options)
   1409 try:
-> 1410     meth = statement._execute_on_connection
   1411 except AttributeError as err:

AttributeError: 'str' object has no attribute '_execute_on_connection'

The above exception was the direct cause of the following exception:

ObjectNotExecutableError                  Traceback (most recent call last)
Cell In[5], line 5
      3 for table in ("statistics", "statistics_short_term"):    
      4     print(table)
----> 5     merge(source_sensor, target_sensor, table)

Cell In[4], line 5, in merge(source_sensor, target_sensor, table)
      2 print(f"source: {source_sensor}, target: {target_sensor}")
      4 # read data from target sensor
----> 5 target_sensor_id=pandas.read_sql_query(f"""select id FROM statistics_meta 
      6                                             where statistic_id like '{target_sensor}';""", con).loc[0,'id']
      7 target_df=pandas.read_sql_query(
      8         f"select * FROM {table} where metadata_id = '{target_sensor_id}';", con
      9     )
     10 print(f"length of existing statistics for target sensor: {len(target_df)}")

File /usr/local/lib/python3.9/dist-packages/pandas/io/sql.py:397, in read_sql_query(sql, con, index_col, coerce_float, params, parse_dates, chunksize, dtype)
    339 """
    340 Read SQL query into a DataFrame.
    341 
   (...)
    394 parameter will be converted to UTC.
    395 """
    396 pandas_sql = pandasSQL_builder(con)
--> 397 return pandas_sql.read_query(
    398     sql,
    399     index_col=index_col,
    400     params=params,
    401     coerce_float=coerce_float,
    402     parse_dates=parse_dates,
    403     chunksize=chunksize,
    404     dtype=dtype,
    405 )

File /usr/local/lib/python3.9/dist-packages/pandas/io/sql.py:1560, in SQLDatabase.read_query(self, sql, index_col, coerce_float, parse_dates, params, chunksize, dtype)
   1512 """
   1513 Read SQL query into a DataFrame.
   1514 
   (...)
   1556 
   1557 """
   1558 args = _convert_params(sql, params)
-> 1560 result = self.execute(*args)
   1561 columns = result.keys()
   1563 if chunksize is not None:

File /usr/local/lib/python3.9/dist-packages/pandas/io/sql.py:1405, in SQLDatabase.execute(self, *args, **kwargs)
   1403 def execute(self, *args, **kwargs):
   1404     """Simple passthrough to SQLAlchemy connectable"""
-> 1405     return self.connectable.execution_options().execute(*args, **kwargs)

File /usr/local/lib/python3.9/dist-packages/sqlalchemy/engine/base.py:1412, in Connection.execute(self, statement, parameters, execution_options)
   1410     meth = statement._execute_on_connection
   1411 except AttributeError as err:
-> 1412     raise exc.ObjectNotExecutableError(statement) from err
   1413 else:
   1414     return meth(
   1415         self,
   1416         distilled_parameters,
   1417         execution_options or NO_OPTIONS,
   1418     )

ObjectNotExecutableError: Not an executable object: "select id FROM statistics_meta \n                                                where statistic_id like 'sensor.xx_52_total_lifetime_energy_output';"

I tried to debug the code, but could not find anything strange… Can you maybe help me out in the right direction? ( print(con) gives positive result → seems connected to the database ).

Thanks in advance,

theneweinstein · April 1, 2023, 8:00pm

It looks like the same problem I recently had. Apparently the pandas package does not work nicely with SQLAlchemy 2. I solved that by forcing an older version of SQLAlchemy to install in the init_commands of the jupyterlab addon (pip install "SQLAlchemy<2").

KNXBroker · April 9, 2023, 3:39pm

Since the latest HA release 2023.4 changing entity id keeps history data. This should help when moving to a new entity! Rename your old entity id to the new name, remove the old integration, install the new integration.

theneweinstein · April 10, 2023, 6:43am

Yes, I was very happy to read that. Next time it will be a lot easier to keep my history.

gajotnt · June 13, 2023, 9:07pm

Can you explain a little more please?
I have this sensor (sensor.powcasath_energy_total) that isint updated anymore, but i like to have the stats in energy dashboard and now i use a shelly (sensor.shellyem_channel_1_energy).

how can i merge them?
If i rename “sensor.powcasath_energy_total” to “sensor.shellyem_channel_1_energy” will it merge them?

pergola.fabio · June 14, 2023, 5:07am

Interested in this too

KNXBroker · June 14, 2023, 5:46am

I don’t think that merging entities is possible with the approach described above. The approach above can be use to continue an old statistic with a new device / name.

pergola.fabio · June 14, 2023, 6:00am

Hmm, so now what? How to merge entities? Is it there still no official way?

theneweinstein · June 14, 2023, 5:52pm

Merge is indeed maybe the wrong word, the script was meant to append the entries of the old entity_id to the new entity_id. However, with the newer versions of HomeAssistant this is not needed anymore as HomeAssitant handle renaming entities in the statistics database now. I also don’t know if the script still works since the database format has been updated recently.

pergola.fabio · June 14, 2023, 6:57pm

Renaming is indeed an option, but how to rename an old entity to already an existing entity?

KNXBroker · June 14, 2023, 7:34pm

As mentioned, I don’t think it is possible to merge two statistics, you can only continue with the standard home assistant tools. Merging is only possible via the database itsself as mentioned above, but without knowing what you are doing, I would not recommend this.

gajotnt · June 14, 2023, 8:31pm

Well yes, not recommended for the faint of heart lol
Just sucks having several unused sensors in the energy dashboard (because we upgrade/change devices) and i don’t want to lose that information, witch is one of the reasons we use HomeAssistant, keeping track of data.

ndrwha331 · August 4, 2023, 12:24am

@theneweinstein, I’m getting this error:

sqlite3.IntegrityError: UNIQUE constraint failed: statistics.metadata_id, statistics.start_ts

It makes sense not to duplicate start_ts but how to implement it?

ndrwha331 · August 4, 2023, 1:25am

It’s a bit hairy (my first nested SQL statement) but it works for me.

    stmnt = f"""UPDATE {table}                                                                                 
                 SET metadata_id = {target_sensor_id}                                                          
                 WHERE metadata_id = {source_sensor_id} AND start_ts NOT IN                                    
                     (SELECT s.start_ts FROM                                                                   
                         (SELECT start_ts FROM {table}                                                         
                             WHERE metadata_id = {source_sensor_id}) s                                         
                         INNER JOIN                                                                            
                         (SELECT start_ts FROM {table}                                                         
                             WHERE metadata_id = {target_sensor_id}) t ON s.start_ts = t.start_ts);            
             """

The idea is to check for duplicated start_ts rows and ignore them in UPDATE.