Understanding data generation (Recorder, Logbook, History, Energy)

Good morning everyone,
I was looking into the data being stored in and by Home Assistant and more and more questions are starting to pop up. I cannot find the answers in the docs, so I hope you can help me.

From my understanding there is one central database homeassistant_database_v2 which is used by all built in integrations to store data.
What is still unclear to me is, how this is being access.

Recorder records all states etc. of all domains, entities etc.
You can adjust this by setting the include and exclude filters.

Logbook logs all info and error messages in Home Assistant of both the built in and external sources. Not sure what the difference is to Recorder then but probably the type of information (probably error messages instead of states etc.).
Again include and exclude allow filtering.
BUT, if I exclude a domain in Recorder, this will not exclude the same domain from Logbook, correct? Because the docs say, that Logbook uses Recorder to write to the database.
Same question for History. Does filtering in Recorder impact data from History?

Also, say I wanted to disable the Recorder. Is thta even possible? And if it is, would History and Logbook stop working because they use Recorder?

I hope someone can help me with understanding this. Because I would have thought they each write to the database themselves, but docs suggest they all use Recorder for database access.

Also, are there any other built-in default integrations or core components writing to the database?

Thank you for your help :slight_smile:
Alex

P.S.: Ist it actually possible to disable a specific default integration? default_config: will load all of the above and I can include/exclude, but docs do not specify an “unload” command. So is there something like

history:
 - disable

?

1 Like

All integrations and devices store their states here, not just core integrations.

Correct.

No. The system log is where errors and warnings are written (config/home-assistant.log)

The Logbook integration displays state changes read from the recorder database. So does the History integration, but in a different way (graphs rather than text).

Yes but no.

If you exclude an entity or domain from the recorder it wont be excluded from the Logbook or History integrations but once the recorder purge time passes (keep_days) there will be no data for the Logbook or History to read from the recorder database. So essentially yes you have excluded them from these integrations too.

If you want to use History or Logbook for an item it must exist in the Recorder database

Yes it is possible. Yes it would prevent them displaying anything.

All of them. Custom ones too, via the Recorder integration.

No it is not. It’s a package deal. See:

4 Likes

Thank you @tom_l :slight_smile:

That is very helpful. It does look like it will be far more maintenance than I had hoped given the approach, but it is certainly a good starting point.

One final clarification is needed though:

But does the include/exclude then affect those other integrations’ loggings also?
So can I disable all logging related to entity x from any integration by simply excluding entity x from recorder?

Correct.

Perfect.
I will make the adjustments in Recorder then.

I stumbled upon the purge setting and saw that the default is every 10 days. But that would mean that the database is emptied every 10 days. But obviously the data is not deleted after 10 days otherwise the Energy integration could not display monthly or yearly data.
So is the data moved to a different database table after 10 days or what is the point of this setting? Even if it is moved, the setting would not be a deletion process but only a move, hence not making a difference database-size-wise.

Honestly the long term statistics are a bit of a black box at this stage. Not much info on how it works but your states are definitely removed after the 10 days, just the statistics for those days remains.

Little bits like this trickle through:

Okay, thank you. Maybe I will disable auto_purge once I have selected which entities to track. Then all data should be stored indefinitely.
If I select to include entities, then in theory defining domains is obsolete, correct?

If e.g. I define one entity to include and don’t include or exclude anything else, then no domain and no nothing will be recorded except that one entity?
Because entity is the lowest point in the hierarchy, so everything above it (like domains) should be automatically excluded.

No. Do not do that.

If you want to store long term data use InfluxDB. You can be selective and only use includes. e.g.

Crash course. 100% pass rate. Takes 15 minutes… skipping the bit about being selective .

Okay, so I tried to take your advice and switched to InfluxDB and limited what the Recorder should be recording. However, when I look at the database, then there are more entities than I included.

My settings:

logger:
  default: critical
  logs:
    homeassistant.core: fatal
recorder:
  commit_interval: 60
  include:
    entity_globs:
      - climate.*
      - sensor.*_energy
      - sensor.*_power
      - sensor.*_today
      - switch.*
influxdb:
  host: a0d7b954-influxdb
  port: 8086
  database: homeassistant
  username: !secret influxdb_homeassistant_user
  password: !secret influxdb_homeassistant_password
  max_retries: 3
  default_measurement: state

But I also see entities that e.g. match

sensor.*_overheating sensor.*_overpowering

and the rpi_power_status, sun and updater.

What am I missing?

EDIT:
Also tried

recorder:
  commit_interval: 60
  include:
    domains:
      - climate
      - switch
    entity_globs:
      - sensor.*_energy
      - sensor.*_power
      - sensor.*_today

Anybody?
I tried deleting the database but that does not help. More than 40 entities are still being recorded and most of them should not be.
State is the statistic with the most entities.

Yeah I’m not getting this either, I just added a bunch of entities to recorder.exclude, but I still see history for them.

From my understanding history for those entities should stop being recorded the very second HA is restarted with the new config.

Correct, but the existing history is still present in the database

Bad wording, even though I see the history, I meant to say that history is still actively being logged for these entities, even after restarting HA.