WTH is the data in .storage not in the recorder database?

I’d love to see some of the stuff that’s currently in .storage, like core.entity_registry and core.device_registry, in the database.

The database doesn’t necessarily need to be the source of truth (and I don’t think doing this would fit with how Home Assistant works anyway), this data could just be stored in there whenever the files in .storage get updated. This is a pure quality of life improvement for people who write SQL queries or point something like Grafana at their Home Assistant DB.

The database only has state and event history. Config does not go in the db and everything in .storage is config

2 Likes

I’m not suggesting that config move to the database, just that some of it get copied there for convenience when writing queries. The configuration in .storage would still be the source of truth.

What exactly do you want copied into the database?

I guess we’ll see if this gets a lot of votes. I would be pretty surprised if a lot of people want this tbh. A more common problem people have is that their db is way too big. I find it hard to imagine that people want a whole ton of json dumped into it, including all their credentials.

If it doesn’t get added and you do want this you can DIY it for reference. You can use a command line sensor to pull whatever you want from this folder into the state machine. And then it’ll get recorded in the db

2 Likes

Also just an fyi nothing in this folder has a stable, defined schema. It is only intended to be accessed and modified by HA code. So if HA completely changes the schema or even switches from json to some other data format entirely during an upgrade that is considered fully backwards compatible as long as all the integrations still work. There is zero requirement to keep the schema and structure the same.

Because of that the only thing you could possibly get here is the raw data of this folder dumped into the db. Theres essentially zero chance the HA team is going to process the data in these files into rows and tables in a stable schema for easy data analysis because there is none and that’s not an intended use case.

1 Like

For my use-cases, I’d like to see:

From core.device_registry:

  • id
  • area_id
  • manufacturer
  • model
  • name
  • sw_version

From core.entity_registry:

  • device_id
  • area_id
  • entity_category
  • entity_id
  • original_device_class
  • name
  • platform
  • capabilities.state_class
  • unit_of_measurement

Are you familiar with the existing template functions that query various registry files?

For example, Devices

If they lack the functionality you require then odds are better of posting a Feature Request to enhance them than requesting duplication of data to the database.

1 Like

Yea you want a schema. That’s almost certainly not going to happen for reasons I said above. But you can definitely diy that using a command line sensor like I suggested above. Just be prepared to tweak it when it breaks without warning on upgrades.

1 Like

I honestly don’t expect this to get a ton of votes - it’s a “WTH” for me, but I’m aware that this isn’t a normal use-case for the vast majority of Home Assistant users. If nothing happens with this I’ll probably DIY it as you suggest.

This probably wouldn’t be a huge addition to the database, especially if it’s only selected fields. My HA DB (PostgreSQL) is only 3GB (and I’m thinking of increasing my history retention) so my definition of large is obviously skewed compared to those using the default SQLite DB.

Hmm - I think there might be enough here to make it work through Node-RED… I can get all entities in Node-RED easily enough, then I’ve got the device_ids needed to get the device data.

This will probably end up being my solution.

OK but those template functions don’t require Node-Red so I assume you’re employing it for some other reason.

Yeah, I’ve already got it running and use it for most of my automations, so this is really a case of using a tool that’s already available.

If I wasn’t already using Node-RED I’d probably write a Python script to parse the JSON or go the commandline sensor route.

Honestly given that Home Assistant has a data science portal that documents how to access data from the recorder for analysis, it does seem weird that the data source used by such tools lacks such useful information as is present in the device and entity registries.

At the very minimum a table is desperately needed that provides information about the mapping between entity unique_ids (or more accurately the platform, domain, unique_id triple) and entity ids over time. Otherwise if a user changes the enitity id (as is fully supported in the UI), there is literally no way to automatically determine that state rows with the old entity id, and ones with the new one are connected. (Except possibly for the tiny minority of entities that expose their unique_id as an attribute, typically named id).

Even if one argues that for entity and device registry information the correct thing to do is to query it via a Home Assistant API (there is a websocket API for device and entity registry data), that won’t help, as home assistant simply does not store a history of past entity ids used by an entity in the registry.

2 Likes