General questions about recorder and states database table in my HA setup (large "count" devices)

Hello all,

Just looking for a little guidance/ best practices thoughts on the below (this is all entities with a count above 1000 (I have had HA running for around 1-2 months, I don’t know if that matters)

The question I have is based on the table at the bottom of this post which was derived by this sql query using the sqlite web add on within home assistant.

How do I interpret this? What does it mean? Are these devices “polling” which is what is causing their counts to increase so greatly? (I don’t think this is the case, I thought this table is essentially number of times an entity has changed states-- or an attribute related to them has changed) Is it in my best interest to add all of these to my recorder “exclude entity” area of my confiuration.yaml recorder key?

What are the implications if I add all of these to the exclude area of the recorder? Will they still be available to view their state if I put them on a lovelace dashboard. (I’m thinking this only affects them showing up within the logbook and history based on what I’ve read)

With “person.clarice” or “device_tracker.clarice”, this is an android phone and I don’t see my iphone (iOS device only had 220 count in same time period) on the list which I felt was strange. Are those devices just that hugely different with how they interact with HA (we have the app installed on both of our phones)

An additional question with person.clarice, is there a way to reduce the number of times this is showing up in the count? This may not relate directly to the recorder and may be more specific to the device setup. I just find if strange that I was at 220 and she was at 2000 (do I need to look into something on this seperately from the recorder?)

And if I put the person entity in the exclude of the recorder, I then won’t see anyhing historywise related to this entity when going to both the logbook and history of home assistant? Correct?-- If so, I would really appreciate some thoughts on how to get that down, but if it is just state change, I don’t even see how that can be fixed. I could understand if it were “pinging” for location that I could reduce the number of pings and it could possibly fix it------> Or is this possibly exactly the problem, its “pinging” too much due to the location settings in the android phone?

Last question would be with “media_player.fpp”, that number is crazy. Is there a way to reduce the amount of “polling” but keep it in the history. That is one I prefer not to lose from the history but that number seems like a nightmare for my SD card in the short 1-2 months I have had home assistant installed. (Any other options for this? And now that I’m thinking about it, I guess the “commit interval” is helping with this a bit)

The github project is here for this device (not sure if this would help someone to help me determine the answer to the above regarding the fpp media player)

Thank you all ahead of time for the conversation on this one, I know it is all over the place, but I just found this interesting and would love here others guidance on each of these situations

SQL Query used

SELECT 
    entity_id,
    COUNT(*) AS cnt
FROM states
GROUP BY
    entity_id
ORDER BY
    COUNT(*) DESC;

Current recorder.yaml (none of the offenders added just yet)

#Essentially Data For History
  commit_interval: 30
  purge_keep_days: 5
  exclude:
    domains:
      #      - automation
      - weblink
      - updater
      - script
    entities:
      - sun.sun # Don't record sun data
      - sensor.last_boot # Comes from 'systemmonitor' sensor platform
      - sensor.date
      - sensor.icloud3_event_log
      - sensor.time
      - sensor.today_is
      - sensor.time_utc
      - sensor.time_date
      - sensor.time_online
      - sensor.moon
      - sensor.month_is
      - sensor.low_battery

States table export >1000

entity_id	cnt
media_player.fpp	46078
sensor.packets_in_eth0	30723
sensor.packets_out_eth0	30723
sensor.network_in_eth0	30589
sensor.memory_free	30530
sensor.memory_use	30530
sensor.processor_use	30117
sensor.load_1m	30093
sensor.network_throughput_in_eth0	29899
sensor.network_throughput_out_eth0	28995
sensor.load_5m	28016
sensor.memory_use_percent	27228
sensor.processor_temperature	26389
sensor.load_15m	22408
sensor.network_out_eth0	18758
sensor.washer_power_consumed_w	12420
sensor.date_time	7707
sensor.washing_machine_electric_consumption_w	7252
sensor.internet_time	5362
sensor.watchman_last_updated	2116
device_tracker.clarice	2034
sensor.watchman_missing_entities	2021
person.clarice	1843
sensor.watchman_missing_services	1755
camera.first_floor_vacuum	1614
camera.second_floor_vacuum	1588
sensor.washing_machine_electric_consumption_a	1568
sensor.washing_machine_electric_consumption_v	1568
sensor.swap_free	1420
sensor.swap_use	1420
sensor.activewindowsensor	1350
sensor.washer_switch_electric_consumed_kwh	1167
media_player.living_room_media_player	1063
sensor.swap_use_percent	1051

No, polling is a mechanism for an integration to keep up to date with an external device/service. But even for integrations that do polling every poll does not make a row in the table. A new row is made whenever the state changes (state includes attributes).

Yes. You can still view their current state, interact with them normally (in the case of the media player), trigger off their state in automations and things, etc. You’ll lose all history though. There will be no graph or log of state changes over time in more info, they won’t appear in history or logbook, etc. Current state only.

This isn’t actually that weird. ios is way more restrictive about how often apps can get an updated location. You can see what I mean in here. Notice the note that even if you try and force location updates faster apple just shuts that down.

Android on the other hand gets very frequent and accurate updates in my experience. This isn’t to say its more draining on your battery its just constantly piggybacking on every other location request made by apps and the system and getting those updates as well.

If you are seeing something behaving wrong though I would recommend asking about it in the ios/android channels of the HA discord. Or if you’re certain its a bug then submit an issue to either the android or ios repo.

Beyond the commit interval without modifying the integration? Not really. Like I said earlier I don’t believe it is actually related to the mechanism of receiving updates (polling) but rather what is actually changing. What does that entity look like? Does it have a lot of attributes? Or worse, a timestamp attribute or some attribute that updates constantly? Those are particularly problematic.

If you can identify the source of all those changes you can submit a PR or issue to that repo to get it cleaned up. That does sound quite high.

Thank you for your template, exactly what I was looking for. Here is an updated SQL query working with newer HA

SELECT 
    states_meta.entity_id, states.metadata_id,
    COUNT(*) AS cnt
FROM states
LEFT JOIN states_meta ON states.metadata_id=states_meta.metadata_id
GROUP BY
    states.metadata_id
ORDER BY
    COUNT(*) DESC;

I’m sorry for such a late reply on this one. Life has been life for the last little bit. Thank you for being so detailed in your response here where I and others can learn from your knowledge.

I follow everything you are saying in your replies. Thank you again so much!!

Thank you for the new/ updated SQL query. Very useful!!