(SOLVED) Performance of history data is extremely slow

I see the same

So do I. History has been borked for me since 0.80 I think…

1 Like

My History hasn’t worked for a while (a few months?) and I’ve been watching the community for possible reasons. I’ve tried excluding some domains and un-needed sensors. I’ve set my database to purge every day, only keeping a couple days worth of data. Nothing seemed to help. Well, this morning I decided to just not bother using the History functionality, so I commented-out my history:, and recorder: entries from the configuration.yaml. To my surprise, after I rebooted, History was still showing up on the sidebar. And, even more surprisingly, it worked! However, the data has appeared to reset.

I’m running HASS.IO on an Intel NUC (obtained when an image of that was still available), on version 0.82.1, and have MariaDB add-on enabled. And I use Chrome browser on my desktop.

Does History load by default for HASS.IO? I don’t understand why it’s there, and working, when I removed it from my configuration. I do still have History_graph enabled, so perhaps that forces history and recorder to load?

Of course, I don’t know if it will keep working, or if it will perhaps have problems again as more data is collected. I guess I’ll just watch it over the next few days.

Perhaps someone can make sense of this scenario. And hopefully it helps someone else who is having trouble with History loading in their browser.

history_graph almost certainly marks history, which itself marks recorder as dependencies which will cause them all to be loaded. And because you’re not configuring recorder yourself, it’ll default to using the SQLite database and not touch MariaDB at all, which is why your data was reset.

I see! Thank you for clarifying. I wasn’t aware that one component could call on another component as a dependent (makes sense, of course).

I’ll let it run as is to see if the problem returns when more data is accumulated.

also here same issues. Running HA with MariaDB (both dockered) on my RPi3B+. Instead of breaking my browser though the history tab breaks my RPi. I need to unplug it because it becomes unresponsive and I cannot even SSH into my device after looking at the history tab. Don’t try to see three days of history!

Actually the whole history tab feels like an unfinished and not well designed product that should be disabled by default. It’s currently a completely unworkable page that tries to put all the states of all your sensors, switches, lights and other entities for one day on one single page in one go? Why?
It’s fine and great that we can see history per entity/device but this… And then still… In Domoticz (competing? platform) I could easily record history for a complete year for devices… Without a seperate (proper) db like mssql/mariadb that would be impossible for HA it seems.

Don’t get me wrong, I recently moved from Domoticz to HA because it’s sooo much better. The community is great and support for devices is amazing. I really love it but regarding keeping history HA could still learn something from Domoticz. The lack of a proper history for at least some of my sensors like energy meters is my greatest concern for HA.

I tried setting up influxdb as well but my average load on my pi was (before I moved to mariadb) between 4 and 5, and now with mariadb it is still passing the 1 at times. So I either need to move to stronger hardware or setup influxdb and mariadb on a separate pi?

I’m late to the party but did you PR these query optimisations to HA on Github?

If you mean the original query performance fixes I made, then yes. It’s linked at the very top of the first post in this thread: https://github.com/home-assistant/home-assistant/pull/8748

I am very impressed, you are a db expert. Thanks for contributing! :slight_smile:

So what if you’re running on the latest version and still have horrible performance?

I’m on a fairly powerful machine, i5 with 8GB RAM and HASS in a docker connected to MySQL on the same machine (not dockered). Everything else on the machine is flying, but history graphs on HASS are terribly slow. Have to wait quite some seconds (like 15-25) for it to load. The above indexes are already applied in the recent versions since it has been merged for a while now right?

1 Like

You’ll have to wait for someone that wants to take on overhauling the history page to dynamically load content as you scroll, or something similar.

I’ve fixed the database engine problems but that was just to fix the query bottleneck. Now the bottleneck is elsewhere and harder to fix unfortunately.

2 Likes

I made the same move and I have the same concern regarding history graphs. I was using Domoticz too and I suspect it to store values with different granularity like a kinda “round robin database”.
Domoticz stores not only isolated values but also calculate each time the average value for each day, which makes it so fast to display graphs on a long period.
I had to deal with Domoticz database to remove invalid values and I have to recalculate daily average values as well in order to fix the graphs.

stll think its pretty big, but it does sit around that value i think. Just annoying with hassio backups, consume a lot (well its still tiny, but he)

Mine was hovering around 4.5 - 5 GB for only 7 days worth of data. I’ve since trimmed out pointless sensors from the recorder and have halved that, likely trim a few more when I get a chance.

This took me from GB to MB in size and speeded up the history.

recorder:
  db_url:   mysql+pymysql://hass:[email protected]/HOMEBASE
  exclude:
    event_types:
      - service_removed
      - service_executed
      - platform_discovered
      - homeassistant_start
      - homeassistant_stop      
      - feedreader
      - service_registered
      - call_service
      - component_loaded
      - logbook_entry
      - system_log_event
      - automation_triggered
      - script_started
      - timer_out_of_sync

2 Likes

Hey guys, I am experiencing a similar problem. The history and logbook tabs take forever to load and show the data. The problem lies within extracting the states, see logs below (20-40s!!)

2020-02-09 20:13:18 DEBUG (SyncWorker_14) [homeassistant.components.recorder.util] converting 1 rows to native objects took 0.078752s
2020-02-09 20:13:18 DEBUG (SyncWorker_14) [homeassistant.components.history] getting 1 first datapoints took 0.174754s
2020-02-09 20:13:19 DEBUG (MainThread) [homeassistant.components.history] Extracted 5756 states in 18.286318s
2020-02-09 20:13:19 DEBUG (SyncWorker_17) [homeassistant.components.recorder.util] converting 5760 rows to native objects took 18.791389s
2020-02-09 20:13:19 DEBUG (SyncWorker_17) [homeassistant.components.history] get_significant_states took 18.923634s
2020-02-09 20:13:19 DEBUG (SyncWorker_17) [homeassistant.components.recorder.util] converting 1 rows to native objects took 0.067947s
2020-02-09 20:13:19 DEBUG (SyncWorker_17) [homeassistant.components.history] getting 1 first datapoints took 0.171320s
2020-02-09 20:13:20 DEBUG (MainThread) [homeassistant.components.history] Extracted 5761 states in 19.448704s
2020-02-09 20:13:49 DEBUG (SyncWorker_12) [homeassistant.components.recorder.util] converting 14473 rows to native objects took 20.091915s
2020-02-09 20:13:50 DEBUG (SyncWorker_12) [homeassistant.components.history] get_significant_states took 20.106214s
2020-02-09 20:13:50 DEBUG (SyncWorker_12) [homeassistant.components.recorder.util] converting 1 rows to native objects took 0.020867s
2020-02-09 20:13:50 DEBUG (SyncWorker_12) [homeassistant.components.history] getting 1 first datapoints took 0.058466s
2020-02-09 20:13:50 DEBUG (MainThread) [homeassistant.components.history] Extracted 14474 states in 20.440697s
2020-02-09 20:14:25 DEBUG (SyncWorker_15) [homeassistant.components.recorder.util] converting 25500 rows to native objects took 35.942505s
2020-02-09 20:14:25 DEBUG (SyncWorker_15) [homeassistant.components.history] get_significant_states took 36.045067s
2020-02-09 20:14:26 DEBUG (SyncWorker_15) [homeassistant.components.recorder.util] converting 7 rows to native objects took 1.125338s
2020-02-09 20:14:26 DEBUG (SyncWorker_15) [homeassistant.components.history] getting 7 first datapoints took 1.176136s
2020-02-09 20:14:27 DEBUG (MainThread) [homeassistant.components.history] Extracted 25507 states in 38.117463s

I have HA running on docker on a RPI 3+. Currently it has only 1 sensor that remotely connects and reports the temp/humidity ever minute. Not a whole lot of data and running a minimal setup that I’d greatly like to expand. However I would like to see the history and adding more sensors would only make it worse I suppose. Any suggestions of what I can improve to reduce extremely long loading times? I’d greatly appreciate it.

I used to have the same slowness on my RPI 3+ running off SSD but then I had a lot more sensors than you. Moving over to a RPI 4 and running off SSD via USB3 has made a big difference.

If you are running off SD card then maybe changing to a faster card would make a difference.

The SD card I’m using is a SanDisk 16gb class 10, with a reading speed of 100MB/s it’s not that slow. Fasted SD cards can currently reach 300 MB/s, extrapolating the loading times would still results in 10+ seconds of waiting for logbook/history to appear.
I’m hoping there’s another solution too strongly minimise the loading time.

This is the key to understanding where the problem is. 36 out of the 38 seconds is taken up in Python code, not database access! This means that the query is fine from a raw query performance perspective and the issue has nothing to do with read speed, SD cards, etc. This is entirely CPU-bound number-crunching work.

There’s definitely a need to rethink how history data is stored and subsequently queried as a huge number of those 25500 rows are not needed to actually render the page. But don’t go worrying about how to make your database faster, as that’s not where the problem lies.

1 Like

@OverloadUT There’s definitely a need to rethink how history data is stored and subsequently queried as a huge number of those 25500 rows are not needed to actually render the page. But don’t go worrying about how to make your database faster, as that’s not where the problem lies.

Thanks for this insight! As this is all quite new to me, could you point me in the right direction of rendering the page faster without loading all rows? And is there a general page that explains more about how the history data is stored and queried?