Recently I’ve observed a massive slowdown of my HA Supervised instance. A little bit of poking around revealed my Sqlite database takes … yikes … 5,5 GB ! That’s right.
So naturally I sat down to tweaking the recorder settings, I started dabbling with migrating some of my states data to Influx (well, I’ve barely started that) and also spent some time analysing worst DB offenders (turned out my Zehnder recuperator flooded my database).
Lessons learned, I will pay more attention to DB next time.
After purging stuff around, I was barely able to trim it down by 1GB, currently running of 4,4GB DB. That still, to my taste, heavily impacts HA system perfomance.
Now’s the question:
Before I nuke my DB away or dive deep into low level SQL hackery and data migration, is there anything else I might do to forcefully trim DB’s size? Other than stopping HA → and then manually removing records?
Another question - would it be possible to nuke away DB, start anew, and then import subset of states/events tables? Or is that going to break things?
5.5 Gb…that ain’t ‘enormous’; just checked mine and it is currently 8.3 Gb, and that is only from the last 100 days (before it starts purging).
Not that I care, I use a MySQL database hosted elsewhere, so it is of no consequence to HA’s performance in any way
Actually I run it exclusively on an old HP mini-PC (6G ram, 4x vCPU 1,5GHz, HDD M.2 SATA), it’s not bad I figured that since SQLite is officially supported and everybody recommends sticking with it due to HA Team doing performance optimization magic there, I wouldn’t gain anything from moving. Wondering if MariaDB could help a bit
If you don’t offload the database engine (regardless which one) to another machine, you won’t notice any difference at all, as all data will still be processed by the same machine
That said, you hardware should be fast enough to run on the host.
I’m also pretty sure it must be something else that affects HA’s performance.
Since HA is most of the time just writing to the database (which doesn’t take a lot of CPU), it shouldn’t slow down anything.
Having a poor CPU//Ram will affect looking up (retrieving) data from the database, hence sow down performance. So unless you are quiring your statistics all the time, you shouldn’t have performance issues.
However, you will notice the time it takes to make a backup takes longer.
It takes really long time to retrieve Logbook entries (history), hence why I blamed DB
I have more than 160 automations (single automations.yaml file, more than 9k LOC - I refuse to split it into sub-files). It takes really, awfully long to open one for edition on my mobile companion app (like the automation GUI opens up after 3-4 minutes)
Listing blueprints (which I have around 20 loaded) also takes like a minute on the web app
Automations trigger pretty late, like the lights ones using Sensor Light blueprint and some Zigbee motion sensors, again - I think DB could be blamed here
Developer Tools also became subjectively clunky
General experience became clunky as well, hard to put it in better words
So I wonder, if not for DB, anything else worth triaging? I’m sifting through the files (apart from big .storage/trace.saved_traces log) couldn’t find anything out of normal. My Proxmox host doesn’t come with Swap file (btrfs is hard to work with… ), but that shouldn’t be it since my RAM consumption is stable at around 40%.
Finally, few days back - before I updated to 2024.2, I was pretty happy about the performance & experience
My db is 1 GB in size and hosted locally in the HA vm which has 8 vCPUs @ 2.9 GHz and 8 GB RAM. So far I’ve never noticed any slowdown whatsoever. CPU usage is mostly below 1% and RAM, 20%. Most likely the vm is overpowered but since the server has plenty of resources, why not.
You don’t say if it’s actually slow, or whether you just assume it’s slow because it’s big. Although sqlite isn’t particularly fast, size is just one factor in performance of databases. You might want to try compacting the database first - it’s easy and can sometimes make a big difference.
But, all other things being equal, mariadb is probably going to be faster - it certainly scales better with size up to a point. (I maintain hundreds of mariadb databases, some over 5tb for work) but all performance is based on a mixture of hardware, load and type of load. Either way, I’d suggest it’s a better choice than sqlite, not least because you can configure mariadb much more. It also opens up a lot of stuff like grafana graphs and sending data into the db from other sources (I know you can use webhooks and other methods, but I have several databases I populate from other things that I wanted to import into HA). And, if you are very concerned about space, you can use a compressed database engine like myrocks instead of the default innodb, but do some checking first.
Interestingly, my HA (using mariadb) is only 2.2gb, despite being quite a large installation. I think I set retention at 365 days.
I’m running my HA as VM on ESXi server with 4 CPU cores and 5GB of RAM. peak CPU utilization I get is ~20%, idling around 2%. RAM is used at~40% level. I use MariaDB on Synology NAS, connected via 1gbps link and database is ~6GB after carefull tuning to store only data I use in some graphs or for statistics.
When few versions ago HA started to require newer version of MariDB than official Synology package, I decided to minrate to docker instance, that offers more flexibility with version selection. My big learning from this exercise was selection of disks on which DB data files are installed. For comparison; displaying 1 month of data in history dashboard for ~80 entities took ~30s when database was installed on SSD and around 5 minutes (!) when on regular HDDs. Now, running this on SSD for normal view of 1 day takes ~1s (regardles LAN connected desktop or WiFi connected mobile.
Side note; for all my VMs I’m using iSCSI connected datastore from the same NAS and SSD volume. I do not experience any performace issues, even if my instance is pretty large (2600+ entities, ~270 automations). Configuration is actually split into ~40 .yaml files (I do split into packages files, mainly by area/funtionality, to keep related configuration items together) and it makes no visible impact on performance.
Log files are separate from DB, so this is not an issue
Well, I use VSCode to edit files directly from config folder, so cannot say, but perhaps fact that you do not split configuration makes it to read so long.
I do not thinks o. To my knowledge these are read to RAM when system starts and speed of database have nothing to do with, unless you somehow heavily reach to history. Though this should be rather the case for UI elements, like charts.
Here I agree, it is. For my instance loading the full DevTools->States dashboard takes ~12 secs, untill it become responsive. Not sure from where this data is read, but again either memory of configuration files, not DB (states and attributes are taken from state machine rather than DB directly).
So frankly speaking I’d take first a look at your VM configuration; what disks is it runnin on (this might have single biggest effect on performance) if it has enough RAM (seems to be the case) and what is CPU load (you did not mentioned this).