I’m noticing increasing sluggishness of my Home Assistant install. Part of that was caused by my Zigbee network. After making significant improvements there, the Zigbee network is now stable and reliable. However, the sluggishness is still there. It’s not always the same. Sometimes I still enjoy instant reactions when I trigger a sensor. But often, there is a noticeable delay.
When searching this forum and the wider internet, I found several articles of people that migrated from the built-in default SQLite database to MariaDB. Since I’m a software developer and familiar with relational database, I’m considering doing something similar. But first, I’d like to gather some more recent information. A lot of information on this topic is from 2022 or 2023. The official HA documentation states:
Warning
SQLite is the most tested, and newer version of Home Assistant are highly optimized to perform well when using SQLite.
When choosing another option, you should be comfortable in the role of the database administrator, including making backups of the external database.
So, it could be that the situation has been improved in favor of SQLite in 2024/25. Is that the case? Can someone speak to that?
Also, I have more experience with PostgreSQL than with MySQL/MariaDB. As I read the official docs and e.g. this post, it looks like it viable to use PostgreSQL as well, instead of MariaDB. Does anyone have experience with that? Isn’t PostgreSQL to heavy of a service to run alongside HA on the same machine? (I’m running HA on an ODROID-N2 with 4GB of RAM.)
Are there any other things I could do to improve performance of HA? Any recommendations for tooling to investigate performance issues? Or generic strategies to improve it?
I realize my question is a bit broad. I’m not looking for definitive answers per se, but I’d like to hear some recent experiences, given that most information I could find is ~2-3 years old and a lot can change in such a time frame.
In the cookbook there are sections about migrating back from MariaDB or PostgreSQL to the built in SQLite. One of those mentions:
With all the performance improvements to Home Assistant over the last year, the benefits of using MariaDB have somewhat declined.
So that answers my question whether migrating will still yield performance gains in 2025. It seem it won’t, so I’ll stay with SQLite for now.
I enabled debug mode, as described in the Tracking down instability issues… post you linked to. I found one custom integration that was logging a lot of errors. Since I did not use it anymore, the solution was simple: remove the integration. Not sure whether that contributed to system performance, I’ll have to see if the situation has been improved during the next couple of days. Other than that, there were some errors that don’t seem to serious and that are not logged frequently. I don’t think they will have an impact on performance.
One of the suggestions in that same post was to install the profiler. I did that, let it run for a minute, downloaded the resulting file and opened it in qchachegrind. As a software developer, I do have some experience with profiling tools, so an image like this doesn’t intimidate me too much:
(But it made me wonder whether it’s a good idea to recommend random users to do this, but that aside.) The result doesn’t look concerning. I’m assuming the 99.66% and 88.70% functions are part of some “main loop”, so it makes sense they’re called so often. But it’s hard to interpret without knowing the code.
So, I’ll monitor the system the next couple of days. Maybe that custom integration I removed was the culprit. In the mean time, if someone could confirm from the screenshot that this is a normal looking profile, that would be nice.
I can’t speak to the profiler output (hopefully someone else can) but I would also check the Cookbook Zigbee section to ensure you are getting the best performance from your network.
Do you have many add-ons? Depending on your hardware, they can have quite an impact. Studio Code Server in particular seems to use a massive amount of memory, and doesn’t release it even when not in use.
I also recently started systematically removing cloud integrations - that improved performance considerably.
@MaxK I’ve already done a lot to improve my Zigbee network, and that was quite successful. (See e.g. this thread to get an idea.) I skimmed the Zigbee section of the cookbook, and didn’t find anything that I didn’t know already. I think my Zigbee network is in a pretty good state, after:
Switchting to Zigbee2MQTT (from ZHA)
Switching to a stand alone SLZB-06 adapter (from a Sonoff Dongle connected via USB to my HA machine)
Making sure no router device can be turned off accidentally, since that would cause network-instability.
Looking at the CPU/memory stats, the memory usage seems to float around 50% all the time, and the CPU usage is usually way below 10%, with some spikes here and there. That seems healthy to me. I could consider removing Studio Code, although I don’t like the “File editor” as an alternative. Are there any other alternatives that you’re aware of?
No, I’ve been looking. I was advised by @tom_l to turn it off when not in use.
I have a similar number of add-ons. I find memory usage is around 15% without Studio Code Server. With Studio Code Server I’ve seen it creep up to 70%. It will depend on the hardware, of course - my processor is only i3. 8GB RAM.
What’s the hardware you are on? And what exactly are the delays like? And I assume you are talking about completely local entities, so it’s not about network (just to double check, even on good days the cloud things are orders of magnitude slower).
With 10% cpu and half ram free I don’t see why your HA server would struggle with something as simple as an entity state change. But people have different expectations and what one perceive as instant, another person sees as laggy. And of course maybe it’s not 10% at the time when you notice the issues.
I’m running HA on an ODROID N2 with 4GB of RAM. When I bought that hardware, the common wisdom was that this should be more than enough.
As said, I still sometimes have instant reaction, but also often there’s a delay. I’d say if there’s a delay, it usually around 1 second, sometimes a couple of seconds.
Me neither.
Good one. Next time I experience a delay, I’ll see if I can see a spike in the resource usage.
Another potential bottleneck is your WiFi environment. Too many devices/sensors hitting a low quality ISP-provided router can become a problem. Upgrading to Ubiquity, Omada, or similar hardware can make a big difference if you’re hitting that issue.
In regards to the database, I would not switch away from SQLite unless you have a need to access the database remotely at run time. It’s almost always more trouble than it’s worth, and users of external database engines generate orders of magnitude more issue reports than simply using the built-in. Optimizations for corner cases using external databases tend to lag behind by about two years as they generally rely on users providing us test cases (ie entire database dumps and query log captures). Since we as developers are almost all using SQLite for all of our production, it can take even longer to find out about some query that doesn’t optimize as well at scale for an external database we don’t use daily.
If you do pick an external database, rest assured that if you keep the purge interval at the defaults you are unlikely to have any scalability issues, and even than it’s usually only the extreme cases where the purge interval has been changed from days to months that tend to have issues.
I am interested in the original topic. I also occasionally see propagation delays where a switch takes a second or two for the associated device to respond. Sometimes longer, like the repeated commands are queued and suddenly execute at once. But I think all the talk about the database is not germane. My database is remarkably small because I exclude most of my entities. Besides, why does it matter how large the database is if you are only adding one more data point?
I have never seen anyone report that they have solved anything by using a different database program. So, what else can cause these delays.
@bdraco That’s a great offer. I can’t upload the file here (only images are allowed, so it seems) but I’ve put it on my webhosting. You can download it here.
Great to get some advice about the database from a developer. I already decided to stick with the built-in SQLite, but your information is helpful in making that decision more firm. Thanks!
And a short update regarding CPU/memory performance. I’ve been monitoring that since yesterday, and so far:
Overall, the CPU usage floats around 5%, with occasional spikes to 20%. So plenty of breathing room there.
Memory usage is around 50% all the time. So, also enough breathing room, I’d say.
Indeed the Studio Code addon is by far the most resource hungry. That said: memory usage is stable at ~25% all the time. CPU usage is around ~3% with spikes to ~6%. Nothing to concerning, but I still could consider turning it off when I’m not using it. Will at least safe some energy.
dsmr_parser is the top hitter in the profile. It looks like it could be optimized quite a bit more. Its followed up some XML parsing overhead (not immediately clear where its coming from), and finally segno