My HASSIO installation is hanging at over 100% CPU utilisation for the homeassistant container. It seems to be MQTT related as stopping the broker addin solves the problem. I’ve tried both addins with the same result, but I’m at a loss as this had been running rock-solid on my NUC. Any ideas?
Are you using the built-in MQTT broker or the Mosquitto Add-on?
Sorry, I was a little short on details. I’m using the Mosquitto add-on - I had been using the official one but I switched to the community one today to debug this. Either one gives the same results.
I have no MQTT entry in configuration.yaml, I just entered the details in Configuration - Integrations - MQTT.
One strange thing I see in the MQTT Server & Web client logs is this
1556460794: New client connected from 192.168.16.5 as UGl7^Skjtx?1hR7pD0l
xRRCxk<ZUTIqJgA3EF;xRVqIwBU<IiKwtQ3lQwsHqYp (c1, k60, u'mqtt'). 1556461030: Client UGl7^Skjtx?1hR7pD0l
xRRCxk<ZUTIqJgA3EF;xRVqIwBU<IiKwtQ3lQwsHqYp disconnected. 1556461107: New connection from 172.30.32.1 on port 1883. 1556461107: New client connected from 172.30.32.1 as 4P7cmWdoFyjD0xWjcgd1tR (c1, k60, u'mqtt'). 1556461107: Client 4P7cmWdoFyjD0xWjcgd1tR disconnected. 1556461183: New connection from 172.30.32.1 on port 1883. 1556461183: New client connected from 172.30.32.1 as _\EAVFWMUuRCM@0dcRW?\0\WXw]nTr=]XvisdqV?tDR\D<BVDnc4L0CZ8DVkDbSZ (c1, k60, u'mqtt').
New client connected from 192.168.16.5 as UGl7^Skjtx?1hR7pD0l
xRRCxk<Z UTIqJgA3EF;xRVqIwBU<IiKwtQ3lQwsHqYp
192.168.16.5 is my NUC but the client name is very strange.
Thanks.
I’ve done a little bit more debugging on this. By shutting down all the devices that were generating MQTT traffic, and then slowly bringing them back one at a time, I tracked it down to an old RPI 2 with an SDR.
I use this to monitor the Oil Tank level and the electricity consumption sensors on 433MHz. What was confusing was that it took about 10 minutes for homeassistant to drop back to 1 to 5% utilisation after disabling it, so I had missed it before.
Now, I just need to figure out why it’s doing this, but at least my HA is back up and responding.
Well the mystery deepens. I have a test homeassistant server, so I pointed the MQTT output of my Oil Tank and energy monitor to it and it works just fine! I brought over the sensor definitions as well, so I can see the sensor data:
Here’s the output from the sensor:
{"time" : "2019-04-28 19:48:31", "model" : "CurrentCost TX", "dev_id" : 77, "power0" : 1235, "power1" : 0, "power2" : 0}
{"time" : "2019-04-28 19:48:36", "model" : "CurrentCost TX", "dev_id" : 77, "power0" : 1273, "power1" : 0, "power2" : 0}
{"time" : "2019-04-28 19:48:42", "model" : "CurrentCost TX", "dev_id" : 77, "power0" : 1278, "power1" : 0, "power2" : 0}
{
What I’m looking for is power0:
- platform: mqtt
state_topic: "rtl_433/rpi2_rtl_433/devices/CurrentCost_TX/power0"
name: "Power Consumption"
unit_of_measurement: "kW"
So why would that cause high CPU use on one server and not on another?
I also have problem with high cpu use from mqtt, but the mosquitto add-on. Have not found why.
@teachingbirds Hi Isabella,
I just figured out why this was happening for me. The MQTT sensor tracked how much electricity my house was using, and it reported every 15 seconds or so. On my Lovelace home page I had a graph tracking the values using the custom mini-graph card.
Last weekend, I changed the graph to a bar-graph and this was what caused my problem. Removing the graph has restored my home assistant to about 10% CPU.
I hope this helps.
Yea, the mini-graph-card pulls the history from the rest API every time the entity is updated (which is not optimal), should probably implement some caching.
I’ve heard many people saying they have poor history performance. You would think it should be able to handle at least one request every 15 seconds, but apparently not on some systems…
Ah, then that make sense. I’m actually running HA on an Intel NUC but it managed to almost completely lock it up. The weird thing was that sometimes the problem would disappear for no apparent reason, but it would always come back. I had narrowed it down to the MQTT sensor, but it took a while before I remembered that I was using mini-graph-card with it.
Thanks!
It may be worth looking into some sort of rate limiting for MQTT if that is causing issues. It should not really use much CPU but then I’ve never had MQTT issues like this myself.
MQTT was not the problem, it turned out to be the mini-graph plotting the MQTT sensors.
I have the same issue with the MQTT plugin from the official Add-ons.
Restarting the plugin gives between 4 and 6 hours a lower cpu load (max 10% in total)
The system is running in a VM on VMWare ESXi with a 64G disk, 4 G ram and 2 CPU cores without limit.
I have all graphs disabled which are sourced from MQTT. That did not change the load at all.
Only restarting the MQTT plugin works for a short time.
I tried it with the ‘Mosquitto MQTT Server bundled with Hivemq’s web client’
Same issue with high CPU load after a while.
The only thing I can see other than before is:
1565000787: Warning: Received PUBREL from shellyswitch25-740330 for an unknown packet identifier 61686.
1565000788: Warning: Received PUBREL from shelly1-2C7455 for an unknown packet identifier 38592.
This goes on and on forever…
Found the issue: QoS setting 2 on devices with a weak signal causes CPU load on the MQTT service.
Changed to QoS0 and the issue was gone.
The CPU load is now around 1% which looks much better.
A second plus is the high temperature of the mini PC with a I5 core is way lower than ever. Less energy consumption which was one of the goals in domotica.
I will monitor it for a while.