MQTT/Mosquitto randomly stops working and then stays offline, help!

My whole home assistant smart home relies on MQTT with the mosquitto broker add-on. frigate camera sensors are unavailable, so is the whole zigbee2mqtt network.
at random times(it seems) the mqtt stops working. the binary sensors go unavailable, and no command can be send anymore to zigbee devices due to using z2m.
Here is the logs of the mosquitto broker, would love some steps to find the cause and to redemy it.

2024-02-01 20:28:51: New connection from 172.30.32.2:45424 on port 1883.
2024-02-01 20:28:51: Client <unknown> closed its connection.
2024-02-01 20:30:51: New connection from 172.30.32.2:56994 on port 1883.
2024-02-01 20:30:51: Client <unknown> closed its connection.
2024-02-01 20:32:51: New connection from 172.30.32.2:56334 on port 1883.
2024-02-01 20:32:51: Client <unknown> closed its connection.
2024-02-01 20:34:51: New connection from 172.30.32.2:52762 on port 1883.
2024-02-01 20:34:51: Client <unknown> closed its connection.
2024-02-01 20:35:42: Client frigate disconnected.
2024-02-01 20:35:57: New connection from 172.30.32.1:58160 on port 1883.
2024-02-01 20:35:57: New client connected from 172.30.32.1:58160 as frigate (p2, c1, k60, u'MQTT').
2024-02-01 20:36:22: Client frigate disconnected.
2024-02-01 20:36:35: New connection from 172.30.32.1:56278 on port 1883.
2024-02-01 20:36:35: New client connected from 172.30.32.1:56278 as frigate (p2, c1, k60, u'MQTT').
2024-02-01 20:36:51: New connection from 172.30.32.2:41138 on port 1883.
2024-02-01 20:36:51: Client <unknown> closed its connection.
2024-02-01 20:37:30: Saving in-memory database to /data//mosquitto.db.
2024-02-01 20:38:51: New connection from 172.30.32.2:56324 on port 1883.
2024-02-01 20:38:51: Client <unknown> closed its connection.
2024-02-01 20:40:51: New connection from 172.30.32.2:58886 on port 1883.
2024-02-01 20:40:51: Client <unknown> closed its connection.
2024-02-01 20:41:37: Client frigate has exceeded timeout, disconnecting.
2024-02-01 20:42:51: New connection from 172.30.32.2:47328 on port 1883.
2024-02-01 20:42:51: Client <unknown> closed its connection.
2024-02-01 20:44:51: New connection from 172.30.32.2:44372 on port 1883.
2024-02-01 20:44:51: Client <unknown> closed its connection.
2024-02-01 20:46:51: New connection from 172.30.32.2:49350 on port 1883.
2024-02-01 20:46:51: Client <unknown> closed its connection.
2024-02-01 20:48:51: New connection from 172.30.32.2:53790 on port 1883.
2024-02-01 20:48:52: Client <unknown> closed its connection.
2024-02-01 20:50:52: New connection from 172.30.32.2:50228 on port 1883.
2024-02-01 20:50:52: Client <unknown> closed its connection.
2024-02-01 20:52:52: New connection from 172.30.32.2:37378 on port 1883.
2024-02-01 20:52:52: Client <unknown> closed its connection.
2024-02-01 20:54:52: New connection from 172.30.32.2:34372 on port 1883.
2024-02-01 20:54:52: Client <unknown> closed its connection.
2024-02-01 20:56:52: New connection from 172.30.32.2:60720 on port 1883.
2024-02-01 20:56:52: Client <unknown> closed its connection.
2024-02-01 20:58:52: New connection from 172.30.32.2:39530 on port 1883.
2024-02-01 20:58:52: Client <unknown> closed its connection.
2024-02-01 21:00:52: New connection from 172.30.32.2:38952 on port 1883.
2024-02-01 21:00:52: Client <unknown> closed its connection.

At around 20:41 i got notification that the binary sensor of frigate went to unavailable. (could be there is a delay until these binary sensors go to unavailable, idk)

Restarting home assistant works, but this is ofcourse not a great solution if this happens like 3-4 times daily. EDIT: with a restart, i mean restarting the whole Host PC. restarting mosquitto or just the OS does not work.

Would super appriciate assistance, thanks!

EDIT: Here is a graph showing it becoming unavailable 32 minutes ago but also showing it happened more often. after a restart it is solved. https://i.imgur.com/fyqWbv8.png

1 Like

Hardware? Is this raspberry pi?

There is no software based reason for this to occur so I would suspect something external
To HA like the hardware or the network.

Also, how is your storage? Maybe you run out of drive space. Again, external to software but will
Definitely cause havoc. Reboot would not fix this and restore would clear space until it happen again.

i am using 2% of the internal ssd drive, and 1tb out of 6.2tb of a external drive attached by UDEV mounting rules.
The hardware is a minipc running a n100 processor.

Network i doubt, but maybe…? i am also running frigate that created a ton of binary sensors, all connected to mqtt. frigate detects objects every second in my house like the car on my parking spot, but the binary sensor that shows car occupancy shows “unavailable”. also in the log above i see problems with frigate with the mosquitto broker. And yeah…my whole z2m network was unavailable during this time until i restarted.
The mosquitto broker, mqtt, frigate and z2m are all on the same device. the z2m coordinator is attached by usb to the minipc.

Is this docker install

No, its directly HAOS thats installed.

I also use HAOS on a mini-pc, and I’m getting the same thing. I’m using the stock MQTT addon. I’ve been seeing messages

[quote=“borgqueenx, post:1, topic:683118”]
Client frigate disconnected
[/quote] since I installed frigate/mqtt a few months ago, but my cameras were working and defined events were triggering. Since the last couple of updates (about a week ago) my cameras don’t always record defined events. None of the frigate sensors show as available anymore… It’s annoying, and I’m not familiar enough with MQTT, to even know where to start. I sure hope you get an answer on this because I’m sure it would help me as well. I’ve seen these error messages mentioned in other posts, but never a solution.

Hey lopan, for the last few days i solved the issue by moving to Emqx. A alternative to the mosquitto mqtt broker. A full restart of the minipc after setting up was needed, a home assistant restart didnt do the trick.

I also have this problem with mosquitto not even hosted on HA, it’s on a vps in another country. Every so often all comms stops and I have to do a
Service mosquitto restart