MQTT entites greyed out/no connection to broker

Hey everyone, my first post here :wink:

Got a problem yesterday - my all zigbee/mqqt device are no longer available after front end went unresponsive (which was quite frequent for me, like once a day at least for last few months) - usually rebooting the box (i have a thinclient, with linux arch and Home Assistant core installed on it).

But this time it went down for good, what’s worse i cannot any longer connect to broker when using “http://192.168.1.69:8081/#/” - connection is refused. The frontend of the HA is accessible and other integrations beside MQQT are working.

I am no linux or HA power user so definitely needy your help to troubleshoot this thing :frowning:

So far i noticed in host’s console after typing “top” abnormality that command from user zigbee2+ consumes around 120% of CPUs time as shown in screenshot attached to this post.

There has been no direct change to system made - it was going unresposive like thet from time to that - i did run “pacman -Syu” on a box but that was day before - and box was rebooted during that time - so i hardly believe update to system made any cause.

image

Can you guys help me bring back this thing?

That’s confusing because HA OS is the version of Home Assistant that comes with its own operating system.

Do you mean you have Home Assistant Core running on Linux Arch?

Ah, sorry i confused thing indeed - of course I mean the latter, Home Assistant core running on Linux Arch.

It sounds like whatever MQTT broker you installed (is it Eclipse Mosquitto?) has stopped working.

I am glad you understood problem statement mate.

Right, so have you tried restarting the MQTT broker service?

BTW, the Eclipse foundation maintains a community forum for their Mosquitto Mosquitto broker here:

hey, thanks for reply - so I tried it now however not getting too lucky, can you check if any of below commands should do the work? Because it;'s not :slight_smile:

Blockquote

[[email protected]][/h/chmurix]$ sudo systemctl mosquitto restart
Unknown command verb ‘mosquitto’.
[[email protected]][/h/chmurix]$ sudo service mosquitto restart
sudo: service: command not found

Blockquote

What instructions did you use when you originally installed the MQTT broker?

oh, you’re going to love the answer - a buddy of mine did it and kinda all the specs are gone with him now because he’s eternal douche :smiley:
I can do all the troubleshooting you want me to - but i need hand holding here and that’s the problem :smiley:

edit; the alternative here is to extract the profile from this instance and later IF needed i can set up HAOS on my own etc - but thing is i have a tons of config there and would like to salvage or bring this one back to life.

In your first post you said that Home Assistant is still accessible. In that case, I recommend you make a full backup (Settings → System → Backup). Save the backup file and restore it on a new installation of Home Assistant OS.

HA OS is the preferred installation method because there’s far less maintenence involved and has simpler upgrades (compared to what you currently have). The MQTT Broker is available as an Add-on for HA OS.

Okay, cool - is that back from Core version will be compatible with new one?
However - are there any steps i can do (including command lines) for me to maybe reinstall the mosquito or reconfigure it? Any other ideas besides tearing this whole thing down?

The alternative is to ask your friend to explain what he did when he configured your system.

he’s out of reach

any other ideas guys? The issue here clearly loos like connection got broken for some reason - i got spare Zigbee dongle just in case - i got full access to machine via putty - any ideas how to troubleshoot this?

Should be

sudo systemctl restart mosquitto

Home Assistant and Zigbee can be installed in maybe a half dozen or more different ways.

Your friend appears to have configured for you what is probably the most complicated and most difficult method to maintain. Usually this method is only used by seasoned Linux (and/or Python) experts.

Your long-term goal should be to scrap this installation and migrate to one that is easier to maintain. The recommendation (above) is to install Home Assistant Operating System (HAOS) which offers lots of tools to help inexperienced users.

If you can backup your existing HA you might be able to restore the backup onto the new HAOS which would minimize disruption. There is a backup feature in the HA web interface under settings → system.

Now, as far as repairing this installation — lots could be wrong here. If a reboot isn’t working, then a simple process restart probably won’t fix either. First run df -h to get a report of your remaining storage to ensure none of the Linux volumes have filled up.

The “broker” you were connecting to on the :8080 url (oops, see below) is actually the zigbee2mqtt web frontend, a service which translates between the Zigbee protocol and mqtt messaging over IP (likely truncated “zigbee2+” on your top screenshot). You are correct in deducing this should not be using 123% cpu. You’ll need to review the logs to find out why the app is malfunctioning. If the :8080 console comes back, you may be able to view the logs there; if not, you you’ll need to find the directory where it’s storing log files — I don’t have enough info to help you there. The Z2M project has a lengthy list of startup troubleshooting topics here.

Edit: just saw you’re using a url with :8081 which could be something else? Even Mosquitto’s web console (rarely used) defaults to :8088 so it’s possible your friend changed some defaults.

1 Like

Your post peterxian is spot on in all aspects.
I’ll go over your recommendations guys in few hrs when i am home and let you know.

Hey,

tried “sudo systemctl restart mosquitto” - however this hasn’t changed nothing.

AS for the rest mr peterxian, this it the df-h output

Filesystem Size Used Avail Use% Mounted on
dev 1.9G 0 1.9G 0% /dev
run 1.9G 195M 1.7G 11% /run
efivarfs 128K 107K 17K 87% /sys/firmware/efi/efivars
/dev/sda1 15G 11G 3.0G 79% /
tmpfs 69M 0 69M 0% /dev/shm
tmpfs 1.0M 0 1.0M 0% /run/credentials/systemd-journald.service
tmpfs 420M 0 420M 0% /tmp
tmpfs 69M 0 69M 0% /var/tmp
tmpfs 1.0M 0 1.0M 0% /run/credentials/[email protected]
tmpfs 383M 8.0K 383M 1% /run/user/1000

nothing concerning here - although i have been delaying with some issues in the past with some log messages depleting space

as for the ports thing - yea i know that in order to acess my HA front end i used 192.168.1.69:8123 while it was 192.168.1.69:8081 for the the mqqt interface

Maybe thet can give you some clue or you got some idea to follow? In the meantim I’ll review that troubleshooting topics as well.

btw; that exact moment when this all went back was the reboot of the box - do you think mqqt config file could became corrupted or maybe overwritten so that there’e no communication there?

Is mosquito running? Does mosquitto_pub or mosquitto_sub work?

mqtt or zigbee2mqtt? They are different services, neither of which usually uses :8081 for webpage. The Zigbee frontend looks like this with a list of your devices and a map of connections, but defaults to port :8080. The MQTT broker, which could be Mosquitto or something else entirely, rarely has a web frontend (though notably the bevywise broker has a GUI on :8080 just like Z2M which could explain why one was moved to :8081). Step one is to figure out which service is actually failing.

Apologies - based on your screenshot we can finally determine that i am running indeed the Zigbee2MQTT - however i have seen mosquito in top at various times.

As for the reason for service failing - would it matter i was experiencing regular issue (every 24hrs) when Zigbee devices became unresponsive? It was strange but i had it for months really. I am wondering - maybe there is (and was) some hardware issue going on with the dongle?

Now, i mentioned i bought spare Dongle… the make and model is Sonoff ZBDongle Plus-E

And this how this devices presents in my “devices” :
image

I am not familiar with this add-in - you think i can take any leverage of that here?

So again just the output from top:

PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND

278043 zigbee2+ 20 0 1045244 89544 41276 R 102.6 2.3 0:03.09 node
512 hass 20 0 1779696 501180 57708 S 1.7 12.8 40:10.77 hass
425 root 20 0 1854860 46272 22376 S 1.3 1.2 7:38.20 cloudflared
278013 chmurix 20 0 14416 8252 6076 R 1.0 0.2 0:00.09 top
1 root 20 0 21708 13032 9532 S 0.7 0.3 10:14.35 systemd
18 root -2 0 0 0 0 I 0.3 0.0 1:24.45 rcu_preempt
235 root 20 0 80612 47244 46220 S 0.3 1.2 1:41.43 systemd-journal
213415 mosquit+ 20 0 12136 6292 5396 R 0.3 0.2 0:08.80 mosquitto
277216 root 20 0 0 0 0 I 0.3 0.0 0:00.02 kworker/u10:2-events_unbound
277399 root 20 0 0 0 0 I 0.3 0.0 0:00.56 kworker/1:1-events
2 root 20 0 0 0 0 S 0.0 0.0 0:00.11 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pool_workqueue_release
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/R-rcu_gp
5 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/R-sync_wq
6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/R-slub_flushwq
7 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/R-netns
9 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0H-events_highpri
11 root 20 0 0 0 0 I 0.0 0.0 0:00.00 kworker/u8:0-events_unbound
12 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/R-mm_percpu_wq
13 root 20 0 0 0 0 I 0.0 0.0 0:00.02 kworker/u8:1-efi_rts_wq
14 root 20 0 0 0 0 I 0.0 0.0 0:00.00 rcu_tasks_kthread
15 root 20 0 0 0 0 I 0.0 0.0 0:00.00 rcu_tasks_rude_kthread
16 root 20 0 0 0 0 I 0.0 0.0 0:00.00 rcu_tasks_trace_kthread
17 root 20 0 0 0 0 S 0.0 0.0 0:06.19 ksoftirqd/0
19 root -2 0 0 0 0 S 0.0 0.0 0:00.00 rcub/0
20 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_exp_par_gp_kthread_worker/0
21 root 20 0 0 0 0 S 0.0 0.0 0:02.03 rcu_exp_gp_kthread_worker
22 root rt 0 0 0 0 S 0.0 0.0 0:00.74 migration/0
23 root -51 0 0 0 0 S 0.0 0.0 0:00.00 idle_inject/0
24 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/0
25 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/1
26 root -51 0 0 0 0 S 0.0 0.0 0:00.00 idle_inject/1
27 root rt 0 0 0 0 S 0.0 0.0 0:01.02 migration/1
28 root 20 0 0 0 0 S 0.0 0.0 0:07.06 ksoftirqd/1
30 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/1:0H-events_highpri
33 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kdevtmpfs