MQTT Connect failed rc -2 and rc -4 with Socket error on client

After upgrading to Mosquitto 5.1 I’m getting intermittent socket errors and MQTT Connection failures with reason code 2 and 4.
This is a proxmox instance running only one VM being hassos and recorder redirected to Synology NAS running MariaDB10.

I’ve also recently upgraded firmware on Synology router.

Logs from Mosquitto:

[12:02:09] INFO: Setup mosquitto configuration
[12:02:09] INFO: Found local users inside config
[12:02:09] INFO: Initialize Hass.io Add-on services
[12:02:09] INFO: Initialize Home Assistant discovery
[12:02:09] INFO: Start Mosquitto daemon
1594605729: Loading config file /share/mosquitto/acl.conf
1594605729: mosquitto version 1.6.3 starting
1594605729: Config loaded from /etc/mosquitto.conf.
1594605729: Loading plugin: /usr/share/mosquitto/auth-plug.so
1594605729:  ├── Username/password checking enabled.
1594605729:  ├── TLS-PSK checking enabled.
1594605729:  └── Extended authentication not enabled.
1594605729: |-- *** auth-plug: startup
1594605729: Opening ipv4 listen socket on port 1883.
1594605729: Opening ipv6 listen socket on port 1883.
1594605729: Opening websockets listen socket on port 1884.
1594605729: Opening ipv4 listen socket on port 8883.
1594605729: Opening ipv6 listen socket on port 8883.
1594605729: Opening websockets listen socket on port 8884.
1594605729: Warning: Mosquitto should not be run as root/administrator.
1594605729: New connection from 192.168.0.42 on port 1883.
[INFO] found xxxxxxxx on local database
1594605729: New client connected from 192.168.0.42 as DVES_D3BACD (p2, c1, k30, u'xxxxxxxx').
1594605731: New connection from 192.168.0.17 on port 1883.
1594605731: New client connected from 192.168.0.17 as auto-0EBB3489-6F74-077D-25F1-DFBCC09D4635 (p2, c1, k60, u'xxxxxxxx').
1594605731: New connection from 192.168.0.172 on port 1883.
1594605731: New client connected from 192.168.0.172 as DVES_6878A2 (p2, c1, k30, u'xxxxxxxx').
1594605731: New connection from 192.168.0.90 on port 1883.
1594605731: New client connected from 192.168.0.90 as DVES_A75C42 (p2, c1, k30, u'xxxxxxxx').
1594605731: New connection from 192.168.0.98 on port 1883.
1594605731: New client connected from 192.168.0.98 as DVES_51CD30 (p2, c1, k30, u'xxxxxxxx').
1594605731: New connection from 192.168.0.152 on port 1883.
1594605731: New client connected from 192.168.0.152 as DVES_37A1E3 (p2, c1, k30, u'xxxxxxxx').
1594605739: New connection from 192.168.0.224 on port 1883.
1594605739: New client connected from 192.168.0.224 as weatherstation-50029178e40d (p2, c1, k15, u'xxxxxxxx').
1594605739: New connection from 192.168.0.155 on port 1883.
1594605739: New client connected from 192.168.0.155 as DVES_AA3944 (p2, c1, k30, u'xxxxxxxx').
1594605739: New connection from 192.168.0.139 on port 1883.
1594605739: New client connected from 192.168.0.139 as DVES_3571C1 (p2, c1, k30, u'xxxxxxxx').
1594605739: New connection from 192.168.0.171 on port 1883.
1594605739: New client connected from 192.168.0.171 as DVES_689727 (p2, c1, k30, u'xxxxxxxx').
1594605739: New connection from 192.168.0.221 on port 1883.
1594605739: New client connected from 192.168.0.221 as DVES_330423 (p2, c1, k30, u'xxxxxxxx').
1594605739: New connection from 192.168.0.93 on port 1883.
1594605739: New client connected from 192.168.0.93 as DVES_AA33B8 (p2, c1, k30, u'xxxxxxxx').
1594605739: New connection from 192.168.0.214 on port 1883.
1594605739: New client connected from 192.168.0.214 as DVES_8F1BF8 (p2, c1, k30, u'xxxxxxxx').
1594605739: New connection from 192.168.0.200 on port 1883.
1594605739: New client connected from 192.168.0.200 as DVES_32FFC9 (p2, c1, k30, u'xxxxxxxx').
1594605739: New connection from 192.168.0.61 on port 1883.
1594605739: New client connected from 192.168.0.61 as DVES_32843F (p2, c1, k30, u'xxxxxxxx').
1594605740: New connection from 192.168.0.99 on port 1883.
1594605740: New client connected from 192.168.0.99 as DVES_68BAE2 (p2, c1, k30, u'xxxxxxxx').
1594605740: New connection from 192.168.0.91 on port 1883.
1594605740: New client connected from 192.168.0.91 as DVES_A79ACD (p2, c1, k30, u'xxxxxxxx').
1594605740: New connection from 192.168.0.207 on port 1883.
1594605740: New client connected from 192.168.0.207 as DVES_33073A (p2, c1, k30, u'xxxxxxxx').
1594605743: New connection from 192.168.0.17 on port 1883.
1594605743: New client connected from 192.168.0.17 as mqtt_3f48992b.89e316 (p2, c1, k60, u'xxxxxxxx').
1594605747: Socket error on client DVES_A75C42, disconnecting.
1594605747: Socket error on client DVES_51CD30, disconnecting.
1594605747: Socket error on client DVES_37A1E3, disconnecting.

Logs from one of the Tasmota devices:

12:02:08 MQT: Attempting connection...
12:02:08 MQT: Connect failed to 192.168.0.17:1883, rc -2. Retry in 10 sec
12:02:19 MQT: Attempting connection...
12:02:19 MQT: Connected
12:02:19 MQT: tele/kitchen_light_3/LWT = Online (retained)

Anybody else seeing issues with Mosquitto 5.1 or am I going down the wrong rabbit hole?

Thanks
Rob

I don’t know much about return and/or reason codes, but reason codes were introduced in 5.0. One article says:

The reason code is a single-byte unsigned value used to indicate the result of the operation. The reason code less than 0x80 indicates that the result of the operation is successful. Under normal circumstances, the reason code value returned by the operation is 0. If the reason code returned is greater than or equal to 0x80, the operation has failed.

But is the older codes are being used, it is likely a rejected client ID. Client IDs must be unique.

EDIT article here MQTT 5.0 new features |(2) Reason code and ACK | by EMQ Technologies | Medium

And the fact that, although the add-on says 5.1, the mosquitto version is actually 1.6.3. And I think the add-on actually uses MQTT protocol 3.1

1 Like

So these client names are the default names configured in Tasmota and take the form DVES_ plus the last 3 bites from MAC address so should be unique. Also no client name has changed and all was working fine for many months.

One thing that did change was I added a D1 mini based weather station via ESPHome and in the YAML for this is the following:

mqtt:
  broker: !secret mqtt_broker
  username: !secret mqtt_user
  password: !secret mqtt_pass
  birth_message:
    topic: solarsensor/birthdisable
    payload: disable
  will_message:
    topic: solarsensor/willdisable
    payload: disable

This weather station is from a digiblurDIY project on Youtube (see here) and specifically a hack that he has used as he describes here to disable last will and testament? Wonder if this has something to do with it. Don’t understand last will and testament.
Thanks
Rob

This one seems different, but OTOH isn’t that the sever?

Yes 192.168.0.17 is the Home Assistant and MQTT IP address.

Digiblur suggested it might be Tasmoadmin that’s causing the issue. I’ll shut that down in the container and see what happens. Only start it when you want to use it.

After removing the weather station and shutting down tasmoadmin everything settled down and I haven’t observed any more connection errors until this morning when it all started again?

06:15:05 MQT: Attempting connection...
06:15:20 MQT: Connect failed to 192.168.0.17:1883, rc -4. Retry in 10 sec
06:15:31 MQT: Attempting connection...
06:15:46 MQT: Connect failed to 192.168.0.17:1883, rc -4. Retry in 10 sec
06:15:57 MQT: Attempting connection...
06:15:57 MQT: Connected
06:15:57 MQT: tele/blue_lamp/LWT = Online (retained)
06:15:57 MQT: cmnd/blue_lamp/POWER = 
06:19:34 MQT: tele/blue_lamp/HASS_STATE = {"Version":"8.2.0.6(eb35b73-tasmota)","BuildDateTime":"2020-05-13T13:00:40","Module or Template":"Sonoff Basic","RestartReason":"Software/System restart","Uptime":"0T02:16:08","Hostname":"blue-lamp","IPAddress":"192.168.0.214","RSSI":"82","Signal (dBm)":"-59","WiFi LinkCount":1,"WiFi Downtime":"0T00:00:06","MqttCount":13,"LoadAvg":19}
06:20:28 MQT: tele/blue_lamp/STATE = {"Time":"2020-07-15T06:20:28","Uptime":"0T02:17:02","UptimeSec":8222,"Heap":28,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":19,"MqttCount":13,"POWER":"OFF","Wifi":{"AP":1,"SSId":"Cecil","BSSId":"00:11:32:72:0E:2D","Channel":8,"RSSI":78,"Signal":-61,"LinkCount":1,"Downtime":"0T00:00:06"}}
06:24:34 MQT: tele/blue_lamp/HASS_STATE = {"Version":"8.2.0.6(eb35b73-tasmota)","BuildDateTime":"2020-05-13T13:00:40","Module or Template":"Sonoff Basic","RestartReason":"Software/System restart","Uptime":"0T02:21:08","Hostname":"blue-lamp","IPAddress":"192.168.0.214","RSSI":"82","Signal (dBm)":"-59","WiFi LinkCount":1,"WiFi Downtime":"0T00:00:06","MqttCount":13,"LoadAvg":19}
06:25:28 MQT: tele/blue_lamp/STATE = {"Time":"2020-07-15T06:25:28","Uptime":"0T02:22:02","UptimeSec":8522,"Heap":28,"SleepMode":"Dynamic","Sleep":50,"LoadAvg":19,"MqttCount":13,"POWER":"OFF","Wifi":{"AP":1,"SSId":"Cecil","BSSId":"00:11:32:72:0E:2D","Channel":8,"RSSI":78,"Signal":-61,"LinkCount":1,"Downtime":"0T00:00:06"}}
06:28:58 MQT: Attempting connection...
06:29:14 MQT: Connect failed to 192.168.0.17:1883, rc -4. Retry in 10 sec
06:29:24 MQT: Attempting connection...
06:29:24 MQT: Connected

Do you have a networking problem?

Not that I can identify. Recently upgraded firmware on router and mesh APs recently but no issues with any other devices that I can see.

After some time I believe it’s the Synology router.
Doing a soft reboot did not solve the problem. However after a hard reboot it’s been stable for a day.