Constant MQTT devices disconnections (socket error)

Hoping this gets resolved as I have these same issues.

I have the same problem on some of my sonoff s20 with 6.2.1 and 2_3_0 (and some versions before that). Havent found a solution yet.
The webserver of the tasmota is not available until i reboot my raspberry or reconnect the switch.

Those are not even related… Tasmota/Sonoff will work even if the Pi is off so this isn’t making sense to me unless you have a WLAN issue with the Pi swamping the network…

I relented yesterday and compiled 6.4.1.12 with Core 2.3.0 and I have not had any device dropout for 24 hours…

Those are not even related…

i´m sure they are. those constant connection errors seem cause a malfunction on the tasmota device.

btw: i´m using hassbian with a mosquitto server. the problem remains the same.

You could turn off the Pi and still be able to go to the web interface for Tasmota. They are not related in any way.

It is possible the Pi is flooding the network though.

Hello everyone.

I’ve found that the problem is due to wifi instabilities, packet loss.

I’ve managed to “fix” the problem by changing a couple of parameters on the Tasmota Firmware ino and changed how long to wait until next MQTT check and keep_alive messages. Compiled and updated the firmware OTA, everything seems to be a lot more stable, no more socket errors on log.

Using firmware 6.3 with core 2.3, pubsubclient mqtt.

Changed these lines on PubSubClient.h file:

// MQTT_KEEPALIVE : keepAlive interval in Seconds
// Keepalive timeout for default MQTT Broker is 10s
#ifndef MQTT_KEEPALIVE
#define MQTT_KEEPALIVE 45
#endif

// MQTT_SOCKET_TIMEOUT: socket timeout interval in Seconds
#ifndef MQTT_SOCKET_TIMEOUT
#define MQTT_SOCKET_TIMEOUT 60
#endif

Here is a default firmware compiled with the above settings: https://mega.nz/#!jQF3xKra!zEx9YInJoyTaOnihdo04oFCn2hNfdYASK9k39n1u2Yw

Hope it helps.

Thanks.

2 Likes

Updated to the following version with your adjustments to PubSubClient.h:

Program Version 6.4.1(sonoff)
Build Date & Time 2019-01-31T22:15:23
Core/SDK Version 2_4_1/2.2.1(cfd48f3)
Uptime 1T23:31:01

also switched from fixed mqtt settings to mqtt discovery and changed the ip adress. i suspect the latter for being the real reason, but no disconnects since nearly 2 days.

i´ve got another problematic device. i´ll check if it helps also.

can you also share your bin file
thank you

I was running 6.4.1 on 8 Sonoffs (S20, touch, basic, T1 2 gang) and had a lot of these socket errors (one more than the other, couldn’t really find a pattern).
I tried:

  • changing the IP addresses of my Sonoffs
  • changing the username/password of the MQTT host
  • adding a list of local users in MQTT config for the devices with most errors
  • changing the client name (DVES_xxxx )
  • erase all flash using esptool.py
  • changed sleep settings

Without any result. Uploaded the firmware of @Schneider yesterday, and so far no Socket errors anymore. Will check the next few days. :pray:

1 Like

@stanvv Please make sure your are using core 2.3 previously, too.

I agree. Core 2.3.0 fixed mine as well.

These socket errors were bothering me as well, I decided that reverting to an older core didn’t seem like a good fix, that using @Schneider’s fix would be the most beneficial.

I ended up modifying the PubSubClient.h the same way… BUT…
After trying to get a STATUS6 message from the switch I noticed that it was still sending a keep alive of 15 seconds!

After a grep search for more keep alive defines, I noticed that sonoff_post.h also had a MQTT_KEEPALIVE 15

After reseting the device and realizing it wasn’t in the CFG I rebuilt the firmwares and I’ve had no socket errors since even with a 60 RSSI.

Just for clarification, I did not modify the code in any way, no user settings just changed the keep alive definitions. My configurations are pretty generic on my Sonoff Basic V2’s

1 Like

Schneider’s fix IS using the older 2.3.0 core or didn’t you notice that?

1 Like

Bad communication on my part.
I meant I wanted to make it work with 2.4

An update. After two hours it started to do it again. I will compile the 2.3 core version and test it next.

Well, I concede! As described by @Bieniu the Tasmota releases with Core 2.3 with 6.4.1 work perfect

And again verified by eh50

So that seems to be it.

2.4.2 is horrible, 2.5.0 is much better but 2.3.0 is the most stable. All have tradeoffs.

Thanks for the fix! I’m running 2.3//6.4.1 from the hackbox

I’m not really keen on reflashing every device (60+) so i would like to do this by mqtt commands:

The keep alive would be this, as far as i can tell:
MqttRetry 10..32000 = set MQTT connection retry timer in seconds (default = 10 )

But i’m having a hard time finding the socket timeout, anyone any idea?

Hello!

It is currently not possible to do it natively using commands. You will need to rebuild and flash the firmware with the above settings.

I’ve managed to do so without the need of a minimal firmware, the final bin size is small enough to flash directly.

There a risk involved but no way out.

You can flash all of them by uploading the firmware online and doing it by OTA. Tasmoadmin is your best friend.

Thanks.

@Schneider thanks for the quick reply! I thought so, to bad… Weird thing is, i had these issues with 2.4///6.4.1 then i’ve uploaded 2.3///6.4.1 and it was solved. Now after i’ve updated to hass 87 i’m seeing more and more offline devices.
Just cant link it to updating hass…