ESP8266 drops WiFi connection

Hi,

I’m using HassIO on a Intel NUC. I have a lot of devices, from ESP8266 (some Tasmotized), ZigBee and Z-Wave, and most of the time they work flawless. But since some upgrades ago (of HassIO and add-ons) some of my ESP8266 drops WiFi connection.

I have several WiFi-AP’s and it is not tied to a specific AP, nor can I see that it is a specific firmware (in the ESP8266 - some use Tasmota and some homebrewed firmwares), but I can see that the devices drops connection and then fail to connect again. It is also different hardware - some are Sonoff (e.g Basic or 4CH), some Wemos Mini or similar, but they all use ESP8266, some are selfpowered (like Sonoff) and some uses a USB-charger (like Wemos). And they all connect to HassIO using MQTT.
I also have some Shelly and ESPHome devices (that also uses ESP8266) - but I have never seen them fail this way. All in all there are about 20 WiFi (and about 30 ZigBees or Z-Wave) devices.

Usually I have to restart or power cycle the device, sometimes it repairs itself (could be hours or even days after). Sometimes it fails after an hour or so, sometimes it works for several days without any problem.

I would blame the firmware or hardware if it wasn’t for the case that several devices (2-4 devices) can drop the connection at the same time (see red line in picture), and they all come back online if I just restart the HassIO server (green line). They are not completely gone, sometime they manage to send a single update (blue circle). And when they are gone, I can see that they have lost WiFi (not just MQTT) and are constantly trying to reconnect to WiFi but fails (I can’t find them on the network using the static configured IP-address). Some of the devices defaults back to local AP, so I can log in to them, and reconfigure e.g. WiFi settings - but they will still fail to connect after restart. I can’t say I have seen any error-messages in the logs that I can relate to this.

I have swapped AP’s, checked signal strength, I have updated the ESP’s to latest firmware etc, but my gut feeling is that this started/got worse a couple of weeks ago and that it is related to HassIO or the Mosquitto add on (since it starts to work again when I restart the server).

The ZigBee’s and Z-Wave devices all uses MQTT as well, and they have never failed - it is only some ESP8266 (~5 out of about 20) devices that use WiFi to connect to MQTT that fails.

Anyone that has an idea what could cause this? Or where to look?

Best regards,
Thomas

Had the same issues with my esp’s and my Asus routers. It turns out that it is an issue with DHCP, so i gave all my devices a static IP.
After that everything is working like charm.

Hi Henrik,

Thanks for the info!

As it is now, all esp’s and other “static” devices already have static IP’s.
Only mobile devices (phones, laptops etc) have dynamic IP’s.
I guess your setup is similar, or did you give static addresses to all devices?
But, I will double check to see if I have missed some esp or the setup of the DHCP in the router …

Cheers,
Thomas

Only esp’s have static IP and they are in a rage that are excluded from the range the DHCP server can give away.

It’s likely things will behave like the way you are experiencing if you have colliding IP address. The DHCP server gives away an address that is already used as a static address. The static device will loose it’s connectivity.

The address-range of the static and dynamic do overlap, so maybe I have hit a “corner case” with too many static devices occupying the DHCP range - or just unknowingly broken a golden rule …
I will move all static IP so they don’t overlap with the DHCP range and see if it helps!
Thanks for the info!

Hi, could you please elaborate on the issue you found ?
For most of my devices I have the main router provide fix IP to the mac, however not for all and I had issues last week with my wemos d1-mini too which connected/not-connected without any readable reason.

Hi,

I narrowed the address-range for the DHCP, and moved all sensors and esp-devices with static IP-addresses outside of that range.

So far the problem seems to be gone and the system is much more stable, so big thanks for the tip @henols !

Cheers,
Thomas

Apparently this issue has floated to the top again. I have three tasmotized Martin Jerry switches, in the wall. One at the front of the house, one near the back door, and one at the (detached) garage door. I can understand the one at the garage occasionally losing connection, but the not the front door. It’s only 30 feet away from the router. It’s closer than and has fewer walls between it and the back door switch. And yet the back door switch remains connected through it all. They all have fixed IP and those IP are not within the limited DHCP range I have reserved. All three are on Tasmota 12.3.something. I think 12.3.1. NOTE: I pitched all three Belkin WEMO switches after having connectivity issues in these same locations - but that was just a straw too heavy for the camel to carry on that adventure.

Also, sorry for bringing up sort of unrelated issue on HA community. But I’m looking everywhere I can find for a solution.

AND, from reading here, in other places, I now understand that ESPHome is a Nabu Casa thing which makes it more likely to work well with HA. What a long strange trip…