Esphome wifi dropout issue

about couple of months ago all my esphome devices (esp8266 & esp32 based devices) started drooping off wifi every few minutes (sometimes seconds). they might be even restarting. I dont know a sure way to know. esphome log shows the below message

“WARNING Disconnected from API: Timeout while waiting for message response!”

A few other observations:

  1. the drop outs happen only a few hours a day. usually morning to evening. I have never seen this happening during sundown.
  2. not all devices effected everyday. usually only a handful of devices effected on any day. next day a different (random) set of devices get effected. somedays almost all of the devices affected (but not all). somedays (once or twice a week) everything is fine.
  3. only esphome devices are effected. each individual device is fine for a day are two and then starts dropping off. then it is fine for a day or two.

Also I have unifi (UDM pro) network.

  1. I have other 2.4 ghz devices like yeelights, vacuums, etc, they are fine. never effected.
  2. I also have other esp8266 with WLED and they are also fine. never get effected by this issue. issue is only with esphome devices.

What I tried:

  1. I tried various changes on the unifi side (like DTIM, rate controls, etc) but no avail.
  2. created new wifi network and and updated esphome thru OTA. but no dice there either.
  3. tried “power_save_mode: none” under esphome wifi section. didn’t work.

What seemed have worked:

I compiled & downloaded the bin file and flashed it using esphomeflasher (thru usb cable). this seemed to have worked (3 days so far), I suspect there is some part of esp8266 that only esphomeflasher is erasing and re-flashing. OTA is not touching that area. edit: spoke too early

what is stopping me from doing the same for all other devices:

I have bunch of tuya based wall switches and plugs. Initially I flashed the custom firmware using tuya_convert over the air. I am not even sure if there is access to physical connectors (without breaking them) on these devices.

I would like to know if there is a way to re-flash firmware thru OTA which will erase everything like the esphomeflasher.

Also would like to hear from anyone having similar issues.

1 Like

Do you use static IP addresses for your devices?

I have over 50 devices flashed with ESPHome around my house. All have static IP addresses assigned. None have connection issues. I am using multiple Ubiquiti APs.

1 Like

Hi Tom, thanks for looking into this. Yes all my devices are static. I also have 3 APs (on channel 1,6 and 11 respectively). I have had these devices at least for couple of years. Never had any issues earlier.

Connect one module, which drops out via your USB/UART cable to terminal application (like bray terminal…). ESPhome modules have log information output on Rx/Tx pins (GPIO1 and 3), so that way you’ll maybe know more about why it drops out. Information via serial cable is very usefull and many times a “deal broker” for resolution of problems.
Second thing: check your router for a number of wifi clients allowed at the same time. Does have a limitation set? If so, maybe when limit is reached router starts to disconnect oldest connections…

1 Like

I don’t have any client limit setup, so that would be an unlikely cause.
I will look into that USB cable logs. Can you provide me little bit more about how exactly I have wire them? thanks in advance.

Since you said that you flashed a module i guess that you already own an USB–>UART module. All you need is connect that box same way you did for flashing: GPIO1 to your USB-UART RX module, GPIO3 to TX and GND to GND. It’s basically no need to connect GPIO3, since communication is only one way: from module to PC - you can’t send commands from PC to esp8266 module, you can only read it’s log output (but i always connect all three pins anyway…)
Then download a PC (or MAC) terminal software (like bray terminal for PC) and observe.

1 Like

Do you have it in a spotty WiFi area? I had similar issues with my garage door opener and it was related to my wifi. I had a minimum RSSI set up on my UniFi USG that anything below -67 got dropped. However my garage coverage is spotty at best and continually got dropped. Setting this number lower to -75 fixed my isssue.

  1. I have very good wifi every corner of my home
  2. minimum RSSI is not enabled

What access points are U using, unifi ones on recent firmwares are seeing disconnection issues

I have 3x unifi-ap-ac-pro

they have been working for years with out any issues until recently.

you done an update to the firmware recently

Funny you say this, my Unifi network was acting badly a couple of weeks ago. I logged into the controller and realised there was a firmware update for the AP’s. After updating them they have been great.

yeah its been hit and miss, i had to roll back to v4 on some of mine, have a look at the forums, there are lots of issues listed with disconnects, failing to get dhcp leases etc with some of the last few releases

OK, I can’t speak for others but my issue has been fixed.

Root Cause: Solar Optimisers (Tigo optimisers)

They have been the primary suspects right from the get go since the the problem started the day they were installed. The issue is there has been zero evidence online (as per google) to indicate that they could cause any RF interference. Neither esphome not my unifi setup is nothing to do with the issue yet I spent hundreds of hours tinkering them. Anyway, I got them chucked out of my solar panels and everything is working great.

1 Like

Bugger. At least you solved the issue.

Tigo technology uses 2.4Ghz for both it’s PAN Rooftop communications, as well as it’s WiFi networking.

As you have 3 APs it is going to interfere with at least one of the channels, possibly more depending on their channel selection.

1 Like

I recently got this very same problem. ESPHome devices dropping in and out sometimes within seconds. It seems it affects more devices by the day and I’m afraid my Shelly relays may break as they flicker the relay when set to turn it on at boot. The devices that are mostly affected are closest to an AP. And I am not running any of OPs solar products. May try to get some logs via cable on some device but ripping Shelly’s in and out of the walls is not so convenient. Even less to mix uart on something connected to mains. Maybe connecting a new esp over usb could work. I’ll go through my WiFi settings and I’m also running unify access points. Not enforcing RSSI but I cannot find a setting for number of clients.

See this interesting history on one of the ESP devices (Sonoff Basic). Consistently dropping every ~15m.
image

Another device, a shelly 2.5
image

Are your ESP Devices 8266 based? And do you have the web_server component enabled. Ive recently been experiencing problems with my devices - after a recent update & i have read that the web_server component is quite large and can cause issues. Im disabling it on one of my devices to see if it fixes the wifi stability issue.

Thanks for the advice, I’d definately try it! The answer is unfortunately not yes, The devices are mixed both ESP8266 and ESP32 but they are all running the web server.

Just to rule out your screen shots aren’t by accident from a router and showing dhcp leases (and the lease time of the dhcp is set to 15 mins)?

Otherwise you checked already the all the possibilities mentioned in the FAQ? :point_down:

And logs would be also thing to go here :wink:

Thank you for you input @orange-assistant . Very late reply but yes, I have tried most of these things besides checking the logs (neither on wifi actually or uart/usb) so this is something to do!
I have since I wrote here heard that some people experience problems with Unifi APs when devices switch AP as it seems to sometimes disconnect the HA API but then they never come back. I have this issue as well on some devices and it was solved by switching to using mqtt instead of HA API. I do however have tons of ESPHome devices (above 30st and maybe 70+ wifi devices in total) so I wonder if there is a limit to the amount of connections to the HA API. I agree with you on the extremely precise timestamps that my screenshot was showing but this is not the case for all devices, nor always for this one. Besides I have the reboot parameter set manually to somehing else (And yes, it is no DHCP lease). I also cannot remember if the device was responsive to ping or not when offline.
image

This is my configuration and I’ve just changed from API to mqtt so lets see how it affects this device (same one in the screenshot above IVT Climate).

esp32:
  board: wemos_d1_mini32
  framework:
    type: arduino
mqtt:
  broker: !secret mqtt_broker
  username: !secret mqtt_username
  password: !secret mqtt_pass
wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  
  manual_ip:
    static_ip: 192.168.1.159
    gateway: 192.168.1.1
    subnet: 255.255.255.0

  reboot_timeout: 30min
  
  fast_connect: true

EDIT: This is a screenshot of the uptime over 4 days so it does not seem to be rebooting at least