ESP8266 Keeps going unavailable

I have an ESP8266 board (D1 mini clone) which I am using to control a string of 50 WS2811 LEDs. It worked great during testing (powered by bench supply and close to AP) and is working awesome when it is showing up now, but it keeps going unavailable.

I am not sure if it’s a power issue or what. The lights still work and play the effect that I have set up when it goes unavailable, so it’s not losing power entirely. The supply is capable of just over 3A and in testing it never drew more than ~2A, so I figured I had plenty of headroom.

I can also ping the IP address of the ESP8266 when it claims to be unavailable and it responds, though it does have a wide variation in response times (like between 5ms and ~250ms) and drops a packet occasionally. So is it just that ESPHome/Home Assistant doesn’t like the slow/occasional dropped packet connection? If so, is there a way I can configure it to essentially be more patient with occasional dropped packets or slow responses? Or is there possibly something else wrong?

The serial logs will help to figure out what’s happening :bulb:

Try putting some tin foil around it. Insulate it so it doesn’t short anything, see if that helps. Seriously.

1 Like

Welcome. This sounds exactly like what is apparently a very common phenomena, and has spawned at least a half-dozen threads on here (that I know of).
The list of suggestions for fixes is long, yet no one fix seems to do it for everyone. That, or the thread tails off and we never learn if they got resolved.
Search for things like “unavailable” and “disconnected” or “OTA” and you may find most of the old threads. Maybe one holds a helpful suggestion.

In my case, I discovered that the ESP’s WiFi stack is just not stable when linked to an OpenWrt WiFi Access Point (yet other non-ESP systems have no trouble with it), and that by reverting it (a TP-Link Archer A7v5) to latest factory firmware the ESP’s link becomes rock-solid.

I hope you have better luck with it.

Thanks, I did notice other similar complaints, I was worried about power being the issue, but I think it’s unrelated. Think I’ll start looking around at those posts and seeing if any of their solutions work. I am also using an Archer A7, but unfortunately it’s just running the standard firmware.

Without providing logs anything people can do here is to poke in the dark or give tin foil tips :see_no_evil:

Lol yeah, sorry, I will try and get some logs from it a bit later. It is outside (it’s Christmas lights it’s controlling), so I’ll have to devise an arduino with SD card to capture the logs or something. Unless it will cache the logs and upload them to the MQTT server when it can connect. How do I set up the MQTT server for the logger? I’m not seeing where the doc specified that. Or is it an internal MQTT server on the ESPHome server itself?

The stock firmware on the Archer underwent a very nice upgrade about 9 months ago, so I’d encourage you to check to be sure it’s the latest.

And you can capture logs OTA using the command ‘esphome logs your.yaml’

In this case (API/wifi connection problems) serial logs (not ota logs) are the way to investigate further.

True, but it would also be interesting if it can maintain the log link but not the one to HA.

When I watched mine serially, it was pretty innocuous, saying something like “link with HA lost… Link to HA restored” and not much more detail (but I hadn’t upped the log level beyond the default, either).

If it’s dropping pings from a workstation, it’ll certainly lose OTA log traffic too, unless it’s via TCP…

And you know even why the important details were missing:

Maybe the tip from @RoadkillUK might help you :man_facepalming:

1 Like

Very eager to see one of these threads reach a conclusive solution someday.
None [seem to] have achieved that so far.
That’s left me with only the gross evidence - which I cannot discount - that OpenWrt on the AP means ESPs will flap, randomly, after about 4 hours, stock firmware means they don’t. Every few weeks I flash the AP back to OpenWrt, just to see if anything’s improved, i.e. perhaps the ESPhome latest version fixed it. But the results are always the same: about 4 hours in with OpenWrt running on the AP, and some of the ESPs start to flap.

I have about 2 dozen ESPs running ESPhome, and it’s a dice-roll which ones will flap. And when some do, all I have to do is “spin the wheel” (and another few will go nuts) by rebooting the AP. Rebooting the ESP doesn’t change a thing. If it was flapping before, it’ll keep doing so, until the AP gets rebooted or the Weather Gods decide that it’s time for that one to become stable and another one or two to start acting up.
That doesn’t rule out HA having a possible role in whatever makes the link so fragile, but it also casts a pretty suspicious light on the WiFi stacks in at least one of: ESP, OpenWrt.

Staying closely tuned … I’ll help however I can - tests, logs, etc.

The tin foil method works for ESP’s that have trouble actually connecting to an AP.

I get disconnects from my Asus router and that’s why I plugged in a cheap Tenda AP with a different SSID for my ESP’s and they seem happy with that.

I still have no idea why they don’t like my Asus.

P.S. some of my ESP’s are quite happy with the Asus however, it’s hit and miss.

Are you trying the run the whole setup via the USB port? Powering the load via the micro-USB port on the Wemos is not good for things that pull high current, ie: 2A.

I had one behaving exactly the same so I cut the plug off the USB cable and wired it directly to the 5V & GND pins on the Wemos. Haven’t had a single drop-out since. (Mine was only running 30 LED’s but it was enough to cause issues)

1 Like

:point_down:

Saw that, but asked if he has updated to latest. That coincides with what fixed it for me.

All poking in the dark till the author provides meaningful logs. We don’t even know if WiFi is troublesome or it’s “just” problems with the api connection.

We need to wait for the logs or give advise how to use the brick method which always works reliable :brick: :muscle:

BTW: My setup includes around 60 esp’s and at least 3 openwrt AP’s without any coincidences and spurious correlations :bulb:

Or simply a power issue:

Everyone is jumping on the wifi bandwagon but it could just be a hardware / wiring setup issue.

Not necessarily in the dark. I’ve been trying to contribute some of the observations that have been helpful in my case, in a belief that the more you know, the easier it will be to isolate a cause, or at least a strong correlation.
I’m curious if you have anything relevant of your own to add in that regard, or if you’re only here to criticize the things which I attempt to contribute.