Specific ESP boards remaining offline after power outage, where to begin troubleshooting

I’m relatively new to ESPHome, but have been using devices without issue for a few months. Then as t within luck would have it, within 12 hours of leaving home for a few days power was lost for a short period. To my surprise there were specific esp devices that have remained offline. Note that I had a power outage once before and had no issues with devices coming back online (see note at end for more info). The interesting thing is that issue appears to be isolated to the specific board type being used. I have 5 RPi Pico W’s and 1 D1 Mini ESP8266MOD all of which recovered fine. However my ESP32-C2 minis (qty 2), ESP32-C3 super minis (qty 2) and ESP32-CAM all failed to recover and remain off-line. Currently the only thing that comes to mind why some recovered and others did not is the board itself, as the setup/configuration/locations are basically the same.

I don’t believe the power outage itself cause the problem because power cycling the units has been my standard method of restarting the devices and I have never had an issue. This leads me to wonder if it was the sequence of them coming back online vs when my network equipment came back online. My assumption is that my network came fully back online after the devices have restarted. That being said if it was wifi/network related I’m not sure why they wouldn’t have recovered as I believe the default timeout/reboot is 15min.

I’ve looked around but really haven’t found anything helping me to determine what the cause of the issue is. Any ideas on where to start as I would like to fix the issue, so it doesn’t happen again in the future.

Also, to note, the last power outage I had my network was fully powered down and I didn’t turn it back on until well after power was restored. This time, since I was gone, and the power outage was relatively short, everything would have received power at the same time HA (HAOS on separate mini PC), esp devices and network switches and APs (the controller, a Unifi DMPro, never powered down and was kept alive via UPS battery backup)

Do you have the boot disabled if there is no connection to HA or wifi?

Can You publish part of config - all related to esp/wifi/ota/api/…, no need for sensors etc.
Have You restarted “failed” devices ? What a status on manual reboot ?

When you say offline, do you mean they just show offline in the ESPHome dashboard, or do you also mean that these devices & their entities are unavailable in HA?

When you say offline, do you mean they just show offline in the ESPHome dashboard, or do you also mean that these devices & their entities are unavailable in HA?

@ShadowFist They are completely offline (ESPHome and HA) and I have confirmed they have never reconnected to my wifi network.

Can You publish part of config - all related to esp/wifi/ota/api/…, no need for sensors etc.
Have You restarted “failed” devices ? What a status on manual reboot ?

@Masterzz Following are the initial parts of the config files. Initially I was thinking it was the inclusion of “captive portal:” as none of my Pi Pico W included that, however the ESP8266MOD did. So the only difference I found between the ones that reconnected and those that didn’t is the framework with type: arduino didn’t reconnect:

  framework:
    type: arduino

Here is the configs for the Pi Pico W which are all the same except for the names (all reconnected):

esphome:
  name: coop-pico-w
  friendly_name: coop-pico-w

rp2040:
  board: rpipicow
  framework:
    # Required until https://github.com/platformio/platform-raspberrypi/pull/36 is merged
    platform_version: https://github.com/maxgerhardt/platform-raspberrypi.git

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: "removed"

ota:
  password: "removed"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

  # Enable fallback hotspot in case wifi connection fails
  ap:
    ssid: "Coop-Pico-W Fallback Hotspot"
    password: "removed"

Here is the ESP8266MOD which reconnected:

esphome:
  name: d1miniesp8266mod
  friendly_name: D1MiniESP8266Mod

esp8266:
  board: esp01_1m

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: "removed"

ota:
  password: "removed"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "D1Miniesp8266Mod"
    password: "removed"

captive_portal:

Here is the config for the ESP32-C3, only the names are different (they did NOT reconnect):

esphome:
  name: esp32-c3-1
  friendly_name: ESP32-C3_1

esp32:
  board: esp32-c3-devkitm-1
  framework:
    type: arduino

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: "removed"

ota:
  password: "removed"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Esp32-C3-1 Fallback Hotspot"
    password: "removed"

captive_portal:

And finally the ESP-C2, which only the names are different (they did NOT reconnect):

esphome:
  name: wemosd1mini-2
  friendly_name: WemosD1Mini-2

esp32:
  board: esp32-s2-saola-1
  framework:
    type: arduino

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: "removed"

ota:
  password: "removed"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Wemosd1Mini-2 Fallback Hotspot"
    password: "removed"

captive_portal:

You haven’t posted a yaml for a single affected device, so this is my best guess.

Try setting static IPs and disabling power save on the device itself, something like this:

wifi:
  ssid: "yourSSID"
  password: "yourPassword"
  manual_ip:
    static_ip: 192.168.1.80    # set this to your device's IP
    gateway: 192.168.1.1      # set this to your gateway's IP
    subnet: 255.255.255.0    # only change this if your IPs start with 10. or 172. or if you're running some non-standard subnet
  fast_connect: true
  output_power: 20.4 dB
  power_save_mode: none

Just to be 100% sure, set the same IP in your router reservation and make sure it’s not part of your DHCP pool.

I have a similar/same issue. Just had a power bump, one esp8266 device came back, two Esp32 devices remain stubbornly offline. They sow as off line in the HA esp home tab and their entities are unavailable. The devices themselves are running- status leds show they are working.

Haven’t found a way to get them back yet. I’ve restarted HA, restarted my Wi-Fi network and power cycled the devices themselves but still offline.

One device is ‘plumbed in’ -really quite difficult to get it physically connected to HA for repair.

Any ideas? Thanks

OK have discovered the fallback AP mode - and connected to it - entered the correct wifi password and pressed save.

Nothing happens and the device continues to broadcast the fallback network.

What is the preferred method so the devices just keep trying to connect to the wifi network - i.e. don’t enable the fallback?

Thanks

Been reading more around this topic but not made any breakthroughs.

Have reconfigured both devices that wouldn’t reconnect to use static ips

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  manual_ip:
   static_ip: 192.168.1.22    # set this to your device's IP
   gateway: 192.168.1.254      # set this to your gateway's IP
   subnet: 255.255.255.0    # only change this if your IPs start with 10. or 172. or if you're running some non-standard subnet
  output_power: 20.4 dB
  reboot_timeout: 3 minutes

Added the reboot timeout due to this cryptic remark in the wifi component docs
“…but note that the low level IP stack currently seems to have issues with WiFi where a full reboot is required to get the interface back working.”

The devices will now connect when they are in my ‘workshop’ but not when they are in their respective locations. My Wifi network uses BT WholeHome Mesh with three discs. Is there any way that the ESP32s are somehow trying to ‘hang on’ to the workshop disc when the far stronger kitchen disk is available?

The ESP 32 devices are these: https://www.amazon.co.uk/dp/B0BYJ8MVYF?psc=1&ref=ppx_yo2ov_dt_b_product_details

Everything was working fine for several weeks - it all went wrong after a brief power outage. Power came back up quickly but wifi takes 2-3 minutes to recover during which time the ESP32 boards would have seen no wifi.

Unfortunately brief power outages are a fact of life in my rural location - really need to find a way to get the ESP32s to recover automatically (as the ESP8266 board I have does - need bluetooth or I’d just get some more 8266 boards)

Not sure what else I can try - any and all suggestions welcome - thanks

IMHO it means something “wrong” in WiFi/mesh configuration.

Devices configuration looks fine, have approximately same on 40+ devices, working in Keenetic mesh.
Recently seen post when mesh have same channel on all Access Point and author had issues with device connecting. Keenetic, f.e. have different channel on AP’s. As well can check and use only 20Mhz channel width on all AP.

Thanks - Had a good poke around in the wifi set up but nothing made any difference

However when I took the car out of the garage all started working again - both devices connected when in their proper locations. I have no idea why (interference/attenuation?) - might be coincidence. Not going to mess with it now. Will see what happens when the car goes back in the garage later today.

best

Everything is currently working well enough for OTA upgrades to function.

Enabled ‘Compatibility’ mode on the BT Whole Home Wifi and getting connected now seems to work fine (as it did before power bump). Also discovered my BT Smart Hub 2 which has the wifi disabled has been broadcasting a public EE Wifi Hotspot - it was on a different channel to the Whole Home Wifi but who knows - it is now properly disabled.

My supposition is that the WholeHome Wifi did not recover properly after the power bump in such a way that subsequent restarts didn’t fix it - maybe I should have physically turned it all off and on again :grinning:

Going to invest in a UPS - let’s see and fingers crossed.

I have also noticed improvement in wifi connectivity when disabling mesh from esp devices. In my tp-link router settings.