ESPHome Nodes Become Unavailable After a While

I have an issue where my ESPHome nodes become unavailable randomly. This has happened for the past few years, across various versions of ESPHome and HASS. My nodes are all ESP8266 WEMOS D1 Mini boards.

My router shows that the nodes are connected to Wi-Fi, but it seems that the ESPHome native API on the nodes become unresponsive…? Because when I try to remove then add an unresponsive node in HASS directly via it’s IP address, it gives me the error Can't connect to ESP. Please make sure your YAML file contains an 'api:' line.

Any suggestions would be appreciated.

YAML and suggest you add following to track Wi-Fi strength

My weakest nodes regularly loose connection

I have that added on all of my nodes. They typically have signal strengths in the range of -60 to -85. But even the ones with the strongest signal become inaccessible sometimes.

The nodes in the inaccessible state show up as having good signal strength on my router, so I’m not sure what’s going on.

and if you could post YAML…

and if you look at your history of signal strength do you see drop offs

and when you loose connect … if you run logs from the ESPHome dashboard … can you post the complete logs that are generated …

and of course what you have posted suggests that you do not have API added to your yaml… but then again I am sure you have checked but we cant see the yaml

and do you have ping MDNS turned on for the add-on…

and if you could post YAML…

Here’s the stripped down YAML of one of my nodes:

esphome:
  name: living-room-enviro
  platform: ESP8266
  board: d1_mini

wifi:
  networks:
    - ssid: "SSID2"
      password: "PASSWORD2"
      priority: 5
    - ssid: "SSID"
      password: "PASSWORD"
      priority: 0

  ap:
    ssid: "Living Room Enviro"
    password: "password"

captive_portal:

# Enable logging
logger:

# Enable Home Assistant API
api:
  password: "password"

ota:
  safe_mode: true
  password: "password"

i2c:
  sda: 4
  scl: 5
  scan: True

sensor:
  - platform: wifi_signal
    name: "Living Room Enviro WiFi Strength"
    update_interval: 60s
  - platform: adc
    pin: A0
    name: "Living Room Ambient Light"
    update_interval: 1s
    filters:
      - multiply: 100
    unit_of_measurement: "%"
  - platform: bme280
    temperature:
      name: "Living Room Temperature"
      oversampling: 16x
    pressure:
      name: "Living Room Pressure"
      oversampling: 16x
    humidity:
      name: "Living Room Humidity"
      oversampling: 16x
    address: 0x76
    update_interval: 20s
  - platform: uptime
    name: "Living Room Enviro Uptime"

switch:
  - platform: restart
    name: "Living Room Enviro Restart"
    

and if you look at your history of signal strength do you see drop offs

My most recent disconnection happened when signal strength was reported by that node at -58dbm. So I would assume that this wasn’t a Wi-Fi signal loss disconnection.

and when you loose connect … if you run logs from the ESPHome dashboard … can you post the complete logs that are generated …

I don’t have a disconnected node at the moment to test the output, but from memory, it says something like “can’t connect to esphome api, retrying in background”.

and do you have ping MDNS turned on for the add-on…

I’m not sure what you mean by this?

I do not think this is your problem but it pings your node…

AND you can use the sensor logging component that initiates when wifi connection is lost

I run API with no password … if you get desperate you could drop it…

Sometimes power supply is a problem, if you have one make of board try another make …

I think you just keep changing and trying different bits until you find a pattern that you can blame on

I have found that my boards are always responsive (I have four different types) but sometimes some of the sensors become unavailable. I haven’t got to the bottom of it …

I would strongly advise also to use fixed IP in ESP as DHCP temporary issue/glitch might put ESP offline when renewing lease… (makes also ESP come back online faster after reboot/crash/power on :wink:

1 Like

Yes … i should have said that …

api: password: was deprecated in HA 2023.2.4. Doc’s suggest using encryption: instead. Have you tried without this setting

Edit: corrected term - deprecated

deprecated. Depreciation is an accounting term.

And it will still work with password.

I already had manually assigned IPs for the nodes on my router, but I now also added these IPs as static within the YAML files. Also changed the api component from password to encryption key.

I applied these changes to a couple of my nodes. I’ll provide an update later on whether these changes do end up making a difference.

1 Like

i’ve been fighting with this issue for months now…here’s what i discovered: the ESP boards in my house have issues connecting to my APs that are on channels 6 and 11, but are more stable on the AP that’s on channel 1. originally, i thought this might be due to interference (there is a neighbor’s AP running on channel 9, overlapping both 6 and 11)…but after yesterday, i’m not sure that’s the issue (more on that later).

the main problem with that approach is that some of the boards are closer to the other two APs, so i can’t just have it connect to channel 1 and have the wifi signal strength be terrible. also, i’d be at the mercy of my neighbors…if someone else adds an AP that runs on channel 3 instead of channel 9, my ESPs locked to channel 1 would start having the same issues.

yesterday, i locked one of my two boards (i removed the rest to simplify for testing) to channel 1 as it had been, and one to channel 6. the one on channel 6 started doing its usual thing, becoming unavailable and going offline in the ESPhome UI every few minutes, while the channel 1 board stayed solid. i added a ping sensor in HA for both devvices, sending 2 packets every 45 seconds - the channel 6 board hasn’t dropped since. online the entire time in the esphome ui, and available the entire time in HA. 21 hours since.

i do still see a disconnect in the esp logs every minute or so (.240 is my HA IP):

[17:41:02][D][api:102]: Accepted ::FFFF:192.168.88.240
[17:41:02][W][api.connection:071]: ::FFFF:192.168.88.240: Socket operation failed: CONNECTION_CLOSED errno=128

but the difference is that the board doesn’t appear to be actually dropping out. i don’t know if the ping sensor is somehow keeping the wifi awake, or if there’s a bug in esphome, or what exactly is going on…all i know is that it has been solid for 21 hours since i added that. before adding the ping sensor, i could at times not go 21 seconds without the board going unavailable and seemingly dropping off the network.

Update: setting the IP as static within ESPHome (then re-adding the integration for each node with the static IP) seems to have mostly solved the dropout issues. I assumed that setting the ESPHome Node to have a constant IP from within my router should have done the same thing, but apparently this was not enough.

Thanks everyone for the help.

1 Like

thank you!
I had the same issue and setting a static ip in the yaml seems to have fixed the drops (the device itself never went offline when checking the uptime of the ESPHome device)
Many thanks for sharing!