ESPHome devices go unavailable every hour since early hours today

Firstly, I’m not sure if this is ESPHome related, but I’m not sure where else to post?

OK this is very strange! I have a simple setup monitoring an unoccupied property we’re trying to sell. The property is not local to me.

Since 02:27 this morning the (only) 2 ESPHome devices become unavailable and then every hour at 27 minutes past the hour. I also have zigbee devices installed which all appear to be fine and not becoming unavailable.

One of the affected sensors is in the lounge the other in a bedroom. The lounge sensor becomes unavailable for approx 2 seconds then returns to previous state, the bedroom device follows a second or two later becomes unavailable for a second or two and returns to previous state. This pattern has repeated every hour since.

I have remove access via Tailscale and have cleaned the build files and then recompiled the code, at approx 55 minutes past the hour but the devices are still going unavailable at 27 past.

I have rebooted HA at 45 minutes past. There is no error reported in the log at 27 minutes, but where should I look? The uptime on the devices is not reset every hour. There was probably a power cut/dip yesterday as the system uptime says restarted yesterday morning at 08:32.

I have an automation that sends me an SMS and Telegram message when a sensor is triggered and I also put a (templated) message into a file. I have disabled that automation.

This is running on a 2 GB Pi4 and usually sits at about 40-50% memory usage and 1-5% processor usage.

Remotely, what else should I be checking?

EDIT: What I didn’t say above is that this has been running successfully other than some device additions for almost 2 months.

Maybe it is a problem with your wifi router or acces points?

@jsuanet I have had thought and I have got Firefox installed on HA and I can access the router config. The router is a Huawei 4G router and it has a SIM card to give me remote access.

I’ve had a quick look at the settings but nothing looks out of place, at least so far.

I have two Wiz bulbs connected to the router and they appear to be fine, or maybe they’re just not reporting any issues?

did you restart your esphomes to see if xx:27 keeps on going…

Quick and easy way to see what’s really going on:

  • Fire up ESPHome Addon at :25
  • Monitor the logs in the addon for one device until :28
  • Profit

@Karosm Yes, I’ve restarted the devices at 55 minutes past. In fact I’ve restarted several times at different times. So I don’t think it’s actually a ESPHome issue.

@ShadowFist That’s given me some more info

[09:26:56][D][sensor:094]: 'Still Energy (%)': Sending state 2.00000 % with 0 decimals of accuracy
WARNING bedroom-presence @ 192.168.1.200: Connection error occurred: [Errno 104] Connection reset by peer
INFO Processing unexpected disconnect from ESPHome API for bedroom-presence @ 192.168.1.200
WARNING Disconnected from API
INFO Successfully connected to bedroom-presence @ 192.168.1.200 in 0.008s
INFO Successful handshake with bedroom-presence @ 192.168.1.200 in 0.067s
[09:27:21][D][bh1750.sensor:159]: 'Illuminance': Got illuminance=221.3lx

After reading some posts and advice with the above error I’ll investigate this as a DHCP issue. The DHCP leases are currently set to 1 week. Lease time remaining for these devices as reported by the router is over 5 days. I think I need to restart the router to make any DHCP changes take effect, but I’m reluctant to do that just in case it doesn’t come back up properly and it’s a 2 hour drive each way to flick a switch!

Ok, if you don’t want to take that risk (which I fully understand) then accept the situation for this moment because you stated that the devices come on line without problems after that disconnect😀

1 Like

Good, now that you’ve narrowed it down to a DHCP issue, why don’t you avoid all the hassle with the router and set the ESPs on a static IP?

Just modify the wifi section as follows:

wifi:
  ssid: "YourSSID"
  password: "YourPass"
  manual_ip:
    static_ip: 192.168.1.80
    gateway: 192.168.1.1
    subnet: 255.255.255.0
  fast_connect: true
  power_save_mode: none

Either that, or set up an IP reservation on your router. Or do both.
All you need to ensure is that no other devices are using the IPs you assign, so ideally use addresses outside your DHCP pool.

1 Like

24 hours with no “timed” unavailables! @jsuanet I did remotely restart the router, with everything crossed that was safe to cross! Just a router restart didn’t fix the issue though.

So what have I done? Well a simple router restart did nothing other than give me confidence that I could. @ShadowFist Thinking that it was still router/DHCP related, I made some DHCP assignments in the router, but I needed another router restart before that took effect. Since then no drop offs!

It’s interesting/confusing why this suddenly started in the early hours one day after months of working normally!?

The timed drop offs really confused me and searching for similar issues drew me to DHCP, reinforced by advice in this thread. The fact that DHCP leases were 1 week but this was happening every hour was a red herring. Maybe if I’d changed the DHCP settings and restarted the router it would have fixed things, I don’t know?

As this was only going to be a temporary installation I went quick and dirty with spare gear I had lying around!

What was a frustration is that when trying to configure things I kept losing connection, this is likely due to the patchy 4G signal at the property, plus running Firefox on the 2GB Pi4!

Well it happened again! Day before yesterday one device started going unavailable every hour then 2 hours later the other one started.

A router restart didn’t fix the issue, so I have changed the device configuration to include the fixed IP address of the device. Another router restart and it’s been working without now for over 24 hours.

So will this fix it permanently or will it happen again do you think?