How can I make esphome more reliable?

I made a simple switch using an ESPHOME to access wirelessly a relay to turn on and off the power of my electric water heater tank. I need to turn off the power on peak periods as the rate goes up by 5 times in some days. So far so good. I have created an automation in HA to turn it off at 6 am and back on at 9 am. Last peak period we got was on feb 19th.
But somehow, between 8:57 and 9:10 the esp32 device was unavailable. So at 9am, HA could not send the turn_on command to the device.
Looking back, it is the only time that it came unavailable but still, I would like to get at least a notification if something fails, but even better, I wish to avoid this “unavailable” problem. And, if it does happen, I wish HA would retry to send back the failed command.

I don’t know where to find any relevant log for this. I’m also fairly new to the HA. I’m just moving everything from Hubitat to HA. I’m still struggling with all the different options.

So, anyone can give me a clue on what I should do to prevent this from happening?
PS: the wifi signal at this area is about 4 out of 5. So pretty good signal.

Assuming the ESP has disconnected from the HA, but is still working:

  • Move the automation to the ESP
  • Add an RTC module to keep the ESP time updated
  • Add the Status sensor and create an automation so you are notified in case the ESP becomes unavailable

https://esphome.io/components/time/ds1307

If the ESP becomes unavailable frequently, check the power supply and/or replace the ESP board

4 Likes

Make the whole automation in esphome and use SNTP on time component, so your esphome works independently from HA.

How many devices are connecting by wi-fi ?

The firmware for Wi-fi access points (including wi-fi routers) includes a table of the active wi-fi connections. The number of entries in this table is determined by the manufacturer. Back in the early days it was common for home APs to support only 10 or 15 wi-fi devices, and business-grade APs to support 30. Now memory is cheap and so later models increased the size of the tables - though manufacturers are still focussed on high-bandwidth applications like streaming, and haven’t really considered that IoT is low bandwidth (infrequent short messages) and so a lot more devices could share a 2.4GHz channel.

Anyway, if a device without a current wi-fi connection asks the AP to join, the AP drops one of the currently connected devices to make a space in the table for it. Since the AP cannot know if a device it hasn’t heard from in a while is still turned on and out there, it drops the device it hasn’t heard from for the longest time.
If HA (or an app on your PC) wants to contact a device which is not currently in the AP’s table, well the AP doesn’t know anything about it so cannot contact to the device.

The answer for this is (a) use wired connections wherever reasonable (these are always faster and more secure anyway), and/or (b) add an additional Wi-Fi AP to share the workload.

Adding an AP allows you to reposition the APs closer to the devices they serve, thus improving signal strength. Use the same SSID but different channel numbers and your phone/laptop will automagically connect to the one with the better signal wherever you are.

Can you name any manufacture/device that would act like this? Or maybe you even have a link to the specification? :page_facing_up:

What table are you talking about? Active DHCP leases maybe? It’s weird because I only know the opposite happening in such a case: If a router/AP doesn’t have any leases left it will (temporarily) not be possible to connect (work’a’round is usually a static IP) :person_shrugging:

DHCP has to be a separate table because it applies to all connected devices - wired and wireless.
There is additional wi-fi specific information used for each individual clienti, such as which band, which SSID (many APs have a guest network), and of course the individual encryption key for that individual connection.

Unfortunately manufacturers rarely admit to the number of concurrent wi-fi sessions, and in only one occasion did I see it in their product specification.

Wouldn’t make much sense IMHO. A very traffic intensive client streaming 16k (or whatever the coolest thing is today) might cause more buffer/memory then 100 esphome nodes :person_shrugging:

The Wi-fi sessions table is like DHCP table in that it lists those clients that the AP knows about and so can send packets of data to.
That is a rather different from what data is currently being sent. Ethernet packets, as you know, are all a fixed payload length (so the AP doesn’t need to hold the whole file, just the current packet); but streaming a large file means lots of packets as quickly as it can send them … leaving little “free air” for other devices to get a word in (making the other devices appear slow).

As the 2.4GHz band is being used more for IoT (eg esphome nodes) than for hi-quality streaming, so it should be capable of supporting a greater number of concurrent wi-fi sessions before the wi-fi seems to slow down. But manufacturers would rather just sell us another shiny new AP box with the promise that it will magically fix all our problems :wink:

Wow, thank you for all the good suggestions. I was out of town and seeing the emails coming in. Was looking forward to check all of that.

@walberjunior I like this idea. Same with you @Karosm . thanks. Yes I’m going to try that. I also added the status sensor, but not sure how this is going to prevent the problem.

@donburch888 wow, very well explanation. I’ve got 17 devices total on that router. Probably 4 of them are wired. This router is dedicated to home Assistant and only available locally. No internet access.
I understand where you are pointing but I dont think this is too many. This is a flashed router with openwrt. But still, I’m going to investigate your idea closely as it make sense.
But, looking at the history bar, it seems like it only happened once. Strange that it happened at the time I needed it, but still… taking those precautions should help.

But, as I have been programmer all my life, I still have a hard time understanding that the automation did not return any error.
Thanks again.

It won’t prevent it, but if the ESP disconnects from HA you can check if it has also lost connection with the router.

Agreed - doesn’t sound like this is the issue here because it only happened once - but I mention it for your information (and other readers) as IoT devices dropping off wi-fi seems a fairly common issue.

I also recommend doing the automation on the ESP. In the half year since i got into ESPHome I have been quite overwhelmed at how much it can do, and even on tiny devices like ESP01 … though the yaml formatting gets to be really tedious and frustrating :frowning:

I also spent 30 years as a programmer, and learnt to lookout for the edge conditions … because people usually focus on normal operation and forget to ask “but what if it doesn’t ?”