I have a relay connected to a pump in my garage and the wifi signal isn’t great. I would like ESPHome to operate without wifi (as far as possible).
ESPHome seems to restart regularly despite reboot_timeout: 0s being set on the wifi: and api: elements. The times when it restarts seems to coincide with periods of poor wifi coverage. This causes a big issue as the pump restarts each time which will eventually cause premature wear and damage.
What ESP8266/ESP32 device are you using? Does it have a u.fl antenna connector? If so, adding an external antenna may help. If not, consider using a Wemos D1 Mini Pro with external antenna.
I’m sure I can improve the wifi coverage but that’s not really my question.
Is it actually possible for ESPHome to operate with poor wifi without restarting? And if so, what do I need to configure in addition to reboot_timeout: 0s on wifi and api?
I’m doing that a lot and don’t experience the problems you have:
We don’t even now at this point what your “not great” wifi signal is or if that is even the cause for your reboots (and not a complete other reason like a failed component/platform etc.)
Thanks for the help @orange-assistant. I don’t have a wifi signal sensor but will add one.
For now, I can compare the uptime sensor to periods of unavailability, which I believe is due to wifi drop outs. Adding the wifi signal sensor should prove this either way I guess.
Thanks also for the ‘good question’ link. I have read all of the ESPHome documentation, searched for others with a similar issue and have tried to be clear on my goal. I do appreciate that we are all hobbyists and noone is being paid for their time here.
Given your last post, am I correct in assuming that your ESPHome devices are not restarting during wifi dropouts? From the docs, this does seem to be the intended behaviour when reboot_timeout is 0s.
We really need logs to be sure that’s the case otherwise it’s really only a (wild) guess.
The “unavailable” times actually only tell you that the api wasn’t connected but nothing if the esphome node was connected to wifi or not. If you look in the beginning of your graph you have big blocks that the esphome node is not connected to ha (via api) but at the same time it didn’t restart (uptime going up).
It will put more weight on it but you will not have any “proof” until the logs tell you it’s a planned restart because there were no wifi connection.
I’m not really certain that I do really have a lot of clients dropping the wifi connection (regularly) actually. Certainly I have esphome nodes which are kind on the “edge” (like -85/-90dBm) in terms of signal strength but they all perform fine. I actually didn’t change the defaults (15min) for a restart if there is no wifi/api connection.
I have read a lot of the esphome docs but certainly not all - there is just to much of it
Not sure if you came across this entry in the FAQ:
Resurrecting this thread as I never managed to get to the bottom of this last year and ended up closing the pool for winter
Now we are in the middle of the pool season and the issue has started to occur again regularly this week.
I tried to capture logs using the esphome logs abc.yaml > log_file command. But I’m not sure how useful it is as I seem to miss the restart (due to lack of connection?)
[14:32:38][D][sensor:093]: 'Pool Pump Uptime Sensor': Sending state 7902.55518 s with 0 decimals of accuracy
[14:34:46][D][sensor:093]: 'Pool Pump Wifi Signal Sensor': Sending state -65.00000 dBm with 0 decimals of accuracy
[14:34:46][D][sensor:093]: 'Pool Pump Uptime Sensor': Sending state 7962.56104 s with 0 decimals of accuracy
[14:34:46][D][sensor:093]: 'Pool Pump Wifi Signal Sensor': Sending state -65.00000 dBm with 0 decimals of accuracy
[14:34:46][D][sensor:093]: 'Pool Pump Uptime Sensor': Sending state 8022.55713 s with 0 decimals of accuracy
[14:36:46][D][sensor:093]: 'Pool Pump Wifi Signal Sensor': Sending state -65.00000 dBm with 0 decimals of accuracy
[14:46:40][D][sensor:093]: 'Pool Pump Uptime Sensor': Sending state 533.29999 s with 0 decimals of accuracy
[14:47:36][D][sensor:093]: 'Pool Pump Wifi Signal Sensor': Sending state -64.00000 dBm with 0 decimals of accuracy
[14:47:40][D][sensor:093]: 'Pool Pump Uptime Sensor': Sending state 593.29498 s with 0 decimals of accuracy
Is there another method to capture logs so I can try to diagnose these restarts once and for all? ESPHome is running on a Sonoff 4CH device.
Thanks @orange-assistant for the continued support. I did a bit more investigation and added the debug and restart config as suggested. Since this is a Sonoff device, it is not straightforward to connect a serial logger but I will do that as a next step if there is still not sufficient info here to diagnose the cause.
The entities show i) device info inc. restart reason ii) uptime and iii) wifi signal strength. The gaps in the graphs correlate to periods where HA is reporting that the device is unavailable. I previously assumed that these dropouts was due to poor wifi signal strength but I can now see that in this specific example the wifi signal strength was actually stronger during the instable periods than the stable. So perhaps the issue is causing the wifi dropouts rather than the other way around.
From the graphs I see a long period of stability followed by a period of instability during which the device rebooted several times. The restart reasons reported from the OTA logs are sometimes exception 28 and sometimes exception 9.
Do you have the output_power set to a lower value to see if you get a more stable mileage?
output_power (Optional, string): The amount of TX power for the WiFi interface from 8.5dB to 20.5dB. Default for ESP8266 is 20dB, 20.5dB might cause unexpected restarts.
Yes I have Google Wifi and it seems that ESPHome had connected to a weaker access point. To remove this as a factor I have now setup a new, single, wifi access point in the garage with a different SSID
Since doing this I have had 5 days of trouble free running but unfortunately today the resets started again. My observations are:
At 12.44 the reported wifi signal strength to the new access point dropped from around -50 to -60.
First outage occurred at 12.51 for ~40 seconds
Device reconnected at 12.52 for around 5 minutes.
Second outage occurred at 12.57 to 13.29 (33 minutes!)
Device crashes with fatal exception 9 two mins later at 13.31
Device is reported “unavailable” until it reconnectts at 13.39
Device crashes again at 13.44 with fatal exception 28
The board is this one https://www.aliexpress.com/item/4000026433011.html
I’ve flashed ESPHome on it using their doc, it’s powered by 24V output from a 5KWh Solar Inverter, i guess it should be stable supply. The problem is that i can reproduce this if i reboot the router or nearby wifi repeater
Esphome documents warns you about web server but not captive portal. Leaving out bits of YAML is just part of the process of elimination for these sort of problems.
I thought so but nope. The other ESP device has now started resetting with the same fatal exception when wifi is poor. Today it (pool-doser) experienced two restarts whilst the original device (pool-pump) experienced none… but more on that below…
They are both Sonoff devices and interestingly both seem to have the same ESP8285 chip. I found these photos online which match my models and revisions:
Anyway, Since yesterday, I think I may have made a breakthrough. On both devices I had the fallback AP and captive portal enabled (although I don’t recall ever using them).
I had an idea that the crashes were occuring when the wifi drop outs enabled the ESPHome AP/Captive Portal. Yesterday morning I removed the AP config from one device (pool-pump) and left it on the other (pool-doser). Today I have had two reboots on the doser device and none on the pump… so far
I will need another week of running to test my hypothesis properly. But if I am correct, I wonder if there is a bug/issue with the AP/Captive Portal when running on ESP8285 chips in at least two Sonoff models