Strange (Recent) Behavour with Tasmota - Reboots every 3 hours

So, a few weeks ago, a whole bunch of changes happened on my network at once. Unifi updated al my AP’s and switches, I upgraded Home Assistant - and I switched to the “new” methods of MQTT for my Tasmota devices.

Not soon afterwards, all my MQTT devices (including those that do not run Tasmota) started going Haywire. They constantly disconnected from the network, and appeared to reboot often. (Unfortunately, my Proxmox system with Syslog and Splunk died around the same time so I lost my logs).

I suspect it was a change in the Wifi from Ubiquiti, because my ESP8266 devices that do not run Tasmota were also going haywire. Plus, my shop, which is some distance from the house, and only has one AP, ESP8266 devices in the shop were more stable, and crashed less.

After 3 weeks of troubleshooting, including rebuilding my Wifi networks names/passwords, replacing my router (which handles DHCP) and finally, upgrading all my Tasmota devices from versions 6,7,8,9,10 all to 12.1.1 the systems were still pretty much dead.

Finally, I disabled MQTT on Home Assistant, and re-scanned my devices with the TasmoAdmin application (which is awesome!).

(You can see version/uptimes here)

For each device (except one as a smoke test) I upgraded them to 12.1.1.5 of Tasmota (current) and then executed a “Reset 4” to wipe everything from the device but the Wifi network/password and then re-configured them.

Things are now, finally, mostly stable. The MQTT devices “work” in TasmoAdmin now, and don’t constantly drop off the network.

However, None of them stay up for more than 3 hours runtime. Once they hit 3 hours of runtime, they crash and restart. Fortunately, the restart is very quick, you would not notice unless you were looking.

Still, I am really concerned. Why are my Tasmota devices rebooting? I still have one that is running v 10.0.0 because I am testing if it is a version issue - and it seems not to be.

Any ideas? Why do they never stay up for more than 3 hours before reset?

(meanwhile, I will rebuild my Proxmox systems, and get syslog and Splunk back up to try to get logs).

Downgrade your UniFi APs back to firmware v.6.0.21.13673.

All UniFi AP-Firmware versions released after the above up to v.6.2.39 are known to not working well with many devices (not only IoT devices). Just check the referring threads on their forum.

They claim to have fixed this with v.6.2.41 which is not officially released yet (release candidate):
[UAP] Fix band steering causing certain clients to disconnect and reconnect to 2.4GHz in a loop.

I did learn about this the hard way too. Luckily downgrading UniFi firmware is a breeze.

1 Like

That sounds EXACTLY like what I have happening.

My ESP8266 devices were rock solid before this.

It has had a huge negative impact on the WAF.

Ik, that’s certainly a big danger zone! :crazy_face:

I’m on v6.2.4.41 - I’ve not seen any of these issues with previous versions but then I do not use band steering as all my IOT stuff sits on a 2.4Ghz WIFI network on my access points.

I would suggest (if you haven’t) that having a pure 2.4Ghz WIFI network and turning off band steering would assist. I have a mixture of ESPHome/Tasmota devices and they have been rock solid.

Thx for the info. I have updated 1 UAP nanoHD to v6.2.4.41 earlier today. Will see whether UniFi has really sorted it out. All other 6 APs will be kept at v.6.0.21 until I’m sure that those constant reconnections has been fixed.

On a 2 storey concrete quite spacious house not using band steering is not an option. 5 Ghz is switched off with the VLAN for the IoT devices.

Besides of band steering and VLAN for IoT no fancy settings here. All is kept pretty much at default. Looking at the individual forum threads for the fw-versions from > v.6.0.21 up to < v.6.2.41 those speak for themselves. Plus I have made my own bad experiences regarding connection issues with v.6.2.39.

I live in a 2 storey stone cottage built in 1662 where the walls range from 4ft to 8ft thick - so as you can imagine signal was an issue initially. With my AP’s everything is happy - just wondering why the need for band steering as none of my 5Ghz devices ever connect at 2.4Ghz - even the phones which roam from AP to AP.

Idk what UAP models you have installed and how your network looks like. Here we have all Gen1 and Gen2 devices which work rock solid with v.6.0.21. No dis-/reconnects at all with all kind of devices. Since I am not a must-have-bleeding-edge-fw type and we all know that Ubiquity has made it’s user base to beta testers with their low quality sw since ~2 years I feel absolutely no urge to jump onto the latest. Lets let others do the testing :wink:

Well, a downgrade in Unifi’s AP versions is absolutely warranted. I already have a 2.4 ghz IoT only network with roaming and all the other anti iot-features disabled.

After midnight, I must have fed the gremlins because:

Looks like they are breaking again.

Every one that shows up red is having issues communicating.

So, I figured I would give the beta path a try before going to a full downgrade, v6.2.4.41 does not seem to be helping. The devices are also on a dedicated IOT network that is only 2.4 gz with no roaming.Screen Shot 2022-10-13 at 11.13.49 AM

Will see if this old version helps. Fingers crossed.

So far it isn’t looking very good.

:frowning:

I have ordered some Z-Wave switches because at this point, the WAF is very very low. Ironic, given that she did not care much about these, then for the last 3 years, having flawless automations working - now that they are gone, the WAF needs an uptick.

So latest test is to go back to my “static” MQTT configurations, and disable auto-discovery:

mqtt:
  discovery: false

Well, I just rolled back from v.6.2.41 to v.6.0.21 with that one UAP nanoHD I have used for testing the latest fw.

While the constant disconnections have been remedied, at least what I have noticed during the last 20 hours, roaming between different APs is still crippled. This is true not only for various Android and IOS devices but also for the Roborock vaccum bot which looses WiFi connection while roaming between APs when coming back to that v.6.2.41 AP. The handover doesn’t work at all which results in “Error 8: Roborock is trapped or stuck” in despite of the device is in open space.

Various user comments about v.6.2.41…

All taken for granted until it stops working. Now hell gets loose. Similar here :stuck_out_tongue_winking_eye:
Hope you will get it sorted out soon. Rebooting/restarting all UniFi gear after updating/downgrading devices sometimes helps. Don’t forget the Cloud Key!

As @Tamsy has said - often a restart after an update is recommended. I have seen odd behaviour with a single WLED ESP32 after some f/w updates - and a reboot of the AP has always remedied this (including the venerable 6.0.21).

Everything is ticking along perfectly here - just a few of my Tasmota devices :

I have rebooted all my APs what is interesting, is that no matter what I do, none of my devices stay online for more than about 3 hours. Its not exact, sometimes they get up to 3:20 but it is rare:

I have a dedicated IOT network in Unifi and I have been adjusting the settings through various options to try to find something that works:

Hi,

Here’s mine which is working without the 3 hour issue. I see you have band steering and UAPSD enabled. Maybe it’s UAPSD?

…and there is this (off on mine)

image

I only just turned on UAPSD, I have been running through various settings to try to find something that works. I just reset mine to mirror yours - but based on my tests unfortunately I don’t expect to see any changes. :frowning:

I so wish my Splunk/Syslog systems did not blow up around the same time, at least I would have a log history to go through. Now I have to use the weblog feature of Tasmota - but it does not seem to show much. Though I am getting fewer client disconnects so far - but the uptime still is a major issue.

So I fat fingered a command, and instead of restart 1 I sent reset 1 to all the devices that were communicating.

OOPS.

I had to reconnect to each tasmota-1233455 network and re-enter the credentials.

In order to speed up the process of configuring, I wrote this script, which could be so much better, but it saved me time.

#!/bin/bash

# Script to auto populate Tasmota hosts backlog command

# Oct 14 2022 
mqttpassword=""
mqttuser=""
dns1=""
dns2=""
mqttserver=""
ntpserver1=" 0.pool.ntp.org"

# Mac OS X needs \033 for colours
# Linux uses \e

echo -e "\033[93mEnter name of device\033[39m"
read tasmota_name
echo -e "\033[93mYou entered\033[39m $tasmota_name"
echo
echo -e "\033[93mEnter Device Module type as a number\033[39m"
echo "For a Shelly 1 - enter the number 46"
echo "For a Sonoff S31 - enter the number 41"
echo "For a Generic - enter the number 18"
read module_number
echo
echo -e "\033[93mYou entered\033[39m $module_number"

echo
echo
echo -e "\033[93mHere is your backlog command for Tasmota\033[39m"
echo
echo "Backlog MqttHost $mqttserver; MqttUser $mqttuser; MqttPassword $mqttpassword; IPAddress4 $dns1; IPAddress5 $dns2; NtpServer1 $ntpserver1; DeviceName $tasmota_name;
FriendlyName $tasmota_name; Module $module_number; topic $tasmota_name; mqttclient $tasmota_name"

I am adding that NTP server in there - because I actually have my own NTP server, but saw some errors that are related to NTP.

After configuring everything again, I ran a “restart 1” properly this time, and the hosts still seem to be dropping off the network. We will see if they get the reboot issue after being accidentally reset to factory defaults.

Well, I figured there would be something better out there, and there is:

I use Ansible a lot, so this is great!

Unfortunately, most of my Tasmota devices are still dying, and the WAF suggests I am going to have to replace all my Shelly 1’s with Lutron’s and my Sonoff S31’s with Zwaves. :frowning: