I think the next step in this investigation is to remove the integration and just monitor the hub with a ping sensor to see if the problem still occurs. @msp1974 I think you said that you have a second hub? Is this something you are able to test? If not, I can (reluctantly) remove the integration and test this.
Alternatively, rather than removing the integration (as I spent a lot of time renaming entities etc), is it possible to disable polling? i.e. if I select System Options and then toggle ‘Enable polling for updates’ to off, will this stop the integration from polling the hub, which effectively will be the same as removing the integration.
I can and have setup them up next to each other, bound to the same mesh node, with same monitoring sensors. However, mine hasn’t dropped out now for 5 days, so whether I will see anything or not is to be determined.
Anyway, they can stay like that for as long as is necessary.
Just to share my Wi-Fi setup with everyone also:
Router: 2 x Asus RT-AX92U
Mesh - Yes - Asus AIMesh supporting 1 x 5ghz and 1 x 2.4ghz
Connected on 2.4ghz channel 13 (fixed not auto)
No other Wi-Fi signals on this channel and 2 zigbee networks (this + 1) both on channel 11 - no issues
IPs: DHCP assigned static addresses
Hub Connections: Back of house node and bound to that node
Master Unit: Front of house
Node: Back of house connected over 5GHz reserved channel
House construction - mix of brick and partition walls (1920s house with mods)
Total of 29 things connected (25 on Wi-Fi as mix of 5gz and 2.4ghz).
The signal fluctuates between -70 and -82 dB. HA can only record the signal strength reported by the Hub when it is online, but the breaks in the purple line correspond to outages, which are red in the histogram above. I had two outages this morning - about 40mins at 2am and 40 mins at around 9am. I am on 3.1.7 and power cycled the hub a couple of days ago.
The outages don’t appear to correspond to large drops in the signal strength - it often stays connected at times when the signal is weaker either side of the outage. As mentioned previously, for a variety of reasons the signal strength isn’t ideal, so I don’t know that my case should be seen as representative.
Oh - I’ve also started tracking the AP the hub connects to in case it is jumping around. I just did this, so waiting to see what happens at the next disconnection.
So, in between doing actual work, I have been trying to think why my hub maybe now more stable and others not. So, the other thing I did was fix the channel and not leave on auto.
Anyway, been playing around with this and I have twice now been able to make my hub loose Wi-Fi connection and not reconnect (both times solid red light). The very weird thing is that my other ‘test’ hub that is now sat on top of it, did drop its connection when changes made (as expected), but reconnected straight away.
So what I did to cause it to happen. I was switching between 20Mhz channel bandwidth and 24/40Mhz and also switching between auto and fixed channels. Haven’t quite nailed exactly what changes made it twice drop off and not recover but will keep working on that. For some of these changes it did recover quickly like the test hub. Both times, putting it into and out of setup mode brought it back online (albeit it one time I switched to setup and back within a couple of seconds and it did not - doing again for 10 secs in setup mode before switching back did recover it).
So this leaves me wondering if this is maybe a channel hopping related issue. Many people seem to have a good Wi-Fi setup and these routers maybe do channel hopping in auto rather than just pick one at start up. Or it could be that when doing these changes, Wi-Fi goes off and on again and this kicking the hub off is causing it to not be able to reconnect in some instances.
For those having issues, 2 questions:
Do you have fixed or auto channel selection on 2.4ghz?
Do you have 20Mhz or 20/40Mhz channel bandwidth set?
I think most stable option is probably fixed channel and 20Mhz bandwidth (more than adequate for most 2.4ghz devices) and the basic of config. For note, last 5 days with no drops, my router has been on 20/40mhz but fixed channel 13.
@jamiebennett maybe this might help isolate the issue. But why one does it and the other not - I have no idea. The test one has no ZigBee devices connected, is a 2 channel rather than a 1 channel for my main one and both have same firmware version. Both are currently linked to my production HA running on 30s updates and sat next to each other now bound to same Wi-Fi mesh node. Also to note, when it comes back online via setup mode on/off tcp respawn count has increased on the main one. Test one still has 0 respawns.
Thanks for the all the investigations. Channel hopping, Mesh networks and handoffs, and DHCP lease renewals are a few areas we are exploring but the comments around multiple connections has led us to believe that this is also an area which we can implement code slightly differently which should help. I personally will have a new build from the team shortly implementing a persistent connection approach to see if that helps, will report back when I know more. The original implementation was done like that to be more efficient but it doesn’t cater as well for the HA use case of opening connections every 30 seconds.
Not sure if it’s any use or not, but I’m happy to setup a ping monitor against my hub to see if I get any drop outs or not. All I have is the hub and room thermostat. I haven’t installed any TRVs yet as waiting to fit new valves. I haven’t set up the HA integration yet either until I have all my TRVs working with the main Wiser app.
I’ve not seen any sign of red lights on my hub, or any connectivity issues in the Wiser app on my phone, but haven’t been monitoring all day (obviously!).
Just thought it may help by seeing if there are any dropouts with regular numbers of opening connections?
I have auto channel selection turned off on my Access Points, and the channel width is set to 20 MHz.
The only time a channel change might happen in my network is if the hub switches AP for any reason (e.g. an AP goes offline, or there is enough interference in the channel that the Hub decides to reconnect to a different AP. I am monitoring for the hub changing AP as of today.
No problem. I’ve set my ping monitor running against my hub and I’ll leave it running for the foreseeable. I’ve created email alerts should the host go offline, and I’ll keep an eye on it throughout the day tomorrow, and report findings as and when. I can leave it running for a few days to try to capture any drop outs. If anything is picked up I will post my relatively simple WiFi setup details and house construction etc as others have done.
You kind of need to fix your WiFi channel if you have Zigbee networks and have planned your installation to prevent overlapping and interference with them, otherwise it’s free to wander all over the place and do what it wants.
There used to be a requirement from Wiser regarding Wifi channels, but it seems to be it has been removed from the article where it was listed (or the article replaced in 2021):
@jamiebennett, I tried to post above an article about a vulnerability what the Kasa smart plugs had. One plug was enough to take down all the other plugs on the same network. There was a set of commands to turn on an off the plugs as a group, and it could be stormed between devices. So one started and the others were joining in also sending similar commands.
This issue was local network related as well, and was deemed a vulnerability.
Agreed, which is why in this case you need the secret key to be able to establish a connection with the Wiser hub. This key should only be known to the homeowner.
At Schneider Electric we take cyber security very serious which is why all out connected products go through a rigours security evaluation both internally and externally.