Solution to wifi devices becoming unavailable when using Unifi network and access points

Hi all
Sharing a solution to a problem I was facing but didn’t realise it.

I began noticing lag or failure for some of my lights to turn on or off based on automations. I have a mix of wifi and ZigBee devices but the impacted lights were primarily wifi. Looking at my logs I noticed a repeating and widespread behaviour for my WiFi devices to constantly flick between available and unavailable. Checking the logs on my Unifi Dream Router revealed that my devices were dropping off the network and reconnecting.

I did some quick Copilot querying and Google searches and they pointed me to doing the following:

  1. Block Tuya devices from the internet
  2. Disable cloud integration from LocalTuya integration
  3. Disable Fast Roaming on my Unifi router advanced manual wifi settings

I did all 3 without luck.
Last thing I tried was to disable BSS transition. Fingers crossed but I think that has fixed things!

I’ve applied these settings to my 2.4ghz SSID that handles my IoT devices. You can probably leave fast roaming and BSS transition on for your main 5ghz device network e.g. phones, tablets, PCs if needed.

I’ll report back if I continue to see issues.

1 Like

Thanks for posting that. I’ve gone ahead & applied those suggestions to my Unifi IOT VLAN / SSID (except Tuya; no need here).

In addition, I noticed another setting that could be enabled for that network in Unifi: “Enhanced IOT Connectivity”, & I enabled that one also. Description says “Limits the network to 2.4 GHz only with special compatibility settings”.

I’m interested to see if any of that helps.

1 Like

So I’ve found ongoing issues since my original post with what I thought was a solid fix. Using Gemini to help analyse my Unifi server.log files it has helped identify the following:

The root cause for your devices showing short connection times (under 20 minutes) appears to be a combination of Controller Resource Exhaustion (Locking Failures) and State Tracking Loss, rather than the devices actually physically disconnecting from the WiFi.

1. Root Cause of Short Connection Reporting: Controller Locking Failures

The most pervasive issue in your recent logs (December 2025) is the controller’s inability to process device statistics in time.

  • The Error: The logs are flooded with com.ubnt.service.lock.FailedToAcquireLock errors originating from DeviceStatProcessor$InformStat.
  • What this means: Your Access Points are sending status updates (“informs”) to the controller, but the controller is too busy or “locked” to process them before they time out.
  • Why this causes short connections: If the controller cannot process the “I am still here” heartbeat from an AP or client, it assumes the device has gone offline. When a subsequent heartbeat finally gets through, the controller treats it as a new connection.
  • Result: The User Interface reports frequent disconnects and short session times (<20 min), even if the client device never actually dropped its WiFi signal.

2. Loss of Client Tracking (“AP Null” Errors)

Directly correlated with the locking issue is the controller losing track of where clients are connected.

  • The Log Entry: Repeated hundreds of times: Wireless Client 'xx:xx:xx:xx:xx:xx' disconnected, but no ap 'null' found.
  • Analysis: When a client disconnects, the controller usually logs “Client disconnected from AP [Name]”. In your case, it says ap 'null'. This confirms the controller had already lost the state of that client before the disconnect event occurred. It didn’t know which AP the client was on, further confirming the “Ghost Disconnect” theory caused by the processing lag mentioned in point #1.

3. Historical Root Cause: “Locked AP” Triggering Mass Provisioning

In the older logs (April–August), a specific configuration issue was causing network-wide disruptions.

  • The Event: Device[available] changed state to 'd0:21:f9:e0:36:1c' and there are some clients locked to it, provisioning all Access Points in Site.
  • The Problem: You have specific clients configured to be “Locked” to a specific AP (MAC ending in 36:1c). When that AP fluctuates or changes state, the controller forces a re-provision of all Access Points in the site to update their lock lists.
  • Impact: Re-provisioning an AP drops all its wireless clients. If that one AP is unstable, it takes down the entire network repeatedly.

Some more suggested fixes:

What This Means

The controller is trying to process statistics from this device (CPU usage, traffic stats, DPI data), but the database is “locked.” Because this device is the most chatty, it is the one hitting the wall most often.

  1. It creates a bottleneck: While the controller is stuck trying to process this device’s data, it ignores updates from your Access Points.
  2. It causes the “Disconnects”: When the AP updates are ignored, the controller thinks clients have disconnected.
  3. The device itself is likely fine: The hardware of 70:a7:41:ea:d0:53 is likely functioning correctly (routing traffic), but its relationship with the management software is broken.

Recommended Fix

Since this device is the source of the lockups, you need to reduce the load on the controller immediately:

  1. Restart the Console: You must restart the Cloud Key or UniFi Console to clear the stuck database locks.
  2. Turn Off “Traffic Identification” (DPI): If you are on an older controller or hardware, go to Settings > Security > Traffic Identification and turn it off temporarily. This stops this device from sending deep packet inspection data, which is the most resource-intensive task for the database.
  3. Check Data Retention: Go to Settings > System > Advanced > Data Retention. Lower the retention settings (e.g., set “5 Minutes Granularity” to “1 Hour” or “Keep Data for” to “7 Days”) to reduce the database size.

And another further update, looks like forcing all of my access points onto wifi channel 11 was causing the connection issues (or at least I think it was). I changed it to auto mode now and the network seems more stable. I have a ZigBee network of over 100 devices using ZigBee channel 15 so I wanted to further reduce the risk of interference by moving my WiFi to channel 11.

Yes, if creating a mesh by having several Wireless APs they should be on different channels but with the same SSID and password.

Of course these days finding 3 channels that aren’t already flooded by the neighbours in your apartment building or office is a major challenge :frowning_face:

I’m not sure if this is related or not, so I don’t know if I should post a new thread and/or a Github issue. But I’m finding my HA 2025.12.4 is having issues once or twice every 24 hours when it loses contact with the Unifi controller and has to re-establish the connection. This causes all Device Trackers I’ve got to go Unavailable which causes me issues for various automations.

I’m not sure how best to debug the issue, the Unifi Controller itself isn’t have any issues best as I can see (It’s version 10.0.162, hosted a LXC Container on the Proxmox server the HA VM is hosted on, so not even any external networking)

Keen for any pointers - it’s been flawless for the last ~14 months but just in the lsat 2 months I’ve noticed this problem (i.e. it began before 2025.12)

Setting your Wifi to auto will do the opposite. As soon as your APs decide to use the lower channels, you’ll start getting interference.
At the very least, set your channel width to 20MHz for the 2.4g SSID, so that this will only affect your zigbee channel 15 network if the chosen wifi channels are 2-5.

I’m not sure from your description if this is the same thing and I’ve refrained from posting in this forum until now due to the fact that I have no real evidence of when, why or how this happens.

But…

For the last six months or so I get occasional complete network disconnects of HA with the only solution being to pull the ethernet cable (or restart the port from the controller).

HA is still running perfectly well but is just unreachable on the network.

HA is updated every month so this is not version specific (except possibly that it started suddenly around May/June 2025, having been rock solid for many years)

HA is on a VM in a Proxmox machine.

All network equipment is UniFi.

I tried the obvious like new ethernet cable, different port on the switch, different actual switch…

Searching appeared to point to some kind of network ‘storm’ and further searching pointed towards a ‘known problem’ with some network cards on some Intel NUCs which included mine.

I removed some network heavy integrations (notably Speedtest) and I moved my PiHole VM off Proxmox onto a RPi.

I ended up with just the one HA VM on Proxmox but it still kept happening.

I was due an upgrade(!) so the nuclear option was a new server.

HA now runs (still alone for the time being) in a Proxmox VM on a Dell with a ‘problem free’ NIC.

But I still get (less frequent) network disconnects.

It is more than a little annoying and I’m just thankful that if I’m away I can at least reset the network port using the UniFi Netwok app.

As no one else has, to my knowledge, until now, brought this up I find it hard to believe that is an HA issue. But on the other hand I seem to have ruled out as much as I can.

Any ideas??


I hesitate to say as it could be a huge red herring but recently I think I have noticed that disconnects seem to only happen when I am using VSCode on my PC to edit HA yaml files.
Could it possibly be connected to the HA VSCode extension?

Just following up to say that actually this issue is resolved for me.
I’m fairly sure this was to do with a bug in the Proxmox Kernel I was using:

revert TCP changes in 6.17 that causes connection stalls on some setups

Since moving from this kernel my issue has stopped (I run Unifi in an LXC so this kernel would have been being used for TCP sessions)

@klogg Your issue sounds like the well known Intel e1000 bug where the card crashes/hangs. Nothing to do with Unifi specifically. The workaround is to disable all offloads for the card - there’s hundreds of pages dedicated to discussing it on various forums.

1 Like