Zigbee network configuration: HA tools and stability of TRV connections

The Problem

After suffering frequent intermittent errors with TRVs, I tried to study the Zigbee network configuration, but found it hard to understand how the network configuration works and how to access it through HA.

My Zigbee network

I have a ZHA network with

  • ten Aqara E1 TRVs
  • eleven generic Chinese TS0601 TRVs
  • six more battery-powered end devices such as door and window detectors.

The network is in a house with thick walls, so has has an ample number of Zigbee repeaters:

  • six IKEA TRADFRI repeaters
  • an MHCOZY 4-way mains-powered relay switch, used to operate the zone valves
  • ten more Zigbee mains-powered smart plugs and switches of various brands that also act as repeaters.

:hammer_and_wrench:

The HA tools

… are hard to use…

  1. For Zigbee devices, in the section reconfigure → manage network, there is some information, including a list of ‘neighbours’, each with a number that I take to be a measure of signal strength? There is however, no indication of which neighbours are actually being used to connect to the network. UNANSWERED

  2. There does not seem to be any way of simply asking a Zigbee device what next device (repeater) it is connected to, or which are connected to it (unless that is the definition of a ‘neighbour’?). The device main page always just says the device is connected to the main controller. UNANSWERED

  3. The only way of analysing the topology seems to be with the ‘visualisation’ feature – a great idea but hard to use. The Visualisation feature in the latest version of HA no longer has a search function. Finding a given node is therefore difficult. My diagram fills more than a page. If I zoom in I lose context. If I zoom out everything get obscured by labels. There used to be a search box, but that has been withdrawn for some reason. How do you find a node now? UNANSWERED

  4. The 'physics feature no longer works. I think that used to label the lines with a signal strength and colour them, but now they are all blue with no numbers. **ANSWERED: instead of the physics feature, one can hover over a line.

  5. It is not even clear that the visualisation feature is a correct representation of the actual topology. I found a TRV connected for no apparent reason to a repeater two rooms away rather than one in the same room. I tried turning off the repeater that was further away to see what would happen. Several minutes later the TRV was still online and still responding fine, but the visualisation feature (after several refreshes) still claimed that it was connected to the powered-off repeater! UNANSWERED

  6. If the visualisation is to be believed, the network IS reconfiguring itself dynamically. When I recently added a few more repeaters, they were ‘absorbed’ into the network within a couple of hours. That is to say, they had taken a role in the network by connecting to other devices. However, the logic looks bonkers! Interconnections look arbitrary rather than based on each device connecting to the next nearest node. UNANSWERED

  7. Wondering if the Zigbee network was not updating quickly enough I tried changing the network refresh rate temporarily from 7200 seconds to 60 seconds. This had the unexpected effect of taking all the battery-powered devices offline altogether! I changed it back, then they rejoined after a few minutes. What happened here? UNANSWERED

  8. I have looked in vain for ways to force or encourage a device to connect to the nearest repeater. Most sources say remove the device and then add it back in, but it seems (if the visualisation is to be believed) that the re-added device always insists on connecting to the same repeater as before, even when that repeater is turned off! SOLVED

  9. When I tried removing, physically moving, then re-adding a repeater with a new name, it took all the end devices with it. Apparently none of them tried to find a closer repeater; they seem to ‘stick with what they know’ even after a name change! UNANSWERED

  10. When the nearest repeater goes offline, the 4-way zone relay, goes offline too, even though there are other repeaters well within range. UNANSWERED but no longer relevant; I swapped it for a WiFi switch.

:woozy_face:

My questions

See the 10 November post below →

Are you using an extension cable for the coordinator (if it is USB)?

Thanks for the idea.

Yes it is a USB SONOFF controller stick, placed about 2m away from the computer (a Dell OptiPlex 7080 Micro), which does not have a WiFi card fitted.
There are no other emitting devices nearby. I am not using Bluetooth.
The house has a TP-Link WiFi mesh whose nodes are nowhere near the computer or Zigbee controller. The computer is connected to the local network by Ethernet cable.

Hello AndySymons,

The next question is are your zigbee channels fighting with your WIFI channels?

More things to try in some of the posts here: The Home Assistant Cookbook - Index

Older Aqara devices (using Zigbee 1.2 instead of Zigbee 3.0) were known to be hard headed and would try to maintain a connection to the first device you paired them to. Remove them from the network, find the closest zigbee device for each one, then click “add devices via this device” in the router device page. That should sort out your issue.

Hover over the lines. You will see the signal strength.

For what it’s worth, I’ve got a couple of bulbs which literally blew up and have been disconnected since last winter. ZHA visualisation still shows a connection with an LQI of 45 to the coordinator and 131 between each other after all these months!

They should not be because the Zigbee controller is set to channel 25, which should be above any WiFi channel.

You seem to think it is a connectivity problem. Is that just a hunch or have you seen this sort of problem with TRVs before?

I don’t think it is a connectivity problem because (mostly) the TRVs are online and a manual setting is immediately reflected on the dashboard. My own automation tries five times at two minute intervals if a new setting is not confirmed by the actual reading, so that should overcome intermittent communication problems anyway.

That is very helpful. Will try that in situations that look particularly bonkers. After doing that, will it still re-assign the connection if the node goes offline, or a new node is added with stronger signal strength?

If it’s a zigbee 3.0 device, it should. If it’s an older zigbee 1.2 device, it probably won’t

The coordinator is a SONOFF Zigbee 3.0., so I suppose it is 3.0, although the HA device info does not say so

Nwk: 0x0000
Device Type: Coordinator
LQI: Unknown
RSSI: Unknown
Last seen: 2025-11-06T16:34:12
Power source: Mains

For other devices, how can I tell?
The IKEA repeaters device Zigbee info looks like this, no mention of the Zigbee level

Nwk: 0x9e9f
Device Type: Router
LQI: 90
RSSI: Unknown
Last seen: 2025-11-06T18:01:25
Power source: Mains

A TRV end device Zigbee info looks like this, also no mention of the Zigbee level

Nwk: 0x237c
Device Type: EndDevice
LQI: Unknown
RSSI: Unknown
Last seen: 2025-03-17T02:49:19
Power source: Battery or Unknown
Quirk: zhaquirks.tuya.ts0601_trv.MoesHY368_Type1

Google the model number and find the official product page. If it’s 3.0, they’ll make a huge deal about it.

1 Like

TS0601 is generic name that is used by several Chinese products, so there is no ‘official’ page. The advert on AliExpress says it is 3.0.

The Aqara E1 is Zigbee 3.0.

Update 10 November

Following comments on this page and research elsewhere, I gathered some tips that might or might not be correct. Working with the as yet unverified theory that the problem is with the Zigbee network rather than the TRVs (or HA software), I today tried the following:

  1. I moved the Zigbee coordinator to the centre of the house. It is connected by an extension cable to the computer running HA and about 2m from the box.
  2. One source recommended connecting the SONOFF Dongle to to USB2.0 instead of 3.0. When I did that the computer crashed and could not be restarted! That is weird and inexplicable, so I put it back to USB3,0 and it works again.
  3. I think that recommendation was based on USB3.0 emitting more interference than USB2.0 – is that true? Anyway the USB ports are next to each other so changing socket does not removed it from the emission anyway. I assumed that not replugging, but using an extension lead is the sensible solution.
  4. One source says that Aqara TRVs ‘prefer’ middle or lower channels to the higher ones … so I tried changing the Zigbee channel from 25 to 15. This channel is recommended by several sources as being between WiFi channels 1 and 6 (and well away from 11)
  5. In ZIgbee network settings, I reduced the wait time for battery devices fro 7200 to 300 seconds (5 minutes). My theory is that the Visualisation diagram was mostly wrong because it was 2 hours out of date. Still haven’t really understood this. I previously tried an even lower number, but then it cut off all Zigbee devices.
  6. The house already has at least one Zigbee repeater in every room - either an IKEA Tradfri or a smart plug of some description. Can there be too many repeaters in a room??
  7. I rebuilt the Zigbee network by adding repeaters and devices using the “Add devices to this device” feature for a manual configuration. I started at the coordinator and added the nearest repeaters. To those I added the next nearest. Finally, I added the TRVs and other battery Zigbee devices to the repeater in the same room.
  8. It is a 19C house with thick walls. By checking the LQI levels I was able to confirm my theory that transmission is better vertically through a floor/ceiling than horizontally through walls. I cannot find any sources on what is an acceptable LQI and whether it measures the link to the next device or all the way to the coordinator?

Result?

None of this made any difference at all. :woozy_face:
It all works as it did before, which is to say most of the time but not reliably enough for a satisfactory automatic heating system.

Tearing our hair now! :nauseated_face: At wits end! :crazy_face:

Any ideas? Is it even a Zigbee network problem? TRVs are giving the biggest difficulties – devices such as door closure sensors do not seem to be as unreliable…

:thinking:

Open questions

(Some new, some reformulations…)

  1. Does it make any difference whether a SONOFF Zigbee 3.0 Dongle is plugged into a USB2.0 or3.0 port?
  2. Why would plugging the SONOFF Zigbee 3.0 Dongle into a USB2.0 port cause a Dell Optiplex computer to crash and not restart?!?
  3. Does rebuilding the network topology manually mean (a) it stays like that forever and a section will fail if a repeater fails; (b) it will reconfigure itself if there is a failure but return to my settings when a failed device returns; (c) it will go ahead and reconfigure itself any time anyway and ultimately ignore my topology.
  4. Can I select a Zigbee device and have HA tell me (a) what device this one is connected to upstream (the device page always indicates the main coordinator), and (b) what devices are connected to it downstream (there is something called ‘neighbours’ but I don’t know if that is the same thing? Yes, in theory these connections are shown on the Visualisation diagram, but that is so busy I cannot read it easily, and anyway I do not trust it for reasons given in the original post.
  5. How exactly does HA decide whether a battery-operated TRV is online (available) or offline (unavailable)? And if a device is “unavailable” does that affect the way or whether set temperature commands are passed on?
  6. Is it true that Aqara (or any other make of) TRVs ‘prefer’ middle or lower channels to the higher ones? Why?

I get some issues with my temperature sensors also.
I can’t find it now but it was fairly recent, a sensor reported a flatline for a few hours.
And a few days before another temperature sensor did it.
I believe it’s a lot due to them being end devices and possibly because they are end devices at the ends of the homes.
It’s not very common that you have a router outside of the window, so that means the TRVs are always at the ends.
Door sensors could be closer to middle of the home but some of them probably are at the outer edges too.
But I believe they have one distinct difference, they have a wakeup action.
There is something physical done that can be used to wake them up.
TRVs and temperature sensors are more snoozing than anything else. Temperature sensors has the advantage of being on a schedule at least.
The TRV is something we expect to respond when it’s a sleep, possibly in the edge of the home with a less than ideal reception.

Just my theory…

Not sure about the ‘sleep mode’ theory, but it is the case that battery-powered devices (is that what you mean by ‘end-device’?) have a preset time before they are considered unavailable. In your networks settings find ‘Consider battery-powered devices unavailable after (seconds)’. This is typically set by default to several hours, though I now reduced mine to 300 seconds (5 minutes). I do not know if this wait time also applies to the device becoming available again. I do not know if it is the interval between polls from the coordinator, or the interval between spontaneous reports from the device. We need a Zigbee expert!

The location in the home is less relevant than the proximity to the nearest repeater. You need enough to cover everywhere there are devices.

True.
But your not going to get a good mesh at the edge of your home unless you place several routers next to each other.
I don’t know for sure but my experience is that it’s not just the closeness to the router, sometimes they route to other routers that is further away.
Possibly because the router disconnected or the end device did and accidentally connected to the wrong one.
But I believe a strong mesh around the end device makes it less likely for them to drop off or not respond.

If you can then test moving one TRV to the center of a room with lots of routers and see if it responds better there.
Obviously you would to replace the TRV with a manual TRV during the test.

I have the same problem and it is one of the things I asked about in my post…

Only advice I can give you is not to do the above beyond short-term testing. You’ll chew through your batteries on those devices.

Before you revert your change, try this: walk up to one of those unavailable TRVs and (short) press the button on it. Is it marked as Available before the 5 minutes are up? If not, then they are indeed dropping off the network (unless the battery hasn’t dropped considerably already).

Much as I love Aqara stuff, I haven’t heard good things about those TRVs. It might be time to shop around for a replacement.