ZHA issues - time to move to zigbee2MQTT?

As my network of Zigbee devices grows, so does the number of dropouts from ZHA… all of them are sudden and the true reasons are unclear to me. Hue bulbs will go haywire, Aqara/Xiaomi buttons stop working and so on.

I run Conbee II on a 2 meter lead from the RPi + 2 tradfri signal repeaters + 1 Zigbee plug.
I also do run Aqara Zigbee through G3 and E1 bridges separately (I would love to get rid of E1 but Aqara H1 switches refuse to cooperate with ZHA - tested, they just keep dropping connection).

Now I wonder what to do:

  • replace Conbee II with something more powerful. An option but since I have repeaters I am not sure range is an issue
  • replace ZHA with zigbee2MQTT. It will be a major pain to migrate so many devices…,

Any thoughts?

I’m not sure changing from ZHA to Z2M is going to help with your issues, as both are still just zigbee networks, and many of your issues sound like they are connectivity based.

I have 57 zigbee devices (on ZHA) at present, and about half of them are repeaters such as smart plugs and bulbs, whilst the remainder are end devices like battery powered motion sensors etc. Most of the end devices are aqara, and these are known not to re-route through the mesh network if the device they are going through (if not directly to the coordinator) is down.

So, I’d suggest perhaps as your repeater count grows, you re-pair the aqara devices so they can make use of the new possible paths through the network.

Conbee II, I believe is an older coordinator. For ZHA, a skyconnect or sonoff dongle-E (same chipset) are preferred. I have the sonof and its been faultless.

Perhaps start looking at something like the ZHA network card:

And see what your LQI values are like. If these are low, it could be some of the cause of your issues. ZHA is not it.

1 Like

Thanks for detailed overview - will certainly look into the card. Does sonoff e work natively or will I have to flash it?

The fact that Aqara does not look for new pathways I think is actually a good thing in my situation as I’ve placed repeaters at just the right distance to ensure consistency.

I also checked for LQI and those are disabled on Aqara sensors - enabling them leads into the “Unknown”

enabling the lqi value is fine. I do it on all my devices and it will turn up eventually, on battery powered devices it can take a while. The ZHA network card works whether or not you enable the hidden lqi entity or not.

The sonof works straight out of the box.

Be aware that there are two identical versions of the dongle… The P and the E. The P is recommended with Z2M and the E for ZHA due to firmware differences.

1 Like

Thanks a lot!

I did setup an automation overnight that tells me which Zigbee device goes to Unavailable and there you have it:


message: "Professor Signal Repeater Identify is unavailable" at 00:00
message: "Queen Light is unavailable" at 00:06
message: "Kitchen Button D/W Battery is unavailable" 00:40

All 3 came back online since then but sort of proves the point

Be aware that some devices are very ‘sleepy’ and Home Assistant will decide they have gone unavailable.
I have three Ikea remote switches like this, which regularly report unavailable.
However, if I use the switch, it works immediately as expected.

However, assume the light and the signal repeater are mains powered, and that shouldnt happen.

1 Like

For what its worth I’m running a Z2M network with similar experience:

  • sonoff E dongle on a 1m usb cable
  • 26 devices in total, 7 acting as repeaters
  • modest area - 4 room apartment
  • devices disconnect randomly and without any logic (i.e. 2 light bulbs 20cm away from each other, only one gets offline)

Could you please share the automation you created to track the device status?
Thanks!

This blueprint works good for detecting offline devices in Z2M:

Unless its changed recently, the Dongle-E was only ‘experimental’ for Z2M, which could be part of your issues.

I also use this script to notify me when a (ZHA) Zigbee device goes offline. It requires the ZHA_Toolkit to be installed:

alias: Check for ZHA offline devices and notify
sequence:
  - parallel:
      - - wait_for_trigger:
            - platform: event
              event_type: zha_devices_ready
        - repeat:
            for_each: "{{ wait.trigger.event.data.devices }}"
            sequence:
              - if:
                  - condition: template
                    value_template: >
                      {{ (repeat.item.available | bool() == false) and ((('IKEA
                      of Sweden TRADFRI on/off switch' not in repeat.item.name)
                      and ('IKEA of Sweden Remote Control N2'
                      not in repeat.item.name) and ('_NOT-USED' not in
                      repeat.item.name)))}}
                then:
                  - service: notify.mobile_app_your_phone_here
                    data:
                      title: Device Offline
                      message: >-
                        {{ "(%s) LQI: %s" % ( repeat.item.user_given_name,
                        repeat.item.lqi ) }}
      - service: zha_toolkit.zha_devices
        data:
          event_done: zha_devices_ready
mode: single

I run this via an automation on an hourly schedule.

You’ll see that I deiberately exclude the sleepy IKEA stuff from notifying me.

@Rofo Repeater for sure should have stayed online, with bulb there is always a chance someone knocks the switch off.

@ddppddpp for automation I went somewhat an old fashioned way - used state trigger and list of entities and then action notify with small template. I am sure there is a simpler way to do it, but that’s the one I know how to do :slight_smile: just make sure you select one entity per Zigbee device so you are not getting two notifications for the same. I use battery for sensors, identify for plugs/repeaters and light for bulbs. I’ve shortened the list below so it doesn’t take too much space here. Will add Aqara entities later on today to complete

alias: Home Zigbee Network
description: ""
trigger:
  - platform: state
    entity_id:
      - sensor.king_thsensor_b
      - sensor.queen_thsensor_b
      - sensor.princess_thsensor_b
      - sensor.professor_thsensor_b
      - sensor.attic_thsensor_b
    to: unavailable
  - platform: state
    entity_id:
      - button.zigbee_smart_plug_identifybutton
      - button.lounge_garden_signal_repeater_identify
      - button.mark_signal_repeater_identify
    to: unavailable
  - platform: state
    entity_id:
      - light.hallway_bulb_guest
      - light.hallway_bulb_kids
      - light.hallway_bulb_king
      - light.hallway_bulb_middle
      - light.hallway_bulb_queen
    to: unavailable
condition: []
action:
  - service: notify.haaswb
    data:
      message: >-
        Zigbee Network: {{ trigger.to_state.name }} is {{ trigger.to_state.state
        }}
mode: single

have you managed to get any of your device LQI scores yet ?

For comparison, these are the worst ten on my network:

How does your zigbee channel compare to your 2.4 wifi channels, are they overlapping ?

only 3 devices report LQI so far: 191, 191 and 212. Others show unknown.

There is a degree of overlap yes because I run separate Wi-Fi / VLANs and APs broadcast across different channels (1/6/11) and 11 is the same as Zigbee one. But, AP that broadcast 11 is funnily enough in the area where I do not have any dropouts :slight_smile:

Zigbee channel 11 conflicts with WiFi channel 1, not WiFi 11.

Thanks for the correction. Then it may be wifi indeed. Will give it a try

Now I have my LQI values - see below:

183
143
255
255
255
183
255
255
255
255
255
255
255
255
143
143
37
151
159
255
143

The only device in that list that might be problematic, has the LQI of 37, everything else looks very healthy. Which device is that ? Something far away from a repeater or the coordinator ?

Either way, I would only expect a single device to maybe have issues, but you’re reporting more wide spread problems.

If it were me, I’d take a punt and change to the Sonoff Coordinator.

That one is Philips hue bulb in the room where one of repeaters is actually. The repeater itself is a bit all over the shop with LQI as you can see below. And in that same room there is another device (Window sensor) which just disconnect all the time.

I wonder if I should try to re-add repeater with Bulb off as it may think that bulb is a router and tries to connect through it? And then readd Bulb through repeater

image

LQi will bounce around depending on its location and time of day, as its constantly affected by environmental factors. I find some of mine drop around 4pm, but not entirely sure why, or whats different at 4pm.

Most zigbee devices should automatically route to the best/most viable path to the coordinator, unless its something like an Aqara device that religiously sticks to its orginal route.

You could try forcing it by adding ‘via’ another device:

yep, forcing is exactly what I’ve done for some + I heard Aqara is known for not switching sources so fingers crossed