Can you help me figure out why my ZHA keeps crashing?

Hello,

I’ve been witnessing an issue with my system off and on sporadically across multiple releases and it’s become persistent enough where I’m ready to do some troubleshooting to figure out what’s happening but I’ll need some guidance on next steps.

Observation: I will wake up once every 1-2 weeks and automations won’t be firing. What I’ve learned each time is that it has something to do with the entire ZHA network going down because all ZHA devices will be white in HA. A reboot fixes the problem every single time and I don’t know why. It only appears to happen at night/early morning after I’ve gone to bed.

I have a backup that runs every night at 3:00 AM and I thought it could be THAT causing issues. This morning, I was able to look at all the ZHA sensor history and they all went down between 8:53AM and 9:17AM which is a pretty wide range. Is this indicative of my ZHA slowly going down because of performance? or does it just appear this way as devices are/are not checking in as the issue is occurring?


My stick is an EZSP = Silicon Labs EmberZNet protocol: Elelabs, HUSBZB-1, Telegesis
by ZHA
My setup is:

Home Assistant Supervisor

Host Operating System Home Assistant OS 8.1
Update Channel beta
Supervisor Version supervisor-2022.05.3
Agent Version 1.2.1
Docker Version 20.10.14
Disk Total 30.8 GB
Disk Used 12.4 GB
Healthy true
Supported true
Board ova
Supervisor API ok
Version API ok

What would be the next thing I can look at to try to dx the issue? Keep ZHA logs running for the next time it happens and then cross reference the time? What am I looking for if the devices are just becoming unavailable? Is that all that I’m looking for or is there something else I should be looking for?

I do have a couple ZHA mains and repeaters throughout. I don’t notice drops or lags during peak working hours

(I have no idea if this looks normal)

Before starting troubleshooting deeper by looking at logs suggest to first begin by optimizing your setup.

  1. Upgrade Zigbee Coordinator firmware if possible and a newer known good firmware is available
  2. Be sure to use a long USB extension cable for the Zigbee Coordinator to move it a bit away.
  3. Be sure Zigbee Coordinator is on a USB 2.0 port and not a USB 3.0 port. Buy and use a powered USB 2.0 hub (which will convert from USB 3.0) if you computer does not have a USB 2.0 port.

And so on, see tips links bellow.

Regardless of the actual root cause, always aim to keep the firmware of the Zigbee Coordinator updated, add more products acting as Zigbee Routers devices and implement workarounds for interference, see:

https://github.com/home-assistant/home-assistant.io/pull/18864

and

https://www.home-assistant.io/integrations/zha#best-practices-to-avoid-pairingconnection-difficulties

Understand and remember that Zigbee signals are weak so rely on a strong Zigbee network mesh (meaning many Zigbee Router devices) and are very sensitive to RMF/EMI/RMI interference so it makes it much easier to troubleshoot and find the real root cause if have already optimized your setup and environment to work around that.

1 Like

These are great tips and I appreciate it.

On the firmware, yeah you’re absolutely right. I have refused to touch this since I installed it because I had achieved stability and the stick is in a partition that I’m not very familiar with accessing. I will take the necessary steps to update this device’s firmware (I’m sure it’s out of date)

The stick is on an extension cable device in the center of the house abut 3 feet from the hub

I had never thought to check the 2.0 vs 3.0. I know what you’re talking about but it’s not labeled on the device. I will research further to make sure they match. The device I use to extend/power the USB plug is:

I will work on these today and see what I come up with. Thanks for the direction!

Most USB ports are color coded: USB 2 are either black or white (on the inside of the socket), USB 3 are blue.

2 Likes

I realized that after the fact when I went to looking at my VR 3.0 cables. Thanks!

Again, Zigbee should be considered extremely sensitive to interference so there is a good chance that just shielding EMF/RMI/RMI sources and/or moving your Zigbee Coordinator further away from any souces as well as adding a few more known good Zigbee Router devices will resolve your main issues.

Personally, I can recommend buying a few “IKEA Trådfri Signal Repeater” as dedicated Zigbee routers.

Note that if using a dongle/adapter with an old EM35x chip then you might now want to consider buying newer hardware as while they still work fine for not they are no longer getting new firmware features.

You did not post which exact Zigbee Coordinator brand/manufacturer and model on the chip you using, however since you mention that your stick uses EZSP then can at least suggest check out these links:

https://github.com/grobasoz/zigbee-firmware/

https://github.com/Elelabs/elelabs-zigbee-ezsp-utility

https://github.com/walthowd/husbzb-firmware

Most are yes, but unfortunately not all are color coded as not everyone follow the industry standards.

I finally built up the bravery and knowledge to update the firmware on the HUSBZB-1 yesterday because the zigbee crashes are accelerating. I did the firmware update and I got everything running but it crashed again last night at some point.

In all my research of updating the firmware, I read people complaining about stability over 30 devices and I have 65 just on zigbee. To prepare in the event the firmware failed or didn’t resolve the issue, I followed the recommendation of other users that moved off this stick for more stability and I ordered a sonoff.

After the crash last night post firmware update, I feel like I should just bite the bullet and try a different stick.

I paid the slight premium of getting a Conbee II 4 years ago. It’s under the control of ZHA with over 100 devices and 400 entities, I always smile and haven’t looked back.

I’m gonna get this figured out and achieve stability one way or another. I went ahead and ordered a conbee also. I love the idea of being able to control under zha.

While HUSBZB-1 is based on an old microcontroller/radio-chip it should be able to handle 100+ Zigbee devices as long as you add more good Zigbee Router devices as those off-load the coordinator, read:

https://github.com/zigpy/zigpy/wiki/Generic-best-practice-tips-on-improving-Zigbee-network-range-and-general-stability

Personally I would recommend buying three IKEA Trådfri Signal Repeater which is known to be great:

Did the ZHA Crashing ever get sorted out? I use a Conbee II USB Stick and recently moved from deCONZ to ZHA. Just thought getting rid of an extra add-on might help a bit. DeCONZ was stable, slow to boot sometimes, but once loaded up it was stable and never lost a device.
I moved to ZHA using the exact same Conbee II USB stick and ZHA is really unstable and crashes a lot, my mains pwoered devices (hue light globes and some SonOff basic Zigbee devices mostly) regularly go to unavailable.
If you did sort it out, can you let me know. The only change I made was going from deCONZ to ZHA.

I moved to sonoff and haven’t had anymore crashes.

Cool. Maybe it is just the conbee not being quite fully compatible.

Hi. Anyone with a fix already. I recently started to have the same issue. I use the skyconnect stick and every couple of weeks zha crashes. It says failed to load error in the integration page. A restart fixes it but I have to unplug the power supply. The restart command doesn’t work. Thanks.