ZHA crashes regularly on Skyconnect

Ive been fighting issues with ZHA for weeks and making little to no progress.

ZHA with skyconnect (on a 10’ extension cable i think)
HASSOS as a VM on esxi 7
Home Assistant 2023.3.3 (started a while ago, maybe on 2022.12 or 2023.1)
Supervisor 2023.3.1
OS 9.5
Frontend 20230309.0

i was around 125 devices connected to my ZHA network with an old nortek 2 in 1 stick. 35 or so of them were repeaters. I got my skyconnect late december (i think?) so i was excited to move to it. Migration was a piece of cake.

A little while earlier a couple of Osram bulbs started acting up, so i thought the skyconnect might work better. It didnt make a difference, so i replaced a couple of them with sengled bulbs. They also had some issues where they wouldnt respond to commands. At the time i was around 135 devices, still 35 or so routers.

One day zha crashed completely. No device on the network would respond. I reloaded the integration and it failed to load. I got an error message (cant find the text, it was a while ago) and it took me a while to figure out i needed to disconnect the skyconnect, connect it again, then reload the integration. it comes back up right away, but it might take another 20 minute for all of my devices to come back online.

This continued for a few weeks. Same issues, same fix. Sometimes it would crash 3 times a day, other times it would be fine for a few days, but that was rare.

Decided to set up Z2M and start moving devices. subjectively, Z2M seemed faster and more stable, while i still had issues with ZHA. Same thing, it would crash and id have to reseat the USB and reload ZHA to get it back online.

Im down to 101 devices on ZHA, the rest have been moved to Z2M. yesterday, i posted in the HA facebook group and someone mentioned Osram devices were failing on them causing their ZHA network to crash. Since id already had several Osram failures and similar symptoms he described, i unplugged every Orasm device on my network. (i didnt consider that i was disconnecting 12 or 13 routers, but ive still got around 25, now with 101 ZHA devices, but i think thatll iron itself out).

this was last night, around 9pm. I reloaded ZHA after i did that and let it go about its buisness. By 12:10am, it had crashed again. I reseated the dongle and reloaded ZHA before leaving for work and so far (1130am) it seems to be okay, as far as i can tell from here.

what can i look at to find the cause of these problems? Im hoping we are onto something with Osram devices being the cause but since it crashed 3 hours later, im not so sure. Im wondering if that was something with the mesh being changed and rebuilding a lot of routes?

where can i find helpful logs? If i look at the logs before ZHA crashes i cant find anything thats helpful, but i dont know what im looking for. If it stops responding and i reload it before reseating the USB, i get an error in the logs that it failed to load but i dont remember the wording since i can tell when its not working and i just fix it to get things back up and running.

short of moving 100 devices to Z2M, what can i do to find the cause of this? Im good with EVENTUALLY moving to Z2M, but i dont want to try and move 100 devices at a time.

should i ditch the skyconnect and migrate back to the nortek for the time being? surprisingly i dont think i had a single ZHA issue when that was my coordinator, but i dont think the device count was greater than 110 on that coordinator.

any insight or advice would be much appreciated!! also please let me know what other info i might need to provide, this is my first post here so hopefully ive given enough info.

For what it’s worth, I used Skyconnect for a week on ZHA and moved back to Conbee II/Deconz and eventually the Sonoff Dongle-E on ZHA. Nothing but issues. Skyconnect and ZHA are equally problematic. I plan on moving to Z2M soon. I’d avoid Skyconnect and ZHA if at all possible.

I have around 100 devices on my network as well.

i was so excited to get Skyconnect going, but its just been troublesome for what seems like the whole time.

maybe ill go back to the nortek and disconnect skyconnect until theres something with Thread/Matter that i just cant live without and use it for just that radio. i thought i was upgrading from the Nortek but its been better than the skyconnect has in my experience.

I unfortunately have the very same issue, SkyConnect keeps crashing HA (not just zha) about 2 to 3 times a day. I updated to the latest firmware for SkyConnect without any change to the behavior. Any further progress or recommendations what to try or what to do. I am using SkyConnect with a 3m extension cable?

Or should I transfer my zigbee network from SkyConnect to a different zigbee adapter (Zigbee Home Automation - Home Assistant). Any recommendations what would the most stable alternative? Will the device configurations survive a transfer to a new adapter?

Ive got z2m up and running and now have 68 or 69 ZHA devices, a little over 70 on Z2M. ZHA seems to be MUCH more stable, ive only had one or two issues in the last couple of weeks. I think i was just throwing too much at ZHA with the skyconnect. Im slowly moving the rest to Z2M, but since both are pretty stable now, im just doing it as i have issues with a device or as i add new devices.

It’s definitely not the number of devices that is your problem. I have a tiny Zigbee network running on Skyconnect, a total of 3 devices (one hue light strip and two IKEA remotes) and it still just crashes regularly, forcing me to unplug the stick and reboot the whole VM host. I have decided to move to a different stick. Does anyone have good recommendations for sticks to use with ZHA in 2023?

I have a similar issue with ZHA crashing sporadically - anything between a couple of hours to a couple of days. I have 49 devices and around 20 are router plug points. I’m using SkyConnect on an extension that is around 2m long on a proxmox VM which well over powered for the application.

It had been running well on HA vesion 23.7.x but then in October I did my quarterly upgrade to 23.10. That’s when the trouble started.

The logs don’t seem to give me much hints. At least I don’t understand the platform well enough to interpret them.

I now automated a reload every morning but still was caught out when the power point powering my alarm switched off overnight! Partly an issue with the power point device which sporadically switches off when not connected to a network. Put the two issues together and you have hell!

Does anyone know how to capture the event of zha crashing? I would like to, at least for the moment, have a script that reloads each time it crashes!

I haven’t used any other dongle for zigbee previously and it would like to know what might be the best option for zigbee. The power points do have energy readings and may be overloading the network.

Any ideas would be great!

I tried SkyConnect and all the other dongles available in AliExpress.

I finally ended with the following setup: Sonoff Zigbee Dongle P + Firmware Z-Stack_3.x.0_coordinator_20221226.

This is not the lastest firmware, If you have issues you can try new ones. For me it is perfect!

SiliconLabs EFR32MG21 is the cause of most problems. It simply does not go well with some Zigbee devices and they stop responding after a while. I am almost certain SiliconLabs will release a new chip soon that fixes all issues. All my EFR32MG21 based dongles had the same issues.

With ZHA radio migration, switching between EFR and CC devices is easy and does not require any re-pair.

Hi. I have the same problem. But just recently. I use the skyconnect since the first release without any issues until a few weeks ago. I see no solution yet in this post. I hope someone reads this and has a solution.

If it helps anyone
I have a SkyConnect was working fine before recent updates, I did have 76 devices working perfectly. Post updates crashed hourly.
I removed 40 philips hue lights and moved back to hue, post that crashed daily.
Removed all devices on 2nd floor so number of devices was now 17, system stable.
Added devices back but again started crashing.
There are issues logged on GitHub but no assignees to fix. However removing multiprotcol support in Hardware settings and running only Zigbee fixed my issues.