ZHA fails to start and needs to be restored from backup after every reboot

Since migrating to SkyConnect for Zigbee, every time I restart Home Assistant ZHA fails to initialize with the following error:

Couldn't start EZSP = Silicon Labs EmberZNet protocol: Elelabs, HUSBZB-1, Telegesis coordinator (attempt 1 of 3)
Couldn't start EZSP = Silicon Labs EmberZNet protocol: Elelabs, HUSBZB-1, Telegesis coordinator (attempt 2 of 3)
Couldn't start EZSP = Silicon Labs EmberZNet protocol: Elelabs, HUSBZB-1, Telegesis coordinator (attempt 3 of 3)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 185, in async_initialize
    self.application_controller = await app_controller_cls.new(
  File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 171, in new
    await app.startup(auto_form=auto_form)
  File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 145, in startup
    await self.connect()
  File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 124, in connect
    self._ezsp = await bellows.ezsp.EZSP.initialize(self.config)
  File "/usr/local/lib/python3.10/site-packages/bellows/ezsp/__init__.py", line 105, in initialize
    await ezsp._startup_reset()
  File "/usr/local/lib/python3.10/site-packages/bellows/ezsp/__init__.py", line 96, in _startup_reset
    await self.reset()
  File "/usr/local/lib/python3.10/site-packages/bellows/ezsp/__init__.py", line 125, in reset
    await self._gw.reset()
  File "/usr/local/lib/python3.10/site-packages/bellows/uart.py", line 260, in reset
    return await asyncio.wait_for(self._reset_future, timeout=RESET_TIMEOUT)
  File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

I am able to initialize ZHA by going to Configure → Migrate Radio → Re-configure the current radio ->Restore Automatic Backup. This restores ZHA functionality pretty much instantaneously. However, I would like to avoid having to do this every time I restart Home Assistant. Can anyone help me figure out why this is happening?

It seems like people have run into this issue with SkyConnect when enabling multi-pan, but I have NOT enabled multi-pan / multiprotocol yet and I have flashed the latest firmware to SkyConnect as of 3/20/23.

Did you get this one solved?

Just ran in to the samme “Couldn’t start EZSP = Silicon Labs EmberZNet protocol: Elelabs, HUSBZB-1, Telegesis coordinator (attempt 1 of 3)” issue…

I will answer myself if anyone else gets this issue.

What I did was to remove the Sky Connect device from my Qnap and added it again.
Reloaded the integration and is is now working…

@thomas_t Did you just remove and reinsert the SkyConnect device then reload the integration?

I’m having a similar issue as @Kipmo, but the logs indicate it may be due to a different root cause. Migrated Zigbee away from HUSB-ZB1 dongle to Sonoff ZBDongle-P, changed the IEEE address on the Zigbee side of the HUSB-ZB1 to avoid conflicts with the migrated network, and kept ZWave in service on the HUSB-ZB1.

Everything seemed to work fine at first, but as @Kipmo described ZHA fails to start following any restart of HA Core and I have to follow the same Configure → Migrate Radio → Re-configure the current radio → Keep network settings steps each time to get ZHA to start (and connectivity to devices is again more or less instantly restored).

This is running on an ASUS PN63-S1 mini-PC using HAOS generic-x86-64 v10.1, HA 2023.5.3.

Logger: homeassistant.components.zha.core.gateway
Source: components/zha/core/gateway.py:205
Integration: Zigbee Home Automation (documentation, issues)
First occurred: 2:26:27 PM (3 occurrences)
Last logged: 2:26:53 PM

Couldn't start ZNP = Texas Instruments Z-Stack ZNP protocol: CC253x, CC26x2, CC13x2 coordinator (attempt 1 of 3)
Couldn't start ZNP = Texas Instruments Z-Stack ZNP protocol: CC253x, CC26x2, CC13x2 coordinator (attempt 2 of 3)
Couldn't start ZNP = Texas Instruments Z-Stack ZNP protocol: CC253x, CC26x2, CC13x2 coordinator (attempt 3 of 3)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/zigpy_znp/api.py", line 673, in _skip_bootloader
    result = await responses.get()
  File "/usr/local/lib/python3.10/asyncio/queues.py", line 159, in get
    await getter
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 205, in async_initialize
    self.application_controller = await app_controller_cls.new(
  File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 219, in new
    await app.startup(auto_form=auto_form)
  File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 193, in startup
    await self.connect()
  File "/usr/local/lib/python3.10/site-packages/zigpy_znp/zigbee/application.py", line 107, in connect
    await znp.connect()
  File "/usr/local/lib/python3.10/site-packages/zigpy_znp/api.py", line 715, in connect
    self.capabilities = (await self._skip_bootloader()).Capabilities
  File "/usr/local/lib/python3.10/site-packages/zigpy_znp/api.py", line 672, in _skip_bootloader
    async with async_timeout.timeout(CONNECT_PROBE_TIMEOUT):
  File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 129, in __aexit__
    self._do_exit(exc_type)
  File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 212, in _do_exit
    raise asyncio.TimeoutError
asyncio.exceptions.TimeoutError

Im running HA as a VM in my Qnap NAS, so this might work differently on other platforms.

What I did was remove the stick from my Qnap.
In the VM running HA I could see that the connection between my stick and the VM running HA disappeared.
Then I reconnected the stick to Qnap and made a new link between the VM and the stick.

All this is done “outside” of the HA interface.

Then I logged into HA and reloaded the ZHA integration.

I did try to restart the integration several times, but it wasn’t until I removed the stick and installed it again that I got it to work.

@thomas_t Thanks for sharing more details on what worked for you. On HAOS running on bare metal, disconnecting & reconnecting the stick doesn’t seem to resolve the issue I have, only the migrate radio/keep current radio settings process seems to work.

I’ve been thinking about completely dumping the HUSB-ZB1 to eliminate any potential conflict (even though the Zigbee-side IEEE address was successfully updated), and also trying Zigbee2MQTT with a fresh setup of my Zigbee network and re-pairing all devices on the ZBDongle-P.

It would be nice to see if that makes a difference for this issue and would give the opportunity to move the Zigbee network to a less congested channel, but hard to find a good time to jump into that amount of work.

Can report that this issue persists for my setup after migrating away from my previous ZBDONGLE-P coordinator to a networked SMLIGHT SLZB-06. Migrate Radio → Keep Network Settings still works to recover on each HA restart.

I had hoped that moving away from USB for my coordinator might make this better, but it seems like more or less the same behavior and errors, as well as method for recovery.

Now on HAOS 10.5, HA 2023.8.3, using SMLIGHT SLZB-06.



Logger: homeassistant.config_entries
Source: components/zha/core/gateway.py:205
First occurred: 11:52:36 PM (1 occurrences)
Last logged: 11:52:36 PM
Error setting up entry ZHA - SLZB-06 for zha

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/zigpy_znp/api.py", line 684, in _skip_bootloader
    result = await responses.get()
             ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/queues.py", line 158, in get
    await getter
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/config_entries.py", line 388, in async_setup
    result = await component.async_setup_entry(hass, self)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/zha/__init__.py", line 138, in async_setup_entry
    await zha_gateway.async_initialize()
  File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 220, in async_initialize
    raise exc
  File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 205, in async_initialize
    self.application_controller = await app_controller_cls.new(
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 219, in new
    await app.startup(auto_form=auto_form)
  File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 193, in startup
    await self.connect()
  File "/usr/local/lib/python3.11/site-packages/zigpy_znp/zigbee/application.py", line 115, in connect
    await znp.connect()
  File "/usr/local/lib/python3.11/site-packages/zigpy_znp/api.py", line 726, in connect
    self.capabilities = (await self._skip_bootloader()).Capabilities
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/zigpy_znp/api.py", line 683, in _skip_bootloader
    async with async_timeout.timeout(CONNECT_PROBE_TIMEOUT):
  File "/usr/local/lib/python3.11/site-packages/async_timeout/__init__.py", line 129, in __aexit__
    self._do_exit(exc_type)
  File "/usr/local/lib/python3.11/site-packages/async_timeout/__init__.py", line 212, in _do_exit
    raise asyncio.TimeoutError
TimeoutError

Logger: zigpy.application
Source: /usr/local/lib/python3.11/site-packages/zigpy/application.py:196
First occurred: August 23, 2023 at 11:52:22 PM (2 occurrences)
Last logged: August 23, 2023 at 11:52:36 PM

Couldn't start application
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/zigpy_znp/api.py", line 684, in _skip_bootloader
    result = await responses.get()
             ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/queues.py", line 158, in get
    await getter
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 193, in startup
    await self.connect()
  File "/usr/local/lib/python3.11/site-packages/zigpy_znp/zigbee/application.py", line 115, in connect
    await znp.connect()
  File "/usr/local/lib/python3.11/site-packages/zigpy_znp/api.py", line 726, in connect
    self.capabilities = (await self._skip_bootloader()).Capabilities
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/zigpy_znp/api.py", line 683, in _skip_bootloader
    async with async_timeout.timeout(CONNECT_PROBE_TIMEOUT):
  File "/usr/local/lib/python3.11/site-packages/async_timeout/__init__.py", line 129, in __aexit__
    self._do_exit(exc_type)
  File "/usr/local/lib/python3.11/site-packages/async_timeout/__init__.py", line 212, in _do_exit
    raise asyncio.TimeoutError
TimeoutError

I am facing the same issue with skyconnect:

after every reboot ZHA fails to start, and have to manually re-configure current radio.

I have two USB zigbee coordinators: the skyconnect and a ZBDongle (probably a P, but not sure). I use the skyconnect on ZHA and the sonoff dongle on z2m.

It seems like skyconnect times out 3 times before ZHA gives up:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 193, in startup
    await self.connect()
  File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 133, in connect
    self._ezsp = await bellows.ezsp.EZSP.initialize(self.config)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 167, in initialize
    await ezsp._startup_reset()
  File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 158, in _startup_reset
    await self.reset()
  File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 187, in reset
    await self._gw.reset()
TimeoutError
2023-08-26 08:14:03.712 WARNING (MainThread) [homeassistant.components.zha.core.gateway] Couldn't start EZSP = Silicon Labs EmberZNet protocol: Elelabs, HUSBZB-1, Telegesis coordinator (attempt 1 of 3)
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 205, in async_initialize
    self.application_controller = await app_controller_cls.new(
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 219, in new
    await app.startup(auto_form=auto_form)
  File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 193, in startup
    await self.connect()
  File "/usr/local/lib/python3.11/site-packages/bellows/zigbee/application.py", line 133, in connect
    self._ezsp = await bellows.ezsp.EZSP.initialize(self.config)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 167, in initialize
    await ezsp._startup_reset()
  File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 158, in _startup_reset
    await self.reset()
  File "/usr/local/lib/python3.11/site-packages/bellows/ezsp/__init__.py", line 187, in reset
    await self._gw.reset()
TimeoutError

I’ll try later if removing the sonoff dongle helps.
I tried flashing the latest firmware on the skyconnect, but for some unexplained reason the web flasher tool does not work for me. It connects to skyconnect, but can not poll the current poll the current version.
I tried and succeeded flashing it with the linux command line flasher command to the latest: 7.3.1.0 but the issue persists.

Replying to myself:

my problem seems to be solved.

I was messing around with the coordinators, and it turned out, that the problem was the sonoff zigbee donge after all. If i removed the sonoff zigbee dongle, then ZHA with skyconnect wakes up all right.
I got hold of a newer version of the sonoff stick, the shorter V2 version, and this one does not seems to collide with skyconnect.

no i have to reconfigure z2m to the new sonoff zigbee dongle 3.0 plus, and repair all of my devices on that network.

Sorry to bump an old thread, but wanted to report back that at least for my issue, it looks like HA 2024.3.1 that includes GitHub @puddly’s HA Core / ZHA PR #112415 has totally resolved this for me.

I think my issue was the same as described in the PR with duplicate unique_ids since I migrated the Zigbee portion of my HUSB-ZB1 to other coordinators but needed to retain the stick for the Z-Wave portion, leaving it connected to my host after the migration.

As I outlined I had to complete the ‘Migrate Radio’ config flow after each and every HA restart and re-enter my correct device path (socket:// after migrating to the SLZB-06) which worked fine but was supremely annoying.

After updating to 2024.3.1 and a few restarts, no more migrating!