HA Yellow/Zigbee Issue

Over the past year, I have consistently had an issue where my Zigbee devices go offline and the only way to get them back is to perform a hardware reboot. Most of the time is happens after updates, but sometimes it is random.
Has anyone else experienced this issue?

Could be any number of things - we need more specifics.

  • What integration are you using?
  • What coordinator are you using?
  • How many devices do you have, and are they routers or sensors?
  • Which ones go offline?
  • Have they ever worked properly?
  • Any logs?

Lots of good advice here:

Thank you for the response!

  • I am using the official Zigbee integration, Zigbee Home Automation - Home Assistant
  • I have the Home Assistant Yellow and it comes with the Zigbee hardware
  • I have 2 Aqua door sensors, 5 Aqua temp/hum sensors, and 1 Smart Things button. I believe the are all acting as sensors. (I am waiting to solve this issue before expanding to more devices.)
  • This is the crazy part… the entire integration fails. It shows that is fails in the integrations page, which renders all the devices offline.
  • It has worked properly and without issues for first half of 2023 before starting to have issues. Even currently, I haven’t performed any updates and the integrations is offline for the past couple of days. A full hardware reboot will fix the issue.
  • I just enabled debugging so I should have logs starting today.

Have you no mains-powered devices at all?

That’s correct.

If your coordinator is connecting directly with each of your sensors, then Zigbee is bound to be fragile. You can do it that way, but there is a limit to the number of devices a coordinator can connect with, and all of those connections are going to be very vulnerable to environmental factors.

Zigbee is designed to work in a mesh of mains powered devices; the coordinator will connect directly with very few of them and there will always be alternative routes for messages to take. Battery-powered devices don’t make any contribution to the network at all.

Do read the post I provided a link to.

1 Like

Thank you for the explanation and the extra push to read the link. The topology information along with understanding the need for mains powered devices seems key.
I’ve picked up a mains powered device to setup and will report back on the results!

1 Like

Ok. I am definitly in a better spot with my topology! I added 4 Tradfri bulbs and 2 outlets spread out over the entire home. Thank you again for the assistance with this.
Before:


After:

I am still experience random crashes of the Zigbee integration. It seems like the network could possibly use some time to “settle in,” but it crashes about every 12 hours.

I have an alert monitoring when the devices become unavailable and can pin point when the integration does down within about 30 minutes. I enabled debugging on the integration but I am not quite sure what I should be looking for.

What are some good next troubleshooting steps? I am thinking to remove all battery powered devices and start there?

Ok, so I think I was able to get the information from the logs that would be helpful.

2024-05-01 10:57:58.893 DEBUG (MainThread) [bellows.uart] Connection lost: ConnectionResetError('Remote server closed connection')
2024-05-01 10:57:58.897 ERROR (MainThread) [bellows.uart] Lost serial connection: ConnectionResetError('Remote server closed connection')
2024-05-01 10:57:58.898 DEBUG (MainThread) [bellows.ezsp] socket://core-silabs-multiprotocol:9999 connection lost unexpectedly: Remote server closed connection
2024-05-01 10:57:58.898 ERROR (MainThread) [bellows.ezsp] NCP entered failed state. Requesting APP controller restart
2024-05-01 10:57:58.899 DEBUG (MainThread) [bellows.zigbee.application] Received _reset_controller_application frame with ("Serial connection loss: ConnectionResetError('Remote server closed connection')",)
2024-05-01 10:57:58.899 DEBUG (MainThread) [zigpy.application] Connection to the radio has been lost: "Serial connection loss: ConnectionResetError('Remote server closed connection')"
2024-05-01 10:57:58.899 DEBUG (MainThread) [homeassistant.components.zha.core.gateway] Connection to the radio was lost: "Serial connection loss: ConnectionResetError('Remote server closed connection')"
2024-05-01 10:57:58.900 DEBUG (MainThread) [bellows.uart] Connection lost: None
2024-05-01 10:57:58.900 DEBUG (MainThread) [bellows.uart] Closed serial connection
2024-05-01 10:57:58.902 DEBUG (MainThread) [homeassistant.components.zha.core.gateway] Shutting down ZHA ControllerApplication

I haven’t been able to find the exact issue anywhere else, yet. I’ll keep hunting.

So this isn’t a solution, but it’s a bandaid fix for my issue.
I created an automation where is >3 Zigbee devices are offline it will restart HA (with a frequency limiter and downtime duration check). It’s been working great and from the user experience standpoint resolves my issue.

If I have time in the future I will test my old Zigbee dongle to see if it is related the the Yellow’s Zigbee chip, but until then this will have to do.