Some lightbulbs drop off (tradfri)

This is a handy utility as well, shows near realtime state of all of your devices, you can sort by LQI, RSSI, Status. But it is ephemeral, which why I went down the rabbit hole of writing a program to capture this over time:

# configuration.yaml
    # https://github.com/dmulcahey/zha-network-card
    - type: module
      url: /local/custom-lovelace/zha-network-card/zha-network-card.js

# ui-lovelace-some-page.yaml

title: ZHA
# icon: mdi:home-outline

cards:

# https://github.com/dmulcahey/zha-network-card
clickable: true
columns:
  - name: Name
    prop: name
  - attr: available
    id: available
    modify: x || "false"
    name: Online
  - attr: manufacturer
    name: Manufacturer
  - attr: manufacturer_code
    name: Manufacture Code
  - attr: model
    name: Model
  - attr: ieee
    name: IEEE
  # - attr: device_reg_id
  #   name: Device Reg ID
  - attr: device_type
    name: Device Type
  - name: NWK
    prop: nwk
  - attr: rssi
    name: RSSI
  - attr: lqi
    name: LQI
  - attr: last_seen
    name: Last Seen
  - attr: power_source
    name: Power Source
  - attr: quirk_class
    name: Quirk
  - attr: quirk_applied
    name: Quirk Applied
sort_by: available
type: 'custom:zha-network-card'


1 Like

Thanks for the detailed explanation.

I was thinking that maybe link quality might be bad, current position of my bridge is basically in the corner of the apartment while later on it will be in a more central location. But in general, distance from two most distant bulbs is less than 10m, and generally, distance between two neighbouring bulbs is 3-4m at most (without physical barier between them).

Bulbs are in metal casing, interesting is that I have four bulbs on a single rail and one has LQI of 200 (which should be near perfect) while other one just centimeters away is at about LQI 100. It also happens that three out of four bulbs work correctly and just one decides to misbehave.

I know that there is a limit of number of devices connected to the bridge and I was wondering if I can control the path between my bulb and my bridge (so I can choose which “router” will be used as intermediate connection). It seems that I can not influence that.

I will catalogue LQI values to see if I can improve connection quality.

But I would like to understand this log error message (it repeats a lot and my google search did not produce any result).

If you power cycle the bulbs, do they work fine for awhile?

no, they remain uncontrolable (although they paired correctly and are registered in home assistant)

If you can go in at command prompt level and make a copy of your home-assistant.log with these errors in them, and then look at the detail entries using a text editor. It is sometime hard to extract the full detail of log entries from the HA user interface to the logs. From what I find at github from the info you showed on the log entry. These multiple repeating errors seem to be something around the small SQLITE database that ZHA uses to keep track of the zigbee network and devices. Again hard to tell what is going on without more detail of log entries. Is there a problem with the SQLLITE database, that might be, or is something happening so fast in ZHA that ZHA can not write to database. A number of possibilities. The zigbee.db is a small database and does not keep data over time.

From the graph, you network does not look ‘unhealthy’, but devil is in details.

Remember that LQI is a point to point measure so you will have one or two LQI values for each neighbor that a device is connecting to. You get a LQI value for a connection TO a neighbor and possibly another LQI value for a connection FROM a neighbor. So for each bulb you will a ‘cloud’ of LQI around it grin. I have some device that have a really crappy LQI value to another far away device, but good LQI with other closer devices.

Based on your description, it does not sound like you have any signal barriers, walls, distance, etc. playing into problem.

You can add devices ‘VIA’ other devices. I am again a noob here, but this appear to be a way to hint to your network the relative placement of devices. Your coordinator and routers will still change routes on their own and it does not appear that you force any particular route to stay in place. But again I am learning.

To your question about firmware versions of your bulbs. See picture below, do your bulbs show firmware version similar? So devices do and some don’t on ZHA in my experience. Are they all the same? I have yet to see a firmware upgrade occur on my ZHA network. You can turn on the debug level for this log entry to see OTA detail, so something might be of interest there if you bulbs are not all at same firmware level:

logger:
  default: warning
  logs:
    asyncio: info
    homeassistant.core: info
    zigpy.ota: debug

There should be a new firmware for EFR32MG1P in the sonoff bridge that fixes some issues. Look for 6.7.8 on the tasmota website.

Also enable source routing in zigpy.

Add the following to your configuration.yaml:

zha:
  zigpy_config:
    source_routing: true
1 Like

Curious, if you do this firmware update to the Sonoff Zigbee coordinator do you lose your network and have to rebuilld it? Thx!

No, you keep the network. You can however backup everything before hand:

You can even restore the backup on a different hub, or EZSP stick (Nortek GoControl HUSBZB-1 / Elelabs ELU013) or even TI stick and migrate your network seamlessly.

Good to know, thanks for the info and link!

Now, that is a real option :wink:

--i-understand-i-can-update-eui64-only-once-and-i-still-want-to-do-it

Thanks, I noticed new firmware but it was “release candidate”, I see now that readme suggest 6.7.8 as preffered firmware. Will update.

What is the difference between current networking and source routing? What will change?

When I download only zigbee firmware, tasmota reports error (file signature error or something like that), I have done OTA update and I can only guess that along with tasmota part, zigbee was updated as well.

I have moved to source routing and used your ezsp_config data and after some reseting and rejoining my bulbs seem to work fully ok now. I will test this a little bit more in the coming days but currently I have 100% of desired functionality which is great!

Now when I look at my zigbee network some bulbs are connected to other bulbs and they are connected to the bridge (which is what I wanted to achieve), which basically means that either when I hit 32 zigbee devices on sonoff bridge I hit a limit of some kind OR that change to source routing meant something (I have read some articles about source routing but franky could not figure out difference between default and source routing (in zigbee network)).

I have not managed to create card for cheching all zigbee devices in one spot, but I will likely do that in next few days (or at least try).

I will report back in a day or two to confirm that my setup still works as expected.

Thank you!

Can I kindly ask for someone to explain what ‘source routing’ is? I have already attempted to search for this info but have come up short. thanks in advance.

That is some super news! Nothing better to do that ‘I’m bad’ dance in front of the significant other, after she was giving you the ‘eye’ as the lights flashed on and off :grinning:

Do you attribute the improvements to the Sonoff Zigbee bridge firmware upgrade or to setting the source routing at the ZHA level, or both or ¯_(ツ)_/¯ ?

Congrats, tech that works, what a concept!

Maybe this write up might help. I am still learning, but as I understand it Zigbee networks allow for several different routing methods for the network and it’s devices to decide on how to get a packet from the source node to the destination node (or nodes). Some of the methods are old and deprecated, I think there is a tree routing one that is for the books.

AODV = Ad-hoc On-demand Distance Vector

Digi - Source routing
https://www.digi.com/resources/documentation/Digidocs/90002002/Concepts/c_zb_source_routing.htm?TocPath=Transmission%2C%20addressing%2C%20and%20routing|RF%20packet%20routing|Source%20routing|_____0#:~:text=Zigbee%20source%20routing%20helps%20solve,specify%20routes%20for%20many%20remotes.&text=A%20remote%20device%20sends%20an%20RF%20data%20packet%20to%20the%20data%20collector.

If I have to guess, I would say that change of routing parameters solved the problem.

I am not entirely sure that I have upgraded ezsp firmware (I do not know how to check that, firmware version is not reported in HA, maybe there is some console command for tasmota to get that info, will check).

p.s. One other conclusion from two days of playing with lights is that they perform much better if lights are grouped on ZHA level vs. home assistant group level.

Thanks for the link. I sorta understand it but not 100%.
What I could not glean from the link is if it’s ‘better’ to have source routing on?
My ZHA network with approx 30 devices is very stable so not sure that I want to tempt fate by turning source routing on unless there is a specific advantage.
thanks

Your conclusion that control at the ZHA level of commanding groups of zigbee devices makes sense. I think this was/is one of the big selling points of zigbee for pro lighting, low latency and devices moving in unison.

Here is something you can try to get version, but be careful! I am not sure how cool it is to execute zha bellows commands within HA while HA is running it’s own bellows instance, I have received a python timeout error a couple times but HA and ZHA seem to recover and continue. This shows that I am running 6.7.6.0 firmware in my Sonoff Zigbee Tasmota’ed hub. I am running HA in a docker container, so I can basically ssh into it via docker command line command or portainer gui as shown. The version is also dumped in the log if you turn the right debug on, shown below:

# https://github.com/zigpy/bellows


export EZSP_DEVICE=socket://192.168.2.190:8888

bellows info

[60:a4:23:ff:fe:00:00:00]
[0x0000]
[<EmberNetworkStatus.JOINED_NETWORK: 2>]
[<EmberStatus.SUCCESS: 0>, <EmberNodeType.COORDINATOR: 1>, EmberNetworkParameters(extendedPanId=cc:cc:cc:cc:e3:ab:00:00, panId=0x3498, radioTxPower=20, radioChannel=11, joinMethod=<EmberJoinMethod.USE_MAC_ASSOCIATION: 0>, nwkManagerId=0x0000, nwkUpdateId=0, channels=<Channels.ALL_CHANNELS: 134215680>)]
[<EmberStatus.SUCCESS: 0>, EmberCurrentSecurityState(bitmask=<EmberCurrentSecurityBitmask.64|32|HAVE_TRUST_CENTER_LINK_KEY|8|GLOBAL_LINK_KEY: 124>, trustCenterLongAddress=60:a4:23:ff:fe:00:00:00)]
Manufacturer: 
Board name: 
EmberZNet version: 6.7.6.0 build 327

2021-01-25 09:03:59 INFO (MainThread) [bellows.zigbee.application] EZSP Radio manufacturer: 
2021-01-25 09:03:59 INFO (MainThread) [bellows.zigbee.application] EZSP Radio board name: 
2021-01-25 09:03:59 INFO (MainThread) [bellows.zigbee.application] EmberZNet version: 6.7.6.0 build 327

You ask a good question. My first thought is don’t mess with something that is working! But keep the source routing and some of the other possible option in quiver as your network expands. In @mrakar 's case for reasons TBD source routing seems to be a must have. Is it the devices and their quirks? Is it how the network was ‘built out’? Much to learn! Good hunting!

Yes, thanks. Lots to learn!