The only time I’ve ever has Zigbee2MQTT actually crash was when attempting to load a customization for a specific type of device. You may have just found the culprit.
Can you remove that device? If not can you revert to a backup?
If you google that message. This thread at the Zigbee2MQTT forum seems useful as does the link cited in the thread.
Are you factory resetting each device before you add them back? I think you have some bad baggage. Also, I would recommend AGAINST doing any restores from you backups, again I think you have some that need to be cleaned out.
Also, I know many are successful running zigbee2mqtt inside their Home Assistant setup, however I just find running them in separate docker containers or on separate VM’s or physical machines a more robust way to go.
I actually joined it to the network, then thought I don’t really need power monitoring for a shower pump so I set it to delete from the network having disabled join, such that it was left in a searching state for when it was next powered on for something else.
I then realised later that I had put this in to automate a routine that detects if the shower is being used and the water tank temp drops below a certain temperature, it then turns the boiler on to bring the tank back to temp.
I’d added a couple more routers by that point. I plugged it in. Realised something wasn’t working so held the power button to force a join. When finished I realised Z2M had restarted.
I’m not seeing this join message issue now. I was earlier with around 10 devices continually making such join messages. That was before changing channel and network key.
What do you mean by factory resetting though? Since reset all devices are gone from device view, but their names and unique identifier string are still in the database. So to bring them back I have to hold whatever button for majority of devices, a few hue bulbs I use a hue remote to reset them, and one or two things have to do a multi-on-off-on-off cycle of physical power. So if that is factory resetting, then yes - there is no other way for them to join the network with new channel and network key.
The join message as it was restarted was when I plugged that device in, which had been removed already. After restarting it joined and completed correctly after a bit of back and forth.
Thanks for the info, hope you are feeling some progress!
In the zigbee2mqtt forum thread and linked article, that I and the other chap posted to you shows how complex zigbee network can be, devices can cache they ‘join request’ and credentials, this is supported to allow end device to go to sleep and wake up and do things, routers have some similar interesting properties as well to support ‘sleepy end devices’.
All that said, it sounds like by changing the network key (and maybe the channel) that kaboshed all those cached creds. That said, figuring out how to do a factory reset of a zigbee device is a good thing to have in your quiver and I try to do it when a device shows odd behavior or I am moving it to another location.
What is it that you do for a typical reset? Is it like I do, and hold that re-pair button, and or force delete it from the controller, then re-pair, or is there something else perhaps?
All is stable this morning. I’ll add a few more devices.
Restored from a week old backup and then went through the channel change, netwoork key, and PAN change again.
So far nothing crashed, but I have lost a couple of devies - again light dimmer switches.
I still see many times “Accepting joining” from devices - all of which I removed, added to an echo with zigbee, left them some time, deleted, then joined back to Z2M. It seems to mostly be hue motion sensors.
It is difficult to follow your steps from your writing.
Not sure what you ‘restored from a week old backup’. Zigbee2MQTT config? Home Assistant config? Both?
‘went through the channel change, network key and PAN change again’. Did you change them to new values or same value from some prior evolution?
‘added to an echo with zigbee’? WTF . So you have another zigbee network, frst time I think I’ve seen you share this info? And what purpose do these steps yield for your setup?
The mystery seems to be why you are trying to paint a Jackson Pollock
You seem to have a fairly complex Zigbee setup. It is hard to do, I’ve been in similar boats before, I think you have to be regimented in your changes, let things settle, and divide into simpler parts. Stop the spaghetti testing. I do not think you have shared the full list of your 40 + devices, however from what I’ve seen in your posts, from Tuya to Hue Motion Sensors you seem to have picked some of the more might I say ‘difficult/odd/abbynormal’ devices. I would recommend reviewing the Zigbee2MQTT forums for others experiences with each of your devices, lots of knowledge in these forums and folks experiences.
All automations had disappeared in home assistant. Adding a new automation still noting would show. Restoring the automations.yaml and configuration.yaml on their own did not work to restore them. As I had made a change firstly test moving some devices to ZHA, and I had changed some automations, scripts, templates and dashboards to support the new entity name formats as well as finding a kind of bug along the way, I decided the best option was to do a full restore back one week.
I changed the channel again to 25, the PAN and Network keys to something completly new.
In post 10 you kindly shared a link to something about this “accepting joining” info line from the debug that was showing. Like : info 2023-05-11 13:25:08: Accepting joining not in blocklist device ‘0xbc33acfffe26e4a7’
At least some of the way through the post Koenkk mentions if you join the device to a totally different zigbee network it should clear out the network details. I mentioned I could join this to an echo with zigbee to achieve this, then re-join it to my actual zigbee network (this in theory may make it have the right details and stop this), however it does not. There is nothing normally connected to the echo, so I assume it doesn’t actually run anything actively.
The end of that post actually mentions this behaviour of accepting joining is realtively normal and it’s when a device can’t reach it’s parent.
I have around 100 devcices. All worked perfectly in harmony until a few weeks ago. Most motion sensors are centred around Hue, which I also use for some plugins without power monitoring - I find those seem to be pretty robust, hive for power monitoring sockets (although have used Auror and Samsung before), Aqara for buttons, 2 or 3 tuya radar sensors (as they were cheap and I couldn’t be bothered to build more ESP based ones), aurroa for wired sockets and some dimmers, couple of tuya buttons (from LoraTap, which I’ve found to be perhaps the most reliable button, and can hop instantly to any parent), Xiaomi contact sensors, some sonoff relays, Develco smoke alarms, Gledopto LED controllers. Which do you consider to be oddball, and or what are your go to devices for categories of motion / microwave / switching / plugs / contact sensors / buttons.
I’m pretty sure this was all down to zigbee interferance. I think I mentioned above, that when I was on channel 11, that simply turning on the solar optimiser controller, it would reset all the Aurora plugs, some aurroa lights, and a few other brands. I suspect the solar optimiser controller jumped channels and or upped its transmit power follwing a software update or similar. I will not be using Tigo again, that is for sure!
So far though, no more crashing. I’m not convinced about the Aurora Aone dimmers at the moment, but will see, if one plays up it will be removed and replaced, but there seems limited suitable devices in this space.