Matter stop working over night?!

I’m using the Matter Integration with 10 Matter over Thread devices since beginning of the year. I worked until last week without any issues.

As Thread Border Router I use one Apple TV an 6 HomePod’s Mini. As OpenThread Border Router ConBee II flashed with the Thread Firmware. The Apple Network is the preferred none and the OTBR is part of the preferred network.

Last week more and more devices become offline, but only in Home Assistant. Within HomeKit all devices are still working.

Also, it was not possible anymore to add a new device via the Matter Integration.

Not as a new one and not by sharing the device via Home App.

Via HomeKit (Home App) it’s possible to add new devices without any issues.

I decided to delete all devices as well the Matter and Thread Integration as well the dedicated Add-Ons.

And set up everything from scratch one time again. But still the same issue, I can’t add new devices anymore.

I tested it with three different Home Assistant installations as well with two different ConBee II as OTBR.

  • HA OS 2024.8.1 as VM in Proxmox → Produktiv
  • HA OS 2024.8.1 as VM in Proxmox → Test System
  • HA OS 2024.8.1 Odroid M1 → Test System

Home Asssistant as well all Thread Border Router are in the same VLAN.

Here the Log from the OpenThread Border Router Add-On

-----------------------------------------------------------
 Add-on: OpenThread Border Router
 OpenThread Border Router add-on
-----------------------------------------------------------
 Add-on version: 2.9.1
 You are running the latest version of this add-on.
 System: Home Assistant OS 12.4  (amd64 / qemux86-64)
 Home Assistant Core: 2024.8.1
 Home Assistant Supervisor: 2024.08.0
-----------------------------------------------------------
 Please, share the above information when looking for help
 or support in, e.g., GitHub, forums or the Discord chat.
-----------------------------------------------------------
s6-rc: info: service banner successfully started
s6-rc: info: service universal-silabs-flasher: starting
[23:09:12] INFO: Flashing firmware is disabled
s6-rc: info: service universal-silabs-flasher successfully started
s6-rc: info: service otbr-agent: starting
[23:09:12] INFO: Setup OTBR firewall...
[23:09:12] INFO: Starting otbr-agent...
s6-rc: info: service otbr-agent successfully started
s6-rc: info: service otbr-agent-rest-discovery: starting
s6-rc: info: service otbr-agent-configure: starting
s6-rc: info: service otbr-web: starting
s6-rc: info: service otbr-web successfully started
[23:09:13] INFO: Starting otbr-web...
otbr-web[263]: [INFO]-WEB-----: Running 0.3.0-41474ce-dirty
otbr-web[263]: [INFO]-WEB-----: Border router web started on wpan0
[23:09:13] INFO: Enabling NAT64.
Done
Done
Done
s6-rc: info: service otbr-agent-configure successfully started
[23:09:13] INFO: Successfully sent discovery information to Home Assistant.
s6-rc: info: service otbr-agent-rest-discovery successfully started
s6-rc: info: service legacy-services: starting
s6-rc: info: service legacy-services successfully started
49d.21:38:26.999 [C] P-SpinelDrive-: Software reset co-processor successfully
00:00:00.038 [W] P-Netif-------: Failed to process request#2: No such process
00:00:00.038 [W] P-Netif-------: Failed to process request#6: No such process
00:00:00.329 [W] P-Netif-------: Failed to process request#7: No such process
00:00:00.604 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:00:12.237 [W] P-Netif-------: Successfully added an external route ::/0 in kernel
00:00:12.237 [W] P-Netif-------: Successfully added an external route fd11:93b3:21ea:ffff:0:0::/96 in kernel
00:00:14.706 [W] DuaManager----: Failed to perform next registration: NotFound
00:00:22.020 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:00:22.435 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:00:22.472 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:00:27.719 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:00:30.242 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:00:35.577 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:00:46.980 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:00:47.990 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:00:47.996 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:00:57.245 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:07.833 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:09.514 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:11.561 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:14.724 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:43.509 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:44.340 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:44.741 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:45.178 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:45.579 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:45.987 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:46.384 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:46.808 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:47.471 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:48.010 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:48.040 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:48.420 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:48.455 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:48.853 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:49.434 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:50.901 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:55.281 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:58.283 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:58.555 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:59.412 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:01:59.911 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:02:00.330 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:02:00.724 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:02:04.011 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:02:05.234 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:02:12.610 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure
00:02:15.049 [W] P-RadioSpinel-: Handle transmit done failed: ChannelAccessFailure

Here the Log from the Matter Server Add-On

Add-on: Matter Server
 Matter WebSocket Server for Home Assistant Matter support.
-----------------------------------------------------------
 Add-on version: 6.4.1
 You are running the latest version of this add-on.
 System: Home Assistant OS 12.4  (amd64 / qemux86-64)
 Home Assistant Core: 2024.8.1
 Home Assistant Supervisor: 2024.08.0
-----------------------------------------------------------
 Please, share the above information when looking for help
 or support in, e.g., GitHub, forums or the Discord chat.
-----------------------------------------------------------
s6-rc: info: service banner successfully started
s6-rc: info: service matter-server: starting
[23:09:27] INFO: Starting Matter Server...
s6-rc: info: service matter-server successfully started
s6-rc: info: service legacy-services: starting
s6-rc: info: service legacy-services successfully started
[23:09:28] INFO: Using 'enp0s18' as primary network interface.
[23:09:28] INFO: Successfully send discovery information to Home Assistant.
[1723410569.028440][126:126] CHIP:CTL: Setting attestation nonce to random value
[1723410569.028582][126:126] CHIP:CTL: Setting CSR nonce to random value
[1723410569.029093][126:126] CHIP:DL: ChipLinuxStorage::Init: Using KVS config file: /tmp/chip_kvs
[1723410569.029154][126:126] CHIP:DL: writing settings to file (/tmp/chip_kvs-y3h5RG)
[1723410569.029187][126:126] CHIP:DL: renamed tmp file to file (/tmp/chip_kvs)
[1723410569.029298][126:126] CHIP:DL: ChipLinuxStorage::Init: Using KVS config file: /data/chip_factory.ini
[1723410569.029349][126:126] CHIP:DL: ChipLinuxStorage::Init: Using KVS config file: /data/chip_config.ini
[1723410569.029365][126:126] CHIP:DL: ChipLinuxStorage::Init: Using KVS config file: /data/chip_counters.ini
[1723410569.029423][126:126] CHIP:DL: writing settings to file (/data/chip_counters.ini-Z8N3ZG)
[1723410569.029503][126:126] CHIP:DL: renamed tmp file to file (/data/chip_counters.ini)
[1723410569.029506][126:126] CHIP:DL: NVS set: chip-counters/reboot-count = 12 (0xC)
[1723410569.029666][126:126] CHIP:DL: Got Ethernet interface: enp0s18
[1723410569.029733][126:126] CHIP:DL: Found the primary Ethernet interface:enp0s18
[1723410569.029862][126:126] CHIP:DL: Failed to get WiFi interface
[1723410569.029866][126:126] CHIP:DL: Failed to reset WiFi statistic counts
2024-08-11 23:09:29.030 (MainThread) CHIP_PROGRESS [chip.native.TS] Last Known Good Time: 2023-10-14T01:16:48
2024-08-11 23:09:29.030 (MainThread) CHIP_PROGRESS [chip.native.FP] Fabric index 0x1 was retrieved from storage. Compressed FabricId 0x8FE8BB58D61145F7, FabricId 0x0000000000000002, NodeId 0x000000000001B669, VendorId 0x134B
2024-08-11 23:09:29.031 (MainThread) CHIP_PROGRESS [chip.native.ZCL] Using ZAP configuration...
2024-08-11 23:09:29.032 (MainThread) CHIP_PROGRESS [chip.native.IN] CASE Server enabling CASE session setups
2024-08-11 23:09:29.069 (Dummy-2) CHIP_PROGRESS [chip.native.CTL] Setting attestation nonce to random value
2024-08-11 23:09:29.070 (Dummy-2) CHIP_PROGRESS [chip.native.CTL] Setting CSR nonce to random value
2024-08-11 23:09:29.070 (Dummy-2) CHIP_PROGRESS [chip.native.SPT] Using device attestation PAA trust store path /data/credentials.
2024-08-11 23:09:29.094 (Dummy-2) CHIP_PROGRESS [chip.native.CTL] Generating NOC
2024-08-11 23:09:29.095 (Dummy-2) CHIP_PROGRESS [chip.native.FP] Validating NOC chain
2024-08-11 23:09:29.097 (Dummy-2) CHIP_PROGRESS [chip.native.FP] NOC chain validation successful
2024-08-11 23:09:29.097 (Dummy-2) CHIP_PROGRESS [chip.native.FP] Updated fabric at index: 0x1, Node ID: 0x000000000001B669
2024-08-11 23:09:29.097 (Dummy-2) CHIP_PROGRESS [chip.native.TS] Last Known Good Time: 2023-10-14T01:16:48
2024-08-11 23:09:29.097 (Dummy-2) CHIP_PROGRESS [chip.native.TS] New proposed Last Known Good Time: 2021-01-01T00:00:00
2024-08-11 23:09:29.097 (Dummy-2) CHIP_PROGRESS [chip.native.TS] Retaining current Last Known Good Time
2024-08-11 23:09:29.098 (Dummy-2) CHIP_PROGRESS [chip.native.FP] Metadata for Fabric 0x1 persisted to storage.
2024-08-11 23:09:29.098 (Dummy-2) CHIP_PROGRESS [chip.native.TS] Committing Last Known Good Time to storage: 2023-10-14T01:16:48
2024-08-11 23:09:29.098 (Dummy-2) CHIP_PROGRESS [chip.native.CTL] Joined the fabric at index 1. Fabric ID is 0x0000000000000002 (Compressed Fabric ID: 8FE8BB58D61145F7)
2024-08-11 23:09:29.098 (Dummy-2) CHIP_PROGRESS [chip.native.CTL] *** Missing DeviceAttestationVerifier configuration at DeviceCommissioner init: using global default, consider passing one in CommissionerInitParams.
2024-08-11 23:09:29.098 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Updating services using commissioning mode 0
2024-08-11 23:09:29.099 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] CHIP minimal mDNS started advertising.
2024-08-11 23:09:29.099 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] Advertise operational node 8FE8BB58D61145F7-000000000001B669
2024-08-11 23:09:29.099 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] CHIP minimal mDNS configured as 'Operational device'; instance name: 8FE8BB58D61145F7-000000000001B669.
2024-08-11 23:09:29.101 (Dummy-2) CHIP_PROGRESS [chip.native.DIS] mDNS service published: _matter._tcp
2024-08-11 23:09:29.101 (Dummy-2) CHIP_PROGRESS [chip.native.SPT] Setting up group data for Fabric Index 1 with Compressed Fabric ID:
2024-08-11 23:09:58.601 (Dummy-2) CHIP_PROGRESS [chip.native.CTL] Setting attestation nonce to random value
2024-08-11 23:09:58.602 (Dummy-2) CHIP_PROGRESS [chip.native.CTL] Setting CSR nonce to random value
2024-08-11 23:09:58.603 (Dummy-2) CHIP_PROGRESS [chip.native.CTL] Starting commissioning discovery over DNS-SD
2024-08-11 23:10:28.604 (Dummy-2) CHIP_ERROR [chip.native.CTL] Discovery timed out
2024-08-11 23:10:28.604 (Dummy-2) CHIP_ERROR [chip.native.ZCL] Secure Pairing Failed
2024-08-11 23:10:28.604 (Dummy-2) WARNING [chip.ChipDeviceCtrl] Failed to establish secure session to device: src/controller/python/ChipDeviceController-ScriptDevicePairingDelegate.cpp:89: CHIP Error 0x00000003: Incorrect state
2024-08-11 23:10:28.605 (MainThread) ERROR [matter_server.server.client_handler] [139740526164176] Error while handling: commission_with_code: Commission with code failed for node 16.
2024-08-11 23:13:41.799 (Dummy-2) CHIP_PROGRESS [chip.native.CTL] Setting attestation nonce to random value
2024-08-11 23:13:41.800 (Dummy-2) CHIP_PROGRESS [chip.native.CTL] Setting CSR nonce to random value
2024-08-11 23:13:41.801 (Dummy-2) CHIP_PROGRESS [chip.native.CTL] Starting commissioning discovery over DNS-SD
2024-08-11 23:14:11.806 (Dummy-2) CHIP_ERROR [chip.native.CTL] Discovery timed out
2024-08-11 23:14:11.806 (Dummy-2) CHIP_ERROR [chip.native.ZCL] Secure Pairing Failed
2024-08-11 23:14:11.807 (Dummy-2) WARNING [chip.ChipDeviceCtrl] Failed to establish secure session to device: src/controller/python/ChipDeviceController-ScriptDevicePairingDelegate.cpp:89: CHIP Error 0x00000003: Incorrect state
2024-08-11 23:14:11.808 (MainThread) ERROR [matter_server.server.client_handler] [139740526164176] Error while handling: commission_with_code: Commission with code failed for node 17.

Any advice what is going wrong here?

Edit: Nobody else with Matter Server issues?!

Even after rebuild everything from scratch. Devices becomes offline after few hours, latest after one day. And I always have to restart the Matter server so that all devices are online again.