HomeKit Controller Decryption failed error

Hi,
I have two Eve Thermo Thread devices connected via the HomeKit Controller integration. The thread router is a HomePod Mini.

The problem is that the connection fails every few minutes (often 3-4 times an hour) and the devices become unavailable (both at the same time). After 5 minutes or so the connection works again on its own.
I tried resetting the devices and re-adding the integration, that did not help.

The logs always show the same 3 errors:
First one:

Source: components/homekit_controller/connection.py:752 
First occurred: 17:03:46 (8 occurrences) 
Last logged: 17:10:49

Decryption failed, desynchronized? Counter=19/22
Failed flailing attempts to resynchronize, self-destructing in 3, 2, 1...
Decryption failed, desynchronized? Counter=18/21
Decryption failed, desynchronized? Counter=14/17

Second one (same time as the first):

Logger: homeassistant.components.homekit_controller.connection
Source: components/homekit_controller/connection.py:752 
Integration: HomeKit Controller (documentation, issues) 
First occurred: 17:03:46 (4 occurrences) 
Last logged: 17:10:49

Unexpected exception from <bound method HKDevice.async_update of <homeassistant.components.homekit_controller.connection.HKDevice object at 0x7fa88597b0>>
Unexpected exception from <bound method HKDevice.async_update of <homeassistant.components.homekit_controller.connection.HKDevice object at 0x7fa8990a00>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 118, in _decrypt_response
    return self.decrypt(response.payload)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 95, in decrypt
    dec_data = self.recv_ctx.decrypt(
  File "/usr/local/lib/python3.10/site-packages/cryptography/hazmat/primitives/ciphers/aead.py", line 62, in decrypt
    return aead._decrypt(backend, self, nonce, data, [associated_data], 16)
  File "/usr/local/lib/python3.10/site-packages/cryptography/hazmat/backends/openssl/aead.py", line 207, in _decrypt
    raise InvalidTag
cryptography.exceptions.InvalidTag

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/debounce.py", line 109, in _handle_timer_finish
    await task
  File "/usr/src/homeassistant/homeassistant/components/homekit_controller/connection.py", line 716, in async_update
    new_values_dict = await self.get_characteristics(
  File "/usr/src/homeassistant/homeassistant/components/homekit_controller/connection.py", line 752, in get_characteristics
    return await self.pairing.get_characteristics(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/pairing.py", line 197, in get_characteristics
    return await self.connection.read_characteristics(characteristics)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 507, in read_characteristics
    pdu_results = await self.enc_ctx.post_all(OpCode.CHAR_READ, iids, data)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 195, in post_all
    res_pdu = await self.post_bytes(req_pdu)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 181, in post_bytes
    return await self._decrypt_response(response)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 158, in _decrypt_response
    await self.coap_ctx.shutdown()
AttributeError: 'NoneType' object has no attribute 'shutdown'

Third one (one minute later):

Logger: aiohomekit.controller.coap.pdu
Source: components/homekit_controller/connection.py:752 
First occurred: 17:04:49 (4 occurrences) 
Last logged: 17:11:51

Transaction 0 failed with error 6 (Invalid request

I’m using Home Assistant 2023.1.4, Supervisor 2022.12.1, Operating System 9.4, Frontend 20230110.0 on a Raspberry Pi 4.

I hope someone can help me or tell me if I should search for help at the issue tracker.

Thanks,
Paul

I am having the same issue FYI… have no idea what the device is tho.

Home Assistant 2023.1.7
Supervisor 2023.01.1
Operating System 9.5
Frontend 20230110.0 - latest

Dell Optiplex 9020 Micro PC

2023-01-31 14:19:34.440 ERROR (MainThread) [aiohomekit.controller.coap.connection] Decryption failed, desynchronized? Counter=104/107
2023-01-31 14:19:34.442 ERROR (MainThread) [aiohomekit.controller.coap.connection] Failed flailing attempts to resynchronize, self-destructing in 3, 2, 1...
2023-01-31 14:19:34.443 ERROR (MainThread) [homeassistant.components.homekit_controller.connection] Unexpected exception from <bound method HKDevice.async_update of <homeassistant.components.homekit_controller.connection.HKDevice object at 0x7f4bcfc94070>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 118, in _decrypt_response
    return self.decrypt(response.payload)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 95, in decrypt
    dec_data = self.recv_ctx.decrypt(
  File "/usr/local/lib/python3.10/site-packages/cryptography/hazmat/primitives/ciphers/aead.py", line 62, in decrypt
    return aead._decrypt(backend, self, nonce, data, [associated_data], 16)
  File "/usr/local/lib/python3.10/site-packages/cryptography/hazmat/backends/openssl/aead.py", line 207, in _decrypt
    raise InvalidTag
cryptography.exceptions.InvalidTag

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/debounce.py", line 109, in _handle_timer_finish
    await task
  File "/usr/src/homeassistant/homeassistant/components/homekit_controller/connection.py", line 716, in async_update
    new_values_dict = await self.get_characteristics(
  File "/usr/src/homeassistant/homeassistant/components/homekit_controller/connection.py", line 752, in get_characteristics
    return await self.pairing.get_characteristics(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/pairing.py", line 197, in get_characteristics
    return await self.connection.read_characteristics(characteristics)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 507, in read_characteristics
    pdu_results = await self.enc_ctx.post_all(OpCode.CHAR_READ, iids, data)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 195, in post_all
    res_pdu = await self.post_bytes(req_pdu)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 181, in post_bytes
    return await self._decrypt_response(response)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 158, in _decrypt_response
    await self.coap_ctx.shutdown()
AttributeError: 'NoneType' object has no attribute 'shutdown'
2023-01-31 14:21:32.455 ERROR (MainThread) [aiohomekit.controller.coap.connection] Decryption failed, desynchronized? Counter=0/2

Hello @paull27 and @Greminn, the synchronization errors are caused by dropped network traffic or a broken (lost) session on the accessory side. We try to recover by attempting to decrypt the received data with all nearby counter values, but only sometimes does that work. The InvalidTag exception is the underlying exception that we catch to determine when we need to try to recover the counter.

@Jc2k just fixed the NoneType error: Don't error if self.coap_ctx not available during decrypt · Jc2k/aiohomekit@eeb6cbb · GitHub which should be in HA soon if it isn’t already.

Recent versions of HA & aiohomekit implement a fix to handle when an accessory drops the session & not lead to the desync errors you’ve seen. There are some scenarios that we can prevent and some we can’t.

If your devices are battery powered, I will (eventually) put together some code to adapt the CoAP library to better handle the sleep interval each devices advertises.

If your devices are mains powered, leave a ping running for a few minutes & see if there is packet loss. If there is packet loss, try temporarily moving devices closer to the Border Router to see if that fixes the issue.

Oh, and the “Transaction 0 failed with error 6” is harmless. The code attempts to read a value that can’t be read. We need to block reading of that characteristic…

1 Like

The fix should have gone out in 2023.2.2! Let me know if it still happens.

I was helping someone on Discord with this and the fix didn’t address the root cause, but we ultimately found out that during the “unavailable” window their device isn’t pingable. This means its not HA’s fault, but we are still trying to work out if its his HomePod thats falling off this mesh or his light.

2 Likes

Hi @lambdafunction and @Jc2k

The AttributeError: 'NoneType' object has no attribute 'shutdown' error is gone since 2023.2.2 fixed it, thank you!
Unfortunately the devices (two Eve thermo, battery powered) still become unavailable quite often.
My logs still show the Decryption failed, desynchronized? error and sometimes (less often) I get this error:

Logger: aiohomekit.controller.coap.connection
Source: /usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py:396 
First occurred: February 6, 2023 at 12:01:05 (1 occurrences) 
Last logged: February 6, 2023 at 12:01:05

Pair verify failed
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 391, in connect
    await self.do_pair_verify(pairing_data)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/coap/connection.py", line 355, in do_pair_verify
    request, expected = state_machine.send(payload)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/protocol/__init__.py", line 590, in get_session_keys
    handle_state_step(response_tlv, TLV.M4)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/protocol/__init__.py", line 88, in handle_state_step
    error_handler(tlv_dict[TLV.kTLVType_Error], f"step {expected_state}")
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/protocol/__init__.py", line 62, in error_handler
    raise AuthenticationError(stage)
aiohomekit.exceptions.AuthenticationError: step bytearray(b'\x04')

Thanks for helping
Paul

@Jc2k I can also confirm Controller Decryption error has gone, but this is what I now get with the Eve Energy:

2023-02-09 22:22:11.250 DEBUG (MainThread) [bleak_retry_connector] Eve Energy 96F7 (FA:E8:41:89:5B:CA) - FA:E8:41:89:5B:CA: Connection attempt: 1
2023-02-09 22:22:11.250 DEBUG (MainThread) [homeassistant.components.bluetooth.wrappers] FA:E8:41:89:5B:CA - Eve Energy 96F7 -> /org/bluez/hci0: Connecting (last rssi: -127)
2023-02-09 22:22:11.694 DEBUG (MainThread) [homeassistant.components.bluetooth.wrappers] FA:E8:41:89:5B:CA - Eve Energy 96F7 -> /org/bluez/hci0: Connected (last rssi: -127)
2023-02-09 22:22:11.694 DEBUG (MainThread) [bleak_retry_connector] Bleak error calling <function BleDiscovery.async_start_pairing at 0x7f124d83cf70>, backing off: 0.1, retrying...
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/bleak_retry_connector/__init__.py", line 447, in _async_wrap_bluetooth_connection_error_retry
    return await func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/ble/discovery.py", line 151, in async_start_pairing
    salt, pub_key = await self._async_start_pairing(alias)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/ble/discovery.py", line 123, in _async_start_pairing
    ff_char = await self.client.get_characteristic(
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/ble/bleak.py", line 128, in get_characteristic
    raise BleakServiceMissing(
aiohomekit.controller.ble.bleak.BleakServiceMissing: Eve Energy 96F7: Service 00000055-0000-1000-8000-0026BB765291 not found, available services: []
2023-02-09 22:22:11.795 ERROR (MainThread) [homeassistant.components.homekit_controller.config_flow] Pairing attempt failed with an unhandled exception
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/homekit_controller/config_flow.py", line 491, in async_step_pair
    self.finish_pairing = await discovery.async_start_pairing(self.hkid)
  File "/usr/local/lib/python3.10/site-packages/bleak_retry_connector/__init__.py", line 447, in _async_wrap_bluetooth_connection_error_retry
    return await func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/ble/discovery.py", line 151, in async_start_pairing
    salt, pub_key = await self._async_start_pairing(alias)
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/ble/discovery.py", line 123, in _async_start_pairing
    ff_char = await self.client.get_characteristic(
  File "/usr/local/lib/python3.10/site-packages/aiohomekit/controller/ble/bleak.py", line 128, in get_characteristic
    raise BleakServiceMissing(
aiohomekit.controller.ble.bleak.BleakServiceMissing: Eve Energy 96F7: Service 00000055-0000-1000-8000-0026BB765291 not found, available services: []
2023-02-09 22:22:22.842 DEBUG (MainThread) [homeassistant.components.bluetooth.base_scanner] hci0 (5C:F3:70:A7:39:AB): Scanner watchdog time_since_last_detection: 0.011999925000054645

That’s a Bluetooth error you posted, is that expected?

Eve devices are a bit of a pain with the Linux Bluetooth stack. Like the error says, Linux just can’t see a service the Eve has. If Linux can’t see it, not much we can do.

We think it’s some sort of timeout while Linux is probing the Eve device.

Retrying can help sometimes.

I believe a better Bluetooth chip might help (see the high performance list in the HA docs).

Faster Ha env can also help (I’ve never had a problem with eve devices and I have Intel Xeon’s in my HA box)

i just set up three Wemo thread outlets. I have 5 HomePod minis and 3 apple tvs in my house - so the mesh should be strong. all aoutlets are within about 4 feet of a homepod. The outlets all work, but my logs are filled with:

2023-02-23 20:54:37.023 WARNING (MainThread) [aiohomekit.controller.coap.connection] Pair verify timed out
2023-02-23 20:55:37.023 WARNING (MainThread) [aiohomekit.controller.coap.connection] Pair verify timed out
2023-02-23 20:58:29.068 ERROR (MainThread) [aiohomekit.controller.coap.connection] Decryption failed, desynchronized? Counter=0/3
2023-02-23 20:58:29.070 ERROR (MainThread) [aiohomekit.controller.coap.connection] Failed flailing attempts to resynchronize, self-destructing in 3, 2, 1...
2023-02-23 20:59:37.026 WARNING (MainThread) [aiohomekit.controller.coap.connection] Pair verify timed out
2023-02-23 21:00:29.665 WARNING (MainThread) [aiohomekit.controller.coap.pdu] Transaction 0 failed with error 6 (Invalid request
2023-02-23 21:00:29.752 WARNING (MainThread) [aiohomekit.controller.coap.pdu] Transaction 2 failed with error 6 (Invalid request
2023-02-23 21:05:29.064 ERROR (MainThread) [aiohomekit.controller.coap.connection] Decryption failed, desynchronized? Counter=12/16
2023-02-23 21:05:29.065 ERROR (MainThread) [aiohomekit.controller.coap.connection] Failed flailing attempts to resynchronize, self-destructing in 3, 2, 1...
2023-02-23 21:06:32.506 WARNING (MainThread) [aiohomekit.controller.coap.pdu] Transaction 0 failed with error 6 (Invalid request
2023-02-23 21:06:32.644 WARNING (MainThread) [aiohomekit.controller.coap.pdu] Transaction 2 failed with error 6 (Invalid request

1000’s of lines like that.

Re mesh strength - unfortunately that’s not how this works. Right now adding more BRs is likely to create more carnage. The simplest setups are easiest to get stable.

Right now with multiple border routers (especially if they don’t support TREL) there are a whole host of issues.

For example, we have one guy who has 2 HomePods and when I inspected his network it looked like he had 12. That’s a 2 in 12 chance of a packet getting to the right border router.

Another guy had a ATV in a cupboard surrounded by hdmi and usb and power wires. He would see his devices fall off the network multiple times an hour. Removing his ATV massively improved his network stability. His ATV had formed its own partition and when Ha tried to use it as a router, it didn’t work,

They are just the most common cases, there are other network level things that need investigating.

It looks like the apple ecosystem doesn’t have TREL (it’s not mandatory till thread 1.3.1, and that’s only a plan it’s not locked in). That means they wouldn’t use your network to compensate for mesh partitions. So if I talk to BR A and your bulb is connected to BR B, but BR A and BR B can’t see each other via the mesh, those packets will be lost. With TREL, it would redirect the packets over your WiFi to a known good BR.

With the third party border routers we don’t have visibility of how stable the actual mesh component is.

In this months release there will be a thread diagnostics panel. It will let us verify that all your BRs are operational and that you don’t have the ghost routes problem.

I’m planning some other visibility checks (eg a ping check but via every router, that will let us prove mesh partitions are a factor).

It also reduces the verbosity of some logs where your mesh is a hot mess but it is managing to recover by itself.

It’s also a lot easier to use homekit with Ha Yellow or SkyConnect in that release too. You don’t have to pair with iOS and unpair, you can add it straight to HA.

1 Like

Thanks. I have been running a ping6 all morning to one of the outlets just to see if it’s actually unreachable during the times these show up. but it’s been rock solid even while those errors are showing.

I’ll wait for next week to get the new thread panel. I also have a Sky Connect. I assume once the new thread panel is out I should set up Thread on the SC and then consolidate the networks?

Unfortunately ping6 is not enough, you need to test all the routers separately as we don’t know which route the kernel is selecting for the HA traffic. If you have a mesh split, some homepods will work for some accessories but not all accessories, and linux is just picking routes at random.

I’m certain there are bugs in homekit_controller, but so far I haven’t been able to find anyone to help me test fixes because its usually (when I dig in) ultimately mesh stability or weird ipv6 crap on their home network.

You can’t consolidate networks yet. It is being planned. But anyway, if you consolidate networks, you’ll just reintroduce the problems you want to avoid.

Interestingly enough I did reset one of my Eve Thermo Thread devices because of the connection drops and reconnected it to Home Assistant via HomeKit Controller Bluetooth (thread status = disabled).

Now my other Eve Thermo which is still connected over thread via HomePod mini doesn’t lose the connection to HA anymore. And I don’t get the Decryption failed, desynchronized error anymore.

Experiencing this since migrating my Eve Aqua from BLE to Thread. Both errors are COAP, so seemingly not related to BLE then I’d have thought. Never saw this on BLE devices.

Logger: aiohomekit.controller.coap.connection
Source: components/homekit_controller/connection.py:891
First occurred: 00:16:00 (4 occurrences)
Last logged: 00:38:06

Decryption failed, desynchronized? Counter=37/41
Failed flailing attempts to resynchronize, self-destructing in 3, 2, 1…
Decryption failed, desynchronized? Counter=0/19

Logger: aiohomekit.controller.coap.pdu
Source: components/homekit_controller/connection.py:891
First occurred: 00:41:06 (1 occurrences)
Last logged: 00:41:06

Transaction 0 failed with error 6 (Invalid request

Now happening dozens of times a day. Are we sure that this is not just a straight connectivity issue?

Logger: aiohomekit.controller.coap.pdu
Source: components/homekit_controller/connection.py:891
First occurred: 17:33:56 (11 occurrences)
Last logged: 20:53:54

Transaction 0 failed with error 6 (Invalid request

On the latest beta. Still seeing “Transaction 0 failed with error 6 (Invalid request” but “Decryption failed, desynchronized” is now gone since being fixed in Recover CoAP pairing when remote device stops responding by Jc2k · Pull Request #358 · Jc2k/aiohomekit · GitHub

My Eve Homekit battery powered devices (Eve Thermo, Eve Motion) constantly drop off the Thread network in the past few days, probably related to HA 2024.2. It does not seem as if their sleep interval is handled. @lambdafunction about a year ago you wrote “If your devices are battery powered, I will (eventually) put together some code to adapt the CoAP library to better handle the sleep interval each devices advertises.”, are you still planning to do so?

@Jc2k I’m still seeing the following in the logs:

Logger: aiohomekit.controller.coap.pdu
Source: components/homekit_controller/connection.py:283
First occurred: 17:50:38 (54 occurrences)
Last logged: 18:27:26

Transaction 0 failed with error 6 (Invalid request

and

Logger: aiohomekit.controller.coap.pdu
Source: components/homekit_controller/connection.py:893
First occurred: 17:52:51 (8 occurrences)
Last logged: 19:01:53

Transaction 0 failed with error 6 (Invalid request

In case it’s nothing to worry about I suggest changing the error message to something users can better understand. And please use the closing bracket too, I’m still irritated by “(Invalid request”. :wink: Thanks!

1 Like