Dealing with message loss

MatthiasU · October 31, 2023, 4:53pm

Lots of integrations (KNX Enocean Zigbee etc.etc.) use unacknowledged messages to control things. So when the message is lost (burst of radio noise, flurry of bus activity, …), the light stays off and the user gets annoyed.

IMHO Home Assistant should supply a generic solution for this problem. E.g. an “if the state doesn’t follow the command, retry X times within Y seconds, and if that didn’t work, send an error notification to topic Z” option?

tom_l · October 31, 2023, 5:48pm

There are conditional loop actions you can use to keep trying the service until the required state is obtained. See: https://www.home-assistant.io/docs/scripts/#repeat-until

MatthiasU · October 31, 2023, 6:30pm

True, but setting up an action script for each and every KNX actor out there (repeat until desired state, but notify-and-stop after N retries) gets tedious really quickly, and it’s not exactly user friendly.

farmio · October 31, 2023, 6:53pm

I would think such issues should be handled by the integration internally since the parameters for needed retry delay can be very different (eg. rate limited cloud service vs. local push API). An integration could then just raise some kind of error out of async_turn_on() etc. when it hits its specific limits.

For KNX specifically, I would be interested to know where your message loss comes from. Are you using IP Routing or RF? As you probably know, IP Tunnelling and TP do have their internal retry mechanisms (I don’t know about RF though).

MatthiasU · October 31, 2023, 7:57pm

Sure, I wasn’t advocating for some kind of global standard retry parameter. But per-integration (plus possibly per-device-type) would help a lot.

As for KNX, you can also run it on multicast addresses, which is potentially lossy. Also, many KNX/IP interfaces acknowledge IP-tunneled packets before sending them (and receiving the Ack message). Some even ack the message and then throw it away because their internal buffer isn’t large enough (either it’s full or it gets overwritten by later messages). The knxd gateway has a message delay module to mitigate this, but it’s not perfect.

However, I’m not just talking about KNX, there’s plenty of other channels that can and do lose messages for a variety of reasons.

parautenbach · October 31, 2023, 8:25pm

Also: