Homekit covers failing, lights work on the same device

I have had an issue for the past few weeks I’ve been trying to track down and figure out how to effectively troubleshoot. For the past few years I’ve run Ha on a multihomed setup with an iot side and a lan side. I have MDNS setup to broadcast outside the iot network but I don’t allow any other traffic that isn’t brokered through the HA server. This has been an effective setup for a few years until about a month ago when all my garage interfaces stopped working. They are all ratgdo devices with two being traditional esphome garage door setups and the third being a swing arm gate using the mqtt binary. All of the devices\ covers show unresponsive inside of the home.app. The lights however be controlled without issue. I figure something changed but tracking it down inside the homekit bridge is proving difficult and websearches haven’t been very fruitful in how to troubleshoot this side of things.

HA is not designed to be multihomed and you can really not control it.
If HA starts up and one of the network connections is a bit slow to be set up and reported ready (either by HA or by the device in the other end), then HA thinks it is unavailable and automatically move the services on the interface to another interface and everything fails then.

HA is not a router. There are a device called a router for your setup wihich is especially designed for creating routing rules for such setups.

If you were confused by me saying “HA bridges the devices”, I’m not using HA as a router. I’m keeping the networks separate with a router using appropriate vlans and mdns forwarding and blocking forwarding of mdns for HA interfaces and only allowing the iot side to talk to HA. I know the easy answer is just to blame the network and I’d be inclined to just go with it spending my time entirely there if it weren’t for the fact it’s been working for closing in on a decade now since it was a separate addon breaking only when there was significant configuration updates which were usually easily resolved and back up and running. Not to mention I can toggle the lights on the same device without issue and see the motion and obstruction sensors all in homekit with two different interfaces mqtt and esphome leading me to believe it is something with the homekit bridging itself and covers specifically. In this case they are on their own bridge as are most device classes in my setup to prevent overloading them, and I have removed and re-added this bridge in homekit. I had looked in the debug logs and I never saw anything about covers aside from some attempted initialization of shelly.covers which I don’t have any that would fit that category.

I was really hoping for some information about how I could effectively debug the homekit communications to see what was going on and why that particular call might be failing.

As luck would have it this morning new update and included in it a bump to aiohomekit, I installed and I flipped over to the home app homing to see the status of the garage doors but instead saw what I’ve always seen “device is unresponsive”, I still dutifully clicked it like I have every update for the past month and it still read unresponsive but I could hear the door opening which was a first this time and the switch didn’t automatically move to closed again. About 20 seconds later the status returned to the other two covers and they all seem to be working as expected. Running the debug log I now see it updating the cover status and I can see the commands to open/close where before I didn’t even see attempts.

I read through the change log there and really don’t see anything that jumps out at me for aiohomekit but I guess I will head over there to see how I might be able to debug homekit issues using their libraries since that at least on the surface seems to be where my issue here was.

I run HA dual homed and it works fine, but I also run a container install because it offers more flexibility with OS config than HAOS does. You didn’t mention which install type you are using? If you are using container, have you enabled a host firewall on your HA server?

I have a HomeKit bridge with over 100 devices for several years now, again without issue — separate bridges are only required for accessory mode to enable certain device-specific features, or if you have over 150 entities, so I’m not sure what you mean by overloading — do you have hundreds of accessories?

Also I use YAML to configure my bridges because the GUI config is too opaque and its options are more limited then YAML, but you didn’t mention which bridge config method you are using?

I’m also struggling to understand your setup — you mention garage controllers of which some are esphome but one is MQTT only? And it sounds like they all present both a light entity and a cover entity to HA? And they are all working fine in HA’s web interface but only the lights are working when bridged via HomeKit protocol to the Apple Home App, is that correct?

Also you mention VLANs and mDNS reflecting (typically not needed with dual home HA, by the way) but I’m not clear how the different devices are allocated — can I assume the garage devices are on one vlan and the Apple devices — hub(s) and phone(s) — are on the other vlan, and that HA (and your router of course) are the only dual home devices on both vlans?

Finally, you posted this to the Matter/Thread category but I am failing to figure out how Matter or Thread comes into play, so again maybe I’m missing something?

My recommendation is to turn off any host firewall while troubleshooting (also maybe mdns reflecting). Delete your existing multiple bridges and confirm everything is removed from Apple Home. Then setup a single bridge in YAML for anything that doesn’t need to be in accessory mode. If you are still seeing problems, post your YAML config here for further clarity.

I should have mentioned it is using an old HAOS esxi vm that I’ve just kept up to date over the years and is now on a prox mox cluster.

For the bridge config I am mostly over in the ui for everything now except cameras only because I needed to be able to set streams for secure video so it shows up on the apple tv, it used to all be in yaml but last year sometime I noticed that there had been a lot of work there so I slowly started moving things over. I did run into the 150 limit as I have over 130 shelly devices for lights. My solution was to just break items off into their own bridges.

The garage controllers/covers are the ratgdo kits from paul, one of them uses his mqtt firmware for a swing arm gate and the other two use his esphome firmware. The lights and sensors could be seen and controlled via homekit but I could not see the status or set the doors to open/close from homkit the HA ui worked fine for everything.

There are two interfaces on the HA server one configured lan the other configured iot the iot side is a tagged vlan and I control pretty tightly what can talk outside of there and on what ports everything for HA lives inside that network with one exception. My apple tvs are all on the lan side, which I knew going into would be a risk but forwarding the mdns and allowing them to talk to the iot side seems to work. I also block MDNS repeating from HA across the two networks to prevent it from seeing it’s own advertised mdns on one interface from the other interface.

I posted to the matter/thread because I didn’t see a homekit specific, and I know the underlying behavior is very similar so I tried it and it gave me an option to tag it with homekit so it seemed appropriate.

I will try what you recommended next time, for now after the 2025.04.04 release it’s back to working like a champ. We’ve left the house several times today and been able to use siri in the vehicle to open/close the gate and garage without issue.

1 Like