HomeKit Accessory Protocol (HAP) over CoAP/UDP (was: Nanoleaf Essentials bulb via Thread/CoAP)

This can happen… When I recently installed 16.3.2 on my ATVs and HomePod Minis, I lost connectivity to all my thread devices (I only have 3 wemo outlets at the moment). I read here that if the BR changes ip, HA doesn’t alway pick that up (or like you might suspect the routing table doesn’t get updated). Solution was needing to reboot the box (in my case VM). A simple restart of HA doesn’t work, has to be the whole machine.

While inconvenient, I can live with the occasional reboot being required in that scenario. Am I correct to assume, however, that in your environment your HA instance is able to discover your Thread network without the creation of manual routes? This is the sticking point that would be significantly more annoying for me: to have to keep monkeying with removing and adding routes after various updates and so forth.

Thanks for all your help so far.

Edit to add: I have not yet tried the suggestions above to add sysctl lines (eg: net.ipv6.conf.br0.accept_ra=2). I realize I will have to modify that to pertain to my docker container. If that solves the RA/discovery issue, then I would certainly consider my specific issues “solved.”

@superpaul oh, excellent, we’re on the right track. Yeah, the routes/prefixes can and do change. Every time my Nanoleaf Elements hex panels reboot (power loss, fw update, etc.) I get a new prefix. This should absolutely be auto-configured by your OS. Other people on this thread have had luck setting that sysctl although one or two had further issues. Please do give it a try.

Works great! With the 1.0.0 version of OTBR and 2023.3.1 pressing the “Provision Preferred Thread Credentials” button just works!! So cool!

I know I should probably branch and do a PR to add these, but if you get a chance here are a few HOMEKIT_ACCESSORY_DISPATCH & CHARACTERISTIC_PLATFORMS I still have to add manually.

// const.py
HOMEKIT_ACCESSORY_DISPATCH = {
    ..................................
    ServicesTypes.AIR_PURIFIER: "fan",
}

CHARACTERISTIC_PLATFORMS = {
    ...................................
    CharacteristicsTypes.AIR_PURIFIER_STATE_TARGET: "switch",
}

// fan.py
ENTITY_TYPES = {
    ServicesTypes.FAN: HomeKitFanV1,
    ServicesTypes.FAN_V2: HomeKitFanV2,
    ServicesTypes.AIR_PURIFIER: HomeKitFanV2,
}

// switch.py
SWITCH_ENTITIES: dict[str, DeclarativeSwitchEntityDescription] = {
    ...................................
    CharacteristicsTypes.AIR_PURIFIER_STATE_TARGET: DeclarativeSwitchEntityDescription(
        key=CharacteristicsTypes.AIR_PURIFIER_STATE_TARGET,
        name="Automatic Fan Speed",
        icon="mdi:fan-auto",
        entity_category=EntityCategory.CONFIG,
    ),
}

If those get added in I’ll be back on stock homekit_controller!!

Well, unfortunately for me, neither “net.ipv6.conf.eth0.accept_ra=2” or “net.ipv6.conf.all.accept_ra_rt_info_max_plen=64” (separately or combined) allow my HA docker container to organically discover the Thread route via the Apple TV. So far I’m stuck with with manually adding the route.

Thanks for everyone’s help so far. I hope there are more ideas on the horizon.

1 Like

Sorry I didn’t see this mooning. The forums are pretty rubbish at notifying me.

So you said that you manually added the route in the container and it worked (not to the host). So I just wanted to check what your network topology was. Like, are you using host mode networking, macvlan, something else?

Where and how did you try setting net.ipv6.conf.INTERFACENAME.accept_ra=2?

Did you try setting net.ipv6.conf.INTERFACENAME.accept_ra_rt_info_max_plen to 64?

You’ll need to do that on the interface name that should be used to talk to the thread devices. So in the container. Probably not the bridge, but It can be depending on your setup.

1 Like

Hi,

i have the exact same problem as @superpaul. I also tried the same stuff and can see, that my NAS (Synology) has not picked up the right IPv6 route. I tried to set both of the settings you mentioned, but
accept_ra_rt_info_max_plen can not be set on my NAS with the following error:

sysctl: cannot stat /proc/sys/net/ipv6/conf/all/accept_ra_rt_info_max_plen: No such file or directory

Also tried to do this on other interfaces, but this gives the exact same error.
Someone has an idea what I can do now to get this working on my NAS?

Your NAS has bad kernel build options. You need CONFIG_IPV6_ROUTER_PREF and CONFIG_IPV6_ROUTE_INFO to be able to process the information from your border router, which your NAS doesn’t have (accept_ra_rt_info_max_plen becomes available when the kernel is compiled correctly).

Okay so then we can not use this on a Synology NAS, am I correct? Because I think we can not change the kernel :smiley:

No worries, I certainly appreciate the help.

You are correct in that when I add the route manually, everything works fine. The route is never discovered (at least with the settings I’ve attempted so far) from inside of the container.

My docker container is setup using a macvlan network

I am setting net.ipv6.conf.eth0.accept_ra=2 and net.ipv6.conf.all.accept_ra_rt_info_max_plen=64 from the docker-compose.yml. From inside the container, I’ve looked at the sysctl output and verified they are reflected successfully there. I did max_plen on “all” versus on the interface itself. I didn’t think to try the interface, actually.

Yeah I don’t think all works for that one.

And it can take 10-20 minutes for it to work. A HomePod should send a route advertisement every 3 minutes but the lifetime is 30 minutes.

So try setting it on the interface. And give it some time to kick in.

If it still doesn’t work, install tcpdump in the container and do “tcpdump -evvv -ni eth0 icmp6” and wait 10 minutes. And post the output.

1 Like

Yes and no.

What you are doing now will not work.

If you can run a VM in your NAS that could work. If you get the networking right, it would have to be bridged or whatever your device calls that.

It might be there is software that can process the RAs from the border routers. For example, HAOS actually uses NetworkManager instead of the kernel (though that is buggy so even id symbology could run it you really don’t want to).

Alternatively you could run a SkyConnect connected to the NAS. You aren’t running HAOS so you would have to quite a bit of manual work but OTBR running locally doesn’t depend on RAs so it would get past that problem.

You’re correct, again. I changed my docker-compose line to specify the interface, and the RA appears to have worked. The route I was manually creating was created automatically. So just to be clear for anybody else who gets to this point, I am:

  1. Running HA in a docker container
  2. Using a macvlan docker network (in my case the interface is eth0 inside of the container).
  3. Used the following addition to my HA container config in docker-compose.yml:

sysctls:
- net.ipv6.conf.eth0.accept_ra=2
- net.ipv6.conf.eth0.accept_ra_rt_info_max_plen=64

Of course, be mindful of spacing for YAML.

@Jc2k thank you again for your diligence and suggestions. I do eventually plan to setup my SkyConnect once I get it flashed for multiprotocol (there’s another thread going on where it seems like somebody was able to pull it off). However, I’ve also read grumblings about issues with multiple (Apple?) border routers causing problems, so I’m glad I can call this “working” until more information comes out there.

1 Like

Yes, if if you don’t use skyconnect you likely face some challenges.

Right now the most understood issues only hurt HAOS users. The issue is what I call ghost routes. Every time your HomePod gets a new route, haos has no ability to expire the old route (because of bugs in HAOS).

HAOS 10 won’t suffer from this bug, but has a new bug where you can only have one route at a time. This means if you have 3 HomePods you’ll
See a route table flap once a minute.

If don’t use HAOS it can still take 30 minutes for Linux to notice a route is no longer valid.

Then even if the router is there, there is no guarantees it works. We are seeing cases where BR is online but can’t see all the devices in mesh. Removing route can help it choose a better BR. This is especially common with appletv.

(OTBR can do something called TREL where it uses your WiFi or Ethernet to compensate for blindspots in your thread network. Apple doesn’t seem to support this yet).

I assume this is fixable at some point?

@Jc2k Thanks again for all your work here. Have you had anyone come across issues with the Wemo Stage Controller? A few weeks ago, I was able to successfully add and then provision two different Stage Controllers to my Thread Network (Home Assistant Yellow’s OTBR), and things were working great. I setup automations that responded to each button type. Then, things stopped working. It’s hard to document, but basically if I did a full reboot of the HA Yellow followed by another restart of HA Core, I could sometimes get the automations to run. Now, it seems like I can’t get it at all. Furthermore, when I go into the automations I no longer can select the Button as a Trigger. Clicking Identify does, however, light up the Stage Controller.

Happy to provide more details or logging if you haven’t seen this before.

I have one of those and sometimes it stops working and a restart of HA fixes it. I haven’t had time to investigate why. It controls the garden lights do fairly high on my list should I get some spare time!

Honestly no idea. It’s not a small piece of work, and it involves a third party upstream. The best bet for the near term is to ditch NetworkManager which in itself is a substantial piece of work.

Ok cool, thanks. When I removed the Stage Controller and re-added it, it did once again show the Button options in the Automation. My remaining (unremoved stage controller) still doesn’t.

Doesn’t that mean that Thread on HAOS is basically going to be broken (or at least not work reliably) for almost any real-world scenario (i.e., want to use their Apple Devices, Google Devices, Echoes, etc as BRs?)