Using ESP Thread Border Router in HA as main preferred network

This is not that hard and complicated, but I guess it’s a bit of uncanny valley of being relatively unexperienced while having relatively specific hardware.

I wanted to use ESP Thread Border Router as the “main” preferred network Thread Router network, using ethernet so the thread network is resilient on HA or Wifi failures. I also didn’t want any vendor-specific device like Apple’s or Google’s.

So the way how to do it:

  1. get the ESP Thread Border Router, potentially with Ethernet shield (optional, I think it’s worth it though)
  2. go to Releases · espressif/esp-thread-br · GitHub and get the latest release. It’s a little bit involved, you do have to compile stuff and so on, but you don’t really need to understand any of it and just follow the linked documentation from the release page. In short you need to build esp.idf and use the thread border router example from a second repo (it’s called examples, but it’s actually what you wanna use). This docs also helped a bit. The USB-C port to use is the one labeled “USB2” (but seems that both work?). It takes quite some time to clone and build everything (and is a bit confusing that you build stuff for esp32 h2 first, but do not flash it, it gets flashed afterwards when you are flashing the main part: ESP32-C3.) You don’t need to really setup thread details via that menuconfig, that can be done later (if I understand correctly). This guide shows that the last command is idf.py -p flash monitor which is nonsense, you need a parameter afterwards. On my mac, the last command was idf.py -p /dev/cu.usbmodem2101 flash monitor. However, you can drop that monitor in case you just want to flash the thing, this can be later brought in via idf.py -p /dev/cu.usbmodem2101 monitor. It will spit out logs and gets you into a console, chances are high that you don’t need this for anything. To quit this console press CTRL+], a bit of a gotcha. Note that to changing wifi name and creds you need idf.py -p /dev/cu.usbmodem2101 erase-flash flash monitor.
  3. Once you connect it to the network, you should be able to get it up and running. I really just followed the guide above - the ethernet version, connected it to my router via ethernet cable and then I was able to access it under the mDNS http://esp-ot-br.local/index.html (note the index.html, it’s mandatory) or also via local IP like 192.168.0.33/index.html.
  4. In that web gui there is a settings that you can “Form” a new network on http://esp-ot-br.local/index.html#Form. You should change the default settings. I generated new PAN ID and Extended ID using echo "PANID: 0x$(printf '%04X' $((1 + RANDOM % 65534)))" && echo "Extended PANID: $(od -An -N8 -tx1 /dev/urandom | tr -d ' ' | tr '[:lower:]' '[:upper:]')", changed the channel from default 15 to 25 so it can sit next to existing zigbee on 26 with as little overlap with wifi, used a different Passphrase/Commissioner Credential (just any string) and also generated random 32 long hex string (there are online generators) to use as a Network Key.
  5. That should be enough, once you click on Form Network, it should create it. Note that if you refresh the page, it will be gone, the web gui is abyssmal, i.e. it doesn’t work for me, showing “unknown” name etc… So don’t get surprised it doesn’t “save”.
  6. Then in HA enable Open Thread Border Router and in the configuration put in the IP address of your newly created device from the above, like http://192.168.0.33. Then add a Thread integration. Your border router should show up there as “Other Networks”. Click on three dots and let it be “Make a preferred network”. That should be it, this should now be the main network. AFAIK you don’t need an android or apple device or companion app, this should be enough. Do not try to use the border router to add it into a HA created thread network - that would wipe the settings from the Open Router.

If you sucedeed, you should see something like this:

3 Likes

Hey! What is your way of running HA?
I am running from a container (podman compose), and also own an ESP Thread Border Router. My Thread client device (running ESPHome) gets an IPv6 address, which I can ping from my machine, as well as the border router itself, so I presume the firmware side of configuration is correct. I am also able to add the OTBR and Thread integrations, and I can see my Thread network with all the details, much like your screenshot. The weird part, HA says I have no border router on the thread network, despite the fact that my BR says on it’s interface it’s the leader of the thread network. HA shows a reset button, all it does is changes the thread network hosted by the border router, but still shows up as “no border router”. Do you have any idea what my issue could be?
Also FYI, clicking on the blue “status” in thread network status section of the page, after a few seconds it fetches the details and you’re able to see the values.

Hmm. Not sure. Have you added Open Thread Border Router integration as well? :thinking: So you don’t see the router in the Thread view at all? Post a screenshot of this screen:

Also FYI, clicking on the blue “status” in thread network status section of the page, after a few seconds it fetches the details and you’re able to see the values.

Yeah, I wrote some findings here btw. Failing to connect ESP-H2 via Thread - #4 by kotrfa

Thanks for getting back! Yes, I have added the OpenThread Border Router integration, pointed at the mDNS local domain address of the router. It shows up only this far


I stil have the logs running from the border router connected to my PC, thread-rf-bridge is my would-be client thread device on a separate H2, the IP address resolves just fine.

I feel like I’m out of my depth here… :sweat_smile:

HA Thread settings shows a list of networks detected by means of the mdns broadcasts to _mechcop._udp.local service. It seems these broadcasts aren’t reaching your machine, either because it’s on a different WiFi subnet/vlan, or because something on your network/host is blocking the multicast dns packets (e.g. router or firewall). Are you using multiple vlans? Can you see the meshcop advertisements in the HA zeroconf browser (settings → system → network), or from another device running a zerocconf browser app?

Can you ping this IP from any host on your network, or only from border routers? If the former, the border routers are doing their job and the only problem seems to be the HA detection part. If you aren’t planning to commission Matter-over-Thread (or HomeKit-over-Thread) devices with HA, then you really don’t need it to see your TBRs, but it does point to a potential problem on your network if it can’t.

This is weird to me. It says there is no border router, but it gives you a button to reset the (nonexistent) border router?? This settings page is a UX disaster. Maybe the OTBR integration (which is separate from the Thread integration) is connected to an OTBR REST API but still no mdns broadcasts are coming through? There should be a better way of presenting this to the user.

If it would be mDNS, couldn’t it help to use IP address directly in OTBR?

I second that the UI is pretty bad… There have been already a lot of suggestions how to make it better on HA github, so I think it’s at least tracked, and I myself have added a couple of them.

mDNS not being used for name resolution in this case, it’s being used for service discovery. Thread Border Routers “announce” themselves using multicast zeroconf packets, so if you have an app that can view them such as “Discovery” for MacOS, you can see your TBR announcements on the local LAN:

meshcop-udp

If they are being received by HA they should show up in its built-in zeroconf browser, which is what makes them appear on the Thread settings page:

This is just service discovery, which isn’t technically needed for the router to do its job (routing). For that, it sends out IPv6 router advertisements (RAs) so that hosts on the LAN know the “nexthop address” to the Thread subnet IP range. It is possible that your RAs are working and you can ping Thread devices even if mDNS service discovery is broken (however you need the mDNS part working if you want to use Matter, as it is a prerequisite).

For example, on a Linux machine I can see all the routes to my Thread subnet, one from each TBR:

peter@sarah:~$ ip -6 route
::1 dev lo proto kernel metric 256 pref medium
fdaf:c551:c359::/64 proto ra metric 1024 expires 1668sec pref medium
	nexthop via fe80::c05:63f8:ea9d:cfff dev enp1s0 weight 1 
	nexthop via fe80::fb:e6f6:1481:6aff dev enp1s0 weight 1 
	nexthop via fe80::cc6:db46:9e3e:74ff dev enp1s0 weight 1 
	nexthop via fe80::1066:ac0e:ce97:fdff dev enp1s0 weight 1 
	nexthop via fe80::18b8:6b02:a291:e0ff dev enp1s0 weight 1 

On my MacOS machine I can only see the chosen route:

peter@whistler ~ % netstat -rn -f inet6 | grep fdaf:c551:c359
fdaf:c551:c359::/64                     fe80::18b8:6b02:a291:e0ff%en1           UGc                   en1       

I can confirm using the Discovery app (screenshot above) by expanding my TBR details that this fe80 address belongs to one of my border routers. If I traceroute to a Thread device I should expect it to use this TBR as the next hop:

peter@whistler ~ % traceroute6 5E99DF85BBD8433C.local
traceroute6: Warning: 5e99df85bbd8433c.local has multiple addresses; using fdaf:c551:c359:0:a805:5d3a:784f:2f1
traceroute6 to 5e99df85bbd8433c.local (fdaf:c551:c359:0:a805:5d3a:784f:2f1) from fdc4:4aab:e77f:e44c:1c53:e1b9:c62a:83dd, 64 hops max, 28 byte packets
 1  fdc4:4aab:e77f:e44c:81b:8955:4a71:e3ff  11.543 ms  7.511 ms  5.909 ms
 2  fdaf:c551:c359:0:a805:5d3a:784f:2f1  27.794 ms  34.724 ms  33.154 ms

Here it’s showing me the ULA instead of the LLA but again I can check my Discovery app output to see that it’s just another address on the same TBR.

1 Like

Hey, thanks for the reply! Yep, resetting the border router I don’t have :smiley: Your theory seems plausible to me.
I have multiple SSIDs and VLANs, running on Unifi, but I have configured the border router to be on the same VLAN.
My HA instance shows no zeroconf discovery at all. I installed BonjourBrowser on my phone, which sees my TrueNAS and my BSB-LAN but not the thread BR. So could already be 2 potential issues.
As for IPs, I was able to ping from both my HA host OS and my separate PC, to the border router and the thread client device I had. I don’t have any matter or homekit devices as for now, all i have is a test ESPHome board.

Wow, thanks for the description, this is very helpful even for me now, I am sure I will find this handy (and already see how this seems to be much better than zigbee introspectibility)!

I was able to confirm that indeed meshcop udp advertisements are sent by the BR, as I’m able to see them from my machine. I am now pretty sure my issue is with running rootless podman, without capabilities such as CAP_NET_RAW. I’m not sure this looks solvable in rootless mode for now. I am going to have to try switch to running as root probably

I was able to figure it out. I have switched to rootful Home Assistant but still nothing would come up. Turns out my nftables config was a bit too strict and I didn’t outright allow mDNS UDP traffic. I am now able to see my Border Router as expected