Add a second border router to a existing Thread network

Hi,
I just create my brand new Thread network and comissioning my first Thread Matter device. Now I wonder, how to add a second border router to my existing network?

Currently the OpenThread Border Router integration only support a single instance, unfortunately :cry: We are working on support multiple, hopefully this lands soon :crossed_fingers:

Actually, there are two problems here: The OpenThread Border Router add-on and The OpenThread Border Router integration. The integration part we intend to fix soon.

Currently it is also only possible to install/run a single add-on instance. However, runnign two instances at on the same machine is usually also not helpful: You want to spread out the BRs. So ideally you install another OTBR instance on a different machine.

To control such a second instance, as a work around to the problem that the integration only supports one instance, you should be able to just delete the current one from the integration and add just the second one, and configure it from the Thread configuration panel. The OTBR is self-contained, so the first instance should continue to run with the last configuration. You just wonā€™t be able to control it anymore from the Thread configuration panel.

Do you have some Updates regarding multiple border routers?

Yes, this is in progress with PR #124289, it will most likely make it into 2024.9 :tada:

Iā€™m thinking on this some ā€¦ I already have two HA instances, each instance running OTBR-AddOn w. Thread/OTBR Integration. I have both TBRs using the same Thread dataset, so they are both on the same Thread network. Each Thread integration knows its local dataset as well as the remoteā€™s dataset.

So in the case where there is only one HA instance running a local OTBR AddOn and there is the desire to have a second OTBR instance on another machine but controlled by the one HA instance, then is it the case that the second OTBR is not implemented as an HA AddOn, but something else (Docker)?
If no, and the second OTBR is an AddOn running on a second instance of HA, then what would be the advantage of having one HA instance control the OTBR of the second instance?
Best Regards

Iā€™m interested in this too. Iā€™ve been looking forward to using thread like WiFi, deploy PoE TBRs like you would an access point. Deploy additional TBRs for redundancy and to increase coverage. In my deployment, WiFi is my strongest and most resilient signaling protocol because I have 3 APs. Its only drawback is battery powered devices. Iā€™m hoping thread can solve for that.

@agners Thanks for your work on OTBR! Does this PR help improve stability in networks that mix Apple Home and OTBR devices?

I recently added a SkyConnect to my Home Assistant server, joining it to an existing Thread network. Since then, most of my Thread devices frequently go offline. The only Thread device I can monitor is my HomeKit device: it stays online for 1-3 minutes, then goes offline, and reliably reconnects exactly 30 minutes later. I cannot track the thread status of all my Matter devices, as the Matter integration doesnā€™t seem to log the status.

About a year ago, I merged my Apple Home and OTBR Thread networks with similarly unstable results. I thought Iā€™d give it another try, but it seems the issues persist. I might end up buying another HomePod Mini or Apple TV instead of using my existing SkyConnect.

Do you know if anyone is actively working on these stability issues? I havenā€™t found any specific GitHub issues related to this, but navigating the HA project can be overwhelming without knowing exactly where to look. Thanks!

Wohnzimmer Lightstrip Thread Status  became unavailable
08:07:35 - 16 minutes ago

Wohnzimmer Lightstrip Thread Status  changed to Leader
08:04:04 - 19 minutes ago

Wohnzimmer Lightstrip Thread Status  became unavailable
07:35:51 - 1 hour ago

Wohnzimmer Lightstrip Thread Status  changed to Leader
07:34:04 - 1 hour ago

Wohnzimmer Lightstrip Thread Status  became unavailable
07:05:39 - 1 hour ago

Wohnzimmer Lightstrip Thread Status  changed to Leader
07:04:04 - 1 hour ago

Wohnzimmer Lightstrip Thread Status  became unavailable
06:35:26 - 2 hours ago

Wohnzimmer Lightstrip Thread Status  changed to Leader
06:34:04 - 2 hours ago

Wohnzimmer Lightstrip Thread Status  became unavailable
06:06:45 - 2 hours ago

Wohnzimmer Lightstrip Thread Status  changed to Leader
06:04:04 - 2 hours ago

EDIT: Thereā€™s nothing in the logs for Home Assistant Core, Open Thread Border Router and Matter Server for this time frame.

EDIT2: Removing the device from HA, resetting and readding it caused the issues with this specific device to disappear. I noticed its Thread status changed from Leader to Router. So the problem might be caused if a Router device decides to take additional responsibilities as a Leader.

@agners Can you please help to figure out how to connect additional Thread Boarder Router to existing Thread Network. I have Home Assistant Yellow flashed with Thread only, OpenThread add-on and integration configured. Everything works as excepted. I bought SLZB-06M to extend my Thread network. I reflashed SLZB-06M with Thread only firmware. How do I add it to existing thread network? Do I need second instance of Home Assistant to install addon? Or is it possible to connect it to existing one somehow?

2 Likes

Was there an answer this question? I have a single SLZB-06M working correctly, but its range is insufficient to reach other Matter devicesā€¦ adding a second, networked radio to the same Thread network would really solve my issueā€¦

(although I guess I could buy a Tado Bridge-X and join it)

Ordinarily, if you just want to extend your network, you wouldnā€™t add another Border Router ā€” any line-powered Thread device will act as a repeater and increase your meshā€™s range. The cheapest way is a $5 esp32-h2 dev module flashed with firmware as explained in this post, but any line-powered Thread device will work. I personally made a cheap repeater out of espressif zero code firmware.

I guess the exception is if you want to extend your network beyond the reach of the original mesh and donā€™t want (or canā€™t add) repeaters in between. You can get a standalone TBR like a GL-iNet GL-S200 or you can build one yourself with OTBR software coupled with an adapter. Ordinarily you would never run two OTBR containers on the same HA server (since they would be in the same place), but the SLZB-06M has a cool serial-over-IP feature allowing you to place it further away from the server running OTBR. Lots of niche cases there!

[ PS - I would add that, having two isolated (i.e. not in range of each other) TBRs means there is no reason/benefit for them to be the same network (name/credentials dataset) since you would know which mesh you are joining you just choose the one you want to use during commissioning. BUT the state of Thread UX is so dismal right now that having two mesh datasets is causing all sorts of confusion and itā€™s not even clear if you CAN choose a mesh during commissioning. Given that reality, unless you have a separate mobile device to commission on each mesh, you may need to consolidate to a single mesh dataset which complicates things even MORE because vendors arenā€™t sharing datasets with each other yet. Currently (to my knowledge) only OTBR (being open) lets you modify its dataset to merge with another network. Hopefully this improves in the coming years. ]

TL;DR: use one TBR if you can, and repeaters to extend your mesh. If you must have two TBRs, make them the same vendor. If they must be different vendors, make sure the second one is OTBR so it can merge with the first to simplify commissioning. There are only niche cases where you might want to run two OTBRs on the same server, e.g. geographically sparse networks using serial-over-IP Thread adapters.

For what its worth, Iā€™ll add that if the TBRs support a feature called TREL (and these TBRs can be put on the same Thread network), it theory at least, if a Thread network gets ā€œpartitionedā€, meaning it looses some connectivity within its mesh, and a TBR exist on each partition, then TREL is suppose to allow the partitions to be reconnected using the LAN connecting the TBRs.

In practice, Iā€™m not so sure ā€¦ I have 2 OTBRs running TRELL, but have found cases where one OTBR may loose connection with a device, while the other TBR still has connectivity (I havenā€™t actually dug into it enough to find out why). So Iā€™m not so sure yet how well TREL really works.

But as mentioned by Peter, it will certainly help the mesh to have repeaters.

Is this possible yet? Looks like this PR was merged, however I canā€™t find any information about how to configure it or any supporting documentation mentioning it

I just glanced at the code [2025.1] on my running system and the changes are there.
It looks like you go to the UI->Integration->OTBR-> ADD SERVICE and you get a pop up asking for the URL to reach the OTBR which I presume is the additional instance.
Whether or not the Thread Integration can make use of it (like push preferred credentials), I donā€™t know.

Oh this is for the integration, I see.

Iā€™m using the add-on, but canā€™t find a way to create an additional add-on instance. If thatā€™s the intended behavior I donā€™t think this is possible in HA right now.

Does anyone know what did the testing environment for this PR look like?

Hope you donā€™t mind if I ask what is your use case for a second OTBR instance on the same machine?

1 Like

Iā€™m actually using a tcp/ip adapter connected over ethernet to ha which is running on a VM on a large virtualization host in the garage. I suppose running otbr outside of the ha add-on would make this possible?

Iā€™m assuming that running another otbr instance is whatā€™s necessary to run a second otbr adapter

I made another comment earlier in this thread sometime back to explain my use case, but the main thing Iā€™m after here is network redundancy. My entire network is fully redundant with the exception of thread, zigbee, and Z-Wave

By this I think you mean OTBR ā†” tcp/ip ā†” Thread radio.
Just an FYI, the HA Matter/Thread developers have tested RCP over tcp/ip and have concluded it introduces too much variable delay and shouldnā€™t be used. From this page, scroll down and youā€™ll see a warning about it.

1 Like

Yea I saw that. I would imagine most of the problems arise from WiFi backhaul. Mine has been working great for 6 months via PoE. As things are right now, Iā€™m happy with my current architecture, aside from the lack of redundancy

Hopefully a final question, is your PoE link running straight out of your HA machineā€™s Ethernet port to the adapter, or are there multiple L2 switch hops or ever router hops along the path? I could see where the former is not a problem, but perhaps the latter (store and forward a packet) is a little more of a problem??