Hi there, I’m thinking about building up a home automation system with HA. But before I go all in, I wanted to make sure it all works the way I need it to.
I have one major problem. I sometimes travel for longer periods of time, while my family stays at home. When I’m not there, I need some sort of fallback system, in case the server/gateway breaks.
I have read a little about HA and high availability. But it seems that there is no blessed way for that (yet).
I was thinking about just creating two identical servers/gateways and use one as an emergency fallback. For example:
Let’s say I buy two identical Raspberry Pi’s and two identical Zigbee USB gateways. Then I configure HA for one Pi/gateway. When I’m done, I simply run ‘dd’ to clone the configured hard drive with the other one. I should now have two identical HA servers.
When one server/gateway breaks, my wife can unplug the power and Ethernet from the broken Pi/gateway and plug in the other one. After that, the system should be back to normal, right?
That way I would only have to ‘dd’ the hard drive of the main system with the one from the backup system each time I change something in HA.
I regularly travel away from home for 12 months at a time.
I’ve had few issues other than some smart plugs failing.
I have alternate access to my local network and a standby pc that I can spin up (it has a HAOS image already installed) in case of catastrophic server hardware failure.
There are no dongles plugged into my HA server that would need moving. I use a Tube’s zigbee dongle with an Ethernet connection.
Indeed. The USB sticks are the problem for most High availability setups
Op just for clarity this:
Won’t work. Your network is stored on the stick. And no you can’t clone it and expect to work that way. So what Tom’s doing is using a network connected coordinator to eliminate that issue.
I would however offer a different opinion.
Don’t High availability the setup. I love my spouse dearly and if I told her ok Hon if this crashes do this… I get ‘the stare’ you know the one. Where they tell you you’ve done something dumb? There’s no way I’m leaving a restore process for standard users.
If you do, make sure you leave yourself a method for auto recovery or a way for you to login remotely and do it yourself.
Personally I build the setup with failure in mind. Look at everything you put in and ask.
Ok what happens to this if the Internet crashes, if HA crashes? With modern gear you can build setups that work entirely locally.
Pick devices that use industry standard protocols that support local connectivity like ZWave Zigbee Matter etc.
Use hardware buttons that support whatever direct connect tech your network transport supports (binding for Zigbee for instance)
Avoid software based triggers for mission critical things. (if ha crashes you lose access to the dashboard)
By doing this even if catastrophe strikes your home still operates. As long as I have power I can turn on the lights and my alarm is a standalone system so I can still disarm it.
THEN add high availability on top of this build philosophy… If necessary.
If you’re new-get into this habit before you build and you’ll save yourself a LOT of heartache later.
Edit:and for the record I run on a Rpi48G with a 500g SSD. I backup every night, and have a standby SSD with a reasonably recent ha build loaded ready to go.
I put my HA Yellow Box on a Tuya wifi plug. If my HA stops working I can power cycle my system with the Tuya App on my Iphone. On the rare occasions that HA stopped working, the power cycle did the job.
Ok what happens to this if the Internet crashes, if HA crashes? With modern gear you can build setups that work entirely locally.
Pick devices that use industry standard protocols that support local connectivity like ZWave Zigbee Matter etc.
Yes. I want that too. I want it to run locally using open standards, without external cloud shebang.
They can be replaced but it’s not an automatic process in best of cases.
Oh my… I already have plenty of Philips Hue Zigbee lamps and switches. I think the switches won’t work without the Zigbee dongle.
I’m surprised that the Zigbee protocol has no solution for this. This makes it unsuitable for so many mission critical applications.
I have never configured HA/Zigbee. How much trouble is it to replace a failing Zigbee dongle? Would a standard user (e.g my wife) be able to do that with some guidance in less than two hours? I have about 40 switches with about the same amount of lamps.
You misunderstood it can absolutely be backed up (and is by default on ZHA, the default zigbee software) but my point is don’t expect just push button auto fail over. It won’t happen and do not expect it. Restore is a manual process. Same for ZWave if you go that way. Or Matter.
Im a 25+year it pro. Also I’ve been doing Home Automation since 2000. IMHO you’re putting way too much weight on a failed stick. It rarely happens. A bad bit of code and a failed install. Sure, happens all the time. The stick itself doesn’t just usually die. And even if it DID. If you’re designed using hardware buttons with direct linking. Who cares. The lights still work using the switchesyuntil you (worst case order a new stick, wait a day) fix it. Which for the record you can absolutely restore a stick (and in my case the entire rig) in under an hour.
And No YOU DO NOT want a novice user attempting a potentially destructive operation (they can get you to an unrecoverable state)
My point is we try to tech our way out of everything. Some problems are better designed around.
You misunderstood it can absolutely be backed up (and is by default on ZHA, the default zigbee software) but my point is don’t expect just push button auto fail over.
Ah ok. Would the solution that I have written in my first post work then? Because that would be ideal for me.
IMHO you’re putting way too much weight on a failed stick.
Maybe I do. But the worst case scenario would be that I have to take some days off, fly back home, fix it and fly back. I don’t want that to happen. I can’t leave my family w/o light at home for several month, when I’m at the other end of the world.
The stick itself doesn’t just usually die. And even if it DID. If you’re designed using hardware buttons with direct linking. Who cares.
You can have a warm stand by HA install. You cannot have a warm standby stick.
Your solution needs to include either a network based coordinator, or you restore to same hardware or you plan on moving the stick to restoration hardware after your restore operation.
For Tom - he’s using a network coordinator so he restores and the stick is just there.
For me, I pop in my backup SSD run restore and then restart.
If you’ve set them up correctly yes.
My point is if you built it correctly thy won’t be without lights because you designed it to work completely offline. It may be a little dumber. But it works. Negating any criticality to fix it. And then it can wait two or three days. If you don’t build for an outage you’re dead from day one because…
No matter how you slice it this is consumer tech - not enterprise tech. It’s not designed with 5-9’s in mind. Plan for the outage to happen. Because it WILL. You won’t be able to restore your way out of everything.
My fallback is just going back to “dumb home”. The light switches will keep working even if the server’s gone, they just won’t be automatic. The thermostats will just have to be set by using the buttons on them, etc.
Always have a physical backup on everything and then it can keep working.
I can keep a pre-configured fallback HA server, but I can’t keep a pre-configured fallback Zigbee Gateway (doesn’t matter if it’s USB or Ethernet). If the Zigbee gateway hardware breaks I should either rely on this:
The stick itself doesn’t just usually die. And even if it DID. If you’re designed using hardware buttons with direct linking. Who cares.
Or this:
My fallback is just going back to “dumb home”. The light switches will keep working even if the server’s gone, they just won’t be automatic. The thermostats will just have to be set by using the buttons on them, etc.
Or fly back home and install a new Zigbee gateway.
I’ll echo everything said here, and like @NathanCu I build everything with failsafes in mind as we travel quite a bit in an RV during the summer and need my systems to be available for security. I also have a fairly robust RV version of all of this that also helps me keep track of what’s happening at home.
I don’t worry too much about the dongles, honestly, I’ve had exactly one Z-Wave dongle fail in years - which is one of the many reasons I spread my network across Z-Wave, Zigbee and WiFi. Even if one thing fails I still have probably 2/3 of my house still working. That being said I always have spare dongles in case I am home and there is a failure, especially since my one failure happened during COVID when supply chains were bad.
When I was on my rPi I used a Kauf ESPHome plug for HA with code I wrote that tests the API so that if it detected the HA was down (no ping) or locked up (no API) - which had happened a couple times while I was away - then it kills the power on the rPi and brings it back up. Now, on a VM, I use a similar method with restarting the VM if needed and also have a Kauf plug on my host computer that does the auto monitoring of that too. I used ESPHome for this because it is it’s own self contained operating system that can do its own thing, but it also allows me to open a web interface remotely.
I have three Internet connections. My main cable modem for super high speed, an inexpensive DSL for being able to work when the cable goes down and a pre-paid hotspot on a laptop when I’m away as my emergency access.
Of course everything is on a UPS and everything does daily or weekly controlled reboots just to keep it fresh. My HA restarts every Saturday morning when I’m home (not when I’m on the road) and my PoE hubs reboot every morning in case a feed went south.
Fortunately I don’t have to worry too much about me being gone and the wife is stuck with the system, but when that happens she knows that she can still use the light switches.
Exactly. After I had my experiences with ZigBee and a few matter devices I ditched this technology completely and switched to esphome (WiFi) devices.
From the beginning I guaranteed that basic functionalities are given without HA. So everything is manual switchable if needed/wanted.
Beside with esphome it is even possible to detect that no HA connection is given and one node can trigger another directly over WIFI or even without (ESP-NOW) when WIFI is also down. While for now I didn’t even bother setting this “fallbacks” because my system is rock solid since install and the chances are actually high that if their is no WIFI their is no energy too.
Funny, I had the exact opposite experience with Zigbee. For me it’s been pretty much rock-solid, except for one type of device (Aqara door sensor, paired easily but worked like 5 minutes). I think the main thing with Zigbee is that you need a solid mesh of “routers” to start with, and then you can expand. I have all my thermostats wired in and acting as routers, so my base mesh is very strong. All other devices (especially battery-powered ones) can then take advantage of this and have a reliable connection.
I’ve had very good success with ESPHome and Shelly devices, too.
I think the point of both of these quoted parts is: make sure the lights can still switch on if your Zigbee gateway breaks. Either by wired switches that cycle the power to the lights, or by wireless buttons that are connected directly through binding.