Maybe Iāll give it a shot. Would be nice if it was possible to share sensors/inputs across devices in the swarm, making it sort of a integrated HA hive-mind
physical connections and sensors is impossible.
a Docker swarm is just a bunch of docker hosts that can pass the services around on it, it isnāt x-number of services all running together.
Okay I want to understand this correctly, so letās say that I have 3-4 piās around the house, one have a z-wave dongle, the others have some different sensors and so on attached to it. I then turn all of these Pi in to a docker swarm with HA running global (one instant on every pi). Does all these different sensors, from different Piās appear to HA like its just one big Pi?
No. Not at all.
That is not how it works.
A Docker swarm is a collection of docker hosts. FULL STOP.
a service running on a swarm can migrate from host to host.
You cannot have 3-4 Pis all acting as one big pi with different configurations and sensors on each one. You can install home assistant on each one, configure MQTT Eventstream and use a āmain piā to get all the data to one Home Assistant instance. This configuration has nothing to do with Docker Swarm.
Okay, thanks for the reply, that was also my understanding sadly.
What Iām getting at here is just if it would most sense to have HA within or external to the swarm, and external seems like the way to go here (for me).
You could always run a separate pi with Hass, have it act as a zwave/physical interfacing bridge and connect to it from the swarm-hosted main instance through MQTT.
Is anybody still on this? Iām reviving a reasonably old topic it seems, but Iām also actively working on this. Iām looking for knowledge and experience from others which have deployed in Docker Swarm, and I might have some experience that I can share :).
My main motivation is increasing the number of available resources (because I do believe that itāll be difficult for a single Raspberry Pi to run HA + a lot of add-ons), but Iād like to increase the availability (reliability, uptime) of the system as well. Basically I want to build the system such that the probability of the system failing is as low as possible. This is ideally done at both the hardware level (running on a single Raspberry Pi is not a redundant architecture), as well as the software level (easy config, rollbacks in case of failures etc. etc.)
Soā¦ is this still on top of anybody elseās agenda?
Iām running this for the last couple of months. I have a couple of PI with read-only filesystem with zwave sticks (and other serial adapters) that are exposed on an internal network as a tcp server with ser2net
The home assistant docker runs socat to connect to the āslaveā zwave and create a serial device inside the dock command. The configuration is stored in git, with a caching proxy. When a home assistant container starts it does a git pull, runs ser2net to have the zwave device locally then runs hassā¦
This has been running on a 3-node docker swarm without any issues. For example if i re-plug the zwave stick then ser2net is restarted, socat is restarted and home assistant is restarted, in this orderā¦
For the zwave stick you can use something like a MR2030 ( very small ddwrt / openwrt router) with ser2net on it, or a pi or anything elseā¦
Iām running now:
- 3 zwave sticks on read-only pi
- 1 heating container using homegear and a max cube device, writing to mqtt
- 1 mqtt container
- 3 home-assistant containers that translate the zwave data to mqtt
- 8 home-assistant containers with various stuff (tv, multimedia, presence detection); basically one purpose for each instance, and they write to mqtt
- 1 node-red for core data transformation from mqtt to mqtt
- 1 node-red for automations
- 1 āmainā home assistant that reads and write to mqtt
Docker container build files with home assistant + socat (but no git, thatās private)
For zwave devices you can run 2 sticks on the same network - second stick wonāt be able to include/exclude and some other stuff, but at least some controll / monitoring will work even if first failsā¦ I havenāt done that since it seems a bit overkillā¦ maybe in the futureā¦
That sounds great! Do I understand that you have a single z-wave stick in on one device, and then use ser2net to āmountā that serial device to another physical device of your choosing, depending on where the docker container runs that needs that device?
I donāt actually have a Z-wave network, but Iām very interested in this part - mapping physical devices on one host to any other (docker) host that you have. Did you also, for example, setup redundant storage to have docker volumes (where state of all the services is persisted, if configured correctly) redundant as well? And a redundant / distributed database that stores the home assistant data?
Iām starting now with the redundant docker volumes (trying Minio first, then switching to a NAS if thatās not performant enough.
I use 3 nodes that are āequalā
- in docker all are managers and workers
- storage is mirrored between the 3 of them by using glusterfs
- all devices are either from mqtt or use another type of network detection, so it does not matter where the device is
- mysql as db storage was setup as master/master to run on 2 different nodes, with the 3rd acting as a backup every hour (and upload encrypted backup to remote server)
Any of the 3 nodes can fail, and almost everything can run on one server if needed. I have a big ups on the 3 nodes ( j1900 quad-core nodes with ssd, about 10w / hour power usage), and if the ups goes under 2 hours runtime one node will shutdown to keep as much time as possible. when power goes under 30 minutes all non-security stuff is stopped and when power goes under 10 minutes all security stuff moves to 2 raspberry pi that have a 48 hour battery.
If you want to use docker and gluster you need to do this
- setup docker and the swarm
- disable docker auto start
- stop all docker nodes
- setup glusterfs
- create a cronjob that check if glusterfs is mounted and running - if yes, start docker, if not - stop docker
If you start docker before gluster, bad things will generally happenā¦
I installed everything on a basic ubuntu18, on celeron cpus (as I said above), with ssd and 16 gb of ram each and 2 intel network cards (one for storage/inter-cluster), one for the rest of the network. They use about 15w in āidleā, 30w full loadā¦
I tried PIās but when you add up the pi, the power source and everything else needed to make it stable, power on / shutdown automatically itās not much cheaper than an asrock j1900 board (or similar) but more complicatedā¦ Now the PIās are running as various radios serial to network adapters and bluetooth scanners (and some other small, read-only jobs).
Thanks once again, great advice. Iām not familiar with the Asrock boards, but youāre right, theyāre not that much more expensive than the Piāsā¦ However I do already have an RPi cluster with 4 nodes running, so Iāll try that first and will see how it behaves. Will also check out glusterfs. I can imagine that Piās donāt work well if you want to add a UPS underneath, but I havenāt started looking at the power supply yet.
Good to know at least that it works! Iām also interested in in any deployment scripting (docker compose files, ansible playbooks, whatever youāve been using) that you might be able to share?
Most of my scripts are highly integrated, Iāll see what I can shareā¦ Most of them are pretty simple bash scripts.
My docker swarm uses portainer as the main interface, and I have a personal git repository so I just schedule pulling from there automatically. That takes care of rollbacks tooā¦ Iām in the middle of doing some automated testing once a new set of git versions are pushed so I can save the whole list of versions and if no action is taken in an amount of time the versions are rolled back until latest stable version. Having stuff split into the smallest possible chunks and a bit of creative coding does ensure that if something breaks in only affects a minimal part of the systemā¦
For example this is to check if something is mounted and start/stop docker service. Instead of start/stop you could consider putting the node in drain mode.
mount="/mnt/shared"
mounttype="/mnt/shared fuse.glusterfs rw"
if grep -qs "$mount" /proc/mounts; then
echo "It's mounted."
else
echo "It's not mounted."
mount "$mount"
if [ $? -eq 0 ]; then
echo "Mount success!"
else
echo "Something went wrong with the mount..."
fi
fi
mountok=0
if grep -qs "$mounttype" /proc/mounts; then
mountok=1
fi
if [ "$mountok" -eq 1 ]; then
echo "Making sure docker is started"
if pgrep -x "dockerd" > /dev/null; then
echo ".. running"
else
echo ".. starting .."
systemctl start docker
echo ".. done"
fi
else
echo "Making sure docker is stopped"
if pgrep -x "dockerd" > /dev/null; then
echo ".. running .."
systemctl stop docker
echo ".. done"
else
echo ".. stopped"
fi
fi
I will take a look and try to make a github repository with the scripts.
If you start with the piās and need more power, you can always add x64 nodes and do a gradual roll-outā¦
As a side-note, be careful of brain-splits, if a swarm node seems unable to rejoin the swarm, leave it for a couple of hours (4-6) before trying to manually fix itā¦ most of the time it fixes by itself and if you do anything, the whole cluster will be brokenā¦
Iām not completely satisfied with docker, when the swarm breaks (without touching it at all for weeks at a time) you kinda need to re-build it. So the best thing you can do is to have a script to bootstrap all the stacks / containers / services / etc, otherwise you will end up very angry one late night (because it never breaks when you have time to fix it) - being able to destroy the swarm, create it again, join the other nodes and run the bash/perl/whatever script to setup everying is worth its weight in platinum, even if itās tedious to do at firstā¦ by the time the 3rd time the swarm craps itself youāll thank yourself
Iām considering trying to do another small cluster as a proof of concept, but Iāll probably use banana piās (or other versions) with a sata controller for better speeds and reliability - I would not recommend keeping too big of a database on the sd cardsā¦
For glusterfs this is a good tutorial - just use a folder instead of a brick device
When i had a pi cluster i used good usb drives for glusterfs, and if you can afford to get a 5th pi.
If you donāt have an UPS or any other kind of battery for running the piās, the sd cards will become corrupt more easily - even a good 2 amp phone battery that does not use a button to turn on might be enough to save your files a couple of timesā¦ Iāve had good luck with samsung 10.000mah batteries and anker ones so farā¦
Hi everyone ,
Building on quasar66ās description of his setup, I have created a similiar solution to his, which is setup easily using Ansible playbooks. It runs on Docker swarm and creates a stack running Home Assistant, MariaDB in a Galera cluster and Mosquitto broker.
The project is called HAHA - Highly Available Home Assistant, you can check it out in this thread: HAHA - Highly Available Home Assistant
Thanks to quasar66, I hope the project helps you out!
Thank you for the description of your design @quasar66 . Can you please explain how does āall devices are either from mqtt or use another type of network detection, so it does not matter where the device isā work? If a Z-Wave based sensor is connected to node n1 and n1 fails. How would you get events from the sensor?
You wouldnāt. What they mean is that the devices in use are controlled over the network, and as such can be accessed from any node, in contrast to a Z-Wave stick or other directly connected which can only be attached to a single host.
devices = the computers HA runs on? or the Z-Wave devices?
hi i was wondering if any of you guys had this working as a non Dockerā¦ but running OS version of home assistantā¦ i asked for a feature request
https://community.home-assistant.io/t/home-assistant-cluster-feature-request/377975/4
as i run HA Supervisid OS in a VM under unraid
my current solution is i set it upā¦ shut it down copy to my 2nd unraid boxā¦ power up the VM change my name to backupHomeā¦ and my main is HomeAssistant
so i have a failoverā¦
have you guys been able to tackle it in the OS you create a dashboard and it instantly updates the dashboards on the backup home assistants? does that docker thing work there for the OS version?
so i can have like 3 identical KVM VM OS version runningā¦ as the Docker of HA in unraid not as good as the VM