I’ve recently moved my Home Assistant from Docker to Kubernetes. I’m now running Home Assistant (and other apps) on a two node bare metal k3s High Availability cluster with no major problems. k3s High Availability requires an external MySQL database and runs on a VM. I use HAProxy on a pfSense firewall to load balance between the two k3s nodes.
My HA config directory is on an NFS share bind-mount and I have a separate Longhorn persistent volume for the HA recorder/history sqlite database.
The k3s Rancher and Longhorn UI makes deploying and managing application workloads extremely easy. And, high availability works great. If one of the nodes is unreachable for 5 min - the applications on the failed node spins up automatically on the surviving node.
Here’s a screenshot of the Rancher UI showing the workloads as well as the HA workload configuration:
Hey @taylormia, this is exactly the setup I am currently planning for! Can you share some details on the hardware you are using?
Also, are you using zigbee or zwave? If so, how are you getting the bridge connected to your cluster? I am currently using zigbee2mqtt (no zwave), so I should be fine if I was to run this on a pi instead of running it inside the cluster but I would prefer to include zigbee2mqtt into the cluster. Maybe this would be an option? https://community.arm.com/developer/research/b/articles/posts/a-smarter-device-manager-for-kubernetes-on-the-edge
@davosian My cluster nodes are eight year old Xeon based servers running Ubuntu 20.04. I have two Z-Wave networks connected to my HA. One with about 60 devices is on a HomeSeer ZNET (which was the home automation solution I used before migrating to HA). I still use the ZNET and use a custom HS3 component to bridge the ZWave events into HA. I also have a 7 device ZWave2MQTT network on a Aeotec Zwave Stick that is connected to a Pi-clone running Docker. I have no plans to use ZWave directly on my k3s cluster. I have heard that you could attach ZWave hardware to a k3s HA workload if you exposed it to the host network - but I have not tried that.
Interesing setup, thanks for sharing @taylormia! Mine will be similar but instead of zwave2mqtt, it will be based on zigbee2mqtt. I will most likely start with a raspi for running zigbee2mqtt, but I am not too crazy about this idea since it will pretty much be the backbone of my setup and at the same time will be my single point of failure. I guess I have to think it through some more…
I’ve also been considering switching to Kubernetes, so this is very encouraging.
Was there some reason you didn’t use the load balancing built into k3s, or have I got that wrong?
Did you consider an external database for HA rather than Longhorn/sqllite? MariaDb seems to be supported, but I’m sure MySQL would work as well. Or was there another reason that Longhorn was more suitable?
The built in klipper LB exposes the host IP and ports for the pods and services on each host. You still need an external load balancer to balance traffic to the exposed IP on each node. In a cloud environment the provider will provide/charge for a LB. In an on-premises environment an option is an external bare metal LB like an F5 etc. In a home lab, a software LB like HAProxy or Nginx will suffice. Another great option is to deploy MetalLB on the cluster - which exposes an IP address within a configured range. This solution uses gratuitous ARP to advertise the IP address externally but in a fail over scenario can be much slower than an solution like HAProxy.
I prefer using Longhorn because it is easy to set up, is redundant since it replicates the volume and failover works very well. My local storage on each node using Longhorn is a fast 2TB SSD. I don’t need anything more than SQLite as a DB. I’m using a Longhorn persistent volume because the HA recorder/history database can get corrupted if located on the bind-mounted NFS share - where the rest of my config files are located.
Kubernetes adoption would be an easy sell if it weren’t such a resource hog. Even the edge solutions (k3s, microk8s) take up a 1/2 GB of RAM and 10-20% of CPU before you run a single service. That’s just plain ugly if one is on a 1GB or 2GB SOC (which is a ton of the homeassistant base).
I’m inclined to agree; it’s pushing it on a low-end Pi.
That said, perhaps HA has outgrown the smaller Pi being the baseline? The HA Blue has four cores and 4GB of RAM much like a Pi 4. I think (without any numbers to back it up) that a lot of people who get serious with HA end up having to move to a gruntier machine once they start stacking on features. Now that HA is maturing a bit, this migration seems inevitable given all the new juicy addons.
Perhaps the lower end devices should be recommended for a basic introductory setup, but once you commit, you need to move to the Serious Level; more RAM, more cores, enough to run Kubernetes.
@ianjs Well said. I agree 100%. There should be choice of platforms for different types of HA users…from the low to high end. Also, kubernetes is becoming the de-facto container orchestration standard…would be good for HA development to get ahead of the curve. It shouldn’t be a choice of supervisor vs kubernetes…but rather to add kubernetes as a supported platform.
I might have another poke at Rancher. I kept hearing that was the smoothest way to go, but I got stuck somewhere and went back to hacking on Proxmox to get some of my containers running. Your setup certainly looks like where I wanted to be.
I definitely need some “orchestration” now that HA has become an essential service - downtime is not acceptable and impinges on the WAF for future projects
I use microk8s for my cluster with 3 arm64 master nodes (odroid N2) and 2 x86_64 workers (Intel NUC clones). With version 1.19 they added high availability as the default (once 3 nodes are available it activates automatically) so I do not have to so any special setup for it. They support Metallb as load balancer and Multus to access the host network without conflicting with the host ports.
It is running quite stable for all my needs but tt does not support arm 32 bit OSes. My test cluster is on ODROID-HC1 so I am contributing support for armhf to microk8s if someone is interested: https://github.com/ubuntu/microk8s/issues/719 (pull request being worked out with Canonical)
Is anyone willing to share his Kubernetes manifests / configuration? I am mainly interested in the base setup with things like load balancing, ingress, certificate handling.
Btw, if someone wants to experiment with Kubernetes, I can highly recommend Civo with its #kube100 project: it offers k3s for free ($80 monthly credit - enough for a 3 node medium size cluster) while in public beta: civo.com (referral link). It is their vision to offer developer friendly Kubernetes services and eco system.
Hi @davosian, I’ve been running home-assistant (along with node-red, mqtt server, zwave2mqtt, etc) in a kubernetes cluster for about 1.5 years with great success. Most of those components are deployed as helm charts from the k8s-at-home charts repo, using the gitops approach via flux2.
If you’re interested, my home-assistant kubernetes configuration is located: https://github.com/billimek/k8s-gitops/tree/master/default/home-assistant.
There’s a fairly active discord community I belong to dealing with all things kubernetes at home, but I’m not sure on the rules of promotion in this forum so I’ll avoid linking it unless that’s ok to do so.
I’d recommend a couple of good youtube videos and the accompanying github documentation on installing k3s High Availability and installing Rancher. It was pretty easy to follow and was easier than I thought it would be.
Damn. There’s an overwhelming number of ways to slice Kubernetes.
Every time I pin one down, someone points to something like Flux and I think “Yeah, that’s awesome… let’s do that”. Thanks for the pointer - I’ll check it out.
@billimek, these are great resources around k8s-at-home. Thanks for sharing! Also the discord community is linked inside your profile, so I was able to join.
I am thinking of setting up a 3 node cluster (Intel NUCs) with k3s running on Proxmox. I am currently waiting for the hardware to arrive (next week) to get started.
In the meantime, I will brush up my Kubernetes know-how with the Youtube course from @taylormia. Thanks for sharing
I think this is exactly the biggest challenge: there is not one best way. There are many different paths one can take with many options to choose from and it is easly to get lost in the jungle.
And when I was looking at which GitHubs model I pick for my cluster and updating my charts I get your post - very timely!
I specially like the idea of those projects that are of a kind of template to deploy your own K8S cluster with HA on top. They really help to reduce the inital step learning curve to addopt K8S for HA.
For my first cluster I had to spent weeks writing/tuning Ansible scripts around kubeadm and now with microk8s it took me just one day to automate. This is they way to make this available to more people and also be easier to maintain (so we can create more charts!)
@taylormia I notice that you use Longhorn to share the database, but an NFS bind mount for the config directory.
Could you have used Longhorn for both? Or was that just the way your setup evolved?
@ianjs Yes, you could use Longhorn for both. The reason I don’t is that with a Longhorn volume - I haven’t found a way get out of band access to the files on that volume. So, for example, if I wanted to add or delete config files - I couldn’t. I don’t need to access the recorder database - so it’s fine located in a Longhorn volume and it also prevents corruption since it’s not on a NFS bind mount.
If you happen to find a way to access files in a Longhorn volume from outside HA - please let me know. I have tried installing an NFS and SSH server in the same as a sidecar app in the same workload as HA - but it doesn’t work consistently.