As promised, here is a guide on how to setup pacemaker on a couple of Raspberry PIs and run HA in a cluster.
Some prerequisites and assumptions:
English is not my first language.
This is a rough guide based on my notes.
I made many attempts and my progress was based on trial and error so please take this into account.
I will make soon a clean installation starting from scratch and that will be a good chance to write a precise guide (even for myself).
At the time of writing, the Python version on the latest Raspberry OS (Buster) needs to be upgraded in order to run HA core.
How to do this is out of the scope of this guide. I compiled Python 3.9.2 from source.
All installations have to be done on both nodes.
You need to define 3 IP addresses, one for the VIP (virtual ip) and one for each node. These IPs need to be static.
There must be DNS resolution or similar host name to IP resolution.
Since I use OpenWrt on my routers I defined static host names.
On my installation I defined the following:
ha-berry.lan 192.168.64.50 (virtual IP)
node-a.lan 192.168.64.51
node-b.lan 192.168.64.52
Install the Pacemaker stack
sudo apt-get update
sudo apt-get upgrade
reboot Raspberry
sudo apt-get install pacemaker
(this will also install corosync)
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
cluster-glue corosync docutils-common fence-agents gawk libcfg7 libcib27 libcmap4
libcorosync-common4 libcpg4 libcrmcluster29 libcrmcommon34 libcrmservice28 libcurl3-gnutls
libdbus-glib-1-2 libimagequant0 libjbig0 libknet1 liblcms2-2 liblrm2 liblrmd28 libltdl7
liblzo2-2 libmariadb3 libnet-telnet-perl libnet1 libnspr4 libnss3 libopenhpi3 libopenipmi0
libpaper-utils libpaper1 libpe-rules26 libpe-status28 libpengine27 libpils2 libplumb2
libplumbgpl2 libqb0 libquorum5 libsensors-config libsensors5 libsgutils2-2 libsigsegv2
libsnmp-base libsnmp30 libstatgrab10 libstonith1 libstonithd26 libtiff5 libtimedate-perl
libtransitioner25 libvotequorum8 libwebp6 libwebpdemux2 libwebpmux3 libxml2-utils libxslt1.1
mariadb-common mysql-common openhpid pacemaker-cli-utils pacemaker-common
pacemaker-resource-agents python3-asn1crypto python3-boto3 python3-botocore
python3-cffi-backend python3-cryptography python3-dateutil python3-docutils python3-fasteners
python3-googleapi python3-httplib2 python3-jmespath python3-monotonic python3-oauth2client
python3-olefile python3-openssl python3-pexpect python3-pil python3-ptyprocess python3-pyasn1
python3-pyasn1-modules python3-pycurl python3-pygments python3-roman python3-rsa
python3-s3transfer python3-sqlalchemy python3-sqlalchemy-ext python3-suds python3-uritemplate
resource-agents sg3-utils sgml-base snmp xml-core xsltproc
Suggested packages:
ipmitool python3-adal python3-azure python3-keystoneauth1 python3-keystoneclient
python3-novaclient gawk-doc liblcms2-utils lm-sensors snmp-mibs-downloader crmsh | pcs
python-cryptography-doc python3-cryptography-vectors docutils-doc fonts-linuxlibertine
| ttf-linux-libertine texlive-lang-french texlive-latex-base texlive-latex-recommended
python-openssl-doc python3-openssl-dbg python-pexpect-doc python-pil-doc python3-pil-dbg
libcurl4-gnutls-dev python-pycurl-doc python3-pycurl-dbg python-pygments-doc
ttf-bitstream-vera python-sqlalchemy-doc python3-psycopg2 python3-mysqldb python3-fdb
sgml-base-doc debhelper
The following NEW packages will be installed:
cluster-glue corosync docutils-common fence-agents gawk libcfg7 libcib27 libcmap4
libcorosync-common4 libcpg4 libcrmcluster29 libcrmcommon34 libcrmservice28 libcurl3-gnutls
libdbus-glib-1-2 libimagequant0 libjbig0 libknet1 liblcms2-2 liblrm2 liblrmd28 libltdl7
liblzo2-2 libmariadb3 libnet-telnet-perl libnet1 libnspr4 libnss3 libopenhpi3 libopenipmi0
libpaper-utils libpaper1 libpe-rules26 libpe-status28 libpengine27 libpils2 libplumb2
libplumbgpl2 libqb0 libquorum5 libsensors-config libsensors5 libsgutils2-2 libsigsegv2
libsnmp-base libsnmp30 libstatgrab10 libstonith1 libstonithd26 libtiff5 libtimedate-perl
libtransitioner25 libvotequorum8 libwebp6 libwebpdemux2 libwebpmux3 libxml2-utils libxslt1.1
mariadb-common mysql-common openhpid pacemaker pacemaker-cli-utils pacemaker-common
pacemaker-resource-agents python3-asn1crypto python3-boto3 python3-botocore
python3-cffi-backend python3-cryptography python3-dateutil python3-docutils python3-fasteners
python3-googleapi python3-httplib2 python3-jmespath python3-monotonic python3-oauth2client
python3-olefile python3-openssl python3-pexpect python3-pil python3-ptyprocess python3-pyasn1
python3-pyasn1-modules python3-pycurl python3-pygments python3-roman python3-rsa
python3-s3transfer python3-sqlalchemy python3-sqlalchemy-ext python3-suds python3-uritemplate
resource-agents sg3-utils sgml-base snmp xml-core xsltproc
0 upgraded, 100 newly installed, 0 to remove and 0 not upgraded.
Need to get 21.5 MB of archives.
sudo apt-get install pcs
user pi needs to be member of the haclient group on both nodes
sudo usermod -a -G haclient pi
pi@node-a:~ $ pcs client local-auth
Username: hacluster
Password:
localhost: Authorized
pi@node-a:~ $ pcs host auth node-a node-b
Username: hacluster
Password:
node-a: Authorized
node-b: Authorized
pcs host auth node-a addr=192.168.64.51 node-b addr=192.168.64.52
sudo pcs cluster setup haberry node-a addr=192.168.64.51 node-b addr=192.168.64.52 --force
Warning: node-a: Running cluster services: 'corosync', 'pacemaker', the host seems to be in a cluster already
Warning: node-a: Cluster configuration files found, the host seems to be in a cluster already
Warning: node-b: Running cluster services: 'corosync', 'pacemaker', the host seems to be in a cluster already
Warning: node-b: Cluster configuration files found, the host seems to be in a cluster already
Destroying cluster on hosts: 'node-a', 'node-b'...
node-b: Successfully destroyed cluster
node-a: Successfully destroyed cluster
Requesting remove 'pcsd settings' from 'node-a', 'node-b'
node-a: successful removal of the file 'pcsd settings'
node-b: successful removal of the file 'pcsd settings'
Sending 'corosync authkey', 'pacemaker authkey' to 'node-a', 'node-b'
node-a: successful distribution of the file 'corosync authkey'
node-a: successful distribution of the file 'pacemaker authkey'
node-b: successful distribution of the file 'corosync authkey'
node-b: successful distribution of the file 'pacemaker authkey'
Synchronizing pcsd SSL certificates on nodes 'node-a', 'node-b'...
node-a: Success
node-b: Success
Sending 'corosync.conf' to 'node-a', 'node-b'
node-a: successful distribution of the file 'corosync.conf'
node-b: successful distribution of the file 'corosync.conf'
Cluster has been successfully set up.
Now try to start the cluster
pi@node-a:~ $ sudo pcs cluster start --all
node-a: Starting Cluster…
node-b: Starting Cluster…
pi@node-a:~ $ sudo pcs status cluster
Cluster Status:
Stack: corosync
Current DC: node-b (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Mon Mar 1 22:25:52 2021
Last change: Mon Mar 1 22:25:43 2021 by hacluster via crmd on node-b
2 nodes configured
0 resources configured
PCSD Status:
node-a: Online
node-b: Online
pi@node-a:~ $ sudo pcs status nodes
Pacemaker Nodes:
Online: node-a node-b
Standby:
Maintenance:
Offline:
Pacemaker Remote Nodes:
Online:
Standby:
Maintenance:
Offline:
pi@node-a:~ $ sudo pcs status corosync
Membership information
Nodeid Votes Name
1 1 node-a (local)
2 1 node-b
pi@node-a:~ $ sudo pcs status
Cluster name: haberry
WARNINGS:
No stonith devices and stonith-enabled is not false
Stack: corosync
Current DC: node-b (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Mon Mar 1 22:28:43 2021
Last change: Mon Mar 1 22:25:43 2021 by hacluster via crmd on node-b
2 nodes configured
0 resources configured
Online: [ node-a node-b ]
No resources
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
Since no STONITH device exists on this cluster (for now) you need to disable STONITH and also disable quorum policy warning
sudo pcs property set stonith-enabled=false
sudo pcs property set no-quorum-policy=ignore
pi@node-a:~ $ sudo pcs property
Cluster Properties:
cluster-infrastructure: corosync
cluster-name: haberry
dc-version: 2.0.1-9e909a5bdd
have-watchdog: false
no-quorum-policy: ignore
stonith-enabled: false
Now you need to create a cluster resoucre for the virtual IP address
sudo pcs resource create virtual_ip ocf:heartbeat:IPaddr2 ip=192.168.64.50 cidr_netmask=32 op monitor interval=5s
pi@node-a:~ $ sudo pcs status resources
virtual_ip (ocf::heartbeat:IPaddr2): Started node-a
You can use the following commands to manually move the resource from one node to another and test that the VIP ownership changes
sudo pcs node standby node-a
sudo pcs node unstandby node-a
sudo pcs node standby node-b
sudo pcs node unstandby node-b
If you want to manually force a failover, you can use this command to stop a node
sudo pcs cluster stop node-a
Now we need to create a resource for the HA service so that it becomes managed by the cluster.
The HA service must remain disabled in systemd so that is the cluster that decides when to start/stop it
pcs resource create homeassistant systemd:[email protected]
pcs resource create clustersync systemd:clustersync.service --group ha_group
to create a group for your resources
sudo pcs resource group add ha_group virtual_ip homeassistant
This guide is missing an additional resource (service) that I use to sychronize the HA profile accross the nodes. I will cover that in another guide.