`monit` for monitoring HA (including restart) and more

This is still work in progress, but I prefer continuing the discussion here rather than on a git issue .

The monit tool is generally used for monitoring services on a machine and restart them when necessary. For instance, I use it to monitor apache2, mysql, proprietary services, applications, etc. on servers, embedded systems using *nix, etc.

I discovered it can be installed on HA OS/Core using the “Terminal + SSH” add-on. I haven’t configured it yet, but in order to document it for the community, I started writing a script.

To enable monit in “Terminal + SSH”, you have have several options.
You can it to the ‘apks’ of the “Terminal + SSH” configuration as shown in the example that also includes nmap:

authorized_keys:  []
apks:
  - monit
  - nmap
password: ''
server:
  tcp_forwarding: false

You can also install packages from the command line:

# Install monit
apk add monit
# Look for packages
apk search sqlite
# Install a package - sqlite installs sqlite3 in fact
#    - only specify the main part of the package.
#    (sqlite could be helpful to check that the datebase is still updating).
apk add sqlite

After saving the configuration and restarting the add-on, monit should be available in the terminal.
You can verify that monit is running and accessible from CLI:

monit summary

The script below is work in progress and not tested at all at this time.
What it will be doing when tested/finished:

  • Create /etc/monit.d to put the configurations into in separate files.
  • Modify monitrc to include the files in that directory.
  • Add some configurations for monitoring services, including home assistant.
  • Add some monitoring to monitor cpu load (and restart HA).
  • Reload the configuration to include the new configurations.
  • Report a status (which will correspond to “Initializing”) to check that monit is ok.
# Host and port for HA, 
#  `curl http://homeassistant:8123` gives home page
#  `curl http://homeassistant:8123/api/` gives '401: Unauthorized' (not ... Forbidden):
#
# or set to other working host, sometimes pubic host works `curl https://publichaurl`
#  update protocol http to https if needed
#  (Testing https, also tests the nginx proxy)

HA_HTTP_HOST=homeassistant
HA_HTTP_PORT=8123
PROTOCOL=http
# Otherwise, define HA_HTTP_HOST to empty value

EMAILLOGIN="YOUREMAILLOGIN"
EMAILPASSWORD="YOUREMAILPASSWORD"
MAILSERVER=YOURMAILSERVER


mkdir -p /etc/monit.d

# change line with uncomment include of monit.d 
sed -i 's@#\(\s*include\s*/etc/monit.d\)@\1@' /etc/monitrc
sed -i 's@set log syslog@set log /var/log/monit.log@' /etc/monitrc

if [ "$HA_HTTP_HOST" != "" ] ; then 
  cat > /etc/monit.d/ha <<EOHA
  check host ha with address $HA_HTTP_HOST
    # with pidfile PATHTOPIDFILE
    if failed
       port $HA_HTTP_PORT protocol $PROTOCOL
       # Expect unauthorized, shows that the server is up
       # Could check status if API key provided.
       #and request /api/ with content = "401: Unauthorized" with timeout 20 seconds
       and request / with timeout 20 seconds
       then
         # Alert, for testing
         alert
         # Change to exec when confident that this works
         # exec "/usr/bin/ha core restart"
    group ha
EOHA
fi


# RESTART HA when CPU LOAD is high for 10 minutes
cat > /etc/monit.d/checkload <<EOCPUHIGH
check system homeassistant.local
  if loadavg (1min) per core > 1.5 for 10 cycles then alert
  if loadavg (1min) per core > 1.5 for 15 cycles then exec "/usr/bin/ha core restart"
EOCPUHIGH

# Its possible to set up mail alerts.
# Example for server using TLS on port 587:



cat > /etc/monit.d/mailserver <<EOMS
set mailserver
      $MAILSERVER port 587 username "$EMAILLOGIN" password "$EMAILPASSWD" using tlsv1
EOMS

monit
monit reload
sleep 1
monit summary
1 Like

Excellent, thanks.

I use monit on other servers by exposing the builtin httpd daemon and allowing access from my subnet.

set httpd port 2812 
    allow 192.168.X.0/255.255.255.0        # allow localhost to connect to the server and

and then pulling the xml data

http://192.168.7.XX:2812/_status?format=xml

in via Node-Red to be processed.

I tried to work out how to pull that xml into HA a while back without success;

I have not been using the restful interface yet, but this thread may help - it confirms that there is a kind of conversion from XML and may indicate how this happens.

My idea was not so much to show a status of monit in the UI but to restart ha core when there is an issue we can detect with monit that the supervisor can’t or won’t because of a design choice.

[Note: my regular computer broke down yesterday so it will take a while before I finish the script]

1 Like

Ah, OK, that makes sense. There are other routes for getting this sort of data.

From the last issue with DST, it looks like prolonged CPU high use might be another trigger.

Yep, that’s in the list: “Add some monitoring to monitor cpu load (and restart HA)” .

1 Like

I worked on the setup script which works for me.

Note the following code:

         # Alert, for testing
         alert
         # Change to exec when confident that this works
         # exec "/usr/bin/ha core restart"

I have set the default action to “alert” in stead of restarting ha so that you can verify that this works for your setup.
Only move to the restart action when the result for monit summary looks like this (ha/OK/Remote Host):

1 Like

I am not certain the the cpu load reported by the docker container for the ssh terminal reports the real load of the system (I haven’t really checked).

The monitoring can be extended by using the ha command.

  • ha observer stats could be checked because it gives:

blk_read: 14163968
blk_write: 4096
cpu_percent: 0
memory_limit: 1969250304
memory_percent: 0.2
memory_usage: 3895296
network_rx: 12097771
network_tx: 834870

  • ha host log could help check for errors in the dmesg log. Next time something like the DST error occurs, this log should also be retrieved so check if it provides hints about the issue.
  • ha host reboot could be used to restart the system if ha core restart did resolve it.
1 Like

I made the script available as a github gist . Gists can also be forked but I am not aware of a github way for making pull requests, I could still merge changes though (using pure git methods).

I recommend that you put your configuration in setupMonit.conf in the same directory as the script. That way the script and local settings stay separate.

Monit does not seem to start on reboot, this needs to be resolved somehow.

My setupMonit.conf file looks something like this:

HA_HTTP_HOST=homeassistant
HA_HTTP_PORT=8123
PROTOCOL=http
EMAILLOGIN="[email protected]"
EMAILPASSWORD="realpassword"
MAILSERVER=provider.mailserver.net
2 Likes