Keeper: HA and MQTT service manager

Hi,


Just sharing if anyone needs a service to keep an eye on mqtt and ha

Nuno

7 Likes

This is cool. Seems there is a bit more traditional control systems methodologies getting put into HA now.

Good job!

This can seem overzealous but all my sensors and automation rely on MQTT. So I decided to keep an eye on MQTT :). Luckily “System restarts” will never happen

very nice idea - thanks for sharing!
would it be much to ask for a docker installer?

1 Like

This is quite good.

  1. How could I incorporate Pushover to send me a message to let me know when Home Assistant or MQTT service has been restarted?
  2. How could I incorporate a counter with date and time within Home Assistant to let me know when Home Assistant or MQTT service has been restarted?

Currently I have no mechanism to send notifications. There are several implementation we can use to achieve this

  1. Send notifications through Keeper
  2. Send notification through HA
    I’m open to suggestions.
    As for metrics, I’ve added 3 new metrics. MQTT, HA and System last restart time
1 Like

No, I’ll update this post when I’ve completed the installer

2 Likes

how do you get these sensor in hass ?

They are automatically added. However, discovery must be enabled on HA. For instance, this is my layout (lovelace)

- type: entities
        title: Keeper
        entities:
            - entity: sensor.kconnectorstatus
              name: Connector
              icon: mdi:access-point
            - entity: sensor.kheartbeaterstatus
              name: Heartbeater
              icon: mdi:heart-pulse
            - entity: sensor.kreporterstatus
              name: Reporter
              icon: mdi:information-outline
            - type: section
            - entity: sensor.kmqttconnectionstatus
              name: MQTT Connection Status
              icon: mdi:network
            - entity: sensor.kmqttfailedconnections
              name: Failed MQTT connections
              icon: mdi:sync-alert
            - entity: sensor.kmissedheartbeat
              name: Missed Heartbeats
              icon: mdi:pipe-leak
            - entity: sensor.kmqttrestarts
              name: MQTT Restarts
              icon: mdi:restart
            - entity: sensor.kharestarts
              name: HA Restarts
              icon: mdi:restart
            - entity: sensor.ksystemrestarts
              name: System Restarts
              icon: mdi:server
            - entity: sensor.klastheartbeat
              name: Last Heartbeat
              icon: mdi:calendar-clock
            - entity: sensor.klastmqttrestart
              name: Last MQTT Restart
              icon: mdi:calendar-clock
            - entity: sensor.klastharestart
              name: Last HA Restart
              icon: mdi:calendar-clock
            - entity: sensor.klastsystemrestart
              name: Last System Restart
              icon: mdi:calendar-clock

I’ve added a simple dockerfile, not tested yet, didn’t have the time :slight_smile:

Thank you for incorporating those features.

A possible additional feature you could consider is adding a time period over which the missed messages/heartbeats is being considered to prevent restarts based on isolated events over an extended period of time.

Actually a restart only happens after 3 missed heartbeats. In the config file you have the heartbeat interval which should be the same for the HA automation. Then you have delay. A heartbeat is considered missed if the difference from last heartbeat and the time we are comparing are greater than Interval + Delay. You can ajust this values however lower values provide a more thorough monitor. Is this the same you were asking?

Assuming I have missed heartbeat yesterday, the second one today and third one tomorrow, the restart will take place tomorrow after the third heartbeat. This would not be critical and may be classified as intermittent. If for example all 3 missed heartbeats occurred within a set 2 minute period, some thing may be amiss and require a restart.

This will not happen because:

  • You receive a heartbeat message at T
  • at T1 = T + interval + delay no heartbeat is received, so number of missed heartbeat will increment.
  • Since at T1 a missed heartbeat was detected, next check will be at T2 = T1 + interval (without delay)
  • If between T1 and T2 you receive a heartbeat, number of misses will be reseted

TL;DR: A reset happens when between misses an heartbeat arrives

Thank you for clarifying.

Very nice! Just what I was looking for, however I followed the HASSIO instructions on the git page, but the add-on will not load. Used ```
https://github.com/nragon/keeper/tree/master/setup/hassio as the url for the add-on. Any suggestions or help is appreciated.

I’m afraid HASSIO addon is not yet finished, started but not finished. I’ll let you know when done

I appreciate it. And thanks for developing it.

Any luck to get that Hass.io add-on?
By the way, shouldn’t HA’s config use time_pattern instead of time?

- id: keeperheartbeat
  initial_state: "on"
  trigger:
    platform: time
    seconds: "/<numberofseconds>"