Alert2 - a new alerting component

@pav and @teachingbirds, thanks so much for your feedback.

I just released Alert2 v1.3.2 and the companion Alert UI v1.1.1. Combined change list and comments:

  • Fix UI bugs related to old Alert-1 alerts. The issue of multiple more-info popups, and also listing extra old-style alerts in the overview card. @teachingbirds, let me know if this doesn’t address the bugs you saw.

    Old Alert-1 alerts don’t have enough information to distinguish between an alert being active recently and HA restarting recently. So I made it so that the overview card will show recently active Alert2 alerts, but will only show Alert-1 alerts while they are firing. Just to avoid a flood of Alert-1 alerts showing up when HA restarts. I’m open to a better approach to handling this.

  • Add config support for data, title and trigger parameters. They are passed to the notifier.

  • Add support for done_message

  • Add support for friendly_name - this shows up in the overview card.

  • Add support for annotate_messages. This is a boolean, defaulting to true, whether Alert2 will add extra information to notifications when an alert starts or stops firing. Does not affect reminder notifications.

  • Update README documenting the new options

To followup on your comments:

Posting either here or in an issue is fine with me. Here is good for open discussion, but may get confusing if too many threads are going on in parallel.

@teachingbirds, In your suggestion to include notifiers in the more-info dialog, do you mean just showing for reference purposes which notifier the alert is configured to use?

I distinguish between notifications of an alert starting or stopping to fire, and reminder notifications. I found it handy for reminder notifications to include extra info like “Alert x on for 1hr” and not just repeat the original notification text from when the alert first fired. You can view the original message text when the alert first fired in the more-info dialog. I could document more what kinds of extra info is added. Any thoughts?

@pav, thanks for your comments on reset_when_fires. I realized that what I really wanted was a way to limit notifications and prevent a flood when an alert spasms and turns on&off rapidly (before I can snooze the alert). I’ll look into creating a fall-back parameter limiting max notification rate or something simpler than the reset_when_fires that I had proposed.

NOTE - the currently notification frequency logic does not reset between multiple firings of a single alert. This is I think more complex than it needs to be. I’ll clean it up and simplify to just affect a single alert firing.

I’m considering changing the name of notification_frequency_mins to reminder_frequency_mins, since it really controls reminders specifically. Any reason not to?

Next I’ll work on making notification_frequency_mins take a list, and probably rethink the frequency logic.

-Josh

Following up on your feature suggestions I just released Alert2 v1.3.3. Release notes:

  • rename parameter notification_frequency_mins to reminder_frequency_mins

  • reminder_frequency_mins now takes a list of values, similar to the repeat parameter in the old Alert integration. However, reminder_frequency_mins stops at the last entry in the list and does not cycle. It also resets whenever the condition alert turns off, so it no longer controls notifications across multiple firings of an alert.

  • Add throttle_fires_per_mins - Limits notifications for frequently firing alerts.

  • Add on_for_at_least_secs - Limits a condition alert turning on until the underlying condition has been true for at least the specified number of seconds. It’s a form of hysteresis, similar to skip_first in the old Alert integration and on_for in other alert systems.

  • Add annotate_message - Controls whether notifications have extra context information added to them. Useful if need more precise control over notification text.

  • Expand number of options for tracked alerts to mirror the relevant ones available to other event alerts.

  • Update docs

@pav and @teachingbirds, I think I’ve covered perhaps all of your suggestions at this point. Let me know what you think and if there’s anything else you think would benefit from change or addition.

EDIT - I renamed on_for_at_least_secs to delay_on_secs since it seems like a better name. Minor release v.1.3.4.

-Josh

1 Like

Hi @redstone99. It looks like you fulfilled all our wishes and then some. I adapted all my alert instances, and converted the last of the older ones to use Alert2, using the new extra parameters.
It is a bit early to vouch for the correct functioning of it all, but knowing you :slight_smile: I’m confident there will be no complaints …
Thanks for a very useful Alert2 integration and especially for bringing it to near perfection in such short time. It - and you - deserve to be highlighted in Hass’s next months’ release notes. My vote you have !
Cheers

1 Like

Hi Josh. One thing I forgot until now to bring up : I noticed that Alert2 creates a bunch of alert2.alert2_* entities. Are these purely for the integration’s internal use, or are they supposed to be useful to us as well ? Just asking, because I personally up to now fail to see their usefulness to me.
And because I am on a mission to reduce my DB size, and some of these entities sometimes are massive (one of them currently containing some 11 MB), I would like to do away with them/prevent their creation - of course only if you do not need them yourself …
Thoughts ?

Hi Paul,
I’m so glad you’re finding Alerg2 useful. And I’m all for keeping the DB footprint small. I realize that alerts using delay_on_secs will update the DB whenever the condition becomes true, even if the alert doesn’t actually turn on. I should probably change that so DB update doesn’t happen until the alert actually turns on. That’ll help some with DB size.

The alert2.alert2_* entities are used to notify you of issues with Alert2 itself, like a YAML config error, an error evaluating a template, a notifier not being available, and a few other error conditions. I wouldn’t expect them to be very active or to take up much DB space. If you want to prevent them from taking up any space you could exclude them from being recorded in the DB by adding the following to your config YAML:

recorder:
  exclude:
    entities:
      - alert2.alert2_undeclared_event
      - alert2.alert2_unhandled_exception
      ... and so on ...

I can make the change to delay_on_secs alerts and see if it helps you. Or, if you’re up for digging a bit deeper…
It be helpful to know which alert2 entities are taking up space in your DB. Would you be up for running a SQL query for recent alert2 entities and telling me what you see? In particular, whether most of the activity for you is in the delayed_on_secs entities or something else. The following SQL query will show alert2 DB activity in the last 24 hours, showing just entity name:

select states_meta.entity_id, DATETIME(states.last_updated_ts, 'unixepoch', 'localtime') AS "states.last_updated_ts", states.last_changed_ts, states.attributes_id FROM states LEFT JOIN states_meta ON states.metadata_id = states_meta.metadata_id WHERE states_meta.entity_id LIKE "alert2.%" AND states.last_updated_ts > strftime('%s', 'now','-24 hours') ORDER BY states.last_updated_ts;

You could paste that into the SQLite Web addon, or on the command-line, it’d be something like the following (you may need to stop HA first):

$ sqlite home-assistant_v2.db
sqlite> select states_meta.entity_id, DATETIME(states.last_updated_ts, 'unixepoch', 'localtime') AS "states.last_updated_ts", states.last_changed_ts, states.attributes_id FROM states LEFT JOIN states_meta ON states.metadata_id = states_meta.metadata_id WHERE states_meta.entity_id LIKE "alert2.%" AND states.last_updated_ts > strftime('%s', 'now','-24 hours') ORDER BY states.last_updated_ts;

EDIT - added recorder YAML suggestion above.

Josh

Hi Josh. Thanks for coming back to me on this ‘issue’. I had already adjusted my recorder settings to exclude most of these entities - more in an attempt to keep a tidy ship, and not so much because of an urgent need. I even think that establishing the space taken up in the DB would not amount to much, because 1) I have not been using Alert2 for more than a couple of weeks now, 2) I have a rather limited number of use cases for it, and 3) my alerts are such that I don’t expect them to fire frequently. So you might say that I am a bit nitpicking :smirk:

The following is definitely in no way meant as criticism, but merely offered as ‘food for thought’ :

  • this is the only integration I know of that packages its issues in multiple & real entities, iso just ‘logging’ them in the standard logs (the latter which has always worked for me)
  • at least to me, their usefulness seems limited to the creation/debugging stadium of the alerts, and of limited or no interest once these are up & running
  • they seem to be overly long-lived & never reset : I have some of them with info from 2 weeks ago, and thus totally irrelevant …

Anyway, no big deal and not worth losing any sleep over it. Still very happy with a much improved Alert integration :+1:

Hi Paul,
Thanks for the thoughtful feedback. I can definitely clean up the internal alerts and make them optional - let me know what you think of the suggestion below. And you raise an interesting point about event alerts being long-lived (see below).

My thinking for internal alerts was that, if for some reason Alert2 itself runs into a problem, you may no longer receive alerts for things you actually care about. So in a sense, alert2 failing is as serious as the most serious condition you might alert on. But that’s just an opinion, so how about I collapse the internal entities down to a single one and make it optional with a flag:

  • alert_on_internal_errors - If false, internal errors will just be logged and no internal alert entity will be created. If true, then a single internal alert entity, alert2.alert2_internal_error will be created for tracking and notifying on any internal errors.

The internal alerts (eg an assert failed) are event-based alerts. They indicate an event at a point in time rather than a condition being true or false, so they don’t have a notion of turning on or off. So there isn’t a sense in which an event alert ever “resets” in the way that a condition alert turns off. Would it help if I make it clearer in the UI which alerts are event alerts?

Thanks again for taking the time to give feedback! It’s valuable to get fresh, thoughtful eyes on Alert2 to keep improving it.

-Josh

Hi Josh, you keep surprising me, not only for paying such close attention to my ramblings :slight_smile: , but also for proposing well thought-out changes/improvements that surpass my own suggestions.
Just to say that the combination of collapsing the internal entities into a single one, and the addition of the parameter governing its creation would leave me wishless (at least for now, haha).

Btw, talking about the alerts : I must admit that sometimes they leave me bewildered as to their origin and/or meaning. E.g. I had one the other day “Err reported: {‘domain’: ‘alert2’, ‘name’: ‘assert’, ‘message’: ‘Printer_Powered, not notifying but remaining time is 0’}”. Cryptic to me, as I have no clue as to the what or why …

Hi Paul, I just released Alert2 v1.3.5. It consolidates all the internal alert entities to a single one, alert2.error, and provides an option to omit even that entity. The weird alert you saw was an unnecessary notification - I fixed the bug. I also somewhat improved the message for alert2.error to request that people file an issue or notify the Alert2 maintainer. My goal is that any alert should be clear and informative, so let me know if you see any confusing things going forward.
-Josh

1 Like

And just released v1.3.6. The main change is to add unittests (yay!) and also allow specifying literal entity names in condition and threshold value parameters. So in an alert, instead of writing:

condition: "{{ states('binary_sensor.my_sensor') }}"

you can now alternatively just write:

condition: binary_sensor.my_sensor

and similarly with the value field of a threshold.

EDIT: I updated the latest release to 1.3.7.1 to support the new HA 2024.10.
HA 2024.10 changes how messages to notifiers are interpreted, making it no longer necessary to wrap messages in “{% raw %}”.

-Josh

1 Like

v1.4 is out with support for template notifiers! You can now dynamically specify who gets notified and how. I updated the docs with details and examples.

Also, I opened up the discussions section of the github repo, which might be a good place for discussing specific feature requests. Or here.

Feedback or suggestions welcome!
-Josh

1 Like

Wow, @redstone99 this is an incredible project. Thank you for your contributions!

Quick question, do I need to restart Home Assistant after any edit to configuration.yaml or can I just reload yaml files?

On a related note I am finding maintaining alerts to be difficult through YAML (especially as a HA restart takes a long time in my setup). It would be amazing if there were a UI for creating alerts as maintaining a lot I’m finding to be a challenge and these are the types of things you add/edit/change a lot. A simple YAML error causes my entire HA setup to fail on startup and twice I’ve had to revert to backups (separate problem I’m trying to chase down where if HA seems to not load components in the right order because of a config error it resets dashboards).

Also testing the alert is something I’m struggling with when using templates as I can’t easily get some triggers/conditions to occur.

It would be incredible to have a UI to manage alerts (and hopefully minimize problems with formatting/errors in alert templates for example) and I think would open HA up as an overall product unbelievably. Something like the scheduler card would be really the best.

I’m not sure how to be able to better test alerts without causing the triggers/conditions they are dependent on to fire, because I want to make sure they are right especially if some of the conditions are a “must alert” situation, I don’t want an untested error to prevent the alert from firing.

1 Like

Hi @tman98 , thanks for the feedback! Yes, I believe that modifying configuration.yaml requires an HA restart. I’ve tried to make Alert2 so that a bad Alert2 config will never prevent HA from starting and also so that a bad single alert config will not prevent other alerts from loading successfully. I think the original Alert integration requires the whole config to be correct.

The ability to configure alerts through the UI would be nice. I’d help someone else do the development of this. Doing it well is probably a decent spot of work.

How to test alerts is a great question. Today I suppose you copy an alert and replace the sensor reference with true to see what happens. One could imagine an interface that let’s you test “what happens if…” without changing your alerts. It would mock the sensors referenced in an alert with values you specify so you could see what happens.

Two other threads on the alert UI topic below (though neither seems to be active).

Hi - I agree on the UI being some work. I have a few ideas I’ve been tossing around.

In the meantime, I’ve been getting a few errors reported to the alert2_error alert. I routninely get a number of these errors on a HA restart:

exception {'message': 'Task exception was never retrieved', 'exception': ServiceNotFound('service_not_found'), 'future': <Task finished name='Task-527' coro=<ServiceRegistry.async_call() done, defined at /usr/src/homeassistant/homeassistant/core.py:2697> exception=ServiceNotFound('service_not_found')>}

The second is this error, which I periodically get when a threshold alert starts to exceed its bound:

Internal error. Please report to Alert2 maintainers (github.com/redstone99/hass-alert2). Details: system_memory_high turning on but already have delayed wait set

The relevant alert is looking at the system monitor’s memory usage:

    - domain: system
      name: memory_high
      friendly_name: Memory high
      threshold:
        value: sensor.system_monitor_memory_usage
        maximum: 90
        hysteresis: 5
      delay_on_secs: 120
      message: "{{ states('sensor.system_monitor_memory_usage') }}"

Hi, thanks for reporting the issues! I confirmed the bug with thresholds and delay_on. Fix coming soon. The “Task exception”/“service not found” one is a bit trickier though I have an idea.

Can you take a look at the logs and see if there is a message from Alert2 right before the exception happens? There should be a log line right before reporting which notifier service Alert2 was trying to async_call on. Might be helpful if you could send some log lines before that as well, just in case.
EDIT: I think I know the issue. My guess is in the logs you’ll see a message about persistent_notification.

Regarding UI alert2 config support, I’d love to hear your thoughts. I’ve started laying some groundwork on the backend as I mull over how it might work and look at how the Scheduler card works. Though let’s first get these bugs sorted out.

-J

Thanks much! On the threshold error it appears I received that error today and never got the actual alert so it looks like it’s preventing the alert itself from correctly firing, looking forward to that fix!

For the task exception was never retrieved, I found a few logs. Unfortuantely I cannot for the life of me figure out how to copy out of the terminal (there’s a github issue tracking this), so here is a screenshot:

There are then a bunch of logs of this same info being repeated over and over
Note the actions “notify.lovelace_popup” (a custom one of my own in pyscript) or “notify.mobile_app_6453” (a standard mobile device with the companion app) are being reported as not found. Both exist and work later on when alerts are actually fired.

However when I look later in my log I see the pyscript that creates lovelace_notify being run later, and then I see “Setting up mobile_app” occuring in the log later as well.

Perhaps alert2 is trying to configure/initialize something with notifiers that have not yet been set up in the sytem yet?

Hi, I just released v1.4.1 that hopefully will fix the issues you encountered. Can you give it a shot?

It adds a dependency on notify during startup which should ensure persistent_notification is available. The notify component changed somewhat recently to defer full initialization of all notifiers, which means it can take some time after startup for all notifiers to be available. So Alert2 has a notion of “deferred” notifications during startup. I also now catch the ServiceNotFound exception, which I don’t think should ever occur anymore, but if it does, it will log an error in the logs.

Thanks,
Josh

1 Like

Wow amazing that was incredible turnaround and with unit tests! So glad you have a solid test base, makes me more confident in the codebase as well. I’m installing and will report after a day or two!!

Hi @redstone99, unfortunately the startup issue isn’t yet resolved, here’s the log from startup of alert2 (below). Subsequently in the log, my two notifiers notify.lovelace_notify and notify.mobile_app_6453 are set up. So it does not appear that we’re delaying until after all notifiers are established.

For the mobile one, that is a standard mobile app notificiation so seems like we should definitely be delayed until after that, so it seems like alert2 is starting before registered notifiers (but perhaps after the notifier system and persistent_notification themselves are set up).

For my custom one written in pyscript (lovelace_popup), it occurs to me alert2 wouldn’t actually know to wait as the system wouldn’t know a pyscript has a notifier. I’m not sure how I can force myself to be in the startup chain ahead of alert2 without modifying the manifest (which would be overwritten on an upgrade).

Given that it seems there can be notifiers being set up during HA startup that alert2 may not be aware are being set up until an alert fails during startup, I’m not sure the utility of trying to send alerts to notifiers that are not yet set up during startup. Even with error checking they will still get lost and never reported to yet-to-be-setup notifiers, so perhaps that phase needs a little retooling, probably some logging of the alert and either queuing up the alerts to send only once HA starts or the notifiers are available or something else.

I’m also not entirely sure why a throttle interval alert is the very first alert that appears to be being thrown in the logs before any other alerts.

2024-11-12 15:16:10.883 INFO (MainThread) [homeassistant.setup] Setting up alert2

2024-11-12 15:16:10.886 INFO (MainThread) [custom_components.alert2] Setting up Alert2

2024-11-12 15:16:10.894 WARNING (MainThread) [custom_components.alert2] _notify msg=[Throttling ending] Alert2 alert2_error: Did not fire during throttled interval

2024-11-12 15:16:10.896 WARNING (MainThread) [custom_components.alert2] Notifying ['all_devices']: [Throttling ending] Alert2 alert2_error: Did not fire during throttled interval

2024-11-12 15:16:11.192 INFO (MainThread) [homeassistant.setup] Setup of domain alert2 took 0.31 seconds

2024-11-12 15:16:11.258 INFO (MainThread) [homeassistant.setup] Setting up shopping_list

2024-11-12 15:16:11.258 INFO (MainThread) [homeassistant.setup] Setup of domain shopping_list took 0.00 seconds

2024-11-12 15:16:11.289 ERROR (MainThread) [custom_components.alert2] Exception {'message': 'Task exception was never retrieved', 'exception': ServiceNotFound('service_not_found'), 'future': <Task finished name='Task-1996' coro=<ServiceRegistry.async_call() done, defined at /usr/src/homeassistant/homeassistant/core.py:2697> exception=ServiceNotFound('service_not_found')>}

2024-11-12 15:16:11.290 ERROR (MainThread) [custom_components.alert2] Err reported: {'domain': 'alert2', 'name': 'error', 'message': "exception {'message': 'Task exception was never retrieved', 'exception': ServiceNotFound('service_not_found'), 'future': &lt;Task finished name='Task-1996' coro=&lt;ServiceRegistry.async_call() done, defined at /usr/src/homeassistant/homeassistant/core.py:2697&gt; exception=ServiceNotFound('service_not_found')&gt;}"}

2024-11-12 15:16:11.291 WARNING (MainThread) [custom_components.alert2] _notify msg=Alert2 alert2_error: exception {'message': 'Task exception was never retrieved', 'exception': ServiceNotFound('service_not_found'), 'future': &lt;Task finished name='Task-1996' coro=&lt;ServiceRegistry.async_call() done, defined at /usr/src/homeassistant/homeassistant/core.py:2697&gt; exception=ServiceNotFound('service_not_found')&gt;}

2024-11-12 15:16:11.292 WARNING (MainThread) [custom_components.alert2] Notifying ['all_devices']: Alert2 alert2_error: exception {'message': 'Task exception was never retrieved', 'exception': ServiceNotFound('service_not_found'), 'future': &lt;Task finished name='Task-1996' coro=&lt;ServiceRegistry.async_call() done, defined at /usr/src/homeassistant/homeassistant/core.py:2697&gt; exception=ServiceNotFound('service_not_found')&gt;}

2024-11-12 15:16:11.295 ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved (None)

Traceback (most recent call last):

File "/usr/src/homeassistant/homeassistant/core.py", line 2735, in async_call

raise ServiceNotFound(domain, service) from None

homeassistant.exceptions.ServiceNotFound: Action notify.lovelace_popup not found

2024-11-12 15:16:11.309 ERROR (MainThread) [custom_components.alert2] Exception {'message': 'Task exception was never retrieved', 'exception': ServiceNotFound('service_not_found'), 'future': <Task finished name='Task-1995' coro=<ServiceRegistry.async_call() done, defined at /usr/src/homeassistant/homeassistant/core.py:2697> exception=ServiceNotFound('service_not_found')>}

2024-11-12 15:16:11.309 ERROR (MainThread) [custom_components.alert2] Err reported: {'domain': 'alert2', 'name': 'error', 'message': "exception {'message': 'Task exception was never retrieved', 'exception': ServiceNotFound('service_not_found'), 'future': &lt;Task finished name='Task-1995' coro=&lt;ServiceRegistry.async_call() done, defined at /usr/src/homeassistant/homeassistant/core.py:2697&gt; exception=ServiceNotFound('service_not_found')&gt;}"}

2024-11-12 15:16:11.310 WARNING (MainThread) [custom_components.alert2] _notify msg=Alert2 alert2_error: exception {'message': 'Task exception was never retrieved', 'exception': ServiceNotFound('service_not_found'), 'future': &lt;Task finished name='Task-1995' coro=&lt;ServiceRegistry.async_call() done, defined at /usr/src/homeassistant/homeassistant/core.py:2697&gt; exception=ServiceNotFound('service_not_found')&gt;}

2024-11-12 15:16:11.311 WARNING (MainThread) [custom_components.alert2] Notifying ['all_devices']: Alert2 alert2_error: exception {'message': 'Task exception was never retrieved', 'exception': ServiceNotFound('service_not_found'), 'future': &lt;Task finished name='Task-1995' coro=&lt;ServiceRegistry.async_call() done, defined at /usr/src/homeassistant/homeassistant/core.py:2697&gt; exception=ServiceNotFound('service_not_found')&gt;}

2024-11-12 15:16:11.312 ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved (None)

Traceback (most recent call last):

File "/usr/src/homeassistant/homeassistant/core.py", line 2735, in async_call

raise ServiceNotFound(domain, service) from None

homeassistant.exceptions.ServiceNotFound: Action notify.mobile_app_6453 not found

Hi @tman98, sorry the fix didn’t work and thanks for sending logs. I created an issue to track this: Notifying "notify groups" during startup cause ServiceNotFound exceptions · Issue #3 · redstone99/hass-alert2 · GitHub.

HA initializes notifiers asynchronously. So I believe that HA can say it has fully started while there are still notifiers that have not finished initializing. For this reason, Alert2 has a mechanism to queue up and defer notifications during startup if the notifiers don’t yet exist.

However, Alert2 isn’t (yet) smart enough to check inside “notify groups”. So I think the following could be happening:

  1. Due to the earlier threshold bug, Alert2 tried to send you lots of notifications and throttling started. That throttling state persists across restarts.
  2. HA restarts. The notify group all_devices initializes. However the member notifiers like lovelace_notify have not yet initialized
  3. Alert2 wants to notify you that throttling of alert2.error has ended.
  4. Alert2 goes to notify all_devices, sees that the notifier exists and invokes the services.async_call to notify.
  5. The notify group all_devices then invokes services.async_call to notify each member.
  6. The members don’t exist, so ServiceNotFound is thrown, which is attributed to alert2 since it originally initiated the call.

If I’m right about this issue with “notify groups”, one answer is to make Alert2 detect “notify groups” and expand them itself, so it can check and defer any not-yet-existing notifiers.

As a work-around, Alert2 already has the ability to notify multiple notifiers itself. One could create an entity called sensor.all_devices whose state would be the list of notifiers. Then you could tell Alert2 to notify that entity name.

EDIT: Actually, I realize that unless early_start is specified, generally alerts wait till HA has fully started before triggering. The alert2.error is an exception to this rule. Making it conform would also avoid the immediate issue.

I’ll look into notify groups and see what’s involved in expanding them.
I’ll copy this message into the Github issue. You can reply there or here.
I appreciate you help making Alert2 better! Nothing beats real world mileage.
Cheers,
Josh