Alert2 - a new alerting component

tman98 · November 12, 2024, 11:11pm

Great! And more than happy to be a real world production user!

tman98 · November 14, 2024, 6:33pm

Hi Josh, I may have hit another bug. I had an alert go off yesterday that I snoozed for a few hours. It is still indicating snoozed today after the time has passed:

And in the details of the alert:

Notifications

Status: snoozed until 2024-11-14T01:39:20.713000+00:00

Which (as long as Ive gotten my UTC conversions right ) is a time yesterday so it looks like the snooze time passed without the snooze ending. Happy to provide any logs you’d like.

Thanks!

tman98 · November 14, 2024, 10:05pm

I feel like I’m throwing a lot at you :). I had an alert go off earlier today that I snoozed. This one is if there is motion on my indoor motion sensors. It suddenly went off now so that was surprising. However when reading it it’s something delayed from a few hours ago, possibly after the snooze period ended, but no alert should have fired now as the occupancy ended over 3 hours ago.

2024-11-14 17:01:03.896 WARNING (MainThread) [custom_components.alert2] _notify msg=House occupancy away fired 4x (most recently 3h ago): turned off 3h ago after being on for 54s

2024-11-14 17:01:03.897 WARNING (MainThread) [custom_components.alert2] Notifying ['all_devices_critical']: House occupancy away fired 4x (most recently 3h ago): turned off 3h ago after being on for 54s

redstone99 · November 14, 2024, 10:33pm

Hi - I’m glad for the feedback, keep it coming! I will be less online till this Sunday, so sorry in advance for delays responding.

Re the notification exceptions: I figured out what’s going on. It’s partly due to limitation/issue with the group integration. I’m testing a workaround that should avoid the problem. It’s probably ready to release now. Will release and explain on Sunday or sooner.

Re the snooze not turning off when it should. I’ll look into it:

Did HA happen to restart within 15 minutes of when you last snoozed the alert?
If it happens again, can you go to the bottom of the “more info” pop-up for the alert, click on “Attributes” and see if the value for “Notification control” matches the time you expect the snooze to end?
If you set an alert to snooze for a minute, does it unsnooze after a minute for you?

EDIT: I noticed that Alert2UI does not update the more-info notification-control “Status”. So if you updated the snooze time, the “status” field would still show the old time until you closed the popup. Is it possible that’s what you observed? In the upcoming release, I changed it so the “status” field auto-updates.

Re the unexpected notification after snooze ends. The intention was that, if any alerts fired during the snooze period, to send a notification summarizing what happened when the snooze period ends. I can see that being unexpected and maybe undesired, especially at the “critical” notification level. I see two options to improve this:

Add a config flag to suppress any such summary notifications
Add a config flag to specify the notifier to use for such summaries.

Any preference or thoughts?

Cheers,
Josh

tman98 · November 16, 2024, 4:27pm

Thanks - it is possible that HA did restart within 15 minutes of when I snoozed the alert as I was doing some work with HA. I’m not certain but it’s possible. I’ll keep my eye on it to see if it occurs again.

Regarding the notification after the snooze ends, that’s an interesting feature you have thought of (the summary). It was counter to what i expected as behavior in this case beacuse this was a critical alert notifier for occupancy sensors I had snoozed when someone was at the house. Getting an alert at the end of the snooze period that occupancy had gone off was surprising and seemed like an intruder was in the house.

I actually suggest both of your options. My feeling is the summary is more of an “opt-in” type feature so perhaps only have it sent if you have selected to have it sent (and an optional separate notifier is great - for example a less critical one would be useful). I think the messaging might want to include something about summary in the text because even when reading the message I thought the alert had just fired.

Perhaps something like: "House occupancy away summary at end of snooze period: alert fired 4x during snooze (most recently 3h ago): turned off 3h ago after being on for 54s

tman98 · November 18, 2024, 10:34pm

Hi Josh - I think there is something going on with snooze. I had two alerts go off today, and I snoozed both. One ended at the end of the snooze time no problem, the other appears to have passed the snooze time. It’s now 5:32 pm and the snooze was supposed to end at 5:03 pm, which the notification control shows as seen below. I’ve refreshed the browser so I think this is a server side error and not a UI caching problem. I also never got the “unexpected” snooze ending notification we were discussing for this alert.

I’ve tried enabling custom_components.alert2 debug logging, not sure if that will turn anything up?

redstone99 · November 18, 2024, 11:29pm

Hi @tman98, I just released Alert2 v1.5 along with a corresponding update to Alert2 UI. This release should help with some of the issues you encountered.

Alert2 biggest changes:

Add config options notifier_startup_grace_secs and defer_startup_notifications to better support YAML notify groups.
Add summary_notifier config option so you can use separate notifiers for summary messages (eg snooze ending)
Add unack service call

Alert2 UI biggest changes:

Make more-info popup dynamically update “Status” field and “attributes”.
Clicking on “Alerts” header of Lovelace card shows Alert2 & Alert2-UI version.

And I updated all docs.
Regarding recent issues you’ve faced:

Notifier exceptions

I believe those notification exceptions you saw were not coming from Alert2 per-se, but from the “notify group” code. YAML notify groups have an issue where they throw an exception if a member notifier does not exist. And Alert2 has no good way to tell during HA startup if a notifier is a notify group with a missing member. So…

I created a config option defer_startup_notifications. You can list notifiers here, eg notify groups, and Alert2 will defer trying to notify them for notifier_startup_grace_secs during HA startup. You can also just set defer_startup_notifications to true and Alert2 will defer all notifications during startup. For non-group notifiers, Alert2 automatically defers notifying during startup, so this is probably only makes sense for YAML notify groups.

EDIT: I forgot to mention, you could also just use Alert2 native “groups” aka creating an entity with a list of notifiers and using that. The docs have an example. It’d avoid the issues I mention above.

Surprising summary notifications

Sorry for the alarming and confusing summary notification on your high-priority alert. I created summary_notifier to control where those go. You can specify a notifier here, set to True to use the regular notifier, or set to False to not send summaries. So to be clear, normal notifiers are used for an alert firing, a condition alert stopping firing and reminders that a condition alert is still on. summary_notifier is used for other notifications, which at the moment are to detail alert action that happened while an alert was snoozedor throttled.

Snooze not appearing to unsnooze

The v1.5 release makes the Status and attributes sections of the more-info dialog dynamically update. That should help the more-info popup show current info. I’m not sure why the alert2 lovelace card you screenshotted is showing stale info. Are there any errors in the javascript console log (shift+ctrl+ I ) ?

When the snooze period for an alert ends, there is a log (INFO) line written containing the alert name and “snooze has expired. Reenabling notifications”. Can you look in your logs and see if you see those at the times you expect? If you haven’t already, you may need to set the default log level to “info” in your configuration.yaml, as below.

logger:
  default: info

And if you want gobs of debug info, you can set debugging output for the alert2 like this in your configuration.yaml:

logger:
  default: info
  logs:
    custom_components.alert2: debug

Oh and lastly, when you update, you can click on the “Alerts” header of the Lovelace card to verify you’re running the current versions.

Thanks for looking into the snooze issue and trying things out!
-Josh

tman98 · November 19, 2024, 9:38pm

Great, thanks for the update! Thanks for the info on the Alert2 notify “groups” - that might be a good way to also try to avoid the error, and great news about the new options as well. I’ll also keep my eye on the snooze item and I have debug logging on so will report back.

Awesome update on the new summary_notifier option, I"ll definitely set that to false for the particular occupancy notification. I think conditional period end notifications still going to the main notifier makes sense as a side note.

Question: what exactly does ack/unack actually do besides a visual indicator a user has seen the alert? For example, it appears an unack’d alert will still fall off the UI if it has resolved earlier than the UI’s slider time window. I thought perhaps an unack’d alert would never disappear until ack’d which would be really useful to never miss an alert that fired (and ack’d would fall off if the time passed the slider’s time window).

Overall, wanted to say this component is awesome and fulfills a major hole I think HA has in its alerting. FWIW my day job is building large scale multi-site colocated high frequency trading solutions, hundreds of machines spread across numerous international sites. Monitoring and alerting is the single most critical tool in our industry and I have to say, you have built something pretty damn solid that isn’t conceptually different than what we use on big interconnected multi-site systems; notably the ability to view if an alert has been fired in the UI and not just relying on notifications is a really key feature.

tman98 · November 19, 2024, 11:42pm

In order to keep my alerts config manageable and also allow my system to grow without having to do lots of editing (or forgetting to edit) and risking errors in critical alerting config, I have been using a number of alerts that try to match multiple entities at once (and therefore test less frequently). For example, I use remote home assistant and have this alert in the master to alert on connectivity:

    - domain: system
      name: remote_connectivity
      friendly_name: Remote connectivity
      condition: "{{ states.sensor|selectattr('entity_id','match','sensor.remote_connection_to_(.*)')|rejectattr('state','eq','connected')|list|length > 0 }}"
      delay_on_secs: 60
      notifier: all_devices
      message: "Remote connectivity lost to: {{ states.sensor|selectattr('entity_id','match','sensor.remote_connection_to_(.*)')|rejectattr('state','eq','connected')|map(attribute='name')|list }}"

I’ve been doing similar things for other things I want to check in bulk as well - I have a few learnings and possibly some features that might help that I’ll get over as I keep going. But first, I’m running into a possible limitation I wanted to ask about. With the above, once one remote connection dies, the condition alert goes to “on” so I won’t get more notifications of further remote connections dying.

I was thinking to move to a trigger alert. The best I have so far is to just trigger on the entire domain of states.sensor (using platform:event and a state-changed event) and then run the same condition I have above to check if that state change was one I care about. It means a lot of unneeded triggers which probably is fine. But just checking if it’s possible to template the trigger spec (then I could dynamically construct entity_ids to trigger on. Also is the “trigger” variable accessible in the condition and message lines to act upon the object that triggered?

EDIT: one further question, will delay_on work correctly with a trigger? I want to alert as above when one of the dynamically selected remote connections goes bad for 60 seconds, but any time that happens after 60 seconds (therefore combining a trigger, condition, delay_on, with dynamic entity selection)

EDIT2: On reflection I think I’m working around a fundamental limitation, so there may be a more core thought. I have alerts I want to create where I do not specifically know all the entitiies that I might want to alert on at configuration time. For example, above I prefer to not write a new alert for each remote system I connect to, each new temperature sensor I connect up, etc. I may have 50 temperature sensors at one site to detect a low temperature condition, and I do not want to write a new alert for each.

I’ve been creating “aggregator” alerts similar to the one above that look at a set of entities at once and alert if any of them are in an alert state. This has required either a match filter like above or I will combine them together in a threshold or other sensor and then alert on that single sensor. But this has limitations, namely I have to create an aggregation function across all the set of entities (e.g. look at the minimum of all temperature sensors and alert on that number, and also do hysteresis on that aggregation which has some limitations) and a single condition that is alerted on, so once we are in the alert state (e.g. one sensor goes below the low temperature) we don’t discover if new entities in the aggregation become a problem (since we’re already in the state). A trigger alert loses the fact that we’re currently “in” the alert and when we leave it.

But there is no way to dynamically create alerts at present so I have to do the aggregation above. I think some spitballing of how to create dynamic alerts, likely through templates, would be highly valuable.

redstone99 · November 20, 2024, 8:02pm

Hi @tman98,
Re aggregate alerts and matching multiple entities - that’s a neat use case. It’s a little like automation blueprints

As a starting point, if you hypothetically knew the entities involved at config creation time, I could imagine adding a pattern parameter so your remote connection alert config might look something like:

- domain: system
  pattern: [ remote_a, remote_b, .. ]
  name: patternItem
  condition: "{{ states('sensor.' + patternItem) != 'connected' }}"

You mentioned not wanting to explicitly enumerate the entities. In that case maybe we support wild-cards in this way:

- domain: system
  pattern: "{{ states.sensor|selectattr('entity_id', 'match',
                 'sensor.remote_connection_to_(.*)')|list }}"
  name: "{{ patternItem.name }}"
  condition: "{{ states(patternItem.entity_id) != 'connected' }}"

The way the above might work is that, if the alert config contains the pattern parameter, it is an “alert generator”. pattern is evaluated periodically during HA startup ( and maybe after ) and produces a list. Each item on the list results in the creation of a new alert. name accepts a template that is evaluated only when the alert is created. name, condition and other template parameters have access to a variable, patternItem, that is the specific list item from pattern that resulted in the alert’s creation. Thoughts?

EDIT: Forgot to mention, I think triggers are not the best path for this use case. They’re for events in a moment in time (like a message arrives) rather than conditions that are true for an interval. For this reason, triggers do not work with delay_on_secs.

To respond to your other comments:

Acking a condition alert prevents reminder notifications while the alert stays on. It also aborts any pending summary notifications that remain to be sent (eg if you ack a snoozed or throttled alert).

Having the Alert2 UI card include unacked alerts outside the displayed time window is an interesting idea. Probably should be an option so as to not force people to ack alerts if they don’t want to without the display filling up.

Thanks for the comment about “alert turned off” notifications using the main notifier (rather than the summary notifier).

I’m glad you take your alerting seriously! Your input has been super helpful.

Cheers,
Josh

tman98 · November 21, 2024, 4:46pm

I think the pattern matching (#2 above) is exactly what I’m looking for and created a repeated set of alerts, and then allowing for use of the matched object in that particular instantiation of the alert is perfect (as it lets me get to names, attributes, states etc). I think of it basically as a foreach loop.

I would need the dynamic list, which I think should be any list created by a template although I can see utility in a hardcoded list (#1 above) for some.

Honestly the exact same construction patterns you have for getting list of notifiers (e.g. single, YAML list, sensor, template oftolist, template to sensor), could generally work and provide consistency in your config style. It’s really rich and has all the same requirements I think the “pattern” idea has for getting lists of objects.

I’m not sure if there’s some refactoring required because you might do alert instantiation just at startup (whereas a template could reevaluate which you’d have to listen to and redo relevant alert instantiation on the pattern template change).

Somewhat related I find that a YAML reload doesn’t seem to get alert2 to pick up alert changes in configuration.yaml, only a restart does. I wonder if the above work would also help with that?

Lastly, I’m finding that if I ack or snooze the more info popup always updates but the lovelace card does not and I have to reload the browser sometimes to pick up the update so not sure if it’s getting all the events correctly for some reason. I’ll continue to see what I can figure out. I’m on the latest server side and UI updates.

redstone99 · November 22, 2024, 12:06am

Supporting YAML reload sounds like a good idea. There are “config unload” routines I never implemented, so I’m not surprised YAML reloading doesn’t affect Alert2 yet.

And thanks for being watchful re UI not updating. I can look into it.

Re patterns, yeah, for the pattern param I was thinking a similar construction to the list of notifiers as you suggest. However, there’s a qualitative difference between using a template for config convenience at HA startup time, and using a template to capture ongoing state changes while HA runs. I have a conceptual worry that making pattern work exactly like notifier might lead to issues for this reason, but I need to think through what those issues might be.

One issue might be efficiency - for example, if you have pattern match on states.sensor, It’d be nice if that only reevaluates the template when entity ids are added to or removed from states.sensor. I’m not sure if that’s the case. Another thing to look into.

-Josh