Alert2 - a new alerting component

redstone99 · December 30, 2024, 10:55pm

Btw, as I’m working on editing alerts in the UI, I just noticed that Alert2 swallows certain type errors in the condition field and convert them to “False”, which isn’t great. E.g., if your condition returns “foo”, that becomes False.

(the mock in my unittest didn’t accurately reflect the real HA internals in this regard. Will fix that, too). I think I’m going to change it so it is an error if condition renders to something that isn’t “truthy” (i.e., “true”, “on”, “yes”, positive numbers and the opposites) unless someone objects. I don’t like the idea that you may miss alerts because a condition has some type error in it that makes it become False all the time.

EDIT: I’ll probably put an explainer in the error message saying that condition got more strict so people aren’t wondering why they may be getting new errors.

EDIT2: This change will mean that the empty string, which used to mean “False”, will now cause an error to fire.

Btw, gotta say, even though I have only a little done in the UI work, it’s neat to be able to play with alert definitions and get instant feedback. Makes it easier to experiment.

J

woodersayer · December 31, 2024, 5:53am

Sweet. Great release! I’m home alone for the next week couple of days so I’ll have a some extra time to dedicate to alerting.

In regards to the suggestion of using the data field that’s exactly what I’m doing currently and it absolutely works. The suggestion is specifically to try to be “DRY” when writing alerts. For example, I have a bin which I store 3D printer filament that is to be humidity controlled. I have an alert that fires after the humidity rises above 20% for 5 minutes.

    - domain: humidity
      name: ams_humidity
      friendly_name: "AMS Humidity"
      condition: "sensor.filament_humidity"
      title: "AMS Filament Humidity Alert"
      message: "AMS humidity has gone above 15% humidity. It's at {{ states('sensor.filament_humidity')}}%."
      delay_on_secs: 600
      done_message: "AMS humidity has gone below the threshold. At, {{states('sensor.filament_humidity')}}"
      data:
        group: "GroupTag.ams_humdity_group"
        tag: "NotificationTag.ams_humditiy_tag"
        url: "/dashboard-home/management"
        actions:
          - action: "ACK_AMS_HUMIDITY"
            title: "Ack"
          - action: "SNOOZE_AMS_1_HOUR"
            title: "Snooze for 1 Hour"
          - action: "SNOOZE_FOR_1_DAY"
            title: "Snooze for 1 day"
      threshold:
        value: "sensor.filament_humidity"
        maximum: 15
        hysteresis: 5
      notifier: rich

When this alert fires I’m sent a notification that allows me to acknowledge the alert or snooze to it for 1 hour. Focusing only on an action being clicked, an event is sent into the event stream where the event is mobile_app_notification_action with the action equal to whatever you set and then the alert has to be read from the event stream and then actioned. A quick example of a pyscript notification action handlers:

@event_trigger("mobile_app_notification_action", "action =='ACK_AMS_HUMIDITY'")
def ack_ams_humidity(**kwargs):
    alert2.ack(entity_id="alert2.humidity_ams_humidity")

@event_trigger("mobile_app_notification_action", "action =='SNOOZE_AMS_1_HOUR'")
def snooze_1_hour_ams_humidity(**kwargs):
...

@event_trigger("mobile_app_notification_action", "action ==''SNOOZE_AMS_1_DAY")
def snooze_1_day_ams_humidity(**kwargs):
...

Ideally, for every alert I create that I want to be able to ack or snooze from a notification I have to include details in alert configuration which will largely be the same for all alerts as well as implement the handling logic for them. I mind implementing the event handlers less as I’m not sure this is something the integration can do but being able to have actions automatically be added would be ideal. I figure something like the following:

alert2:
  defaults:
     actions:
         - action: "SNOOZE_{{alert_name}}_1_hour"
            title: Snooze for 1 hour
         ...
  alerts:
      ...
        data:
           group: "GroupTag.ams_humdity_group"
            tag: "NotificationTag.ams_humditiy_tag"
            url: "/dashboard-home/management"

Where the “actions” are not overridden so the default would be added. Ideally, if alert2 could listen for and handle these that’d also be awesome but that may not be feasible as I’m not familiar with that side of Home Assistant.

Another “nice to have” feature would be to show alerts where the condition has passed but we are waiting on the length of time to pass before the alert is actually triggered. IE: “Pending alerts”.

Some other thoughts on what has been mentioned in this post:

I really am excited for UI. Instant feedback is great and hopefully will open alerting to more people
I like the idea of an staged alerting/escalation policy for alerts to increase severity.
Making the alerts more UI friendly is also pretty exciting!

As always, I appreciate the work you’re doing!

EDIT:
I had a thought about how to handle the snoozing and acking of alerts. I’ll update here with the result of that. Likely to be some time later this week.

tman98 · January 1, 2025, 5:39pm

@woodersayer That’s awesome you have actionable alerts working. I agree I’d love those too for sure! Being able to ack/snooze/disable etc. from the alert itself on the phone is amazing. Seems like alert2 could handle the action returns for sure by listening to the event stream and just having a YAML configuration option for specific actions to take.

Ironically, this seems remarkably overloaded with the concept of external triggers and/or service calls (see GitHub issue thread Trigger based alert doesn't fire · Issue #7 · redstone99/hass-alert2 · GitHub that also has possible connection between stages/escalation and a separate trigger and/or service call manual deactivation (or deescalation) as an option) to deactivate an alert if you just expanded the set of actions that can occur on a specific event/service call for an alert.

I think there’s a set of interrelated features here that are pointing to a particular architecture for ways to interact with a fired alert and I think there’s an elegant solution to get all of them through a good design/configuration.

Will revert as I noodle on it

tman98 · January 1, 2025, 5:41pm

Josh - thanks for the explanation on alert handling on restart, that makes sense. Does mean on a crash I think there could be alerts that didn’t shut down properly - may mean a cleanup call is necessary in the future but in this remote home assistant example there’s some entity name prefixing that would have to be applied. I view as a low priority item just wanted to note before closing out that topic.

tman98 · January 2, 2025, 7:31pm

I know there are a few interspersed conversations here, but as we’re tossing around product feature suggestions, I have some thoughts now on unavailable states to try to get everything written down.

I think, similar to the “availability” option in YAML templates (Template - Home Assistant), an availability option for Alerts would be valuable.

Nearly every one of my alerts is written like this at present:

    - domain: temperature
      generator_name: very_low_indoor_temperature
      generator: "{{ expand('sensor.very_low_building_temperature_alert_group')|map(attribute='entity_id')|list }}"
      name: "{{ genRaw }}_is_low"
      friendly_name: "{{ state_attr(genRaw,'friendly_name') }} is low temp"
      condition: "{{ states(genRaw) != 'unundefined' and  states(genRaw) != 'unavailable' and states(genRaw)|float < states('input_number.indoor_building_very_low_temperature_alert_value')|float }}"
      delay_on_secs: 600
      message: "Low temp of {{ states(genRaw)|float }}"
      reminder_frequency_mins: 30

Where I check for availability of the entity in the condition to prevent exceptions/failures of the alert (e.g. the cast to float).

Note that threshold alerts still fail though and throw exceptions as there’s no way to specify availability.

I then have a giant single generator that enumerates over all the groups I care about to check availability and report an error for unavailable entities:

    - domain: critical
      generator_name: entity_unavailable
      generator: "{% from 'entity_name_state.jinja' import expand_with_remote -%}{{ expand_with_remote(
        'sensor.freezer_temperature_alert_group' ~ ',' ~
        'sensor.fridge_temperature_alert_group' ~ ',' ~
        'sensor.indoor_building_temperature_alert_group' ~ ',' ~
        'sensor.very_low_building_temperature_alert_group' ~ ',' ~
        'binary_sensor.leak_detected_alert_group' ~ ',' ~
        'binary_sensor.smoke_detected_alert_group' ~ ',' ~
        'binary_sensor.carbon_monoxide_detected_alert_group' ~ ',' ~
        'binary_sensor.unexpected_occupancy_detected_alert_group' ~ ',' ~
        'switch.switch_availability_group' ~ ',' ~
        'binary_sensor.occupancy_detected_group') }}"
      name: "{{ genRaw }}_entity_unavailable"
      friendly_name: "Entity unavailable: {{ state_attr(genRaw,'friendly_name') }}"
      condition: "{{ states(genRaw) == 'undefined' or states(genRaw) == 'unavailable' or states(genRaw) == 'unknown' }}"
      delay_on_secs: 60
      message: "Entity state: {{ states(genRaw) }}"
      notifier: all_devices
      reminder_frequency_mins: 1440

where expand_with_remote is a template macro that is able to expand multiple groups and groups that come from remote home assistant and make one list.

It would be a lot better to combine these. This would clean up conditions by removing the necessity of checking for entity availability. It would also make threshold alerts protected against exceptions from unavailable entities.

I suggest adding an availability template option to an alert’s YAML at a minimum. If this is true, the alert will not fire.

I mentally went down the path of thinking maybe there should be something like an “availbility_action” that would be “none” or “alert” and allow the alert to fire in some special case if it was unavailable (e.g. availability evaluated to ‘false’). But this leads down a whole series of implementation path questions, what is the “state” of an unavailable alert (e.g. is there now a third state to on/off but you can’t overload this name as ‘unavailable’ as that’s already in use by HA for the alert entity itself being unavailable), how do you notify (do you use the alert’s main notifier or some new special one which then creates a whole bunch of new config options we wouldn’t want). My feeling is that’s a thorny route - so I think if you use the availability template, you still need to create a separate alert to alert on unavailable entities should you desire to have those alerts. But at least the availability template would stop alerts from failing and throwing exceptions.

Reference: Macro for Expanding Remote Home Assistant Groups Locally in Alert2

To provide my macro I’m using in the YAML above, if you’re using Remote Home Assistant and want to work with groups replicated from remote instances in integrations like Alert2, you’ll notice some unique challenges. Groups from remote instances are reflected locally as single entity sensors, with their member entities moved into the entities attribute. This happens because the entity IDs on the local instance are prefixed (e.g., remote_), rendering the original group invalid locally.

To allow for expansion of these groups into their individual entities (with the correct local prefixes), I created a Jinja2 macro. This macro:

Iterates through the entities in a local group and the entities attribute of each remote group.
Adds the appropriate remote prefix to each remote entity ID.
Expands nested groups (both local and remote) to resolve all endpoint entity IDs.

This means you can input a list of groups (local or remote, including nested ones), and the macro will return a fully expanded list of local entity IDs reflecting the original remote group members.

Macro Usage:

Use the macro as shown in the example below, passing a comma-separated string of group entity IDs to expand.

Input: A comma-delimited list of groups.
Output: A complete list of expanded entity IDs, with correct prefixes applied to reflect the local instance.

Example:

{{ expand_with_remote('group.local_group,group.remote_group') }}

Macro:

{%- macro expand_with_remote(entities) -%}
  {%- set ns = namespace(b1 = false) -%}
  [
  {%- for entity in entities.split(',') -%}
    {%- set entity_list = expand(entity) -%}
    {%- for o in entity_list -%}
      {%- if 'entity_id' in o.attributes %}
        {%- set prefix = (o.entity_id.split('.')[1]).split('_')[0] -%}
        {%- for o2 in o.attributes.entity_id -%}
          {%- if ns.b1 -%}
            ,
          {%- endif -%}
          {%- set ns.b1 = true -%}
          '{{ o2.split('.')[0] }}.{{ prefix }}_{{ o2.split('.')[1] }}'
        {%- endfor -%}
      {%- else -%}
        {%- if ns.b1 -%}
          ,
        {%- endif -%}
        {%- set ns.b1 = true -%}
        '{{- o.entity_id -}}'
      {%- endif -%}
    {%- endfor -%}
  {%- endfor -%}
  ]
{%- endmacro -%}

redstone99 · January 2, 2025, 8:31pm

@woodersayer - good thoughts and suggestions. Adding a default for data makes sense, as does showing pending alerts. For pending alerts, I’d probably add an attribute to the alert indicating the pending state, and then a config switch in the UI to show the pending ones or not.

Making it easier to have alert2 management actions in notifications sounds like a good idea as well. Makes sense to me to have Alert2 watch the event stream for some set of predefined events. Do you have suggestions for what a good set might be? The action specs can’t take a parameter can they? Like instead of hard-coding SNOOZE_FOR_1_DAY, could you have the action be SNOOZE_FOR with a time parameter? I’m not super familiar with actions.

@tman98, I’m glad you’re giving the unknown/unavail issue some thought! I agree that triggering secondary alerts/notifications from within an alert might be rather complex/problematic. Also, what to do about alerts that reference multiple entities, any one of which might be unavailable. Some possibilities:

adding a function/filter to make it easier to detect entity availability issues in templates.
add a generator function that expands to the list of all entities referenced in alert templates. Might make it easier to set up alerting on unavailable entities.
Expose a variable last_value to threshold value templates to make it easier for them to specify policies to handle unavailable entities (e.g., fallback to the previous value).

I’m coming around more to staged alerts. Here are my recent thoughts on them:

Expressing staging (or “circumstances”) is about notifications and display of the alert, not so much about whether it’s firing or not. So it makes sense to me to be able to alter any notification/display parameters of an alert in each stage.
My inclination would be to declare that stages are mutually exclusive. You can’t be in two stages at the same time.
Stages are ordered. The ordering may be related to display format/priority, but that’s not required.
Stages are entered via some trigger/condition specification. Optionally, when an alert changes stage (or just advances in stage), the reminder/snooze status is reset. It’d be as if the alert just started firing after being off.

Oh and relatedly, it makes sense to me to add a display_message template parameter that is shown in the UI below the alert name. That can also be changed in stages.

EDIT: @tman98 - did you mention that Alert2 looses the last few minutes of alert history after HA restart? I was just looking at the state restore code as part of the UI work and it looks like it should write out most state when HA shuts down. I can test and see what’s up, but wondered if you remembered what tended to get lost on restart.

-Josh

cerebrate · January 4, 2025, 7:59pm

Hello over here, @redstone99 !

I thought I’d post here on my general list of use cases and desires for future directions, rather than flood a whole bunch of feature request issues into the GitHub, and to let everyone comment.

Right now, I’ve just moved over to Alert2 from HA’s built in Alert, given that the latter has been frozen and may be deprecated (per here). And, let me say up front, I’m already pleased with the extra capabilities I’m getting from it, but it does mean that I’ve got a few feature requests based on making sure I can replicate some of the behaviors of Alert that we’ve got accustomed to here:

Alerts that don’t notify immediately (i.e., like the old skip_first option); this is, for example, like door and window open alerts where I don’t want an announcement every time someone goes in and out, but (in cases where the door is deliberately left open) I would like to be able to ack the alert immediately, meaning it should fire straight away. (This is the one I have a GitHub issue for.)
I’d like to have the option to hear the stop notification for acked condition alerts.

(To turn this into a use case, if I leave the door open deliberately, I ack the alert so I won’t be reminded. But if someone else closes the door, I’d like to know when that happens. Likewise, if I ack an overheating alert, I’d like to know if the situation resolves itself so I can stop trying to resolve it manually. That sort of case.)

This may actually be a request to have alerts unack when they go on->off, rather than off->on, although that would only apply to condition alerts and not event alerts. I recognize that that would be a possibly breaking change, so maybe as an option?

I have a couple of UI requests too. The first one is to make it easier to differentiate active and inactive (condition) alerts. I think one good step for that would be to apply an active color to active alert icons, similar to the state color in entity cards, and/or gray them out. Another one would be to sort “off” condition alerts into a separate section, as is currently done for acked, snoozed, and disabled alerts, and sort them to the very bottom.

I’m not sure how to best handle event alerts in this respect, though? (Unless the alert2.manual_turn_off action suggested here is implemented, of course.)

The other is that alongside the current custom card, which works well I find as a sort of “Master Alarm” panel, it would be good to have a custom alert2-row that works like multiple-entity-row; basically, providing the special functionality of the alert row displayed in the custom card in a package that could be dropped into any entities card, or auto-entities card.

This would make it a lot easier to set up, say, room-specific alert cards that provide all the alert functionality. I’ve got a couple of dozen pages in my Home Assistant UI, so it’d be handy to not need to flip back and forth to manage alerts.

So, y’know. Just a few small things.

Thanks for the great integration!

tman98 · January 4, 2025, 8:41pm

We’re sort of spanning here and two GitHub issues (Trigger based alert doesn't fire · Issue #7 · redstone99/hass-alert2 · GitHub and [feature] Allow alerts to fire but not notify on first firing · Issue #5 · redstone99/hass-alert2 · GitHub). I think this thread is the best place to discuss core architectural topics in order to keep coordinated in once organized location.

I tend to agree with @DonLuigi on Trigger based alert doesn't fire · Issue #7 · redstone99/hass-alert2 · GitHub (restating some of his thoughts here):

I think it’s valuable to step back to determine, What is an alert2 alert in the abstract? As @DonLuigi said, it is really just tracking that “something” has occurred that we care about (alert fired); then that that something may still be occurring (alert active/on); and finally that that something has ended (alert off and completion fired).

That abstraction is generic and usable across any of the ways of activating and deactivating alerts.

From a concept perspective:

An alert could be activated by an event, condition, service call, or something else in the future.
An alert can either stop immediately (single shot fired, which is the subset @DonLuigi described for an event notification alert, which in an even more generic abstraction is just an “immediate deactivation”) or stay active until expliictly deactivated.
Then an alert can deactivated via any number of mechanisms, e.g. a condition going false, an event, a service call, a cancellation by the user (e.g. button or actionable notification), etc.

But it’s all generically the same concept and I believe should be a single implementation as well. That creates a simpler and smaller code base with fewer lines of code and fewer tests that achieve more coverage.

One small nuance is the single event-based one-shot alert (immediate deactivation)‘s state today is the time it fired whereas “on/off” alerts’ state are the on/off state. Not sure if that semantic is critical or not but that could be implemented with a single special case if we beleive it is an important differentiation.

I do see strong value from a usability perspective in creating certain alert types/clases for configuration purposes (making it easier to write, read, and debug configuration YAML), for example:

event based alert (takes only an event and/or condition to be true for firing a one-shot alert with immediate deactivation)
condition alert (takes only a condition)
latched alert (event/service call with optional condition on and event/service call with (a separate most likely) condition off)

Other alert types can then be created in the future strictly in the configuration code section of the codebase, setting the appropriate activation/deactivation states of the generic implementation.

This also allows staging to then be built as a second abstraction layer over the alert, and I believe the above will simplify that code as well.

I’m not convinced staging is strictly a notification concept yet. I think there is a potential refinement of the alert state that is useful, but again an abstraction on top of the alert. For example, one set of stages might be to increase the size of the notification group or go from normal notifications to critical, or potentially send SMS messages, if the alert stays active for too long.This would be notification level staging.

But another might be to increase the notification criticality if the condition gets worse for example. Let’s say we have a fridge is getting cold (e.g. under 37 degrees F) alert. But if it goes under freezing we want to both escalate the notification but also a “substate” of the alert may also now be true (fridge is now freezing). I think there’s value in tracking a further state of the alert. This “substate” change could really just possibly be captured as just a refinement of the message and display_message as you said @redstone but I wanted to note that staging may be based on time, state changes, etc. and is worth of tracking. State changes should certainly go into history as well.

redstone99 · January 5, 2025, 1:54pm

@tman98, great thoughts. I agree with I think all of it. Yeah, I’m not sure how to best handle event alerts (that immediately deactivate), both architecturally as well as in the UI.

@cerebrate - thanks for the ideas! Getting an “off” notification even after having ack’d the alert is an interesting feature. I wonder if it’s a form of staged alert, but probably simpler just to have a flag to specify that acking applies only to reminders, not the “off” notification.

I’m accumulating all the feature requests in a list. I can put it online if there’s a good place. I should add that there are a lot of them. I expect that the rate and scope of new feature requests will start to level out as the project matures. Coding help welcome!

-Josh

cerebrate · January 5, 2025, 7:07pm

For myself, the current semantics make sense to me insofar as one-shot alerts behave like HA button/input_button objects, also one-shots, whereas condition alerts (and presumably latched event alerts) behave like switch/input_boolean objects, state-wise, which matches up intuitively.

I have also been thinking about staged alerts, and wondering if it might be a good idea to separate the notions behind them into two concepts which seem to me to have different semantics: call them severity and sub-alerts, perhaps.

Severity, as I see it, scales linearly, and every stage is a strict subset of the one before it. It’s possible to either advance to a new stage or revert to one before it (although this latter capability may not be used). Possibilities include:

“disk 80% full”, “disk 90% full”, “disk 95% full”, “out of disk space”
“overheating (5°)”, “overheating (10°)”, “overheating (critical)”
“alert”, “alert for > 15 minutes”, “alert for > 1 hour”

and so forth, where each one implies the previous is, or at least was, true.

Whereas sub-alerts I’m conceptualizing as alerts that are relevant because the primary alert is true, but which aren’t necessarily related to each other, and can change state independently.

So here, to use my standard example, we have:

“door is left open”

“door is left open AND it’s cold outside”
“door is left open AND it’s nighttime”
“door is left open AND activity is detected outside”

where the sub-alerts have different and unrelated conditions, but are all conditioned on the parent alert being “on”.

The former seems to be where staging is converging per here, but the latter seems like valuable functionality for the sort of use we talked about on GitHub here.

And conveniently, this latter is attainable right now by defining a condition alert whose conditions include a reference to the “parent” alert as well as its “actual” condition, so as a feature request it would just imply some syntactic sugar in the YAML and a cleaner display of them in the UI.

(Sorry if I’ve been recapitulating the already known, but I wanted to get this clear in my own head.)

…that was where I was going to leave it, but I stopped to get a coffee and now I have a new feature request or pair of them to add, courtesy of my lovely wife.

They’re both about snoozing, and partly my fault because I’d implemented a “snooze” feature for some of our alerts manually back under Alert.

The first of them is just that the name of the snooze feature is confusing if you haven’t read the Alert2 docs, because the snooze everyone’s familiar with is the alarm-clock kind, and that kind of snooze starts sounding the alarm again when it expires. Whereas the Alert2 snooze doesn’t; the snoozed alert stays acked when the snooze expires, and so the notifications for an already-on alert don’t come back.

So, maybe that could use a rename to make its semantics more obvious to HA-users who aren’t also the HA-implementors?

The other one is for a notification snooze that would have identical semantics to an alarm-snooze, i.e., act like acking the alert and then automatically unacking it again when the snooze period expires?

(This latter is effectively what I set up for our “Clean Laundry Ready” alert using a timer and a couple of automations, so that she could flip the “Busy now; remind me in a couple of hours” switch and defer notifications for that long. This is of course just as easy to implement now with the ack and unack actions, which I’m going to do right now, but it seems like a generalizable enough request that it’s worth mentioning as a possible feature.)

woodersayer · January 6, 2025, 5:05am

Assuming the path that is that Alert2 is listening to the event stream AND sending the notifications, the user could simply configure the snooze length when creating the “data” field. A rough idea could be as follows:

alert2:
  defaults:
     ack_on_notification_click: true     # This can be done by setting the HA URL to 
     actions:
         - action: snooze 
            length: 1h
         - action: snooze
            length: 1w
  alerts:
      - domain: humidity
        name: bin_humidity
      ...
        data:
           group: "GroupTag.ams_humdity_group"
           tag: "NotificationTag.ams_humditiy_tag"
           url: "/dashboard-home/management"

In this above example:

The unique action identifier could be built programmatically for each alert as we know each alert will have a unique domain and name (I believe this is an existing requirement but I may be mistaken).
We would also know the default lengths of time the user wants
This still leaves the user in control of determining what works best for them.
This also makes it more easily extensible in the future as this would abstract away the action generation from the user

So the “actions” that get appended to the notifier service call could be converted into something like this:
humdity.bin_humidity|snooze_1_week
Where the pipe represents some delimiter to make it easier on the integration side to determine which part is the name of the alert and what is the actual action selected by the user.

This enters the event stream, the integration would listen to mobile_app_notification_action events for the interaction from the HA Companion app and then handle any which has action == <some action we expect to see from an alert notification>. This would seem to be the easiest to implement. That being said arbitrary input is supported by the HA Companion apps and for what it’s worth, I’m coming at this from a user perspective as I’ve not read too far into the development docs of HA. Here is a link to the docs talking about how automatons are intended to be used to handle this. To quote these docs, this what a mobile_app_notification_action event looks like is this:

{
    "event_type": "mobile_app_notification_action",
    "data": {
        "action": "OPEN_<context_id_here>",
        // will be present:
        // - when `REPLY` is used as the action identifier
        // - when `behavior` is set to `textInput`
        "reply_text": "Reply from user",
        // iOS-only, will be included if sent in the notification
        "action_data": {
          "entity_id": "light.test",
          "my_custom_data": "foo_bar"
        },
        // Android users can also expect to see all data fields sent with the notification in this response such as the "tag"
        "tag": "TEST"
    },
    "origin": "REMOTE",
    "time_fired": "2020-02-02T04:45:05.550251+00:00",
    "context": {
        "id": "abc123",
        "parent_id": null,
        "user_id": "123abc"
    }
}

In the above example, if REPLY is one of the actions and that is what this event is, then reply That is to say, arbitrary input is supported for alerts and the data is pretty easy to get into the alert. but the problem now becomes parsing the input from the user. One could allow pretty non specific snoozing (an untested regex could look like this [1-9][0-9]+[smhdwy]`), to allow the user to be more precise allow for them to mix these values (IE: 1w3dh), or one could require the user enter a parse-able date format that is included in the alert (IE: YYYY-MM-DDTHH:MM).
There will be errors because this is user input, so I see a couple of ways of handling when a reply contains an parse-able value:

Send the user notifications until they get it right (potentially eating into their 500 daily allotment from Nabu Casa)
Send them a singular follow up notification with a URL that the user can specify which takes them to a page where they can handle snooze/disabling/acking via the UI.

While I could handle arbitrary input on my own, I’ve made the decision not to as I find the for all of my alerts so far I’ve not needed anything really different than acking, snoozing for an hour, or snoozing for a day. If I want anything more, I can go to the UI to be more exact which I can do rather easily by including a URL as part of the notification definition.

tl;dr yes, actions do support arbitrary input and I intentionally don’t use them as I’ve found what appears to be a sweet spot for me (1 hour and 1 day) but I would love to avoid repeated configuration. Let me know if the above was worth your time and/or you have anything else I can help answer

A quick aside which I learned while answering you (link), apparently you can call services with URL handler so the default action when clicking a url could be acking it. Neat!

tman98 · January 6, 2025, 4:20pm

Wow great thoughts both! I think you’re on to some clever abstractions… going to keep thinking.

Just as a side note, for what it’s worth, I’m thoroughly enjoying this component’s budding community. Some really clever minds are getting involved here (@cerebrate
and @woodersayer) on top of an open minded and really great architect and author in @redstone99. Great ideas are being formed. It’s not always the case the discussions are so constructive and focused principally on building great tech in some of the HA communities (instead of defending turf), so it’s a very enjoyable breath of fresh air. I love opening my browser each day to see if there’s a new message ! Thanks all.

tman98 · January 8, 2025, 3:26pm

@cerebrate I think you’ve nailed the two concepts of staging and “sub-alerts”. Both have a lot of value and well described. Also your skip_first use case makes sense to me (I’m also trying to think if there’s any broader abstraction there but haven’t found one yet).

Been thinking about alerting more in relation to everything we’re doing and my own use cases.

Providing the ability to call an arbitrary set of services when a condition/stage occurs and not just a notifier is valuable. For example, if I get a leak alert, I want to shut off my water valves. If I get a fire alert, I want to shut down boilers. And I actually think calling the services each reminder interval is fine (make sure the valves are off for example), once I don’t care about the alert redoing things I can ack it. I could see an option to call services only on the alert being fired the first time.

Equally, there could be optional services to fire when the alert is over. In a leak condition I might want to keep the valve off until I figure out the problem, but if there was just temporary smoke that cleared I may want to just fire up the boilers automatically because I don’t want the buildings to freeze.

Again, this would be wonderfully simple yet materially powerful to link to alerts because we already have delays to start, hysterisis, etc. And with staging, I could have a first stage notification that a problem exists, then a second stage action to shut down the valves and/or boilers. That is big.

Automations alone would be pretty hard to do the above actually (notably the staging), and then we’d have disjoint action taking in the system from alerting to the user - which means you actually can’t know for certain if an action has been taken when you see an alert (you have to know the underlying automation code vs. alert code). For my use cases, having them exactly aligned would be the best, for example a staged message can be “Smoke has progressed, shutting down boilers” and reduces emergency condition code in my system to being in one place: alert checks fire notifications and take actions.

EDIT: Also @redstone99 with logging, could you ensure there is an INFO level log statement that is detailed for each alert creation and deletion, as well as each activiation, deactivation and notification (and unified across generator and not)? I’ve found some inconsistencies with what is at debug vs. info, and the problem with the debug level is every condition and trigger check gets logged which spams the logs when I’m not trying to debug at that level. I actually think that’s a fine set of info to log at debug as when you really need to figure out why an alert is not firing, you want that detail. But I want to be able to confidently go into the log when it’s at a normal Info level and check the major events related to alerts, like making sure they were created in the first place, fired, deactivated, etc. Each should have a full set of info to be able to go back and diagnose if something went wrong given the mission criticalness of the component. Thanks!!

EDIT2: As I’ve expanded my usage of Alert2, found more and more use cases. I now have it sending alerts for nice to know conditions (e.g. I’ve started to warm up a building, tell me when it’s gotten warm enough) vs. emergencies (smoke alarm or leak). A way to designate severity on an alert and in the UI would be great, I know you are doing UI work and we referenced some form of severity in the past so just reraising now as you’re doing the work.

redstone99 · January 8, 2025, 9:44pm

Hi All, I’m sortta on vacation for the next week so may be flakier than usual responding.

I like singling out “severity” as a common use case.

“sub alerts” feels close, but at least in the door-open use case, there is a severity aspect to it. The door-open-and-its-cold seems more urgent than just door-open. But agree that there’s not any apparent severity relationship between door-open-cold and door-open-nighttime. Not sure how to best capture that.

Actually to that point, does anyone have an example of “sub alerts” besides the “door open” case? Having some other examples might make it easier to extract common pattersn/abstractions.

@cerebrate, roger on snooze. Added to list
@woodsayer, I like the idea of being able to take alert actions from the companion app in a handy way. I think I need to play around with what you’ve described to get a feel for it first.
@tman98 , roger on cleaning up the info/debug so there’s consistent alert lifecycle messages in logs. Great idea. Added to list.

And having alerting support remediation (e.g., turn off water valve) is interesting. Definitely some thinking to be done about how to best support.

For my standpoint my plan roughly is:

get the UI alert editing ability to some sort of alpha release
Do a clean-up pass on Alert2 and pick off some of the easy logging and other simple feature requests.
maybe next support latched alerts. I’m currently thinking maybe we introduce trigger_off and condition_off and manual_off that determine when an alert turns off. So if an alert just specifies trigger, it’s a momentary alert. If it specifies only condition, it’s a traditional condition alert. And if it specifies one of the new fields, it’s a latched one. Something like that.
Cheers,
Josh

cerebrate · January 8, 2025, 10:52pm

Another variation on the theme that occurred to me is alerts that are relevant either when they run for a long time, or when they’re short but chronic. I use my HA instance for server monitoring, too, so the first examples that spring to mind are memory/CPU usage alerts, where sustained high level use is worth a notification, but short high level bursts aren’t - but it’s still handy to fire the alert, because the alerts/time is statistically useful.

I’ve been thinking about this, and I’m not sure that it’s something that would need implementation in Alert2 itself, except when it comes to making sure that implementations of staging do the needful to support it. (Unless you specifically want to call the service every reminder-time, at least.)

I say this largely because I’ve already, in some cases, got automations set to trigger when an alert goes off->on and on->off (to avoid having to duplicate the conditions of the alert in the automation, and it also includes the delays to start, hysteresis, etc. of the alert), and I’m not seeing much of a win from just moving the link between the alert and the automation from out of the latter and into the former.

(Where staging is concerned, to make this easy, this would mean implementing staged alerts to not be off->on->off, but rather be off->stage1->stage2->stagen->off, so that the stage would be visible in the state, but that doesn’t break backcompat so shouldn’t be an issue, I think?)

Am I missing something here?

That may just be an unfortunate quirk of my calling it severity, because you’re right. Maybe a better way to look at the distinction is between “non-dependent” and “dependent” cases, where one set, the one we’ve been calling severity, is alerts that are added to by the same conditions, and the other is alerts that are added to by an outside factor.

So non-dependents would be the “X open longer”, “X got hotter”, “X space decreased”, etc., type of alert, and dependents are all the ones best phrased as “This… AND ALSO that.”

(There’s a little subjectivity in here. One could argue reasonably that a second smoke detector going off after the first constitutes “fire got worse” or “This is on fire… AND ALSO that’s on fire now”, but I think that sort of decision might be best left up to the person setting up the alerts and how they want to handle that sort of case.)

Thinking about severity:

Being able to specify a severity or priority on alerts (as just a simple integer, I suggest, higher being more) would be good for UI purposes anyway, for example, being able to sort displayed alerts by their severity.

(The fancy version of this would take from the gauge card UI and let us specify alerts to be displayed in amber, red, etc., as the severity got higher.)

And then that could integrate with both severity/non-dependent/staged alerts and sub-alert/dependent alerts. I would propose that for the former, a different severity can be assigned to each stage, so they can bubble up as they move through their stages.

And for the latter, perhaps a sub-alert going off gives its severity to the parent alert, if that alert’s effective severity is lower? So, as an example, if we declare three alerts, door_open (severity 10), door_open_after_nightfall (severity 30), and door_open_while_cold (severity 50), and door_open is already on, when door_open_while_cold goes on, it promotes door_open to severity 50 and it starts showing up at the top of the panel along with the sub-alert beneath it.

So, basically, the effective severity of an alert is the highest of its own and that of any sub-alerts it has which are currently on.

Well, I just implemented the equivalent of “Pool Pump Power Low” plus “Pool Pump Power Low AND ALSO Outside Temperature Below Freezing”. There will be some more subalerts going onto the former once depth and pressure sensors have been added to the pool automation.

I’m also in the middle of rewriting some purely-advisory logic about inside-outside temperature differences that will look at the state of the house and advise on possible ways to help out the HVAC, which will have “Significantly Cooler Outside” alerts for summer, with sub-alerts like “You Could Open This Window” and “You Could Turn On The Whole House Fan”, and likewise in winter with “Significantly Warmer Outside” and “You Could Close This Window”, etc.

I also also have some loose thoughts on implementing some rather complicated sets of them that are basically my first few steps in debugging network issues hereabouts, but those are still a bit vague to go into details about.

All sounds good to me. Enjoy your vacation!

cerebrate · January 8, 2025, 11:00pm

Added:

While I don’t think it’s all of them, I think "here is a situation (primary alert), and here are potential problems with it (sub-alerts)"¹ and "here is a problem (primary alert), and here are possible mitigations for it (sub-alerts)"² are likely to be two very common sub-alert usage patterns.

The door-open scenario, for example.
The HVAC scenario, for example.

tman98 · January 9, 2025, 2:55am

Good stuff @cerebrate I’ll think about all that!

Interesting how you are using automations to trigger off of the alert. I hear you in that if the alert had enough to make triggering an automation simple code to write for each desirable case, that might be sufficient. There is a difference between an indirect association between alerts and actions that might occur when they arise (triggered automations) vs. explicit associations in the form of scripts. I think the latter is easier from a usability standpoint, as the logic is all together in the alert YAML and not subject to automation trigger code correctness. In the end I’m not sure if we have to actually force one or the other on users - if you want to use automations do that but adding service calls to Alert2 would be trivial (after all a notification call is just a special cased service call).

And @redstone99 enjoy your vacation - well deserved.

cerebrate · January 9, 2025, 5:31am

Hm, yeah, I can see how it would be desirable from a clarity point of view.

Are you thinking about putting the entire script in the alert configuration (which “the logic is all together in the alert YAML”) might imply, or just the action to run the script?

(The latter would seem to me to save on adding redundant complexity to Alert2, at the cost of having the script code elsewhere, but would also let you use a single script to handle multiple alerts by passing parameters to it from the alert. Since I think most people are likely to have multiple similar alerts, this seems like a win to me, especially from the maintenance PoV.)

tman98 · January 11, 2025, 2:47pm

Thanks for asking for the the clarification - definitely was intending the latter, that slert2 can make generic service calls, of which a (optionally parameterized) script is one possibility. For all the reasons you stated that is the right abstraction to me. I would not suggest supporting arbitrary action ymal action logic in alert2 configuration, just n number of services calls (so could be a script call or a simple service call to turn a valve off).

EDIT: I have started actually flipping the problem (for now). When I have critical causal relationships I want to both alert on and cause action off of I’m creating an automation that both takes action and performs an alert2.report_event call, so at least there isn’t too code paths separate from each other for action and alerting. THat works for now -e.g. if a leak is detected I shut off my water valve and fire an alert2.report_event. Given the severity of that condition I also have a normal alert2 condition alert on the leak. But if that happens I really don’t mind my phone blowing up.

redstone99 · January 16, 2025, 7:10pm

Hi all - small update on my end. I have the UI code mostly working to create / edit alerts in the UI. Now trying to write some basic testing, which is a bit complex because it involves interactions between the browser and the backend. I was faking out most of HA in my unittests up until now, but I think I need to switch to a better test framework, probably pytest-homeassistant-custom-component. Figuring out how that stuff works.
Cheers,
Josh