Triggering Pushover notifications with voice commands

Hello! I recently configured an ESP32-S3 with ChatGPT (working fine) and installed the Pushover integration, which tested successfully. I learned how to send notifications to myself view automations. But all attempts to ask the voice assistant to send me a notification via Pushover failed. The assistant literally told me he can’t use Pushover for me. Is there any way, including workarounds (with automations?) to achieve this? My real-life usage objective will be something like “please send me a recipe for a chocolate cake as a notification via Pushover”. Thank you.

Use automations and sentences. You can get it to do anything you want.

If you’re talking about Assist, you have to write the automation. It won’t do it now simply because the integration exists in your installation.

Basic intents for on and off and control of covers and many things built into HA by default but you need to write the intent script or automation to recognize when you’re asking for it to do that and when it recognizes it - how to execute the ask. (that includes some things you’d expect to be builtin like Input_select, input_text and input_number. I’m looking at you…)

So now that you learned how to send notifications you get to write a script that does so for the LLM

@baudneo @NathanCu thank you for the replies! That makes sense, and after trying it out I see how I can send fixed text messages. But I’m struggling to make it work as intended. Can you help me figure out a viable Automation?

I started by creating a trigger “If a sentence is said” then the obvious “Then Do” seemed to be “Send a notification with Pushover” but in that action I have to input a fixed message which obviously is not my intention. The message content should be the Assistant’s response.

In other words, what I want to achieve is to “route” the Assistant’s response to a Pushover notification.

Ideally it would be:

I say “Hey Jarvis” to wake the assistant
I say “Send a notification”
The Assistant replies “What’s your query?”
I say “A simple recipe chocolate cake recipe”
The assistant sends the response via Pushover notification
Nice but not essential - the Assistant says “message sent”

Is this even possible? The flow could be a little more rudimentary as long as Assistant is still providing the content for the notification.

My ESP32-S3 device has actions like “Finished speaking detection to first option” which I am not sure what they are…

Since your asking it for a recipes I’m assuming your assistant is chatgpt or Google generative ai? You can setup response variables and then use them in the rest of the automation.

I would setup a sentence to ask chat gpt or Google AI something, store the response from their answer and then send their answer via notification. I use this to get Gemini to look at my security cameras on motion events and tell me if anything I need to know about us going on.

@baudneo Hello again! That sounds exactly what I’m looking for! How do I create a response variable though? Apologies if it’s really a noob question. My assistant is ChatGPT, I’m using an ESP32-S3 to talk to it.

I read the docs and it seems they don’t have response variables set up in conversations, which is unfortunate and something I really hope is being worked on (I may be wrong about this, but from what I could find, I think I’m right).

I am interfacing with Gemini AI using an action call, which allows me to store the Gemini AI response in a response variables and do whatever I want with it.

Idk if this can be worked around by using your voice request as a stored variable, and forwarding a request to gpt then storing it’s answer in a variable, but I don’t think so. This may be a good feature request, i.e. allow extracting and templating conversations as they happen to set automations and scripts off.

I really appreciate your input, I’m learning my way around the documentation right now. If I understand correctly, the main difference in your case is that you are not using voice commands in the process, but the action call.

The “action call” you mentioned, is that what is called a “Sequence” in Scripts? If your Script isn’t something private, would it be ok to share it here? Or point me to an example somewhere?

It would be really useful to know how to store & use the LLM’s response at least via action calls, especially if response variables aren’t available in conversations.

Up to now, the only benefit I got from having ChatGPT in my Assistant are more “human” / less robotic replies.

Action used to be called “service”. You can use an action in a script sequence or in an automation. Let me hop on my desktop so I can copy paste my Gemini AI stuff.

Basically, I have an automation that catches when my cameras detect motion, when the automation catches motion, it runs a script which holds all the Gemini AI logic. I’ll paste my automation and script here shortly.

automation

alias: ZM Motion Start (MQTT)
description: Uses MQTT to catch 'event start:' for configured monitor id's
triggers:
  - trigger: mqtt
    topic: ZoneMinder/monitor/2
    id: event_start_backyard
    payload: "on"
    value_template: |-
      {% if value is match('^event start:.*$') %}
        on
      {% else %}
        off
      {% endif %}
  - trigger: mqtt
    topic: ZoneMinder/monitor/7
    id: event_start_doorbell
    payload: "on"
    value_template: |-
      {% if value is match('^event start:.*$') %}
        on
      {% else %}
        off
      {% endif %}
  - trigger: mqtt
    topic: ZoneMinder/monitor/5
    id: event_start_alley
    payload: "on"
    value_template: |-
      {% if value is match('^event start:.*$') %}
        on
      {% else %}
        off
      {% endif %}
  - trigger: mqtt
    topic: ZoneMinder/monitor/6
    id: event_start_frontyard
    payload: "on"
    value_template: |-
      {% if value is match('^event start:.*$') %}
        on
      {% else %}
        off
      {% endif %}
conditions: []
actions:
  - parallel:
      - if:
          - condition: trigger
            id:
              - event_start_doorbell
        then:
          - action: notify.gotify_rest
            data:
              title: ZM Doorbell Motion
              message: ZoneMinder MQTT event start message!
            enabled: false
          - action: script.google_ai_front_door_notification
            metadata: {}
            data:
              mqtt_payload: "{{trigger.payload}}"
      - if:
          - condition: trigger
            id:
              - event_start_backyard
        then:
          - action: notify.gotify_rest
            data:
              title: ZM Back Yard Motion
              message: ZoneMinder MQTT event start message!
            enabled: false
          - action: script.camera_back_yard_snapshot_ai_notification
            metadata: {}
            data:
              mqtt_payload: "{{trigger.payload}}"
      - if:
          - condition: trigger
            id:
              - event_start_alley
        then:
          - action: notify.gotify_rest
            data:
              title: ZM Back Alley Motion
              message: ZoneMinder MQTT event start message!
            enabled: false
          - action: script.camera_back_alley_snapshot_ai_notification_duplicate
            metadata: {}
            data:
              mqtt_payload: "{{trigger.payload}}"
      - if:
          - condition: trigger
            id:
              - event_start_frontyard
        then:
          - action: notify.gotify_rest
            data:
              title: ZM Front Yard Motion
              message: ZoneMinder MQTT event start message!
            enabled: false
          - action: script.camera_front_yard_snapshot_ai_notification
            metadata: {}
            data:
              mqtt_payload: "{{trigger.payload}}"
mode: parallel
max: 50

You will notice when a script is called, it is passed the mqtt_payload parameter which the script uses in its templates. mqtt_payload is set to the trigger.payload available from MQTT triggers and is the payload sent to the mqtt topic. In this case it would be “event start: <5-6 digit integer>” so it would look something like: event start: 412345 . The script is passed this data as the mqtt_payload variable, which the script then uses to extract the 5-6 digit integer and stores it in the eid variable.

script

This is the front yard script, I have a script for each camera.

alias: Camera - Front Yard - Snapshot, AI & Notification
sequence:
  - variables:
      trig_ts: "{{ now().strftime('%Y%m%d-%H%M%S') }}"
      eid: >-
        {% set event_id = mqtt_payload | regex_findall('event start: (\\d+)') |
        first %}  {{event_id}}
      ext_url: https://ha.example.com
  - action: camera.snapshot
    metadata: {}
    data:
      filename: /config/www/snapshots/front_{{trig_ts}}_snapshot1.jpg
    target:
      entity_id:
        - camera.front
    enabled: true
  - delay:
      hours: 0
      minutes: 0
      seconds: 1
      milliseconds: 0
    enabled: true
  - action: camera.snapshot
    metadata: {}
    data:
      filename: /config/www/snapshots/front_{{trig_ts}}_snapshot2.jpg
    target:
      entity_id:
        - camera.front
    enabled: true
  - delay:
      hours: 0
      minutes: 0
      seconds: 1
      milliseconds: 0
    enabled: true
  - action: camera.snapshot
    metadata: {}
    data:
      filename: /config/www/snapshots/front_{{trig_ts}}_snapshot3.jpg
    target:
      entity_id:
        - camera.front
    enabled: true
  - action: google_generative_ai_conversation.generate_content
    metadata: {}
    data:
      prompt: >-
        Motion has been detected, compare and very briefly describe what you see
        in the following sequence of images from my front yard camera. If people
        or animals are present, describe them in detail. Do not describe
        stationary objects, buildings, holiday decorations, etc. I want you to
        act as a security guard monitoring cameras for people and animals. If
        you see nothing obvious / interesting, only reply with "No Obvious
        Motion Detected." Your message needs to be short enough to fit in a
        phone notification.
      image_filename:
        - ./www/snapshots/front_{{trig_ts}}_snapshot1.jpg
        - ./www/snapshots/front_{{trig_ts}}_snapshot2.jpg
        - ./www/snapshots/front_{{trig_ts}}_snapshot3.jpg
    response_variable: generated_content
  - action: counter.increment
    metadata: {}
    data: {}
    target:
      entity_id:
        - counter.google_ai_requests
        - counter.google_ai_daily
  - if:
      - condition: template
        value_template: "{{ 'No Obvious Motion Detected.' in generated_content.text }}"
    then:
      - sequence:
          - parallel:
              - action: delete.file
                data:
                  file: /config/www/snapshots/front_{{trig_ts}}_snapshot1.jpg
              - action: delete.file
                data:
                  file: /config/www/snapshots/front_{{trig_ts}}_snapshot2.jpg
              - action: delete.file
                data:
                  file: /config/www/snapshots/front_{{trig_ts}}_snapshot3.jpg
          - stop: Gemini AI saw nothing of interest
    else:
      - parallel:
          - action: notify.ntfy_cameras
            metadata: {}
            data:
              title: Front Yard AI Alert!
              message: "{{generated_content['text']}}"
              data:
                tags:
                  - warning
                attach: {{ext_url}}/local/snapshots/front_{{trig_ts}}_snapshot2.jpg
                filename: front_yard_ai.jpg
                click: null
                actions:
                  - action: view
                    label: View Event
                    url: >-
                      https://zm.example.com/zm/cgi-bin/nph-zms?mode=jpeg&scale=50&buffer=1000&replay=single&event={{eid}}&user=user&pass=password
                  - action: view
                    label: View Camera
                    url: homeassistant://navigate/dashboard-cameras/front_yard
          - action: notify.mobile_app_tylers_s22_ultra
            metadata: {}
            data:
              title: Front Yard Alert!
              message: "{{generated_content['text'] }}"
              data:
                image: /local/snapshots/front_{{trig_ts}}_snapshot2.jpg
                clickAction: "{{ext_url}}/local/snapshots/front_{{trig_ts}}_snapshot2.jpg"
                group: front_yard_gemini
                sticky: true
                actions:
                  - action: URI
                    title: View Event
                    uri: >-
                      https://zm.example.com/zm/cgi-bin/nph-zms?mode=jpeg&scale=50&buffer=1000&replay=single&event={{eid}}&user=user&pass=password
                  - action: URI
                    title: View Camera
                    uri: /dashboard-cameras/front_yard
          - action: notify.mobile_app_sm_s928w
            metadata: {}
            data:
              title: Front Yard Alert!
              message: "{{generated_content['text'] }}"
              data:
                image: /local/snapshots/front_{{trig_ts}}_snapshot2.jpg
                clickAction: "{{ext_url}}/local/snapshots/front_{{trig_ts}}_snapshot2.jpg"
                group: front_yard_gemini
                sticky: true
                actions:
                  - action: URI
                    title: View Event
                    uri: >-
                      https://zm.example.com/zm/cgi-bin/nph-zms?mode=jpeg&scale=50&buffer=1000&replay=single&event={{eid}}&user=user&pass=password
                  - action: URI
                    title: View Camera
                    uri: /dashboard-cameras/front_yard
            enabled: false
mode: queued
description: ""
icon: mdi:cctv
max: 30
fields:
  mqtt_payload:
    selector:
      text: null
    name: mqtt payload

In the script you can see I set the gemini ai repsonse to a response variable named generated_text which is used in the notify action: message: "{{generated_content['text'] }}".

Credit for this gemini script goes to another user on this forum, I just modified it for my use case.

There’s also an integration on HACS called LLM Vision which simplifies the whole setup from trigger to llm camera readout to notification all in a blueprint. Takes a little to srtup but well worth the effort.