I have been at this for a week now, and even though I don’t like it, I am literaly out of my depth.
I have been working on an automation that will do the following:
- Triggered by a detection of a person on an outside camera (eventually only between certain hours - but for now while testing - all hours).
- Take a snapshot and store it with a date and timestamp as well as the name of the camera
- Query LLM Vision to give me description of the feed of the camera
- Send me a notification of the description with the snapshot attached
- Send me a TTS voice message of the LLM description.
On the surface it seemed to be mostly working, but if there is a problem, timeout or insufficient tokens available on LLM Vision, the script got stuck and never finished.
I then decided to use a generic description (Person detected at Camera 6) and then call LLM Vision. If there was no response from LLM Vision within a certain timeframe, to just send the generic message and snapshot.
On the surface it seemed OK, but in reality, it just got stuck at LLM Vision again (no messages received).
I then decided to take a modified approach (now it is getting over complicated).
I created a script as below
driveway_llm_analysis:
mode: parallel
sequence:
- service: llmvision.image_analyzer
data:
provider: redacted
model: gemini-2.0-flash
message: >
Give a detailed description of the security camera footage observed.
remember: false
image_entity:
- "{{ camera_entity }}"
include_filename: true
target_width: 1280
max_tokens: 100
generate_title: true
expose_images: true
response_variable: llm_out
- service: input_text.set_value
data:
entity_id: input_text.driveway_analysis_title
value: "{{ llm_out.title if llm_out.title is defined else '' }}"
- service: input_text.set_value
data:
entity_id: input_text.driveway_analysis_text
value: "{{ llm_out.choices[0].message.content if llm_out.choices is defined else '' }}"
I also created two text helpers called driveway_analysis_text and driveway_analysis_title.
The idea is that these helpers will have generic text related to the notification.
The automation then calls LLM Vision to update these two helpers. Should LLM Vision work, they will have good descriptions, if not, it will be generic.
The automation will have a 10s wait state to give the LLM Vision chance to react (or not).
After the 10s delay, the automation must send whatever is in these two helpers data to me. (Sounds workable [although too complicated IMO] enough in my mind)
Yet, it is not working and I have no idea what else to try (like I said - I have been at this for a week)
Below is the automation code
alias: Driveway Camera Person Detection Notification with LLM Vision and TTS
description: >
Guaranteed driveway camera security alert with snapshot and TTS; LLM Vision
analysis incorporated if available.
triggers:
- entity_id: binary_sensor.camera_6_driveway_person_occupancy_2
to: "on"
trigger: state
conditions: null
actions:
- variables:
camera_entity: camera.camera_6_driveway_2
dashboard_url: https://redacted.com:1234/security-cameras-frigate/frigate
snapshot_filename: >-
/config/www/tmp/driveway_person_{{ now().strftime('%Y%m%d_%H%M%S')
}}.jpg
snapshot_url: /local/tmp/driveway_person_{{ now().strftime('%Y%m%d_%H%M%S') }}.jpg
analysis_title: Camera 6 Driveway person detection
analysis_text: Person detected at Driveway Camera.
- target:
entity_id: "{{ camera_entity }}"
data:
filename: "{{ snapshot_filename }}"
action: camera.snapshot
- delay: "00:00:02"
- data:
camera_entity: "{{ camera_entity }}"
response_variable: driveway_analysis
action: script.driveway_llm_analysis
- delay: "00:00:10"
- variables:
latest_title: >
{% set r = states('input_text.driveway_analysis_title') %} {{ r if r
else analysis_title }}
latest_text: >
{% set r = states('input_text.driveway_analysis_text') %} {{ r if r else
analysis_text }}
- data:
title: "{{ latest_title }}"
message: "{{ latest_text }}"
data:
image: "{{ snapshot_url }}"
url: "{{ dashboard_url }}"
clickAction: "{{ dashboard_url }}"
tag: camera_person_detection
action: notify.mobile_app_sm_s711b
- data:
message: "Security: {{ latest_text }}"
data:
tts_text: Security alert! {{ latest_text }}
action: notify.mobile_app_sm_s711b
mode: single