Not sure if this is the right place or not but I am hoping someone might be able to provide some guidance. I have a script using LLM Vision that appears to run run correctly but I can’t get the reponse to return and be spoken by my voice satellite. If anyone has any ideas please let me know.
Here is my script code:
sequence:
- action: camera.snapshot
metadata: {}
data:
filename: /config/www/tmp/snapshot-backyard.jpg
target:
entity_id: camera.reolink_bakyard_camera_fluent
- delay:
hours: 0
minutes: 0
seconds: 4
milliseconds: 0
- action: llmvision.image_analyzer
metadata: {}
data:
remember: false
include_filename: false
target_width: 1280
max_tokens: 100
temperature: 0.2
generate_title: false
expose_images: false
expose_images_persist: false
provider: 01JMJ53RZV1AA9RZJ194FFN7E9
model: gpt-4o-mini
message: >-
Analyze the backyard camera snapshot. Report anything of interest like
packages, cars, trucks, people. Also report the weather conditions and
backyard status like ice, rain, snow etc. Be witty
image_file: /config/www/tmp/snapshot-backyard.jpg
response_variable: response
- variables:
response:
query_backyard: "{{ response.response_text }}"
- stop: ""
response_variable: response
alias: Backyard Security
description: >-
Check the backyard camera to determine what is happening outside. Also check
the weather conditions.
Here is the trace details that it appears to be working but I just can’t get the voice response to work:
Executed: February 21, 2025 at 10:02:43 AM
Result:
params:
domain: llmvision
service: image_analyzer
service_data:
remember: false
include_filename: false
target_width: 1280
max_tokens: 100
temperature: 0.2
generate_title: false
expose_images: false
expose_images_persist: false
provider: 01JMJ53RZV1AA9RZJ194FFN7E9
model: gpt-4o-mini
message: >-
Analyze the backyard camera snapshot. Report anything of interest like
packages, cars, trucks, people. Also report the weather conditions and
driveway status like ice, rain, snow etc. Be witty
image_file: /config/www/tmp/snapshot-backyard.jpg
target: {}
running_script: false
response:
response_text: >-
In this snapshot from the backyard camera, it looks like winter has decided
to throw a little party! The ground is covered in a blanket of snow, giving
the scene a serene, albeit chilly, vibe.
As for the backyard, it appears to be quite the winter wonderland—no cars,
trucks, or surprise packages in sight. Just a couple of empty chairs and a
table, likely longing for warmer days and a good barbecue.
The weather is crisp and clear, with bright sunlight illuminating
Hi, I’m requesting support for Tuya Smart video. I have a video intercom connected to Tuya. LLM Vision sees the camera, but when testing with AI, the AI can’t see through the camera video giving this error.
The llmvision.stream_analyzer action could not be performed. No image input provided
I think you need some permission to be able to review the video from the video intercom
I can see the video intercom from Home Assistant without any problems.
Two things: 1. You don’t have to create a snapshot yourself; image_analyzer does that automatically if you pass the camera entity as image_entity. 2. I can’t see where you call a tts action. You response variable is correct but you need to pass this to a tts entity. For example:
Thanks for responding. I seem to have it working by using an automation but could never get it working via a script. That makes total sense to have the tts.speak action. I was trying to follow an example someone else had provided but could never get the response. I’ll try this out as well.
Just wondering though, or maybe it doesn’t really matter but is it better or more efficient to have it done through a script or automation. Seems like it might not matter.
The difference between scripts and automations is that automations can have one (or multiple triggers) while scripts have to be called by the user (or by another automation).
In your case an automation would make more sense, as you would probably want to announce automatically when a person is detected.
I’m not sure why I don’t get any pictures anymore in my notifications. I only get the text in my notifications. Any clue?
alias: AI deurbel Frigate
description: ""
use_blueprint:
path: valentinfrlch/event_summary.yaml
input:
mode: Frigate
notify_device:
- f5fc620b4132f43fde52fcfc86def4ae
- 93870dccbfbc2d6bf7d381d0f2714284
cooldown: 5
provider: 01JBPTWC6XXGZM2E1JNQFZW2GH
camera_entities:
- camera.camera
frigate_url: 192.168.1.118:5000
important: false
motion_sensors:
- binary_sensor.reolink_video_doorbell_wifi_persoon
message: >-
Ga in één zin samenvatten met maximaal 10 woorden. Beschrijf de scène
niet! Als er een persoon is, beschrijf dan wat diegene doet en zorg dat je
diegene op een grappige manier kritiek geeft.
duration: 5
remember: true
preview_mode: Snapshot
max_frames: 3
temperature: 0.5
max_tokens: 20
input_mode: Frigate
required_zones:
- Oprit
- Weg
object_type:
- person
model: gemini-2.0-flash
Its not showing in my list either - In fact I don’t have Timeline OR Memory…
I’ve been trying to add it for the last hour. Looks like there may be some kind of issue. (Yes, I updated LLM Vision yesterday because I saw it was a feature of the latest builds)
Ahhhh yeah that was not apparent at all in the docs. ETA? (read am I going to blow anything up running this beta, heck with it - the ability to edit the motion detected subject line alone is worth it ; ) )
There will be a second beta by the end of this week. The first one is already out and as far as I can tell there aren’t any major bugs currently. Check the discussion if you’re interested in the beta!
I currently have Frigate events being ran though my NodeRed setup thus I’m not looking to manage all of my Frigate event via LLM Vision. However what I would like to do is enrich some of my existing NodeRed events by calling LLM Vision service with a url to an image either relative or full uri and have the LLM vision service handle the base64 encoding and sending the request to the vision service as usual.
I’m not sure if this is currently possible. So here are some examples.
action: llmvision.image_analyzer
data:
include_filename: false
target_width: 1280
max_tokens: 100
temperature: 0.2
provider: [LOCAL_OLLAM_UID]
model: llama3.2-vision:11b
message: Is there a package
image_url: >-
/api/frigate/notifications/1740403924.78291-7cotr08/snapshot.jpg
#Relative could be nice and add the homeassistant http(s)//(host):(port)
action: llmvision.image_analyzer
data:
include_filename: false
target_width: 1280
max_tokens: 100
temperature: 0.2
provider: [LOCAL_OLLAM_UID]
model: llama3.2-vision:11b
message: Is there a package
image_url: >-
https://[host]:[port]/api/frigate/notifications/1740403924.78291-7cotr08/snapshot.jpg
#Or just always expect the user to give the full uri
Added a run_condition: You can now add custom conditions to the blueprint. All conditions have to be true in order for the blueprint to run. Use this to only run the automation when no one is home or when the alarm is enabled.
How do I actually use this now? I don’t see the option on the blueprint (v 1.3.8)?
thanks!
EDIT: Oh, nvm, I see that I need the beta blueprint.
LLM Vision 1.4.0: Major Update with Exciting New Features
LLM Vision 1.4.0 is here with major improvements, new features, but also some breaking changes. This update enhances usability, adds powerful memory capabilities, and improves event tracking with the new Timeline and Timeline Card. Read on to see what’s new!
Important Updates & Breaking Changes
Major Restructure & Frigate Compatibility
The blueprint has been significantly restructured, meaning you’ll need to set up some parts again. Frigate mode has been removed due to recurring issues when fetching clips. The blueprint remains compatible using Frigate’s camera and binary_sensor (”occupancy”) entities.
Deprecated Features
input_mode has been removed
**detail**has been removed. Use target_width to control the resolution of images.
expose_images_persist has been deprecated. Images are now saved in /www/llmvision/<uid>-<key_frame>.jpg. The path is returned in response as key_frame.
New Features & Enhancements
Timeline Replaces Event Calendar
Introducing the LLM Vision Timeline Card—a new dashboard card to display events. Install it now: GitHub