Anyway, for some reason it doesn’t work for me. (
Edit 3: Got it working by using the image_entity, I’ll post details at the bottom. I think I’d still like to get it working with image_file, but this will suffice for now.
Edit 2: It seems like images work when I use image_entity in the YAML, but not when I use image_file. I think if I can figure out how to get from a trigger entity_id to an image_entity (camera name) then I can make this work.
This update looks awesome! Is there a way to diagnose why images aren’t showing up in the timeline card? Events are being added to the calendar and they’re showing up with text descriptions in the timeline card, but I’m not getting the keyframe image.
When I look at the script timeline under changed variables I see:
generated_content:
response_text: No obvious motion detected.
key_frame: /config/www/llmvision/46c782f3-0.jpg
That image exists inside that path (I can view it from the file plugin editor), but it’s not showing in the card.
I did notice the documentation says " expose_image
" but the YAML configuration offers expose_images
. I tried both and only expose_images
seems to show the key_frame in the generated_content variable, so I don’t know if that matters.
Edit: I set up debug logging and I don’t see any errors in the logs.
To get this working with image_entity, I needed to get the name of the camera sub entity from the device ID of the automation trigger. I’m using this to pass the “sub_entity” name to the script:
sub_entity: >-
{{ device_entities(device_id(trigger.entity_id)) | select('match',
'^camera.*sub$') | list | join('') }}
That results in something like “camera.living_room_camera_sub” depending on the trigger. Then in the script, I use:
image_entity:
- "{{ sub_entity }}"
And now my images are showing up in the timeline card.
Thanks for the detailed post!
This is a bug - it should work with image_file as well. expose_image
is a typo in the docs. The correct parameter name is expose_images
.
Such an awesome update @valentinfrlch! I am having a play with the integration like this, but for some reason cant see anything in the timeline entity. Am I missing something?
alias: LLM Testing
description: ""
triggers: []
conditions: []
actions:
- action: llmvision.image_analyzer
metadata: {}
data:
include_filename: false
max_tokens: 100
temperature: 0.2
provider: BLABLABLAETC
image_entity:
- camera.garden_camera
message: Describe the image.
remember: true
expose_images: true
response_variable: response
- action: persistent_notification.create
metadata: {}
data:
title: Front Camera Motion
message: "{{response.response_text}}"
mode: single
@valentinfrlch Loving the updates you recently made to LLM Vision! Especially the new timeline card that will show the images next to the response.
I have a couple of questions / feature requests though. Do you have a place we should submit them? Adding below in case this is the best place.
- Allow you to select which timeline entity to use. This is helpful if you have different types of events you want to remember/capture (i.e. security camera motion events, vs road cam data analyzer events).
- Timeline Card - Ability to filter what events are displayed based on the entity ID associated with the event (or even the name or something similar).
- Ability to Remember Data Analyzer events. This is essential when you want to then use assist to ask about those data analysis operations later on. (Understood that you can use the Remember action here but it’s a few extra steps to get that data into the remember call)
Thanks again for all your work on this! It’s been invaluable in so many ways!
there is a git repository for the project - submit an issue, of type: feature request (You’ll see the option when you click new issue)
Only thing I can see is that there are no triggers. If it still doesn’t work, please create an issue here, and attach some logs/trace of the automation.
Yea - just manually triggering the automation whilst having a play/testing. Have submitted a bug report. Thanks!
I have this up and running VERY well now and love it. Thank you for an awesome integration.
Honestly, the key for me was not using the blueprint (although I haven’t tried the updated one)
Using the integration in a custom automation is doing exactly what I want. I just have a question.
When testing prompts in openwebui, I tend to get better responses, but when I pass the same photo and prompt from HA, the responses are good, but not as good as when prompting directly.
I noticed in openwebui, the default temperature is 0.8 I adjusted this in my automation, and it is a little better.
Just wasn’t sure if there were some other settings or adjustments I should look into?
Absolutely love this integration.
I use frigate and ollama on unraid with a GTX1080SC 8GB and the prompts fire off quick. (Had to add environmental variable ollama_keep_alive 24h)
For anyone else here is a simple automation you can use if having trouble with blueprint, maybe this can help someone else going forward:
I have been experimenting with playing the description over TTS in my office, if I am in the office when the detection happens. Unfortunately, I have not even touched voice assistants in HA yet to improve upon this.
alias: AI Image Description Front Door -Sean
description: ""
triggers:
- trigger: state
entity_id:
- image.doorbell_camera_person
to: null
conditions: []
actions:
- action: llmvision.image_analyzer
metadata: {}
data:
include_filename: true
target_width: 1280
max_tokens: 100
temperature: 0.8
generate_title: true
expose_images: true
model: llava:7b
message: >-
PROMPT
provider: Select your provider
remember: true
image_entity:
- image.doorbell_camera_person
response_variable: description
- action: notify.mobile_app_iphone15
data:
data:
image: /api/image_proxy/image.doorbell_camera_person
title: "{{ description.title }}"
message: "{{ description.response_text }}"
- condition: state
entity_id: input_boolean.office_occupancy
state: "on"
enabled: false
- action: tts.google_translate_say
metadata: {}
data:
entity_id: media_player.rk3326
message: " {{ description.response_text }} "
enabled: false
mode: single
I use the LLM Vison card, I see the response but no image. Does it something to do with my settings in the automation?
I have a new TAPO C200 camera. I’ve set up a automation that triggers when the state of “cell motion detection” becomes “on”.
I want to use the stream or picture from the camera (don’t know what’s best - please recommend) together with LLM Vision. Does anyone have an example of how to do this? I can only find examples that are using a button as a trigger.
You could modify the system prompt. You can change it in Memory settings (Memory provider needs to be set up).
The blueprint has been completely rewritten, as there have been lots of problems with frigate clips specifically. From what I can tell, it is pretty stable now, a lot faster, and also very customizable with the new run_conditions
.
@Jovink There is a known bug for image_file
inputs. It should work for entities.
@danwie Check the blueprint. I think it does exactly what you want.
is it possible to do this with a script after every reboot?
what is the problem that the text description is cut off?
probably increase max_tokens
Quick question. It’ been a week or 2 that my prompt is no longer working as expected.
whenever llmvision is triggered the answer I get starts with my question being rephrased like “of course, here’s a description of the people in the shot : tall man standing near blablabla” and I want only the part with “tall man standing near blablabla”. LLM used : gemini
Right now I test Gemma 3 with home assistant. I configured it as assistant and pointed LLM Vision stuff to the same model. What is curios, when I us the assist function of HA, Ollama load up Gemma 3 and answer my request. When I initiate a LLM Vision request, Ollama unload the model and load it again. So HA and LLM vision always play ping pong by loading and unloading the same model, which makes the response time very slow. Also LLM vision heavily needs to pass the keep alive variable. HA I can set it to -1 (Forever) but when I initate a LLM request, the keep alive is set to default (5min). Yes I could set it in Ollama settings but I do not want to keep all my models loaded all the time.
Sorry for lame question, I made Groq setup, but where provider ID comees from?.I can’t find it.
Integration LLM entry Groq has no devices or entities
Is Danish support coming for the UI and Icons?
Im trying to use a example script from yours, but i get a template error.
the script:
alias: Kenteken check
sequence:
- service: llmvision.image_analyzer
data:
max_tokens: 100
provider: 01JP73JVPR5YRERKGM6RKVR9FM
image_file: /config/www/snapshot_oprit/oprit.jpg
model: gpt-4o
target_width: 512
temperature: 0.5
detail: low
include_filename: false
message: >-
Please check if there is a car in the driveway with the license plate "JX-820-Z" and respond with a
JSON object. The JSON object should have a single key,
"car_in_driveway," which should be set to true if - and only if - there is a car with the license number provided above in the
driveway and false otherwise.
response_variable: response
- choose:
- conditions:
- condition: template
value_template: >-
{{ response.response_text |
from_json.car_in_driveway }}
enabled: true
sequence:
- service: input_boolean.turn_on
target:
entity_id: input_boolean.car_in_driveway
data: {}
default:
- service: input_boolean.turn_off
target:
entity_id: input_boolean.car_in_driveway
data: {}
enabled: true
mode: single
The error that i get:
Message malformed: invalid template (TemplateAssertionError: No filter named ‘from_json.car_in_driveway’.) for dictionary value @ data[‘sequence’][1][‘choose’][0][‘conditions’][0][‘value_template’]
Do the error i cant even save the script, i copied and past the code, so a typo sould not be in there.
Doesn’t work. Don’t know if it’s a bug. The trace stops when it complains about a “timeline.” I can see the calls getting to google cloud but have no idea what that message means. There’s nothing meaningful in HA documentation about a “timeline” the script is complaining about.
This is what google is registering around the same time.