What version are you on and what did you do to set up LLM VIsion? You can follow the steps in the documentation.
I followed this instruction step by step, it is clear and well written. Ollama is installed locally and added to the Home Assistant. Added the minicpm-v model. Communication with her through the assistant in the Home Assistant is working. But when creating any script to get a description, an error appears.
I setup a quick test automation
- When motion detected on front camera
- Analyze the stream of the camera via LLM (Gemini)
- Save on ResponseVariable
- Use the ResponseVariable in TXT notification and also Alexa tts
Everything works good , but when Alexa notifies , it says
“Response underscore text” , followed by whatever description generated by the LLM
Also most of the time, Alexa says words like “Asterix” , “back slash” etc
How do I trim all these extra characters and words. Specially that “Response underscore txt” that it says every time on the notification.
LLM Analyzer YAML
action: llmvision.stream_analyzer
data:
remember: false
duration: 10
include_filename: false
max_tokens: 100
temperature: 0.2
expose_images: true
provider: 01JK4774TM8F9YC
message: >-
You are my security guard. Describe what you see. Don't mention trees,
bushes, grass, landscape, driveway, light fixtures, yard, brick, wall,
garden. Don't mention the time and date. Be precise and short in one or two
sentences. Only describe the person, vehicle or the animal, otherwise just
say, "nothing to report"
model: gemini-2.0-flash-001
image_entity:
- camera.front_yard_right_high_resolution_channel
max_frames: 10
response_variable: frontdoordeliveryvariable
Below is YAML for Echo voice notify
action: notify.alexa_media_s_echo_pop
metadata: {}
data:
message: "{{ frontdoordeliveryvariable | string }}"
enabled: true
alias: Voice Alert via Alexa Pop
Use {{frontdoordeliveryvariable.response_text}}
as the message
input for tts.
frontdoordeliveryvariable
is a json object.
I’ve added a Gallery section to the website. The goal is to collect examples and use-cases for the community in one place and make them easily accessible to copy-paste and use them yourself. Have built something with LLM Vision and want to share it? Then please reach out and I’ll add it to the gallery.
Thank you so much.
It worked after clearing the browser cache.
I loaded locally on my PC (Windows 11) an ollama llm:
ollama pull llava-phi3
when I launch the url ‘http://127.0.0.1:11434/’ in Chrome it tells me “ollama is running”
however when I add an entry in the llm vision integration it rejects me
“Could not connect to the server. Check you API key or IP and port”
what(s wrong ?
thanks for your support
127.0.0.1
always points to the computer running the program. I assume Home Assistant is running on another machine. In this case you will need to find the actual IP address of your computer. You can follow this guide for windows.
i have introduced adress ipv4 192.168.0.243 and just the same error.
adress ip of ha : 192.168.0.19x
home assistant is running on my pc where i have the ollama server (disk c:) and my avg antivirus is disconnected
maybe public or private network ???
Worked like a charm!!!
Appreciate it.
Thanks you
Here is the corrected code for this small Automation.
- LLM scans for the Garbage Bin on my Front yard camera.
- If the Bin is detected/not detected, then Alexa notifies based on the detection criteria.
I like the way I can direct the LLM to look for only the “Left side of the picture” .
There is an electrical box on the right side of the picture and often getting false positive as a trash Bin. Adding “Look only the left side of the picture” fixed this issue. Awesome !!
alias: Trash Bin Detector
description: ""
triggers:
- trigger: time
at: "21:05:00"
- trigger: time
at: "21:45:00"
- type: motion
device_id: 8efb7999473a61cd5879a6fae738
entity_id: baa87d878f6d0795854befaaf860
domain: binary_sensor
trigger: device
alias: G5 Camera Feed
conditions:
- condition: time
weekday:
- sun
after: "21:04:00"
before: "22:00:00"
actions:
- delay:
hours: 0
minutes: 0
seconds: 30
milliseconds: 0
enabled: false
- action: llmvision.image_analyzer
data:
remember: true
include_filename: false
target_width: 1280
max_tokens: 100
temperature: 0.2
provider: 01JKPYX533SYCPCCFXP7MW0
image_entity:
- camera.front_yard_right_high_resolution_channel
message: >-
You are my picture Analyzer. On the left side of the picture if you see
any trash bin or any garbage container, say "I'm Glad to see that you
have taken out the Trash". If you don't see any garbage containers then
say "Today is Sunday! It seems like you have not taken out the Garbage.
You need to take out the Garbage. Maybe your son can help out too?"
Followed up by some kind of humor . Avoid special characters like smiley
face, sun shine or other kids of emojis.
expose_images: true
response_variable: llmvariable
enabled: true
- action: notify.mobile_app_iphone_13
data:
message: "{{ llmvariable.response_text }}"
enabled: true
- action: notify.alexa_media_echo_dot
metadata: {}
data:
message: "{{ llmvariable.response_text }}"
enabled: true
- action: notify.alexa_media_echo_show
metadata: {}
data:
message: "{{ llmvariable.response_text }}"
enabled: true
- action: notify.alexa_media_echo_pop
metadata: {}
data:
message: "{{ llmvariable.response_text }}"
enabled: true
mode: single
Now I’m excited to create another automation to check if the garbage truck has arrived and collected the garbage bin or not. I will probably be using the LLM Stream Analyzer.
I never thought garbage bins could be so much fun!
Anybody else seeing this? It only started for me today. I recently updated to LLMVision 1.3.8 and HASS Core 2025.2.3.
action: llmvision.image_analyzer
data:
include_filename: true
max_tokens: 100
temperature: 0.2
provider: 01JKE9GJSANH5EEMVNVGZG2NBY
image_file: /config/downloads/Wed12-02-202583802AM.jpg
message: >-
Describe the image.
Gives me:
“Failed to perform the action llmvision.image_analyzer. Error: cannot identify image file ‘/config/downloads/Wed12-02-202583802AM.jpg’”
I thought maybe it doesn’t have access to that location, but I confirmed I have that folder location added in allowlist_external_dirs.
Any chance to make this blueprint notify tough Telegram?
Nevermind, I tried it a few minutes later and it worked.
I was interested in trying this out. I have HACS installed and I have other things working from it (dahua integration, bubble card, nest protect, etc.) but when I visit:
http://(MY HA IP)/hacs/repository?owner=valentinfrlch&repository=ha-llmvision
I get:
Repository valentinfrlch/ha-llmvision not found
Is this a new issue or something I’m doing wrong?
edit: I’m on HA 2025.1.4
Hi. I installed the LLM Vision and am using the blueprint with Frigate for my doorbell. When a person is detected, the alert goes to my phone and will open up the recording, however it does not provide me a detailed description of the snapshot. All it says is Person detected. I am running the HA app on my Galaxy Fold 5. Any suggestions?
I am trying to install the custom repository by clicking the link within the instructions however I am getting this error → Repository valentinfrlch/ha-llmvision not found. Has something changed?
Does anyone have an idea or implementation on how to compose a promt for an assist pipeline so that the conversation agent can call LLMvision and get the response information to have a further conversation with the user?