I just want to start off by saying i’m new to home assistant and frigate in general. Throughout the past couple days, I’ve gotten my setup working to where Frigate is recording and HomeAssistant will send me basic notifications when a person, car, dog, etc is detected.
I’m now looking into getting an AI overview to expand the setup, however it’s just simply not working. I’ve gotten all the prerequisites setup, including the LLM Vision integration with Gemini API Key. I’ve used a blueprint to setup my template, and it looks like the following:
However, when I manually run it, I get a basic notification on my phone stating:
Home Assistant
Motion detected
has detected activity
and that’s all I get. People coming up to the doorbell doesn’t trigger an alert or anything.
If it matters, the Camera Entity i’m using “Doorbell” should be a frigate entity. I believe. I can also select the Reolink doorbell camera itself, but it produces the same error.
I assume the Trigger State or perhaps Motion Sensor entries are incorrect. Trying various motion sensor entries and manually running the automation script causes it to not run at all and does nothing.
Can someone please advise? I feel like im losing my sanity here. I’ve tried various gemini models as well.
I apologize. The above has been fixed - turns out it takes a couple minutes to get the notification, and isn’t instant like i was expecting. I’m now trying to get LLMV timeline to work, it’s not saving events at all.
was there a change in the language settings?
in the past i get the text in german. but since few weeks the text is only in english, but the prompt is in german. how can i get the german text?
It probably changed when the system prompt was introduced. The system prompt applies to all calls and is useful for general information and behavior. You can change it in the “Memory” settings.
@valentinfrlch I see when I “Take Control” of the blueprint to analyze it that you are including the Apple iOS-only flag “interruption-level”.
interruption-level: >-
{{'passive' if notification_delivery=='Dynamic' else
'active'}}
Could you also add the Android equivalent ttl: and priority: tags? I don’t get notifications until I pick my phone up otherwise, if it’s sitting on my desk.
service: notify.mobile_app_*
data:
message: Test
data:
ttl: 0
priority: high
As far as I know, the way to do it is to defer the ‘remembering’ till you can evaluate the result, and then use the separate ‘remember’ action if warranted by the result.
This works for me - but I don’t use the blueprint, and instead rolled my own automation …
I was just going to ask about Notification updates, looks like you made some for Android. Is iOS coming? Specifically, I’d like the notification to persist in the Notification tray until I Dismiss them.
Thanks, incredibly useful and versatile integration here!
Having some challenges with Gemini not returning the correct response, however. In my case, it’s not that it’s failing to process (although this does occasionally happen), but more that it’s processing what looks a very clear image and giving the incorrect answer to the question of how many bins it can see (3 rather than 4!). Interestingly, when using the Google UI directly, it always seems to get the answer correct, although clearly takes longer to do so than the API response takes from LLM Vision. I switched to the flash 2.5 model and this is a little better, but is still making more mistakes than I would have thought it should given the relative simplicity of the problem I’m asking it.
Does anyone have any thoughts around tuning the parameters for Gemini such that it tries a bit harder to get the correct answer, rather than responding quickly with an incorrect one?
edit: One thing I’ve just done is increase the resolution of the sampled image from 640x480 up to 2560x1920 (the max of the camera) so we’ll see if this makes a difference…
edit 2: Still testing, but one thing that’s very useful is to add some retries to the call to handle errors in the response, which has significantly improved stability for me. e.g.
- repeat:
sequence:
- action: llmvision.image_analyzer
data:
provider: xxx
message: >-
Is there washing hanging on a washing line in the upper left side of this
image? Please answer with on if you see washing or off if you do not.
Do not reply with any other value and ensure that the response is returned with no extra characters.
image_entity:
- camera.patio
include_filename: true
max_tokens: 100
temperature: 0.2
expose_images: true
response_variable: washing_status
until: "{{ washing_status.response_text in ['off', 'on'] or repeat.index > 4 }}"
In this case, it will try it four times, or until it gets an “on” or “off” response, whichever happens first. So far, it’s not taken more than two to get a valid response. It still might not be the correct response, but it does at least filter out the “Event detected” responses, or other overtly-incorrect answers.
Until December, the code below worked fine and I received a ‘reasonable’ description of the photo. I recently upgraded the analyzer and now I only get the description ’ Event Detected’. What could be the reason?
remember: false
include_filename: false
target_width: 1280
detail: low
max_tokens: 100
temperature: 0.2
expose_images: false
provider: 01JG3R7V7xxxxxPADFM9CXC78
model: gemini-1.5-pro
message: >-
-> Beschrijf wat je ziet in één zin. Als je een persoon ziet, beschrijf dan
hoe hij/zij eruitziet. + Wat is de actuele datum en tijd ?
image_file: /config/www/images/snapshots/snapshot_voordeur_foscam.jpg
Hi @valentinfrlch thank you for this integration. I’m trying to get it working with my Google nest cameras, however, I get a description that says the image is dark. Is there any way to make this integration work with my cameras?
As an additional note, the cameras are working. They are dark initially, but when I click on them and click play, the image shows up.
Gm, I’m trying to use Venice.ai, since I have a lot of free API tokens there. Venice has a OpenAI compatible endpoint and three different models with vision capabilities, Llama-4-Maverick-17B, Qwen-2.5-VL and Mistral-3.1-24B. I managed to get a single image analyzed fine, but with multiple images I got:
Error: ‘str’ object has no attribute ‘get’ as response.
When I switched to OpenAI’s gpt-4o-mini it worked fine with the same automation. I’m using the provided blueprint, but also made my own script with the help of ChatGPT, but got the same error with multiple images, while one image worked fine.
My guess the problem would be with how Venice API responds? It would be amazing to get it to work as all my detections would be free.
Edit: seems Venice doesn’t support streams. I talked to their devs and requested the feature and they will start working on it tomorrow.
I have docker installation of home assistant. To setup memory in LLM I put my example.jpg to /local/llmvision/memory/example.jpg
I get an error message invalid path.
Already tried /config/llmvision/memory/example.jpg and the same error
SOLVED: The LLM Vision plugin expects the path to the image in the real file system of the Home Assistant container, i.e. not www/, but directly in its config structure. Since www is only for web access (/local/), this path is not available for the memory function. I created folder llmvision directly in /config. Before I had it in www
The cameras need to be “woken up” before accessing the stream. I’ll try to implement this into the integration.
@xup Just a guess but perhaps Venice.ai has a similar limitation as Groq which only allows one image per request. Google has a free tier which supports multiple images though.
Yes it seems like it, talking with their devs now to have them fix it. I’m guessing Google’s free tier requires you share your data for their model training. OpenAI API also has free daily tokens if you allow content sharing for evaluation. 250k tokens per day for the regular models and 2.5m tokens per day for the mini models. Not sure I want to share my photos though That’s why I want Venice to fix their API, because they are a privacy-first service. Their business model doesn’t include spying on their users like the others do.
Well honestly I am not sure they can “fix” their API or want to because it seems like an intentional choice. Running these models is not cheap, especially with multiple images which increases the amount of tokens significantly.
It’s either share your data so that the service is “free” (the price being your privacy), or pay for it yourself. Nothing is free.
I’d recommend looking into Ollama. There are some smaller models that you might be able to run on an old laptop of PC.
Venice only has a paid API, no free tiers. But I have a lot of included usage (around $20/day) because I got airdropped their token and stake some of it in exchange for ACU (API Compute Units).