LLM Vision: Let Home Assistant see!

Yeah if you click on the album you can see my other pics showing that.
I’m pretty sure it was something to do with the cooldown, If I set the cooldown to 0 it will proceed through the automation but now I’m stuck with a "Error: Failed to fetch frigate clip " If I check my frigate detections I don’t have anything which makes sense why the script can’t find it.

The model didn’t run and error was something like it didn’t find the model. I deleted the 11b model and am using just 1 model now with the name in the blueprint “llama3.2-vision:90b” seems to be working.

Separately, for the card example you have on the instruction website, I modified the script to take a fresh snapshot (button press option), wait for 2 seconds, and then run script; added a qualifier to keep the response to 255 chars since that’s the inout_helper limitation; got it to work with Ring. So far, this working decently with Ring + Scrypted

Is it possible to create a card for a Home Assistant dashboard that show the latest notification from the blueprint? I know the image is saved in www/llmvision and the result from the AI is in the LLM calendar.

I’m hoping to add a clickAction to the notification that takes me to my Alarmo dashboard. There I could display Alarm status, live feeds, door sensor etc. But also the content from the last notification.

Is it possible to include images in the prompt to compare the provided images with the camera input like this person did here?

This is planned for a future release. If someone has some experience with cards, I’d love to collaborate!

I would like to improve the docs regarding Ring support. Could you share how you got your cameras working with LLM Vision?

Thank you for the excellent integration, I just have a question with regards to Android notifications. When I receive a notification with a description of the camera view, only the first part of the description show in the notification. See screenshot attached. When I click on the notification, it takes me to my camera views and I can’t figure out how to see the rest of the description.
Is there a way to show the full scene description and perhaps a list of previous descriptions, either on the notifications itself or in Home Assistant?

Here’s more details on Android notifications and creating expandable and large text block notifications.

Hi all, i’m using the camera blueprint with frigate but get this error…

Any idea what I need to do to fix?

The automation "AI Event Summary Doorbell" (automation.ai_event_summary_doorbell) has an unknown action:llmvision.video_analyzer .

This is likely related to Camera, Frigate: Intelligent AI-powered notifications - #87 by Bogey

Hi all,

Have been playing around with this, and google gemini and it works really well.

The other day I had the automation fail on “Model is overloaded”, so LLM vision couldnt analyse the image. Makes sense because its the free tier, but it broke the automation so i didnt get any alerts.

Just wondering how people have handled these errors? At the moment, the automation fails. I would like it to continue but give me a generic error. I could use continue_on_error: true but this doesnt seem the best way forward. Ideally LLM Vision should return something

I’ve got the Phi 3 Vision Mini loaded in LM Studio. Prompting an image to the LM results in a detailed analysis. However, despite having tied the model to the LLM Vision integration and configuring the blueprint by selecting the server I created with the integration and using both Frigate, and the Camera option in the blueprint the closest I can get is ‘motion detected’. I get no analysis from the LM server and nothing in the log. Just a notification on my Android that says ‘Motion Detected’. I am frustrated with it. Not sure what I am missing. If anyone has any ideas I would greatly appreciate it.

I’m having similar issues with this and Ollama integration. I am curious if you’re also running the LLM on a Mac?

I have Ollama running on Mac native and OpenWebUI running on the Mac via Docker. I can access the OpenWebUI from any machine on my network. My HA Yellow “sees” Ollama at IP:PORT and installs on both Assist and/or LLM Vision. However no communication actually happens. No firewall running on the Mac. Not sure what could be stopping it from communicating.

When running Ollama on macOS you need to set the env variable OLLAMA_HOST to “0.0.0.0”. Unfortunately you have to do this every time you restart your Mac. You can do so with this command:

launchctl setenv OLLAMA_HOST "0.0.0.0"

Amazing work on this! Can you please explain some of the “Tap Navigate” options, as I can get it to display my dashboard of choice, but I would rather have it show something more related to the clip and the suggestion of “e.g. /lovelace/cameras” I don’t seem to have in my folder structure. What are people putting here? Thank you!

Does anyone know if it’s possible to only trigger this for a certain zone in frigate?

Anyone gotten this working with the amcrest ad410 and amcrest integration??

I think I have the LLM side set correctly. Just not understanding the best way to pull the doorbell in through the integration to get this to work.

I use the blueprint and right now it just tells me motion was detected.

anyone know how I might get the image analysis service to read from a media source Media source - Home Assistant

Thanks

Hello Valentin, a quick question if you don’t mind and can answer it. I have extended OpenAI conversations working well, and I can control all instances without issues. I’ve set several specs in the add-on configuration, and everything works perfectly. However, what code should I put inside “specs” so that when I ask in the chat about a particular camera or multiple cameras, it responds correctly and doesn’t say it can’t analyze images? If I go to dev tools and run tests, the analysis works, but it doesn’t within the chat. I must be missing something, and I think it’s some code I need to add inside specs, but I can’t figure out what it is right now.

Also interested in this - do you have a separate “conversation” for each camera feed? are you able to reference previous images or specific people/cars/etc in the conversations?