LLM Vision: Let Home Assistant see!

I Have the same issue. What did you do exactly to fix it?
Never mind, it’s done.
don’t use
media: “/media”

this works:

homeassistant:
allowlist_external_dirs:
- /config/media/llmvision/snapshots
media_dirs:
llmvision: /config/media/llmvision/snapshots

Hi,
first of al, This is a very nice integration!!
i started to use it a couple of days ago. ’

2 questions:
i have 2 camera’s in my setup in one automation. google ai tells me what is happening from both camera’s streams. And the automation save’s the keyframe. strange thing is: Why does it seems to use only 1 camera for this keyframe? even though the google analysis says nothing to see on cam 1, activity on cam 2, the saved image is from cam1.

next question:
i can’t get the image shown on the custom:llmvision-preview-card and no images on the timeline functionality.

the key frames are stored correctly in my opinion, because this automation sends me the key frame image via telegram.

alias: AI achterdeur
description: ""
triggers:
  - type: motion
    device_id: 65eddb84809cc6c5861fe39898231642
    entity_id: e754dec645b108ebf718301857f5c15b
    domain: binary_sensor
    trigger: device
    for:
      hours: 0
      minutes: 0
      seconds: 5
  - type: motion
    device_id: 0e51d87576bf1d2d5a164b01be3f6343
    entity_id: f8f267a2a3253ae9a66473422c39ff30
    domain: binary_sensor
    trigger: device
    for:
      hours: 0
      minutes: 0
      seconds: 5
conditions: []
actions:
  - action: llmvision.stream_analyzer
    metadata: {}
    data:
      remember: true
      duration: 6
      max_frames: 5
      include_filename: true
      target_width: 1280
      max_tokens: 400
      generate_title: true
      expose_images: true
      provider: **
      message: >-
        Write a short description in dutch language of person(s) detected at the
        driveway, garden, fence or back door,  and what is happening, based on
        this camera live feed. The description should highlight observable
        traits like clothing, hair color, accessories, age or gender. Keep the
        response under 1000 tokens.
      image_entity:
        - camera.voortuin_vloeiend
        - camera.achtertuin_vloeiend
    response_variable: llm_response
  - action: telegram_bot.send_message
    metadata: {}
    data:
      config_entry_id: **
      message: "{{ llm_response.response_text }}"
  - action: telegram_bot.send_photo
    metadata: {}
    data:
      config_entry_id: **
      file: /config/{{ llm_response.key_frame.replace('/config', '') }}
max_exceeded: silent
mode: single
1 Like

Good afternoon. Is it possible to integrate Yandex AI Studio which partially supports compatibility with OpenAI API and the Gemma 3 27B model? Attempting integration via custom OpenAI does not work because it uses a Custom Endpoint https://llm.api.cloud.yandex.net/v1, but besides the API key, it also requires a catalog identifier. Here’s reference documentation: Yandex Cloud Documentation.

The newest release fixed the camera issue as i described.
But still no pictures in the timeline. New behavour:

When i klik on a even, the ai tekst appears. (just like before).
But in the log:

Login attempt failed

Login attempt or request with invalid authentication from 82-169-61-77.fixed.kpn.net (82.169.61.77). See the log for details.

Did you upgrade the Card as well? You might also need to force refresh the cache on your Browser/App. See this issue here: Invalid Authentication Error and No images in TimeLind Card - New 1.5.2 release · Issue #65 · valentinfrlch/llmvision-card · GitHub

This is likely a rate limit issue: See these issues:

Also, try increasing max_tokens.

Thx for your quick reply. Clear cache did the job. Thx

When I ditched Ring for ReoLink cameras I really missed the notifications for packages on my porch. Thanks to this thread and community, I was able to automate my own!

To say thank you :clap:, and because I feel it’s my duty to give back, here’s my automation:

  • id: notification_package_detected
    alias: Notification Package Detected

    trigger:

    • platform: state
      entity_id: binary_sensor.doorbell_person
      to: ‘on’

    • platform: time_pattern
      minutes: ‘/20’

    action:

    sequence:

TAKE CAMERA SNAPSHOT

  - delay: '00:01:00'
  - service: camera.snapshot
    data:
      entity_id: camera.doorbell_fluent
      filename: "/config/www/CCTV/check_for_packages.jpg"
  - delay: '00:00:05'

CHECK CAMERA IMAGE FOR PACKAGES

  - metadata: {}
    data:
      prompt: >
        There is a package on or near the steps. Answer as one word: True or False.
      filenames: "/config/www/CCTV/check_for_packages.jpg"
    response_variable: generated_content
    action: google_generative_ai_conversation.generate_content

SET PACKAGE STATUS TO TRUE OR FALSE

  - metadata: {}
    action: input_text.set_value
    data:
      value: "{{ generated_content['text'] }}"
    target:
      entity_id: input_text.package_detected

TOGGLE INPUT BOOLEAN

  - service_template: > 
      {% if is_state('input_text.package_detected', 'True') %}
      input_boolean.turn_on
      {% else %}
      input_boolean.turn_off
      {% endif %}
    entity_id: input_boolean.package_detected

SEND IOS NOTIFICATION

  - metadata: {}
    data:
      title: EZ House Alert
      message: Package Detected at Front Door
      data:
        image: '{{ filename }}'
        entity_id: camera.doorbell_fluent
    action: >
      {% if is_state('input_text.package_detected', 'True') %}
      notify.mobile_app_zachs_iphone
      {% endif %}

mode: single

Thank you! :sparkling_heart:

3 Likes

after updated to Version 1.5.2 it stop or dont use my memory i have about 60-80 pictures in memory that it stop use is it some data base I ned to go back to?

I am trying to add openrouter to the llm vision integration. When I add the api_key and choose for the openai/gpt-4o-mini model I get an error saying "home assistant can not add open router to llm vision message “Could not connect to the server. Check your API key or IP and port”

I am sure the api key is correct. What can be wrong ?

Images missing in notifications just me? I can see them in the media directory and they appear in the timeline card with the description. I tested using the reolink example code (no llm vision) with the snapshots in the same media directory and that worked. Snapshots are in media\llmvision\snapshots I did notice in the notification configuration for llm vision it indicates those photos should be in sub directories for each camera which mine are not.

Am I alone and or any other tips to debug?

I was just trying with Claude Sonnet 4.5, but getting the error:

“Full Response: {“type”:“error”,“error”:{“type”:“invalid_request_error”,“message”:“temperature and top_p cannot both be specified for this model. Please use only one.”}”

I tried setting temperature to 0 but still getting the error. I guess there should be a way to disable one of those 2 settings? Thanks!

Hi, this project is amazing! Would be nice to be able to put URL option for the file path for people that don’t keep frigate as home assistant add on. From what I understand currently you can only add local file path. Thank you!

This might be why I keep getting 400 errors with ollama and llmvision. Thanks for the heads up.

This may be a simple referencing mistake, but do I have the correct format here? Is there something needed before referencing the variable or an escape sequence of some kind?

I’m trying to get my Alexa device to read the {{response.response_text}}. But the blueprint based automation keeps erroring out

“Stopped because of unknown reason at December 4, 2025 at 4:54:04 PM (runtime: 0.01 seconds)”

If I change what it reads to {{ label }} it works just fine and reads it aloud. I looked all over the blueprint, and it would appear both {{ label }} and {{ response.response_text }} are valid.

this works just fine,

      - action: notify.send_message
        metadata: {}
        data:
          message: "{{ label }}"
        target:
          entity_id: notify.joe_s_echo_speak

this does not work,

      - action: notify.send_message
        metadata: {}
        data:
          message: "{{ response.response_text }}"
        target:
          entity_id: notify.joe_s_echo_speak

Just curious, before I drop the cash on a 360 camera, has anyone played with these types of cameras and seen how LLMs interpret the warped footage?

On my side the same. I am on iOS and for the sake I cannot make it happen that I see a snapshot in my notification.

Is it possible to add a delay from issuing the snapshot to saving it and sending it to analysis in the blueprint? My Arlo cameras take forever to produce one so its often the snapshot from the last trigger that ends up being sent for analysis.

I have noticed the same

I timed it, from sending command to actually seeing a new snapshot is 5-7 seconds so the delay has to be after a snapshot request.