LLM Vision: Let Home Assistant see!

(Feature request)
Was wondering if it is possible to add some type of zoom capability to the picture on the card. On my phone it is difficult to see the picture due to the screen size. A button to make the image fill the screen would work as well.

I am using the v1.4.1 blueprint and have multiple cameras and motion detectors. I frequently get these messages in the trace timeline “Stopped because only a single execution is allowed at March 21, 2025 at 12:21:34 PM (runtime: 0.01 seconds)” Will this support running in parallel mode?

I tried google and I had good results from that.

You just need to add you Ollama in LM Vision.

and then use it in Blueprint/Automation
Provider: Use you Ollama host
Model: gemma3:12b

1 Like

custom element doesn't exist: llmvision-card after update HA to 2025.03.1, does anyone have the same problem?

I have added Ollama as a docker container, not addon. When I do the integration homeassistant and Ollama I have to pick a model from a list. Gemma3 is not available. The model I pick is the one that then is used in LLM Vison. How have you installed Ollama?

Yes - I am having same issue - tried removing - reinstalling LLM Vision integration as well as timeline and memory entries. watching this thread for a possible solution.

… fiddled and I ended up resolving after noticing it worked in another browser - so headed and cleared Chrome Cache.

1 Like

I have bit of different setup. I use Ollama on mac mini M4 and that’t the only software run on physical machine. HA and many other containers are hosted on Kubernetes Cluster.

When you add Ollama into LLM Vision there is NO model selection it is just host:port or IP:port. when you configure Automation from Blueprint that time you select this as Provider as Ollama host you configured in LLM Vision and then in Model you mention gemma3:12b or whatever your model is.

1 Like

In UI you can override it with putting higher number in the input box and save it.

can somebody tell me please - how to enter a proper image path into “Memory” settings?
is it relative to some of HomeAssistant’s paths?

the example data says /config/llmvision/memory/example.jpg - I’m trying to point to /config/www/llmvision/memory/photo.jpg and still got the “One or more image paths are invalid.” warning. both - path and file exists [assuming, that said /config/ directory is where configuration.yaml is]

Hi @valentinfrlch

I just started playing with in yesterday and it really works well. I am using Claude as an AI service, which works quite well.

Am am a little confused about the “asking about events” feature.
I set up the “Anthropic Conversation” integration and I can use it in the assist chat.
I am wondering where to put the “spec” listed here: Asking about Events | LLM Vision | Getting Started
Does it belong into the general configuration.yaml from Home Assistant? I tried to put it there but I got lots of error messages in VSC.

Here my questions

  • Where do I put the “spec”?
  • Is it working with the Anthropic Conversation integration?

Thanks a lot for your great work!

It is no longer necessary to install the OpenAI Extended Conversation integration. It now works natively with Assist.

See the updated guide here: Asking about Events | LLM Vision | Getting Started

@pejotigrek Are you running HA in a docker container?

@valentinfrlch yes, that’s correct. docker container on ubuntu server.

Trying to use the integration to check a license plate to open my garage door.
For simplicity sake I was trying to just set it up to return a value in a script. I can’t save the script because I get this error:

Message malformed: invalid template (TemplateAssertionError: No filter named ‘from_json.car_in_driveway’.) for dictionary value @ data[‘sequence’][1][‘choose’][0][‘conditions’][0][‘value_template’]

Here is my current YAML.

alias: Garage Door opener
sequence:
  - action: llmvision.image_analyzer
    data:
      include_filename: false
      target_width: 1280
      max_tokens: 100
      temperature: 0.2
      provider: 01JQF0C7AH8A6RW5D8F838N9SZ
      image_entity:
        - camera.garage_snapshot
      message: >-
        Please check if there is a car in the driveway with the license plate
        "1111111" and respond with a
                JSON object. The JSON object should have a single key,
                "car_in_driveway," which should be set to true if - and only  if - there is a car with the license number provided above in the
                driveway and false otherwise.
    response_variable: response
  - choose:
      - condition: template
        value_template: >-
          {{ response.response_text |
          from_json.car_in_driveway }}
        enabled: true
        sequence:
          - data: {}
            action: input_boolean.turn_on
            target:
              entity_id: input_boolean.car_in_driveway
    default:
      - data: {}
        action: input_boolean.turn_off
        target:
          entity_id: input_boolean.car_in_driveway
    enabled: true
mode: single
description: ""


@valentinfrlch
Regarding the timeline card. Is it in some way possible to add date or time range selection for the displayed events?
I can see and select all events with a normal calendar card, but without pictures.
So if there is no functionality right now, is it conceivable that you will implement it in the future?
Thanks.

Trying to test this out with ollama as my provider but getting the following error when testing in developer tools > action. I’m assuming this might have to do with my setup. Ollama runs on a different server than HA. I’m not sure where to look to see what is trying to do the POST and why/where it’s trying to use localhost basically.

Failed to perform the action llmvision.image_analyzer. POST predict: Post "http://127.0.0.1:46441/completion": EOF

action: llmvision.image_analyzer
data:
  remember: false
  use_memory: false
  include_filename: false
  target_width: 1280
  max_tokens: 100
  temperature: 0.2
  generate_title: false
  expose_images: false
  provider: 01JQJBSRHQQA3ZB218S6KD1K1C
  message: >-
    Describe this image in a few sentences to be used in a mobile notification
    triggered by motion detected.
  image_entity:
    - camera.doorbell_sub
  model: gemma3:12b

I had this half setup and now want to use the timeline card however i have installed and uninstalled it today and all i get is a spinning circle in the box where it should be to add the card to the dash. I dont see anything in the error logs about it.

Hi all, I’m trying to get this working with google nest cam. But it looks like the images the LLM Vision gets are all just black. How to overcome this?

thanks for your help!

1 Like

How do I translate the title into another language? Changing the DEFAULT_TITLE_PROMPT in the file custom_components/llmvision/const.py but the title remains in English.

Ollama + Gemma 3:4b

You can change the title prompt (as well as the system prompt) in the Memory provider settings (you may leave the actual memories empty).

The const you changed is just the default value, it doesn’t change the prompt for your integration.

See this page for additional information: Memory | LLM Vision | Getting Started

@PhiSig This was an issue in Ollama. It should be fixed in the latest version, just upgrade Ollama.

@Vital555 Yes this is planned. Might take a while to implement though.

@pejotigrek The paths in Docker containers are different. v1.4.2 should address this by using a built-in method to retrieve the config path (instead of hard coded paths). There is a beta out now if you want to test it: Release Performance Improvements ¡ valentinfrlch/ha-llmvision ¡ GitHub

2 Likes