LLM Vision: Let Home Assistant see!

You may need to dewarp frames after capture with something like FFmpeg. A frontier LLM may be able to do the dewarping.

How do you get the snapshots from scrypted to the LLM-Vision addon? I know that you can fetch the snapshot picture via a webhook and the takePicture URL… does this just need to be added to the Home Assistant Generic Camera or how do you feed the LLM-vision the pictures from Scrypted Snapshots plugin?

I am someone might be able to help me.

I am using Pushover for sending a notification. I can see the description ok but would also like to send the image using the image that LLM has saved to my folder/timeline?

I was missing the AI blueprint, it works automatically. What kind of experiences are people having with the default prompt? I’m getting various results by analyzing 10 second clip and then 3 frames with most activity. For example when my child walks out of the door usually the clip consists of a snapshot of nothing and the explanation might be that there’s snow drifting on the porch :smiley:

Having issues using my Reolink cameras as an entity. The entities themselves are working fine and integrated well everywhere in homeassistant, they’re live and have rtsp enabled in the advanced settings.

I should note that the reolink cameras are connected to a reolink NVR.

I can only use LLM Vision if I use image analyser with a snapshot that a camera has taken. But I’d like to provide LLM Vision the reolink camera entity directly.

The error im getting (when using the stream analyser with a reolink camera entity) is this:
Failed to perform the action llmvision.stream_analyzer. No cameras available - all cameras offline or unavailable

Hello,

I have been using LLM Vision with Home Assistant for quite some time. I’m currently experiencing an issue with notifications in combination with my Apple Watch.

Initially, I was able to see snapshot images on my Apple Watch without any problems when receiving motion notifications. After a major update (where the storage path in Home Assistant had to be adjusted), I followed the instructions and changed the storage location accordingly. Since then, snapshot images are no longer displayed on my Apple Watch when I am outside my local Wi-Fi network.

Recently, there was another update that required reverting the storage location change. I have done that as well, but unfortunately the issue still persists. My Apple Watch cannot display snapshots when I am outside my Wi-Fi network.

I have already tried adjusting several settings, but I haven’t been able to resolve the issue. Interestingly, my iPhone always displays the snapshots correctly outside the Wi-Fi network — I never had any issues there.

Does anyone have a suggestion or idea what could be causing this?

Best regards

Did you ever figure out which model works best in LocalAI for this? I’m still trying different models.

The Qwen vision models are probably the ones considered the best. So Qwen3 VL or the newer Qwen 3.5. The smallest Qwen 3.5 just came out today.

2 Likes

Perfect, thanks!
I’ll see if I can use Qwen3.5-9B for vision and various chat functions at the same time.

You are welcome.

The new Qwen 3.5 models should be better for general use than the Qwen VL models which many people have been using. 3.5 is not a vision improvement over 3 VL, but is claimed to be improved generally.
One issue with Qwen 3.5 is that there are no instruct variants at this point. For vision and Assist use in Home Assistant there is no use for thinking in the use cases I’m familiar with. In LM Studio I removed the thinking block with AI help.

I find that these new Qwen model may exhibit more personality in Voice, even at lower temperatures. Although I have’t test 9B with voice.

If you get frustrated with Qwen 3.5, I suggest you try Qwen3-VL-8B-Instruct and see if you get cleaner replies. I’m sure that over the next few month people will figure out how to make Qwen 3.5 more instruct-like. Perhaps even instruct versions will be released.

Actually I’m using the 4B for instruct and it’s working quite well.
The bit I can’t get working is vision.
I’m using LocalAI for my setup.

Anyone else concerned about the socket connection for the LLM Vision timeline card? I have a very standard HA install- HAOS on a NUC. I see log errors and performance hits on that connection. I’m going to make a card using the API.

No reason a card can’t be made where the server pushes a refresh/update when a new timeline event is recorded

Web socket between camera and LM is interesting. but that’s a high performance system that would ideally be designed without affecting HA directly.