I have have set up ollama integration to use an llm as an assistant.
Everything seems to work fine, except when I want to actaually, lets say turn on/off the tv. And yes, I have the “assist” option enabled.
I use 2 llm llama 3.2 and mistral.
llama3.2 tells me it cannot connect to a server when things become specific, like:
Sorry, I had a problem talking to the Ollama server: POST predict: Post “http://127.0.0.1:37149/completion”: EOF
mistral:
To turn on the lamp naast de bank, you can use the following command in Home Assistant:
HassTurnOn(area=“Woonkamer”, name=“Lamp naast de bank”)
The action is not executed. What am I missing here?
Asked the question again, now I get this response:
It seems like you have asked me to format an answer based on a tool call response. However, I don’t see the tool call response provided. Could you please provide the output of the tool call so that I can assist you in formatting an answer to the original user question?
I am now using llama 3.1:8b that seems to work a little bit better. I can now turn on/off devices, but not the ones I asked for.
I have now downgraded to ollama version 0.51. this seems to work better as well. But every now and than I get this " Sorry could not talk to ollama server" .
Ok, I found this message in the logs of ollama: gpu VRAM usage didn’t recover within timeout" seconds=5.250980419
So I think this is a “performance issue” . The GPU is still producing an answer, but homeassistant has timed out i think. Is there a way to change this timeout?
Me too! I’ve tried diffferent models (some with, some without tool support) and all the time I’m just getting JSON replies when I ask the AI to turn on some switch (that is exposed). Through the OpenAI conversation agent it works fine though. Curious to learn how to get it working with ollama as well