Custom Integration: Ollama Conversation (Local AI Agent)

ej52 · November 4, 2023, 2:37pm

Since Ollama does not have a OpenAI compatible API, I thought I would get ahead of the curve and create a custom integration

Simply spin up a Ollama docker container, install Ollama Conversation and point it to your Ollama server.

Enjoy your fully local AI assistant, with no cloud dependancies!

Ghzgod · November 10, 2023, 3:01pm

First off thank you so much for doing this.

Quick question, I want to use a siri shortcut to launch the Home Assistant assist feature. This doesn’t work with the custom Ollama as it tries to use the default Assist even though I removed it. Any ideas?

Other than that, it works great within the webui via text input

aceat64 · November 10, 2023, 5:34pm

Thank you! I was already playing around with Ollama in my home lab and found your add-on from the Ollama repo. It was very easy to get running and works great.

The only downside is that conversation agents can’t (yet?) control anything.

ej52 · November 10, 2023, 9:49pm

Thanks @aceat64

Control of devices is possible with the right model and prompts, I will be adding functionality for this soon. It is a bit sketchy at the moment but new features coming to Ollama will make it easier…

ej52 · November 10, 2023, 9:53pm

Thanks @Ghzgod

Unfortunately I have no knowledge of IOS integrations to Home Assistant and how they work, will try to look into it when I get some time.

Anto79-ops · November 25, 2023, 10:31pm

Hey

Got LocalAI running with a chatGPT frontend. Works really well. I have a 7b and 13b model running really EXCEPT there is no HA support.

So, how well does this work in HA for example, can I say “turn off the light and tell me joke?”

Thanks for the development

ignacio82 · November 26, 2023, 5:00am

Thanks for your work @ej52! I was able to install ollama using docker, and integrate it with home assistant. However, when I try to talk to it using home assistant it does not work:

Any ideas for what I might be doing wrong? In case it is relevant, I’m running home assistant and ollama on a AMD Ryzen 9 6900HX(Up to 4.9GHz), Radeon 680M Graphics,8C/16T Micro Computer, 32GB DDR5. This is the docker-compose that I use to run ollama:

version: "3.9"
services:
    ollama:
        devices:
          - "/dev/dri/renderD128:/dev/dri/renderD128"
          - "/dev/kfd:/dev/kfd"
          - "/dev/dri/card0:/dev/dri/card0"
        volumes:
            - nfs-ollama:/root/.ollama
        ports:
            - 11434:11434
        container_name: ollama
        image: ollama/ollama
        environment:
          OLLAMA_GPU: amd-gpu

volumes:
  nfs-ollama:
    external: true

Thanks for the help!

thundergreen · November 26, 2023, 10:59am

Hello I would like to run ollama on synology with 6GB. Is there any language model a bit smaller? 7b is way to much

ej52 · November 27, 2023, 7:15pm

It cannot control ha devices yet, but I am working on a new version that will add support soon.

ej52 · November 27, 2023, 7:18pm

I suspect your hardware is taking too long to run the model in the predefined timeout of 60 seconds. I will push a update soon that will allow you to set your own timeout for server response.

ej52 · November 27, 2023, 7:23pm

Not that I know of, sorry. Unfortunately running LLMS still requires some decent hardware to have human like response times.

ignacio82 · November 27, 2023, 9:12pm

Thanks. I have a couple of questions that you might be able to answer.

Does the model have a memory? For example, would it be possible to tell it something “in the future if someone says X do Y instead of what you just did” could that have a permanent effect?
I see that in the configuration you can provide a “system prompt”. I imagine that if the prompt is complicated everything will take longer. Is that right? If so, is it possible to train a new model based on an existing one an a complicated prompt to avoid that overhead?
Similar to 2, is it possible to train a new model based on an existing one an additional information? For example, could I provide additional information to the new model in the form of PDF files?

Thanks for the great work. This is very interesting and fun to play with.

hydraxon64 · December 17, 2023, 10:31pm

Hey @ej52 love this project! Was curious, is there any update / eta on when support would be added for controlling the home assistant?

Thanks!

smarthoo · January 1, 2024, 9:08pm

this should be the default new AI assistant because 1) ollama is the only local AI which is user friendly and 2) just works without any pain and 3) no need to be linux expert to set it up 4) usual models can be loaded into it. https://github.com/jmorganca/ollama/blob/2a2fa3c3298194f4f3790aade78df2f53d170d8e/docs/linux.md HARDWARE: You need a machine with a GPU that has at least 8GB VRAM but if you wanna buy something then buy such card which has 16GB VRAM, for example NVIDIA T4 GPU. (note this GPU need a small cooling fan as it is server GPU)
the 13B model is maxing out the T4 but only for a few seconds until it generates the answer:

Maxcodesthings · January 11, 2024, 10:44pm

I have seen localai models which can control home assistant. But i believe this relies on openai functions support and thats something localai wraps for llama llms.

I think this is the missing piece for ollama. Localai does not run as well as ollama so a combination of both of these seems to be the best solution

Maxcodesthings · January 11, 2024, 10:59pm

Curious if you have had the chance to report your findings on this yet? I see localai has some documentation on this but it runs poorly compared to ollama

Amateur-God · January 15, 2024, 12:41pm

im having an issue where whatever model i try it tells me the model wasnt found and to try pulling it first

formatBCE · January 16, 2024, 9:26pm

Hi all!
Promising project!

I have Ollama on lxc, and chat with it locally. However, any request from HA with this integration leads to 100% CPU on lxc, which never goes down - and timeout on HA side.
Also, as i understand, there are functions that we can use with Ollama, if model supports them. Right?

formatBCE · January 16, 2024, 10:28pm

It’s not that easy like “i will give it this data and train it on that data”. Model training is nothing to do with the context you’re using in conversation.

gawr · January 23, 2024, 11:30am

I’m having exactly the same problem. Did you find a solution for this?