Host Local AI for voice assistant

Ksully86 · January 13, 2024, 1:35am

I know we can use Open AI for the voice assistant but is it possible to host our own private open AI type server? It would be cool to keep everything local except for web lookups. Idk as I haven’t used Open AI much but would hosting your own server allow for better context awareness? I guessing years when we reach the AI assistant level of Tony Stark/ Jarvis the assistant will need it own instance and likely offered from big tech for a premium pay service(likely harvesting even more user data).

horga83 · January 13, 2024, 9:17pm

This fellow seems to have done that:

Search youtube for: Build a LOCAL ChatGPT Voice Assistant For Your Smart Home

Haven’t tried it so YMMV

celodnb · February 10, 2024, 12:47pm

I’ve also watched all of his videos, and am really interested to hear more about how he set up local ai, and used it as a conversation agent in HA. He has promised to release a video, where he details his local ai set up, but hasn’t been released yet…

I’ve also tried following other guides, trying to set up a local lllm in docker. I can get a model to run and load, using a gpu, so it’s fairly fast in processing requests so using it’s own UI, but I can’t get it to work with HA yet. I’ve used this video so far, but can’t get it to work…

Haldi · March 1, 2024, 4:51pm

Would you mind telling me how fast It ist?
I didn’t manage to download the model in the Home assistant VM so I installed the ollama App in TrueNAS and connected that. Works fine.
But I expected more on a AMD 5600 machine… though CPU is only 50% Loaded.
Takes about 30 seconds.

Here is a short example video.

celodnb · March 13, 2024, 7:06pm

Sorry for the late reply, and I should probably clarify a bit more after all my testing…

I managed, after a lot of ball ache, to get acon96’s Home-V3 model running in a docker container, and in the container itself, I would get an answer to queries within 5-6 seconds using my GTX1650 GPU.

However, firstly 5-6 seconds is still not great enough, but secondly (and most importantly), I couldn’t get it to control HA devices. I would get responses in assist to queries (for example, can you tell me who Bruce Wayne is), but there is apparently a limit of 32 devices that the Home model can control, and I have 5-6 times that at home. I followed this guide to get it all set up (just used a different model).

So for now, I’m using Whisper and piper in docker containers, then using a custom integration called conversation chain, which allows me to use a local pipeline as well as a chatgpt pipeline, meaning that the local one will give me responses about my devices, and chatgpt will give me responses about everything else.