It’s been great using Voice Assist with the official Ollama integration. I have everything running locally and a used RTX 3060 + llama3.1:8b works well for me.
I’ve been doing a lot of reading to try to find a way that I can utilize RAG (Retrieval Augmented Generation) on the back end. I thought I found it. Instead of using the Ollama integration – I could use Home-LLM (HACS) and connect it to OpenWebUI. It appears OpenWebUI exposes it’s LLMs via Ollama protocol if you include /ollama on the end of the URI.
My idea, I’d have some basic RAG stuff in OpenWebUI that would wrap around Ollama/llama. Well, part of this works. The Home-LLM integration connects to OpenWebUI and can access “real” Ollama models, but really doesn’t want to use a “workspace model.”
I made a workspace and have some text documents for augmentation. When I point Home-LLM at my workspace model “house” I get this error on first query:
Sorry, there was a problem talking to the backend: HomeAssistantError(‘Failed to communicate with the API! {“detail”:“Model ‘house:latest’ was not found”} (status code: 400)’)
I may try something with AnythingLLM.
My question – does anyone have this sort of things working? If so, how did you configure it?
Thanks,
Scott