WTH would we not add RAG to LLM integrations?

A lot of the benefits of using RAG (Retrieval Augmented Generation) with LLM Voice Assistants have already been very well explained by John Karabudak: AI agents for the smart home - #24 by JohnTheNerd

Beside improving performance when using a LLM to control Home Assistant, RAG could also be used by Voice Assistants to make conversations more personalized. I would like the Assistant to know about my partner, my occupation, my pets, etc., and mention them when relevant (I’m using Ollama, so privacy is not a concern in this context).

Prompts are not the best vehicle for that kind of information, because Agents will try to use as much information as possible in all their answers (relevant or not), and will get confused if you tell them not to.

What’s more, the same RAG device (to use Home Assistant terminology) could be accessed by multiple agents, limiting redundancy, and the opposite would also be true: you could give some Agents access to specific RAG devices but not others.

I fully agree - instead of creating to a huge prompt. RAG could improve things. Especially if we maybe want to keep the whole history.

The thing is SQLite has only “beta” support for vector search.

So I’m thinking about setting up a separate postgress database (via supabase) and then potentially have a custom AI api (mimicking the openAI API) that already knows about all the devices and the whole chat history of every user (via RAG).

Or does anyone already have a working RAG setup?