I did it! I tried for days to use local LLM, but my PC just wasn’t up to the task (cheap server). It can run whisper and piper though. So I was forced to go with the Google Generative AI and I don’t regret it. I wanted a smart voice assistant that works in Finnish. Now I got it and it’s smarter than Alexa even though it can’t do all the same things or atleast some are difficult to set up.
I would recommend this route with voice assistant for Home Assistant.
It’ll be such a long manual, but what do you need to achieve? Do you have faster-whisper and piper yet? I got those running in my Linux Mint server in a docker using this guide (piper is also on the same repositories list) GitHub - rhasspy/wyoming-faster-whisper: Wyoming protocol server for faster whisper speech to text system You’ll need to edit those commands a little if you want the Finnish language. So basically replace English with Finnish and for voice use Harri.
After I got those running I linked the server via Wyoming. For Google Generative AI you need API from Google AI studio. Just log in with your Google account generate api and use Google Generative AI add-on. And type “Answer in Finnish” to the prompt.
I have my HA OS in Rock pi 4B+ and then I have a PC NAS server where I run Whisper and Piper too.
This is what I do (minus the voice hardware) and I’m pretty happy with it.
The only thing missing imo is thinking/reasoning controls. I’ve found 2.5 Flash-lite to be surprisingly capable for the price (not to mention insanely fast) but, it can get confused on occasion and maybe being able to “think” would help with that.
I used to use free tier but decided to switch over to paid because, as with all things free from for-profit corporations, you’re ultimately the product. Free API requests are logged and used for training while paid tier requests are not.
BTW, if you’re sticking with free tier, I’d highly recommend switching over to 2.5 Flash. It’ll be the default in the upcoming HA 2025.7 anyway and is much better than 2.0 Flash.