Custom Component: Gemini Cloud STT vs Google Cloud STT

After tinkering with Faster-Whisper for a while, I’ve come to the conclusion that local STT is a crap. So, I’m sticking with Google until this dinosaur goes extinct.

Now, I’m building a shiny new Gemini Cloud STT to finally retire the ancient Faster-Whisper setup in Home Assistant.

2 Likes

I just started to use your integration and it works flawlessly and very fast, thank you!

Maybe you plan to make a Gemini TTS too?

1 Like

Is support for other languages planned? Or is it impossible? Thanks for your work.

1 Like

Actually, I was shocked that now the gemini can support transcription from different languages. At the time I develop the integration (a few days ago). It support only Englsih transcription. Now, it basically works for all languages and even understands “sounds”. You can try the mic in the conversation. It should translate and transcribe your voice.

I will clean up the code later to make it work flawlessly

Nice, then i can try it later! Thanks man!!

I was thinking that STT → INTENT → TTS consuming three LLM API calls is so… NOT ELEGANT. :rofl: But I think I will do that as a hands-on exercise before moving to a one-shot audio input to audio output assist pipeline.

1 Like

Yeah, the one-shot solution would be the absolute best, I didn’t know that is possible with HA. :slight_smile:

I also use this component with a language different than english, it works perfectly. :+1:

This is amazing - especially the fact it’s free! Thank you so much!