Speech to Text conversion through microphone

vinith-reddy25 · January 12, 2022, 7:49am

The conversation integration allows us to converse with Home Assistant. We can either converse by pressing the microphone in the frontend (supported browsers only (no iOS)) or by calling the conversation/process service with the transcribed text. I understood “how to call the conversation/process service with the transcribed text”. My issue is how speech is converted into text by using microphone. Can somehow help how home assistant is converting speech to text? What api or code is used for that?

WallyR · January 12, 2022, 10:07am

I think it uses Stanford University’s Genie (Formerly known as Ada with Almond)

You might also look at Rhasspy

Shishir_Chauhan · January 12, 2022, 10:49am

Can someone please give more clarity to this?
I need to know exactly where this speech to text conversion is happening in HA?

WallyR · January 12, 2022, 11:34am

STT was introduced in HA 0.102.
Here is the release post: 0.102: Official Android App, Almond, Scene editor - Home Assistant

vinith-reddy25 · January 12, 2022, 12:27pm

But it is not using stt. I’m sure about it because I tried to print logs in stt component. But no logs came as it is not using stt component. May be it using browser api to convert. But not clearly about it. Can some clarify?

gcampax · January 13, 2022, 4:08pm

The conversation icon in the Home Assistant web UI uses the Web Speech API in the browser (Chrome only, backed by Google Cloud APIs).