Voice Assistant using own Speech-to-text cloud app

ultromaton · June 4, 2025, 1:29pm

I would like to use Assist with my voice. However, 3rd party STT services (OpenAI, Google, MS) are not an option for me due to privacy.

I have already deployed a web STT service that accepts HTTP POST request with audio (WAV format, PCM codec) and returns text, which works well. I have also created a custom HA component that should send the audio to my web STT serice.

This custom component is exposed in configuration.yaml like so:

However, after restarting the whole system I still don’t see my service in the Speech-to-text dropdown.

How do I list my custom component in that dropdown?
Or do you know any custom integration/addon that would allow me to configure my own URL address for the STT service?
Thanks!

mchk · June 4, 2025, 1:44pm

Wrap it in wyoming protocol it’s much easier, the client is already in the system, so no additional integration is required.
Use any project from GitHub as an example.

ultromaton · June 4, 2025, 2:03pm

What do I wrap in wyoming protocol - my web STT service? It is a JavaScript app running in a serverless CloudFlare worker, so not sure whether wrapping it like that is possible. Or easy.

And I believe everything else is in place for STT to work end-to-end, I just need to be able to select my component in the dropdown. That would be the easiest approach, but it’s not clear to me how to make a custom component show up in that dropdown. Any help on that would be much appreciated!

mchk · June 4, 2025, 2:32pm

If your stt entity is registered in the system, it will be displayed. Look at the logs, debug the code.
What integration did you base your solution on?
Look at this project

What do I wrap in wyoming protocol

your custom integration