Configuration Issues: OpenAI TTS Speech Service and Kokoro

I’m trying to get the OpenAI TTS custom integration (sfortis/openai_tts) working with Kokoro but what I’m seeing doesn’t match the readme in the OpenAI TTS Custom Component for Home Assistant repo.

First, a bit of background regarding my setup: I have HAOS installed on a VMware VM running on a Windows 11 box (Intel Core Ultra 7 265 with 32GB RAM). Kokoro is running on the same box in a Docker container and I can access it from other computers on my network by browsing to http://192.168.11.100:8880/web/ and I can generate and play speech.

I added the the OpenAI TTS integration from HACS using the instructions provided in the repo readme, specifically:

HACS installation ( preferred! )

Go to the sidebar HACS menu

Search for "OpenAI TTS" in the integrations

Click on the integration and download it. Restart Home Assistant to apply the component.

Next, added the integration from Settings->Devices & services, clicked Add integration, searched for “tts” and selected “OpenAI TTS” from the search results. Here’s where I’m running into a problem. According to the docs, the “Add OpenAI TTS Service” UI should have fields for:

API Key (optional for custom endpoints)
API Endpoint
model
voice

I’m only seeing API Key and API Endpoint–model and voice are missing.

Clicking Submit" adds the integration but there doesn’t appear to be a way to configure voice and model.

I’m think everything is up to date:

Home Assistant OS 17.1
Home Assistant Core: 2026.3.0
OpenAI TTS Custom Component for Home Assistant V3.7

Any idea what I might be doing wrong?

OK, I got this working. There are a couple of things missing from the repo’s readme and hopefully this helps someone else down the line.

First, make sure you use the full endpoint URL to the kokoro server. All of the search results I was seeing suggested the URL should be:
http://YOUR_KOKORO_SERVER_IP:8880/v1

The correct URL is:
http://YOUR_KOKORO_SERVER_IP:8880/v1/audio/speech

Next, add a “TTS agent” to the Open AI TTS integration:

On the Add TTS Agent UI, enter Kokoro (or whatever strikes your fancy) for the Profile name, select tts-1 from the Model dropdown, and enter the name of the Kokoro voice you want to use tine Voice field.
NOTE: the Add TTS Agent will only display the OpenAI TTS voices so you’ll need to manually enter the Kokoro voice.

NOTE2: You can see the list of voices by browsing to the Kokoro’s web interface URL: http://YOUR_KOKORO_SERVER_IP:8880/web/

NOTE3: The Kokoro web interface only works on Chromium based browsers. Firefox will display an “Error generating speech…” message if you try to generate speech from text.

You can leave the other fields at their default values. Click the Submit button to save your changes and restart Home Assistant.

Lastly, go to Settings->Voice assistants and click the voice assistant you want to use with Kokoro.

Scroll down to Text-to-speech and click on the Text-to-speech dropdown. You should see Kokoro as one of the available options:

You can simply use a ready-made Wyoming server for kokoro.

Furthermore, this component does not support streaming (despite claims), which can be a problem for long/slow responses from LLM.

1 Like

I’m probably (likely!) being dense here but I’m confused. The Wyoming Protocol integration is already installed and includes the 3 default services (Piper, Speech-to Phrase, Whisper). I assume I’m to add another Wyoming Protocol service using the Add service button?

When I do that the UI is looking for a host and a port. I’m entering http://192.168.11.100 and 8880 respectively, but I’m getting a Failed to connect error.

Wyoming is, first and foremost, a protocol.
In practical terms, this means that various speech recognition or speech synthesis engines are wrapped in it. You can find various projects on GitHub and run them on your own host. After that, they are connected via a Wyoming component (the thing you’re showing in the screenshot).

1 Like

Thanks for pointing me in the right direction. I was able to setup a docker container for roryeckel/wyoming_openai, get it working with Kokoro, and add the openai_streaming service to the Wyoming integration.

Everything appears to be working as it should and looking at the docker wyoming_openai logs it appears to be chunking/streaming the audio:

INFO:wyoming_openai.handler:Detected 3 ready sentences for immediate synthesis: ['This is a test of the emergenc...', 'I am currently generating this...', 'If you can hear the beginning ...']
INFO:wyoming_openai.handler:Starting concurrent synthesis for 3 sentences
INFO:httpx:HTTP Request: POST http://192.168.11.100:8880/v1/audio/speech "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://192.168.11.100:8880/v1/audio/speech "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://192.168.11.100:8880/v1/audio/speech "HTTP/1.1 200 OK"
INFO:wyoming_openai.handler:Processing final remaining text: 'We are no longer waiting for the full audio file to render before playback begins.'
INFO:wyoming_openai.handler:Starting concurrent synthesis for 1 sentences
INFO:httpx:HTTP Request: POST http://192.168.11.100:8880/v1/audio/speech "HTTP/1.1 200 OK"
INFO:wyoming_openai.handler:Successfully completed incremental streaming synthesis, final timestamp: 21910.42