I followed the guide to set the local Voice Assistant, but I do not get voice feedback, only text. I have local assistant without https, so I have to type requests in chat. I would still expect the audio feedback. Is that no possible with local assistant or without https? Or what am I missing in my configuration?
TTS is working fine as I hear voice if I run the test…
tmjpugh
(Tmjpugh)
May 17, 2024, 3:54am
2
A lot of people reporting “no voice feedback” with latest version of HA
opened 10:09AM - 15 May 24 UTC
### The problem
I am using the ESP32-S3-BOX (non 3) firmware from esphome/firmw… are. However, after updating to esphome 2024.5 I get no voice return. The text on the display does come up correctly and opening the audio link in a browser plays the audio as normal.
### Which version of ESPHome has the issue?
2024.5.0
### What type of installation are you using?
Home Assistant Add-on
### Which version of Home Assistant has the issue?
2024.5
### What platform are you using?
ESP32
### Board
_No response_
### Component causing the issue
_No response_
### Example YAML snippet
_No response_
### Anything in the logs that might be useful for us?
```txt
[11:05:18][D][voice_assistant:591]: Speech recognised as: "Tell me a joke."
[11:05:18][D][text_sensor:064]: 'text_request': Sending state 'Tell me a joke.'
[11:05:18][W][component:237]: Component voice_assistant took a long time for an operation (240 ms).
[11:05:18][W][component:238]: Components should block for at most 30 ms.
[11:05:18][D][voice_assistant:563]: Event Type: 5
[11:05:18][D][voice_assistant:596]: Intent started
[11:05:19][D][voice_assistant:563]: Event Type: 6
[11:05:19][D][voice_assistant:563]: Event Type: 7
[11:05:19][D][voice_assistant:619]: Response: "I'm here to assist with your smart home. How can I help you today?"
[11:05:19][D][text_sensor:064]: 'text_response': Sending state 'I'm here to assist with your smart home. How can I help you today?'
[11:05:19][D][voice_assistant:563]: Event Type: 98
[11:05:19][D][voice_assistant:704]: TTS stream start
[11:05:19][D][esp-idf:000][speaker_task]: I (258604) I2S: DMA Malloc info, datalen=blocksize=2048, dma_buf_count=8
[11:05:19][D][esp-idf:000][speaker_task]: I (258612) I2S: I2S0, MCLK output by GPIO2
[11:05:19][D][esp-idf:000][speaker_task]: I (258618) ESP32_S3_BOX: I2S0, MCLK output by GPIO0
[11:05:19][D][esp-idf:000][speaker_task]: I (258622) AUDIO_PIPELINE: link el->rb, el:0x3d85c2c8, tag:raw, rb:0x3d85c438
[11:05:19][D][esp-idf:000][speaker_task]: I (258629) AUDIO_ELEMENT: [raw-0x3d85c2c8] Element task created
[11:05:19][D][esp-idf:000][speaker_task]: I (258635) AUDIO_ELEMENT: [i2s-0x3d85c024] Element task created
[11:05:19][D][esp-idf:000][speaker_task]: I (258640) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8064151 Bytes, Inter:63740 Bytes, Dram:63740 Bytes
[11:05:19][D][esp-idf:000][i2s]: I (258646) AUDIO_ELEMENT: [i2s] AEL_MSG_CMD_RESUME,state:1
[11:05:19][D][esp-idf:000][i2s]: I (258648) I2S_STREAM: AUDIO_STREAM_WRITER
[11:05:19][D][esp-idf:000][speaker_task]: I (258652) AUDIO_PIPELINE: Pipeline started
[11:05:20][W][component:237]: Component voice_assistant took a long time for an operation (280 ms).
[11:05:20][W][component:238]: Components should block for at most 30 ms.
[11:05:20][D][voice_assistant:563]: Event Type: 8
[11:05:20][D][voice_assistant:639]: Response URL: "http://192.168.1.102:8123/api/tts_proxy/8e80ff9caa1ef21e0bcaaea38ac66211b3483bab_en-gb_2cdeae300d_tts.microsoft.wav"
[11:05:20][D][voice_assistant:439]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[11:05:20][D][voice_assistant:445]: Desired state set to STREAMING_RESPONSE
[11:05:20][D][voice_assistant:563]: Event Type: 2
[11:05:20][D][voice_assistant:653]: Assist Pipeline ended
[11:05:21][D][esp-idf:000][speaker_task]: W (260212) AUDIO_PIPELINE: There are no listener registered
[11:05:21][D][esp-idf:000][speaker_task]: I (260219) AUDIO_PIPELINE: audio_pipeline_unlinked
[11:05:21][D][esp-idf:000][speaker_task]: W (260226) AUDIO_ELEMENT: [i2s] Element has not create when AUDIO_ELEMENT_TERMINATE
[11:05:21][D][esp-idf:000][speaker_task]: I (260235) I2S: DMA queue destroyed
[11:05:21][D][esp-idf:000][speaker_task]: W (260243) AUDIO_ELEMENT: [filter] Element has not create when AUDIO_ELEMENT_TERMINATE
[11:05:21][D][esp-idf:000][speaker_task]: W (260251) AUDIO_ELEMENT: [raw] Element has not create when AUDIO_ELEMENT_TERMINATE
[11:05:21][D][esp-idf:000][speaker_task]: I (260291) I2S: DMA Malloc info, datalen=blocksize=2048, dma_buf_count=8
[11:05:21][D][esp-idf:000][speaker_task]: I (260299) I2S: I2S0, MCLK output by GPIO2
[11:05:21][D][esp-idf:000][speaker_task]: I (260309) ESP32_S3_BOX: I2S0, MCLK output by GPIO0
[11:05:21][D][esp-idf:000][speaker_task]: I (260317) AUDIO_PIPELINE: link el->rb, el:0x3d85c2c8, tag:raw, rb:0x3d85c438
[11:05:21][D][esp-idf:000][speaker_task]: I (260325) AUDIO_ELEMENT: [raw-0x3d85c2c8] Element task created
[11:05:21][D][esp-idf:000][speaker_task]: I (260333) AUDIO_ELEMENT: [i2s-0x3d85c024] Element task created
[11:05:21][D][esp-idf:000][speaker_task]: I (260338) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8064243 Bytes, Inter:63832 Bytes, Dram:63832 Bytes
[11:05:21][D][esp-idf:000][i2s]: I (260345) AUDIO_ELEMENT: [i2s] AEL_MSG_CMD_RESUME,state:1
[11:05:21][D][esp-idf:000][i2s]: I (260348) I2S_STREAM: AUDIO_STREAM_WRITER
[11:05:21][D][esp-idf:000][speaker_task]: I (260350) AUDIO_PIPELINE: Pipeline started
[11:05:25][D][voice_assistant:563]: Event Type: 99
[11:05:25][D][voice_assistant:712]: TTS stream end
[11:05:25][D][voice_assistant:310]: End of audio stream received
[11:05:25][D][voice_assistant:439]: State changed from STREAMING_RESPONSE to RESPONSE_FINISHED
[11:05:25][D][voice_assistant:445]: Desired state set to RESPONSE_FINISHED
[11:05:27][D][esp-idf:000][speaker_task]: W (266700) AUDIO_PIPELINE: There are no listener registered
[11:05:27][D][esp-idf:000][speaker_task]: I (266707) AUDIO_PIPELINE: audio_pipeline_unlinked
[11:05:27][D][esp-idf:000][speaker_task]: W (266716) AUDIO_ELEMENT: [i2s] Element has not create when AUDIO_ELEMENT_TERMINATE
[11:05:27][D][esp-idf:000][speaker_task]: I (266723) I2S: DMA queue destroyed
```
### Additional information
_No response_
ok I did not think that as a possibility. Downgraded to 2024.4 and it started working.
Thank you for your heads up.
Did you ever solve this problem other than going back to an earlier version of HA? I’m on 2024.12.4 and cant debug the audio in Assist. it doesnt hear me when i talk.