How does esphome voice assistant send audio input from mic to home assistant

Hello All,

Can someone please guide me on how to send mic input from an external Voice satellite based on Esp32-S3 to Home Assistant’s Wyoming integration. I am thinking May be use a websocket client on esp32-s3 to stream mic audio to Home Assistant !

I am trying to create a custom voice satellite using esp32-s3 board as below:

  1. capture voice from connected mics connected to esp32-s3 and perform some voice processing such as
    |NS (Noise Suppression)|
    |BSS (Blind Source Separation)|
    |VAD (Voice Activity Detection)|
    |AGC (Automatic Gain Control)|

using Audio Front-end Framework - ESP32-S3 - — ESP-SR latest documentation

  1. Send the processed Voice to HA that uses Wyoming Protocol

I have seen the Wyoming protocol description at GitHub - rhasspy/wyoming: Peer-to-peer protocol for voice assistants

I am not looking for esphome voice assistant or a Wyoming Voice satellite using Raspberry Pi but instead developing custom code on a bespoke esp32-s3 board with Espressif Audio Front end (using ESP IDF framework) enabled for voice processing and trying to understand how to send the processed audio to home assistant for STT, Wakeword, intent handling and TTS.

1 Like

Bump on this hoping for any responses.

I don’t know if it helps, but the ESPHome voice assistant code is here: https://github.com/esphome/esphome/tree/dev/esphome/components/voice_assistant

1 Like

Same, also checking for any guides on implementing VA on bare C++ instead of ESPHome.