Performance improvements for ESPHome + Voice Assistant + i2s Audio

I’m posting this as a feature request to “sum up” the myriad of issues posted here and there about the aforementioned combination.
After perusing the codebase involved, I can think of several items that could use a few pair of eyes… This list is likely non-exhaustive, but I only have two eyes myself… :smile:

Please don’t post issues here, to keep this thread about possible improvements.

ESP32-audioI2S library

https://github.com/esphome/ESP32-audioI2S

The library is about a year behind the original in terms of commits (hundreds since).
Of interest, a few optimizations with the usage of psram which would seem useful with newer boards.

i2saudio / media_player component

https://esphome.io/components/media_player/i2s_audio

Unavailable with esp-idf framework.
speaker is too “disconnected” from Home Assistant in my opinion, especially for a voice assistant.
Makes esp-adf unusable with media_player; I didn’t dig into what it would bring to the table though.

Other

The voice pipeline setup is a bit messy, start, start_continuous, stop, use_wake_word; it all seems a bit “disconnected”. How about some higher-level functions to take care of stopping any active state and switching to the desired one (listen once without wake word / listen for wake word)? And a use_wake_word switch that automatically calls the proper functions (at the right time), to avoid messing with init progress, on_client_connected & other such mechanisms to avoid errors on boot; and that shuts down the voice pipeline appropriately on its own when deactivated. Ideally for the use_wake_word config, the value could be either true/false, or the ID of a template switch (to link its value directly).

The voice pipeline also throws a lot of errors when it doesn’t seem warranted (like “wake word not detected” on deactivation of use_wake_word). Likely linked to the aforementioned “disconnect” between processes.

Bonus: TTS queueing, although easy enough to implement with a couple of wait templates (but requiring a bit of yaml…), could really be handled better at a lower level.

Some if not all of this appear to be espHome Feature requests, and those specifically do not go here…
try here for those:
Issues · esphome/feature-requests · GitHub.