Hi everyone,
Following my previous questions about dataset generation, I am now tackling the hardware configuration phase. I want to squeeze every bit of performance out of my setup to ensure the lowest possible latency and the cleanest audio path for microWakeWord.
My Hardware:
- MCU: Waveshare ESP32-S3-Board (v3)
- Mic/DSP: Seeed Studio ReSpeaker Lite (XMOS based)
- Software: ESPHome (Latest 2026 version)
I am looking for community wisdom on achieving maximum efficiency. I’ve seen many generic configs, but I want to avoid common pitfalls like audio buffer under-runs or WiFi jitter.
My Questions for the Power Users:
-
I2S & DMA Buffering:
What are the optimali2s_audiosettings for the ReSpeaker Lite?- Is the standard
dma_buf_count: 8anddma_buf_len: 256sufficient, or should I increase this to prevent audio cracking (at the cost of latency)? - Are there specific
i2s_configflags recommended for the Waveshare v3 to ensure stable clocking?
- Is the standard
-
XMOS Integration:
For those using the ReSpeaker Lite, are you offloading AEC (Echo Cancellation) and NS (Noise Suppression) entirely to the XMOS chip?- If yes: Does this introduce any startup delay in the audio pipeline when the ESP32 wakes up?
-
Power vs. Latency:
To guarantee instant wake-word reaction, do you strictly forcepower_save_mode: noneon the WiFi component?- Has anyone managed to balance decent power consumption with sub-200ms response times, or is “High Performance Mode” mandatory for voice satellites?
-
The “Glitchless” LED:
I plan to use the onboard RGB LED (or a ring) for feedback. In the past, I’ve seen heavy LED effects block the main loop and cause audio stutter. Are there current best practices (e.g., specific light effects or partition allocations) to ensure visual feedback doesn’t degrade the listening stream?
The “Holy Grail” Request:
If anyone has a production-stable YAML snippet for the Waveshare S3 + ReSpeaker combo that they are proud of, could you share the i2s_audio and microphone blocks?
Thanks in advance!