Hi Jakub, thanks for the feedback!
You’re right. The filter_length should only need to cover the actual acoustic echo tail (speaker to mic physical path), not compensate for software buffering
alignment.
The ring buffer reference mode uses a fixed pre-fill delay (currently hardcoded at 80ms) to align speaker samples with mic data. If that delay doesn’t match your
hardware’s actual DMA latency, the AEC filter has to search a wider window, requiring higher filter_length (and more CPU).
We’ve just added a configurable delay on dev: aec_reference_delay_ms so you can tune the pre-fill for your hardware. If you find the right value for your setup,
filter_length: 4 should work.
intercom_api:
id: intercom
mode: full
microphone: mic_component
speaker: spk_component
dc_offset_removal: true
aec_id: aec_processor
aec_reference_delay_ms: 80 # Tune for your hardware (10-200ms). Try lower values with filter_length: 4.
ringing_timeout: 30s
The default is 80ms which works for most setups with separate I2S buses. Try lowering it (40-60ms) and see if filter_length: 4 gives you good cancellation. The right
value depends on your I2S DMA buffer configuration and codec latency.
We’re also exploring a single-bus duplex approach (i2s_audio_duplex) where mic and speaker share the same I2S bus (same BCLK/LRCLK, separate DIN/DOUT), giving
sample-aligned reference without ring buffer delay. Currently tested with codec hardware (ES8311/ES7210), working on a no-codec variant.
Can you share your hardware setup? (which mic, speaker, codec if any, single or dual I2S bus?) That would help us suggest the best configuration.