Help Needed: XMOS XU316 AEC Not Working on Voice Preview Edition (VPE)

Hi everyone,

Disclaimer: I’m not an expert on XMOS or audio DSP, but I did try hard to run experiments and debug this issue, with help from Claude Code. If I’ve misunderstood something, please correct me!

I’m working on a custom firmware for the Home Assistant Voice Preview Edition (VPE) device, and I’ve hit a wall trying to get the XMOS acoustic echo cancellation (AEC) to work. I’d really appreciate any insights from those who have experience with this hardware.

Background

I’m building a voice assistant that uses LiveKit for real-time communication instead of the standard Home Assistant voice pipeline. My goal is to enable full-duplex conversation - the ability to speak while the assistant is talking (barge-in capability).

After studying the official ESPHome reference implementation carefully, I realized it doesn’t actually use the XMOS AEC. Instead, it uses temporal separation (half-duplex):

Approach ESPHome Reference
Full-duplex with AEC No
Half-duplex (don’t listen while speaking) Yes
Barge-in with new commands No
“Stop” interrupt during TTS Yes (special wake word)

The voice assistant state machine ensures STT is not running while TTS is playing:

Wake Word → Listen → Process → Speak → (done) → Wake Word
              ↑                           │
              └───── NOT simultaneous ────┘

This works fine for their use case, but I need true AEC for full-duplex operation.

My Setup

Hardware

  • Voice Preview Edition device
  • XMOS XU316 chip (running voice processing firmware v1.3.1)
  • TI AIC3204 DAC (I2C address 0x18)
  • ESP32-S3 as controller

Audio Path

Input:  Physical Mics → XMOS (AEC→IC→NS→AGC) → I2S RX → ESP32
                          ↑
                          │ AEC Reference (?)
                          │
Output: ESP32 → I2S TX → XMOS → AIC3204 DAC → Amp → Speaker

My Implementation

I had a custom ESP-IDF firmware (not ESPHome) that:

  1. Initializes I2C bus for XMOS and DAC control
  2. Resets XMOS via GPIO4 (HIGH→LOW), waits 3 seconds for boot
  3. Configures XMOS pipeline stages via I2C:
    • Channel 0: AGC (stage 4)
    • Channel 1: NS (stage 3)
  4. Initializes AIC3204 DAC with standard configuration
  5. Sets up I2S:
    • TX (to speaker): 48kHz, 32-bit stereo, slave mode, GPIO 7/8/10
    • RX (from mic): 16kHz, 32-bit stereo, slave mode, GPIO 13/14/15
  6. Enables amplifier via GPIO38

This matches the ESPHome reference exactly - same firmware version, same pipeline stages, same I2S configuration, same GPIOs.

The Problem

AEC is not working. When I play audio through the speaker, the microphone picks it up and it passes through the entire XMOS pipeline without being canceled.

Experiments I’ve Tried

Test 1: Pipeline Stage Analysis

I created a test module that:

  • Reconfigures XMOS output channels to different pipeline stages
  • Plays a 3-second speech sample through the speaker
  • Captures stereo audio from XMOS and streams via UDP
  • Analyzes in Audacity

Result: With both channels set to AGC, both clearly contain the speaker audio. The echo is NOT being canceled.

Test 2: Pre-AGC Stage Levels

Tested NS (stage 3) vs AGC (stage 4) output.

Result: Pre-AGC stages output at very low levels, but when amplified in Audacity, they contain the SAME uncanceled speaker audio. AGC just amplifies it - AEC is not removing it.

Test 3: Sample Rate Matching

Changed mic I2S from 16kHz to 48kHz to match speaker I2S, in case XMOS needed matching rates for AEC reference routing.

Result: No effect. Reverted.

Test 4: Init Order Changes

Tried initializing I2S TX before XMOS reset (like ESPHome does with its priority system), hoping XMOS needs to see active I2S data during boot.

Result: Broke audio playback. I2S TX in slave mode doesn’t work without XMOS providing clocks.

Test 5: I2S TX Restart

Added a restart of I2S TX channel after full initialization.

Result: No effect.

Test 6: XMOS I2C Interface

Verified I can communicate with XMOS via I2C:

  • Successfully read firmware version (servicer 240)
  • Successfully read/write pipeline stages (servicer 241)
  • VNR readings work

Result: Basic I2C communication works, but there’s no AEC-specific control interface exposed.

Test 7: VNR Monitoring

Read Voice-to-Noise Ratio before and after playing audio.

Result: VNR increased from 1 to 16 during playback - XMOS IS detecting signal activity. But this “voice” is the speaker audio being picked up by the mic, not being canceled.

What I’ve Verified Matches ESPHome

Parameter My Implementation ESPHome Reference
XMOS Firmware 1.3.1 1.3.1
Pipeline ch0 AGC (4) AGC (4)
Pipeline ch1 NS (3) NS (3)
I2S TX rate 48kHz 48kHz
I2S TX format 32-bit stereo, slave 32-bit stereo, slave
I2S TX GPIOs 7 (BCLK), 8 (WS), 10 (DOUT) Same
I2S RX rate 16kHz 16kHz
I2S RX format 32-bit stereo, slave 32-bit stereo, slave
I2S RX GPIOs 13 (BCLK), 14 (WS), 15 (DIN) Same
XMOS I2C addr 0x42 0x42
DAC I2C addr 0x18 0x18

Questions for the Community

  1. Is AEC actually enabled in the XMOS firmware on VPE? Since ESPHome doesn’t use it, maybe it’s disabled or not configured by default?

  2. Are there additional XMOS registers that need to be configured to enable AEC? The ESPHome code only writes pipeline stage registers - is there something else needed?

  3. Is there something special about how the I2S TX reference signal needs to be routed internally in XMOS? The audio reaches the DAC fine, but maybe it’s not being routed to the AEC module?

  4. Has anyone successfully used XMOS AEC on this device? I’d love to see working code or configuration.

My Hypothesis (After Analyzing XMOS Firmware Source with Claude Code)

After reading through the XMOS sln_voice firmware source code, here’s what I suspect might be happening:

  1. AEC is enabled by default - The appconfAUDIO_PIPELINE_SKIP_AEC flag is 0 in app_conf.h, so AEC should be running.

  2. Only two I2C servicers exist - Resource ID 0xF0 (240) for DFU and 0xF1 (241) for configuration. There is no runtime AEC control interface exposed via I2C - you can’t enable/disable or tune AEC at runtime.

  3. AEC reference routing might be the issue - The firmware appears to expect the AEC reference signal via a separate I2S INPUT stream, not by internally tapping its own output to the DAC. From src/ffva/src/main.c:

   if (aec_ref_source == appconfAEC_REF_I2S) {
       // Reference comes from I2S INPUT on Tile 1
   }

If this is correct, AEC would require the VPE hardware to route the speaker audio BACK to XMOS as an I2S input - which may not exist on this board.

If this hypothesis is correct, it would explain why ESPHome uses half-duplex instead of AEC - the hardware simply doesn’t support it.

Can anyone confirm or deny this? Does VPE have an internal loopback for AEC reference, or is the speaker I2S routed directly to the DAC?

Summary

I’ve spent significant time trying to get AEC working, but no luck. The XMOS chip responds to commands, the audio path works (speaker plays, mic captures), but echo cancellation simply doesn’t happen.

Any help, pointers to documentation, or suggestions would be greatly appreciated. I’m happy to run additional tests or share more details about my implementation.

Thanks in advance!

What was the code that Claude offered?

Thanks for the prompt reply!

The repo is at livekit-on-vpe/main at dev/breakds/test_aec · breakds/livekit-on-vpe · GitHub

Key locations:

The audio pipeline is implemented as a module in the audio_pipeline subdirectory. Let me know if there’s anything concerning.

Sorry that as a new user I was not able to put more links here, but the code is all in this repo.

I encountered the same problem, same xmos version, and i just use aec pipiline