Voice assist with INMP441 mic results in noisy recordnings

My setup:

Hardware:

  • Microphone: INMP441
  • Amp: MAX98357
  • Speaker: 3 watt mini speaker 8 ohm
  • board: az-delivery-devkit-v4

I have been banging my head against this wall for a while. I get lots of crackling noise. Both from speaker, but also from the recordings from the wake words and commands enabled by adding the following to my home assistant configuration.yaml:

assist_pipeline:
  debug_recording_dir: /share/assist_pipeline 

Crackling noise aside, it DOES work, but not as well I expect! The crackling noise heavily disrupts the speech recognition. I’ve enabled GPU powered speech-to-text on a Proxmox container that runs a docker, and it is blazing fast and really good, taking into account all the disruptions in the recordings! Amazing that it still sometimes gets the command right! Without the disruptions/crackling noise, this setup would be so awesome! It really eats my brain all these crackling noises!

I’ve search the web, many suggestnings of fixing solderings, cables etc, splitting i2s to in/out. I’ve tried everything. Nothing seems to work. Except this one suggestion, that is outside the realm of possibilities of ESPhome.

In that thread Vegethalia writes:

Ok I’ll answer myself, just in case this serves someone else: The problem was that I had configured the I2S in 16 bit mode, and then I was only capturing the MSB of the bit stream. If I configured 24 bits, then the stream was garbled. The solution is to configure I2S in 32 bits mode and when reading, each sample takes 4 bytes and one must be discarted. Like this:

REAL_BYTES_X_SAMPLE=4;

i2s_read(I2S_NUM_0, (void*)dataOrig, buffSizeOrig, &bytesRead, portMAX_DELAY);
uint16_t samplesRead = bytesRead / REAL_BYTES_X_SAMPLE;
for (int i = 0; i < samplesRead; i++) {
    byteIndex = i * REAL_BYTES_X_SAMPLE;
    int32_t value = ((int32_t*)(dataOrig + byteIndex))[0]>>8;
}

Even though you can specify “bits_per_sample:” for your microphone, you can only set it to 16bit or 32bit. You are not able to specify that level of detail of the recordnings as suggested in the post.

After changing output to a media_player (and changing platform to arduino) the crackling noise disappeared from the response in the speakers, the recordings saved by home assistant is still crackling… The crackling MUST something related to the I2S microphone on my system.

I’ve tried different GIPO:s for the I2S, trying to make guesses from the pinout for my board. But I’ve not found any pins that make any difference.

Code:

esphome:
  name: esp32-mic-speaker
  friendly_name: esp32-mic-speaker
  on_boot:
     - priority: -100
       then:
         - wait_until: api.connected
         - delay: 1s
         - if:
             condition:
               switch.is_on: use_wake_word
             then:
               - voice_assistant.start_continuous:

esp32:
  board: az-delivery-devkit-v4
  framework:
    type: arduino
    version: recommended

# Enable logging
logger:
  level: VERBOSE

# Enable Home Assistant API
api:
  encryption:
    key: "llorem ipsum"

ota:
  password: "lorem ipsum"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Esp32-Mic-Speaker"
    password: "loremipsum"


i2s_audio:
  - id: i2s_out
    i2s_lrclk_pin: GPIO21
    i2s_bclk_pin: GPIO22
  - id: i2s_in
    i2s_lrclk_pin: GPIO26
    i2s_bclk_pin: GPIO25

microphone:
  - platform: i2s_audio
    i2s_audio_id: i2s_in
    id: mic
    adc_type: external
    i2s_din_pin: GPIO13
    pdm: false
    bits_per_sample: 32bit

media_player:
  - platform: i2s_audio
    name: "esp_speaker"
    id: media_player_speaker
    i2s_audio_id: i2s_out
    dac_type: external
    i2s_dout_pin: GPIO33   
    mode: mono
    
voice_assistant:
  microphone: mic
  use_wake_word: false
  noise_suppression_level: 4
  auto_gain: 31dBFS
  volume_multiplier: 2
  media_player: media_player_speaker
  id: assist

switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(assist).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
    on_turn_off:
      - voice_assistant.stop
      - lambda: id(assist).set_use_wake_word(false);

I’ve purchased other I2S mics, but they are four weeks away. Any suggestions on how to remove the crackling from the INMP441 recordnings?

Or is it possible to do some bit mangling like the quoted suggestion above in ESPhome?

I replaced the ESP32 with an ESP32 Devkit V1 (board: esp32doit-devkit-v1) and I got it working to an satisfactory level.

I used exactly the same wiring and code with the “az-delivery-devkit-v4”-board, but with that board it get lots of “static” and noise.