My setup:
Hardware:
- Microphone: INMP441
- Amp: MAX98357
- Speaker: 3 watt mini speaker 8 ohm
- board: az-delivery-devkit-v4
I have been banging my head against this wall for a while. I get lots of crackling noise. Both from speaker, but also from the recordings from the wake words and commands enabled by adding the following to my home assistant configuration.yaml:
assist_pipeline:
debug_recording_dir: /share/assist_pipeline
Crackling noise aside, it DOES work, but not as well I expect! The crackling noise heavily disrupts the speech recognition. I’ve enabled GPU powered speech-to-text on a Proxmox container that runs a docker, and it is blazing fast and really good, taking into account all the disruptions in the recordings! Amazing that it still sometimes gets the command right! Without the disruptions/crackling noise, this setup would be so awesome! It really eats my brain all these crackling noises!
I’ve search the web, many suggestnings of fixing solderings, cables etc, splitting i2s to in/out. I’ve tried everything. Nothing seems to work. Except this one suggestion, that is outside the realm of possibilities of ESPhome.
In that thread Vegethalia writes:
Ok I’ll answer myself, just in case this serves someone else: The problem was that I had configured the I2S in 16 bit mode, and then I was only capturing the MSB of the bit stream. If I configured 24 bits, then the stream was garbled. The solution is to configure I2S in 32 bits mode and when reading, each sample takes 4 bytes and one must be discarted. Like this:
REAL_BYTES_X_SAMPLE=4; i2s_read(I2S_NUM_0, (void*)dataOrig, buffSizeOrig, &bytesRead, portMAX_DELAY); uint16_t samplesRead = bytesRead / REAL_BYTES_X_SAMPLE; for (int i = 0; i < samplesRead; i++) { byteIndex = i * REAL_BYTES_X_SAMPLE; int32_t value = ((int32_t*)(dataOrig + byteIndex))[0]>>8; }
Even though you can specify “bits_per_sample:” for your microphone, you can only set it to 16bit or 32bit. You are not able to specify that level of detail of the recordnings as suggested in the post.
After changing output to a media_player (and changing platform to arduino) the crackling noise disappeared from the response in the speakers, the recordings saved by home assistant is still crackling… The crackling MUST something related to the I2S microphone on my system.
I’ve tried different GIPO:s for the I2S, trying to make guesses from the pinout for my board. But I’ve not found any pins that make any difference.
Code:
esphome:
name: esp32-mic-speaker
friendly_name: esp32-mic-speaker
on_boot:
- priority: -100
then:
- wait_until: api.connected
- delay: 1s
- if:
condition:
switch.is_on: use_wake_word
then:
- voice_assistant.start_continuous:
esp32:
board: az-delivery-devkit-v4
framework:
type: arduino
version: recommended
# Enable logging
logger:
level: VERBOSE
# Enable Home Assistant API
api:
encryption:
key: "llorem ipsum"
ota:
password: "lorem ipsum"
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
# Enable fallback hotspot (captive portal) in case wifi connection fails
ap:
ssid: "Esp32-Mic-Speaker"
password: "loremipsum"
i2s_audio:
- id: i2s_out
i2s_lrclk_pin: GPIO21
i2s_bclk_pin: GPIO22
- id: i2s_in
i2s_lrclk_pin: GPIO26
i2s_bclk_pin: GPIO25
microphone:
- platform: i2s_audio
i2s_audio_id: i2s_in
id: mic
adc_type: external
i2s_din_pin: GPIO13
pdm: false
bits_per_sample: 32bit
media_player:
- platform: i2s_audio
name: "esp_speaker"
id: media_player_speaker
i2s_audio_id: i2s_out
dac_type: external
i2s_dout_pin: GPIO33
mode: mono
voice_assistant:
microphone: mic
use_wake_word: false
noise_suppression_level: 4
auto_gain: 31dBFS
volume_multiplier: 2
media_player: media_player_speaker
id: assist
switch:
- platform: template
name: Use wake word
id: use_wake_word
optimistic: true
restore_mode: RESTORE_DEFAULT_ON
entity_category: config
on_turn_on:
- lambda: id(assist).set_use_wake_word(true);
- if:
condition:
not:
- voice_assistant.is_running
then:
- voice_assistant.start_continuous
on_turn_off:
- voice_assistant.stop
- lambda: id(assist).set_use_wake_word(false);
I’ve purchased other I2S mics, but they are four weeks away. Any suggestions on how to remove the crackling from the INMP441 recordnings?
Or is it possible to do some bit mangling like the quoted suggestion above in ESPhome?