ESP32 Voice Assistant - Static Noise on Startup

I have been working on voice assistant with an ESP32. Got it working but I get static through the speaker, on startup, until I get the voice reply. After that the static/noise goes away. I have attached a schematic of the wiring.

Also, sometimes the voice response stops.

1 Like

Watching… I also get this issue and also have an LED attached.

I have the same problem. Has anyone found a fix yet?

I also get this same thing.

have the same issue, was thinking is there a way to send a “start up” sound or message as a work aound?
Thoughts?

So I think I fixed this by changing a few configurations and wiring.

  1. Power the MAX98357A with 5V. Under-powering can cause distortion. It is best to power with an external power supply and not with the VCC of the ESP32. However, I’m powering mine via VCC and it is working fine now. This help to clear up the distortion while playing the wav.
  2. Reduce the gain to 3db. This help to clear up the distortion while playing the wav.
    3dB if a 100K resistor is connected between GAIN and Vin
  3. This is the one that removed the startup noise. I’m not sure if I’m doing this 100% correctly but it is working well. There is a voltage on the pin that i’m using on startup (~1V) and after a wav has been sent it is 0V. I’m using an output configuration to set this low on boot.
output:
  - platform: gpio
    pin:
        number: GPIO25
        allow_other_uses: true
    id: set_low_speaker
speaker:
  platform: i2s_audio
  id: external_speaker
  dac_type: external
  i2s_audio_id: i2s_out
  i2s_dout_pin: 
    number: GPIO25
    allow_other_uses: true

Hope this helps.

1 Like

Unfortunately your code did not solve my problem, unless there was more to it. Did you need to do anything on the HA side to make this work? I used your code exactly other than changing to the pin I was using. After I loaded the code, everything worked just fine. Then I pulled the plug on the ESP32 and powered back up. After it booted up, the horrible noise resumed until I got it to say something.

I thought the same idea but I cannot see a way to say something in the esp32 code.

this worked for me

output:
  - platform: gpio
    pin: 
      number: GPIO32
      allow_other_uses: true
    id: set_low_speaker
speaker:
  - platform: i2s_audio
    id: my_speaker
    dac_type: external
    i2s_audio_id: i2s_in
    i2s_dout_pin: 
      number: GPIO32 #DIN
      allow_other_uses: true
    mode: mono

Try setting it off in the on_boot

esphome:
  name: office-va
  friendly_name: Office VA
  # Automation to perform every time the device boots 
  on_boot:
      priority: 600
      then: 
        - output.turn_off: set_low_speaker

I did add code at startup to turn it off but still no joy. However, I just solved the issue without even adding new code. The bottom line is that this issue depends on which ESP32 dev board you are using AND which pins you use for the amp i/o. I was thinking about using a tiny relay to disconnect the speaker on startup controlled by another pin which would then connect the speaker the first time speech was needed. While thinking about this I thought to check for voltage on another pin on startup and lo and behold I realized that there was none, unlike the pin I was using for the amp! So, I switched to another pin and updated the firmware speaker out pin to the new pin and voila! No noise on startup.
Thanks to @steriku for mentioning the voltage on startup that causes this.
So, if you are experiencing this, grab your voltage meter and find other gpio pins!

1 Like

\n you post your full yaml? I have the same parts and cant get anywhere with it…

@Rich37804 Here is my current yaml file:

3 Likes

This worked for me as well! Thanks for the code :slight_smile:


esphome:
  name: voice-assist-003
  friendly_name: Voice Assist 003
  on_boot:
    - priority: -100
      then:
        - wait_until: api.connected
        - delay: 1s
        - if:
            condition:
              switch.is_on: use_wake_word
            then:
              - voice_assistant.start_continuous:

esp32:
  board: esp32dev
  framework:
    type: arduino

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: "+ONyY="

ota:
  password: ""

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  manual_ip:
    static_ip: xxx.xxx.xxx.xxx
    gateway xxx.xxx.xxx.xxx
    subnet: xxx.xxx.xxx.xxx

    
  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "IBEHERE"
    password: "PASSWORD?????"

captive_portal:

i2s_audio:
  - id: i2s_in
    i2s_lrclk_pin: GPIO26 #WS 
    i2s_bclk_pin: GPIO25 #SCK

microphone:
  - platform: i2s_audio
    adc_type: external
    pdm: false
    id: mic_i2s
    channel: right
    bits_per_sample: 32bit
    i2s_audio_id: i2s_in
    i2s_din_pin: GPIO33  #SD Pin from the INMP441 Microphone


media_player:
  - platform: i2s_audio
    name: "esp_speaker"
    id: media_player_speaker
    i2s_audio_id: i2s_in
    dac_type: external
    i2s_dout_pin: GPIO27   #  DIN Pin of the MAX98357A Audio Amplifier
    mode: mono


voice_assistant:
  microphone: mic_i2s
  id: va
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.0
  use_wake_word: false
  media_player: media_player_speaker
  
  on_wake_word_detected: 
    - light.turn_on:
        id: led_light
  on_listening: 
    - light.turn_on:
        id: led_light
        effect: "Rainbow Spinner"
 #       red: 71%
 #       green: 0%
 #       blue: 71%

  on_stt_end:
    - light.turn_on:
        id: led_light
        effect: "None"
        red: 0%
        green: 100%
        blue: 0%

  on_error: 
    - light.turn_on:
        id: led_light
        effect: "None"
        red: 100%
        green: 0%
        blue: 0%
    - if:
        condition:
          switch.is_on: use_wake_word
        then:

          - switch.turn_off: use_wake_word
          - delay: 1sec 
          - switch.turn_on: use_wake_word      
  

  on_tts_start:                                    # this is required to play the output on a media player
    - homeassistant.service:
        service: tts.speak
        data:
          media_player_entity_id: media_player.voice_assist_001_esp_speaker    #replace this with your media player entity id
          message: !lambda 'return x;'
          entity_id: tts.piper_2                 #replace this with your piper tts id.




  on_client_connected:
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.start_continuous:

  on_client_disconnected:
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.stop:
 
  on_end:
    - light.turn_off:
        id: led_light



binary_sensor:
  - platform: status
    name: API Connection
    id: api_connection
    filters:
      - delayed_on: 1s
    on_press:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - voice_assistant.start_continuous:
    on_release:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - voice_assistant.stop:


switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(va).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
    
    on_turn_off:
      - voice_assistant.stop
      - lambda: id(va).set_use_wake_word(false);

light:
  - platform: neopixelbus
    id: led_light
    type: grb
    pin: GPIO32       # DIN pin of the LED Strip
    num_leds: 24      # change the Number of LEDS according to your LED Strip.
    name: "Light"
    variant: ws2812x
    default_transition_length: 0.5s
      
    effects:
      - addressable_scan:
          name: Scan Effect With Custom Values
          move_interval: 50ms
          scan_width: 2
      - addressable_rainbow:
          name: Rainbow Spinner
          speed: 50
          width: 24
      - addressable_rainbow:
          name: Rainbow Fader
          speed: 50
      - addressable_random_twinkle:
          name: Random Twinkle Effect With Custom Values
          twinkle_probability: 50%
          progress_interval: 32ms
      - addressable_flicker:
          name: Flicker Effect With Custom Values
          update_interval: 16ms
          intensity: 100%

Could you explain please. I see that both i2s_in and i2s_out are declared in your code for audio:

i2s_audio:
  - id: i2s_in
    i2s_lrclk_pin: GPIO25
    i2s_bclk_pin: GPIO26
  - id: i2s_out
    i2s_lrclk_pin: GPIO12
    i2s_bclk_pin: GPIO13

While most esp speaker schematics I’ve ever seen only had the i2s_in configuration. What is the point of this, what hardware are you using for this configuration?

Hey Elf,
in the i2s_audio you can declare which pins it should use for input audio, coming from mic, audio-jack or something liike that. the i2s_out tells which pins are used for outgoing audio in our case to the amplifier.