MSM261S4030H0 as ESPHome i2s_audio microphone?

I got a set of these: https://www.aliexpress.us/item/3256804659411651.html
image

and I’m trying to use the i2c_audio / microphone support in ESPHome to use it but I’m unsure how to go about it. I tried making a YAML like this (with an ESP32 SoC):

i2s_audio:
  i2s_lrclk_pin: GPIO32
  i2s_bclk_pin: GPIO23
  
microphone:
  - platform: i2s_audio
    adc_type: internal
    adc_pin: GPIO35
    on_data:
      - logger.log:
          format: "Received %d bytes"
          args: ['x.size()']  

and I wired it like this:
image

Primarily because it defined the i2s_lrclk_pin as potentially WS, but I also have an LR pin and a CK pin but I see that there are three clock options (!) on the i2s_audio ESPHome component. I’m confused as to which goes where.

I found this datasheet PDF file: https://dl.sipeed.com/MAIX/HDK/Chip_DS/麦克_MSM261S4030H0(使用的).pdf that shows WS is Serial Data-Word Select and CK is SCK or Serial Data Clock, the L/R appears to just configure left vs. right if tied to ground vs. VDD, respectively which is why I hooked it up like that. Anyway, with this configuration, ESPHome device doesn’t expose any microphone entity, so I’m definitely missing something.

Anyone know:

  1. Is it possible to configure this type of microphone to work with ESPHome? If so, how?
  2. How do I use a microphone from an ESPHome device? Do I need to define a voice_assistant? Is that the only way?

Thanks!

I’m really stuck on this. I tried it a bunch more tonight and every time I manually trigger the STT process the pipeline seems to get stuck immediately:
image

… and it never comes back from this, it just spins on the Speech-to-text section.

This is what the raw serial-to-USB log looks like when I manually trigger a listen event:

[21:53:37][C][audio:218]:   Internal DAC mode: Left & Right
[21:53:50][D][button:010]: 'Fire Listener' Pressed.
[21:53:50][D][main:282]: Button Pressed
[21:53:50][D][voice_assistant:132]: Requesting start...
[21:53:50][D][voice_assistant:111]: Starting...
[21:53:50][D][voice_assistant:154]: Assist Pipeline running
[21:53:50][D][main:304]: voice_start
[21:53:50][W][light:476]: 'Response Light' - No such effect 'pulse'
[21:53:50][D][light:036]: 'Response Light' Setting:
[21:53:50][D][light:047]:   State: ON
[21:53:51][D][light:036]: 'Response Light' Setting:
[21:53:51][D][light:047]:   State: OFF
[21:53:58][D][sensor:094]: 'AudioTest Uptime Raw': Sending state 28.89900 s with 0 decimals of accuracy
[21:54:05][D][text_sensor:064]: 'AudioTest Uptime': Sending state '28s'
[21:54:20][D][voice_assistant:218]: Assist Pipeline ended
[21:54:20][D][main:085]: voice_end
[21:54:20][E][voice_assistant:231]: Error: pipeline-timeout - Pipeline timeout
[21:54:20][D][voice_assistant:144]: Signaling stop...
[21:54:20][D][main:102]: voice_error
[21:54:20][D][light:036]: 'Response Light' Setting:
[21:54:20][D][light:047]:   State: ON
[21:54:21][D][light:036]: 'Response Light' Setting:
[21:54:21][D][light:047]:   State: OFF
[21:54:21][D][light:036]: 'Response Light' Setting:
[21:54:28][D][sensor:094]: 'AudioTest Uptime Raw': Sending state 58.88900 s with 0 decimals of accuracy

I put a little LED on the proto board so I can see it light up when the voice pipeline is started and finished (or errored). It never seems to receive anything. :-1:

I’ve tried it with:

  • internal adc_type
  • external adc_type with PDM: false
  • external adc_type with PDM: true

The whole YAML looks like this:

i2s_audio:
  i2s_lrclk_pin: GPIO32
  i2s_bclk_pin: GPIO23
  
microphone:
  - platform: i2s_audio
    id: my_mic
    adc_type: internal
    adc_pin: GPIO35
    # adc_type: external
    # i2s_din_pin: GPIO35
    # pdm: false

media_player:
  - platform: i2s_audio
    name: ESPHome I2S Media Player
    id: media_out
    dac_type: internal
    mode: stereo
    
output:
  - platform: esp32_dac
    pin: GPIO26
    id: right_output
  - platform: esp32_dac
    pin: GPIO25
    id: left_output
  - platform: gpio
    pin: GPIO17
    id: light_output

light:
  - platform: binary
    name: "Response Light"
    id: response_light
    output: light_output

voice_assistant:
  microphone: my_mic
  media_player: media_out
  on_start:
    - logger.log: voice_start
    - light.turn_on:
        id: response_light
        effect: pulse
    - delay: 1s
    - light.turn_off: response_light
  on_tts_start:
    - logger.log: tts_start
    - light.turn_on: response_light
    - delay: 1s
    - light.turn_off: response_light
  on_tts_end:
    - logger.log: tts_end
    - media_player.play_media: !lambda return x;
    - light.turn_on: response_light
    - delay: 1s
    - light.turn_off: response_light
  on_end:
    - logger.log: voice_end
    - delay: 1s
    - wait_until:
        not:
          media_player.is_playing: media_out
    - light.turn_off: response_light
  on_error:
    - logger.log: voice_error
    - light.turn_on: response_light
    - delay: 1s
    - light.turn_off: response_light

button:
  - platform: template
    name: "Fire Listener"
    on_press:
      - logger.log: Button Pressed
      - if:
          condition: voice_assistant.is_running
          then:
            - voice_assistant.stop:
          else:
            - voice_assistant.start:

I only ever see ‘voice_start’ but never ‘tts_start’. Presumably because I can’t get the microphone to register anything.

Does anyone know a good way to “test” the microphone on a raw build like this?

@SpikeyGG Did you get anywhere with this device? I’ve found one in box of components and trying to get it going but getting nowhere. It could be that I’m using a Seeed Studio Xiao ESP32C3, also a left over device!

As for the pin definitions, I agree with what you said in the first post, I also have tried the Left/Right, but not the PDM choice yet. When I tried a bit of Arduino test code to test the microphone and the speaker, it looks like I was getting a response from the microphone but the speaker was terrible.

I do have a working configuration with a different microphone and controller, so I know HA configuration works.

More testing tomorrow…

Yeah, this was happening (the stuck HA Voice Assistant) because my network was blocking the packets from reaching the Home Assistant address. Once I opened the network path with firewall rules HA received the requests.

Thanks for the info, I have a feeling that my issue is with the Xiao, but I don’t have another ESP32 to test with.

I have this microphone working using the below code. Can trigger using the wake word. I have the LR pin pulled high which says it should be set to right but only works when I set the channel in code to left.

i2s_audio:
  - i2s_lrclk_pin: GPIO23
    i2s_bclk_pin: GPIO22

speaker:
  - platform: i2s_audio
    id: speaker_out
    dac_type: external
    i2s_dout_pin: GPIO19
    mode: mono

microphone:
  - platform: i2s_audio
    id: microphone_in
    i2s_din_pin: GPIO21
    adc_type: external
    channel: left
    pdm: false

did you need to port forward on your router? i added your button to manually fire the voice assist but getting error:

[E][voice_assistant:757]: Error: stt-stream-failed - speech-to-text failed

i’m not sure which port to open (3232? 6053?):

[11:39:29][C][mdns:116]:   Hostname: esp32-voice-assistant
[11:39:29][C][esphome.ota:073]: Over-The-Air updates:
[11:39:29][C][esphome.ota:074]:   Address: esp32-voice-assistant.local:3232
[11:39:29][C][esphome.ota:075]:   Version: 2
[11:39:29][C][safe_mode:018]: Safe Mode:
[11:39:29][C][safe_mode:020]:   Boot considered successful after 60 seconds
[11:39:29][C][safe_mode:021]:   Invoke after 10 boot attempts
[11:39:29][C][safe_mode:023]:   Remain in safe mode for 300 seconds
[11:39:29][C][api:139]: API Server:
[11:39:29][C][api:140]:   Address: esp32-voice-assistant.local:6053

I did not need to port forward. However, I opened everything to that ESPHome device’s IP on my router because everything is on VLANs. Unless you’re trying to issue commands over the internet, you shouldn’t need to port forward… I think your local Home Assistant should talk directly to your local ESPHome device but I could be wrong.