ESP32 + Microphone

I am trying to figure out how to use an ESP32 as voice satellite. Since I’m pretty new to HA, I thought to do baby steps and start small.

From the hardware perspective, I have an ESP32 and a INMP441 microphone connected to it.

I started by trying to control the blue led on GPIO2 using a switch:

captive_portal:

output:
  - platform: gpio
    id: relay1
    pin: GPIO2    

switch:
  - platform: output
    id: extension1_relay1_switch
    name: "Relay 1"
    output: relay1

and it worked fine. Now I want to understand if the mic is active and receiving data. I thought of using the same switch to run the microphone.capture action and log the data:

captive_portal:

output:
  - platform: gpio
    id: relay1
    pin: GPIO2    

switch:
  - platform: output
    id: extension1_relay1_switch
    name: "Relay 1"
    output: relay1
  - platform: template
    id: extension1_relay1_microphone
    optimistic: True
    on_turn_on: 
      then:
        - microphone.capture: 
    on_turn_off: 
      then:
        - microphone.stop_capture: 
          
i2s_audio:
  - id: i2s_in
    i2s_lrclk_pin: GPIO26 #WS 
    i2s_bclk_pin: GPIO25 #SCK

microphone:
  - platform: i2s_audio
    adc_type: external
    pdm: false
    id: mic_i2s
    channel: right
    bits_per_sample: 32bit
    i2s_audio_id: i2s_in
    i2s_din_pin: GPIO34  #SD Pin from the INMP441 Microphone
    
    on_data:
      - logger.log:
          format: "Received %d bytes"
          args: ['x.size()']

but of course I don’t see anything in my logs. Any suggestion?

try looking at one of the other voice device projects for inspiration.

The schematics of the devices are online so you have working software and a reference hardware. you can almost take it as is and modify it to your hardware. At a minimum you can use it to understand how it can be done

yaml files can be found at links below

See here for some up to date working examples.

Unless your esp has psram you will be in for a poor experience, so make sure you are using an esp32 s3 n16r8 board. You may also get away with an n8r2 board.

1 Like

these are the ones I got: ELEGOO 3pcs ESP-WROOM-32 Development Board, USB Tyce-C, 2.4GHz Dual Mode WiFi+Bluetooth Dual Core Microcontroller for Arduino IDE, Support AP/STA/AP+STA, CP2102 Serial Chip : Amazon.ca: Electronics

I am not 100% sure of the exact model though. Is there a way perhaps to query the board directly and get the exact model?

That board will cause memory problems if trying to run voice assistant on it.

To have a successful build use one of these.

https://www.aliexpress.com/item/1005008209644199.html?spm=a2g0o.productlist.main.1.56fb6619b53p1g&aem_p4p_detail=2025110405240416104494560013700000067586&algo_pvid=e93245f2-269a-4e6d-95b3-447b4bf2280a&algo_exp_id=e93245f2-269a-4e6d-95b3-447b4bf2280a-0&pdp_ext_f={"order"%3A"345"%2C"eval"%3A"1"%2C"fromPage"%3A"search"}&pdp_npi=6%40dis!GBP!2.30!2.30!!!20.92!20.92!%402103856417622626448157440e7c39!12000044242268882!sea!UK!2575567091!X!1!0!n_tag%3A-29919%3Bd%3A4b229d5f%3Bm03_new_user%3A-29895&curPageLogUid=jR61bstLTocn&utparam-url=scene%3Asearch|query_from%3A|x_object_id%3A1005008209644199|_p_origin_prod%3A&search_p4p_id=2025110405240416104494560013700000067586_1

1 Like

DWEII 2PCS ESP32-S3-DevKitC-1-N16R8 ESP32-S3 Development Board Wi-Fi + BLE MCU Module Integrates Complete Wi-Fi and BLE Functions for Arduino : Amazon.ca: Electronics would this one be ok too?
Also, what’s the difference between this and the boards I bought?

Yep that’s the board.

The difference is the 8M of PSRAM, Dont ask me anything more technical. But that and the 16m flash will allow the space for the code to run.

From the esphome voice page

:warning: Warning

Audio and voice components consume a significant amount of resources (RAM, CPU) on the device.

Crashes are likely to occur if you include too many additional components in your device’s configuration. In particular, Bluetooth/BLE components are known to cause issues when used in combination with Voice Assistant and/or other audio components.

If you experience crashes, see the Troubleshooting guide for how to get a backtrace.

I have 4 setups using the code in the post I linked earlier. They work well and have done for some time.

I have been messing about with this since the very early days of voice.

A few tips. Solder all connections and place the end result somewhere with a good wifi signal. Poor wifi signals can cause the speech to cut short or crackle.

1 Like

thanks! I ordered the one you linked. After the first order the price is more or less the same on Amazon, I’ll probably get those as well.
Would it be possible for you to post your YAML, just for reference? Thanks!

It includes some other sensors, but all the connection details are at the start of the yaml.

1 Like

thanks, this helps a lot! I’ll try once I receive the boards and post the results here :slight_smile: