Record sound from I2S microphone

Hi community!
Is it possible to record sound from I2S microphone which connected to ESPHome, and save this record i.e to media folder?

1 Like

hello,

i thought about the same thing today :wink:

i have a decibel-meter already with esp32 and i2s mic.

i think its possible because Assist already record a query and convert it to text.

I’m wondering how to do anything with i²s. I bought a set of i2s microphones and I’ve been trying to get them to work with ESPHome but I haven’t had any luck.

I created this thread a while back and no one has responded to it. :frowning:

Hi there!

you need a INMP441 : https://fr.aliexpress.com/item/1005004789925958.html?spm=a2g0o.order_list.order_list_main.49.5d235e5b0WjKGX&gatewayAdapt=glo2fra

and any ESP32 cause doesnt work with an ESP8266.

its working like a charm with this config :

sphome:
  name: decibel-meter-advanced
  friendly_name: decibel-meter-advanced

esp32:
  board: wemos_d1_mini32
  framework:
    type: arduino

external_components:
  - source: github://stas-sl/esphome-sound-level-meter

# Enable logging
logger:
  level: DEBUG

# Enable Home Assistant API
api:
  encryption:
    key: "exxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx="

ota:
  password: "dxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxf"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_pwd
  manual_ip:
    static_ip: 192.168.1.xxx
    gateway: 192.168.1.xxx
    subnet: 255.255.255.0
    dns1: 192.168.1.xxx
  fast_connect: on
  power_save_mode: none


i2s:
  bck_pin: GPIO5
  ws_pin: GPIO25
  din_pin: GPIO26
  sample_rate: 48000            # default: 48000
  bits_per_sample: 32           # default: 32
  dma_buf_count: 8              # default: 8
  dma_buf_len: 256              # default: 256
  use_apll: true                # default: false

  # right shift samples.
  # for example if mic has 24 bit resolution, and
  # i2s configured as 32 bits, then audio data will be aligned left (MSB)
  # and LSB will be padded with zeros, so you might want to shift them right by 8 bits
  bits_shift: 8                 # default: 0

sound_level_meter:
  id: sound_level_meter1

  # update_interval specifies over which interval to aggregate audio data
  # you can specify default update_interval on top level, but you can also override
  # it further by specifying it on sensor level
  update_interval: 60s           # default: 60s

  # you can disable (turn off) component by default (on boot)
  # and turn it on later when needed via sound_level_meter.turn_on/toggle actions
  is_on: true                    # default: true

  # buffer_size is in samples (not bytes), so for float data type
  # number of bytes will be buffer_size * 4
  buffer_size: 1024             # default: 1024

  # ignore audio data at startup for this long
  warmup_interval: 500ms        # default: 500ms

  # audio processing runs in a separate task, you can change its settings below
  task_stack_size: 4096         # default: 4096
  task_priority: 2              # default: 2
  task_core: 1                  # default: 1

  # see your mic datasheet to find sensitivity and reference SPL.
  # those are used to convert dB FS to db SPL
  mic_sensitivity: -26dB        # default: empty
  mic_sensitivity_ref: 94dB     # default: empty
  # additional offset if needed
  offset: 0dB                   # default: empty

  # for flexibility sensors are organized hierarchically into groups. each group
  # could have any number of filters, sensors and nested groups.
  # for examples if there is a top level group A with filter A and nested group B
  # with filter B, then for sensors inside group B filters A and then B will be
  # applied:
  # groups:
  #   # group A
  #   - filters:
  #       - filter A
  #     groups:
  #       # group B
  #       - filters:
  #           - filter B
  #         sensors:
  #           - sensor X
  groups:
    # group 1 (mic eq)
    - filters:
        # for now only SOS filter type is supported, see math/filter-design.ipynb
        # to learn how to create or convert other filter types to SOS
        - type: sos
          coeffs:
            # INMP441:
            #      b0            b1           b2          a1            a2
            - [ 1.0019784 , -1.9908513  , 0.9889158 , -1.9951786  , 0.99518436]

      # nested groups
      groups:
        # group 1.1 (no weighting)
        - sensors:
            # 'eq' type sensor calculates Leq (average) sound level over specified period
            - type: eq
              name: LZeq_1s
              id: LZeq_1s
              # you can override updated_interval specified on top level
              # individually per each sensor
              update_interval: 1s

            # you can have as many sensors of same type, but with different
            # other parameters (e.g. update_interval) as needed
            - type: eq
              name: LZeq_1min
              id: LZeq_1min
              unit_of_measurement: dBZ

            # 'max' sensor type calculates Lmax with specified window_size.
            # for example, if update_interval is 60s and window_size is 1s
            # then it will calculate 60 Leq values for each second of audio data
            # and the result will be max of them
            - type: max
              name: LZmax_1s_1min
              id: LZmax_1s_1min
              window_size: 1s
              unit_of_measurement: dBZ

            # same as 'max', but 'min'
            - type: min
              name: LZmin_1s_1min
              id: LZmin_1s_1min
              window_size: 1s
              unit_of_measurement: dBZ

            # it finds max single sample over whole update_interval
            - type: peak
              name: LZpeak_1min
              id: LZpeak_1min
              unit_of_measurement: dBZ

        # group 1.2 (A-weighting)
        - filters:
            # for now only SOS filter type is supported, see math/filter-design.ipynb
            # to learn how to create or convert other filter types to SOS
            - type: sos
              coeffs:
                # A-weighting:
                #       b0           b1            b2             a1            a2
                - [ 0.16999495 ,  0.741029   ,  0.52548885 , -0.11321865 , -0.056549273]
                - [ 1.         , -2.00027    ,  1.0002706  , -0.03433284 , -0.79215795 ]
                - [ 1.         , -0.709303   , -0.29071867 , -1.9822421  ,  0.9822986  ]
          sensors:
            - type: eq
              name: LAeq_1min
              id: LAeq_1min
              unit_of_measurement: dBA
            - type: max
              name: LAmax_1s_1min
              id: LAmax_1s_1min
              window_size: 1s
              unit_of_measurement: dBA
            - type: min
              name: LAmin_1s_1min
              id: LAmin_1s_1min
              window_size: 1s
              unit_of_measurement: dBA
            - type: peak
              name: LApeak_1min
              id: LApeak_1min
              unit_of_measurement: dBA

        # group 1.3 (C-weighting)
        - filters:
            # for now only SOS filter type is supported, see math/filter-design.ipynb
            # to learn how to create or convert other filter types to SOS
            - type: sos
              coeffs:
                # C-weighting:
                #       b0             b1             b2             a1             a2
                - [-0.49651518  , -0.12296628  , -0.0076134163, -0.37165618   , 0.03453208  ]
                - [ 1.          ,  1.3294908   ,  0.44188643  ,  1.2312505    , 0.37899444  ]
                - [ 1.          , -2.          ,  1.          , -1.9946145    , 0.9946217   ]
          sensors:
            - type: eq
              name: LCeq_1min
              id: LCeq_1min
              unit_of_measurement: dBC
            - type: max
              name: LCmax_1s_1min
              id: LCmax_1s_1min
              window_size: 1s
              unit_of_measurement: dBC
            - type: min
              name: LCmin_1s_1min
              id: LCmin_1s_1min
              window_size: 1s
              unit_of_measurement: dBC
            - type: peak
              name: LCpeak_1min
              id: LCpeak_1min
              unit_of_measurement: dBC


# automation
# available actions:
#   - sound_level_meter.turn_on
#   - sound_level_meter.turn_off
#   - sound_level_meter.toggle
switch:
  - platform: template
    name: "Sound Level Meter Switch"
    lambda: |-
      return id(sound_level_meter1).is_on();
    turn_on_action:
      then: sound_level_meter.turn_on
    turn_off_action:
      then: sound_level_meter.turn_off
    restore_mode: RESTORE_DEFAULT_ON
  - platform: restart
    name: "Restart Decibel Metre Advance"

status_led:
  pin:
    number: GPIO2 #ESP32 OnBroad LED
    inverted: true

binary_sensor:
  - platform: status
    name: "Status Wifi Decibel Metre Advance"

sensor:
  - platform: internal_temperature
    name: "Internal Temperature Decibel Metre" 

ENJOY :wink: @Joe3

Nah, it works. I have both types of i²s sensors actually. Eventually, I got some responses and we figured it out.

Thanks for posting.

It also works with a simple config (YAML) which I found here https://github.com/stas-sl/esphome-sound-level-meter

esphome:
  name: esp32-02
  platform: esp32
  board: esp32dev

external_components:
  - source: github://stas-sl/esphome-sound-level-meter

# Enable logging
logger:
  level: DEBUG

# Enable Home Assistant API
api:
  encryption:
    key: " xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
            
ota:
  password: "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
                     
wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  fast_connect: true
  manual_ip:
    static_ip: 192.168.178.xx
    gateway: 192.168.178.x
    subnet: 255.255.255.0

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Esp32-02 Fallback Hotspot"
    password: "eVv4rVmMxxfi"

captive_portal:

# ---------------------------

# INMP441 I2S MEMS microphone
# VDD  3V3
# GND  GND
# SD   D26
# L/R  GND
# WS   D25
# SCK  D5

status_led:
  pin: 2

i2s:
  bck_pin: GPIO5         # SCK
  ws_pin: GPIO25         # WS
  din_pin: GPIO26        # SD
  sample_rate: 48000            # default: 48000
  bits_per_sample: 32           # default: 32

  # right shift samples.
  # for example if mic has 24 bit resolution, and i2s configured as 32 bits,
  # then audio data will be aligned left (MSB) and LSB will be padded with
  # zeros, so you might want to shift them right by 8 bits
  
  bits_shift: 8                 # default: 0

sound_level_meter:
  id: sound_level_meter1

  # update_interval specifies over which interval to aggregate audio data
  # you can specify default update_interval on top level, but you can also
  # override it further by specifying it on sensor level

  update_interval: 1s           # default: 60s

  # buffer_size is in samples (not bytes), so for float data type
  # number of bytes will be buffer_size * 4

  buffer_size: 1024             # default: 1024

  # see your mic datasheet to find sensitivity and reference SPL.
  # those are used to convert dB FS to db SPL

  mic_sensitivity: -26dB        # default: empty
  mic_sensitivity_ref: 94dB     # default: empty

  # for flexibility sensors are organized hierarchically into groups.
  # each group can have any number of filters, sensors and nested groups.
  # for examples if there is a top level group A with filter A and nested
  # group B with filter B, then for sensors inside group B filters A
  # and then B will be applied:
  # groups:
  #   # group A
  #   - filters:
  #       - filter A
  #     groups:
  #       # group B
  #       - filters:
  #           - filter B
  #         sensors:
  #           - sensor X

  groups:
    - sensors:
        - type: eq
          name: Leq_1s

# automation
# available actions:
#   - sound_level_meter.turn_on
#   - sound_level_meter.turn_off
#   - sound_level_meter.toggle

switch:
  - platform: template
    name: "Decibel Meter ON / OFF"
    lambda: |-
      return id(sound_level_meter1).is_on();
    turn_on_action:
      then: sound_level_meter.turn_on
    turn_off_action:
      then: sound_level_meter.turn_off
    restore_mode: RESTORE_DEFAULT_ON
  - platform: restart
    name: "Decibel Meter Reset"

sensor:

  - platform: wifi_signal
    name: "WiFi ESP32 02"
    update_interval: 60s  

  - platform: internal_temperature
    name: "Decibel Meter Temperature" 

binary_sensor:
  - platform: status
    name: "Decibel Meter WiFi"

I added the ON-OFF switch, internal temperature, RESET-switch and WiFi-status from the advanced script to the simple script which worked without any failures.

Posting your config would be helpful :slight_smile:

Look in the other thread: MSM261S4030H0 as ESPHome i2s_audio microphone?

OK, so esphome-sound-level-meter is working like a charm right now, but is there a way to record / stream the captured audio to Home Assistant to be played in any media player entity?

Thanks in advance

I’m trying to implement this sound meter code for the m5 atom echo which has an I2S microphone as I’m trying to troubleshoot if the embedded microphone is not working at all or what (towards usage as voice assistant).

I tried to map the GPIO from the atom (ref1, ref2) to those in the sound meter template but I’m unsure of what to map to each variable needed.
While it compiles and runs the code, the output is always -inf in the logs, so not sure whether I did wrongly apply the GPIOs, or something else.

[13:23:51][D][sound_level_meter:120][sound_level_met]: Processing time per 1s of audio data (48000 samples): 21 ms
[13:23:52][D][sensor:094]: 'Leq_1s': Sending state -inf dB with 2 decimals of accuracy

So can’t tell if microphone is, as presumed, bad in this unit, or the code is wrong somewhere, or there’s some incompatibility between the sound meter library and the atom lite base.

  • Any pointers on what to look for?
  • Anyone with success on this platform?

Thanks!

Code
substitutions:
  name: atom-sound-meter
  friendly_name: Sound Meter

esphome:
  name: ${name}
  friendly_name: ${friendly_name}
  name_add_mac_suffix: true
  project:
    name: m5stack.atom-echo-sound-meter
    version: "1.0.1"
  min_version: 2023.11.1

esp32:
  board: m5stack-atom
  framework:
    type: esp-idf

wifi:
  ssid: !mySSID
  password: !myPassword
  use_address: !myIP
  on_connect:
    - delay: 5s  # Gives time for improv results to be transmitted
    - ble.disable:
  on_disconnect:
    - ble.enable:

api:
logger:
ota:
  safe_mode: true

improv_serial:

esp32_improv:
  authorizer: none

button:
  - platform: factory_reset
    id: factory_reset_btn
    name: Factory reset

external_components:
  - source: github://stas-sl/esphome-sound-level-meter
  - source: github://pr#5230
    components:
      - esp_adf
    refresh: 0s

esp_adf:

status_led:
  pin: 27

i2s:
  bck_pin: GPIO19         # SCK
  ws_pin: GPIO33        # WS
  din_pin: GPIO23        # SD
  sample_rate: 48000            # default: 48000
  bits_per_sample: 32           # default: 32

  # right shift samples.
  # for example if mic has 24 bit resolution, and i2s configured as 32 bits,
  # then audio data will be aligned left (MSB) and LSB will be padded with
  # zeros, so you might want to shift them right by 8 bits
  
  bits_shift: 8                 # default: 0

sound_level_meter:
  id: sound_level_meter1

  # update_interval specifies over which interval to aggregate audio data
  # you can specify default update_interval on top level, but you can also
  # override it further by specifying it on sensor level

  update_interval: 1s           # default: 60s

  # buffer_size is in samples (not bytes), so for float data type
  # number of bytes will be buffer_size * 4

  buffer_size: 1024             # default: 1024

  # see your mic datasheet to find sensitivity and reference SPL.
  # those are used to convert dB FS to db SPL

  mic_sensitivity: -22dB        # default: empty
  mic_sensitivity_ref: 94dB     # default: empty

  groups:
    - sensors:
        - type: eq
          name: Leq_1s

switch:
  - platform: template
    name: "Decibel Meter ON / OFF"
    lambda: |-
      return id(sound_level_meter1).is_on();
    turn_on_action:
      then: sound_level_meter.turn_on
    turn_off_action:
      then: sound_level_meter.turn_off
    restore_mode: RESTORE_DEFAULT_ON
  - platform: restart
    name: "Decibel Meter Reset"

sensor:
  - platform: wifi_signal
    name: "WiFi ESP32 02"
    update_interval: 60s  

  - platform: internal_temperature
    name: "Decibel Meter Temperature" 

binary_sensor:
  - platform: status
    name: "Decibel Meter WiFi Status"

What about waterfall visualization?
Real time fft is cool but sometimes events happens when we’re not looking at screen so whould be great to have a feature like this.
Maybe showing latest x secs or hours, and streaming everything to hassio to store it