M5stack M5CoreS3 SE with MicroWakeWord

Got microwakework working on the M5CoreS3 SE. Speaker output is garbage, microphone works pretty good. Might be helpful to anyone that already has one but wouldn’t recommend it at this point. Especially with the Seeed respeaker lite announcement and Nabu announcing that there voice assistant will have a XIOS chip in it like the respeaker lite. I had timers and text on the screen working but it kept locking up or stopped responding so I removed it. I also didn’t install ESP-ADF but the tensorflow/lite part still runs. Guess it’s not required? Maybe someone could clear that up for me. I saw the respeaker lite’s yaml didn’t have it either but I assumed that is because it’s using the XIOS chip for obvious reasons.

substitutions:
  name: m5cores3
  friendly_name: M5CoreS3
  loading_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/loading_320_240.png
  idle_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/idle_320_240.png
  listening_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/listening_320_240.png
  thinking_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/thinking_320_240.png
  replying_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/replying_320_240.png
  error_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/error_320_240.png

  loading_illustration_background_color: '000000'
  idle_illustration_background_color: '000000'
  listening_illustration_background_color: 'FFFFFF'
  thinking_illustration_background_color: 'FFFFFF'
  replying_illustration_background_color: 'FFFFFF'
  error_illustration_background_color: '000000'

  voice_assist_idle_phase_id: '1'
  voice_assist_listening_phase_id: '2'
  voice_assist_thinking_phase_id: '3'
  voice_assist_replying_phase_id: '4'
  voice_assist_not_ready_phase_id: '10'
  voice_assist_error_phase_id: '11'  
  voice_assist_muted_phase_id: '12'

  micro_wake_word_model: hey_jarvis  

esphome:
  name: m5cores3
  friendly_name: M5CoreS3
  platformio_options:
    board_build.flash_mode: dio
    #board_upload.maximum_size: 16777216
    #board_build.f_cpu : 240000000L    
  libraries:
    - m5stack/M5GFX@^0.1.11
    - m5stack/M5Unified@^0.1.11
  on_boot:
    priority: 600
    then:
      - delay: 30s            
      - if:
          condition:
            - lambda: return id(init_in_progress);
          then:
            - lambda: id(init_in_progress) = false;      
      - script.execute: draw_display  # Initialize the display
      - delay: 1s  # Wait a moment to ensure the display is ready
      - switch.turn_on: switch_lcd_backlight  # Turn on the backlight switch to set the initial state

esp32:
  board: esp32-s3-devkitc-1
  flash_size: 16MB
  framework:
    type: esp-idf
    sdkconfig_options:
      # need to set a s3 compatible board for the adf-sdk to compile
      # board specific code is not used though
      CONFIG_ESP32_S3_BOX_BOARD: "y"
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB:      "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B:  "y"

      #CONFIG_LOG_DEFAULT_LEVEL_DEBUG: "y"
      #CONFIG_LOG_DEFAULT_LEVEL: "4"

psram:
  mode: quad
  speed: 80MHz

external_components:
  - source:
      type: git
      url: https://github.com/m5stack/M5CoreS3-Esphome
    components: [ board_m5cores3, m5cores3_audio, m5cores3_display ]
    refresh: 0s
  #- source: github://pr#5230
  #  components: esp_adf
 #   refresh: 0s        
  #- source: github://jesserockz/esphome-components
  #  components: [file]
  #  refresh: 0s
text_sensor:
  - platform: wifi_info
    ip_address:
      name: "${friendly_name} IP Address"
time:
  - platform: homeassistant
    id: homeassistant_time
# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: "APIKEY"
  on_client_connected:
    - script.execute: draw_display
  on_client_disconnected:
    - script.execute: draw_display

ota:
  - platform: esphome
    password: "OTAPW"

wifi:
  ssid: !secret wifi_ssid 
  password: !secret wifi_password
  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "M5Stack-Cores3 Fallback Hotspot"
    password: "HOTSTPOTPW"
  on_connect:
    - script.execute: draw_display
   # - delay: 5s # Gives time for improv results to be transmitted 
  on_disconnect:
    - script.execute: draw_display   



captive_portal:

# 
# 
# Globals
# 
globals:
  - id: init_in_progress
    type: bool
    restore_value: no
    initial_value: 'true'
  - id: voice_assistant_phase
    type: int
    restore_value: no
    initial_value: ${voice_assist_not_ready_phase_id}

#
# Temp values for slider lcd brightnes
#
number:
  - platform: template
    name: "Display Brightness"
    id: display_brightness
    entity_category: config
    optimistic: true
    min_value: 0
    max_value: 200
    step: 1
    unit_of_measurement: "%"
    initial_value: 127
    icon: "mdi:brightness-6"
    set_action:
      - lambda: |-
          int brightness = (int)(x * 2.55);  
          M5.Display.setBrightness(brightness);
          

# 
# Display
# 
script:
  - id: draw_display
    then:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - if:
                condition:
                  wifi.connected:
                then:
                  - if:
                      condition:
                        api.connected:
                      then:
                        - lambda: |
                            switch(id(voice_assistant_phase)) {
                              case ${voice_assist_listening_phase_id}:
                                id(m5cores3_lcd).show_page(listening_page);
                                id(m5cores3_lcd).update();
                                break;
                              case ${voice_assist_thinking_phase_id}:
                                id(m5cores3_lcd).show_page(thinking_page);
                                id(m5cores3_lcd).update();
                                break;
                              case ${voice_assist_replying_phase_id}:
                                id(m5cores3_lcd).show_page(replying_page);
                                id(m5cores3_lcd).update();
                                break;
                              case ${voice_assist_error_phase_id}:
                                id(m5cores3_lcd).show_page(error_page);
                                id(m5cores3_lcd).update();
                                break;
                              case ${voice_assist_muted_phase_id}:
                                id(m5cores3_lcd).show_page(muted_page);
                                id(m5cores3_lcd).update();
                                break;
                              case ${voice_assist_not_ready_phase_id}:
                                id(m5cores3_lcd).show_page(no_ha_page);
                                id(m5cores3_lcd).update();
                                break;
                              default:
                                id(m5cores3_lcd).show_page(idle_page);
                                id(m5cores3_lcd).update();
                            }
                      else:
                        - display.page.show: no_ha_page
                        - component.update: m5cores3_lcd
                else:
                  - display.page.show: no_wifi_page
                  - component.update: m5cores3_lcd
          else:
            - display.page.show: initializing_page
            - component.update: m5cores3_lcd

image:
  - file: ${error_illustration_file}
    id: casita_error
    resize: 320x240
    type: RGB24
    use_transparency: true
  - file: ${idle_illustration_file}
    id: casita_idle
    resize: 320x240
    type: RGB24
    use_transparency: true
  - file: ${listening_illustration_file}
    id: casita_listening
    resize: 320x240
    type: RGB24
    use_transparency: true
  - file: ${thinking_illustration_file}
    id: casita_thinking
    resize: 320x240
    type: RGB24
    use_transparency: true
  - file: ${replying_illustration_file}
    id: casita_replying
    resize: 320x240
    type: RGB24
    use_transparency: true
  - file: ${loading_illustration_file}
    id: casita_initializing
    resize: 320x240
    type: RGB24
    use_transparency: true
  - file: https://github.com/esphome/firmware/raw/main/voice-assistant/error_box_illustrations/error-no-wifi.png
    id: error_no_wifi
    resize: 320x240
    type: RGB24
    use_transparency: true
  - file: https://github.com/esphome/firmware/raw/main/voice-assistant/error_box_illustrations/error-no-ha.png
    id: error_no_ha
    resize: 320x240
    type: RGB24
    use_transparency: true

color:
  - id: idle_color
    hex: ${idle_illustration_background_color}
  - id: listening_color
    hex: ${listening_illustration_background_color}
  - id: thinking_color
    hex: ${thinking_illustration_background_color}
  - id: replying_color
    hex: ${replying_illustration_background_color}
  - id: loading_color
    hex: ${loading_illustration_background_color}
  - id: error_color
    hex: ${error_illustration_background_color}

display:
  - platform: m5cores3_display
    model: ILI9342
    dc_pin: 35
    update_interval: never
    id: m5cores3_lcd
    pages:
      - id: idle_page
        lambda: |-
          it.fill(id(idle_color));
          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_idle), ImageAlign::CENTER);
      - id: listening_page
        lambda: |-
          it.fill(id(listening_color));
          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_listening), ImageAlign::CENTER);
      - id: thinking_page
        lambda: |-
          it.fill(id(thinking_color));
          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_thinking), ImageAlign::CENTER);
      - id: replying_page
        lambda: |-
          it.fill(id(replying_color));
          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_replying), ImageAlign::CENTER);
      - id: error_page
        lambda: |-
          it.fill(id(error_color));
          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_error), ImageAlign::CENTER);
      - id: no_ha_page
        lambda: |-
          it.image((it.get_width() / 2), (it.get_height() / 2), id(error_no_ha), ImageAlign::CENTER);
      - id: no_wifi_page
        lambda: |-
          it.image((it.get_width() / 2), (it.get_height() / 2), id(error_no_wifi), ImageAlign::CENTER);
      - id: initializing_page
        lambda: |-
          it.fill(id(loading_color));
          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_initializing), ImageAlign::CENTER);
      - id: muted_page
        lambda: |-
          it.fill(Color::BLACK);


# 
# Audio
# 
#esp_adf:
board_m5cores3:
m5cores3_audio:
  id: m5cores3_audio_1

microphone:
  - platform: m5cores3_audio
    m5cores3_audio_id: m5cores3_audio_1
    id: m5cores3_mic
    adc_type: external
    i2s_din_pin: 14
    pdm: false

speaker:
  - platform: m5cores3_audio
    m5cores3_audio_id: m5cores3_audio_1
    id: m5cores3_spk
    dac_type: external
    i2s_dout_pin: 13
    mode: mono

# 
# VA
# 
micro_wake_word:
  models: ${micro_wake_word_model}
  on_wake_word_detected:
    - voice_assistant.start:
        wake_word: !lambda return wake_word;
        
voice_assistant:
  id: va
  microphone: m5cores3_mic
  speaker: m5cores3_spk
  use_wake_word: true
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.0
  #vad_threshold: 3
  on_listening:
    - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
    - script.execute: draw_display
  on_stt_vad_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
    - script.execute: draw_display
  on_tts_stream_start:
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - script.execute: draw_display
  on_tts_stream_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - script.execute: draw_display
  on_error:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};  
          - script.execute: draw_display
          - delay: 1s
          - if:
              condition:
                switch.is_off: mute
              then:
                - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
              else:
                - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
          - script.execute: draw_display
  on_client_connected: 
    - if:
        condition:
          switch.is_off: mute
        then:
          - voice_assistant.start_continuous:
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
        else:
          - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
    - lambda: id(init_in_progress) = false; 
    - script.execute: draw_display
  on_client_disconnected:
    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};  
    - script.execute: draw_display

select:
  - platform: template
    entity_category: config
    name: Wake word engine location
    id: wake_word_engine_location
    optimistic: true
    restore_value: true
    options:
      - In Home Assistant
      - On device
    initial_option: On device
    on_value:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - wait_until:
                lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id} || id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
            - if:
                condition:
                  lambda: return x == "In Home Assistant";
                then:
                  - micro_wake_word.stop
                  - delay: 30ms
                  - if:
                      condition:
                        switch.is_off: mute
                      then:
                        - lambda: id(va).set_use_wake_word(true);
                        - voice_assistant.start_continuous:
            - if:
                condition:
                  lambda: return x == "On device";
                then:
                  - lambda: id(va).set_use_wake_word(false);
                  - voice_assistant.stop
                  - delay: 30ms
                  - if:
                      condition:
                        switch.is_off: mute
                      then:
                        - micro_wake_word.start      
      


switch:
- platform: template
  name: "LCD Backlight"
  id: switch_lcd_backlight
  restore_mode: "ALWAYS_ON"
  turn_on_action:
    - lambda: |-
        int brightness = id(display_brightness).state; 
        M5.Display.setBrightness(brightness);          
        id(switch_lcd_backlight).publish_state(true);
  turn_off_action:
    - lambda: |-
        M5.Display.setBrightness(0);                  
        id(switch_lcd_backlight).publish_state(false);
- platform: template
  name: Mute
  id: mute
  optimistic: true
  restore_mode: RESTORE_DEFAULT_OFF
  entity_category: config
  on_turn_off:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:      
          - lambda: id(va).set_use_wake_word(true);
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
          - if:
              condition:
                not:
                  - voice_assistant.is_running
              then:
                - voice_assistant.start_continuous
          - script.execute: draw_display
  on_turn_on:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:      
          - voice_assistant.stop
          - lambda: id(va).set_use_wake_word(false);
          - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
          - script.execute: draw_display

@ginandbacon Thank you for sharing your ESPHome config for the CoreS3 SE. I’ve managed to get wake word working on my Atom Echo but so far had no luck with the CoreS3.

Flashing was successful, however should I expect to see something on the screen after I have installed this config? Looking at the console logs via USB I can see the device has connected to wifi, and I’ve added it to HA’s ESPHome Devices along with my encryption key, but so far the device looks dead externally.

I take it you added it to HA after adding to ESPHome, correct? It should reboot after flashing and show a black background with a face like the S3-Box, it will turn red if it can’t connect to HA, something else if it can’t connect to WiFi. I just updated mine to the latest non beta ESPHome version and it works. I was also trying to get alarms and text to speech to display on screen but it just kept acting way to “wonky”. so i may have left something in there that I should have removed. I do know the below works.

Does the device show as red even when it’s on and has an IP? I found out that if you addd the project and version yourself, it does this, I might have taken it out, just something I ran accress with this and my Korvo-1 which was odd until I figured it out as it all worked, it just showed as red in ESPHome add on

Try this, I was having some random issues myself, I think something at the beginning was out of order or causing issues. That or leaving captive_portal: in there when not using arduino for the platform (I am not a developer). The one odd random thing I had is it stopped doing OTA updates, I can plug it in to a PC, no issues, that’s why I commented out that lines at the beginning.

It should show something, that is unless it doesn’t have internet access. The beginning its’ downloading images. There is a way to do this without downloading them everytime, I think if you create an images folder in the esphome folder you could just reference the as loading_illustration_file: image/image.jpg (it’s on these forums somewhere)

ORA Error:

INFO Uploading /data/build/m5cores3/.pioenvs/m5cores3/firmware.bin (3502752 bytes)
ERROR Error receiving acknowledge binary size: timed out

substitutions:
  name: m5cores3
  friendly_name: M5CoreS3
  loading_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/loading_320_240.png
  idle_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/idle_320_240.png
  listening_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/listening_320_240.png
  thinking_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/thinking_320_240.png
  replying_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/replying_320_240.png
  error_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/error_320_240.png

  loading_illustration_background_color: '000000'
  idle_illustration_background_color: '000000'
  listening_illustration_background_color: 'FFFFFF'
  thinking_illustration_background_color: 'FFFFFF'
  replying_illustration_background_color: 'FFFFFF'
  error_illustration_background_color: '000000'

  voice_assist_idle_phase_id: '1'
  voice_assist_listening_phase_id: '2'
  voice_assist_thinking_phase_id: '3'
  voice_assist_replying_phase_id: '4'
  voice_assist_not_ready_phase_id: '10'
  voice_assist_error_phase_id: '11'  
  voice_assist_muted_phase_id: '12'

  micro_wake_word_model: hey_jarvis  

esphome:
  name: m5cores3
  friendly_name: M5CoreS3
  platformio_options:
    board_build.flash_mode: dio
    #board_upload.maximum_size: 16777216
    #board_build.f_cpu : 240000000L    
  libraries:
    - m5stack/M5GFX@^0.1.11
    - m5stack/M5Unified@^0.1.11
  on_boot:
    priority: 600
    then:
      - delay: 30s            
      - if:
          condition:
            - lambda: return id(init_in_progress);
          then:
            - lambda: id(init_in_progress) = false;      
      - script.execute: draw_display  # Initialize the display
      - delay: 1s  # Wait a moment to ensure the display is ready
      - switch.turn_on: switch_lcd_backlight  # Turn on the backlight switch to set the initial state

esp32:
  board: esp32-s3-devkitc-1
  flash_size: 16MB
  framework:
    type: esp-idf
    sdkconfig_options:
      # need to set a s3 compatible board for the adf-sdk to compile
      # board specific code is not used though  
      CONFIG_ESP32_S3_BOX_BOARD: "y"
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB:      "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B:  "y"

      #CONFIG_LOG_DEFAULT_LEVEL_DEBUG: "y"
      #CONFIG_LOG_DEFAULT_LEVEL: "4"

psram:
  mode: quad
  speed: 80MHz

external_components:
  - source:
      type: git
      url: https://github.com/m5stack/M5CoreS3-Esphome
    components: [ board_m5cores3, m5cores3_audio, m5cores3_display ]
    refresh: 0s
  #- source: github://pr#5230
  #  components: esp_adf
 #   refresh: 0s        
  #- source: github://jesserockz/esphome-components
  #  components: [file]
  #  refresh: 0s


# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: "API"
    
ota:
  - platform: esphome
    id: my_ota
    password: "OTA"

wifi:
  ssid: !secret wifi_ssid 
  password: !secret wifi_password
  on_connect:
    then:
      - delay: 20ms # Gives time for improv results to be transmitted
      - ble.disable:
      - script.execute: draw_display        
  on_disconnect:
    then:
      - ble.enable:  
      - script.execute: draw_display

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "M5Stack-Cores3 Fallback Hotspot"
    password: "HOTSPOT"

improv_serial:

esp32_improv:
  authorizer: none  
# 
# 
# Globals
# 
globals:
  - id: init_in_progress
    type: bool
    restore_value: no
    initial_value: 'true'
  - id: voice_assistant_phase
    type: int
    restore_value: no
    initial_value: ${voice_assist_not_ready_phase_id}

time:
  - platform: homeassistant
    id: homeassistant_time    

text_sensor:
  - platform: wifi_info
    ip_address:
      name: "${friendly_name} IP Address"    

#
# Temp values for slider lcd brightnes
#
number:
  - platform: template
    name: "Display Brightness"
    id: display_brightness
    entity_category: config
    optimistic: true
    min_value: 0
    max_value: 200
    step: 1
    unit_of_measurement: "%"
    initial_value: 127
    icon: "mdi:brightness-6"
    set_action:
      - lambda: |-
          int brightness = (int)(x * 2.55);  
          M5.Display.setBrightness(brightness);
          

# 
# Display
# 
script:
  - id: draw_display
    then:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - if:
                condition:
                  wifi.connected:
                then:
                  - if:
                      condition:
                        api.connected:
                      then:
                        - lambda: |
                            switch(id(voice_assistant_phase)) {
                              case ${voice_assist_listening_phase_id}:
                                id(m5cores3_lcd).show_page(listening_page);
                                id(m5cores3_lcd).update();
                                break;
                              case ${voice_assist_thinking_phase_id}:
                                id(m5cores3_lcd).show_page(thinking_page);
                                id(m5cores3_lcd).update();
                                break;
                              case ${voice_assist_replying_phase_id}:
                                id(m5cores3_lcd).show_page(replying_page);
                                id(m5cores3_lcd).update();
                                break;
                              case ${voice_assist_error_phase_id}:
                                id(m5cores3_lcd).show_page(error_page);
                                id(m5cores3_lcd).update();
                                break;
                              case ${voice_assist_muted_phase_id}:
                                id(m5cores3_lcd).show_page(muted_page);
                                id(m5cores3_lcd).update();
                                break;
                              case ${voice_assist_not_ready_phase_id}:
                                id(m5cores3_lcd).show_page(no_ha_page);
                                id(m5cores3_lcd).update();
                                break;
                              default:
                                id(m5cores3_lcd).show_page(idle_page);
                                id(m5cores3_lcd).update();
                            }
                      else:
                        - display.page.show: no_ha_page
                        - component.update: m5cores3_lcd
                else:
                  - display.page.show: no_wifi_page
                  - component.update: m5cores3_lcd
          else:
            - display.page.show: initializing_page
            - component.update: m5cores3_lcd

image:
  - file: ${error_illustration_file}
    id: casita_error
    resize: 320x240
    type: RGB24
    use_transparency: true
  - file: ${idle_illustration_file}
    id: casita_idle
    resize: 320x240
    type: RGB24
    use_transparency: true
  - file: ${listening_illustration_file}
    id: casita_listening
    resize: 320x240
    type: RGB24
    use_transparency: true
  - file: ${thinking_illustration_file}
    id: casita_thinking
    resize: 320x240
    type: RGB24
    use_transparency: true
  - file: ${replying_illustration_file}
    id: casita_replying
    resize: 320x240
    type: RGB24
    use_transparency: true
  - file: ${loading_illustration_file}
    id: casita_initializing
    resize: 320x240
    type: RGB24
    use_transparency: true
  - file: https://github.com/esphome/firmware/raw/main/voice-assistant/error_box_illustrations/error-no-wifi.png
    id: error_no_wifi
    resize: 320x240
    type: RGB24
    use_transparency: true
  - file: https://github.com/esphome/firmware/raw/main/voice-assistant/error_box_illustrations/error-no-ha.png
    id: error_no_ha
    resize: 320x240
    type: RGB24
    use_transparency: true

color:
  - id: idle_color
    hex: ${idle_illustration_background_color}
  - id: listening_color
    hex: ${listening_illustration_background_color}
  - id: thinking_color
    hex: ${thinking_illustration_background_color}
  - id: replying_color
    hex: ${replying_illustration_background_color}
  - id: loading_color
    hex: ${loading_illustration_background_color}
  - id: error_color
    hex: ${error_illustration_background_color}

display:
  - platform: m5cores3_display
    model: ILI9342
    dc_pin: 35
    update_interval: never
    id: m5cores3_lcd
    pages:
      - id: idle_page
        lambda: |-
          it.fill(id(idle_color));
          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_idle), ImageAlign::CENTER);
      - id: listening_page
        lambda: |-
          it.fill(id(listening_color));
          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_listening), ImageAlign::CENTER);
      - id: thinking_page
        lambda: |-
          it.fill(id(thinking_color));
          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_thinking), ImageAlign::CENTER);
      - id: replying_page
        lambda: |-
          it.fill(id(replying_color));
          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_replying), ImageAlign::CENTER);
      - id: error_page
        lambda: |-
          it.fill(id(error_color));
          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_error), ImageAlign::CENTER);
      - id: no_ha_page
        lambda: |-
          it.image((it.get_width() / 2), (it.get_height() / 2), id(error_no_ha), ImageAlign::CENTER);
      - id: no_wifi_page
        lambda: |-
          it.image((it.get_width() / 2), (it.get_height() / 2), id(error_no_wifi), ImageAlign::CENTER);
      - id: initializing_page
        lambda: |-
          it.fill(id(loading_color));
          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_initializing), ImageAlign::CENTER);
      - id: muted_page
        lambda: |-
          it.fill(Color::BLACK);


# 
# Audio
# 
#esp_adf:
board_m5cores3:
m5cores3_audio:
  id: m5cores3_audio_1

microphone:
  - platform: m5cores3_audio
    m5cores3_audio_id: m5cores3_audio_1
    id: m5cores3_mic
    adc_type: external
    i2s_din_pin: 14
    pdm: false

speaker:
  - platform: m5cores3_audio
    m5cores3_audio_id: m5cores3_audio_1
    id: m5cores3_spk
    dac_type: external
    i2s_dout_pin: 13
    mode: mono

# 
# VA
# 
micro_wake_word:
  models: ${micro_wake_word_model}
  on_wake_word_detected:
    - voice_assistant.start:
        wake_word: !lambda return wake_word;
        
voice_assistant:
  id: va
  microphone: m5cores3_mic
  speaker: m5cores3_spk
  use_wake_word: true
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.0
  #vad_threshold: 3
  on_listening:
    - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
    - script.execute: draw_display
  on_stt_vad_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
    - script.execute: draw_display
  on_tts_stream_start:
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - script.execute: draw_display
  on_tts_stream_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - script.execute: draw_display
  on_error:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};  
          - script.execute: draw_display
          - delay: 1s
          - if:
              condition:
                switch.is_off: mute
              then:
                - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
              else:
                - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
          - script.execute: draw_display
  on_client_connected: 
    - if:
        condition:
          switch.is_off: mute
        then:
          - voice_assistant.start_continuous:
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
        else:
          - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
    - lambda: id(init_in_progress) = false; 
    - script.execute: draw_display
  on_client_disconnected:
    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};  
    - script.execute: draw_display

select:
  - platform: template
    entity_category: config
    name: Wake word engine location
    id: wake_word_engine_location
    optimistic: true
    restore_value: true
    options:
      - In Home Assistant
      - On device
    initial_option: On device
    on_value:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - wait_until:
                lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id} || id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
            - if:
                condition:
                  lambda: return x == "In Home Assistant";
                then:
                  - micro_wake_word.stop
                  - delay: 30ms
                  - if:
                      condition:
                        switch.is_off: mute
                      then:
                        - lambda: id(va).set_use_wake_word(true);
                        - voice_assistant.start_continuous:
            - if:
                condition:
                  lambda: return x == "On device";
                then:
                  - lambda: id(va).set_use_wake_word(false);
                  - voice_assistant.stop
                  - delay: 30ms
                  - if:
                      condition:
                        switch.is_off: mute
                      then:
                        - micro_wake_word.start      
      


switch:
- platform: template
  name: "LCD Backlight"
  id: switch_lcd_backlight
  restore_mode: "ALWAYS_ON"
  turn_on_action:
    - lambda: |-
        int brightness = id(display_brightness).state; 
        M5.Display.setBrightness(brightness);          
        id(switch_lcd_backlight).publish_state(true);
  turn_off_action:
    - lambda: |-
        M5.Display.setBrightness(0);                  
        id(switch_lcd_backlight).publish_state(false);
- platform: template
  name: Mute
  id: mute
  optimistic: true
  restore_mode: RESTORE_DEFAULT_OFF
  entity_category: config
  on_turn_off:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:      
          - lambda: id(va).set_use_wake_word(true);
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
          - if:
              condition:
                not:
                  - voice_assistant.is_running
              then:
                - voice_assistant.start_continuous
          - script.execute: draw_display
  on_turn_on:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:      
          - voice_assistant.stop
          - lambda: id(va).set_use_wake_word(false);
          - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
          - script.execute: draw_display

It seems we have a couple of updates here I’m still trying to navigate:

- the board name has changed to m5stack-cores3
- the platform name for the display has changed in esp home
</s> <s>display:</s> <s> - platform: ili9xxx</s> <s> model: ILI9342</s> <s>
- This requires the spi component to be specified
</s> <s>spi:</s> <s> clk_pin: GPIO15 </s> <s> mosi_pin: GPIO11</s> <s> id: display_spi_bus</s> <s>
- this should be referenced in the under display as spi_id: display_spi_bus
- it also requires the option invert_colors: false/true to be specified under display

does it make sense? I’m still trying to get it updated and discover a couple of those differences to get my CoreS3 working

EDIT: Apparantly the issue is somehow not being able to load external components :thinking:

EDIT2: I’ve opened this issue with m5stack so they can update the dependency from validate_pillow_installed in their component, still testing if my fork will work for my use case

I got CoreS3se working as voice assistant.
Looking at the number of different external components from multiple sources I’d call it FrankenVoice Assistant :wink:

Here’s my repo wake-word-voice-assistants/m5stack-coreS3-se at main · luka6000/wake-word-voice-assistants · GitHub
Treat it as a work in progress but things working so far:

  • speaker
  • microphone
  • adf pipelines
  • micro wake word with 3 models (nabu, jarvis, vad)
  • voice assistant
  • lcd screen
  • software mute switch
  • assist images
  • international font glyphes
  • backlight
  • touchscreen

So it’s pretty much full package. Let me know how’s that working for you.

Hi @luka6000
I tried your yaml and it doesnt build on esphome 2025.2.0
Would be great if you update it.

thanks a lot !

Error :

Processing m5s3-voice (board: esp32s3box; framework: espidf; platform: platformio/[email protected])
--------------------------------------------------------------------------------
HARDWARE: ESP32S3 240MHz, 320KB RAM, 16MB Flash
 - framework-espidf @ 3.40408.0 (4.4.8) 
 - tool-cmake @ 3.16.9 
 - tool-ninja @ 1.10.2 
 - toolchain-esp32ulp @ 2.35.0-20220830 
 - toolchain-riscv32-esp @ 8.4.0+2021r2-patch5 
 - toolchain-xtensa-esp32s3 @ 8.4.0+2021r2-patch5
Reading CMake configuration...
Dependency Graph
|-- Improv @ 1.2.4
|-- esp-audio-libs @ 1.1.1
|-- ESPMicroSpeechFeatures @ 1.1.0
Compiling .pioenvs/m5s3-voice/src/main.o
Compiling .pioenvs/m5s3-voice/components/esp-tflite-micro/tensorflow/lite/micro/micro_log.o
Compiling .pioenvs/m5s3-voice/components/esp-tflite-micro/tensorflow/lite/micro/micro_op_resolver.o
Compiling .pioenvs/m5s3-voice/components/esp-tflite-micro/tensorflow/lite/micro/micro_profiler.o
Compiling .pioenvs/m5s3-voice/components/esp-tflite-micro/tensorflow/lite/micro/micro_resource_variable.o
Compiling .pioenvs/m5s3-voice/components/esp-tflite-micro/tensorflow/lite/micro/micro_time.o
Compiling .pioenvs/m5s3-voice/components/esp-tflite-micro/tensorflow/lite/micro/micro_utils.o
src/main.cpp: In function 'void setup()':
src/main.cpp:778:55: error: invalid new-expression of abstract class type 'esphome::aw9523::AW9523GPIOPin'
   aw9523_aw9523gpiopin_id = new aw9523::AW9523GPIOPin();
                                                       ^
In file included from src/esphome.h:20,
                 from src/main.cpp:3:
src/esphome/components/aw9523/aw9523.h:25:7: note:   because the following virtual functions are pure within 'esphome::aw9523::AW9523GPIOPin':
 class AW9523GPIOPin : public GPIOPin
       ^~~~~~~~~~~~~
In file included from src/esphome/core/hal.h:4,
                 from src/esphome/components/light/light_transformer.h:4,
                 from src/esphome/components/light/light_state.h:11,
                 from src/esphome/core/controller.h:11,
                 from src/esphome/components/api/api_server.h:11,
                 from src/esphome/components/api/api_connection.h:8,
                 from src/esphome.h:3,
                 from src/main.cpp:3:
src/esphome/core/gpio.h:61:23: note: 	'virtual esphome::gpio::Flags esphome::GPIOPin::get_flags() const'
   virtual gpio::Flags get_flags() const = 0;
                       ^~~~~~~~~
Compiling .pioenvs/m5s3-voice/components/esp-tflite-micro/tensorflow/lite/micro/mock_micro_graph.o
Compiling .pioenvs/m5s3-voice/components/esp-tflite-micro/tensorflow/lite/micro/recording_micro_allocator.o
Compiling .pioenvs/m5s3-voice/components/esp-tflite-micro/tensorflow/lite/micro/system_setup.o
Compiling .pioenvs/m5s3-voice/components/esp-tflite-micro/tensorflow/lite/micro/test_helper_custom_ops.o

yes, I have a dependency on external components that are not yet updated to 2025.2. Nevertheless I think I’ve managed to get it compiling. I’ve just pushed new code so try and let me know how that went.

1 Like

FYI, if you aren’t already aware they added native support for the DAC used (ES7210). They completely got off of ESP-ADF in the last release although I’m sure you are already aware of this, not that it can’t still be used.

This release includes support for a number of new audio-related components/hardware. These are primarily aimed at supporting hardware found in Espressif’s S3-Box series of products, eliminating the need to use the ESP-ADF and thus offering better integration with ESPHome in general. If you’re using an S3-Box (or one of the variants), we strongly recommend updating your device either OTA or by using our Ready-Made Projects web installer. If you have “taken control” of or “adopted” your S3-Box, we strongly recommend updating your device’s local configuration based on our updated configuration files found here.

In addition, new speaker components have been introduced to provide more advanced functionality when using Voice Assistant. These components extend our work to help you create the ultimate personal voice assistant hardware.

The new speaker media player component adds several features for building a well-rounded audio device. It supports playing two different streams of audio: one for announcements and another for music.

The new mixer speaker component lets you combine the two streams. The mixer even supports audio ducking, so you can lower the volume of the music while your announcement plays!

@luka6000 , I tried your updated file, but I get an error near tensorflow

Compiling .pioenvs/m5s3-voice/components/esp-nn/src/pooling/esp_nn_max_pool_s8_esp32s3.o
Compiling .pioenvs/m5s3-voice/components/esp-nn/src/pooling/esp_nn_avg_pool_s8_esp32s3.o
Compiling .pioenvs/m5s3-voice/components/esp-tflite-micro/tensorflow/lite/micro/debug_log.o
Compiling .pioenvs/m5s3-voice/components/esp-tflite-micro/tensorflow/lite/micro/fake_micro_context.o
Compiling .pioenvs/m5s3-voice/components/esp-tflite-micro/tensorflow/lite/micro/flatbuffer_utils.o
Archiving .pioenvs/m5s3-voice/esp-idf/esp-nn/libesp-nn.a
Compiling .pioenvs/m5s3-voice/components/esp-tflite-micro/tensorflow/lite/micro/memory_helpers.o
xtensa-esp32s3-elf-g++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
*** [.pioenvs/m5s3-voice/src/main.o] Error 1
xtensa-esp32s3-elf-g++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
*** [.pioenvs/m5s3-voice/components/esp-tflite-micro/tensorflow/lite/micro/fake_micro_context.o] Error 1
========================= [FAILED] Took 301.95 seconds =========================

any ideas ?

there is nothing in this log part.
It’s as if you’d terminated the process. Have you?

Have you changed anything in the yaml?
Just checked 2025.2.1 - still working.

Linking .pioenvs/m5s3-voice/firmware.elf
RAM:   [=         ]  10.1% (used 32948 bytes from 327680 bytes)
Flash: [=====     ]  50.0% (used 4064721 bytes from 8126464 bytes)
Building .pioenvs/m5s3-voice/firmware.bin
Creating esp32s3 image...
Successfully created esp32s3 image.
esp32_create_combined_bin([".pioenvs/m5s3-voice/firmware.bin"], [".pioenvs/m5s3-voice/firmware.elf"])
Wrote 0x3f0740 bytes to file GitHub/wake-word-voice-assistants/m5stack-coreS3-se/.esphome/build/m5s3-voice/.pioenvs/m5s3-voice/firmware.factory.bin, ready to flash to offset 0x0
esp32_copy_ota_bin([".pioenvs/m5s3-voice/firmware.bin"], [".pioenvs/m5s3-voice/firmware.elf"])
========================================================================================================================================== [SUCCESS] Took 74.69 seconds ==========================================================================================================================================
INFO Successfully compiled program.
lu@mac m5stack-coreS3-se % esphome --version                         
Version: 2025.2.1

I didnt stopped the process and I used your yaml as is at first (same error) and now I added api-key, but got the same error.

I just saw an update on ESP builder, I will update this and try again.

Also do you build using Home assistant addon ? on a Rasp pi ? im using Pi 4.
Can I remove tensorflow component ?

I also found this :

ah, ok. So you are running out of resources on the pi, probably memory. I’m using my mac for this and my HAss is a pc with 8GB memory so also no issues there.
Tensorflow is used with microwakeword which is the whole point of on-device wake word detection so you don’t want to remove that. Just use any computer you have. ESPhome works on Windows, Linux and Mac so pretty much everywhere. If you can’t do that, turn off all add-ons you have in HAss on the Pi and try again. If that won’t make it, you could even turn off homeassistant for the moment of compilation but if you don’t know what I’m talking about it’s probably better to stay off the Pi with compilation.

I confirm that it was a ressource problem.

I added:

esphome:
  compile_process_limit: 1

it took ages to build on pi4, but it built fine.

Thanks for your support !

I started this thread but moved on as far as voice assistants, just an FYI, in the last ESPHome update they got rid of ESP-ADF on the devices they support. The M5Stack Core has ES7210, dual microphone inputs so that code can be used now. Not sure about the DAC for audio output though. They also removed support for armv7 (pi3 and below) for ESPHome builder.

This release includes support for a number of new audio-related components/hardware. These are primarily aimed at supporting hardware found in Espressif’s S3-Box series of products, eliminating the need to use the ESP-ADF and thus offering better integration with ESPHome in general. If you’re using an S3-Box (or one of the variants), we strongly recommend updating your device either OTA or by using our Ready-Made Projects web installer. If you have “taken control” of or “adopted” your S3-Box, we strongly recommend updating your device’s local configuration based on our updated configuration files found here.