Need help with media player component with speaker platform

I’m banging my head into a wall since i would like to use media player component using speaker platform (here) since it’s compatible with idf framework.
I’m unable to find any structured documentation or example, so if i missed something, please let me know.
The issue i have, is relate to playing media through the media player. All info i could gather is here were it says

Generic media player commands are received by the control function

and

Local file play back is initiatied with play_file and adds it to the media_control_command_queue_

So i tried lambda: id(va_media_player).control(play_file(id(timer_finished_wave_file))); and i get error: 'play_file' was not declared in this scope
Also my second shot was lambda: id(va_media_player).play_file(id(timer_finished_wave_file)); but i get error: no matching function for call to 'esphome::speaker::SpeakerMediaPlayer::play_file(const uint8_t [26002])'.

Does anyone know how to use it?

The PR you pointed to has not been merged into esphome.

Thank you for your reply. I’m a big noob, i’ve started using HA few months ago.
Is there a simple way i can use this feature?
I thought that using it as external component, like:

external_components:
  - source: github://pr#7672
    components:
      - audio
      - speaker
      - i2s_audio
      - speaker
    refresh: 0s

would work, but apparently it’s not

Your full yaml and logs would be needed.

This is the YAML:

esphome:
  name: esp32-s3-my-assistant
  friendly_name: ESP32-S3-My-Assistant
  platformio_options:
    board_build.flash_mode: dio
  on_boot:
    - light.turn_on:
        id: led_strip
        blue: 100%
        brightness: 60%
        effect: fast pulse  

esp32:
  board: esp32-s3-devkitc-1
  framework:
    type: esp-idf

    sdkconfig_options:
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
      CONFIG_AUDIO_BOARD_CUSTOM: "y"
   
psram:
  mode: octal # Please change this to quad for N8R2 and octal for N16R8
  speed: 80MHz

# Enable Home Assistant API
api:
  encryption:
    key: "****************************************"
  on_client_connected:
        then:
          - delay: 50ms
          - light.turn_off: led_strip
          - micro_wake_word.start:
  on_client_disconnected:
        then:
          - voice_assistant.stop: 

logger:

ota:
  - platform: esphome
    password: "********************************"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  reboot_timeout: 15min  
  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Esp32-S3-Wake-Word"
    password: "**********"
  power_save_mode: none
  enable_on_boot: True
  fast_connect: On
  output_power: 8.5

captive_portal:

button:
  - platform: restart
    name: "Restart"
    id: but_rest

switch:
  - platform: template
    id: mute
    name: mute
    optimistic: true
    on_turn_on: 
      - micro_wake_word.stop:
      - voice_assistant.stop:
      - light.turn_on:
          id: led_strip           
          red: 100%
          green: 0%
          blue: 0%
          brightness: 60%
          effect: fast pulse 
          
      - delay: 2s
      - light.turn_off:
          id: led_strip

      - light.turn_on:
          id: led_strip           
          red: 100%
          green: 0%
          blue: 0%
          brightness: 30%

    on_turn_off:
      - micro_wake_word.start:
      - light.turn_on:
          id: led_strip  
          red: 0%
          green: 100%
          blue: 0%
          brightness: 60%
          effect: fast pulse 
      - delay: 2s
      - light.turn_off:
          id: led_strip
  - platform: template
    id: timer_ringing
    optimistic: true
    internal: False
    name: "Timer Ringing"
    restore_mode: ALWAYS_OFF

binary_sensor:
  - platform: gpio
    id: button01
    name: "Mute Button" # Physical Mute switch
    pin:
      number: GPIO10  #Physical Button connected to this pin.
      inverted: True
      mode:
        input: True
        pullup: True
    on_press: 
      if:
        condition:
          switch.is_on: timer_ringing  #stops timer ringing 
        then:
          - switch.turn_off: timer_ringing
        else:
          - switch.toggle: mute     #mutes the device
   
light:
  - platform: esp32_rmt_led_strip
    id: led_strip
    rgb_order: GRB
    pin: GPIO09
    num_leds: 29
    rmt_channel: 1
    chipset: ws2812
    name: "Led Strip"
    effects:
      - pulse:
      - pulse:
          name: "Fast Pulse"
          transition_length: 0.5s
          update_interval: 0.5s
          min_brightness: 0%
          max_brightness: 100%
      - addressable_scan:
          name: "Scan Effect With Custom Values"
          move_interval: 5ms
          scan_width: 10
        
 # Audio and Voice Assistant Config          
i2s_audio:
  - id: i2s_in # For microphone
    i2s_lrclk_pin: GPIO3  #WS 
    i2s_bclk_pin: GPIO2 #SCK

  - id: i2s_speaker #For Speaker
    i2s_lrclk_pin: GPIO6  #LRC 
    i2s_bclk_pin: GPIO7 #BLCK

microphone:
  - platform: i2s_audio
    id: va_mic
    adc_type: external
    i2s_din_pin: GPIO4 #SD
    channel: left
    pdm: false
    i2s_audio_id: i2s_in
    bits_per_sample: 32bit
    
speaker:
    platform: i2s_audio
    id: va_speaker
    i2s_audio_id: i2s_speaker
    dac_type: external
    i2s_dout_pin: GPIO8   #  DIN Pin of the MAX98357A Audio Amplifier
    channel: mono
    sample_rate: 48000

media_player:
  - platform: speaker
    name: "Speaker Media Player"
    id: va_media_player
    speaker: va_speaker
    sample_rate: 48000

#Wake Word
micro_wake_word:
  on_wake_word_detected:
    
    - voice_assistant.start:
        wake_word: !lambda return wake_word;
        silence_detection: true
    - light.turn_on:
        id: led_strip
        effect: "Scan Effect With Custom Values"
        red: 80%
        green: 0%
        blue: 80%
        brightness: 80%
  models:
    - model: hey_jarvis
    
voice_assistant:
  id: va
  microphone: va_mic
  auto_gain: 31dBFS
  noise_suppression_level: 3
  volume_multiplier: 4.0
  media_player: va_media_player
  on_stt_end:
       then: 
         - light.turn_off: led_strip
  on_error:
          - micro_wake_word.start:  
  on_end:
        then:
          - light.turn_off: led_strip
          - wait_until:
              not:
                voice_assistant.is_running:
          - micro_wake_word.start: 
  
  
  on_timer_finished:
    - micro_wake_word.stop:
    - voice_assistant.stop:

    - wait_until:
        not:
          microphone.is_capturing:

    - wait_until:
        not:
          micro_wake_word.is_running:

    - switch.turn_on: timer_ringing

    - light.turn_on:
        id: led_strip
        effect: "Scan Effect With Custom Values"
        red: 80%
        green: 0%
        blue: 20%
        brightness: 80%
    
    - lambda: id(va_media_player).play_file(id(timer_finished_wave_file));

    - micro_wake_word.start:
    - wait_until:
        and:
          - micro_wake_word.is_running:

    - while:
        condition:
          switch.is_on: timer_ringing
        then:
          - delay: 3s
          - lambda: id(va_media_player).control(play_file(id(timer_finished_wave_file)));      
    - wait_until:
        not:
          speaker.is_playing:
    
    - light.turn_off: led_strip
    - micro_wake_word.start:

external_components:
  - source: github://jesserockz/esphome-components
    components: [file]
  - source: github://pr#7672
    components:
      - audio
      - speaker
      - i2s_audio
      - speaker
    refresh: 0s  

file: 
  - id: timer_finished_wave_file
    file: https://github.com/esphome/firmware/raw/main/voice-assistant/sounds/timer_finished.wav

and the following is the compiling log:

INFO ESPHome 2024.12.4
INFO Reading configuration /config/esphome/vs1.yaml...
INFO Updating https://github.com/esphome/esphome.git@pull/7672/head
WARNING GPIO3 is a strapping PIN and should only be used for I/O with care.
Attaching external pullup/down resistors to strapping pins can cause unexpected failures.
See https://esphome.io/guides/faq.html#why-am-i-getting-a-warning-about-strapping-pins
INFO Generating C++ source...
INFO Updating https://github.com/espressif/[email protected]
INFO Updating https://github.com/espressif/[email protected]
INFO Updating https://github.com/espressif/[email protected]
INFO Compiling app...
Processing esp32-s3-my-assistant-7f5600 (board: esp32-s3-devkitc-1; framework: espidf; platform: https://github.com/pioarduino/platform-espressif32.git#51.03.07)
--------------------------------------------------------------------------------
HARDWARE: ESP32S3 240MHz, 320KB RAM, 8MB Flash
 - framework-espidf @ 3.50105.0 (5.1.5) 
 - tool-cmake @ 3.21.3 
 - tool-esptoolpy @ 4.8.1 
 - tool-mklittlefs @ 3.2.0 
 - tool-ninja @ 1.7.1 
 - tool-riscv32-esp-elf-gdb @ 12.1.0+20221002 
 - tool-xtensa-esp-elf-gdb @ 12.1.0+20221002 
 - toolchain-esp32ulp @ 2.35.0-20220830 
 - toolchain-riscv32-esp @ 12.2.0+20230208 
 - toolchain-xtensa-esp32s3 @ 12.2.0+20230208
Reading CMake configuration...
Dependency Graph
|-- esp-audio-libs @ 1.0.0
|-- noise-c @ 0.1.6
|-- ESPMicroSpeechFeatures @ 1.1.0
Compiling .pioenvs/esp32-s3-my-assistant-7f5600/src/main.cpp.o
In file included from src/esphome/components/esp32_rmt_led_strip/led_strip.h:12,
                 from src/esphome.h:25,
                 from src/main.cpp:3:
/data/cache/platformio/packages/framework-espidf@src-6e6cd5244a43bdfe453daac50cfe73b6/components/driver/deprecated/driver/rmt.h:18:2: warning: #warning "The legacy RMT driver is deprecated, please use driver/rmt_tx.h and/or driver/rmt_rx.h" [-Wcpp]
   18 | #warning "The legacy RMT driver is deprecated, please use driver/rmt_tx.h and/or driver/rmt_rx.h"
      |  ^~~~~~~
In file included from src/esphome/components/i2s_audio/i2s_audio.h:5,
                 from src/esphome.h:28:
/data/cache/platformio/packages/framework-espidf@src-6e6cd5244a43bdfe453daac50cfe73b6/components/driver/deprecated/driver/i2s.h:27:2: warning: #warning "This set of I2S APIs has been deprecated, please include 'driver/i2s_std.h', 'driver/i2s_pdm.h' or 'driver/i2s_tdm.h' instead. if you want to keep using the old APIs and ignore this warning, you can enable 'Suppress leagcy driver deprecated warning' option under 'I2S Configuration' menu in Kconfig" [-Wcpp]
   27 | #warning "This set of I2S APIs has been deprecated, \
      |  ^~~~~~~
Compiling .pioenvs/esp32-s3-my-assistant-7f5600/components/esp-tflite-micro/tensorflow/lite/micro/micro_log.cc.o
Compiling .pioenvs/esp32-s3-my-assistant-7f5600/components/esp-tflite-micro/tensorflow/lite/micro/micro_op_resolver.cc.o
In file included from components/esp-tflite-micro/tensorflow/lite/micro/tflite_bridge/flatbuffer_conversions_bridge.h:19,
                 from components/esp-tflite-micro/tensorflow/lite/micro/micro_allocator.h:26,
                 from components/esp-tflite-micro/tensorflow/lite/micro/micro_interpreter.h:26,
                 from src/esphome/components/micro_wake_word/streaming_model.h:8,
                 from src/esphome/components/micro_wake_word/micro_wake_word.h:6,
                 from src/esphome.h:55:
components/esp-tflite-micro/tensorflow/lite/core/api/flatbuffer_conversions.h: In member function 'T* tflite::BuiltinDataAllocator::AllocatePOD()':
components/esp-tflite-micro/tensorflow/lite/core/api/flatbuffer_conversions.h:46:24: warning: 'template<class _Tp> struct std::is_pod' is deprecated: use is_standard_layout && is_trivial instead [-Wdeprecated-declarations]
   46 |     static_assert(std::is_pod<T>::value, "Builtin data structure must be POD.");
      |                        ^~~~~~
In file included from /data/cache/platformio/packages/toolchain-xtensa-esp32s3/xtensa-esp32s3-elf/include/c++/12.2.0/bits/stl_pair.h:60,
                 from /data/cache/platformio/packages/toolchain-xtensa-esp32s3/xtensa-esp32s3-elf/include/c++/12.2.0/bits/stl_algobase.h:64,
                 from /data/cache/platformio/packages/toolchain-xtensa-esp32s3/xtensa-esp32s3-elf/include/c++/12.2.0/deque:60,
                 from src/esphome/components/api/api_frame_helper.h:3,
                 from src/esphome/components/api/api_connection.h:5,
                 from src/esphome.h:3:
/data/cache/platformio/packages/toolchain-xtensa-esp32s3/xtensa-esp32s3-elf/include/c++/12.2.0/type_traits:757:5: note: declared here
  757 |     is_pod
      |     ^~~~~~
Compiling .pioenvs/esp32-s3-my-assistant-7f5600/components/esp-tflite-micro/tensorflow/lite/micro/micro_profiler.cc.o
Compiling .pioenvs/esp32-s3-my-assistant-7f5600/components/esp-tflite-micro/tensorflow/lite/micro/micro_resource_variable.cc.o
/config/esphome/vs1.yaml: In lambda function:
/config/esphome/vs1.yaml:251:33: error: no matching function for call to 'esphome::speaker::SpeakerMediaPlayer::play_file(const uint8_t [26002])'
  251 | 
      |                                 ^                         
In file included from src/esphome/components/speaker/media_player/automation.h:6,
                 from src/esphome.h:82:
src/esphome/components/speaker/media_player/speaker_media_player.h:67:8: note: candidate: 'void esphome::speaker::SpeakerMediaPlayer::play_file(esphome::speaker::MediaFile*, bool)'
   67 |   void play_file(MediaFile *media_file, bool announcement);
      |        ^~~~~~~~~
src/esphome/components/speaker/media_player/speaker_media_player.h:67:8: note:   candidate expects 2 arguments, 1 provided
/config/esphome/vs1.yaml: In lambda function:
/config/esphome/vs1.yaml:263:32: error: 'play_file' was not declared in this scope
  263 |     - wait_until:
      |                                ^        
*** [.pioenvs/esp32-s3-my-assistant-7f5600/src/main.cpp.o] Error 1
========================= [FAILED] Took 11.74 seconds =========================

If my understanding is correct you want to be able to use a media player with voice assistant?

If this is so I use a modified version of this code, this gives me reliable voice and a media player for announcements etc, It plays music and radio from music assistant as well.

Just remebered you will need to change the framework setting to this for it to compile inthe latest esphome.

  framework:
    type: esp-idf
    version: 4.4.8
    platform_version: 5.4.0

Thanks for the suggestion. I had a look at the git you provide and tried to modify my yaml accordingly.

I kind of have the same issue, since i need to play a wav file in the lamba. I don’t see what is command I should use to play it.

Going through the git it seams i should use:
play_media but if try
lambda: id(va_media_player).play_media(id(timer_finished_wave_file));

I get kind of the same error
error: 'class esphome::esp_adf::ADFMediaPlayer' has no member named 'play_media'; did you mean 'play_intent_'?.

Also i get another error components/audio_hal/driver/es8388/es8388.c:29:10: fatal error: board.h: No such file or directory

Sorry I cant help with that issue as I played about a bit with that when trying to setup timers, gave up as I don’t use timers enough to worry about it.

This is a link to the git containing stuff about it if you have not seen it

which then links to this.

You may find something there.

yes, that was the git I was reading. It has ADFMediaPlayer::control function that, If i understand correctly, takes requires media player calls and handles it. But my knowledge is very limited, so i don’t know how to use it

Mine is limited as well, so we may not find a solution :slight_smile:

You could ask on the discord esphome channel, the guys who know about the code are there normally.

Thanks for the hint, I’ll try that