How to make a button that plays a media file which behaves identically to playing a media file via media player in Home Assistant?

I am using the Muse Luxe YAML (which is conceptually very similar to Voice Assistant PE yaml):

Playing an audio file just via the Media Player from Home Assistant seems to work fairly stable and well.

However, I have some small sounds & chimes which I want to integrate directly into ESPhome and play with a button.

Hence I added “beep”, “SimpliChime” and “BurglarAlarm” under “files” under “media_player”; along with buttons. The relevant section:

media_player:
  - platform: speaker
    name: None
    id: luxe_media_player
    internal: false   
#    task_stack_in_psram: true
#    volume_min: 0.5
#    volume_max: 0.8
    announcement_pipeline:
      speaker: luxe_speaker
      format: WAV
      sample_rate: 48000
      num_channels: 1       
    files:
      - id: little_sound
        file: https://github.com/RASPIAUDIO/esphomeLuxe/raw/refs/heads/main/wav/sounds_timer_finished.wav
      - id: beep
        file: http://10.227.4.10:8123/local/beep.wav
      - id: SimpliChime
        file: http://10.227.4.10:8123/local/SimpliChime.wav
      - id: BurglarAlarm
        file: http://10.227.4.10:8123/local/Burglar-alarm-sound.mp3
    on_announcement: 
      - micro_wake_word.stop:
      - wait_until:
          not:
            micro_wake_word.is_running:
      - microphone.mute: luxe_mic
      - lambda: |-
          id(audio_active) = true;
          id(audio_started) = false;
      - if:
           condition:
                lambda: 'return(id(phase) != 2);'
           then:           
              - lambda: |-
                   if(id(phase) == 1)  id(phase) = 2;    
              - script.execute: mute_off         
              - script.execute: update_led    
                     
    on_idle:
      - if:
          condition:
            lambda: 'return id(audio_active) && id(audio_started);'
          then:
            - wait_until:
               and:
                - not:
                    media_player.is_announcing:
                - not:
                    voice_assistant.is_running:    
            - if:
                condition:              
                    lambda: 'return((id(phase) == 4) || (id(phase) == 2));'
                then:
                  - lambda: |-
                       id(phase) = 1; 
            - microphone.unmute: luxe_mic
            - micro_wake_word.start:                                              
            - script.execute: update_led
            - lambda: |-
                id(audio_active) = false;
                id(audio_started) = false;
    on_play:
      - if:
          condition:
            lambda: 'return !id(mute);'
          then:
            - output.turn_off: dac_mute
            - lambda: id(muteH) = false;



button:
  - platform: template
    name: "Beep"
    on_press:
      - media_player.speaker.play_on_device_media_file:
          media_file: beep
          announcement: true
      # Wait until the alarm sound starts playing
      - wait_until:
          media_player.is_announcing:
      # Wait until the alarm sound stops playing
      - wait_until:
          not:
            media_player.is_announcing:
  - platform: template
    name: "SimpliChime"
    on_press:
      - media_player.speaker.play_on_device_media_file:
          media_file: SimpliChime
          announcement: true
      # Wait until the alarm sound starts playing
      - wait_until:
          media_player.is_announcing:
      # Wait until the alarm sound stops playing
      - wait_until:
          not:
            media_player.is_announcing:
  - platform: template
    name: "Burglar Alarm"
    on_press:
      - media_player.speaker.play_on_device_media_file:
          media_file: BurglarAlarm
          announcement: true
      # Wait until the alarm sound starts playing
      - wait_until:
          media_player.is_announcing:
      # Wait until the alarm sound stops playing
      - wait_until:
          not:
            media_player.is_announcing:

This generally works but when I click fast the buttons it crashes. This does NOT happen when I just use the media_player from HA. So it seems media_player.speaker.play_on_device_media_file behaves differently than sending an audio file via the media player.

How can I make the behavior exactly identical but instead of streaming the audio, just taking a file integrated into the ESPhome project?