Voice PE → Play Replies on an External Media Playerer

najs2000 · January 17, 2026, 6:51pm

Hi all.

Been trying for some time to get the Voice PE to output its TTS to an external media player.

I use a Windows PC (oldschool HTPC setup) in my living room with a good sound system and wanted the output here instead of the crapy builtin speaker.

Before this i had zero knowledge about ESPHome but i have been running HA instance for 4-5 years.

I took alot of inspiration from this thread Redirect Voice PE Replies to Sonos - Community Guides - Home Assistant Community.

ChatGPT helped me all the way here with the code.

Firstly i installed HASS.Agent on my PC(called sofa) and got it working.
Then i installed ESPHome and imported the Voice PE.

Voice PE → Play Replies on an External Media Player (No Double Audio)

Goal

Build a Voice PE satellite that:

Listens locally (microphone, wake word, LEDs all work normally)
Runs the full Assist pipeline in Home Assistant
Plays TTS replies on an external media player (e.g. media_player.sofa)
Does not speak locally at the same time
Works reliably, without race conditions or ESPHome YAML errors

In this setup, media_player.sofa is a Windows PC running Hass.Agent, exposed to Home Assistant as a media player.

The Robust Solution (Recommended)

Key Idea

Let Voice PE keep generating TTS, but:

Mute the local Voice PE speaker

Capture the generated TTS URL

Hand off playback to Home Assistant

Let HA play the reply on any media player (here: a Windows PC via Hass.Agent)

This is done using:

an input_text helper as a bridge
a Home Assistant script for playback
a small ESPHome override

ESPHome (Voice PE override)

What this does

Ducks the local mixer
Mutes the local Voice PE speaker
Saves the TTS URL into Home Assistant
Triggers the HA playback script
Restores everything afterward

ESPHome YAML (override only)

substitutions:
  name: home-assistant-voice-095a6b
  friendly_name: Home Assistant Voice 095a6b

packages:
  Nabu Casa.Home Assistant Voice PE: github://esphome/home-assistant-voice-pe/home-assistant-voice.yaml

esphome:
  name: ${name}
  name_add_mac_suffix: false
  friendly_name: ${friendly_name}

api:
  encryption:
    key: ******************

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

# ------------------------------------------------------------
# Voice PE reply redirect (NO double audio + no "blip" after idle)
#
# Home Assistant prerequisites:
# 1) Helper: input_text.voice_pe_tts_url
# 2) Script: script.voice_pe_play_reply_on_sofa
#    (reads input_text.voice_pe_tts_url and plays it on media_player.sofa)
# ------------------------------------------------------------

voice_assistant:
  # PRE-MUTE EARLY to prevent the "local starts speaking briefly then stops" issue after idle
  on_intent_progress:
    - if:
        condition:
          lambda: 'return !x.empty();'
        then:
          - logger.log:
              level: DEBUG
              format: "Redirect: pre-muting local Voice PE speaker before TTS begins."
          - media_player.volume_set:
              id: external_media_player
              volume: 0.0

  # Duck hard during TTS and ensure local speaker stays muted
  on_tts_start:
    - logger.log:
        level: INFO
        format: "Redirect: ducking local mixer + muting Voice PE speaker."
    - mixer_speaker.apply_ducking:
        id: media_mixing_input
        decibel_reduction: 51
        duration: 0s
    - media_player.volume_set:
        id: external_media_player
        volume: 0.0

  # Save the TTS proxy URL and trigger HA playback on sofa (Windows PC via Hass.Agent)
  on_tts_end:
    - logger.log:
        level: INFO
        format: "Redirect: saving TTS URL to HA helper + starting sofa playback script."
    - homeassistant.service:
        service: input_text.set_value
        data:
          entity_id: input_text.voice_pe_tts_url
          value: !lambda |-
            return x;

    - homeassistant.service:
        service: script.turn_on
        data:
          entity_id: script.voice_pe_play_reply_on_sofa

  # Restore state when pipeline is fully finished
  on_end:
    - wait_until:
        not:
          voice_assistant.is_running:
    - mixer_speaker.apply_ducking:
        id: media_mixing_input
        decibel_reduction: 0
        duration: 0s
    - media_player.volume_set:
        id: external_media_player
        volume: 1.0

I added this to script.yaml:

voice_pe_play_reply_on_sofa:
  alias: Voice PE – Play reply on sofa
  mode: restart
  sequence:
  - variables:
      url: '{{ states(''input_text.voice_pe_tts_url'') }}'
  - condition: template
    value_template: '{{ url.startswith(''http'') }}'
  - target:
      entity_id: media_player.sofa
    data:
      media_content_id: '{{ url }}'
      media_content_type: music
    action: media_player.play_media

Added this to configuration.yaml:

input_text:
  voice_pe_tts_url:
    name: Voice PE last TTS URL
    max: 255

Why This Works (and Why Others Fail)

No unsupported ESPHome YAML
No direct media_player hijacking
No timing race conditions
Works with any HA media player:
- Sonos
- Music Assistant
- Chromecast
- Windows PC via Hass.Agent
Voice PE remains fully functional as a satellite

Voice PE still believes it is playing locally — but it’s muted — while Home Assistant takes over actual playback.

Result

You end up with a clean, professional Voice Assistant setup:

One device listens
Another device speaks
No echo
No hacks
No flakiness

This is effectively how commercial multi-room assistants work — just implemented with full local control.

If you want, this approach can easily be extended to:

restore exact previous volume
room-aware replies
multi-room announcements
Music Assistant ducking
LED sync with external playback

But as-is, this is already a production-grade solution.

stncttr908 · February 11, 2026, 3:33pm

Would this work with the “$13 Voice Assistant” aka M5 Atom Echo? Thanks for the great work.

dCatfish · March 3, 2026, 8:30pm

Thank you so much! This is awesome I skipped the HASS.Agent part. Just needed my Home Assistant Voice PE to stop stuttering when replying (because it made me crazy). And now it TTS replies on my Google Nest Mini, or whatever media_player.* i want, with perfect flow and sound. Yess, I love it, VERY NAJS!!!