"Media not found" - Music Assistant/View Assistant/Local Voice Assistant

Hello, I am using Home Assistant, Wyoming-Whisper, Wyoming-Piper, Wyoming-Openwakeword and Music assistant in Docker on Fedora. I have followed the documentation from the official Home Assistant documentation for the wyoming containers, the official View Assist docs, and the Music Assistant documentation. The music provider I am using is Jellyfin. I have Jellyfin integration on Home Assistant, in addition to adding it to Music Assistant.

I am running HA v2025.11.3, MA v2.6.3, VA 2025.11.2, VACA v0.8.1, Jellyfin v10.11.4 and whatever the latest wyoming docker containers are. My Satellite device is an Android 9 tablet on an IoT network that has no WAN access.

I am able to play music on Jellyfin and in Music Assistant in the browser, but I can not get it to play on the Satellite. I am able to control light switches with the Satellite. If I access Music Assistant from a web browser on the Satellite I can get music to play in the web browser.

When I try to speak to the Satellite itself and say something like “Play some Billy Joel” it responds “Sorry, I couldn’t understand that”.

Docker log output from whisperr (HA and others do not seem to print a log for this)

homeassistant-whisper      | INFO:faster_whisper:Processing audio with duration 00:02.920
homeassistant-whisper      | INFO:wyoming_faster_whisper.faster_whisper_handler: Play some Billy Joel

Voice Assistant debug log for voice command:

stage: done
run:
  pipeline: 01hj9x2z6gk3wcdrzqadjvmdrh
  language: en
  conversation_id: 02JBF9B2GJYF9YQJV8PTSXPFAR
  satellite_id: assist_satellite.vaca_29a31de2a
  tts_output:
    token: J5-Ux0QzOUka5uAV1GGYaQ.wav
    url: /api/tts_proxy/J5-Ux0QzOUka5uAV1GGYaQ.wav
    mime_type: audio/x-wav
    stream_response: false
events:
  - type: run-start
    data:
      pipeline: 01hj9x2z6gk3wcdrzqadjvmdrh
      language: en
      conversation_id: 02JBF9B2GJYF9YQJV8PTSXPFAR
      satellite_id: assist_satellite.vaca_29a31de2a
      tts_output:
        token: J5-Ux0QzOUka5uAV1GGYaQ.wav
        url: /api/tts_proxy/J5-Ux0QzOUka5uAV1GGYaQ.wav
        mime_type: audio/x-wav
        stream_response: false
    timestamp: "2025-12-01T06:22:13.522582+00:00"
  - type: stt-start
    data:
      engine: stt.faster_whisper
      metadata:
        language: en
        format: wav
        codec: pcm
        bit_rate: 16
        sample_rate: 16000
        channel: 1
    timestamp: "2025-12-01T06:22:13.522654+00:00"
  - type: stt-vad-start
    data:
      timestamp: 1050
    timestamp: "2025-12-01T06:22:14.734299+00:00"
  - type: stt-vad-end
    data:
      timestamp: 3080
    timestamp: "2025-12-01T06:22:16.725141+00:00"
  - type: stt-end
    data:
      stt_output:
        text: " Play some Billy Joel"
    timestamp: "2025-12-01T06:22:17.272255+00:00"
  - type: intent-start
    data:
      engine: conversation.home_assistant
      language: en
      intent_input: " Play some Billy Joel"
      conversation_id: 02JBF9B2GJYF9YQJV8PTSXPFAR
      device_id: 58f1b9d1531cf84a808ea261d0ffb420
      satellite_id: assist_satellite.vaca_29a31de2a
      prefer_local_intents: false
    timestamp: "2025-12-01T06:22:17.272593+00:00"
  - type: intent-end
    data:
      processed_locally: true
      intent_output:
        response:
          speech:
            plain:
              speech: Sorry, I couldn't understand that
              extra_data: null
          card: {}
          language: en
          response_type: error
          data:
            code: no_valid_targets
        conversation_id: 02JBF9B2GJYF9YQJV8PTSXPFAR
        continue_conversation: false
    timestamp: "2025-12-01T06:22:17.292450+00:00"
  - type: tts-start
    data:
      engine: tts.piper
      language: en_US
      voice: en_US-hfc_female-medium
      tts_input: Sorry, I couldn't understand that
      acknowledge_override: false
    timestamp: "2025-12-01T06:22:17.292659+00:00"
  - type: tts-end
    data:
      tts_output:
        media_id: media-source://tts/-stream-/J5-Ux0QzOUka5uAV1GGYaQ.wav
        token: J5-Ux0QzOUka5uAV1GGYaQ.wav
        url: /api/tts_proxy/J5-Ux0QzOUka5uAV1GGYaQ.wav
        mime_type: audio/x-wav
    timestamp: "2025-12-01T06:22:17.293338+00:00"
  - type: run-end
    data: null
    timestamp: "2025-12-01T06:22:17.293404+00:00"
stt:
  engine: stt.faster_whisper
  metadata:
    language: en
    format: wav
    codec: pcm
    bit_rate: 16
    sample_rate: 16000
    channel: 1
  done: true
  stt_output:
    text: " Play some Billy Joel"
intent:
  engine: conversation.home_assistant
  language: en
  intent_input: " Play some Billy Joel"
  conversation_id: 02JBF9B2GJYF9YQJV8PTSXPFAR
  device_id: 58f1b9d1531cf84a808ea261d0ffb420
  satellite_id: assist_satellite.vaca_29a31de2a
  prefer_local_intents: false
  done: true
  processed_locally: true
  intent_output:
    response:
      speech:
        plain:
          speech: Sorry, I couldn't understand that
          extra_data: null
      card: {}
      language: en
      response_type: error
      data:
        code: no_valid_targets
    conversation_id: 02JBF9B2GJYF9YQJV8PTSXPFAR
    continue_conversation: false
tts:
  engine: tts.piper
  language: en_US
  voice: en_US-hfc_female-medium
  tts_input: Sorry, I couldn't understand that
  acknowledge_override: false
  done: true
  tts_output:
    media_id: media-source://tts/-stream-/J5-Ux0QzOUka5uAV1GGYaQ.wav
    token: J5-Ux0QzOUka5uAV1GGYaQ.wav
    url: /api/tts_proxy/J5-Ux0QzOUka5uAV1GGYaQ.wav
    mime_type: audio/x-wav

When I try to use the Home Assistant Web UI and send the same text to the voice assistant (not sure what this is officially referred to as) it goes like this:

HA: How can I assist?
Me: Play some Billy Joel on VACA Tablet Media Player
HA: Media not found

Home Assistant log after typing command:

Logger: homeassistant.helpers.template
Source: helpers/template/__init__.py:2333
First occurred: 12:25:24 AM (5 occurrences)
Last logged: 1:32:58 AM

Template variable warning: 'dict object' has no attribute 'media' when rendering '{% if slots.media: %} Playing media {% else: %} Media not found {% endif %}'

Docker log after typing command:

homeassistant              | 2025-12-01 01:32:58.698 WARNING (MainThread) [homeassistant.helpers.template] Template variable warning: 'dict object' has no attribute 'media' when rendering '{% if slots.media: %}
homeassistant              | Playing media
homeassistant              | {% else: %}
homeassistant              | Media not found
homeassistant              | {% endif %}'

Voice Assistant debug log for text:

stage: done
run:
  pipeline: 01hj9x2z6gk3wcdrzqadjvmdrh
  language: en
  conversation_id: 01KBF9KF0A9DQNC25RXMM683JP
  runner_data:
    stt_binary_handler_id: null
    timeout: 300
events:
  - type: run-start
    data:
      pipeline: 01hj9x2z6gk3wcdrzqadjvmdrh
      language: en
      conversation_id: 01KBF9KF0A9DQNC25RXMM683JP
      runner_data:
        stt_binary_handler_id: null
        timeout: 300
    timestamp: "2025-12-01T06:26:48.458949+00:00"
  - type: intent-start
    data:
      engine: conversation.home_assistant
      language: en
      intent_input: Play some Billy Joel on VACA Tablet Media Player
      conversation_id: 01KBF9KF0A9DQNC25RXMM683JP
      device_id: null
      satellite_id: null
      prefer_local_intents: false
    timestamp: "2025-12-01T06:26:48.458975+00:00"
  - type: intent-end
    data:
      processed_locally: true
      intent_output:
        response:
          speech:
            plain:
              speech: Media not found
              extra_data: null
          card: {}
          language: en
          response_type: action_done
          data:
            targets: []
            success: []
            failed: []
        conversation_id: 01KBF9KF0A9DQNC25RXMM683JP
        continue_conversation: false
    timestamp: "2025-12-01T06:26:50.475817+00:00"
  - type: run-end
    data: null
    timestamp: "2025-12-01T06:26:50.475879+00:00"
intent:
  engine: conversation.home_assistant
  language: en
  intent_input: Play some Billy Joel on VACA Tablet Media Player
  conversation_id: 01KBF9KF0A9DQNC25RXMM683JP
  device_id: null
  satellite_id: null
  prefer_local_intents: false
  done: true
  processed_locally: true
  intent_output:
    response:
      speech:
        plain:
          speech: Media not found
          extra_data: null
      card: {}
      language: en
      response_type: action_done
      data:
        targets: []
        success: []
        failed: []
    conversation_id: 01KBF9KF0A9DQNC25RXMM683JP
    continue_conversation: false

I’m not sure if this is related but I am seeing this is my HA logs as well:

Logger: frontend.js.modern.202511051
Source: components/system_log/__init__.py:331
First occurred: 1:03:39 AM (6 occurrences)
Last logged: 1:52:55 AM

Uncaught error from Chrome WebView 138.0.7204.244 on Android 9 Error: Failed to execute 'define' on 'CustomElementRegistry': the name "remote-button" has already been used with this registry A.define (node_modules/@webcomponents/scoped-custom-element-registry/src/scoped-custom-element-registry.ts:180:12) /hacsfiles/universal-remote-card/universal-remote-card.min.js:17:6595 _ (/hacsfiles/universal-remote-card/universal-remote-card.min.js:35:4760) /hacsfiles/universal-remote-card/universal-remote-card.min.js:35:9147 /hacsfiles/universal-remote-card/universal-remote-card.min.js:81:3944 /hacsfiles/universal-remote-card/universal-remote-card.min.js:81:3948
Uncaught error from Chrome WebView 138.0.7204.244 on Android 9 Error: Failed to execute 'define' on 'CustomElementRegistry': the name "focus-trap" has already been used with this registry A.define (node_modules/@webcomponents/scoped-custom-element-registry/src/scoped-custom-element-registry.ts:180:12) /hacsfiles/advanced-camera-card/card-2e6f5419.js:75:3686
Uncaught error from Chrome WebView 138.0.7204.244 on Android 9 TypeError: Cannot read properties of undefined (reading 'disconnect') t.value (/hacsfiles/lovelace-layout-card/layout-card.js:1:42249) apply (node_modules/@webcomponents/scoped-custom-element-registry/src/scoped-custom-element-registry.ts:441:42) removeChild (src/panels/lovelace/hui-root.ts:1167:11) _selectView (src/panels/lovelace/hui-root.ts:736:13)
Uncaught error from Chrome 142.0.0.0 on Linux TypeError: Cannot read properties of null (reading 'selected') _itemClicked (src/panels/config/voice-assistants/dialog-expose-entity.ts:133:34) call (node_modules/lit-html/src/lit-html.ts:2109:28)

I’m suspecting this is a known issue, but I haven’t seen any posts as detailed as this. Any help is appreciated, feel free to let me know if more logs/more info is needed, it can be provided if I forgot it.

Where is that?

Try Collecting and Sharing Media Files

Using the Media Source documentation linked in the post you shared, I believe what you might be asking for is “media-source://tts/tts.piper” ? When I go to Piper and have it say something, it generates the file at http://homeassistant.local:8123/api/tts_proxy/J5-Ux0QzOUka5uAV1GGYaQ.wav which is the file that it reads when I either generate a phrase, or the Satellite speaks. Is this related to the Satellite not being able to play from Jellyfin in any way?

Music assistant >> settings >> core >> streamserver

“Published IP Address “ must be accessible by player

So like I said in the quote you replied to, Music Assistant is accessible from the Android tablet in the web browser. Is the player you are referring to the Satellite (Tablet)?

So the satellite is a tablet?
Any device using MA should be able to access the IP:port listed as “published IP address”. type doesnt matter.

Do you have “published IP address” set? Are you able to share it?

Yes, the tablet is the satellite. The tablet is utilizing the View Assist Companion App to be a satellite. The published IP address is 192.168.2.5, TCP Port 8098 (default)

I belive the answer to this is NO.

TTS uses local network address setting from HA

Media provider/Media stream will use address from MA that I pointed to. I am not sure if this is true for jellyfin. just my observation.

Any clue on what the next steps may be? Is there another log I can check or a debugging log I overlooked I can turn on?

You should be able to access and play this file at the url provided. If you cannot you have something misconfigured. start by checking the IP/hostname in 2 locations are accessible for the media player/voice device.

Music assistant >> settings >> core >> streamserver
“Published IP Address “ must be accessible by player

HA >> system >> setting >> network >> local network
IP/hostname should be accessible from voice device

url: /api/tts_proxy/J5-Ux0QzOUka5uAV1GGYaQ.wav

this is not accessible by satellite is it?

The tablet talks back when I ask it to do things. It’s connected to Home Assistant and I can see the Web UI in View Assist Companion App. I can also access the Web UI in the browser. I will generate a text and copy the link to the tablet’s browser and edit this post with the results. (I don’t see a reason why it wouldn’t work.)

no need.

first just use MusicAssistant directly to play a jellyfin song on your tablet.

if it cant find it it may just be wrong command. before you test voice it would be better to manually play to verify it can do that.

In MusicAssist (http://192.168.2.5:8095 from my PC, not the satellite) when I set it to “Play on: VACA Tablet Media Player” and select a song, let’s say ‘We Didn’t Start the Fire’ by Billy Joel, It makes the UI look as if ‘We Didn’t Start the Fire’ by Billy Joel is queued and about to play, but I can’t start it and it doesn’t move the player past 0:00. Pausing and unpausing has no effect. Switching to “This Device” and playing the same song works. I am also able to visit MusicAssist on the Tablet/Sattelites browser, and it will play the song on the tablet on “This Device” in the browser, but not on “VACA Tablet Media Player”

See. Progress.

8095 is web UI
8098 is the streaming server. Did you expose this port in docker?

FYI. macvlan networking for external access to docker container is better (my opinion not fact)

to stream audio music assistant will use 192.168.2.5:8098.
Thats what I would expect if all is setup correctly

I have TCP ports 8095 and 8098 open in the firewall. The docker compose for music assist requested I use host network mode, so that’s how it’s set up currently. Using nmap however, TCP 8098 says closed. Is it UDP? The /settings/editcore/streams page says 8098 should be a TCP port

I GOT IT TO PLAY!
In Music Assist > Settings > Core > Stream Server > Advanced Settings Bind to IP/Interface was set to 0.0.0.0 [default]. I set it to 192.168.2.5 and tried playing the song again, and would you believe it it worked. That’s got to be a bug right? 0.0.0.0 is usually a wild card IP used when a server is supposed to respond on any interface was my understanding.

0.0.0.0 works for me but…I use macvlan and not host networking.
binding to the actual interface it will stream from makes more sense. defaults are just easier.

1 Like

Okay, so we’ve got it streaming when it’s manually initiated in Music Assistant, however it looks like it still says Sorry, I couldn't understand that when verbally asked to play Billy Joel, and Media not found when asked over text.

your giving it the wrong command probably. that why it “couldnt understand that”

I dont believe Piper TTS supports the commands you are speaking

There is an example to add support for the sentence here

Music Assistant also has some blueprints for voice support that i just discovered

EDIT
sorry. put wrong link for blueprints. corrected it

If the View Assist documentation is to be believed “Play some Billy Joel” should be a valid command.