HA + MA no voice integration working, throws error

I’ve got Music Assisstant working. I have Home Assistant with Voice Assistant working. Unfortunately HA voice can’t trigger actions with MA.

Config notes: Running docker on Linux. Fully updated for both HA and MA. Integration is installed in HA, but I don’t get a side menu option for Music Assistant for some reason that people seem to talk about. Controlling MA devices using dashboard media cards works fine. MA finds my HA music players and lets me stream to them from the MA interface. I’ve followed the configuration options for MA to integrate voice assistant in HA by adding the two files in the ha config/custom_sentences/en/ folder from github as per instructions.

First what the voice assistant does. You can see it talks to devices fine, just not MA:

Screenshot from 2024-12-16 14-38-33

The HA log generates an “Unknown intent MassPlayMediaAssist” error:

2024-12-16 14:39:06.213 ERROR (MainThread) [homeassistant.components.assist_pipeline.pipeline] Unexpected error during intent recognition
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/assist_pipeline/pipeline.py", line 1078, in recognize_intent
    conversation_result = await conversation.async_converse(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<7 lines>...
    )
    ^
  File "/usr/src/homeassistant/homeassistant/components/conversation/agent_manager.py", line 110, in async_converse
    result = await method(conversation_input)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/conversation/entity.py", line 47, in internal_async_process
    return await self.async_process(user_input)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/conversation/default_agent.py", line 368, in async_process
    return await self._async_process_intent_result(intent_result, user_input)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/conversation/default_agent.py", line 441, in _async_process_intent_result
    intent_response = await intent.async_handle(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/usr/src/homeassistant/homeassistant/helpers/intent.py", line 121, in async_handle
    raise UnknownIntent(f"Unknown intent {intent_type}")
homeassistant.helpers.intent.UnknownIntent: Unknown intent MassPlayMediaAssist


Have I missed a step somewhere?

Can you post your config files?

What config files? Other than the two voice integration for MA, everything for this is configured through the gui. The only two files I did for this are the two from github as linked in the instructions here: Voice Control - Music Assistant

You added the custom_sentences.yaml from the MA github, but you will need to create a custom intent for triggering the music_assistant.play_media service. There are no built-in intents for music search, only media_player controls.

There’s nothing in the setup instructions for Music Assistant saying that you would have to do that, or exactly how to do that. It only talks about installing the intents file and the responses file and even calls that setup “Easiest”. I got the impression that the additional intent helpers are handled by the plugin. I’m getting the impression that Music Assistant isn’t quite ready for the average joe with only 3 years of HA experience.

Music Assistant did build out the ability to search all your providers content from HA with the music_assistant.play_media service. It all looks great and should work in theory, the automation seems like it would be straight forward, but as i posted earlier this morning I have no been able to get this work (im no pro HA user though).

https://community.home-assistant.io/t/custom-intents-and-music-assistant/811943

I started the same way you did by bringing in the music assistant provided sentence file and tried building out the intent based on the slots but i could not get it to work. Now I’ve eliminated all the bells and whistles and I still cant get this thing to play music from a voice command.

Well at least I’m not the only one. lol. With all the hype around it and nothing but positive vibes I was expecting it to work, yes I expect a bit of pain everytime I do something new with HA, but oh well. This is the last thing I was working out to be able to replace my Google Home infrastructure entirely with in-house.

I’ll let you know if someone rains down some knowledge or if i stumble into a solution.

I will say if you haven’t tried it yet the M5Stack Atom is awesome. The speaker is unusable on it, but I’ve ordered a dac board to give me a line out.

Then again the rumors of voice hardware coming…

I made some progress. Turns out the native MA plugin doesn’t include what’s needed for voice to work. Switching to the HACS version of it caused the error to stop, but now it gives me:

2024-12-16 17:10:07.091 WARNING (MainThread) [homeassistant.helpers.service] Referenced entities media_player.office_speaker are missing or not currently available

Tomorrow I might try ripping it all out and starting over with just the HACS plugin and see where I get, but clearly there’s a major difference between the two.

I thought about going that route, but they are put out a warning deprecating the HACS integration and moving everything over to Core after 2024.12. Let me know if it works well, but probably a temporary solution.

“Yes, in about a month.”

I’m having the same issue and I could not get it to work with either the HACS or the HA integrations

Now it’s working using the HACS component

I’m also trying to play tracks on my new Voice Assistant Satellite using voice and MA.

First I’ve tried it directly - to no avail.

Then at some point I moved back to the HACS integration. The best I could get with that was that it said it would now start the song, but it never actually started.

Moved back to the new integration and tried playing around with setting up an intent (my first one actually, so I barely have an idea what I’m doing here):

config/custom_sentences/de/music_assistant_PlayMediaOnMediaPlayer.yaml

Summary
language: "de"
intents:
  MusicAssistantPlayMediaOnMediaPlayer:
    data:
      # EINEN BEREICH ZIELEN
      - sentences:
          - "<abspielen> {query};im [der ]<bereich> [((mit)|(unter Verwendung von)) {radio_modus}]"
        expansion_rules:
          abspielen: "((spiel(e)?)|(höre))"

      # EINEN NAMEN ZIELEN
      - sentences:
          - "<abspielen> {query};<auf> [dem ]{name} [<wiedergabegeräte>] [((mit)|(unter Verwendung von)) {radio_modus}]"
        expansion_rules:
          abspielen: "((spiel(e)?)|(höre))"
          wiedergabegeräte: "((Lautsprecher)|([Medien-]Player))"
          "auf": "(auf|über)"
        requires_context:
          domain: "media_player"

      # EINEN BEREICH UND EINEN NAMEN ZIELEN
      - sentences:
          - "<abspielen> {query};im [der ]<bereich> <auf> [dem ]{name} [<wiedergabegeräte>] [((mit)|(unter Verwendung von)) {radio_modus}]"
        expansion_rules:
          abspielen: "((spiele)|(höre))"
          wiedergabegeräte: "((Lautsprecher)|([Medien-]Player))"
          "auf": "(auf|über)"
        requires_context:
          domain: "media_player"

lists:
  query:
    wildcard: true
  radio_modus:
    values:
      - "Radiomodus"

config/intents.yaml

Summary
intents:
    - spec:
        name: play_track_on_media_player
        description: Plays any track (name or artist of song) on a given media player
        parameters:
          type: object
          properties:
            track:
              type: string
              description: The track to play
            entity_id:
              type: string
              description: The media_player entity_id retrieved from available devices. 
                It must start with the media_player domain, followed by dot character.
          required:
          - track
          - entity_id
      function:
        type: script
        sequence:
        - service: music_assistant.play_media
          data:
            media_id: '{{track}}'
            media_type: track
          target:
            entity_id: '{{entity_id}}'

But no matter what I do, when not using the HACS version, it always tells me that the player is not in pause and can’t play anything. I assume that it is only able to pause and resume with the pre-existing intents. No idea why it is not picking up my custom intents.

I might add than when debugging the intent with the dev tools → assist, it states that it is using the intent HassTurnOn - that obviously won’t work. I’m telling it like “Spiele XY von Z” - “Play XY by Z”, which should be covered by my custom intent?

I would have assumed that this somehow works out of the box, but it does not and I’m puzzled how I can make this work. And why it seemed to have worked in the HACS integration but not in the new one?
Any help is greatly appreciated! :slight_smile:

Okay, after a lot of try and error I made some progress. This is going to be verbose, but i figured it’s better to be as detailed as possible and not just throw around some random snippets without any context.

Documentation is lacking :-/

Unfortunately the documentation of the whole Intent part is very very lacking and it is near impossible to get something working by the sparse documentation alone - without reading lots of posts and trying to make any sense out of it without actually understanding what things really do, I wouldn’t have gotten anywhere. I really wish someone who really understands the Intent pipeline would make a better documentation.


Prerequisites

So this is my starting point:

  1. Working Assist pipeline in general for built-in commands
  2. Installed Music Assistant and configured it to use my Voice PE as speaker
  3. Installed HA Music Assistant Integration (not the HACS one)
  4. Installed addon “File Editor”, so you can edit/add configuration files
  5. Just for reference, I’m a german speaker, so you might need to change “de” to your country code in folder names and scripts and update the sentences used for matching the intent

Set up intent

  1. Open up the file editor
  2. Create folder custom_sentences if it does not exist
  3. Create subfolder de within custom_sentences
  4. Create a new file to hold my intent - I used the one from the MA sample, which is: music_assistant_PlayMediaOnMediaPlayer.yaml
    The structure should then be:
    custom_sentences/de/music_assistant_PlayMediaOnMediaPlayer.yaml
  5. Add a minimal intent configuration to that file (I will extend and post the complete intent once it is working):

./custom_sentences/de/music_assistant_PlayMediaOnMediaPlayer.yaml

language: "de"
intents:
MassPlayMediaAssist:
    data:
        - sentences:
            - "<play> <track> {track}"
          expansion_rules:
              play: "((spiel)|(spiele))"
              track: "[((der)|(die)|(das)) ](track|song|lied)"

lists:
    track:
        wildcard: true

This should pick up a voice command like this (and then moves on to the intent_script below):
"Spiele Song Fly me to the moon" - "Play song Fly me to the moon"

Note: This is a very minimalistic sentence. For this post I’ve removed all the fancies that the example provided by MA has. Funny thing is, with the fancies it didn’t work at all. It was only now that I’ve stripped it down, that it, while writing this, it actually worked for the first time at all. The regular sentence was like this: "<play> <track> {track} [von <artist> {artist}] [((mit)|(im)) {radio_mode}]"

Verify if intent is picked up

If this works, you can check in the HA GUI using the dev tools and there the tab “Assist”.
If you enter into the parser “Spiele Song Fly me to the moon” it should output this:

intent:
  name: MassPlayMediaAssist
slots:
  track: Fly Me To The Moon
details:
  track:
    name: track
    value: Fly Me To The Moon
    text: Fly Me To The Moon
targets: {}
match: true
sentence_template: <play> <track> {track}
unmatched_slots: {}
source: custom
file: de/music_assistant_PlayMediaOnMediaPlayer.yaml

It is super important that match: true, otherwise, while it has applied our new intent to parse the command, it was not successful in matching one of the sentences or its parameters (don’t know exactly what it didn’t match).


Response

We need to set up a response for our intent next, which will be used to confirm our command using voice output.

  1. Open file editor
  2. Go to custom_sentences/de/
  3. Create new file responses.yaml
  4. Add the following and save file

./custom_sentences/de/response.yaml

language: "de"
responses:
   intents:
     MassPlayMediaAssist:
       default: "Okay"

This will just confirm our command with “Okay”, which probably can be improved later, if that is desired.


Intent Script

So now that we have successfully picked up the voice command, we need to add an matching intent_script to pass on the data to a service. The intent_script should have the same name as the intent, which in our case is MassPlayMediaAssist, as in our intent above.

  1. Again open up the file editor
  2. Edit the configuration.yaml file and add the following

./configuration.yaml

intent_script:
  MassPlayMediaAssist:
    action:
      - service: music_assistant.play_media
        target:
          entity_id: media_player.home_assistant_voice_091d31_media_player_2
        data:
          enqueue: replace
          media_id: "{{ track }}"
          #media_id: Punk Rock Song
          media_type: track
          radio_mode: false

Two notes here:

  • You will need to edit the entity_id and provide it the one of the speaker you wish to use. That probably can be improved later, but we need a bare minimal setup that actually works, before we can move on to more complex things.

  • There is a commented-out media_id which I’ve used to check if the playback works in general, no matter what data the intent passes on to the script. With the more complex sentences I have the issue that the track isn’t provided for some unknown reason and the intent_script throws an error with the voice saying “An unknown error has occurred”. I then used a static media_id to check if it would work if the track would be set and containing valid data. If it doesn’t, that needs to be fixed first.


Reload configurations

Now we need to apply our changes to the yaml configuration files.
I’m not sure if you need to reboot when adding new files, but maybe do that once.
Then you can usually go to the dev tools, tab “YAML” and first check the configuration in the first block and then, if all is good, reload all YAML configs in the second block. (If you know exactly which ones to reload, you can selectively reload them, but I don’t know which, so I hit “All”).


Test it!

Now test it by saying the magic words and make sure you have a matching song. *fingers crossed*


Flow overview

The process in general is somewhat like this:

  1. Speak a command
  2. Some magic happens and if things go well our intent is picked to process the command (probably by the configured sentence patterns in the intent)
  3. Our Intent picks up voice command and tries to match one of the configured sentences and extracts parts of the sentence as our data (track).
  4. Data is handed over to the intent_script along with the matched data (track)
  5. intent_script maps our data to specific arguments and calls defined service with those arguments
  6. If things go well, the response is called, which outputs a voice confirmation and then the song is played

Across the intent, intent_script and responses, our configurations are linked by it’s common name, which here in this case is MassPlayMediaAssist - but it can be pretty much anything that is not in use already.


What’s next?

This is super minimalistic, but it works and should get you started.

  • Next are improvements to the intent so that more complex patterns/sentences can actually be matched without things go boom.
  • The entity_id of the media player should be made dynamic and not static
  • The response could contain what actually is going to be played next, but that’s up to you.
  • It would be nice to have better error handling and better debugging - if anybody knows more, I’m all ears.
  • In the same sense, it would be nice to have a dedicated output in case the track could not be found, but otherwise everything else was okay.
  • I really need a way to debug an intent_script as this is what is usually going wrong - namely the slots exceptions, where I just can’t find out what exactly went wrong with it, but it indicates invalid data being passed to the intent_script or that the mapping is incorrect (like incorrect type).
  • In the intent script, how do I handle the different sentences from the intent, which provides different datasets? Right now it can only play a track and that type is hardcoded, but how do I make it more flexible to e.g. play something from an artist?

Issues

  • When the sentence matching messes up and an invalid track is being passed to the intent_script, it will throw an exception stating that track could not be resolved. But it will not be cleared from the cache, so even if the next attempt will provide a valid track, it will still try the invalid one first and again throw an exception. The only way I found to resolve that is to reboot, which is super bad. Any ideas?
    This is what I get for example:
    Logger: homeassistant.helpers.script.intent_script_massplaytrack
    Quelle: helpers/script.py:2032
    Erstmals aufgetreten: 16:35:19 (2 Vorkommnisse)
    Zuletzt protokolliert: 16:38:59
    
        Intent Script MassPlayTrack: Error executing script. Error for call_service at pos 1: Could not resolve ['Fly Me to the Moon im Radiomodus'] to playable media item
        Intent Script MassPlayTrack: Error executing script. Error for call_service at pos 1: list index out of range
    
  • We really, like really, need something to stop the music. Saying stop does nothing for me, unfortunately.

That’s it. Hope that is useful to someone and please share if you have improved on this. :slight_smile: