I have my voice pipeline setup, i am quite happy with the results the only missing thing is to be able to say “play some indie rock in the kitchen”, this is so easy on Alexa or Google but apparently just not possible in HA unless i create specific actions for it which basically makes this work only on corner cases. Alexa easily searches on Spotify a playlist and plays it on my Sonos speaker in that given room.
I know it could be possible with music assistant but its too frustrating to setup in my env (i have HA on k8s and music assistant just doesn’t play well with a setup like mine), i havent tried using music assistsant with HA integration only (for both speakers and sources), i’ll try that later. That said i would prefer a 100% HA only solution.
Has anyone come up with a good solution using an LLM? I guess it would be a matter of having a service that searches Spotify playlists or Sonos ones and then gives back the URL then the assistant would just need to play that.
The SpotifyPlus Integration can do the search and start play, but not sure how to integrate the voice piece to supply the service what to search for. I’ve never played with voice support before.
The genre search is no longer possible via the Spotify Web API since Spotify removed that feature unexpectedly in November 2024. Prior to that it was very easy with the GetTrackRecommendations service.
Thanks @thlucas at the end i added a Music Assistant container to my Home Assistant pod and then added a simple script shared with Voice Assistant that will use Music Assistant to search for music when i say something like “play X on Y”, it works fine. The main deciding factor is that Music Assistant allows direct oauth sign in with Spotify which makes things a lot easier.
I am going to test Spotify Plus now, if i can get rid of Music Assistant the better. From a voice perspective its very easy to integrate things with the LLM, you just need a prompt explaining something like “when someone asks to play something call this function with two parameters, the search term which is mandatory and the speaker or area which is optional”, then you just need a script that calls search and finds the right player entity.
Yes thats what i am using now (although i created my own script that calls search because i found that one too generic, still i would love to see a 100% ha solution)
Can you share the script you created ?
I have MA running and working with the blueprint/script above, but I cannot get it to work if I say ‘play some {genre} music’.
According to the main MA thread genre support isn’t in MA yet, so how have you done it ?
I am not using their script, right now i just search for a radio, if you have spotify searching for a radio is similar to a genre, you can change the search to a playlist if that suits you better, i have no local music, my script is this:
music_radio_request:
description: |
This script is used to play a radio station on a speaker.
The tool takes the following arguments: 'radio_name', 'player'.
'radio_name' and 'player' are always required and must always be supplied as arguments to this tool.
Use this tool whenever the user requests to play a radio station for example
"Play Radio 105 on the kitchen" or "Play Venus Radio in the living room".
Use 'radio_name' as provided in the sentence without the "radio" part.
Use 'player' as provided in the sentence with no prefix or suffix.
sequence:
- action: music_assistant.search
response_variable: search_result
data:
limit: 1
name: "{{ radio_name }}"
library_only: false
media_type:
- radio
- action: python_script.player_search
response_variable: player_result
data:
name: "{{ player }}"
- variables:
media_uri: "{{ search_result['radio'][0]['uri'] }}"
media_name: "{{ search_result['radio'][0]['name'] }}"
player_id: "{{ player_result['entity_id'] }}"
- action: system_log.write
data_template:
level: info
message: "Play: {{ media_name }} ({{ media_uri }}) on {{ player_id }}"
- action: media_player.play_media
data:
media_content_id: "{{ media_uri }}"
media_content_type: music
target:
entity_id: "{{ player_id }}"
mode: single
fields:
radio_name:
description: "Name of the radio station"
example: "Radio 105"
player:
description: "Media player name"
example: "Kitchen"
Then i have a very simple python script that searches the MA player (i do have sonos and other players active so i want something that finds the MA player only)
name = data.get("name", "Kitchen").lower().split(".")[-1]
# List all entities from the Music Assistant integration
entity_ids = hass.states.entity_ids()
entities = [e for e in entity_ids if e.startswith("media_player.")]
states = {e: hass.states.get(e) for e in entities}
# Filter when app_id is music_assistant
states = {k: v for k, v in states.items() if v.attributes.get("app_id") == "music_assistant"}
logger.debug(f"Found {len(states)} media players, looking for {name}")
# Search for a close match (case-insensitive and partial matching)
candidates = {k: v.attributes.get("friendly_name", k).lower() for k, v in states.items()}
matched_entity = None
for entity_id, friendly_name in candidates.items():
logger.info(f"Checking {entity_id} ({friendly_name}) for match: {name}")
if name in friendly_name or friendly_name in name or name in entity_id:
matched_entity = entity_id
break
if matched_entity:
logger.info(f"Best match found: {matched_entity} ({candidates[matched_entity]})")
else:
logger.warn(f"No suitable match found for '{name}'")
output["entity_id"] = matched_entity
output["states"] = states
Thats it, now i can just say “play rock radio in the bathroom” and it works quite well. Faves are taken first.
Thankyou so much for the share, I’ll dissect and see what I can use.
The other really simply option is to create some genre specific playlists and then just trigger those I suppose.
If you use Spotify this is not needed but yes if you have local media probably thats the way, i am pretty sure you could use an LLM for that too or maybe just use a script for that that creates a playlist every night based on genre in the media info
I could write a python script to parse the mp3 tags and generate m3u files which MA uses for playlists. That would save a bunch of time rather than doing it by hand.