Forked-daapd and media player services

Tags: #<Tag:0x00007f326e6791f0>

I have Forked-Daapd setup and working. I can control speakers in zones and playlist functionality. I can even do TTS to the main forked-daapd media player entity. However, the TTS is very slow and regardless what speakers I have enabled or volumes they are set at, it always sets all zones to 80%, enabled all zones, then speaks to them but anywhere from 1-2 minutes after I do TTS. I also cannot send a media_player.play_media service command to an individual zone. Nothing happens. No error in forked-daapd or in HA. If I do the same service command to the main forked-daapd zone (the system entity) all playing zones immediately stop playing and in the forked-daapd web interface I see all zones join, all volumes go to 80%, but then nothing happens. There’s no audio. Then after a period of time, forked-daapd goes back to playing the queue. These same services all work on my Sonos speakers (TTS and volume set and play_media). From what I can tell, I should be able to play to a forked-daapd zone individually.

Has anyone gotten forked-daapd working fully with respect to playing to an individual zone through HA services?

Hi, I wrote the forked-daapd integration (but now I realize that there was already a custom component for forked-daapd). I just wanted a way to have simple control of volumes and zones within HA. When I added the TTS feature I wasn’t sure if anyone would use it. Now that there’s a user we can talk about how to change or improve functionality.
For the TTS volume part, the initial design was to have all the zones play at the same level (by default this is 0.8 but you can set this level in the integration options). I realized this might not be ideal for some people but was not sure of the alternative. Would playing at whatever the current zone volume is set to be a better option? What does the Sonos integration do?
The play_media service is only available on the main entity and not on the zones. The reason for this is that the zones are not actually independent - they are all fed through the main forked-daapd instance. Sending a play_media to an individual zone would affect/stop whatever is playing on the other zones. Is that preferable?
I’m not sure why there would be a 1-2 minute delay on the TTS. Does this happen consistently?

I honestly haven’t played around with it too much since I initially configured it. The ability to turn zones on and off is nice and works well.

For Sonos I can send a TTS message to a player and it will use the volume that either I specify in the service call data, or it will just use whatever volume it was last at. Also for Sonos I can play to individual speakers. if a speaker is grouped with others and I play to the master speaker, then all speakers play the TTS. If I play to any other speaker in the Sonos group, just that speaker will leave the group and play the message. I then have to manually add that speaker back into the group. Alternatively, Sonos has a couple other services exposed that allows you to save a snapshot of the current group configuration, join/unjoin speakers, play whatever you want, and restore the speaker group config to continue on. The TTS seemed to always take a minute or two. I’d send a TTS to the main forked-daapd media player, then watch the volume controls on the forked-daapd server to change. It would take a bit, all speakers would turn on and volumes go to 80%, then after another slight pause play the message.

I think what Sonos does is logical. The way you explained it, it sounds like while each media player configured under forked-daapd is its own endpoint with on/off and volume control, it can’t actually play something on its own. You have to play to the main media player that’s doing the processing. Whereas Sonos every speaker is its own player

The reason I looked into forked-daapd and configured it was to integrate my airport express, and airplay speaker endpoints on a couple computers in different rooms. Basically I wanted to expand some of the functionality I had with Sonos to existing hardware.

Thanks for the info.
I can change the behavior for TTS to use each zone’s current volume. I think it also makes sense to be able to send a volume along with the service call, as you mentioned. I just had a look at the Sonos component and I don’t see where the volume is passed in the call - are you using a script that sends a few service calls together like https://www.home-assistant.io/cookbook/sonos_say/ ?
You have it exactly right about the structure of the forked-daapd player/zones vs the Sonos speakers. That introduces limitations - you can’t play two sources at the same time with one instance of forked-daapd. This would affect behavior like sending a TTS to a specific zone while other zones are playing. We could just go ahead and have the TTS get sent to that zone while pausing the audio elsewhere - let me know your thoughts on that.
As for grouping, I think the Sonos speakers themselves store the group information and coordinate the group. You can use the groups through the Sonos app and not just through HA right? We don’t have speaker groups in forked-daapd, but I might be able to get something similar working. When you have a Sonos speaker group configured in HA, where can you see it? Do you manage it through service calls?

you’re right, the volume is actually a separate service call to the media player using media_player.volume_set. I had it confused with a script I created that took volume as a parameter when called. The Sonos speakers/controller ecosystem maintain the groupings and can be managed from the Sonos app. The forked-daapd endpoint kind of already has this, instead of join/unjoin like in Sonos, it’s just a matter of turning the endpoint on/off I think which would automatically join/unjoin it from the other endpoints on the main daapd server. So a snapshot might just look at which endpoints are on/playing, a “join” command would just turn on any endpoint requested that’s not already on, an unjoin would turn it off, and a restore would just turn on/off the endpoints as they were at the snapshot. This sounds like it would be something done in HA kind of like creating a scene on the fly. The Sonos join/unjoin/snapshot/restore actually call Sonos api commands I believe. When a Sonos speaker is in a group, the only way to tell is the Sonos speaker is in a group with others is in the sonos_group: attribute of the speaker if it lists any speakers other than itself. The obvious indicator is from the media player entities as they are playing the same exact media and their states are identical.

as for the TTS, if an endpoint other than the main daapd server receives a TTS message, I think it should stop playback of all zones and play the TTS to just that zone at the current volume (so the TTS would actually get routed to the main daapd server, all other zones would be turned off, TTS would play to the requested zone). If the main daapd server receives a TTS, I think it should play to all endpoints (kind of like a PA override input on certain amplifiers). but then I’m not sure how to handle the post-TTS scenario. should it resume playback? should it just sit and wait because it finished what it was told to do? should it turn that endpoint back off if it was off before TTS? The way Sonos handles it is as if a new media file should be played right now, it clears its queue, plays the media file to the speaker and any slaves attached to it, then sits and waits. That’s why that cookbook script has all the extra service calls to snapshot the current group and queue config, set volume, play TTS, then restore and resume the queue.

Thanks for the info and suggestions. I’ll see if I can work in some changes in the next few weeks when I have some time. Probably won’t make it into 0.115 but maybe we can get something in by 0.116.

1 Like

@squirtbrnr Just circling back to this. In terms of what we discussed above, I started to implement similar services to what Sonos has, but I realized that most of them are not actually that useful given the differences in the zone architecture.

As for grouping, I’m not sure it makes sense to have that as a service - it’s probably easier to just turn the zones on/off through scripting. I think most of what we mention actually relates to the TTS functionality. I think we would like to be able to specify which speakers to play the TTS to and with what volume.

The Sonos component does the pausing/snapshotting in the TTS through a script which makes it easy to set the volumes and the groups. However, the forked-daapd component tries to do the TTS atomically in one asynchronous function. The reason for this is that if we used a script, each command would be sent to the forked-daapd server asynchronously, but the forked-daapd server has a hard time if you send overlapping calls. With one function we are able to chain the commands together with logic to wait for each command to complete before sending the next one. (Although you seem to be getting delays of a minute or two. This seems like a bug - we can work together to fix this.)

Maybe a solution is to have 1) a service that stores snapshots of the current zones and their volumes under a given name (or a number). Then have 2) another service which can restore the snapshot if you want to restore a zone configuration and 3) a third service which restores the snapshot as the volume/group to use when playing TTS. Does this make sense to you?

Sorry, I haven’t had a chance to think through this. I’ll try and get back to you tonight.

No problem, whenever you have time. I am pretty busy these few weeks so I might not be able to work on it too much.
BTW I added media browser support but would prefer some feedback before I submit a PR. Let me know if you’re willing to give it a try.

Regarding play_media service, I have the same problem as @squirtbrnr but it happens on the main device (all zones activates and volume set to 80%, ource is changed to my shaircast pipe)

I try to play a specific track from my library on forked daapad

entity_id: media_player.forked_daapd_server
media_content_id: 'library:track:25522'
media_content_type: music

Anything wrong in the parameters or is it not supported by your integration?

1 Like

Yes, I think we have agreed to remove the default volume setting. The question is how to set your own TTS volumes. It’s hard to set the options in the integration because the Zone entities are created dynamically, so we don’t know how many options we need to input and how to map those options to the correct Zone entities. I suggested doing the TTS volume setting by providing a snapshot service. You could snapshot the Zone configuration and then send a TTS call with that Zone configuration and that will take care of the volumes/zones for you.
As for the play_media service, yes it was not implemented for playing any media. The current code only plays a TTS url.
However, the updates I made to implement media browser included updating play_media. The media_content_id string is not exactly in that format - I had to add some extra data into the media_content_id string for better media browser compatibility. Are you interested in trying the updated component out? I would appreciate some quick feedback on it before I submit it as a PR.
Do you have the same problem that @squirtbrnr had where the TTS call was taking a long time to complete?

1 Like

In fact, I’m not using TTS, my main concern is to be able to play tracks from the library but I’m OK to test your updated component.

As a quick and dirty solution to play tracks from the forked-daapd libray, I removed all the TTS stuff in the play media function and it works as I expect it:

    async def async_play_media(self, media_type, media_id, **kwargs):
        """Play a URI."""
        if media_type == MEDIA_TYPE_MUSIC:
            _LOGGER.debug("Play Media '%s'", media_id)
            await self._api.add_to_queue(uris=media_id, playback="start", clear=True)
        else:
            _LOGGER.debug("Media type '%s' not supported", media_type) 

Great.
You’ll have to use pip to install an updated version of the pyforked-daapd library from the master branch here https://github.com/uvjustin/pyforked-daapd (it’s still the same version # to avoid any version pinning problems in HA).
The component is in this branch here: https://github.com/uvjustin/home-assistant/tree/forked-daapd-browse-media .
To use a regular forked-daapd uri with this component just add an ampersand before and after. e.g.

entity_id: media_player.forked_daapd_server
media_content_id: 'library:track:25522'
media_content_type: music

becomes

entity_id: media_player.forked_daapd_server
media_content_id: '&library:track:25522&'
media_content_type: music

I’ve successfully updated the component and it works great. Good work! It 's just a bit strange to have to add an ampersand before & after. Why should we need it?

Yes it’s strange. The reason I added is that from what I can see, the media_content_id and media_content_type are the only fields that we can use to pass data from a media browser parent to a media browser child. The URI contains most of the information but not all of it - for example, if we browse by Album by Genre, we need to keep track of the fact that we are browsing by both Album and Genre - the URI won’t carry those details from the parent down to the grandchild. This also makes it easier to set the title - the parent might have some extra information that we would like to include in the child’s title. I used ampersands because there are readily available functions to escape and unescape the text using ampersands.
When I submit the PR I’ll see if there are any better suggestions for passing information down.
What types of searching would you be interested in adding or removing? For example, Albums by Genre, Tracks by Genre, and Genre might overlap. Maybe we should remove Genre?
If you have a large library, maybe it’s better to split Albums or Artists into some kind of alphabetical index? forked-daapd supports a good amount of search functionality so we just have to think of what kind of searches we would want to translate into a browsing tree.

Ok got around to thinking this through. I agree, a script to call the media player turn on/off service for each zone would handle the “grouping”. Better yet would be a scene of sorts. Scenes currently have the services/ability to save (snapshot) and then to restore the snapshot it’s as simple as calling the scene turn on (or activate) command.

So going back to how forked-daapd works, any input in terms of TTS has to be directed to the “main” media player. But based on which zones are turned on determines where it actually plays. So couldn’t that just be a script in HA of first snapshot the current setup (currently playing media/queue status, media players that are on and their volume), pause playback, turn on selected media players (zones) and set their volume, play TTS message to main media player (ignore volume unless specified in which case it sets all media players to that volume), then restore the setup and resume playback. That’s essentially what the TTS script I have for my Sonos does, the difference being I send the TTS to a main media player for forked-daapd.

From what I just described I don’t see a need for any additional service except a scene snapshot/restore. Most everything is handled by existing component services for media players. The only thing is how to handle a media player call without an accompanying volume set.

I run HA docker. I can install a custom component through the custom component folder, but I don’t think I can with pip. At least I’ve never done it through pip or used pip. I’m not too worried about the media browser side of things right now.

Sorry if I repeated anything, duplicated what was already said, or I’m just not getting it and way off base with what we’re talking about.

I think you just described the same thing I described.
The current TTS implementation does its own snapshotting internally for restoring later. The issue is that between snapshotting and restoring it globably replaces the current volume with whatever the default TTS volume is set to (default is 80%). Having a snapshot save/restore feature and being able to send the snapshot along with the TTS call should provide sufficient functionality.
A few more questions for you:

  1. A scene would be the same thing as a snapshot, right?
  2. Would we provide a limited number of slots or allow the snapshots/scenes to be saved with names?
  3. Should we expose those snapshots/scenes as “Sound modes” so we can switch between them using the media player interface?
    As for trying out the new component feature, don’t worry about it yet. After some initial feedback from @davidlb I can push the updated pip library and bump the version used by HA. Then testing the custom component won’t require a custom pip.
  1. Yes. I just used the existing ability to create scenes as an example.
  2. I don’t think it’s necessary to save the snapshot beyond the TTS call. I think saving the scene as a sound config would be a bonus feature. I was thinking more along the lines of save the config and hold it for restoring until something overwrites it.
  3. hmm haven’t thought of that one. The existing media player component on the front end just has features as it pertains to media. I guess I’d need to know a little more on how the sound modes would be displayed on the front end.

The snapshots would need to be saved to be recalled or sent to the TTS service. Say you have 8 zones but are currently only playing something on zones 1 and 2. The current TTS implementation already saves the existing player state as a snapshot before and restores it after the TTS is done. The issue is what configuration of zones to play for the TTS call itself. You might not want it to play on the exact same zones and volumes as is currently playing. In our example, maybe you want to play the TTS on all 8 zones. For that you’d need to have that information (the 8 zones and their associated volumes) stored as a snapshot.

Just for reference, I also have the 80% volume and delay problem with TTS. Other than that, everything seems to be working great.
I wanted to try forked-daap, because the current integration for Denon receivers does not allow direct TTS, going through forked-daap I can circumvent that.