Forked-daapd and media player services

you’re right, the volume is actually a separate service call to the media player using media_player.volume_set. I had it confused with a script I created that took volume as a parameter when called. The Sonos speakers/controller ecosystem maintain the groupings and can be managed from the Sonos app. The forked-daapd endpoint kind of already has this, instead of join/unjoin like in Sonos, it’s just a matter of turning the endpoint on/off I think which would automatically join/unjoin it from the other endpoints on the main daapd server. So a snapshot might just look at which endpoints are on/playing, a “join” command would just turn on any endpoint requested that’s not already on, an unjoin would turn it off, and a restore would just turn on/off the endpoints as they were at the snapshot. This sounds like it would be something done in HA kind of like creating a scene on the fly. The Sonos join/unjoin/snapshot/restore actually call Sonos api commands I believe. When a Sonos speaker is in a group, the only way to tell is the Sonos speaker is in a group with others is in the sonos_group: attribute of the speaker if it lists any speakers other than itself. The obvious indicator is from the media player entities as they are playing the same exact media and their states are identical.

as for the TTS, if an endpoint other than the main daapd server receives a TTS message, I think it should stop playback of all zones and play the TTS to just that zone at the current volume (so the TTS would actually get routed to the main daapd server, all other zones would be turned off, TTS would play to the requested zone). If the main daapd server receives a TTS, I think it should play to all endpoints (kind of like a PA override input on certain amplifiers). but then I’m not sure how to handle the post-TTS scenario. should it resume playback? should it just sit and wait because it finished what it was told to do? should it turn that endpoint back off if it was off before TTS? The way Sonos handles it is as if a new media file should be played right now, it clears its queue, plays the media file to the speaker and any slaves attached to it, then sits and waits. That’s why that cookbook script has all the extra service calls to snapshot the current group and queue config, set volume, play TTS, then restore and resume the queue.

Thanks for the info and suggestions. I’ll see if I can work in some changes in the next few weeks when I have some time. Probably won’t make it into 0.115 but maybe we can get something in by 0.116.

1 Like

@squirtbrnr Just circling back to this. In terms of what we discussed above, I started to implement similar services to what Sonos has, but I realized that most of them are not actually that useful given the differences in the zone architecture.

As for grouping, I’m not sure it makes sense to have that as a service - it’s probably easier to just turn the zones on/off through scripting. I think most of what we mention actually relates to the TTS functionality. I think we would like to be able to specify which speakers to play the TTS to and with what volume.

The Sonos component does the pausing/snapshotting in the TTS through a script which makes it easy to set the volumes and the groups. However, the forked-daapd component tries to do the TTS atomically in one asynchronous function. The reason for this is that if we used a script, each command would be sent to the forked-daapd server asynchronously, but the forked-daapd server has a hard time if you send overlapping calls. With one function we are able to chain the commands together with logic to wait for each command to complete before sending the next one. (Although you seem to be getting delays of a minute or two. This seems like a bug - we can work together to fix this.)

Maybe a solution is to have 1) a service that stores snapshots of the current zones and their volumes under a given name (or a number). Then have 2) another service which can restore the snapshot if you want to restore a zone configuration and 3) a third service which restores the snapshot as the volume/group to use when playing TTS. Does this make sense to you?

Sorry, I haven’t had a chance to think through this. I’ll try and get back to you tonight.

No problem, whenever you have time. I am pretty busy these few weeks so I might not be able to work on it too much.
BTW I added media browser support but would prefer some feedback before I submit a PR. Let me know if you’re willing to give it a try.

Regarding play_media service, I have the same problem as @squirtbrnr but it happens on the main device (all zones activates and volume set to 80%, ource is changed to my shaircast pipe)

I try to play a specific track from my library on forked daapad

entity_id: media_player.forked_daapd_server
media_content_id: 'library:track:25522'
media_content_type: music

Anything wrong in the parameters or is it not supported by your integration?

1 Like

Yes, I think we have agreed to remove the default volume setting. The question is how to set your own TTS volumes. It’s hard to set the options in the integration because the Zone entities are created dynamically, so we don’t know how many options we need to input and how to map those options to the correct Zone entities. I suggested doing the TTS volume setting by providing a snapshot service. You could snapshot the Zone configuration and then send a TTS call with that Zone configuration and that will take care of the volumes/zones for you.
As for the play_media service, yes it was not implemented for playing any media. The current code only plays a TTS url.
However, the updates I made to implement media browser included updating play_media. The media_content_id string is not exactly in that format - I had to add some extra data into the media_content_id string for better media browser compatibility. Are you interested in trying the updated component out? I would appreciate some quick feedback on it before I submit it as a PR.
Do you have the same problem that @squirtbrnr had where the TTS call was taking a long time to complete?

1 Like

In fact, I’m not using TTS, my main concern is to be able to play tracks from the library but I’m OK to test your updated component.

As a quick and dirty solution to play tracks from the forked-daapd libray, I removed all the TTS stuff in the play media function and it works as I expect it:

    async def async_play_media(self, media_type, media_id, **kwargs):
        """Play a URI."""
        if media_type == MEDIA_TYPE_MUSIC:
            _LOGGER.debug("Play Media '%s'", media_id)
            await self._api.add_to_queue(uris=media_id, playback="start", clear=True)
        else:
            _LOGGER.debug("Media type '%s' not supported", media_type) 

Great.
You’ll have to use pip to install an updated version of the pyforked-daapd library from the master branch here https://github.com/uvjustin/pyforked-daapd (it’s still the same version # to avoid any version pinning problems in HA).
The component is in this branch here: https://github.com/uvjustin/home-assistant/tree/forked-daapd-browse-media .
To use a regular forked-daapd uri with this component just add an ampersand before and after. e.g.

entity_id: media_player.forked_daapd_server
media_content_id: 'library:track:25522'
media_content_type: music

becomes

entity_id: media_player.forked_daapd_server
media_content_id: '&library:track:25522&'
media_content_type: music

I’ve successfully updated the component and it works great. Good work! It 's just a bit strange to have to add an ampersand before & after. Why should we need it?

Yes it’s strange. The reason I added is that from what I can see, the media_content_id and media_content_type are the only fields that we can use to pass data from a media browser parent to a media browser child. The URI contains most of the information but not all of it - for example, if we browse by Album by Genre, we need to keep track of the fact that we are browsing by both Album and Genre - the URI won’t carry those details from the parent down to the grandchild. This also makes it easier to set the title - the parent might have some extra information that we would like to include in the child’s title. I used ampersands because there are readily available functions to escape and unescape the text using ampersands.
When I submit the PR I’ll see if there are any better suggestions for passing information down.
What types of searching would you be interested in adding or removing? For example, Albums by Genre, Tracks by Genre, and Genre might overlap. Maybe we should remove Genre?
If you have a large library, maybe it’s better to split Albums or Artists into some kind of alphabetical index? forked-daapd supports a good amount of search functionality so we just have to think of what kind of searches we would want to translate into a browsing tree.

Ok got around to thinking this through. I agree, a script to call the media player turn on/off service for each zone would handle the “grouping”. Better yet would be a scene of sorts. Scenes currently have the services/ability to save (snapshot) and then to restore the snapshot it’s as simple as calling the scene turn on (or activate) command.

So going back to how forked-daapd works, any input in terms of TTS has to be directed to the “main” media player. But based on which zones are turned on determines where it actually plays. So couldn’t that just be a script in HA of first snapshot the current setup (currently playing media/queue status, media players that are on and their volume), pause playback, turn on selected media players (zones) and set their volume, play TTS message to main media player (ignore volume unless specified in which case it sets all media players to that volume), then restore the setup and resume playback. That’s essentially what the TTS script I have for my Sonos does, the difference being I send the TTS to a main media player for forked-daapd.

From what I just described I don’t see a need for any additional service except a scene snapshot/restore. Most everything is handled by existing component services for media players. The only thing is how to handle a media player call without an accompanying volume set.

I run HA docker. I can install a custom component through the custom component folder, but I don’t think I can with pip. At least I’ve never done it through pip or used pip. I’m not too worried about the media browser side of things right now.

Sorry if I repeated anything, duplicated what was already said, or I’m just not getting it and way off base with what we’re talking about.

I think you just described the same thing I described.
The current TTS implementation does its own snapshotting internally for restoring later. The issue is that between snapshotting and restoring it globably replaces the current volume with whatever the default TTS volume is set to (default is 80%). Having a snapshot save/restore feature and being able to send the snapshot along with the TTS call should provide sufficient functionality.
A few more questions for you:

  1. A scene would be the same thing as a snapshot, right?
  2. Would we provide a limited number of slots or allow the snapshots/scenes to be saved with names?
  3. Should we expose those snapshots/scenes as “Sound modes” so we can switch between them using the media player interface?
    As for trying out the new component feature, don’t worry about it yet. After some initial feedback from @davidlb I can push the updated pip library and bump the version used by HA. Then testing the custom component won’t require a custom pip.
  1. Yes. I just used the existing ability to create scenes as an example.
  2. I don’t think it’s necessary to save the snapshot beyond the TTS call. I think saving the scene as a sound config would be a bonus feature. I was thinking more along the lines of save the config and hold it for restoring until something overwrites it.
  3. hmm haven’t thought of that one. The existing media player component on the front end just has features as it pertains to media. I guess I’d need to know a little more on how the sound modes would be displayed on the front end.

The snapshots would need to be saved to be recalled or sent to the TTS service. Say you have 8 zones but are currently only playing something on zones 1 and 2. The current TTS implementation already saves the existing player state as a snapshot before and restores it after the TTS is done. The issue is what configuration of zones to play for the TTS call itself. You might not want it to play on the exact same zones and volumes as is currently playing. In our example, maybe you want to play the TTS on all 8 zones. For that you’d need to have that information (the 8 zones and their associated volumes) stored as a snapshot.

Just for reference, I also have the 80% volume and delay problem with TTS. Other than that, everything seems to be working great.
I wanted to try forked-daap, because the current integration for Denon receivers does not allow direct TTS, going through forked-daap I can circumvent that.

The 80% volume is not a bug, it was designed that way. If you want to change the set level for now you can go into the integration options and change the TTS volume to your desired level. It’s still a fixed level across all the zones though. I think I will definitely change from this fixed TTS volume to a snapshot method to be more flexible going forward.
For the delay problem, can you tell me when the delay comes? Is it before the text is spoken or after? Does the system eventually return to normal? Do you happen to be changing the volume or other settings while the TTS call is processing?

thx @uvjustin for explaining the integration setting! ( I overlooked that )

So the delay is before the actual text is spoken. I enter the text in the Media Player TTS field and press the Play button, than it will take half a minute before the text is spoken. Afterwards the system is reset to normal, what is a good thing by the way :slight_smile:

Side note, I only have a reference in timing between the Forked-daap integration media player and the Chromecast integration media player. I do not have any other TTS capable devices at home.
So the Chromecast one is almost instant.
I would assume Forked-daap needs to take a bit more time, because it will stream to my receiver and it could be the receiver is still turned off of set to another channel. But it seems the amount more is way off.

Is there any documentation to all this? I would like to set up a simple TTS to HomePod, so that I can send alerts, when the alarm is pending and needs to be turned off? Glad to hear that all this worked made it into official Home Assistant core, but no data on how to use it.

1 Like