The automation that plays the sound when the doorbell rings should first save the current state of the media_player. After it has finished playing the doorbell sound it should restore the media_player’s previous state.
I believe this can be done with a scene, specifically with its snapshot_entities option. It’s explained in the scene documentation, Creating scenes on the fly, along with an example.
It would look something like this:
action:
- service: scene.create
data:
scene_id: before_doorbell
snapshot_entities: media_player.sonos
... Your code to play the doorbell sound ...
- delay: '00:00:03'
- service: scene.turn_on
target:
entity_id: scene.before_doorbell
I would never have thought of using a scene. It sounds like it works well but…
There are dedicated services for this in the Sonos integration: here.
In my experience they have the, albeit small advantage of not needing the delay. They are possibly also more suitable than using scenes if you have more than one Sonos and ever group them.