The automation that plays the sound when the doorbell rings should first save the current state of the media_player. After it has finished playing the doorbell sound it should restore the media_player’s previous state.
I believe this can be done with a scene, specifically with its snapshot_entities
option. It’s explained in the scene documentation, Creating scenes on the fly, along with an example.
It would look something like this:
action:
- service: scene.create
data:
scene_id: before_doorbell
snapshot_entities: media_player.sonos
... Your code to play the doorbell sound ...
- delay: '00:00:03'
- service: scene.turn_on
target:
entity_id: scene.before_doorbell