DIY F/LOSS "Sonos-alike" (with Snapcast, Mopidy), and dad jokes via TTS (with ducking/restore)

TRS-80 · August 4, 2024, 10:50pm

This started as a reply to the developer of the Media Player platform for Mopidy, but then got kind of long so I decided to split it off into its own thread.

This post is kind of a rough guide, and touches on a lot of areas (hardware, software, some scripts and automations) so I did not put it into any particular sub-category.

It’s the least I could do.

OK, I think I should probably have a disclaimer, that there are a lot of little pieces to this. But maybe that’s OK for people who are using Mopidy. Also you should be somewhat comfortable at the command line and generally dealing with Unix-y things. “Unix philosophy” is about composing a number of small tools, which each do one thing well, in order to accomplish a greater purpose. Think of it like a box of legos or a model car that you have to put together yourself, rather than a toy car you can buy that is already put together. But this way you can put it together however you like.

I suggest getting each piece working, before going on to the next piece. Each one of these pieces have their own documentation, forums, etc. and I am not going to go super in-depth or this will become very long. It’s not too hard but there can be a lot of steps. The pieces are:

Hardware
- Server hardware. I have several Single Board Computers (SBC) running Armbian. These run HA, Mopidy, my NAS, and other services.
- Client hardware. I like HiFiBerry AMP HATs, so I can control the volume in software remotely (as these use ALSA directly). I dislike RPi (generally speaking) but even I cannot seem to get the HATs working on anything but an RPi. Even an old 3 or 3b are fine for something like this and you can get them pretty cheap now. You will spend more on the AMP HAT.
- Nice speakers. This does make a difference. We could have a whole another thread about this but there are some communities of inexpensive Hi-Fi enthusiast online, follow advice of people like that.
Software
- Snapcast - This is piping the audio around the netwwork, in sync. Think of it like DIY Sonos, except F/LOSS and on your own hardware.
  - Snapserver running on the machine with Mopidy. On the NAS is a good place, or else you might have to network mount a drive with your music collection.
  - Snapclient running on the RPi client(s).
- Home Assistant - This is necessary for the announcement automation, the TTS source, and basically tying everything together. Not strictly needed for listening to music (as I feel there are better UIs for Mopidy and Snapcast).
- Mopidy - Will be your music source. Mopidy itself have many plugins for UI, music sources, etc. but you probably already know that.
- Mopidy integration into HA - The thread this post was born out of. Control Mopidy from HA.
- Snapcast integration for HA - Control Snapcast from HA.

I think that’s everything?

Get each of those working by following their docs. But I will highlight some things.

One problem I had was stopping OSMC (Debian based Kodi distro) and snapclient fighting over the audio device (as they both use ALSA) on my living room client RPi. The solution to this is setting up a ‘dmix’ device in ALSA. You will find a lot about this online but here is config which worked for me (taken from here and trimmed down a bit):

pcm.hifiberry {
  type hw
  card 0
  device 0
}

pcm.dmixer {
  type dmix
  ipc_key 1024
  ipc_perm 0666
  slave {
    pcm "hifiberry"
    channels 2
    # period_time 0
    # period_size 1024
    # buffer_size 8192
    # rate 44100
  }
  bindings {
    0 0
    1 1
  }
}

ctl.dmixer {
  type hw
  card 0
}

pcm.!default {
  type plug
  slave.pcm "dmixer"
}

OK, are you still with me? lol

Finally, with all that working, I could make this script in HA. This is what saves and restores the media player and snapcast states before and after the TTS announcement:

alias: Announce mopidy0 tts (with restore)
sequence:
  - service: snapcast.snapshot
    metadata: {}
    data: {}
    target:
      entity_id:
        - media_player.osmc_snapcast_client
        - media_player.rpi1_snapcast_client
  - service: mopidy.snapshot
    metadata: {}
    data: {}
    target:
      entity_id: media_player.mopidy0
  - service: tts.google_translate_say
    metadata: {}
    data:
      cache: true
      entity_id: media_player.mopidy0
      message: "{{ text_to_speak }}"
    enabled: true
  - delay:
      hours: 0
      minutes: 0
      seconds: 0
      milliseconds: 900
    enabled: true
  - service: media_player.volume_set
    metadata: {}
    data:
      volume_level: "{{ volume }}"
    target:
      entity_id:
        - media_player.osmc_snapcast_client
        - media_player.rpi1_snapcast_client
    enabled: true
  - wait_for_trigger:
      - platform: state
        entity_id:
          - media_player.mopidy0
        to: idle
        for:
          hours: 0
          minutes: 0
          seconds: 2
    timeout:
      hours: 0
      minutes: 1
      seconds: 0
      milliseconds: 0
    continue_on_timeout: false
  - service: snapcast.restore
    metadata: {}
    data: {}
    target:
      entity_id:
        - media_player.osmc_snapcast_client
        - media_player.rpi1_snapcast_client
  - service: mopidy.restore
    metadata: {}
    data: {}
    target:
      entity_id: media_player.mopidy0
    enabled: true
mode: queued
fields:
  text_to_speak:
    selector:
      text:
        multiline: true
    name: Text to speak
    required: false
  volume:
    selector:
      number:
        min: 0
        max: 1
        step: 0.01
    name: Volume
    default: 0.75
    description: >-
      Announcement volume (decimal).  Media state and volume will be restored
      after the announcement.
max: 10

I made it as a script as I now have many different things calling it. I even made another one to play little sounds, for example when an exterior door is opened, etc.

Here is an example of an automation that calls the above script:

alias: Tell dad joke in living room
sequence:
  - service: shell_command.get_dad_joke
    data: {}
    response_variable: dad_joke_response
  - service: script.announce_mopidy0
    data:
      volume: "{{ states('input_number.announce_volume') }}"
      text_to_speak: "{{ dad_joke_response['stdout'] }}"
mode: queued

The input_number.announce_volume is just something I created to hold a number. I have another automation that gets called hourly (on the 55 minute) and updates that to an appropriate volume for that time of day:

alias: Announce set volume (hourly)
description: ""
trigger:
  - platform: time_pattern
    minutes: "55"
condition: []
action:
  - service: input_number.set_value
    target:
      entity_id: input_number.announce_volume
    data:
      value: |
        {% set lookup_hour = (now()+timedelta(minutes=6)).hour %}
        {% set lookup_table = {
            0 :0.10,
            1 :0.05,
            2 :0.05,
            3 :0.05,
            4 :0.05,
            5 :0.10,
            6 :0.25,
            7 :0.30,
            8 :0.40,
            9 :0.50,
            10:0.50,
            11:0.50,
            12:0.50,
            13:0.50,
            14:0.50,
            15:0.50,
            16:0.50,
            17:0.50,
            18:0.50,
            19:0.40,
            20:0.40,
            21:0.30,
            22:0.20,
            23:0.20
            } %}
        {{ lookup_table[lookup_hour] }}
mode: single

OK, for my fellow dad joke aficionados, here is that (shell) command:

shell_command:
  get_dad_joke: 'curl -H "Accept: text/plain" https://icanhazdadjoke.com/'

I had come across a link to that site somewhere, and once I realized they had an API it was all downhill from there.

Well, that’s it in broad strokes.

If you really get stuck somewhere, maybe ask a question (on the relevant forum / IRC) and hopefully someone can help you. I don’t log in here all the time. As I said you need to know a little about this stuff or you are going to become very frustrated trying to get this all to work. Take your time, read the docs, proceed slowly, and take detailed notes. Believe me, they will come in handy when you need to revisit some part of your config 2 years later!

Some other things I want to do, going forward:

I read a post from a guy who said he did something like this except he ramped the music volume down while he started ramping the announcement volume up. I had to play with the timings quite a bit (and you may too, your network, timings might be different, etc.). It works pretty well but some times I hear little bits of music here or there. So that might be a better way.

Not directly related, but I also want to continue to organize (and expand) my local music library. I just started using MusicPrainz Picard to fix all the metadata, file names, folder organization, etc. It will be a long process but once organized it opens the door to more voice control and HA being able to find and play certain songs or playlists, etc.

bushvin · August 5, 2024, 8:13am

Awesome write-up!