A Wyoming Satellite Voice Assistant to Control my Home Assistant Smart Home and Play my favourite songs. Go Local be Secure

kuligs2 · August 13, 2024, 8:29pm

Hello, can someone enlighten me on somehting?

I installed the wyoming-satellite on pc (arch btw).

But i noticed that the TTS response and awake sound and other sounds produced by satellite application works only if there is nothing playing in the background - youtube video music video… if there is somehting playing then the satellite errors out on the TTS responses…

I have set up MPD (music player demon) and it can recieve TTS when some audio is playing in bg…

I was wondering how can i send/bind satellite to send audio to MPD server?

Some of you had this line

--snd-command 'paplay --property=media.role=announce --rate=44100 --channels=1 --format=s16le --raw --latency-msec 10' \

The official example is like this:

  --snd-command 'aplay -r 22050 -c 1 -f S16_LE -t raw'

Im not smart about what it means but it has to do with the audio device… but my aplay -L doesnt have MPD as audio device… How do i set up the satellite to send audio to MPD?

ginandbacon · August 29, 2024, 9:17pm

I’m just pipe all audio output to a smart speaker. Sonos works great, so does my Sony sound at. Add the below to the wyomiing-satellite service file

  --synthesize-command 'examples/commands/synthesize.sh'

Generate a token by going to the user icon at the bottom then security. Update the file above to. Now running all.voice output and music through it, I’m not sure how that would work out. Since that script is defined in the service file I’m not sure how one would create, say, a switch in HA to toggle it on/off. You would need to ssh and comment it out, restart daemon and service to disable it.

There is a way to do it with.a webhook but I don’t have telegram setup so API key was easier for me personally. You can also use piper instead of Nabu Casa cloud.

#!/usr/bin/env sh

text="$(cat)"
echo "Text to speech text: ${text}"

token='API_TOKEN'
echo "${token}"

curlData='{
  "entity_id": "media_player.vlc_telnet",
  "message": "'$text'"
}';
echo "$curlData" | jq '.'

curl \
  -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $token" \
  -d "$curlData" \
  http://192.168.0.30:8123/api/services/tts/cloud_say

ginandbacon · August 30, 2024, 1:10am

I’m pretty positive the paplay is using pulse audio. If you just want to get the voice assistant working and are using a 2.mic respeaker then just copy and paste what’s in the guide. You can try and get all the audio stuff done after you get the voice assistant working and in home assistant.

kuligs2 · August 30, 2024, 5:51am

somehow i figured it out, i think… it was weeks ago… and no i dont use 2 mics, i have one PC and one mic. And i also figured out mpd. You had to run the MPD as a user service. Running as system service wont play nothing. Also there is slight delay from when you send the first sound to the mpd from HA, and it gets cut off, the second one sends fine and plays fully… I havent figured this problem out yet…

Speaking off… does anyone know how to train/STT model, to be small but precise? THere is whisper model that is small, loads fast and converts fast but its not precise. I want a model that is as fast but precise.

Maybe there is hardware that can do STT on the fly and is more precise that the whisper models?

ginandbacon · August 31, 2024, 6:04am

Have you tried any of the other built in models? You can change the model size but depending on what hardware your running HA on, results will obviously vary. If you look at the whisper add-on’s configuration it defaults to tiny-8 which I believe is the smallest whisper model for the add on (you may know this already).

This thread also helped me find tune OpenWakeWord, way less false.positives.

I don’t believe you can train your own actual models I have a Seeed respeaker lite on the way, which has a XIOS chip in it for noise/echo cancelation and uses an ESP32-S3. It just doesn’t use ESP-ADF like the S3 box and leverages the XIOS to do all that. This should help STT but it’s to early to tell. Nabu Casa has also confirmed that their voice assistant will also have a XIOS chip also, but that’s about all that’s been said about it. The good thing is you know Nabu’s will just work to the best of its ability. Maybe not from the start but they will actually work on issues.

My best voice assistant, both distance and accuracy wise is a round USB speakerphone using the Assist Microphone add on. The obvious downside is it has to be plugged into your HA server directly and constantly streams for the wake word. My understanding is the DSP and other technology for noise/echo cancelation in the a speakerphone just works as it’s just built into the phone and plug and play with any computer, no drivers needed. You can also make it a VLC telnet hub/player so it’s like any casting device for music.

Still, if a TV is on or any audible voices it just doesn’t handle it well, none do in my experience (I have a Korvo-1 S3 model and 2 Wyoming satellites also). Wyoming beats the Korvo although all work pretty good with no background noise, particularly other human voices but the conference speakerphone still works best and super fast. The add on doesn’t have a ton of configuration options either which would be nice to have.

The hardest part is going to be voice isolation for STT IMO. I know both Google and Amazon have lost a lot of money (I think billions) on cloud resources for Alexa and Google Assistant. I imagine a lot of that is for voice isolation but obviously I’m also assuming. They are both just remarkably good at isolating the voice that triggered it.

Dark1886 · September 4, 2024, 3:05pm

I have seen someone else mention Nabu Casa creating their own satellite and using an XIOS chip, but where is the source of this? I’d like to stay up to date on what they are doing.

SpencerDub · September 10, 2024, 2:22pm

@Dark1886 This is a post with many links discussing this effort.

The first announcement I saw of the satellite was on The Verge in April of this year.

The satellite was also discussed in the Year of Voice Chapter 7 Livestream, at about 01:58:00.

synesthesiam · September 12, 2024, 9:47pm

It’s important to note that the “ReSpeaker Lite” is a different product than what Nabu Casa is developing, even though they have similar internals.

So we have 3 voice satellites that should be available by the end of the year:

The ReSpeaker Lite by Seeed Studio
The Satellite1 by FutureProofHomes
Nabu Casa’s satellite (codename: VoiceKit)

All three will feature some variant of the ESP32-S3 running ESPHome and the XMOS XU316 chip for audio processing. They will differ, however, in their enclosures, sensors, and LEDs, buttons, expansion ports, etc. The hope is that there will be something for everyone

Sofa_Surfer · October 21, 2024, 8:56am

Hi, I’m making my voice assistant with a rpi4 and a Raspiaudio Ultra++, I download the drivers following the instruction on the website (GitHub - waveshareteam/WM8960-Audio-HAT: The drivers of [WM8960 Audio HAT] for Raspberry Pi), anyway it’s a wm8960 soundcard and you can find them easily.
In my case I can install wyoming, openwakeword and it’s kind of working and can arecord and aplay, sometimes my rpi recognise the wake word but then get stucked. I have found that the service wm8960-soundcard.service failed to start and it looks it records the wake word once, then it crash.
the service log shows:

× wm8960-soundcard.service - WM8960 soundcard service
     Loaded: loaded (/etc/systemd/system/wm8960-soundcard.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Fri 2024-10-18 22:33:40 CEST; 3s ago
    Process: 1891 ExecStart=/usr/bin/wm8960-soundcard (code=exited, status=99)
   Main PID: 1891 (code=exited, status=99)
        CPU: 242ms

Oct 18 22:33:38 assistant systemd[1]: Starting wm8960-soundcard.service - WM8960 soundcard service...
Oct 18 22:33:38 assistant wm8960-soundcard[1891]: + exec
Oct 18 22:33:38 assistant wm8960-soundcard[1892]: ++ basename /usr/bin/wm8960-soundcard
Oct 18 22:33:40 assistant systemd[1]: wm8960-soundcard.service: Main process exited, code=exited, status=99/n/a
Oct 18 22:33:40 assistant systemd[1]: wm8960-soundcard.service: Failed with result 'exit-code'.
Oct 18 22:33:40 assistant systemd[1]: Failed to start wm8960-soundcard.service - WM8960 soundcard service.

I am not sure if I’ve done something wrong, at the moment I think there’s nothing to do with HA yet.
Someone have advice about how to deal with this problem?

Later this eveving I can post the wyoming service config.

pcwii · October 23, 2024, 10:13am

I had a read of the repo you provided and the first thing I noticed was that it is recommended to manually build the application. Did you do that?
The other thing I noticed was that there are a few open issues that describe an issue with the application exiting prematurely. I suggest reading through the open and closed issues and see if there are any tips that get you going. The last advice I can give is to open your own issue for the developer to review and possibly comment on. Let us know how you make out.

Sofa_Surfer · October 23, 2024, 2:26pm

@pcwii thank you for the reply, I am not a programmer, this is just my hobby. I have double checked all the issues and it seems there are very similar opened, I will try to find a workaround, or carry on with Wyoming satellite and the integration on HA if the sound card service is not “essential”, since I could aplay and arecord.
I will update this.

Sofa_Surfer · November 8, 2024, 8:05am

Could you please help me or share the config?
I am stucked here:

Nov 08 08:56:46 assistant run[3769]: INFO:root:Ready
Nov 08 08:56:46 assistant run[3769]: DEBUG:root:Detected IP: 20.20.1.12
Nov 08 08:56:46 assistant run[3769]: DEBUG:root:Zeroconf discovery enabled (name=dca6328950b2, host=None)
Nov 08 08:56:46 assistant run[3769]: DEBUG:root:Connecting to mic service: ['arecord', '-D', 'plughw:CARD=seeed2micvoicec,DEV=0', '-q', '-r', '16000', '-c', '1', '-f', 'S16_LE', '-t', 'raw']
Nov 08 08:56:46 assistant run[3769]: DEBUG:root:Connecting to snd service: ['aplay', '-D', 'plughw:CARD=seeed2micvoicec,DEV=0', '-q', '-r', '22050', '-c', '1', '-f', 'S16_LE', '-t', 'raw']
Nov 08 08:56:46 assistant run[3769]: DEBUG:root:Connecting to wake service: tcp://127.0.0.1:10400
Nov 08 08:56:46 assistant run[3769]: INFO:root:Connected to services
Nov 08 08:56:46 assistant run[3769]: DEBUG:root:Using webrtc audio enhancements
Nov 08 08:56:46 assistant run[3769]: DEBUG:root:Connected to mic service
Nov 08 08:56:46 assistant run[3769]: DEBUG:root:Connected to wake service
Nov 08 08:56:58 assistant run[3769]: DEBUG:root:Server set: 70639530781646
Nov 08 08:56:58 assistant run[3769]: INFO:root:Connected to server
Nov 08 08:56:58 assistant run[3769]: DEBUG:root:Running ['examples/commands/streaming_stop.sh']
Nov 08 08:56:58 assistant run[3783]: Audio streaming to server has stopped
Nov 08 08:56:58 assistant run[3769]: INFO:root:Waiting for wake word
Nov 08 08:56:58 assistant run[3769]: DEBUG:root:Started recording to /home/hass/wyoming-satellite/local/debug-recording/70639535185675-wake.wav
Nov 08 08:57:00 assistant run[3769]: DEBUG:root:Ping enabled
Nov 08 08:58:39 assistant run[3769]: DEBUG:root:Stopped recording to /home/hass/wyoming-satellite/local/debug-recording/70639535185675-wake.wav
Nov 08 08:58:39 assistant run[3769]: DEBUG:root:Started recording to /home/hass/wyoming-satellite/local/debug-recording/70639535185675-stt.wav
Nov 08 08:58:39 assistant run[3769]: DEBUG:root:Detection(name='ok_nabu_v0.1', timestamp=70740442340518, speaker=None)
Nov 08 08:58:39 assistant run[3769]: DEBUG:root:Streaming audio
Nov 08 08:58:39 assistant run[3769]: DEBUG:root:Event(type='run-pipeline', data={'start_stage': 'asr', 'end_stage': 'tts', 'restart_on_end': False, 'snd_format': {'rate': 22050, 'width': 2, 'channels': 1}}, payload=None)
Nov 08 08:58:39 assistant run[3769]: DEBUG:root:Running ['examples/commands/detection.sh']
Nov 08 08:58:39 assistant run[3789]: Wake word detected: ok_nabu_v0.1
Nov 08 08:58:39 assistant run[3769]: DEBUG:root:Muting microphone for 0.8995918367346939 second(s)
Nov 08 08:58:39 assistant run[3769]: DEBUG:root:Running ['examples/commands/streaming_start.sh']
Nov 08 08:58:39 assistant run[3792]: Audio streaming to server has started
Nov 08 08:58:39 assistant run[3769]: DEBUG:root:Connected to snd service
Nov 08 08:58:40 assistant run[3769]: DEBUG:root:Unmuted microphone

my config:

[Unit]
Description=Wyoming Satellite
Wants=network-online.target
After=network-online.target
Requires=wyoming-wakeword.service

[Service]
Type=simple
User=hass
Environment=XDG_RUNTIME_DIR=/run/user/1000
ExecStart=/home/hass/wyoming-satellite/script/run \
    --name 'Assist' \
    --uri 'tcp://0.0.0.0:10700' \
    --mic-command 'arecord -D plughw:CARD=seeed2micvoicec,DEV=0 -q -r 16000 -c 1 -f S16_LE -t raw' \
    --snd-command 'aplay -D plughw:CARD=seeed2micvoicec,DEV=0 -q -r 22050 -c 1 -f S16_LE -t raw' \
    --awake-wav '/home/hass/wyoming-satellite/sounds/awake.wav' \
    --done-wav '/home/hass/wyoming-satellite/sounds/done.wav' \
    --timer-finished-wav '/home/hass/wyoming-satellite/sounds/done.wav' \
    --mic-noise-suppression 2 \
    --mic-auto-gain 5 \
    --wake-uri 'tcp://127.0.0.1:10400' \
    --wake-word-name 'ok_nabu' \
    --debug \
    --debug-recording-dir '/home/hass/wyoming-satellite/local/debug-recording' \
#    --mic-seconds-to-mute-after-awake-wav 0.1 \
    --wake-refractory-seconds 10 \
    --tts-stop-command 'true' \
    --detection-command 'examples/commands/detection.sh' \
    --startup-command 'examples/commands/startup.sh' \
    --streaming-start-command 'examples/commands/streaming_start.sh' \
    --streaming-stop-command 'examples/commands/streaming_stop.sh' \
    --synthesize-command 'examples/commands/synthesize.sh' \
    --stt-start-command 'examples/commands/stt_start.sh' \
    --stt-stop-command 'examples/commands/stt_stop.sh' \
    --tts-start-command 'examples/commands/tts_start.sh' \
    --tts-stop-command 'examples/commands/tts_stop.sh'
#    --event-uri tcp://127.0.0.1:10500 
WorkingDirectory=/home/hass/wyoming-satellite
Restart=always
RestartSec=1

[Install]
WantedBy=default.target

Even if on the satellite there is the degub message “Audio streaming to server has started” in HA I still cant see the “assist in progress”… (but the mute button is working)

sayam · November 11, 2024, 10:53am

I made my own personal Voice Assistant using the Wyoming Protocol with Pi Zero 2W and a ReSpeaker 2Mic HAT for use with Home Assistant.

It’s got on-device wakeword, volume ducking, multi-room audio support using snapcast, tts. I used an old beats pill as the audio output which sounds way better than the cheaper DIY options!

I have made an in-depth step-by-step tutorial if anyone’s interested in making their own.

You can check it out GitHub - sayam93/Pi-Voice-Assistant

Luiblonc · November 13, 2024, 5:16am

Hello, I ran into this nice project on YT and basically led me here. I have a PI 5 laying around and also the ReSpeaker 2-mic Pi Hat, and I followed the instructions from the GitHub for the Wyoming Satellite driver and all of the dependencies, but for some reason my PI 5 does not want to capture and run a record test.wav. I also used ChatGTP to assist me in this process and took me on a wild-loop with nothing resolved. My question is, is this project only compatible with PI W? Thanks for any feedback!

phil.rendell · November 14, 2024, 1:33pm

I’ve it running on a Pi 3 B however I used a separate mic and speaker using this install process. Try this thread to see if it helps resolve the issues: https://github.com/HinTak/seeed-voicecard/issues/19

chezpaul2 · December 3, 2024, 10:50pm

I have my wyoming-satellite service running on a Rasp zero 2 with a Respeaker 2Mic HAT with the 2mic_leds.service also running. I did not install the Wyoming openWakeWord service because it works fine right now without it. On my Home assistant I can access the open wakeword add-on and change settings from there. The whole thing works great right now. So my question to you guys is: “Why would I need to install the openwakeword service on the Pi Zero?” (My HA is running on an intel i7 so plenty of power, more so than a Pi Zero 2 right?)
I was having problems with the install so I tried without it.

Another question. Is there a way to not have to say He Jarvis when I’m answering a question right away?

So I say: Hey Jarvis, what’s the temperature in the home"
It answers: bla bla bla
And I want to say right away: “okay, then turn the thermostat on”

But right now, I have to say Hey Jarvis each time.
What’s the option to change this?

florian-asche · December 15, 2024, 4:26pm

Hi all,

not sure if this is interesting, but i am currently working on a fork of the rhaspy-satellite. I created a docker build at github and you can setup the voice-satellite, the 2mic_hat led and wakeword only with one docker-compose file.

Meaning you just need to install the 2mic_hat driver and then you can deploy docker-compose and the satellite is ready.

I hesitated posting this, because the documentation is still work in progress.

For the future i also want to merge some PR from the main repo that looks really promising. If nabucase continue the work on the main branch i may merge my effort.

Have a look at GitHub - florian-asche/pi-voice-assistant: HomeAssistant voice satellite using Wyoming protocol.

sgobat · January 17, 2025, 9:13pm

Shouting out this project by @dreed47

Makes setting up a Wyoming Satellite a breeze! Looking forward to the docker setup from @florian-asche as well.

We need more automated installers like these that also add enhancements like audio ducking, pulse audio, etc.

RobertSorgenfrei · January 18, 2025, 5:12pm

hi @ginandbacon

I’m from Germany and haven’t really found anything on this topic in our forum.

I’m not a programmer, but I’ve successfully got a Wyoming Satellite running. Now I’d really like to output the voice output to my Sonos speakers.

I don’t have the know-how for that…

I think I understand that I need to add the following code to my setup here…
is it right???

I also know how to create a token in HA, but where can I find the script below? Or where do I have to create it. Or even better would be a step-by-step guide from there

ginandbacon:

#!/usr/bin/env sh

text="$(cat)"
echo "Text to speech text: ${text}"

token='API_TOKEN'
echo "${token}"

curlData='{
  "entity_id": "media_player.vlc_telnet",
  "message": "'$text'"
}';
echo "$curlData" | jq '.'

curl \
  -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $token" \
  -d "$curlData" \
  http://192.168.0.30:8123/api/services/tts/cloud_say

I would be extremely grateful for your help.

Thank you

RobertSorgenfrei · January 19, 2025, 3:37pm

Hey, i installed the satellite with the script, but i can’t open the menu with the command “m” or “menu”

can you help me?