Year of the Voice - Chapter 5

I just flashed a first gen ESP32-S3-Box with the firmware from the EPShome Projects site, all went perfectly. It has been added to the ESPhome Integration page and works great but doesn’t show up in the ESPhome Dashboard. Is there something additional that needs to be done for this to occur? I’m wondering if it’s because my ESP’s are all on a separate VLAN to to HA so they don’t automatically get picked up. Is there a way to force it? This is the first time I’ve not created an ESPhome device from straight code in the ESPhome Dashboard (that I can think of)

Adoption process won’t start without working mDNS. I had same problem, but not separate VLAN, but ESPHome was set up for ping-wise status instead of mDNS. As soon as I switched that off, device appeared for adoption.

But you can flash itocally with copy-pasted code from GitHub if you want to tinker :slight_smile:

https://www.home-assistant.io/blog/2023/12/13/year-of-the-voice-chapter-5/community-wake-words

doesn’t work …

fwartner/home-assistant-wakewords-collection: Community Collection of Wake-Words for Home Assistant (github.com)

Hope we will soon see ready-made smart speaker hardware projects that leverage Raspberry Pi Zero 2 W (or alternative SOCs) inside similar form factor products as the Google Nest / Google Home speakers series and offer great sound quality (while still being available at a reasonable price).

Would also be awesome to be able to use a house full of such Home Assistant satellite speakers as a wireless multiroom speaker system with the ability to create groups and synchronize streaming for music playback in high-resolution audio quality such as lossless 24-bit FLAC at 96 or 192 kHz.

Alternatively replacement PCB that converts existing products like Onju-Voice does:

2 Likes

You can already do that, but you lose the ready-made-config experience. Here are the steps:

  • install ESPHome as an addon or separate container
  • get a VA config for your board from here or write your own
  • replace speaker with media_player in the voice_assistant configuration
  • use this for voice_assistant.on_tts_end
voice_assistant:
  on_tts_end:
     - homeassistant.service:
         service: media_player.play_media
         data:
           entity_id: media_player.your_desired_speaker # <- change this
           media_content_id: !lambda 'return x;'
           media_content_type: music
           announce: "true"
  • flash the config to your board
  • go to HA, import the new device via the ESPHome integration if not already there
  • in Integrations > ESPHome > click Configure next to your device and allow it to make HA service calls

Now when you use the ESP device as a microphone, the output will be routed to the media_player of your choice

7 Likes

Thought this was pretty slick: https://www.youtube.com/watch?v=VaQkc-sgc04

PCB swap in Google Home mini to run ESPHome

3 Likes

I found that “nevermind” word does hang Assist. In text mode it shows “…”, and next request is coming through normally. However, on S3 Box, it hangs in “thinking” phase and never comes back. Further requests aren’t working. I tried to mute/unmute with programmatic button, it shows “idle” state, but doesn’t react to wake word.

Also, hardware Mute button on S3 Box leads to same result - it becomes unresponsive after unmuting.

In both these situations, the only way back is reset.

Awesome! When the ESP32-S3-BOX-3 was called out in the last update, I figured something like this was in the works and grabbed myself one of the devices. This morning, when I had a spare 15 minutes, I plugged it in to my computer and flashed the firmware. Couldn’t have been easier—now I have a little voice assistant sitting on my desk!

It’s great to hear that Mike will continue on the team! This has been a phenomenal and exciting year—we started with almost no voice control, and now I can have a fully-local smart voice assistant integrated with HA. That’s rad, and a testament to his hard work. Congrats, Mike, and thanks for all you’ve done in 2023!

My big hope for 2024 and further development of Assist is that HA can get smarter about intent matching. The Willow project is working on what they call “Willow Auto Correct”, which allows the server to learn from variance in voice commands, so users don’t have to be so precise with their language. That would be a big step up for the usability of HA’s voice functions.

2 Likes

This is awesome! Thanks for sharing, replying on a different media player was on the top of my wishlist. Do you know if there’s a way to do it with the new raspberry pi wyoming-satellite firmware? I’d imagine it’d be a tweak on the
--snd-command 'aplay -r 22050 -c 1 -f S16_LE -t raw
portion of the command

Sorry, I haven’t played around with wyoming-satellite yet.

Congrats to an amazing achievement. If Nabu Casa develops matching hardware, me and many others would instantly buy them. (please do). We need an open-source, 3d-printable, smart speaker with the sound quality of a Homepod and the freedom of HA.

OK thanks. I have mine setup for ping based status as well otherwise they all show offline due to the VLAN. I guess I need to create a mDNS rule between the VLAN’s. I’ll look into it.

That’s probably actually the easier option!

Sorry for being so late getting this reply posted.

Dave my tutorial WAS for chapter 4.
Today is chapter 5 … which appears to use the new wyoming-satellite per the “remote satellites” link mentioned in Year of the Voice - Chapter 5.

I’m guessing that Mike has been flat out this last week making sure everything is ready for us today.

I noticed several seemingly minor updates by @synesthesiam in 2023.12.2, but none that seemed to include this new functionality.
I had been expecting updates to several of the HA integrations … but maybe they have been in the code for a while and just need wyoming-satellite to use them. Now to re-do my test RasPi.

Definitely. Available options (Raspberry Pi, conferencing speakers, Onju-voice and ESP32-S3 BOX) are getting better, but still noticeably lower quality and more expensive than alexa/google devices.
ESP32-S3 BOX seems a bit of an overkill (I would be happy with a couple of LEDs instead of a screen) and I understand some of its desirable audio processing features are still proprietary software.

I second this ! I appreciate that there has been such a lot of behind-the-scenes work in HA for the voice assist to hook into.
Mike, I hope that working for Nabu Casa has improved your work/family/life balance, and that you will enjoy a well deserved break this holiday season.

4 Likes

Definitely overkill having a screen and low quality audio. I have just setup the first gen ESP32-S3-Box since I had one running Willow previously.

I have some ESP32’s that were just acting as Bluetooth Proxies which I’m trying to add INMP441 microphones to but so far I haven’t had success getting the microphones to work. Not sure what is going on. Once I figure that out they will be super cheap options for voice control, just without the audio feedback at this stage. It’s easy to add a speaker but not all that necessary.

I have 3 Fire tablets in my home that I use with HA and Alarmo/Konnected. They have microphones, and I have disabled Alexa on them. Would it be possible to use them as sources for voice control of HA? Is there a tutorial somewhere?

I’m not familiar with Fire tablets but if they run Android you should be able to set the default Voice Assistant to HA Assist… I think. I know I can on my regular Android devices. It was shown in one of the previous Year of the Voice chapters.

Thanks, Don! Nabu Casa has been a wonderful place to work. I’ve put in a lot of hours, but all because I wanted to :slightly_smiling_face:

I will be taking time off for the rest of the calendar year. Of course, that probably means I’ll try to relax for two days and then work on a project :stuck_out_tongue:

This is what I did for my Pi Zero 2 W satellite, just LEDs for feedback: wyoming-satellite/docs/tutorial_2mic.md at 95f2e58c3a4d4d91aaebbbd3784ed7c7c6e631c2 · rhasspy/wyoming-satellite · GitHub

3 Likes

I’m probably having other issues because I can’t trigger the new intents even from the text prompt😢

I get even less!

EDIT: after waiting a while I got:

EDIT 2: more tests…