Year of the Voice - Chapter 5

We’ve reached the end of Home Assistant’s Year of the Voice! It was our goal for 2023 to let users control Home Assistant by speaking in their own language.

At the start of 2023, Home Assistant had basic text-based control for some devices in English only. As the year closes, users can now control and ask questions of their smart homes with voice in over 50 languages across a variety of devices, including:

  • Any ESPHome device with a microphone
  • Android phones, tablets, and smart watches
  • Old school analog phones (with an adapter)

Home Assistant users can now create multiple voice assistants by mixing and matching components of a voice “pipeline”. Home Assistant Cloud subscribers automatically gain access to high-quality voice components in over 130 languages and dialects. Fully local components are available as well, such as our Piper text-to-speech system, allowing for 100% offline voice control.

In Chapter 4, we added wake word processing directly into Home Assistant by leveraging the openWakeWord project. This allowed tiny voice satellites such as the M5 ATOM Echo Development Kit to offload wake word detection by streaming audio to a Home Assistant server. The community has been hard at work training a variety of custom wake words that everyone can use to make their voice experience unique.

For the final chapter of 2023, we have expanded the available types of voice commands to include weather, temperature, and to-do lists. Voice satellites are now aware of which area they’re in, and more hardware/software options are available too.

Happy holidays!

Assist running on the ESP32-S3-BOX-3.

The S3-BOX-3

Espressif recently released the ESP32-S3-BOX-3, an update of the discontinued ESP32-S3-BOX (and “lite” variant). This “AIoT” development kit contains an ESP32-S3 chip, dual microphones, a small speaker, and a screen. Several docks are available in the box, which expose a USB-C power connector and GPIO pins for expanding the device.

Assist running on the ESP32-S3-BOX-3 with custom artwork.

The ESPHome team has been hard at work adding support for the S3-BOX-3, including the ability to customize the display! Check out the S3-BOX-3 tutorial to get started.

Spend the holidays with Frenck as your voice assistant.

More voice commands

Starting all the way back in Chapter 1, we added voice commands for:

  • Turning lights and other devices on and off
  • Opening and closing doors, windows, etc.
  • Setting the brightness and color of lights
  • Adding items to a shopping list
  • Asking questions, such as which windows are open in an area

For Chapter 5, we’ve extended this list to include:

  • Adding items to a to-do list - “add take out the garbage to my task list”
  • Getting the inside temperature - “what’s the temperature?”
  • Getting the current weather condition - “what’s the weather like?”
  • Canceling a satellite wake-up - “never mind”

Make sure you’ve exposed the devices you want Assist to have access to, and that they are named properly. You can always add an alias when you’d like to refer to a device by something more convenient for voice. For example, adding an alias “Berlin” to a weather entity would allow you to say “what’s the weather like in Berlin?”.

Area awareness

Voice satellites can be placed all around the house, and it’s important to keep their area in mind when interpreting commands like “turn on the lights”. This command will now turn on all of the lights in the satellite’s area, and “turn off the lights” will do the opposite. You can still target the lights in a different area, of course, by specifying: “turn on the lights in the bedroom”.

Voice satellites make use of the area they're in.

This is a small start to satellites being aware of their context, and adjusting behavior accordingly.

Improved Raspberry Pi satellites

To date, Raspberry Pi-based voice satellites have used Home Assistant’s websocket API. This had several limitations, such as requiring an API token, not knowing which area the satellite was in, and not being able to configure it in Home Assistant’s UI.

We’ve extended the Wyoming integration to communicate directly with remote satellites. These satellites are automatically discovered, and can be configured much like ESPHome-based satellites with the ability to set an area and voice pipeline.

Several satellite modes are supported, including:

  • Always-on streaming - satellite streams all audio to Home Assistant
  • Stream on speech - only stream audio once speech is detected
  • Local wake word - only stream audio when a wake word is detected locally

Audio clean up, such as automatic gain control and noise suppression, may be done in Home Assistant or on the satellite. A Raspberry Pi Zero 2 W has more than enough power to do local audio clean up and wake word detection, allowing you to have many satellites without straining your Home Assistant server. Reuse your old Raspberry Pi’s, and start your journey with smart home voice control!

Raspberry Pi Zero 2 W (MSRP: $15 USD).

Stay tuned

Although the Year of Voice is coming to a close, voice in Home Assistant is just getting started! I, Mike “The Voice” Hansen, will continue at Nabu Casa to improve and extend the voice and natural language capabilities of Home Assistant.

On the roadmap for next year, we’re planning things like local wake word detection on the S3-BOX-3, and integration with large language models (LLMs) like GPT. We’re also still on the hunt for the perfect voice satellite hardware: inexpensive with great audio, but also capable of running open source wake word models locally.

Thank you

Thank you to the Home Assistant community for subscribing to Home Assistant Cloud to support Year of the Voice and development of Home Assistant, ESPHome and other projects in general.

Thanks to our language leaders for extending the sentence support to all the various languages.


This is a companion discussion topic for the original entry at https://www.home-assistant.io/blog/2023/12/13/year-of-the-voice-chapter-5
12 Likes

Thanks to all of the Voice Assist team members / contributors. This project really got me engaged into Home Assistant at home even more.

Happy Holidays everybody and have a sparkling 2024! :sunglasses:

2 Likes

I have a couple of old Pi Zero (version 1)'s sitting unused. Above it’s noted that the v2 can be used as a voice satellite, any idea if the v1 can as well? or are they not powerful enough?

They are powerful enough for basic functionality but will be unable to perform local wake word detection. This means it will rely on Home Assistant for wake word detection like ESPHome currently does.

Ok cool. Thanks Paulus.

Thank you great people!
Box works good.
Ordered pi and recommended hat. Super excited to ditch Alexa everywhere! :)))

Two quick questions:

  1. How do I use area awareness in custom intent_script?
  2. Where to get Pi satellite firmware? :slight_smile:

Thanks again!

1 Like
1 Like

You’re the best!

@sparkydave, 5 days ago @synesthesiam commented over in github for homeassistant-satellite

Yes, wyoming-satellite will replace this project eventually. It’s not fully stable yet, so expect some issues.

I note in the first message above, the remote satellites link is to github for wyoming-satellite … so @formatBCE , the answer is to follow that link :wink:

Looks like i’ll have to edit my RasPi as HA Voice Assist post :frowning:

2 Likes

So is your tutorial not for the setup that was shown in the HA Year of the voice chapter 4?

Can somebody please help me find instructions how to enable the new goodies i.e. new built-in voice commands? I’m running a local pipeline which can start/stop lights and execute custom sentences however I keep getting ‘unexpected error during intent recognition’ when I check temperature, the weather or try ‘never mind’. I’m probably missing a link :slight_smile:

I think those were were listed as ‘in the pipeline’ as future updated in 2024.

I probably missunderstood this, but :slight_smile:

For Chapter 5, we’ve extended this list to include:

Great work this year!

Simplest for me would be the possibility to redirect the audio-reply (TTS) to a selected speaker (media_player) that I already have in the room. That way, I could use the ESP microphone unit I already got (but where the sound back is not sufficient in quality/volume).

5 Likes

I guess like you mentioned, there hasn’t yet been any update to the voice assistant back-end add-ons so not sure how these additional features could exist.

All of the chapter 5 commands work for me.
With the exception of the first one as I don’t have a “task list”.
I do have a “grocery list” and using “add milk to the grocery list” does work.
I am running 2023.12.2

1 Like

I just tested the shopping list voice command and can confirm this worked for me too.

I just flashed a first gen ESP32-S3-Box with the firmware from the EPShome Projects site, all went perfectly. It has been added to the ESPhome Integration page and works great but doesn’t show up in the ESPhome Dashboard. Is there something additional that needs to be done for this to occur? I’m wondering if it’s because my ESP’s are all on a separate VLAN to to HA so they don’t automatically get picked up. Is there a way to force it? This is the first time I’ve not created an ESPhome device from straight code in the ESPhome Dashboard (that I can think of)