Year of the Voice - Chapter 4: Wake words

Yeah, I saw that on stream, I’m talking about making it record a command right away. Like if someone actually said a wake word already, irrespective of whether it’s on or off.

For Rhasspy users: the equivalent of calling /api/listen-for-command.

I suppose I could build a simple Wyoming event emitter that sends a wake event in response to a certain nudge, but something built-in would be nicer. Doesn’t seem complicated. Might be useful as a debugging facility as well.

Congrats with this huge milestone! I’ve just ordered a M5STACK Atom Echo to experiment.

As of now, the overall process is still a bit complicated for beginners (different addons and integrations need to be installed).

As a next step, it would be nice if the onboarding could be more streamlined. For example, by creating a wizard which will lead you through all the steps (installling the containers, integrations in the background) and which explains you when to use openwakeword versus porcupine. Just one flow, where an end-user can click next → choice, next → choice, next → done.

2 Likes

I have not been able to test it, but have ordered a mstack echo to do so. Until then i cannot tell for certain.
When i look at the log of the container it says: INFO:root:Ready.
So i guess it will be oké.

1 Like

How could I use my HA host as a voice assistant?
I have it on a practical place anyway, and have a USB mic and a speaker.

Did you figure this out?

This post indicates

:/custom

Awesome progress! I was initially sceptical about the concept of moving wake word detection into the HA instance, but my initial tests this morning were promising :slight_smile: The setup itself was straightforward, including pushing the assistant onto the atom echo.

It does however behave a bit unstable so far, requiring restarts for the atom echo after it gets “stuck”, but I guess that’s a normal thing for a first release. (Details: Wake word is rolled out? - #22 by Lakini)

We sort of still have ‘real world’ benchmarks using direct non processed audio to Porcupine which if compared to commercial results then actually its not the same ‘real world’ as say the accuracy of Google or Amazon.
Opensource seems to set it self up for a fall as the bring-your-own mic to the party creates a much wider scope to big data who dictate hardware.
Also we completely lack any decent audio preprocessing as in use the main sources of noise are other media and dynamic where static noise is actually a rarity.
There big-data goes one step further with profile based targetted voice extraction and the difference in use can be huge.

Its great you can train your own KW and you can post benchmarks, but the real world use is a comparison to commercial KW (that people have used) that we have so much missing from the initial opensource audio pipeline to the huge quality datasets they have that its very likely the results in comparison are poor as the Porcupine benchmarks are far less than the accuracy Google and Alexa units use.

Even the beamforming on commercial units is targetted that lock onto voice for that command sentence and simply using a conference speakerphone that has no target is not the same.
The mics we have very rarely have farfield as the expected setup is broadcast style.
In commercial setups it not just hardware but alg dictate so the models are trained on the exact signature of the device and ontop of that they use transfer learning of captured use to increase accuracy.

We still have a hole with the initial audio-processing and the bring-your-own device compounds this that likely results are going to be similar in use to what we have had before that is quite a way off what commercial units achieve.

Likely some will enjoy building and testing, whilst many who do use or have used commercial system maybe might not or consider it worthwhile.
If the thread fills up with enthuastic users then maybe, but likely what is current will remain with a few hobbyists.

You can do ondevice training and transfer learning to a large pretrained model biased to a smaller model of captured use and likely bridge the gap.
Also HomeAssistant like its Green/Yellow devices could create a KWS device and release openhardware documentation to dictate datasets of use and bridge the gap even closer.

The gap is big of both software and hardware and where opensource has a hole it still doesn’t seem to of been addressed…

Mass adoption and happy users will be the test, but I have a hunch on how that will go.

1 Like

It’s work in progress, not a finished project and am curious what the next chapter will bring.
I’m impressed with what has been achieved in a short time and the fact that you can run it locally.

After some initial challenges I was able to get the wake word to work on the M5 Stack atom however this morning when I cam back it no longer responded. I tried restarting it but no luck. I saw there was a beta update for esphome so I installed that but unfortunately that didn’t help either. I tried restarting HA and now my wake-word integration within Wyoming is failing. I only see this in the logs.

Logger: homeassistant.bootstrap
Source: bootstrap.py:508
First occurred: 8:27:42 AM (4 occurrences)
Last logged: 8:30:42 AM

Waiting on integrations to complete setup: wyoming

It never starts up. Any ideas on how to get past this? Do I need to uninstall the integration?

HA finally restarted and now I also see this error

Logger: homeassistant.config_entries
Source: config_entries.py:399
First occurred: 8:31:52 AM (1 occurrences)
Last logged: 8:31:52 AM

Error setting up entry openWakeWord for wyoming
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/config_entries.py", line 399, in async_setup
    result = await component.async_setup_entry(hass, self)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/wyoming/__init__.py", line 23, in async_setup_entry
    service = await WyomingService.create(entry.data["host"], entry.data["port"])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/wyoming/data.py", line 38, in create
    info = await load_wyoming_info(host, port)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/wyoming/data.py", line 57, in load_wyoming_info
    async with AsyncTcpClient(host, port) as client, asyncio.timeout(timeout):
  File "/usr/local/lib/python3.11/site-packages/wyoming/client.py", line 35, in __aenter__
    self._reader, self._writer = await asyncio.open_connection(
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/streams.py", line 48, in open_connection
    transport, _ = await loop.create_connection(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/base_events.py", line 1069, in create_connection
    sock = await self._connect_sock(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/asyncio/base_events.py", line 973, in _connect_sock
    await self.sock_connect(sock, address)
  File "/usr/local/lib/python3.11/asyncio/selector_events.py", line 628, in sock_connect
    return await fut
           ^^^^^^^^^
asyncio.exceptions.CancelledError: Global task timeout

What a great feature !!!

By the way, would it be possible to stream the answer (if any) on other multimedia devices (such as Sonos or other). The reason is because sound quality and power with device such as ESP32 would probably not be the best. This might maybe also help sparing some ESP32 cpu ressources to help locally detecting wake word. On top of that, it’s more sustainable to reuse existing speaker than adding some new ones.

Using ESP32 as wakeword capture device and Sonos (or other TTS smart speaker) to get the answer might be in interesting in some users configuration.

Thanks.

11 Likes

Came here to say this. It’s likely to get buried in this thread, maybe open a separate feature request and I will upvote it!

I like the idea of having simple microphone ESP32s hidden in sneaky places and then having the replies come over specific multimedia/google home mini devices.

Edit: I opened a feature request.

4 Likes

M5stack does not have customer service to speak to. they don’t answer emails, my parcel is lost and I loose my money.
Where to get ATOM Echo Smart Speaker Development Kit from EU ?

Plugged in my USB microphone and installed assist microphone addon, As per the bit " Turn Home Assistant into a voice satellite". In the config it is asking for the microphone name or number, how can I find this? Or is there something more I should be doing to get the Microphone to work with HA? It is showing up in the hardware list.

1 Like

Are there plans to add support for generic microphones/camera streams as well as custom set output devices?

It would be nice to be able to have a simple UI where you select the device to listen on(microphone, camera, etc…), then chose the device to output sound on

2 Likes

Make sure to go to Devices & Services and configure the discovered openWakeWord.

You shouldn’t need to mess with that. I will be removing those options in the next update, since there is already audio drop downs at the bottom that let you pick the default mic/speaker.

Looks like it lost connection to the add-on or Docker container running the wake word service.

Thanks, I jumped over to Discord and was able to get some help to resolve this. fyi, it took a restart of the entire system.

1 Like

Tinytronics in nl does have them in stock …

You need addon Assist Microphone