Year of the Voice - Chapter 4: Wake words

Hi Pepe59

On your dedicaced homeassistant satellite, just install Rpi OS and follow the procedure explained here:
synesthesiam/homeassistant-satellite: Streaming audio satellite for Home Assistant (github.com)

On your HA, install openwakeword, porcupine1, and/or snowboy addon

1 Like

Thanks, I’ll try it.

So the link I posted above seems to work with IP webcam android app, which I was using for frigate anyway. Wakeword support is pending however.

The best thing is you can have it respond rude too, for that perticular wakeword

Hi, the processing times for the M5 Echo on my rapsberry Pi 4 are lenghty (like 37 seconds).
I’ve created an issue for this:

https://github.com/home-assistant/core/issues/102461

1 Like

Hi,
Just tried to train a new wake word, trying to get a “frenchy” pronounciation for Alexa.
Script errored after 15 minutes, saying something about not being able to find and open a file…
I think I’ll have to wait a bit for things to be a bit more straightforward.
It works great though (using an Atom Echo), but having to pronounce the wake words with an english accent feels not natural ^^

When using HomeAssistant-core, how do we point it to a docker container running OpenWakeWord?
It doesnt look like you can at the moment.

This is how im running piper and whisper
Home assistant, piper and whisper, run in separate dockers, as core does not have “addons”
But I can still go to Home Assistant > Settings > Inetegrations > Add and it let me select piper/whisper, and input the IP:port of the corresponding docker.

openwakeword, does not show in the Add integration list…

Go to your ‘Wyoming Protocol’ integration under ‘Devices and Services’ and ‘add entry’ with the details of your openwakeword container…

Cheers.

3 Likes

While testing further, i have another observation. During the livestream , it is mentioned that false positives did not occur during testing.

In my testing, i do have false positives. This is while using the ATOM Echo M5 Development Kit. This happens both while using openWakeWord 1.8.0 and while using Porcupine 1.0.0 (both set to “Alexa”). All settings are left to default.

The typical scenario at which the false positives occur, is when watching tv, Formula 1: two false positives in one hour.

Hello

French user as me and pitiful english accent ? :wink:
You can use the new snowboy addon ( Thanks Mike !) and train language free custom wakeword.
hassio-addons/snowboy at master · rhasspy/hassio-addons (github.com)

You can add a 2nd stage verifier that will reduce false positives.

As far as I know they didn’t say you would not get false positives GitHub - dscripka/openWakeWord: An open-source audio wake word (or phrase) detection framework with a focus on performance and simplicity.

Basically its this TensorFlow Hub from a Google paper [2002.01322] Training Keyword Spotters with Limited and Synthesized Speech Data

Its basically a way to create a KWS when you don’t have good quality datasets and will never compete with commercial KWS such as Alexa/Google as they have the good quality datasets.

It does well against porcupine but porcupine is in very much the same area as its an easy custom KWS model that performed well against many obsolete KWS aka Snowboy and others but vs Commercial KWS its results are far less.

Its also true of Whisper as for a limited set of languages the Large model posted SotA results, this is not true of how its being used often with the Tiny model where WER skyrockets.

I guess it is what it is…

Oh nice !!
My English accent is not bad, but it is more a matter of WAF ^^
Will try this addon btw, and many thanks.

If you can find a French language TTS model or create a dataset with it then likely it would be better.
Likely the keyword could be the same but the language model will bring through the accent more.

2 Likes

Thanks for replying. Regarding the (absense of) false positive wakeword detection, my expectations were based on this part of the video: https://www.youtube.com/live/YzgYYkOrnhQ?si=LzCIib2LOcvInfLP&t=1446

My experience is somehow very different.

Yeah apparently “It’s a technological marvel that is created with 4 goals in mind” but hey nothing like blowing your own trumpet as some have ribs removed to achieve it.
From Mycroft to Rhasspy we have some very enthuastic hobbyists doing some great work and a lot of work.
Also though when you compare to big data its trailing quite a long way behind.

I have been trying to advocate how easy it is to capture quality data and how hardware dictate is a huge advantage to big data.
A project can roll and dictate a KW and quickly capture a quality dataset of use where users opt-in to send collated packages of data with gold standard metadata.
What we have is more what could be done than maybe what users of commercial voice systems have come to expect.

Opensource has been nothing short of a disaster in the discipline of applying quality datasets and metadata where the CommonVoice initiative was likely the biggest waste of Mozilla funds ever.
It is riddled with wrongly labelled data and much is bad, but also it totally lacks (a few items do) quality metadata.
Region, Age group, gender, native speaker, mic hardware and maybe a few others are all that is needed as we do not need your name or house address just metadata to provide specific language and accent models for KWS & ASR.
Same with the Mlcommons iniative that merely forced aligned CommonVoice as at a guesstime contains more non native spoken word than native with what is near zero metadata so you are unable to filter for specifics.
It just didn’t happen and hasn’t happened and strangely opensource hasn’t started collating usage data via an opt-in.
Also its created custom KW so that the herd will never be able to collate a single KW in any qty.

When a ASR command sentance activates a skill, without an instant stop or the same skill being instantly repeated then its likely you can assume the KW & ASR command sentences to be good.
Opensource could be creating gold standard datasets, through use and closing the gap via an opt-in of only KW and command sentence collation, with ever evolving models based on them shipped OTA.

I noticed stutter when using en-GB. En-US was fine

This is how I did it.

  openwakeword:
    container_name: openwakeword
    image: rhasspy/wyoming-openwakeword
    volumes:
      - ./config/openWakeWord/config:/config
      - ./config/openWakeWord/data:/data
      - ./config/openWakeWord/custom:/custom
    environment:
      - TZ=America/Los_Angeles
    restart: unless-stopped
    command: --preload-model 'ok_nabu' --custom-model-dir /custom
    ports:
      - 10400:10400
      - 10400:10400/udp

Then as @Fraddles says, enter the IP address of your Docker container. I couldn’t get it to work with the non-default port, fwiw, but there’s no conflict with 10400 on my host.

Do you mean here

I have tried changing options there but I am finding the stutter isn’t consistent so hard to test. This is why I thought it might be a hardware resource constraint that was coming and going?

Same problem here. Did you find a solution?

INFO ESPHome 2023.10.1
INFO Reading configuration /config/esphome/m5stack-atom-echo-8a1468.yaml...
INFO Updating https://github.com/esphome/esphome.git@pull/5230/head
INFO Generating C++ source...
Traceback (most recent call last):
  File "/usr/local/bin/esphome", line 33, in <module>
    sys.exit(load_entry_point('esphome', 'console_scripts', 'esphome')())
  File "/esphome/esphome/__main__.py", line 1036, in main
    return run_esphome(sys.argv)
  File "/esphome/esphome/__main__.py", line 1023, in run_esphome
    rc = POST_CONFIG_ACTIONS[args.command](args, config)
  File "/esphome/esphome/__main__.py", line 454, in command_run
    exit_code = write_cpp(config)
  File "/esphome/esphome/__main__.py", line 190, in write_cpp
    return write_cpp_file()
  File "/esphome/esphome/__main__.py", line 208, in write_cpp_file
    writer.write_cpp(code_s)
  File "/esphome/esphome/writer.py", line 342, in write_cpp
    copy_src_tree()
  File "/esphome/esphome/writer.py", line 295, in copy_src_tree
    copy_files()
  File "/esphome/esphome/components/esp32/__init__.py", line 593, in copy_files
    repo_dir, _ = git.clone_or_update(
  File "/esphome/esphome/git.py", line 95, in clone_or_update
    old_sha = run_git_command(["git", "rev-parse", "HEAD"], str(repo_dir))
  File "/esphome/esphome/git.py", line 32, in run_git_command
    raise cv.Invalid(err_str)
voluptuous.error.Invalid: fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

I did actually. I just tried again a day or so later and it worked fine. I think there was just an issue with it accessing the github repo