Year of the Voice - Chapter 4: Wake words

I guess I must be missing something here. I already had the Whisper and Piper Add-ons installed so have just installed the OpenWakeWord one but under the Voice Assistant settings for HA Assist, it doesnt show up:

@sparkydave Did you install the integration for it?

Hi, I watched the video overview of this feature and have a question. Is it technical possible to add the wake word functionality to the Assist app so, if given the appropriate permissions to “listen” via the phones microphone, you’re smartphone can listen for the wake word?

Also curious if anyone has tried to “flash” an existing smart speaker to run this feature?

Thank you!

1 Like

That was the issue, thanks.

Next thing, the list of wake words available in the list from openwakeword is only small (5 options), not the full batch shown in the announcement video. I thought it was supposed to come with the full list?

There’s a “use wake word” switch for each microphone that you can use to enable and disable wake word monitoring. You can also use the switch.turn_on or off service. You can control this in an automation. e.g. if the media_player is playing or if a remote button is pressed, turn off the wake word switch.

Please excuse the fuzzy image, it was taken from the video stream at low resolution as I have not set this up yet:

Screenshot 2023-10-13 at 10-21-02 Year of the Voice - Chapter 4 - YouTube

1 Like

Hi Mike and everyone, great work ! Implementing wakeword on the HA machine certainly makes super cheap satellites like ATOM Echo feasible … at the expense of multiple audio streams over wi-fi. Unfortunately some of us have many neighbors, all using the same limited number of wi-fi channels - making it impossible to know how congested the radio frequency is :frowning:

As a current Rhasspy 2.5 user, I have hesitated implementing HA voice assistant until wakeword detection was available. Rhasspy already had the option to do wakeword on the server … and turning my current satellites (RasPi 3A, 3B and Zero) into homeassistant-satellite with external wakeword seems like a backwards step.

Looks like I’ll have to go back and catch up on Wyoming, Piper, etc.; and try to work out how best to integrate with Rhasspy and HA voice. Am I better to scrap Rhasspy and swap to HA Voice assist (sounds easiest) ? Or is there an upgrade path which integrates the best of Rhasspy (or at least local wakeword) with HA Voice ?

Either way, I am still looking forward to ESP32 S3 satellite devices with local wakeword detection :wink:

1 Like

Does anyone know where the ReSpeaker driver they mention in the video is?

2 Likes

@bbrendon AFAIK latest reSpeaker driver is available from HinTak’s fork

I set up a couple of reSpeaker HATs with Rhasspy 2 years back, but haven’t looked for updates since then … so my understanding may be out of date.

Seeed (in their infinite wisdom) stated “For hardware testing purposes we made a Rasperry Pi OS 5.10.17-v7l+ 32-bit image with reSpeaker drivers pre-installed” which is now over 5 years old, and they didn’t update their driver.
One user named HinTak took upon himself to update to newer Raspbian kernels - not any changes to the code. He recently commented "upstream seeed studio has stopped updating beyond about 5.13, so won’t build against newer kernels. The solution to that is to use one of my v5.x/v6.x branches. (It gets quite messy with a lot of “if kernel version <…” “>=” , so I make new branches when difference is too large for such constructs).

The new problem since early April is that raspberrypi people decided that 64-bit kernel on 32-bit system is better than 32-bit kernel on 32-bit system. (Better compiler optimization and resource useage etc). But that specifically does not work for out-of-tree device drivers. They are still working on it - raspberrypi/firmware#1795 - but until they figure it out, you can add one line in/boot/config.txt to go back to a 32-bit kernel on 32-bit system setup."

Note also that the reSpeakers I use are Raspberry Pi HATs, but some reSpeaker devices connect by USB … and so I assume have different drivers

1 Like

Does anyone know where to place the .tflite wake word model that is created when running HA in docker? The instructions indicate “share/openwakeword” but in docker I don’t have this.

YouTube video showed you have to create the openwakeword directory under /share/

Yea, that’s the issue. Those instructions are for HAOS I believe. Docker is different, I think I need a new mapped volume.

I installed esphome with esp32-s3-box (not esp32-s3-box-3), esphome beta 2023.10.0.b2 with this code https://github.com/esphome/firmware/blob/11ad9052a71285cf7912138acd03016b03270eb5/voice-assistant/esp32-s3-box.yaml but wakeword don’t work. I have installed openwakeword and wyoming protocol. What should I do?It’s properly installed I think, since log keep showing me this:

INFO Successfully connected to esp32-s3-box.local
[14:15:03][I][app:102]: ESPHome version 2023.10.0b2 compiled on Oct 13 2023, 14:14:17
[14:15:03][I][app:104]: Project esphome.voice-assistant version 1.0
[14:15:03][C][wifi:546]: WiFi:
[14:15:03][C][wifi:382]:   Local MAC: [redacted]
[14:15:03][C][wifi:383]:   SSID: [redacted]
[14:15:03][C][wifi:384]:   IP Address: [redacted]
[14:15:03][C][wifi:386]:   BSSID: [redacted]
[14:15:03][C][wifi:387]:   Hostname: 'esp32-s3-box'
[14:15:03][C][wifi:389]:   Signal strength: -62 dB ▂▄▆█
[14:15:03][C][wifi:393]:   Channel: 1
[14:15:03][C][wifi:394]:   Subnet: 255.255.255.0
[14:15:03][C][wifi:395]:   Gateway: 192.168.1.1
[14:15:03][C][wifi:396]:   DNS1: 192.168.1.1
[14:15:03][C][wifi:397]:   DNS2: 0.0.0.0
[14:15:03][C][logger:361]: Logger:
[14:15:03][C][logger:362]:   Level: DEBUG
[14:15:03][C][logger:363]:   Log Baud Rate: 115200
[14:15:03][C][logger:365]:   Hardware UART: USB_SERIAL_JTAG
[14:15:03][C][gpio.binary_sensor:015]: GPIO Binary Sensor 'Mute'
[14:15:03][C][gpio.binary_sensor:016]:   Pin: GPIO1
[14:15:03][C][ledc.output:164]: LEDC Output:
[14:15:03][C][ledc.output:165]:   Pin GPIO45
[14:15:03][C][ledc.output:166]:   LEDC Channel: 0
[14:15:03][C][ledc.output:167]:   PWM Frequency: 1000.0 Hz
[14:15:03][C][ledc.output:168]:   Bit depth: 14
[14:15:03][C][gpio.binary_sensor:015]: GPIO Binary Sensor 'Top Left Button'
[14:15:03][C][gpio.binary_sensor:016]:   Pin: GPIO0
[14:15:03][C][light:103]: Light 'LCD Backlight'
[14:15:03][C][light:105]:   Default Transition Length: 0.0s
[14:15:03][C][light:106]:   Gamma Correct: 2.80
[14:15:03][C][template.switch:068]: Template Switch 'Use wake word'
[14:15:03][C][template.switch:091]:   Restore Mode: restore defaults to ON
[14:15:03][C][template.switch:057]:   Optimistic: YES
[14:15:03][C][psram:020]: PSRAM:
[14:15:03][C][psram:021]:   Available: YES
[14:15:03][C][psram:024]:   Size: 8191 KB
[14:15:03][C][mdns:115]: mDNS:
[14:15:03][C][mdns:116]:   Hostname: esp32-s3-box
[14:15:03][C][ota:097]: Over-The-Air Updates:
[14:15:03][C][ota:098]:   Address: esp32-s3-box.local:3232
[14:15:03][C][ota:101]:   Using Password.
[14:15:03][C][api:138]: API Server:
[14:15:03][C][api:139]:   Address: esp32-s3-box.local:6053
[14:15:03][C][api:141]:   Using noise encryption: YES
[14:15:17][D][voice_assistant:189]: VAD detected speech
[14:15:17][D][voice_assistant:313]: State changed from 4 to 5
[14:15:17][D][voice_assistant:319]: Desired state set to 7
[14:15:17][D][voice_assistant:206]: Requesting start...
[14:15:17][D][voice_assistant:313]: State changed from 5 to 6
[14:15:17][D][voice_assistant:334]: Client started, streaming microphone
[14:15:17][D][voice_assistant:313]: State changed from 6 to 7
[14:15:17][D][voice_assistant:319]: Desired state set to 7
[14:15:17][D][esp-idf:000]: I (34645) wifi:
[14:15:17][D][esp-idf:000]: <ba-add>idx:1 (ifx:0, 90:9a:4a:81:1a:d5), tid:6, ssn:3, winSize:64
[14:15:17][D][esp-idf:000]: 
[14:15:17][D][esp-idf:000]: I (34968) wifi:
[14:15:17][D][esp-idf:000]: <ba-del>idx
[14:15:17][D][esp-idf:000]: 
[14:15:17][D][esp-idf:000]: I (34977) wifi:
[14:15:17][D][esp-idf:000]: <ba-del>idx
[14:15:17][D][esp-idf:000]: 
[14:15:17][D][esp-idf:000]: I (35004) wifi:
[14:15:17][D][esp-idf:000]: <ba-add>idx:0 (ifx:0, 90:9a:4a:81:1a:d5), tid:0, ssn:82, winSize:64
[14:15:17][D][esp-idf:000]: 
1 Like

Not on the blog, only on the forum

1 Like

I’m planning on flashing one of those either tonight or over the weekend. I’ll see how I go with mine.

Can you please format your log report using the code formatting tool? It’s horrible to read in it’s current state.

Fixed. I appreciate it.

So does it work for you? Maybe you could use the same docker image which is used by the add-on?
homeassistant/amd64-addon-openwakeword
https://registry.hub.docker.com/r/homeassistant/amd64-addon-openwakeword

I think the team have done some amazing work here.

I think it may be a good idea for the team to think about producing their own satellite hardware.

1 Like

Unfortunately I can’t even get mine to compile…

error log
INFO ESPHome 2023.10.0b2
INFO Reading configuration /config/esphome/esp32-s3-box-1.yaml...
INFO Updating https://github.com/esphome/esphome.git@pull/5230/head
WARNING GPIO0 is a Strapping PIN and should be avoided.
Attaching external pullup/down resistors to strapping pins can cause unexpected failures.
See https://esphome.io/guides/faq.html#why-am-i-getting-a-warning-about-strapping-pins
WARNING GPIO45 is a Strapping PIN and should be avoided.
Attaching external pullup/down resistors to strapping pins can cause unexpected failures.
See https://esphome.io/guides/faq.html#why-am-i-getting-a-warning-about-strapping-pins
INFO Generating C++ source...
Traceback (most recent call last):
  File "/usr/local/bin/esphome", line 33, in <module>
    sys.exit(load_entry_point('esphome', 'console_scripts', 'esphome')())
  File "/esphome/esphome/__main__.py", line 1036, in main
    return run_esphome(sys.argv)
  File "/esphome/esphome/__main__.py", line 1023, in run_esphome
    rc = POST_CONFIG_ACTIONS[args.command](args, config)
  File "/esphome/esphome/__main__.py", line 403, in command_compile
    exit_code = write_cpp(config)
  File "/esphome/esphome/__main__.py", line 190, in write_cpp
    return write_cpp_file()
  File "/esphome/esphome/__main__.py", line 208, in write_cpp_file
    writer.write_cpp(code_s)
  File "/esphome/esphome/writer.py", line 342, in write_cpp
    copy_src_tree()
  File "/esphome/esphome/writer.py", line 295, in copy_src_tree
    copy_files()
  File "/esphome/esphome/components/esp32/__init__.py", line 593, in copy_files
    repo_dir, _ = git.clone_or_update(
  File "/esphome/esphome/git.py", line 95, in clone_or_update
    old_sha = run_git_command(["git", "rev-parse", "HEAD"], str(repo_dir))
  File "/esphome/esphome/git.py", line 32, in run_git_command
    raise cv.Invalid(err_str)
voluptuous.error.Invalid: fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
1 Like

I was able to get it working for me. Here is the pertinent section of my docker compose:

For some reason my piper gave me an error message until I put

command: ["--voice", "en_US-lessac-medium"]

So I guess I am leaving it for now. I am able to change the voice through the homeassistant GUI. Maybe it was user error on my part. (Not good with making my own docker-compose files)

version: '3.8'

services:
  openwakeword:
      container_name: openwakeword
      image: rhasspy/wyoming-openwakeword:latest
      restart: unless-stopped
      ports:
        - "10400:10400"

  whisper:
      container_name: whisper
      image: rhasspy/wyoming-whisper:latest
      restart: unless-stopped
      ports:
        - "10300:10300"
      command: ["--model", "tiny-int8", "--language", "en"]
      volumes:
        - ./wyoming:/data

  piper:
      image: rhasspy/wyoming-piper:latest
      container_name: piper
      restart: unless-stopped
      ports:
        - "10200:10200"
      volumes:
        - ./wyoming:/data
      command: ["--voice", "en_US-lessac-medium"]

Don’t forget to use the GUI to add the integrations “Wyoming”, “Piper”, and “Whisper” once you have your docker containers running.

1 Like

Yeah, I saw that on stream, I’m talking about making it record a command right away. Like if someone actually said a wake word already, irrespective of whether it’s on or off.

For Rhasspy users: the equivalent of calling /api/listen-for-command.

I suppose I could build a simple Wyoming event emitter that sends a wake event in response to a certain nudge, but something built-in would be nicer. Doesn’t seem complicated. Might be useful as a debugging facility as well.