Home Assistant Voice PE - Custom Wake Words Please!

That’s too bad. I’d pay some $$ to run it through something like elevenlabs to get a better dataset.

Doesn’t take that much to train but you do need a good dataset and anything from 100-200 epochs (I go for 200).
It runs in the background on my computer with a very modest Quadro4000 8gb which is slightly faster than a GTX10606gb (maybe 16 hours)
You can check the tflite in GitHub - StuartIanNaylor/tf-kws: tflite keyword as I just know that model from google-research/kws_streaming at master · google-research/google-research · GitHub from a lot of trial and error I did about 3 years ago.
The model should be an improvement on current as you can not just apply reverberation to mono wav files as has been done by the devs showing a complete misconception how they mix at different frquencies and create harmonics why multi mic smart speakers and speech enhancement is essential to the voice pipeline. Just applying a series of recorded RIRs @1.5m from forests to big halls for a smart speaker device is just bizarre and a clear indication of some complete 101 basic errors and missconceptions.

A very modest PC will chug away and create the models that can be exported for TF4Micro if you have the patience just to leave it the background.
Or if paying $$ it should be pretty rapid but still even my dataset is very small in comparison to the aptly names big data…
Datasets: Data characteristics  |  Machine Learning  |  Google for Developers if you read Googles rule of thumb on data qty vs model params.

I just did the crnn with a total synthetic dataset but with TTS that have better quality but more human like variation than Piper.
I am training on the Gkws repo with a CRNN because I know it and do what I do of including a falsekw as a honey trap to train against to reduce false positives.
Classification models are just dumb label classifications and unless you provide labels to train and classify against you will get the results they are getting…
I also dunno the level the AGC works on the PE and use vlevel AGC on my machine as often soundcards and microphones are very low but you should try to maximise without clipping audio to get the best end resulting resolution.
You should be able to try with any mic and just run the kws.py script with the model included as its that dataset of hey-jarvis.

1 Like

As added a final bit on how to make microWakeWord very accurate but its sort of all datasets and the no-brainer is they should be what is the input to the model. This does make custom wake works a lot of work but once done they could be held in a model zoo with some form of peer-review to dodge the dross.

1 Like

Oh man, that is a bummer. I’ve been using an Atom Echo with my own custom OpenWakeWord model (no small feat, as I’m not a tech professional and had to learn a lot) but the Atom Echo just seems to get crappier and crappier with each successive update.

I was looking forward to a voice interface using better hardware, and I had just assumed I’d be able to transfer my model over to the HA Voice PE. I seriously may return this if I can’t use it with my own wakeword model. This sucks :frowning:

I don’t think you probably can return as they did call it a ‘preview edition’, but IMO amped up the sales pitch of the future of ‘opensource voice assistant’ way to high.
Its basically a $30 https://www.seeedstudio.com/ReSpeaker-Lite-Voice-Assistant-Kit-Full-Kit-of-2-Mic-Array-pre-soldered-XIAO-ESP32S3-Mono-Enclosed-Speaker-and-Enclosure.html in a different box with a wheel and switch for when it goes wrong :slight_smile:

I am presuming they will get a training regime for microWakeWord, but just at the moment what they are doing could not be anymore wrong. They have the info to fix things and that is up to the devs who choose this implementation.

Still takes quite a bit of training and dataset creation, but when sorted it should be possible and if done right the models should be quite accurate. When that will be who knows…

The Atom Echo finally rolled out with microWakeWord but think before that it was purely a microphone streamer and OpenWakeWord ran upstream.
I could be wrong though as OpenWakeWord is some embedding wakeword for synthetic models Google did and don’t know much more than having a cursory glance at it. Its of a size and has certain ML layers that just don’t run on micro.

Hence why I opened this thread - additional choices or customization in this area is important to some folks, myself included. :slight_smile:

3 Likes

I love how Voice PE works, but my wife hates how watching TV invariably makes it think it’s being addressed. While my old Google Home would be set off by TV shows maybe every three weeks, my Voice PE can’t last 3 hours using any wakeword.

I at least need a wakeword sensitivity setting, because right now, my Voice PE is simply unplugged and I had to go back to my Google device.

3 Likes

Any word yet on including at least "alexa" as a default choice? It’s in the repository already… Or is Nabu Casa worried about legal/trademark issues… :thinking: (I can’t blame them!)

1 Like

Agreed we need custom wake words soon. Articulation of the wake words are hit and miss on the esp boards. Both VA preview devices i have seem to not trigger wake word great compared to my 2 wyoming satellites i also have configured.

Wow, have to admit I was a little bit shocked to find out that openWakeWord, which is literally the first option in the Home Assistant wake word documentation, is not actually supported on the Home Assistant Voice PE. Had no idea until I got the device and started playing with it. That’s disappointing.

3 Likes

Yeah, they definitely could have done a better job of explaining that they went from openWakeWord (which is usually set up to stream audio to your HA server - very inefficient) to microWakeWord which runs on the actual ESP32 hardware and is far more efficient… but the drawback is that it’s incredibly difficult to train custom wake words for microWakeWord (at least as of today).

If anyone wants to have “Alexa” as a custom wake word on VPE, this might be of value for you:

  name: home-assistant-voice-somedigits
  friendly_name: my-vpe
  micro_wake_word_model: alexa
packages:
  Nabu Casa.Home Assistant Voice PE: github://esphome/home-assistant-voice-pe/home-assistant-voice.yaml
esphome:
  name: ${name}
  name_add_mac_suffix: false
  friendly_name: ${friendly_name}
api:
  encryption:
    key: your-key-goes-here

micro_wake_word:
  id: mww
  models:
    # CUSTOM
    - model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/alexa.json
      id: alexa
      probability_cutoff: 0.98
      sliding_window_size: 7
wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

If you manage to train your own model (which I have attempted, I even got some files out of the process but it never worked when included in the build), you’d have your preferences json served somewhere and inserted instead of the github link in the config above. I stopped tinkering with the wake word as Alexa was enough for WAF here. Please note that I have modified the probability_cutoff and sliding_window_size to tackle false positives. These values will most probably be different for you and if you want to go with the defaults (which also have worked for some months for me), just remove these two lines. The build will then use the values in the linked json file. Have fun!

4 Likes

Aha! This worked for me! And it even re-integrated into Home Assistant without changing any entity IDs or automations I had already set up. Thanks very much @grantigerbayer!

Oooh interesting will try

so, just as a little explainer to myself in the future when i can’t remember how i got all this working…

train up a new microwakeword - i used an unraid microwakeword community template docker container, selected the ‘advanced’ notebook and changed my phrase to hey_adam.

i then saved the reultant tflite and json file into /config/esphome directory of my home assistant install.

i then pressed ‘take control’ of my voice PE in the esp home device builder area of home assistant.

i then pasted the following code block between the ‘api’ and ‘wifi’ sections of the codce, and pressed install.

‘’’
substitutions:
name: home-assistant-voice-0917b8
friendly_name: Home Assistant Voice 0917b8
packages:
Nabu Casa.Home Assistant Voice PE: github://esphome/home-assistant-voice-pe/home-assistant-voice.yaml
esphome:
name: ${name}
name_add_mac_suffix: false
friendly_name: ${friendly_name}
api:
encryption:
key: my-encryption-key=

micro_wake_word:
id: mww
models:
# CUSTOM
- model: /config/esphome/hey_adam.json
id: hey_adam
probability_cutoff: 0.98
sliding_window_size: 7

wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
‘’’

this is my json

GNU nano 8.2 hey_adam.json
{
“type”: “micro”,
“wake_word”: “hey_adam”,
“author”: “master phooey”,
“website”: “GitHub - MasterPhooey/MicroWakeWord-Trainer-Docker”,
“model”: “hey_adam.tflite”,
“trained_languages”: [
“en”
],
“version”: 2,
“micro”: {
“probability_cutoff”: 0.97,
“sliding_window_size”: 5,
“feature_step_size”: 10,
“tensor_arena_size”: 30000,
“minimum_esphome_version”: “2024.7.0”
}
}

1 Like

Links please

its in the json i pasted, but here is the github, though if you access the app tab of unraid gui and search microwakeword you will get the template

MasterPhooey/MicroWakeWord-Trainer-Docker

2 Likes

Ok this worked for me, had to deactivate the GPU to create a wakeword because the 5060ti was noch detected. Then downloading the firmware and flashed everything vie webesp-home.
Works like a charm.

I was able to successfully build a model for ‘computer’ using the MasterPhooey Docker Trainer + jupyter notebook advanced model. I ran it on Win11 using DockerDesktop and the suggested run commands. It took maybe 15-20 minutes to run the notebook but it worked.

Then I forked the repo at GitHub - JohnnyPrimus/Custom_V2_MicroWakeWords and created a new branch for testing Custom_V2_MicroWakeWords/models/computer at new-computer · amac16/Custom_V2_MicroWakeWords · GitHub

Then after some fiddling in ESPHome yaml, I was finally able to get it to accept the new model and build to the VPE. I then went into the VPE entity and changed the wakeword to computer.

Unfortunately the VPE isn’t responding yet to the new wakeword. It responds to okay_nabu when I select that one. The VPE says its wake word is ‘computer’, but I’m not if that’s definitely on-device, or from the streaming assistant, which is also set to computer.

Here’s my esp micro wakeword section thus far:

micro_wake_word:
  id: mww
  models:
    - model: https://raw.githubusercontent.com/esphome/micro-wake-word-models/refs/heads/main/models/v2/alexa.json
      id: alexa
    - model: https://raw.githubusercontent.com/amac16/Custom_V2_MicroWakeWords/247ba6d60018adb944d5d797273015fa723945c4/models/computer/computer.json
      id: computer
    - model: okay_nabu
      id: okay_nabu_model

Edit to add: I’ve tried just about everything and cannot get the VAPE to respond to the custom micro wakeword ‘computer’. It’s telling me its wakeword is computer. Everything is set to computer from the assistant to the device to the openwakeword mod and all related entities. Out of ideas. :frowning:

You can use the official ESPHome repo instead:

substitutions:
  name: home-assistant-voice-dasha
  friendly_name: Voice PE Dasha 1
packages: 
  Nabu Casa.Home Assistant Voice PE: github://esphome/home-assistant-voice-pe/home-assistant-voice.yaml

esphome:
  name: ${name}
  name_add_mac_suffix: false
  friendly_name: ${friendly_name}
  
api:
  encryption:
    key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

micro_wake_word:
  id: mww
  models:
    - model: /config/esphome/oh_kay_dah_shah.json
      id: Ok_Dasha
      probability_cutoff: 0.9
      sliding_window_size: 5

voice_assistant:
  noise_suppression_level: 3
  auto_gain: 20dBFS
  volume_multiplier: 3