Hi,
Just tried to train a new wake word, trying to get a “frenchy” pronounciation for Alexa.
Script errored after 15 minutes, saying something about not being able to find and open a file…
I think I’ll have to wait a bit for things to be a bit more straightforward.
It works great though (using an Atom Echo), but having to pronounce the wake words with an english accent feels not natural ^^
When using HomeAssistant-core, how do we point it to a docker container running OpenWakeWord?
It doesnt look like you can at the moment.
This is how im running piper and whisper
Home assistant, piper and whisper, run in separate dockers, as core does not have “addons”
But I can still go to Home Assistant > Settings > Inetegrations > Add and it let me select piper/whisper, and input the IP:port of the corresponding docker.
openwakeword, does not show in the Add integration list…
While testing further, i have another observation. During the livestream , it is mentioned that false positives did not occur during testing.
In my testing, i do have false positives. This is while using the ATOM Echo M5 Development Kit. This happens both while using openWakeWord 1.8.0 and while using Porcupine 1.0.0 (both set to “Alexa”). All settings are left to default.
The typical scenario at which the false positives occur, is when watching tv, Formula 1: two false positives in one hour.
Its basically a way to create a KWS when you don’t have good quality datasets and will never compete with commercial KWS such as Alexa/Google as they have the good quality datasets.
It does well against porcupine but porcupine is in very much the same area as its an easy custom KWS model that performed well against many obsolete KWS aka Snowboy and others but vs Commercial KWS its results are far less.
Its also true of Whisper as for a limited set of languages the Large model posted SotA results, this is not true of how its being used often with the Tiny model where WER skyrockets.
If you can find a French language TTS model or create a dataset with it then likely it would be better.
Likely the keyword could be the same but the language model will bring through the accent more.
Yeah apparently “It’s a technological marvel that is created with 4 goals in mind” but hey nothing like blowing your own trumpet as some have ribs removed to achieve it.
From Mycroft to Rhasspy we have some very enthuastic hobbyists doing some great work and a lot of work.
Also though when you compare to big data its trailing quite a long way behind.
I have been trying to advocate how easy it is to capture quality data and how hardware dictate is a huge advantage to big data.
A project can roll and dictate a KW and quickly capture a quality dataset of use where users opt-in to send collated packages of data with gold standard metadata.
What we have is more what could be done than maybe what users of commercial voice systems have come to expect.
Opensource has been nothing short of a disaster in the discipline of applying quality datasets and metadata where the CommonVoice initiative was likely the biggest waste of Mozilla funds ever.
It is riddled with wrongly labelled data and much is bad, but also it totally lacks (a few items do) quality metadata.
Region, Age group, gender, native speaker, mic hardware and maybe a few others are all that is needed as we do not need your name or house address just metadata to provide specific language and accent models for KWS & ASR.
Same with the Mlcommons iniative that merely forced aligned CommonVoice as at a guesstime contains more non native spoken word than native with what is near zero metadata so you are unable to filter for specifics.
It just didn’t happen and hasn’t happened and strangely opensource hasn’t started collating usage data via an opt-in.
Also its created custom KW so that the herd will never be able to collate a single KW in any qty.
When a ASR command sentance activates a skill, without an instant stop or the same skill being instantly repeated then its likely you can assume the KW & ASR command sentences to be good.
Opensource could be creating gold standard datasets, through use and closing the gap via an opt-in of only KW and command sentence collation, with ever evolving models based on them shipped OTA.
Then as @Fraddles says, enter the IP address of your Docker container. I couldn’t get it to work with the non-default port, fwiw, but there’s no conflict with 10400 on my host.
I have tried changing options there but I am finding the stutter isn’t consistent so hard to test. This is why I thought it might be a hardware resource constraint that was coming and going?
INFO ESPHome 2023.10.1
INFO Reading configuration /config/esphome/m5stack-atom-echo-8a1468.yaml...
INFO Updating https://github.com/esphome/esphome.git@pull/5230/head
INFO Generating C++ source...
Traceback (most recent call last):
File "/usr/local/bin/esphome", line 33, in <module>
sys.exit(load_entry_point('esphome', 'console_scripts', 'esphome')())
File "/esphome/esphome/__main__.py", line 1036, in main
return run_esphome(sys.argv)
File "/esphome/esphome/__main__.py", line 1023, in run_esphome
rc = POST_CONFIG_ACTIONS[args.command](args, config)
File "/esphome/esphome/__main__.py", line 454, in command_run
exit_code = write_cpp(config)
File "/esphome/esphome/__main__.py", line 190, in write_cpp
return write_cpp_file()
File "/esphome/esphome/__main__.py", line 208, in write_cpp_file
writer.write_cpp(code_s)
File "/esphome/esphome/writer.py", line 342, in write_cpp
copy_src_tree()
File "/esphome/esphome/writer.py", line 295, in copy_src_tree
copy_files()
File "/esphome/esphome/components/esp32/__init__.py", line 593, in copy_files
repo_dir, _ = git.clone_or_update(
File "/esphome/esphome/git.py", line 95, in clone_or_update
old_sha = run_git_command(["git", "rev-parse", "HEAD"], str(repo_dir))
File "/esphome/esphome/git.py", line 32, in run_git_command
raise cv.Invalid(err_str)
voluptuous.error.Invalid: fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'