Rhasspy offline voice assistant toolkit

I do not know a faster way sadly, but I will adjust my streamer to produce mono :slight_smile:

I tried to install rhasspy-hassio-addon and this is log:

DEBUG:RhasspyCore:Loaded profile from /share/rhasspy/profiles/en/profile.json
INFO:root:++++ Actor System gen (3, 9) started, admin @ ActorAddr-(T|:1900)
DEBUG:root:Thespian source: /usr/local/lib/python3.6/site-packages/thespian/__init__.py
DEBUG:DialogueManager: -> started
DEBUG:DialogueManager:started -> loading_mqtt
DEBUG:DialogueManager:Loading MQTT first
DEBUG:DialogueManager:Loading...will time out after 30 second(s)
DEBUG:HermesMqtt: -> started
DEBUG:DialogueManager:loading_mqtt -> loading
DEBUG:DialogueManager:Loading actors
DEBUG:DialogueManager:Actors created. Waiting for ['recorder', 'player', 'wake', 'command', 'decoder', 'recognizer', 'handler', 'hass_handler', 'sentence_generator', 'speech_trainer', 'intent_trainer', 'word_pronouncer'] to start.
DEBUG:FuzzyWuzzyIntentTrainer: -> started
DEBUG:HomeAssistantIntentHandler: -> started
DEBUG:PhonetisaurusPronounce: -> started
DEBUG:APlayAudioPlayer: -> started
DEBUG:ARecordAudioRecorder: -> started
DEBUG:JsgfSentenceGenerator: -> started
DEBUG:PocketsphinxSpeechTrainer: -> started
DEBUG:HomeAssistantIntentHandler:started -> started
DEBUG:FuzzyWuzzyRecognizer: -> started
DEBUG:DialogueManager:intent_trainer started
DEBUG:FuzzyWuzzyRecognizer:Loaded examples from /usr/share/rhasspy/profiles/en/intent_examples.json
DEBUG:FuzzyWuzzyRecognizer:started -> loaded
DEBUG:DialogueManager:player started
DEBUG:WebrtcvadCommandListener: -> started
DEBUG:DialogueManager:recorder started
DEBUG:WebrtcvadCommandListener:started -> loaded
DEBUG:DialogueManager:sentence_generator started
DEBUG:DialogueManager:word_pronouncer started
DEBUG:DialogueManager:speech_trainer started
DEBUG:DialogueManager:handler started
DEBUG:DialogueManager:recognizer started
DEBUG:DialogueManager:hass_handler started
DEBUG:DialogueManager:command started
DEBUG:SnowboyWakeListener: -> started
ERROR:SnowboyWakeListener:receiveMessage
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/snowboy/snowboydetect.py", line 18, in swig_import_helper
    return importlib.import_module(mname)
  File "/usr/local/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 658, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 571, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 922, in create_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
ImportError: /usr/local/lib/python3.6/site-packages/snowboy/_snowboydetect.so: unexpected reloc type 0x03
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/share/rhasspy/rhasspy/actor.py", line 45, in receiveMessage
    self.transition('started')
  File "/usr/share/rhasspy/rhasspy/actor.py", line 76, in transition
    getattr(self, transition_method)(from_state)
  File "/usr/share/rhasspy/rhasspy/wake.py", line 217, in to_started
    self.load_detector()
  File "/usr/share/rhasspy/rhasspy/wake.py", line 283, in load_detector
    from snowboy import snowboydetect, snowboydecoder
  File "/usr/local/lib/python3.6/site-packages/snowboy/snowboydetect.py", line 21, in <module>
    _snowboydetect = swig_import_helper()
  File "/usr/local/lib/python3.6/site-packages/snowboy/snowboydetect.py", line 20, in swig_import_helper
    return importlib.import_module('_snowboydetect')
  File "/usr/local/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named '_snowboydetect'
DEBUG:PocketsphinxDecoder: -> started
INFO:PocketsphinxDecoder:Loading decoder with hmm=/usr/share/rhasspy/profiles/en/acoustic_model, dict=/share/rhasspy/profiles/en/dictionary.txt, lm=/share/rhasspy/profiles/en/language_model.txt
DEBUG:PocketsphinxDecoder:started -> loaded
DEBUG:DialogueManager:decoder started
WARNING:DialogueManager:Actor timeout! Loading anyway...
WARNING:SnowboyWakeListener:Unhandled message in state started: <rhasspy.wake.ListenForWakeWord object at 0x757a2a10>
DEBUG:DialogueManager:loading -> ready
WARNING:SnowboyWakeListener:Unhandled message in state started: <rhasspy.dialogue.Ready object at 0x757afa30>
INFO:DialogueManager:Automatically listening for wake word
DEBUG:DialogueManager:ready -> asleep
INFO:werkzeug: * Running on http://0.0.0.0:12101/ (Press CTRL+C to quit)

After that I tried to add the hotword_snowboy and command_listener components but it was an error:
About

Home Assistant
0.87.1

7:15 AM setup.py (ERROR)
Setup failed for hotword_snowboy: Could not install all requirements.
7:15 AM setup.py (ERROR)
Not initializing hotword_snowboy because could not install requirement snowboy==1.2.0b1
7:15 AM requirements.py (ERROR)
Unable to install package snowboy==1.2.0b1: Failed building wheel for snowboy Failed building wheel for PyAudio Command "/usr/local/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-z67oitta/PyAudio/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-wd5ydi32/install-record.txt --single-version-externally-managed --prefix --compile --user --prefix=" failed with error code 1 in /tmp/pip-install-z67oitta/PyAudio/ 

Configuration invalid
Component not found: command_listener

Could you help me?

Sure, happy to help. From the logs, it looks like you’re trying to run Rhasspy on a Raspberry Pi (or at least an ARM-based system). The key error is unexpected reloc type 0x03, and is related to snowboy (the wake word component). This appears to be a bug in how they build their library that I can’t fix it because they don’t provide source code!

To just get things up and running, I’d recommend using pocketsphinx instead of snowboy for wake word detection. It’s not as good as snowboy, but it will at least work on all platforms.

I plan to eventually downgrade my Docker base image to something where snowboy will work, but this is not yet possible because of conflicts with some core Rhasspy components. Yay computers.

1 Like

To everyone: I have a draft of the Rhasspy manual up on readthedocs.org. Please let me know what you think, and if you spot any errors. Thanks!

2 Likes

I installed Hassbian and installed Rhasspy by Virtual Environment
All was well:

but I can not add the hotword_snowboy custom component. Here is log:

Not initializing hotword_snowboy because could not install requirement snowboy==1.2.0b1

Unable to install package snowboy==1.2.0b1: Command "/srv/homeassistant/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-e7ilf8oy/snowboy/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-b4vit8m3-record/install-record.txt --single-version-externally-managed --compile --install-headers /srv/homeassistant/include/site/python3.5/snowboy" failed with error code 1 in /tmp/pip-build-e7ilf8oy/snowboy/

How to fix it?

OK, once you’ve followed the virtual environment instructions, just run the run-venv.sh script. You don’t need any custom components; those are for a much older version of Rhasspy.

The only thing you need to add to Home Assistant are automations to handle the voice commands :slight_smile:

1 Like

Here is log when in run the run-venv.sh

pi@hassbian:~/rhasspy $ ./run-venv.sh
Using pre-compiled binaries.
 * Serving Flask app "app.py"
 * Environment: production
   WARNING: Do not use the development server in a production environment.
   Use a production WSGI server instead.
 * Debug mode: off
Traceback (most recent call last):
  File "/home/pi/rhasspy/.venv/bin/flask", line 11, in <module>
    sys.exit(main())
  File "/home/pi/rhasspy/.venv/lib/python3.5/site-packages/flask/cli.py", line 8                                                                                                             94, in main
    cli.main(args=args, prog_name=name)
  File "/home/pi/rhasspy/.venv/lib/python3.5/site-packages/flask/cli.py", line 5                                                                                                             57, in main
    return super(FlaskGroup, self).main(*args, **kwargs)
  File "/home/pi/rhasspy/.venv/lib/python3.5/site-packages/click/core.py", line                                                                                                              717, in main
    rv = self.invoke(ctx)
  File "/home/pi/rhasspy/.venv/lib/python3.5/site-packages/click/core.py", line                                                                                                              1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/pi/rhasspy/.venv/lib/python3.5/site-packages/click/core.py", line                                                                                                              956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/pi/rhasspy/.venv/lib/python3.5/site-packages/click/core.py", line                                                                                                              555, in invoke
    return callback(*args, **kwargs)
  File "/home/pi/rhasspy/.venv/lib/python3.5/site-packages/click/decorators.py",                                                                                                              line 64, in new_func
    return ctx.invoke(f, obj, *args, **kwargs)
  File "/home/pi/rhasspy/.venv/lib/python3.5/site-packages/click/core.py", line                                                                                                              555, in invoke
    return callback(*args, **kwargs)
  File "/home/pi/rhasspy/.venv/lib/python3.5/site-packages/flask/cli.py", line 7                                                                                                             67, in run_command
    app = DispatchingApp(info.load_app, use_eager_loading=eager_loading)
  File "/home/pi/rhasspy/.venv/lib/python3.5/site-packages/flask/cli.py", line 2                                                                                                             93, in __init__
    self._load_unlocked()
  File "/home/pi/rhasspy/.venv/lib/python3.5/site-packages/flask/cli.py", line 3                                                                                                             17, in _load_unlocked
    self._app = rv = self.loader()
  File "/home/pi/rhasspy/.venv/lib/python3.5/site-packages/flask/cli.py", line 3                                                                                                             72, in load_app
    app = locate_app(self, import_name, name)
  File "/home/pi/rhasspy/.venv/lib/python3.5/site-packages/flask/cli.py", line 2                                                                                                             35, in locate_app
    __import__(module_name)
  File "/home/pi/rhasspy/app.py", line 33, in <module>
    from rhasspy.profiles import Profile
  File "/home/pi/rhasspy/rhasspy/__init__.py", line 622
    detected_paths:Set[str] = set()
                  ^
SyntaxError: invalid syntax

But I just need a custom component for Home Assistant to record audio just after hotword, and then save it in “config/www” folder as you recommend me here.

The syntax error you’re seeing is because Rhasspy requires Python 3.6 (you probably have 3.5 installed).

One way to do what you want is to use the custom components as I described (except you should probably try the hotword_pocketsphinx component instead).

You could also do it with the version with Rhasspy you have installed. The easiest way I can think of is:

  1. Override the speech to text system with the command system and set speech_to_text.command.program to something you wrote that just writes the WAV data from stdin to your file (e.g., cat > /config/www/audio.wav).
  2. Turn off the rest of Rhasspy’s pipeline by setting:
    • intent.system and handle.system to dummy

Hope one of these things works :confused:

1 Like

Here is log when in run the pipenv install
Creating a virtualenv for this project

Pipfile: /rhasspy-tools/Pipfile
Using /usr/bin/python3.5 (3.5.3) to create virtualenv

â Œ Creating virtual environment
Using base prefix ‘/usr’
New python executable in /root/.local/share/virtualenvs/rhasspy-tools-0tc_-tLK/bin/python3.5
Also creating executable in /root/.local/share/virtualenvs/rhasspy-tools-0tc_-tLK/bin/python
Installing setuptools, pip, wheel

done.
Running virtualenv with interpreter /usr/bin/python3.5
:heavy_check_mark: Successfully created virtual environment!
Virtualenv location: /root/.local/share/virtualenvs/rhasspy-tools-0tc_-tLK
Installing dependencies from Pipfile.lock (1c6216)

An error occurred while installing scipy==1.2.1 --hash=sha256:014cb900c003b5ac81a53f2403294e8ecf37aedc315b59a6b9370dce0aa7627a --hash=sha256:281a34da34a5e0de42d26aed692ab710141cad9d5d218b20643a9cb538ace976 --hash=sha256:588f9cc4bfab04c45fbd19c1354b5ade377a8124d6151d511c83730a9b6b2338 --hash=sha256:5a10661accd36b6e2e8855addcf3d675d6222006a15795420a39c040362def66 --hash=sha256:628f60be272512ca1123524969649a8cb5ae8b31cca349f7c6f8903daf9034d7 --hash=sha256:6dcc43a88e25b815c2dea1c6fac7339779fc988f5df8396e1de01610604a7c38 --hash=sha256:70e37cec0ac0fe95c85b74ca4e0620169590fd5d3f44765f3c3a532cedb0e5fd --hash=sha256:7274735fb6fb5d67d3789ddec2cd53ed6362539b41aa6cc0d33a06c003aaa390 --hash=sha256:78e12972e144da47326958ac40c2bd1c1cca908edc8b01c26a36f9ffd3dce466 --hash=sha256:790cbd3c8d09f3a6d9c47c4558841e25bac34eb7a0864a9def8f26be0b8706af --hash=sha256:79792c8fe8e9d06ebc50fe23266522c8c89f20aa94ac8e80472917ecdce1e5ba --hash=sha256:865afedf35aaef6df6344bee0de391ee5e99d6e802950a237f9fb9b13e441f91 --hash=sha256:870fd401ec7b64a895cff8e206ee16569158db00254b2f7157b4c9a5db72c722 --hash=sha256:963815c226b29b0176d5e3d37fc9de46e2778ce4636a5a7af11a48122ef2577c --hash=sha256:9726791484f08e394af0b59eb80489ad94d0a53bbb58ab1837dcad4d58489863 --hash=sha256:9de84a71bb7979aa8c089c4fb0ea0e2ed3917df3fb2a287a41aaea54bbad7f5d --hash=sha256:b2c324ddc5d6dbd3f13680ad16a29425841876a84a1de23a984236d1afff4fa6 --hash=sha256:b86ae13c597fca087cb8c193870507c8916cefb21e52e1897da320b5a35075e5 --hash=sha256:ba0488d4dbba2af5bf9596b849873102d612e49a118c512d9d302ceafa36e01a --hash=sha256:d78702af4102a3a4e23bb7372cec283e78f32f5573d92091aa6aaba870370fe1 --hash=sha256:def0e5d681dd3eb562b059d355ae8bebe27f5cc455ab7c2b6655586b63d3a8ea --hash=sha256:e085d1babcb419bbe58e2e805ac61924dac4ca45a07c9fa081144739e500aa3c --hash=sha256:e2cfcbab37c082a5087aba5ff00209999053260441caadd4f0e8f4c2d6b72088 --hash=sha256:e742f1f5dcaf222e8471c37ee3d1fd561568a16bb52e031c25674ff1cf9702d5 --hash=sha256:f06819b028b8ef9010281e74c59cb35483933583043091ed6b261bb1540f11cc --hash=sha256:f15f2d60a11c306de7700ee9f65df7e9e463848dbea9c8051e293b704038da60 --hash=sha256:f31338ee269d201abe76083a990905473987371ff6f3fdb76a3f9073a361cf37 --hash=sha256:f6b88c8d302c3dac8dff7766955e38d670c82e0d79edfc7eae47d6bb2c186594! Will try again.
An error occurred while installing scikit-learn==0.20.2 --hash=sha256:05d061606657af85365b5f71484e3362d924429edde17a90068960843ad597f5 --hash=sha256:071317afbb5c67fa493635376ddd724b414290255cbf6947c1155846956e93f7 --hash=sha256:0d03aaf19a25e59edac3099cda6879ba05129f0fa1e152e23b728ccd36104f57 --hash=sha256:1665ea0d4b75ef24f5f2a9d1527b7296eeabcbe3a1329791c954541e2ebde5a2 --hash=sha256:24eccb0ff31f84e88e00936c09197735ef1dcabd370aacb10e55dbc8ee464a78 --hash=sha256:27b48cabacce677a205e6bcda1f32bdc968fbf40cd2aa0a4f52852f6997fce51 --hash=sha256:2c51826b9daa87d7d356bebd39f8665f7c32e90e3b21cbe853d6c7f0d6b0d23b --hash=sha256:3116299d392bd1d054655fa2a740e7854de87f1d573fa85503e64494e52ac795 --hash=sha256:3771861abe1fd1b2bbeaec7ba8cfca58fdedd75d790f099960e5332af9d1ff7a --hash=sha256:473ba7d9a5eaec47909ee83d74b4a3be47a44505c5189d2cab67c0418cd030f1 --hash=sha256:621e2c91f9afde06e9295d128cb15cb6fc77dc00719393e9ec9d47119895b0d4 --hash=sha256:645865462c383e5faad473b93145a8aee97d839c9ad1fd7a17ae54ec8256d42b --hash=sha256:80e2276d4869d302e84b7c03b5bac4a67f6cd331162e62ae775a3e5855441a60 --hash=sha256:84d2cfe0dee3c22b26364266d69850e0eb406d99714045929875032f91d3c918 --hash=sha256:87ea9ace7fe811638dfc39b850b60887509b8bfc93c4006d5552fa066d04ddc7 --hash=sha256:a4d1e535c75881f668010e6e53dfeb89dd50db85b05c5c45af1991c8b832d757 --hash=sha256:a4f14c4327d2e44567bfb3a0bee8c55470f820bc9a67af3faf200abd8ed79bf2 --hash=sha256:a7b3c24e193e8c6eaeac075b5d0bb0a7fea478aa2e4b991f6a7b030fc4fd410d --hash=sha256:ab2919aca84f1ac6ef60a482148eec0944364ab1832e63f28679b16f9ef279c8 --hash=sha256:b0f79d5ff74f3c68a4198ad5b4dfa891326b5ce272dd064d11d572b25aae5b43 --hash=sha256:bc5bc7c7ee2572a1edcb51698a6caf11fae554194aaab9a38105d9ec419f29e6 --hash=sha256:bc5c750d548795def79576533f8f0f065915f17f48d6e443afce2a111f713747 --hash=sha256:c68969c30b3b2c1fe07c1376110928eade61da4fc29c24c9f1a89435a7d08abe --hash=sha256:d3b4f791d2645fe936579d61f1ff9b5dcf0c8f50db7f0245ca8f16407d7a5a46 --hash=sha256:dac0cd9fdd8ac6dd6108a10558e2e0ca1b411b8ea0a3165641f9ab0b4322df4e --hash=sha256:eb7ddbdf33eb822fdc916819b0ab7009d954eb43c3a78e7dd2ec5455e074922a --hash=sha256:ed537844348402ed53420187b3a6948c576986d0b2811a987a49613b6a26f29e --hash=sha256:fcca54733e692fe03b8584f7d4b9344f4b6e3a74f5b326c6e5f5e9d2504bdce7! Will try again.

Please see the more recent installation instructions from this repository.

The version of Rhasspy you were trying to install is now deprecated. I’ll try and mark the repos better to make this more clear. Sorry for the confusion.

Thanks for your reply.
All I want is here. :frowning:
Could you help me on step 1 to 5?

OK, I added a copy audio example that does what you’re asking. The profile.json file (under rhasspy/profiles/en) sets to speech to text system to command, and has it call a script named copy-audio.sh. This script just saves the recorded WAV data to an output directory inside the profile. Steps 6 and beyond are disabled in Rhasspy.

I added a run-venv.sh script inside the example that will run everything out of that directory. If you want to change the wake word to something besides “okay rhasspy”, you’ll need to edit the profile.json and then re-train. Good luck, and let me know how it goes!

Documentation is as far as I have read it very clear.
I have managed to use my MQTT streamer as well to control a service in Home Assistant. I am rather limited on time, but I liove the work you are doing :slight_smile:

Hmm, I still have problems with my MQTT audio recording, it does not seems to be getting any input.
I tried to find it in the could, but could not guite find it.

I have and esp32 Matrix Voice and I am using met streamer to publish messages to the hermes audioFrame topic. This works, but since Rhasspy does not seems to get any voice command I was wondering about the internals.

  • the software I use sends 256 frames of 16000, 16bit audio, each in a WAV container as 1 message. This is a continuous stream of lots and lots of small messages.

My question is: Does Rhasspy decode all those message and use the raw audio buffers as recordings or does it work in another way? I can adapt my software, but oneshot is not an option, simply because the esp32 can not send such large buffers due to memory limitations.
I have implemented it so that it works for Snips, but I would like to know what exactly to send per message (wave files? raw buffers? and how many frames?)

Rhasspy expects each message to contain WAV data. For each message, it unpacks the raw audio data from the WAV, converts it to 16-bit 16Khz mono (if necessary), and then generates an internal message that flows around an actor network. I recently added an extra buffer between MQTT and the actor system to ensure that message queues don’t get overflowed, which was definitely happening with many small messages.

I’ve uploaded some test WAV chunks to my mqtt example to show you what Rhasspy expects (see this specifically). The shell scripts in there are using my new mqtt command listener to start/stop the voice command. I’ve tested this with the default webrtcvad system too, so you should keep using the default voice command system (unless you do voice activity detection on your esp32’s!).

Excellent thank you, I see the waves are 1004 bytes, which is 960+waveheader.
If I send a waveheader + 512 bytes, should I adjust the chuck_size?
That might become an issue with the 10,20 or 30 ms required for webrtcvad
I will first adapt software to send 960 bytes.

The esp32 is not powerfull enough to do voice activity detection I think, but I would be happy to be proven wrong :smiley:

edit: a see a new version released. Will try with the new stuff in it :slight_smile:

It should work with 512 bytes without any changes. The webrtcvad component will buffer audio data until it gets at least command.webrtcvad.chunk_size bytes, which it then processes. I haven’t tested any other sizes, though.

Assuming 16-bit 16Khz mono, valid values for command.webrtcvad.chunk_size would be 320, 640, and 960 bytes. Not sure why, but webrtcvad is pretty whiny about it :smiley:

1 Like

I have done some tests with a separate system, voice detection works pretty good.
I have some trouble with the hotword not being detected though, anyone else having issues with that?
After I manually press the wake button, all is good

I could not get the builded hotword detected to work, so then I tried you Snowboy addon.
Together with setting to wake detection to hermes, this worked pretty well.

I see in you addon you can upload models, but these are not persistant. When I upload a model, change the config to use that model I have to restart the addon. But then the modelfile is removed again, it the addon still under development?

It works fine when I use snowboy, I finally have something working in Dutch :slight_smile:
I am using my Matrix Voice to stream audio data via MQTT by the way, all is well :slight_smile:

1 Like

Added support for Rhasspy in Matrix Voice Audio Streamer: https://github.com/Romkabouter/Matrix-Voice-ESP32-MQTT-Audio-Streamer