Reduced training time for large voice command spaces. The timer example has about 8 million possible sentences. It takes a few minutes just to generate all the sentences on my laptop, but only 2 seconds to train with fsticuffs
.
Thanks @koan for the detailed infos.
If even the Hermes stuff is not fully documented, I will also not spend any further minute with Snips compatibility. And you are right, if Sonos shuts down the Snips Console, then no skill developer will maintain his script(s) anymore. So another point why it is useless to spend time on further Snips compatibility.
However, Snips was a good example how to make such a voice assistant available for makers.
So obviously they made a few things right.
I think thereās a good opportunity here to work with the HA people to create some kind of open āstandardā that works with Rhasspy and Ada. I donāt know what that should look like yet; maybe itās just a set of HTTP end points and some JSON schemas?
I forgot to mention that in the latest Rhasspy for English, German, and Dutch, you can select Kaldi for speech recognition in the settings to try out the new speech models. These probably wonāt run well on a Raspberry Pi, but should give you better recognition results.
I have just tried Kaldi,I am using dutch, I got this
āKaldiDecoder Missing HCLG.fst Graph not found at /share/rhasspy/profiles/nl/kaldi/model/graph/HCLG.fst. Did you train your profile?ā
When I train the profile, it gives me this error:
Training failed: <Task: vocab_dict>: TaskError PythonAction Error Traceback (most recent call last): File ā/usr/local/lib/python3.6/dist-packages/doit/action.pyā, line 424, in execute returned_value = self.py_callable(*self.args, **kwargs) File ā/usr/share/rhasspy/rhasspy/train/init.pyā, line 404, in do_dict with open(custom_words, āaā) as words_file: FileNotFoundError: [Errno 2] No such file or directory: āprofiles/nl/kaldi/custom_words.txtā
I did retrain Dutch as well, no errors here. But I had already custom words I guess
Going to test the recognition.
Did Rhasspy download extra files when you switched? It sounds like it didnāt successfully extract them to the kaldi
folder in your nl
profile.
Yes it did download the files.
How can I check if they are successfully extracted?
You should see a kaldi
directory inside profiles/nl
with a few files and a model
directory if all went well. You might try creating an empty text file called custom_words.txt
inside kaldi
just to see if training will proceedā¦
The files were downloaded and extracted successfully.
Adding the custom_words.txt
to the kaldi dir solved the problem
Thanks
I think you may have uncovered a bug, actually! Thanks
Glad to helpā¦
And thank you for creating this nice projectā¦ it is exactly what I needed, an offline voice assist. and no cloud.
I have followed the discussion about where Rhasspy fits within the development of Almond and Ada, and I must say that I prefer option 2 "
Rhasspy does everything but handling intents
Basically what happens now, expect you use intent_script rather than events"
Keep up the good workā¦
Thanks! There actually was an issue related to custom words. It was using the same file as if youād selected pocketsphinx
instead of kaldi
, which was not good.
Hoping to get these kinks worked out as more people are trying Rhasspy.
Great news!
After initial testing with a very cheap microphone, I got my hands on a used USB speakerphone. Just got it working today and I am SUPER excited.
So far I have one automation setup - but it works (using web-socket).
Great work on this! Very excited to see what I can do with it.
BTW I made one mistake when changing to the USB speakerphone - in case anyone makes the same mistakeā¦
I thought I would have to change the device for the container from /dev/snd
to /dev/bus/usb/00#
ā¦ still learning linux . Donātā¦ use keep it /dev/snd
or youāll be scratching your head for an hour wondering where all your devices went, lol.
Cheers!
DeadEnd
As a quick follow upā¦ I see the TTS components, but not any directions on how to use them. Are these capable of taking a text string and speaking it? In other words, can I sent a json to the web-socket (or does it have to be HTTP? Iām inexperienced and not sure of the differencesā¦) to have it spoken? I didnāt see anything in the docs explaining thisā¦ maybe I should stop being lazy and just take a look at the APIā¦
/api/text-to-speech
POST text and have Rhasspy speak it
So you would use http://localhost:12101/api/text-to-speech
ā¦ but how should I format the message?
Trying to connect to the /api/text-to-speech as a ws connection gives errors:
[INFO:6110451] quart.serving: 172.17.0.1:44238 GET /api/text-to-speech 1.1 500 21 1321
[ERROR:6110450] __main__: MethodNotAllowed(405)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/quart/app.py", line 1594, in full_dispatch_websocket
result = await self.dispatch_websocket(websocket_context)
File "/usr/local/lib/python3.6/dist-packages/quart/app.py", line 1636, in dispatch_websocket
raise websocket_.routing_exception
File "/usr/local/lib/python3.6/dist-packages/quart/ctx.py", line 45, in match_request
self.request_websocket.url_rule, self.request_websocket.view_args = self.url_adapter.match() # noqa
File "/usr/local/lib/python3.6/dist-packages/quart/routing.py", line 271, in match
raise MethodNotAllowed(allowed_methods=allowed_methods)
quart.exceptions.MethodNotAllowed: MethodNotAllowed(405)
So I must be doing something wrong.
I tried using Kaldi - had the same issue with custom_words.txt - manually added it to resolve thisā¦ but I am still getting this error:
KaldiDecoder Missing HCLG.fst Graph not found at /profiles/en/kaldi/model/graph/HCLG.fst. Did you train your profile?
My training takes 0.16 secondsā¦ and after, the HCLG.fst is still missing.
I tested it anyway, and it appears it canāt recognize any words, which I expect is due to the missing graph file.
Cheers!
DeadEnd
I have a rest command in HA for that:
rest_command:
rhasspy_speak:
url: 'http://192.168.43.54:12101/api/text-to-speech'
method: 'POST'
payload: '{{ payload }}'
content_type: text/plain
where payload is the actual text you want Rhasspy to speak.
In HA you can use this with an automation like this:
- id: '1570370359402'
alias: Lampen
trigger:
- event_data: {}
event_type: rhasspy_Lights
platform: event
condition: []
action:
- data_template:
payload: Dat is goed, ik zet de lamp in de {{ trigger.event.data.location }} {{ trigger.event.data.actiontype
}}
service: rest_command.rhasspy_speak
but essentially you can post a sentence to your Rhasspy API.
Hi,
havent played with rhasspy for a while because we were (and still are) renovating our flat but i got a bit time today to update everything to the latest version.
I was wondering if it is possible to set rhasspy in a mode where it waits for a response without using the wakeword.
For Example:
Have a automation in HA that checks if if the lights were turned off after using the bathroom and if not i will use rhasspy (if someones home) to deal with the problem:
- normal ttts: light in bathroom is still on, should i turn it off
- set rhasspy in a listening mode with a specific event id (from ha) so it waits for a response, it should be configureable how long rhasspy stays in the ārespsonse modeā after which it will resume normal mode.
- if rhasspy gets a respsone it can understand, it should fire a event with the same id it got from homeassistant. Then in HA you could deal with that event however you like
I think in the beginning boolean answers would be enough (yes no, off on etc to keep it simple) .
in the future you could create special intents categorised by the āevent idā.
For example if the event āremind-laterā is fired it should listen for time input configured like the timer example but only for the possible intents that are assinged to that event-id.
Is that possible and a good solution?
Iam trying to install the rhasspy addon on a raspberrypi zero. Iam getting the following error.
19-11-29 13:18:17 ERROR (SyncWorker_2) [hassio.docker.addon] Canāt build 75f2ff60/armhf-addon-rhasspy:2.4.3: The command ā/bin/sh -c chmod a+x /run.shā returned a non-zero code: 139
Can someone help me ?
I have read those methods but as far as i understand this would need a lot of scripting to get basic functionality like i described working. And those response intents would also work in normal operation which could be quite annoying.
Maybe i should describe my idea in a bit more detail:
add a special response mode that can be activated with mqtt, REST or anything else. If you want to use this response mode just send a payload like the following to rhasspy:
{
rhasspy_mode: "response" {
response_id: "myautomationcallback",
tts: "Lights in the bathroom were left on, should i turn them off",
duration: "20",
default_callback: "Yes"}
}
Rhasppy then enters the response mode, if a tts string is in the payload it will play it back and then enable a special response listening mode where i wonāt recognizes normal speech intents but only basic responses unless the response_id matches a specific response intent (for more complex āconversationsā) which have to configured beforehand.
default_callback can be set if you want a automatic response without interacting with rhasspy.
In my example rhasspy asks if it should turn off the lights and then waits 20 seconds for a response, there is no special response intent configured for the used response_id so it will only recognize basic boolean responsen like yes or no. If rhasspy detects the spoken user speech/intent (or there is not response at all) it will fire a new event with the configured response response_id and the recognized intent (or default callback):
rhasspy_response_myautomationcallback
The event payload contains the detected response intents which then can be used in specific automations.
If you want more complex responses you can setup specific response intents that only work if rhasspy was activated in response mode with the correct response_id. Those intents will be ignored during normal wakeword operation.
I my opinion a system likes this should not be to complicated to implement but allows the enduser to easily create simple HA automations or nodered scripts to setup a system that can ātalkā with the enduser.