@Hitesh_Singh, This may be out of my skills, but maybe it could be wrong write access to .config, what’s your docker user and what are the access permissions to .config ? But again, I could be wrong.
DEBUG:__main__:Namespace(host='0.0.0.0', port=12101, profile='fr', set=[], ssl=None, system_profiles='/usr/share/rhasspy/profiles', user_profiles='/profiles')
DEBUG:RhasspyCore:Loaded profile from /profiles/fr/profile.json
DEBUG:RhasspyCore:Profile files will be written to /profiles/fr
DEBUG:root:Loading default profile settings from /usr/share/rhasspy/profiles/defaults.json
DEBUG:WebSocketObserver: -> started
DEBUG:DialogueManager: -> started
DEBUG:DialogueManager:started -> loading_mqtt
DEBUG:DialogueManager:Loading MQTT first
DEBUG:DialogueManager:Loading...will time out after 30 second(s)
DEBUG:HermesMqtt: -> started
DEBUG:HermesMqtt:started -> connecting
DEBUG:HermesMqtt:Logging in as athena
DEBUG:HermesMqtt:Connecting to MQTT broker 192.168.0.105:1883
DEBUG:DialogueManager:loading_mqtt -> loading
DEBUG:DialogueManager:Loading actors
DEBUG:HermesMqtt:Connection successful.
INFO:HermesMqtt:Connected to 192.168.0.105:1883
DEBUG:HermesMqtt:connecting -> connected
DEBUG:DialogueManager:Actors created. Waiting for ['recorder', 'player', 'speech', 'wake', 'command', 'decoder', 'recognizer', 'handler', 'hass_handler', 'sentence_generator', 'speech_trainer', 'intent_trainer', 'word_pronouncer'] to start.
DEBUG:HermesAudioRecorder: -> started
DEBUG:HermesAudioPlayer: -> started
DEBUG:EspeakSentenceSpeaker: -> started
DEBUG:DummyWakeListener: -> started
DEBUG:DummyCommandListener: -> started
DEBUG:FuzzyWuzzyRecognizer: -> started
DEBUG:PocketsphinxDecoder: -> started
DEBUG:HomeAssistantIntentHandler: -> started
DEBUG:PocketsphinxSpeechTrainer: -> started
DEBUG:FuzzyWuzzyIntentTrainer: -> started
DEBUG:PhonetisaurusPronounce: -> started
DEBUG:JsgfSentenceGenerator: -> started
DEBUG:HermesMqtt:Subscribed to hermes/audioServer/default/audioFrame
DEBUG:DialogueManager:recorder started
DEBUG:EspeakSentenceSpeaker:started -> ready
DEBUG:FuzzyWuzzyRecognizer:Loaded examples from /profiles/fr/intent_examples.json
DEBUG:DialogueManager:player started
DEBUG:FuzzyWuzzyRecognizer:started -> loaded
DEBUG:DialogueManager:wake started
DEBUG:DialogueManager:command started
DEBUG:DialogueManager:speech_trainer started
DEBUG:DialogueManager:intent_trainer started
DEBUG:DialogueManager:word_pronouncer started
DEBUG:DialogueManager:sentence_generator started
DEBUG:DialogueManager:speech started
DEBUG:DialogueManager:recognizer started
DEBUG:PocketsphinxDecoder:Loading decoder with hmm=/profiles/fr/acoustic_model, dict=/profiles/fr/dictionary.txt, lm=/profiles/fr/language_model.txt
DEBUG:DialogueManager:handler started
DEBUG:PocketsphinxDecoder:started -> loaded
DEBUG:DialogueManager:decoder started
WARNING:DialogueManager:Actor timeout! Still waiting on ['hass_handler'] Loading anyway...
DEBUG:DialogueManager:loading -> ready
INFO:DialogueManager:Automatically listening for wake word
DEBUG:DialogueManager:ready -> asleep
INFO:__main__:Started
DEBUG:__main__:Starting web server at http://0.0.0.0:12101
Edit : Something that maybe of interests, when I use the POST http request to api/listen-for-command I get these logs :
Edit 2 : My bad, wrong training lead to that, I have managed to fix the first edit, my issue is still here and Rhasspy doesn’t listen to MQTT
From that error message, I would guess that the profile download got interrupted and Rhasspy’s download cache is corrupted. Unfortunately, Rhasspy doesn’t try to verify the files once they’re present, so you would have to either (1) download them manually (which looks like what you did ) or (2) delete the download folder in your profile and restart Rhasspy.
For anyone in the future experiencing this problem, I’d recommend grabbing the files from the Github release page for your language instead and manually downloading them to the download folder in your profile. For example, the English profile would need cmusphinx-en-us-5.2.tar.gz, en-70k-0.2-pruned.lm.gz, and en-g2p.tar.gz. The .pt files are only needed if you use the flair intent recognizer (which you probably don’t).
Thanks for sharing, I didn’t find answer here https://rhasspy.readthedocs.io/en/latest/wake-word/ but what is the behavior of Rhasspy if you don’t set a wake word system ? Does it listen to mqtt or does it still need to be activated some way ?
I ask for debugging right now Rhasspy still doesn’t listen to my mqtt hermes.
The default is the dummy wake word system. I agree, the documentation should be clearer about that. All of the defaults are present in the defaults.json file in the Rhasspy repo, if you’re curious
From your logs above, it looks like Rhasspy is connecting to your MQTT broker correctly and subscribing to audio frames. You should be able to go to the web interface, hold down the “Hold to Record” button, and speak a command (then let go). Does this work for you?
I tried when I was playing with wake word services.
However, I tried recording with my browser audio input via mqtt this morning with the default configuration of the wake word (dummy) this morning. I get these error messages in my logs:
WARNING:HomeAssistantIntentHandler:Empty intent. Not sending to Home Assistant
WARNING:HermesMqtt:Empty intent. Not forwarding to MQTT
DEBUG:__main__:Recorded 120364 byte(s) of audio data
DEBUG:PocketsphinxDecoder:rate=16000, width=2, channels=1.
DEBUG:PocketsphinxDecoder:Decoded WAV in 0.9733586311340332 second(s)
DEBUG:PocketsphinxDecoder:Transcription confidence: 0.017280606925557645
WARNING:PocketsphinxDecoder:Transcription did not meet confidence threshold: 0.017280606925557645 < 0.8
DEBUG:__main__:
DEBUG:__main__:{"text": "", "intent": {"name": "", "confidence": 0}, "entities": [], "speech_confidence": 0, "slots": {}}
WARNING:HomeAssistantIntentHandler:Empty intent. Not sending to Home Assistant
WARNING:HermesMqtt:Empty intent. Not forwarding to MQTT
From what I understand, PocketSphinx does not understand what I’m saying. I will try to tune the confidence threshold.
Edit: Okay, by lowering pocketsphinx to 0.01 Minimum Confidence I got it working by recording via my web browser, thanks a lot ! I have a question though, my average confidence is between 0.02 and 0.03 in the logs, is this normal? And my intents are not so well recognize as Rhasspy tends to mess up with intents (ie, giving me a temperature when I try to close a curtain).
Did you try to retrain Rhasspy? I also noticed that it sometimes does this (completely messed up intent understanding), but after retraining it works flawlessly again.
Hi! After some time, having sentences implemented in english, I would like to migrate it to my native language (pt-br). When I train the sentences (kaldi must be used) I get the following error:
Training failed: Exception(“realpath: ‘${KALDI_PREFIX}/kaldi’: No such file or directory\n”,)
I downloaded the “kaldi_armhf.tar.gz” version, which I believe is the correct version and unpacked, but don’t know how to proceed. The Kadi Url mention another way to install it, so I am afraid to doing the wrong way and have anything broken.
So no I have the version 1.0 that I mentioned above , unpacked in a /tmp/kaldi folder.
Can someone kindly point me to what should I do next in order to compile and install it? Thanks!
Pocketsphinx’s confidence seems to depend on the number of possible sentences, which means the “best” threshold will change if you add more intents or sentences. This is really unfortunate, and I’d love to hear from anyone who knows a better way of getting confidence values out of Pockesphinx!
I agree with @koan that you likely need to just re-train Rhasspy. You may also try using fsticuffs as your intent recognizer instead of fuzzywuzzy (which I saw being loaded from your logs). Fuzzywuzzy will start making mistakes if your sentences are very similar, since it just uses fuzzy string matching. Fsticuffs is more strict, but will have no trouble with similar sentences.
But in my case, on and off should be changed to “acender” and “apagar”. So, I believe a new template can be used to switch “acender” to “on” and “apagar” to “off”. Doing so, HA can understand the task correctly. Can anyone post how did you circunvent this? Thanks for your help.
I agree with @koan that you likely need to just re-train Rhasspy. You may also try using fsticuffs as your intent recognizer instead of fuzzywuzzy (which I saw being loaded from your logs). Fuzzywuzzy will start making mistakes if your sentences are very similar, since it just uses fuzzy string matching. Fsticuffs is more strict, but will have no trouble with similar sentences.
Thanks for your feedbackl. Well as Open FST does not seem to work for me (I get empty intents every time, event after a retrain). I think I’m gonna stick to Pocketsphinx.
Here are my logs, when I try a Get an Intent:
ERROR: State ID -1 not valid
ERROR:root:['ferme', 'les', 'du', 'salon', 'baie', 'vitree', 'est']
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/jsgf2fst/fstaccept.py", line 66, in fstaccept
out_sentences = fstprintall(out_fst, exclude_meta=False)
File "/usr/local/lib/python3.6/dist-packages/jsgf2fst/fstaccept.py", line 206, in fstprintall
for arc in in_fst.arcs(state):
File "pywrapfst.pyx", line 1406, in pywrapfst._Fst.arcs (pywrapfst.cc:17426)
File "pywrapfst.pyx", line 1420, in pywrapfst._Fst.arcs (pywrapfst.cc:17365)
File "pywrapfst.pyx", line 2878, in pywrapfst.ArcIterator.__init__ (pywrapfst.cc:31111)
pywrapfst.FstIndexError: State index out of range
Edit: Nvm, by switching some intents it works again with Open FST
I am having a issue in “pt” when the sentences have “alternatives” like below:
[GetHumidity]
humid_local = (externa | interna) {name}
qual [é] a umidade <humid_local>
All of them work in english, but none of them work in pt. When the sentence is simple, with no alternatives, then it work.
I also tryed to debug looking the intent arriving to the mqtt using “mosquitto_sub”, but nothing arrives.
How can I debug it a little further? May something be broken when using the pt profile?
EDIT: Using v2.22
EDIT 2:
Experimenting in the “Speech” tab, these sentences with alternatives also don’t work, returning:
I was able to get it to work by switching to fuzzywuzzy for intent recognition. The Portugese speech model does not seem to do very well, and the OpenFST recognizer is too sensitive.
I generated a WAV file with the sentence “qual a umidade externa” using Google WaveNet, and Rhasspy transcribed it as “qual o ar ligue ar de externa” (you can see this in the log). Fuzzywuzzy matches it correctly, but only because there are very few other sentences.
The underlying issue here is that all of the Kaldi models I have (Portugese, Vietnamese, Swedish) are not really intended for speech-to-text, and are trained on (expensive) data I don’t have access to. However, I did manage to find a Portugese speech dataset that I will try to train a model on – it’s about 10.5K WAV files, which should be plenty!
I hope Fuzzywuzzy will work for you in the mean time
I tryed fuzzywuzzy and it only worked after a new “training”, otherwise it doesn’t work at all. Anyway it worked in a VERY unreliable way, returning an intend very different, which sentence sound is not even close from what it understood.
I also tryed to play with “minimum conficence” for fuzzywuzzy with no success.
I hope that the new Portuguese Speech Dataset could be successful.
It seems that I will have to switch back to the EN profile. Anyway thanks for your help and effort developing for other languages. Maybe you have another trick that I can test.
If you find it useful, for the PT profile I have 11 sentences. For the EN profile I have about 25.
After some false starts, I managed to train a custom acoustic model for Portuguese. I’ve updated Rhasspy to use this new model instead of the old Kaldi one, and I’d be interested to hear your feedback.
In my testing, the model does just a little better than the Kaldi one. I’m hoping its enough of an improvement that you find it useful. The major roadblock is the small amount of data in the dataset I mentioned. It turns out there are only 8 hours of data, where good models apparently need 100+ hours. Mozilla claims to have 11 hours of Portuguese on their Common Voice website, but its not available for download.
Another idea I had was to use public domain audiobooks for training. This has apparently been done for English, but the process looks daunting…
It still do some mistakes, but in my preliminar testing, it is MUCH better than Kaldi. I have to experiment more though. It seems to respond a lot faster also. It should be the way. I hope that the needed model hours could be improved somehow.
I will need some days experimenting more to give you a more solid feedback, anyway it seems to be promising.
Thanks one more time for your excellent work.
EDIT: It was a gift for my anniversary in the community. THANKS!