Rhasspy offline voice assistant toolkit

Well, here’s an even better gift: as of August 23rd, CMU uploaded an official Portuguese acoustic model! I completely missed that :upside_down_face:

It decoded all of my test phrases perfectly, so I’ve updated Rhasspy again to include that model instead. Some things you’ll need to do to use it:

  1. Update Rhasspy
  2. In Settings, click Re-Download Profile
  3. If you have any custom words, you’ll need to re-do them (CMU uses different phonemes)

Happy Rhasspy-ing

Great! Working even better! And it is also fast.
But I’m having an issue that was already present before:

When I use:

light_state = (ligar){state:on} | (desligar){state:off}

<light_state> [a] <light_name>

And I say “desligar”, it returns “ligar”. I tryed as “acenda” and “apague”, and when I say “apague”, it returns “acenda”.
I don’t know if I did something wrong or if this is a bug.

This is really odd behavior. I can see in the new CMU Portuguese dictionary that “desligar”, “ligar”, “acenda”, and “apague” are all present with single pronunciations. Did you make sure to delete them from your custom words in Rhasspy?

I would also watch the Rhasspy log after you upload a WAV file to see what the exact transcription is. It could be the case that fuzzywuzzy is doing something wrong even if Pocketsphinx is hearing you correctly.

Would you be willing to send me your sentences.ini file and a WAV file where it messes up? I can help debug.

Yes. I did. This new dictionary seems to be more complete indeed. Many words that I had in custom words were not needed anymore.

I would also watch the Rhasspy log after you upload a WAV file to see what the exact transcription is. It could be the case that fuzzywuzzy is doing something wrong even if Pocketsphinx is hearing you correctly.

I watched the transcription after holding the “Hold to Record” button. And that was what I got.

Would you be willing to send me your `sentences.ini` file and a WAV file where it messes up? I can help debug.

Sure! In order to get it working, I had to do it with much more automations… Aff
But I have a backup how it was before. More likely on sunday I will send you both the sentence.ini and wav files.
Thanks.

Hello,

I’ve been playing around with rhasspy for a couple of days now and it’s starting to work well. Thank you for your hard work! I do however run into an issue when using a wakeword. This issue is not present without a wakeword.

When I speak a sentence to rhasspy after a wakeword, it finds the intent, but it takes another 18 seconds or so before node-red sees the intent on the websocket. Without wakeword, and just pressing the “wake” button, this is almost instantly.

Here is the logging for this particular issue:

rhasspy_1        | 2019-09-15T06:30:03.425177817Z DEBUG:DialogueManager:Awake!
rhasspy_1        | 2019-09-15T06:30:03.425327358Z DEBUG:DialogueManager:asleep -> awake
rhasspy_1        | 2019-09-15T06:30:03.426400818Z DEBUG:APlayAudioPlayer:['aplay', '-q', '/usr/share/rhasspy/etc/wav/beep_hi.wav']
rhasspy_1        | 2019-09-15T06:30:03.432346968Z DEBUG:WebrtcvadCommandListener:loaded -> listening
rhasspy_1        | 2019-09-15T06:30:03.433710247Z DEBUG:SnowboyWakeListener:listening -> loaded
rhasspy_1        | 2019-09-15T06:30:03.434661588Z DEBUG:ARecordAudioRecorder:recording -> started
rhasspy_1        | 2019-09-15T06:30:03.435807724Z DEBUG:ARecordAudioRecorder:Stopped recording from microphone (arecord)
rhasspy_1        | 2019-09-15T06:30:03.435942016Z DEBUG:ARecordAudioRecorder:started -> recording
rhasspy_1        | 2019-09-15T06:30:03.436930852Z DEBUG:ARecordAudioRecorder:['arecord', '-q', '-r', '16000', '-f', 'S16_LE', '-c', '1', '-t', 'raw', '-D', 'default:CARD=CameraB409241']
rhasspy_1        | 2019-09-15T06:30:03.437815606Z DEBUG:ARecordAudioRecorder:Recording from microphone (arecord)
rhasspy_1        | 2019-09-15T06:30:03.999560324Z DEBUG:WebrtcvadCommandListener:Voice command started
rhasspy_1        | 2019-09-15T06:30:05.919244630Z DEBUG:WebrtcvadCommandListener:Voice command finished
rhasspy_1        | 2019-09-15T06:30:05.920433450Z DEBUG:WebrtcvadCommandListener:listening -> loaded
rhasspy_1        | 2019-09-15T06:30:05.921690940Z DEBUG:DialogueManager:awake -> decoding
rhasspy_1        | 2019-09-15T06:30:05.922945645Z DEBUG:PocketsphinxDecoder:rate=16000, width=2, channels=1.
rhasspy_1        | 2019-09-15T06:30:05.926208121Z DEBUG:APlayAudioPlayer:['aplay', '-q', '/usr/share/rhasspy/etc/wav/beep_lo.wav']
rhasspy_1        | 2019-09-15T06:30:05.926376885Z DEBUG:ARecordAudioRecorder:recording -> started
rhasspy_1        | 2019-09-15T06:30:05.927012261Z DEBUG:ARecordAudioRecorder:Stopped recording from microphone (arecord)
rhasspy_1        | 2019-09-15T06:30:06.056042450Z DEBUG:PocketsphinxDecoder:Decoded WAV in 0.13323044776916504 second(s)
rhasspy_1        | 2019-09-15T06:30:06.056980948Z DEBUG:PocketsphinxDecoder:Transcription confidence: 0.14622293698088337
rhasspy_1        | 2019-09-15T06:30:06.057129800Z DEBUG:PocketsphinxDecoder:hoe laat is het
rhasspy_1        | 2019-09-15T06:30:06.057985674Z DEBUG:DialogueManager:hoe laat is het (confidence=0.14622293698088337)
rhasspy_1        | 2019-09-15T06:30:06.058028148Z DEBUG:DialogueManager:decoding -> recognizing
rhasspy_1        | 2019-09-15T06:30:06.059501765Z DEBUG:FsticuffsRecognizer:Got 1 intent(s)
rhasspy_1        | 2019-09-15T06:30:06.059538750Z DEBUG:FsticuffsRecognizer:[{'text': 'hoe laat is het', 'intent': {'name': 'GetTime', 'confidence': 1.0}, 'entities': [], 'raw_text': 'hoe laat is het', 'tokens': ['hoe', 'laat', 'is', 'het'], 'raw_tokens': ['hoe', 'laat', 'is', 'het']}]
rhasspy_1        | 2019-09-15T06:30:06.059751726Z DEBUG:DialogueManager:{'text': 'hoe laat is het', 'intent': {'name': 'GetTime', 'confidence': 1.0}, 'entities': [], 'raw_text': 'hoe laat is het', 'tokens': ['hoe', 'laat', 'is', 'het'], 'raw_tokens': ['hoe', 'laat', 'is', 'het'], 'speech_confidence': 0.14622293698088337}
rhasspy_1        | 2019-09-15T06:30:06.059999644Z DEBUG:DialogueManager:recognizing -> handling
rhasspy_1        | 2019-09-15T06:30:06.060397812Z DEBUG:WebSocketObserver:{"text": "hoe laat is het", "intent": {"name": "GetTime", "confidence": 1.0}, "entities": [], "raw_text": "hoe laat is het", "tokens": ["hoe", "laat", "is", "het"], "raw_tokens": ["hoe", "laat", "is", "het"], "speech_confidence": 0.14622293698088337, "slots": {}}
rhasspy_1        | 2019-09-15T06:30:06.062806451Z DEBUG:DialogueManager:handling -> ready
rhasspy_1        | 2019-09-15T06:30:06.062869805Z INFO:DialogueManager:Automatically listening for wake word
rhasspy_1        | 2019-09-15T06:30:06.062885158Z DEBUG:DialogueManager:ready -> asleep
rhasspy_1        | 2019-09-15T06:30:06.062895802Z DEBUG:SnowboyWakeListener:loaded -> listening
rhasspy_1        | 2019-09-15T06:30:06.062906523Z DEBUG:ARecordAudioRecorder:started -> recording
rhasspy_1        | 2019-09-15T06:30:06.062917657Z DEBUG:ARecordAudioRecorder:['arecord', '-q', '-r', '16000', '-f', 'S16_LE', '-c', '1', '-t', 'raw', '-D', 'default:CARD=CameraB409241']
rhasspy_1        | 2019-09-15T06:30:06.062931434Z DEBUG:ARecordAudioRecorder:Recording from microphone (arecord)
rhasspy_1        | 2019-09-15T06:30:24.349774208Z DEBUG:EspeakSentenceSpeaker:['espeak', '-v', 'nl', '--stdout', 'Het is 6:30 AM']
rhasspy_1        | 2019-09-15T06:30:24.375711496Z DEBUG:EspeakSentenceSpeaker:ready -> speaking
rhasspy_1        | 2019-09-15T06:30:24.376597943Z DEBUG:APlayAudioPlayer:['aplay', '-q']
rhasspy_1        | 2019-09-15T06:30:26.105522676Z DEBUG:EspeakSentenceSpeaker:speaking -> ready

Here you can see it wakes up, listens, finds the intent(at 06:30:06.060397812), and starts to listen for another wakeword. Then after 18 seconds (at 06:30:24.349774208) it receives te espeak command from node-red. The delay is NOT in node-red, i’ve monitored the node-red logging, the websocket command is received in node-red after those 18 seconds.

Can anybody shine some light on this issue?

Thank you very much!

PS I used multiple wakeword handlers and they seem to work fine. If i switch from using a wakeword to not using a wakeword without changes other settings, the problem is gone.

Edit: Added logging when not using wakeword:

rhasspy_1        | 2019-09-15T06:49:53.622047396Z DEBUG:DialogueManager:asleep -> awake
rhasspy_1        | 2019-09-15T06:49:53.622891593Z DEBUG:WebrtcvadCommandListener:loaded -> listening
rhasspy_1        | 2019-09-15T06:49:53.623424212Z DEBUG:APlayAudioPlayer:['aplay', '-q', '/usr/share/rhasspy/etc/wav/beep_hi.wav']
rhasspy_1        | 2019-09-15T06:49:53.631411021Z DEBUG:ARecordAudioRecorder:started -> recording
rhasspy_1        | 2019-09-15T06:49:53.631725979Z DEBUG:ARecordAudioRecorder:['arecord', '-q', '-r', '16000', '-f', 'S16_LE', '-c', '1', '-t', 'raw', '-D', 'default:CARD=CameraB409241']
rhasspy_1        | 2019-09-15T06:49:53.643164143Z DEBUG:ARecordAudioRecorder:Recording from microphone (arecord)
rhasspy_1        | 2019-09-15T06:49:54.178782341Z DEBUG:WebrtcvadCommandListener:Voice command started
rhasspy_1        | 2019-09-15T06:49:56.287289036Z DEBUG:WebrtcvadCommandListener:Voice command finished
rhasspy_1        | 2019-09-15T06:49:56.287641431Z DEBUG:WebrtcvadCommandListener:listening -> loaded
rhasspy_1        | 2019-09-15T06:49:56.293841469Z DEBUG:DialogueManager:awake -> decoding
rhasspy_1        | 2019-09-15T06:49:56.293900786Z DEBUG:APlayAudioPlayer:['aplay', '-q', '/usr/share/rhasspy/etc/wav/beep_lo.wav']
rhasspy_1        | 2019-09-15T06:49:56.301341777Z DEBUG:PocketsphinxDecoder:rate=16000, width=2, channels=1.
rhasspy_1        | 2019-09-15T06:49:56.303845713Z DEBUG:ARecordAudioRecorder:recording -> started
rhasspy_1        | 2019-09-15T06:49:56.306549867Z DEBUG:ARecordAudioRecorder:Stopped recording from microphone (arecord)
rhasspy_1        | 2019-09-15T06:49:56.565217959Z DEBUG:PocketsphinxDecoder:Decoded WAV in 0.2633223533630371 second(s)
rhasspy_1        | 2019-09-15T06:49:56.566176554Z DEBUG:PocketsphinxDecoder:Transcription confidence: 0.5165017610925368
rhasspy_1        | 2019-09-15T06:49:56.570271469Z DEBUG:PocketsphinxDecoder:hoe laat is het
rhasspy_1        | 2019-09-15T06:49:56.570313421Z DEBUG:DialogueManager:hoe laat is het (confidence=0.5165017610925368)
rhasspy_1        | 2019-09-15T06:49:56.570328522Z DEBUG:DialogueManager:decoding -> recognizing
rhasspy_1        | 2019-09-15T06:49:56.570337977Z DEBUG:FsticuffsRecognizer:Got 1 intent(s)
rhasspy_1        | 2019-09-15T06:49:56.570345430Z DEBUG:FsticuffsRecognizer:[{'text': 'hoe laat is het', 'intent': {'name': 'GetTime', 'confidence': 1.0}, 'entities': [], 'raw_text': 'hoe laat is het', 'tokens': ['hoe', 'laat', 'is', 'het'], 'raw_tokens': ['hoe', 'laat', 'is', 'het']}]
rhasspy_1        | 2019-09-15T06:49:56.570353725Z DEBUG:DialogueManager:{'text': 'hoe laat is het', 'intent': {'name': 'GetTime', 'confidence': 1.0}, 'entities': [], 'raw_text': 'hoe laat is het', 'tokens': ['hoe', 'laat', 'is', 'het'], 'raw_tokens': ['hoe', 'laat', 'is', 'het'], 'speech_confidence': 0.5165017610925368}
rhasspy_1        | 2019-09-15T06:49:56.570363174Z DEBUG:DialogueManager:recognizing -> handling
rhasspy_1        | 2019-09-15T06:49:56.570374589Z DEBUG:DialogueManager:handling -> ready
rhasspy_1        | 2019-09-15T06:49:56.570385665Z INFO:DialogueManager:Automatically listening for wake word
rhasspy_1        | 2019-09-15T06:49:56.570394445Z DEBUG:DialogueManager:ready -> asleep
rhasspy_1        | 2019-09-15T06:49:56.571227043Z DEBUG:WebSocketObserver:{"text": "hoe laat is het", "intent": {"name": "GetTime", "confidence": 1.0}, "entities": [], "raw_text": "hoe laat is het", "tokens": ["hoe", "laat", "is", "het"], "raw_tokens": ["hoe", "laat", "is", "het"], "speech_confidence": 0.5165017610925368, "slots": {}}
rhasspy_1        | 2019-09-15T06:49:56.592745005Z DEBUG:EspeakSentenceSpeaker:['espeak', '-v', 'nl', '--stdout', 'Het is 6:49 AM']
rhasspy_1        | 2019-09-15T06:49:56.626688872Z DEBUG:EspeakSentenceSpeaker:ready -> speaking
rhasspy_1        | 2019-09-15T06:49:57.026023920Z DEBUG:APlayAudioPlayer:['aplay', '-q']
rhasspy_1        | 2019-09-15T06:49:59.206410336Z DEBUG:EspeakSentenceSpeaker:speaking -> ready

Glad you’re giving Rhasspy a shot!

I suspect the problem you’re seeing is related to the websocket library I’m using. I’ve taken a shot in the dark and enabled some extra stuff suggested in the documentation, specifically gevent monkey patching and yielding to gevent after intents are detected. Please update and let me know how it goes :slight_smile:

1 Like

Thank you! I will try this when I get home tonight and let you know the results in an edit of this post

Edit:

It’s working now! Awesome!

I have attached my logging because it does give some error in there, but it doesn’t seem to interfere with functionality.

Again, thank you!

rhasspy_1        | 2019-09-16T15:16:35.376389788Z DEBUG:SnowboyWakeListener:Hotword detected (snowboy/computer.umdl)
rhasspy_1        | 2019-09-16T15:16:35.376443996Z DEBUG:DialogueManager:Awake!
rhasspy_1        | 2019-09-16T15:16:35.376459365Z DEBUG:DialogueManager:asleep -> awake
rhasspy_1        | 2019-09-16T15:16:35.376473543Z DEBUG:SnowboyWakeListener:listening -> loaded
rhasspy_1        | 2019-09-16T15:16:35.376518942Z DEBUG:APlayAudioPlayer:['aplay', '-q', '/usr/share/rhasspy/etc/wav/beep_hi.wav']
rhasspy_1        | 2019-09-16T15:16:35.376622852Z DEBUG:WebrtcvadCommandListener:loaded -> listening
rhasspy_1        | 2019-09-16T15:16:35.376725316Z arecord: pcm_read:2103: read error: Interrupted system call
rhasspy_1        | 2019-09-16T15:16:35.377042184Z DEBUG:ARecordAudioRecorder:recording -> started
rhasspy_1        | 2019-09-16T15:16:35.385180591Z DEBUG:ARecordAudioRecorder:Stopped recording from microphone (arecord)
rhasspy_1        | 2019-09-16T15:16:35.386392910Z DEBUG:ARecordAudioRecorder:started -> recording
rhasspy_1        | 2019-09-16T15:16:35.386785376Z DEBUG:ARecordAudioRecorder:['arecord', '-q', '-r', '16000', '-f', 'S16_LE', '-c', '1', '-t', 'raw', '-D', 'default:CARD=CameraB409241']
rhasspy_1        | 2019-09-16T15:16:35.396595375Z DEBUG:ARecordAudioRecorder:Recording from microphone (arecord)
rhasspy_1        | 2019-09-16T15:16:35.931805746Z Exception in thread Thread-39:
rhasspy_1        | 2019-09-16T15:16:35.931854781Z Traceback (most recent call last):
rhasspy_1        | 2019-09-16T15:16:35.931867564Z   File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
rhasspy_1        | 2019-09-16T15:16:35.931878946Z     self.run()
rhasspy_1        | 2019-09-16T15:16:35.931889210Z   File "/usr/local/lib/python3.6/dist-packages/gevent/threading.py", line 177, in run
rhasspy_1        | 2019-09-16T15:16:35.931900021Z     super(Thread, self).run()
rhasspy_1        | 2019-09-16T15:16:35.931910545Z   File "/usr/lib/python3.6/threading.py", line 864, in run
rhasspy_1        | 2019-09-16T15:16:35.931921067Z     self._target(*self._args, **self._kwargs)
rhasspy_1        | 2019-09-16T15:16:35.931931223Z   File "/usr/share/rhasspy/rhasspy/audio_recorder.py", line 376, in process_data
rhasspy_1        | 2019-09-16T15:16:35.931942398Z     data = self.record_proc.stdout.read(self.chunk_size)
rhasspy_1        | 2019-09-16T15:16:35.931952927Z AttributeError: 'NoneType' object has no attribute 'stdout'
rhasspy_1        | 2019-09-16T15:16:35.931963221Z 
rhasspy_1        | 2019-09-16T15:16:35.931973215Z DEBUG:WebrtcvadCommandListener:Voice command started
rhasspy_1        | 2019-09-16T15:16:38.172089682Z DEBUG:WebrtcvadCommandListener:Voice command finished
rhasspy_1        | 2019-09-16T15:16:38.172852497Z DEBUG:WebrtcvadCommandListener:listening -> loaded
rhasspy_1        | 2019-09-16T15:16:38.178122984Z DEBUG:ARecordAudioRecorder:recording -> started
rhasspy_1        | 2019-09-16T15:16:38.178213224Z DEBUG:DialogueManager:awake -> decoding
rhasspy_1        | 2019-09-16T15:16:38.178232846Z DEBUG:APlayAudioPlayer:['aplay', '-q', '/usr/share/rhasspy/etc/wav/beep_lo.wav']
rhasspy_1        | 2019-09-16T15:16:38.178248108Z DEBUG:PocketsphinxDecoder:rate=16000, width=2, channels=1.
rhasspy_1        | 2019-09-16T15:16:38.178261316Z DEBUG:ARecordAudioRecorder:Stopped recording from microphone (arecord)
rhasspy_1        | 2019-09-16T15:16:38.178274413Z arecord: pcm_read:2103: read error: Interrupted system call
rhasspy_1        | 2019-09-16T15:16:38.340086495Z DEBUG:PocketsphinxDecoder:Decoded WAV in 0.15467286109924316 second(s)
rhasspy_1        | 2019-09-16T15:16:38.341787739Z DEBUG:PocketsphinxDecoder:Transcription confidence: 0.07572104623112881
rhasspy_1        | 2019-09-16T15:16:38.342316485Z DEBUG:PocketsphinxDecoder:zet de woonkamerlamp aan
rhasspy_1        | 2019-09-16T15:16:38.342958941Z DEBUG:DialogueManager:zet de woonkamerlamp aan (confidence=0.07572104623112881)
rhasspy_1        | 2019-09-16T15:16:38.343630183Z DEBUG:DialogueManager:decoding -> recognizing
rhasspy_1        | 2019-09-16T15:16:38.344787991Z DEBUG:FsticuffsRecognizer:Recognized 1 intent(s)
rhasspy_1        | 2019-09-16T15:16:38.345574202Z DEBUG:FsticuffsRecognizer:[{'text': 'zet de woonkamerlamp aan', 'intent': {'name': 'ChangeLightState', 'confidence': 1.0}, 'entities': [{'entity': 'name', 'value': 'woonkamerlamp', 'raw_value': 'woonkamerlamp', 'start': 7, 'end': 20}, {'entity': 'state', 'value': 'aan', 'raw_value': 'aan', 'start': 21, 'end': 24}], 'raw_text': 'zet de woonkamerlamp aan', 'tokens': ['zet', 'de', 'woonkamerlamp', 'aan'], 'raw_tokens': ['zet', 'de', 'woonkamerlamp', 'aan'], 'slots': {'name': 'woonkamerlamp', 'state': 'aan'}, 'intents': []}]
rhasspy_1        | 2019-09-16T15:16:38.346448515Z DEBUG:DialogueManager:{'text': 'zet de woonkamerlamp aan', 'intent': {'name': 'ChangeLightState', 'confidence': 1.0}, 'entities': [{'entity': 'name', 'value': 'woonkamerlamp', 'raw_value': 'woonkamerlamp', 'start': 7, 'end': 20}, {'entity': 'state', 'value': 'aan', 'raw_value': 'aan', 'start': 21, 'end': 24}], 'raw_text': 'zet de woonkamerlamp aan', 'tokens': ['zet', 'de', 'woonkamerlamp', 'aan'], 'raw_tokens': ['zet', 'de', 'woonkamerlamp', 'aan'], 'slots': {'name': 'woonkamerlamp', 'state': 'aan'}, 'intents': [], 'speech_confidence': 0.07572104623112881}
rhasspy_1        | 2019-09-16T15:16:38.347271623Z DEBUG:DialogueManager:recognizing -> handling
rhasspy_1        | 2019-09-16T15:16:38.347838014Z DEBUG:WebSocketObserver:{"text": "zet de woonkamerlamp aan", "intent": {"name": "ChangeLightState", "confidence": 1.0}, "entities": [{"entity": "name", "value": "woonkamerlamp", "raw_value": "woonkamerlamp", "start": 7, "end": 20}, {"entity": "state", "value": "aan", "raw_value": "aan", "start": 21, "end": 24}], "raw_text": "zet de woonkamerlamp aan", "tokens": ["zet", "de", "woonkamerlamp", "aan"], "raw_tokens": ["zet", "de", "woonkamerlamp", "aan"], "slots": {"name": "woonkamerlamp", "state": "aan"}, "intents": [], "speech_confidence": 0.07572104623112881}
rhasspy_1        | 2019-09-16T15:16:38.349615994Z DEBUG:DialogueManager:handling -> ready
rhasspy_1        | 2019-09-16T15:16:38.351185991Z INFO:DialogueManager:Automatically listening for wake word
rhasspy_1        | 2019-09-16T15:16:38.351355591Z DEBUG:DialogueManager:ready -> asleep
rhasspy_1        | 2019-09-16T15:16:38.352276887Z DEBUG:SnowboyWakeListener:loaded -> listening
rhasspy_1        | 2019-09-16T15:16:38.353071202Z DEBUG:ARecordAudioRecorder:started -> recording
rhasspy_1        | 2019-09-16T15:16:38.353643017Z DEBUG:ARecordAudioRecorder:['arecord', '-q', '-r', '16000', '-f', 'S16_LE', '-c', '1', '-t', 'raw', '-D', 'default:CARD=CameraB409241']
rhasspy_1        | 2019-09-16T15:16:38.361118434Z DEBUG:ARecordAudioRecorder:Recording from microphone (arecord)
rhasspy_1        | 2019-09-16T15:16:38.381045703Z DEBUG:EspeakSentenceSpeaker:['espeak', '-v', 'nl', '--stdout', 'Turning  the .']
rhasspy_1        | 2019-09-16T15:16:38.398969963Z DEBUG:EspeakSentenceSpeaker:ready -> speaking
rhasspy_1        | 2019-09-16T15:16:38.906389136Z DEBUG:APlayAudioPlayer:['aplay', '-q']
rhasspy_1        | 2019-09-16T15:16:39.892537268Z DEBUG:EspeakSentenceSpeaker:speaking -> ready

Strange error, but I’m glad it’s working! Out of curiosity, what microphone are you using?

Hi,

I was just about to ask a question about microphone settings but you beat me to it.:grinning:
I’m using the ps3 eye CameraB409241. I’m trying to come up with a decent asound.conf so that I can boost the mic a bit. I have the feeling the recording volume is a bit low. What I noticed however is that rhasspy at this moment seems to force 1 channel for recording. Is this correct? Or am I missing a setting somewhere, perhaps I can change this in the profile.json? I think the mic performs better with 4 channels, but still have some more testing to do.

Anyway, thats my ramblings.

Edit:
Ah yes, I see now that in arecord_cmd 1 channel is indeed hardcoded.

I have the same feeling using PS3 Eye. I saw somewhere that there is a gain = 1.0. It would be nice if this field could come to the frontend, so we can experiment with this.

What would be the way to implement some satellites in a effective and cheap way?
Could ESP32 be used for this?
My system is growing and I would like to go one step further.
Have some of you done this?

Just a small update.

I tried the ps3 eye with 4 channels by changing this in audio_recorder.py
(I made a reference in dockercompose.yml to it: /opt/docker/rhasspy/audio_recorder.py:/usr/share/rhasspy/rhasspy/audio_recorder.py)

Unfortunately this stops the mic from working at all. I find this a bit strange because it seems to work well outside the docker on the host system.

Anyhow, I will post updates if I have any.

All of the libraries Rhasspy depends on (porcupine, webrtcvad, pocketsphinx, Kaldi) expect a very specific audio format (16 Khz sample rate, 16-bit samples, mono), For efficiency, I hard-coded that audio format into the recording commands. I had always assumed that the microphone was recording from all channels, but the data was being converted (averaged?) before it reached Rhasspy.

I don’t think PyAudio or the arecord command have the ability to adjust the gain or volume of the microphone. I believe this is usually done through the alsamixer command or via the sound controls of the GUI.

Do you mean a microphone-enabled satellite? I haven’t done this, though I’ve considered buying a Pi Zero W to see if it would work for this purpose. With a power supply, SD card, and extra microphone, though, it would still come to $30-$40 apiece.

If you’re talking about something that plays audio, I think @Romkabouter has something for that.

Yes, a microphone enabled satellite would be nice. Playing audio would be even nicer, that possibly can be achieved still in the cheap side using a 5102 DAC with a Pi ZeroW. I use a Pi ZeroW+5102 DAC as a LMS Player (Logitech Media Server), so ready to test it with rhasspy… :stuck_out_tongue_winking_eye:

Matrix Voice from @Romkabouter is very nice but it is more expensive, specially to spread many units in your house.

I completely understand. In the meantime I did some testing with the ps3 eye and I think I cannot hear the difference between 1 and 4 channels, so your settings seem completely fine to me.

Correct, alsamixer unfortunately is unable to access the mic. The issue with low recording volume remains, but this has nothing to do with rhasspy. It has been stated exactly like I experience it in this link:

https://stackoverflow.com/questions/26661497/alsa-cannot-read-control-invalid-argument-raspberry-pi

So for now I am completely unable to boost the mic with alsa, but I will keep trying because the mic quality is really good apart from the low volume.

Edit:
Another small update, just in case anyone is interested in ps3 eye together with alsa

amixer -c 4 contents

(-c 4 is ps3eye) gives me

numid=3,iface=CARD,name='Keep Interface'
  ; type=BOOLEAN,access=rw------,values=1
  : values=off
numid=4,iface=MIXER,name='Mic Gain'
  ; type=INTEGER,access=rw---RW-,values=2,min=0,max=255,step=0
  : values=170,170
  | dBscale-min=-10.00dB,step=0.05dB,mute=0
numid=2,iface=MIXER,name='Mic Capture Volume'
  ; type=INTEGER,access=rw---R--,values=4,min=0,max=1,step=0
amixer: Control hw:4 element read error: Invalid argument

numid=1,iface=PCM,name='Capture Channel Map'
  ; type=INTEGER,access=r----R--,values=4,min=0,max=36,step=0
  : values=0,0,0,0
  | container
    | chmap-fixed=FL,FR,FC,LFE

Here you can see that indeed alsa cannot read the ‘Mic Capture Volume’. Perhaps there is still a possibility with the ‘Mic Gain’, but I gotta go to work first.

Edit2:
Back from work.
I get a definite recording volume improvement when using te following on the host system:

amixer -c 4 cset iface=MIXER,name='Mic Gain',index=0 255,255

Now I have to find out how to automatically replicate this to the docker.

Edit3:
I have now somehow lost the ability to set the mic gain, it’s just not there anymore. I’m about to give up on this microphone. It’s a shame cause quality is good, but volume is just too low.

Anybody have some good microphone experience?

I wonder if using pulseaudio instead of ALSA might make things better. It’s a bit more cumbersome to use pulseaudio through Docker, but I’ve made it work in the past.

If you’re interested, I can compile a different version of the Rhasspy docker image.

1 Like

I’m most certainly interested! But perhaps I should first perform some tests with pulse and the ps3eye on my host system to see if pulse can access the mixer. Otherwise it might be for nothing.

I hope to try this later today.

Thanks!

Edit:

Allright, host system with pulseaudio and alsa backend and I can record and also boost the recording volume with pulseaudio. I have no idea how that works under the hood, cause the backend is still alsa. But what would it mean for rhasspy to use pulseaudio? Wouldn’t you have to change the way it records audio?
Isn’t that a little to much hassle for just one stubborn kind of microphone?

I have just an issue, if you select a WakeWord, the path is incorrect.
Porcupine is correct, but Snowboy should be snowboy/snowboy.udml. Not just snowboy.umdl
Same goes for Mycroft, should be precise/hey-mycroft-2.pb

The Rhasspy in my Pi3 just stopped and I can’t get it running by clicking in START, REBUILD and even by rebooting Pi3. How can I get it working again?