What are best audio parameters for voice assist?

I got a screeching noise when testing the reSpeaker with RasPi 4. When initially testing the audio in and out with the Auafruit Voice Bonnet (reSpeaker 2-mic clone), it recorded and played fine … but after installing MPD, wyoming-satellite and wyoming-wakeword the recorded audio was so full of noise that I could barely hear that there was some voice behind it.

Eventually I found that I had mixed the arecord commands used to test (by saving to a test.wav file) and for wyoming-satellite (which assumes (requires?) “raw” format). It seems that raw format doesn’t sound nice when aplay-ed as a .wav file :wink:

But it leads me to question where the parameters used come from, whether they are still optimal, or if they could be tweaked to improve the voice recognition.

--mic-command 'arecord -D plughw:CARD=seeed2micvoicec,DEV=0 -r 16000 -c 1 -f S16_LE -t raw' \
--snd-command 'aplay   -D plughw:CARD=seeed2micvoicec,DEV=0 -r 22050 -c 1 -f S16_LE -t raw' \

I have a number of questions, probably best for @synesthesiam:

  • Why recording at 16kHz but playing at 22050Hz ? In Rhasspy documentation this was “recommended”; but I suspect the other voice assist modules now require this format ?
  • 44.1KHz is CD quality - over 5 times faster - but would that give a better result, or use too much processing time ? Would a frequency (say 32kHz) give better audio quality, and hence better voice recognition ?
  • I assume “-c 1” means only using one of the reSpeaker’s 2 mics. Is there a way to merge both channels from a reSpeaker 2-Mic board, and would that be better ?