Fix: ThirdReality Voice/Music Assistant Dev Edition microphone too quiet for reliable speech recognition
TL;DR
The device ships with microphone capture so quiet that Home Assistant's STT (whether Whisper, Sherpa/Parakeet, or any other model) cannot reliably transcribe normal speech, even from desk-scale distances. The firmware exposes two different "mic gain" controls — mic_gain in /data/conf/sound.json and ALSA PDM Gain via amixer — and both are non-functional. The only working knob is PulseAudio source volume, which must be pushed to ~500% (well past unity) to get usable signal at room-scale distances. The fix requires the Dev Edition with debug board for serial console access. Without that hardware, this is not user-fixable.
Caveat: this writeup is based on a single Dev Edition unit. I'm working from the assumption that what I observed is normal behavior of the current firmware and hardware combination, not a defect specific to my sample. If you have the same model and see materially better mic performance out of the box, please respond — I'd genuinely like to know whether I received a bad unit or this is the baseline experience.
Symptoms
You have this problem if all of the following are true:
- The device shows up in HA via the ESPHome integration and accepts voice queries
- Wake word detection itself is unreliable at the default gain — it frequently misses phrases that should trigger it, even at close range with normal speaking volume; once the mic gain is corrected (see below), wake word becomes reliable
- Voice queries either fail outright with
stt-no-text-recognized(visible in Settings → Voice Assistants → Debug), or succeed intermittently — same phrase, same conditions, sometimes works, sometimes doesn't - Short phrases like "turn on the lights" sometimes get through; longer phrases like "what's the temperature outside" rarely do
- The same query spoken into the Home Assistant mobile app's voice button transcribes perfectly through the same pipeline
The pattern is "the model works fine when given decent audio; this device doesn't provide decent audio."
Prerequisites — debug board required
You cannot apply this fix without the Dev Edition variant that includes the debug board.
The Dev Edition ships in two SKUs:
- With Debug Board — exposes a USB-to-serial console port for shell access
- Without Debug Board — Type-C cable for firmware flashing only, no shell
The fix below requires editing files on the device and writing an init script, which requires shell access, which requires the serial console, which requires the debug board. If you have the no-debug-board version, you have no path to apply this workaround. Your options are:
- Wait for ThirdReality to ship a firmware fix
- Return the unit
- Use it only at close conversational range and accept marginal reliability
The remainder of this writeup assumes you have the debug board and basic serial console comfort.
How to prove the issue definitively
Three tests, in order. Each rules out a category of possible cause.
Test 1: Phone vs. speaker comparison (30 seconds)
Open the HA mobile app, tap the microphone icon, say a known-failing phrase. If it transcribes correctly, the entire pipeline downstream of the speaker is innocent — the speaker is the problem. This eliminates STT engine, model, language config, conversation agent, and HA-side issues in one test.
Test 2: Capture and listen to the actual audio
In /config/configuration.yaml, add:
assist_pipeline:
debug_recording_dir: /share/assist_pipeline
Pre-create the directory (HA does not always do this for you):
mkdir -p /share/assist_pipeline
Restart HA. Trigger a few wake-word voice queries. Pull the resulting stt.wav files from /share/assist_pipeline/<timestamp>/ to your computer and listen.
You will hear: speech that's barely audible at normal listening volume. A human can make out the words. STT models, trained on normal-volume speech, often can't.
Test 3: Demonstrate each gain control is broken
Connect serial to the debug board (115200 8N1, USB data cable, no password root login). Try each documented gain layer and record the audio between each:
Layer 1: mic_gain in sound.json
cat /data/conf/sound.json
# Default: "mic_gain": 30
Edit to 45 or any other value, restart the voice-assistant service, capture a new wav:
sed -i 's/"mic_gain": 30/"mic_gain": 45/' /data/conf/sound.json
/etc/init.d/S99ha-speaker voice-assistant restart
Result: no audible difference. The field exists in the preferences file and accepts edits, but does nothing.
Layer 2: ALSA PDM Gain (amixer numid=7)
amixer -c 0 cget numid=7 # default 14 of max 48
amixer -c 0 cset numid=7 48 # max out
Re-record. Result: no audible difference. The control accepts values across its 0–48 range without affecting captured audio amplitude.
Layer 3: PulseAudio source volume
pactl set-source-volume alsa_input.hw_0_2 200%
Re-record. Result: audibly louder. This is the only working knob.
You've now proven: two documented mic-gain controls exist in firmware, neither does anything, and the only working control is buried in PulseAudio with no firmware-side or HA-side surfacing.
The fix
A persistent boot-time script that boosts PulseAudio source volume after the audio stack is up. The exact value depends on your room and speaking distance, but most users will land between 400–700%.
1. Get serial console access
Connect the debug board's UART USB port (not the OTG/flashing port) to a PC with a USB data cable. Open a serial terminal:
- Speed: 115200, 8N1, no flow control
- Tools: PuTTY, screen, minicom, Arduino IDE serial monitor
Power-cycle the speaker. You'll see Linux boot messages, then a root shell with no password.
2. Find the right gain experimentally
While logged in, capture HA debug recordings (Test 2 above) at increasing PulseAudio source volumes:
pactl set-source-volume alsa_input.hw_0_2 300%
# trigger 3-5 voice queries, listen to the new wavs
pactl set-source-volume alsa_input.hw_0_2 500%
# repeat
pactl set-source-volume alsa_input.hw_0_2 700%
# repeat
What to listen for:
- Speech louder, background still quiet → keep going
- Speech louder, hiss/noise also louder → you're past the useful range; back off one step
- Speech crackling/clipped → too far; back off two steps
- STT hit rate becomes reliable at your typical use distance → that's your value
For most rooms with a desk-mounted or shelf-mounted speaker, 500% is a reasonable starting point. Larger rooms or further away may need 600–700%.
3. Persist via init script
mount -o remount,rw /
cat > /etc/init.d/S98micvolume <<'EOF'
#!/bin/sh
#
# Boost PDM microphone capture volume via PulseAudio.
#
# This is required because the other gain controls on this firmware
# are non-functional:
# - /data/conf/sound.json mic_gain (silently ignored)
# - amixer -c 0 numid=7 'PDM Gain' (no effect on output amplitude)
#
# PulseAudio source volume is the only mic gain control that works.
#
case "$1" in
start)
# Wait up to 20s for pulseaudio to be ready before pactl
for i in 1 2 3 4 5 6 7 8 9 10; do
pactl info >/dev/null 2>&1 && break
sleep 2
done
pactl set-source-volume alsa_input.hw_0_2 500% >/dev/null 2>&1
;;
stop|restart|reload|force-reload)
;;
*)
echo "Usage: $0 {start|stop|restart}"
exit 1
;;
esac
exit 0
EOF
chmod +x /etc/init.d/S98micvolume
S98 ordering runs the script before S99ha-speaker (the voice service). PulseAudio starts earlier in the boot sequence, and the wait loop is a safety net.
Adjust 500% to whatever value worked in step 2.
4. Reboot and verify persistence
reboot
# wait for boot to complete, log back in via serial
pactl list sources | grep -A 8 "Source #1"
You should see Volume: showing your chosen percentage. Run a voice query through HA and confirm STT works as it did before reboot.
5. Disable debug recording when done
The assist_pipeline.debug_recording_dir setting eats disk space rapidly. Remove from configuration.yaml once you've tuned things:
# Remove or comment out:
# assist_pipeline:
# debug_recording_dir: /share/assist_pipeline
Restart HA, then clean up:
rm -rf /share/assist_pipeline
Limitations after fixing
Even at 500% PulseAudio gain, this is not a high-fidelity voice satellite. Realistic expectations:
- Reliable: ~3m (10ft) with normal speaking volume in a quiet room, speaker facing you
- Marginal: same distance with background TV, dishwasher, or HVAC noise
- Unreliable: across-the-room queries in larger spaces
This device has a single PDM microphone with no hardware DSP (no acoustic echo cancellation, no beamforming, no noise suppression).
What ThirdReality should fix
In priority order:
1. Ship with a sane default mic gain
Set the PulseAudio source volume at boot to whatever value works for typical use (around 500%). This single change would resolve the issue out-of-box for every buyer, including those who can't access serial console. The current default produces audio that no shipping STT model can transcribe — that's not a configuration choice, that's a defect.
2. Fix or remove the two broken gain controls
Two layers of mic-gain knobs exist that do nothing:
mic_gainfield in/data/conf/sound.json- ALSA
PDM Gainviaamixer numid=7
Either wire them up to actually control mic capture volume (mapping to PulseAudio source volume is the obvious implementation), or remove them entirely. Placeholder controls that look like they should solve the problem are worse than no controls at all — users waste time debugging which one to twist.
3. Expose mic gain as a Home Assistant entity
The device already syncs many settings to HA via ESPHome entities (output volume, mute state, wake word selection, thinking sound). Add a Microphone Gain number entity (0–1000% or similar range) wired to the working knob. This single feature would close the gap for all buyers — the no-debug-board SKU included — without requiring serial console access.
4. Add software audio processing
PulseAudio's module-echo-cancel with WebRTC processing provides AGC, noise suppression, and echo cancellation. Load it by default in the device's PulseAudio configuration with sensible parameters. This would significantly improve STT reliability without requiring users to push raw gain to levels that amplify noise floor along with signal.
5. Expose mic gain (and other audio settings) in BLE provisioning
First-time setup should allow tuning mic sensitivity before the device is even on Wi-Fi. Especially critical for the no-debug-board SKU, which currently has zero path to fix this short of returning the unit.
Closing
A voice satellite that requires speaking loudly at close range to be understood by any STT model is not what most buyers expect. The fact that two placeholder gain controls exist that look like they should solve the problem but do nothing is a separate quality issue.
For Dev Edition buyers with debug boards, the workaround above gets you to functional. The vendor needs to ship a firmware update that fixes the broken gain controls and exposes working mic gain to Home Assistant — at minimum.
Tested on:
- Firmware: 1.1.7
- Home Assistant Core: 2026.4.x
- HAOS: 17.x
- Hardware: ThirdReality Voice/Music Assistant Dev Edition with Debug Board
Hardware required for the fix: Dev Edition with Debug Board. The standalone Dev Edition without debug board has no path to apply this workaround.