Voice Assist Satellite with ESP32-C6

I really like voice control and have set up 2 of the Waveshare ESP32-S3 Speaker development boards as Voice Assist satellites, and wanted to see if I could build a device for less $ -

I got a
3-pack of Xiao ESP32-C6’s for $23
5-pack of MAX938357 I2S Audio Amp modules for $9.99
4-pack of 4ohm 3 watt mini speakers for $9.99

ignoring the fact that this can build 3 complete devices and there are leftover parts, it’s $12.17 for a functional voice satellite running micro wake word!
My AI advisors were saying the -C6 might not be capable, that -S3 would be better, but without too much trouble I got it all working with response times that seem same as with the Waveshare devices!

I’m definitely planning to make this permanent and portable and doing several!

I put the YAML and hardware connections on Github here

1 Like

PSRAM is likely the reason but for voice assistant it will still work. Sendspin currently needs psram I believe.

Most important thing for voice assistant in my opinion is dac and adc on seperate i2s bus. I got some waveshare device where mic (adc) and speaker (dac) share i2s and you basically can’t have voice command during music playback. Doesn’t sound like bad deal until your timer rings on voice device and it cannot listen for your turn OFF command.

But $13 voice assistant with good sound is great.

I had a -C6 but no -S3 which is why I decided to try.

There isn’t much price difference between the -C6 and -S3 and I do agree -S3 should be better / the -C3 sharing the I2S isn’t ideal. I will be getting some -S3’s and will probably try similar setup with the speaker/mic/amp using the -S3

This is diy so you should be able to wire as separated i2s if you haven’t already.

What is the dat rate out of the ESP32? The C6 has a few things over the S3, (1) it uses 1/2 the power and (2) it can be a Thread device and not use WiFi.

But Thread does not have WiFi’s bandwidth, hence the question.

Update - I have 2 ESP32C6's as voice satellites and both are working without issue with speaker/mic/LED as an assist satellite (without a media player). No memory or performance issues, and very low power draw (I'm probably going to add a rechargeable battery to one and make it portable.)

Today, I wired up and created YAML for an ESP32-C61-WROOM1-DevKit C1 from Espressif successfully and have a Voice Satellite with media player using the -C61. I'll be putting the YAML and more info on github. This C61 has 2mb PRASM, WiFi6 on 2.4ghz, and draws significantly less power than the S3. One of the issues I haven't solved- there does not seem to be any support in ESPHome for the onboard LED of the C61 Devkit: the C61 has RGD LED driven by GPIO8, but ESPHome does not have RMT hardware support for the C61(?) so the WS2812 on GPIO8 can't be driven by ESPHome RMT LED strip platform (?). ESPHome also doesn't support the neopixelbus; so, I added an external LED to light when wake word detected.

Several weeks ago I made a voice satellite with ESP32-S3-WROOM1 - the S3 uses more than 10x the power of the Xaio C6, and initial indication of power on the C61 is also way lower than the S3, and I think may even be faster responding than the S3.