Hi All,
Here is our approach to the subject.
HW is based on XMOS far-field voice processor and is in the form of hat for Radxa/RPi/OrangePi.
At the moment we are mainly focusing on Radxa Zero 3E because it has a built-in Ethernet and camera support.
Since the main assumption is to work as a far-field microphone but also an advanced alarm sensor, we gave up ESP32 in favor of more powerful processors based on Linux.
One of the assumptions is to support a high-resolution cameras for motion/face detection.
Main product features:
Based on XMOS far-field voice processor support AEC, and beam-forming
Best in class PDM 4x microphones array
6W Hi End DAC to connect directly to a speaker.
3 ways of supply:
build in wide range DC/DC converter that supply voice HAT as well as Radxa/RPi/OrangePi
can be supplied over USBC (due to current limitations not all options may be available). Can work as a separate USB device witout bottom Radxa/RpI/Orange it is detected in Win/Linux as a far-field microphone
option to work with POE at the moment support for popular Hi power PoE modules from Aliexpress type RT5400
Built-in speaker so it can work as a Media Player Device
Socket for mounting popular microwave sensor HLK-LD2450.
Priority is security and stability so we gave up WiFi in favor of a cable LAN connection. But after adding a WiFi card to Radxa Zero 3E or using directly built-in WiFi in RPi/Orange zero works over WiFi
Camera cutout for full RPi.
XMOS programming connector.
Support for popular standards such as local openwakeword and wyoming satellite or anything else that can be lunched on RAdxa/Ri/Orange. We use I2S and it is recognized in Linux as ALSA play/record device
Which XMOS chip are you exactly using?
Do you plan to use the code from here or do you want to use the XVF3800 firmware binaries? In the latter case I would assume that you would have to use this variant.
Just curious: is it the XVF3800 production firmware source code one can only access if some prove of buying matching silicon or dev kit has been provided: VocalFusion Software Request Form | XMOS
Yes but to keep some compatibility for now we do not use MCLK generated on host. Switching between three popular hosts platform is by soldering bridge resistors.
And yes we have a plan to sell it. We have few prototypes we tests over last few weeks and so far we are happy from results.
As for I2S it is a slave for SBC. MCLK is generated by XMOS. We tried to be master but it didnt work properly on ALSA. Only two separate record and play device but that on other hand did not work correcly with wyoming satellite and openwakeword.
I2C is a slave for SBC and can be used to control XMOS as well as FW updates.
USBC is connected directly to XMOS without multiplexer to SBC. So HAT itself without SBC can be a separate far-field mic and speaker for any other devices like win/linux PCs.
Currently the XK-VOICE-L71 attached to a Raspberry Pi 3A+ which I had still lying around. I am running raspios-bookworm.
Initially, I had some plans to build my own PCBs with KiCad8, probably with separate Mic-Array- and XMOS-PCBs. Possible manufactured by JLCPCB.
But since a few boards of other people were spotted in the wild already(incl. yours), I will wait a bit and see if somebody comes up with such a board(s)/solutions.
Long-term plan would be to build some simple replacement PCB(s) which could replace the already existing PCB(s) in commercial active/smart speakers.
Maybe it is the time to design a general purpose XMOS breakout PCB with seperate mic-array board so that it could be fitted on other PCBs easily.
@spchouse
Any reason why you have chosen XVF3800 over XVF3610?
XMOS themselves recommend XVF3800 with 4 mics and beam-forming for products in the area of conference speakers with beam-forming, etc.
The XVF3610 would be sufficient for voice assistants.
I still wonder though how the XVF3610 firmware differs from the “open-source” XCORE-VOICE" (sln_voice repo) with respect to performance related to the ASR stream (cleaned from echos and the reference audio).
Mainly because of beam forming and far-field solution. If board is mounted somwhere in the corner of the room or in ceiling as a security sensor that is a must have.