ReSpeaker XMOS XVF3800 ESPHome integration

That looks like cache issue to me. binary_sensor is crucial element, it cannot be missing.

1 Like

Solved by declaring an empty section with binary_sensor


binary_sensor:


external_components:
  - source:
      type: git
      url: https://github.com/formatBCE/Respeaker-XVF3800-ESPHome-integration
      ref: main
    components: 
      - respeaker_xvf3800
    refresh: 0s      

respeaker_xvf3800:
  id: respeaker
  address: 0x2C
1 Like

Huh, okay. Don’t know why it works for me without that…

Works Great! Will this config support for new SendSpin protocol near future?

Edit: I wrote my own config for sends-in. substitutions: # Phases of the Voice Assistant # The voice assistant is re - Pastebin.com

Implementing Sendspin support as we speak. I already have it for Respeaker Lite and Koala, will post updated config shortly.

3 Likes

Done. Sendspin support in repo.

3 Likes

Hi, i’m considering to buy the ReSpeaker XMOS XVF3800, goal is to get rid of Echo devices and use local Voice maybe with help of llm im the Future.

Currently i test all of this with a Voice PE and a Satellit1. Both work fine but ā€œmediocreā€ when it comes to wakeword detection especially in the Livingroom (about 20qm2) is it worth to try out the ReSpeaker at the current moment or should i wait a few month for another device (maybe NabuCasa comes up with a upgraded device?!) Thanks a lot for your recommendation in advance!

BTW. If i buy one i think i will design a case with a small speaker and 3d-print it. Just for clarification, the microfones are on the side of the board so that the case should have holes on the sides right?

Mics are on the top, they’re looking ā€œupā€ what it’s laying flat.
Functionality wise - it’s definitely much better with wake word detection than Respeaker Lite (I don’t have PE) and SAT1 1st gen. However, it requires very low ambient noise to stop listening for command - basically if something is on background, it just keeps listening. Hopefully it will be fixed with firmware updates, but it’s up to Seeed - it’s closed source…

1 Like

Thank you very much for your assessment. I think i’ll wait a little bit longer if maybe something better comes on the market or seed will fix this.

1 Like

Yeah it has really good potential, but not there yet fully.

There is case variant now Seeed reSpeaker: XMOS XVF3800 4-Mic AI Voice With XIAO ESP32S3 (Case Version)

Did you mess with the new firmware any?

Nope, no new releases…
I’ll ping them

Despite my last post i ordered one just out of curiosity -.-
Can i use this manual to install your integration, or is it still outdated like you said in this thread? Smart Home Voice Control with Home Assistant | Seeed Studio Wiki

If its not usable, can you give me a hint where to find a infos to install it? Thanks in advance

Yeah you can use that. It’s basic installation of the example config.

Hi,

Did you have to do any parameter tuning after installing in your enclosure?

According to seeed,

By default, Seeed has already fine-tuned the following parameters:

AUDIO_MGR_REF_GAIN: 8.0
AUDIO_MGR_MIC_GAIN: 90
AUDIO_MGR_SYS_DELAY: 12
PP_FMIN_SPEINDEX: 1300.0
PP_AGCMAXGAIN: 64.0
PP_AGCGAIN: 2.0
AEC_ASROUTGAIN: 1.0

Looking for some starting points how to tune the parameterrs…

I don’t have enclosure for it yet. Also don’t tune anything past Seeed initial tuning, don’t know if it’s somehow exposed.

Hi, do you know more details about the wake word about xvf3800?

It’s said that the second audio channel of firmware respeaker_xvf3800_i2s_master_dfu_firmware_v1.0.x_48k.bin contains wake word. But i can not find more details about it.
Link:

I was thinking that if I could somehow customize the wake word or just get the wake event in the firmware, I wouldn’t have to do additional wake word engineering on the esp32s3 or host machine.

Uhh, you understand it wrong.
Second channel is receiving the audio SUITABLE for wake word recognition, not the wake word events. There’s no wake word recognition on device itself. You’d expect at least some functionality to change the word in that case, right?
So channel 2 is just preparing the audio stream for better wake word recognition (e.g. applying AEC but no NS, and some gain).
You may see in ESP software that one channel is used for ASR (STT), and other for Microwakeword.

Ohh! Thank you so much for the correction. That makes perfect sense now.

1 Like