I use it with microwakeword (local on-device). Used with Openwakeword too. No troubles in any case.
Does Seeed offer a XMOS firmware with 48kHz BTW?
It depends on hardware. Using single i2s is cheaper, i suppose. Also it frees GPIO pins. So there’s at leat 2 reasons to go with it
However, separate i2s connections are easier to handle and tune up. So there’s reason to use 2.
It’s just two different ways to do it, in the end…
That i don’t know. AFAIK default yaml is using 16K. I don’t think it’s intended to use 48K…
Seeed should really provide the schematics for the respeaker-lite just to be sure.
Then we could compile the XMOS FW for 48kHz and possibly also use it for high quality music playback (if the ESP32-S3 is able to cope with this).
We just have to recompile this: sln_voice/examples/ffva at develop · xmos/sln_voice · GitHub
I already did and it works too. Tested with the UA (USB-version) variant and 48kHz.
We would have to use the INT variant. Default here is already 48kHz: sln_voice/examples/ffva/ffva_int.cmake at 06bf254955dfe2b6d1a83d9b2614a4445021f819 · xmos/sln_voice · GitHub
I hope so but, according to their own docs it’s 16kHz but your compiling your own firmware so I am probably misreading something… I wouldn’t hold my hopes up. Seeed tends to launch a device, have some examples, always Arduino, ESPhome like the voice one they have if you are lucky. After that they never tend to touch or add anything else to that product documentation. Since it’s somewhat new and it appears to be popular, I hope I am wrong, just my past experience with some of their hardware.
Also, ironically just got mine, might have to check the USB firmware above out if even via USB. Flashed with i2c dfu right now. Did the listen light work for everyone else using Seeed’s yaml? Everything else is working. Just wondering if it’s bad yaml or defective hardware.
EDIT: I am wondering if this could be accomplished using the external pad 1 or 2 pins… 1 is for the ESP, 2 is for the XMOS although you are probably aware of this already./
Every single time on the included 5w speaker so far during boot or reboots
They’re saying it’s because of i2s initialization… Probably, i should omit mclk pin… It’s not needed in most of setups.
Did the listen light work for everyone else using Seeed’s yaml?
Yes, that thing worked. However, voice itself was pretty unstable, hanging ESP after 3-4 tries.
After that they never tend to touch or add anything else to that product documentation.
Hopefully we can hold them accountable on Discord. That would be bummer to have good hardware without possibility to use it because of lack of docs.
1 or 2 pins
Yeah, ESP pad just exposes some unused GPIO, nice thing if someone wants to add stuff. Have no idea what to do with XMOS pins.
Also, there’s pins you can jump-solder to have that USR button working.
I had it lock up 2 or 3 times after installing it in the ESPHome but I rebooted HA and haven’t had that issue yet. Kind of sucks the listen light just don’t work and certainly not usee error. I had this out of the box and setup in under 5, maybe 10 minutes most so I certainly didn’t mishandle it. I may need to mess with OpenWakeWord. I only use it for Wyoming satellite at the moment. I’m sure somebody can get this working with microwakeword and I may mess with it later.
I got an m5stack CoreS3 working without asp-adf (tensorflow tlite compiled) . By far my round USB speakerphone works best using assist microphone add on… It’s supposed to utilize the DSP to some degree based on research
A close second is the Wyoming satellite with the S3 box slightly behind that. The firmware I used and altered (BigBooba or. Something like that).has a button to hold and speak and that is super accurate with the TV on but you get to control when it quits listening. That or you can touch the screen to make it quit listening after triggering it with the voice word and saying the command. That and 32 buttons or something close that you can manually configure to whatever script or automation you want. I also own a Korvo-1 and it’s actually just as good as the S3 box.regarding voice.
I’m still making up my mind about this. The fact that the listen LED doesn’t work at all out of the box kind of gets it off to a rocky start. I may.flash the PC DFU and see if it works with Assist Microphone in HA, it should. It seems to want a command sooner than my other esp32 devices but it’s also using open wake word while the rest are using Microwakeword so I’m not sure if that has anything to do with it. I need to use Discord more, seems like that’s what everyone is moving to for support or any help these days. I also should change the logging because it’s on the highest (I think verbose) levels using seeeds YAML. I’ve never seen that in an example before. Almost like it was a rush job to get posted
I only use it for Wyoming satellite at the moment.
But Wyoming satellite can use OWW locally, you don’t need add-on for that. Add-on needed only for always-streaming satellites…
By far my round USB speakerphone works best using assist microphone add on
That’s for sure.
The firmware I copied BigBooba or. Something like that)
It’s BigBobba He’s using @gnumpi 's adf implementation, with nice display add-ons for Box.
I use Box (not 3, 1st one) as satellite with MWW - it works OK, but speaker is crap. Also i have Wyoming satellite with Respeaker hat on Pi Zero 2W. It’s not so good for me, doesn’t hear me from couple meters despite mics exposed…
Apart from that, i have several generic ESP32-S3 satellites with INMP mic and MAX98357 DAC, with Microwakeword. It’s usable, if it’s quiet in the room… I use gnumpi’s code there too.
I understand this is going to be the hardest part and it’s going to be incremental, especially for 100% local. It’s really my only issue. Music in the background without lyrics does okay but if anyone is talking, TV, music, ECT… It just keeps listening. Not complaining because as I said, this can’t be easy
No telling how much Cloud resources Google in Amazon use for this type of stuff.
Very true. Well, XMOS should mitigate it, with directional audio support and ML algorithm of speech detection.
MicroWakeWord for anyone that want it. You can change the chip back to the esp32-s3 version. I accidentally copied/pasted that in there going by another file to create this. That voice pipeline I am using has no Openwakeword specified. The tensorflow stuff takes up double the space on the ROM than what is on Seeed’s site. Doesn’t do esp-adf though which is why the VAD line is commented out in the esp voice pipeline
Also, let me know if the LED’s work because mine just flat out doesn’t work at all, period. Everything else does though.
substitutions:
name: respeakerv3
friendly_name: Respeakerv3
voice_assist_idle_phase_id: '1'
voice_assist_listening_phase_id: '2'
voice_assist_thinking_phase_id: '3'
voice_assist_replying_phase_id: '4'
voice_assist_not_ready_phase_id: '10'
voice_assist_error_phase_id: '11'
voice_assist_muted_phase_id: '12'
micro_wake_word_model: okay_nabu
esphome:
name: ${name}
friendly_name: ${friendly_name}
min_version: 2024.7.0
platformio_options:
board_build.flash_mode: dio
on_boot:
priority: 600
then:
- delay: 30s
- if:
condition:
- lambda: return id(init_in_progress);
then:
- lambda: id(init_in_progress) = false;
- script.execute: reset_led
# on_boot:
# then:
# - if:
# condition:
# switch.is_on:
# then:
# - voice_assistant.start_continuous:
esp32:
board: esp32s3box
flash_size: 16MB
framework:
type: esp-idf
sdkconfig_options:
CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
psram:
mode: octal
speed: 80MHz
#captive_portal:
# Enable logging
logger:
#level: VERY_VERBOSE
# Enable Home Assistant API
api:
encryption:
key: "API="
ota:
- platform: esphome
password: "OTA"
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
on_connect:
then:
- delay: 10s # Gives time for improv results to be transmitted
- ble.disable:
- delay: 20ms
- script.execute: reset_led
on_disconnect:
then:
- ble.enable:
- script.execute: reset_led
# Enable fallback hotspot (captive portal) in case wifi connection fails
ap:
ssid: "Respeakerv3 Fallback Hotspot"
password: "HS"
improv_serial:
esp32_improv:
authorizer: none
#
#
# Globals
#
globals:
- id: init_in_progress
type: bool
restore_value: no
initial_value: 'true'
- id: voice_assistant_phase
type: int
restore_value: no
initial_value: ${voice_assist_not_ready_phase_id}
time:
- platform: homeassistant
id: homeassistant_time
text_sensor:
- platform: wifi_info
ip_address:
name: "${friendly_name} IP Address"
external_components:
- source: github://QingWind6/ESPHome_XIAO-ESP32S3
i2s_audio_xiao:
i2s_lrclk_pin: GPIO7
i2s_bclk_pin: GPIO8
i2s_mclk_pin: GPIO9
microphone:
- platform: i2s_audio_xiao
id: xiao_mic
adc_type: external
i2s_din_pin: GPIO44
pdm: false
bits_per_sample: 32bit
channel: left
speaker:
- platform: i2s_audio_xiao
id: xiao_speaker
dac_type: external
i2s_dout_pin: GPIO43
mode: stereo
micro_wake_word:
models: ${micro_wake_word_model}
on_wake_word_detected:
- voice_assistant.start:
wake_word: !lambda return wake_word;
voice_assistant:
id: va
microphone: xiao_mic
speaker: xiao_speaker
use_wake_word: true
noise_suppression_level: 0
auto_gain: 31dBFS
volume_multiplier: 1.0
#vad_threshold: 3
on_listening:
- lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
- light.turn_on:
id: led_ring
blue: 100%
red: 0%
green: 0%
brightness: 100%
effect: wakeword
- script.execute: reset_led
on_stt_vad_end:
- lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
- light.turn_on:
id: led_ring
blue: 0%
red: 0%
green: 100%
brightness: 50%
effect: pulse
- script.execute: reset_led
on_tts_stream_start:
- lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
- light.turn_on:
id: led_ring
blue: 0%
red: 0%
green: 100%
brightness: 50%
effect: pulse
- script.execute: reset_led
on_tts_stream_end:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
- script.execute: reset_led
on_end:
- if:
condition:
and:
- switch.is_off: mute
- lambda: return id(wake_word_engine_location).state == "On device";
then:
- wait_until:
not:
voice_assistant.is_running:
- micro_wake_word.start:
on_error:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
- script.execute: reset_led
- delay: 1s
- if:
condition:
switch.is_off: mute
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: reset_led
on_client_connected:
- if:
condition:
switch.is_off: mute
then:
- voice_assistant.start_continuous:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- lambda: id(init_in_progress) = false;
- script.execute: reset_led
on_client_disconnected:
- lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
- script.execute: reset_led
select:
- platform: template
entity_category: config
name: Wake word engine location
id: wake_word_engine_location
optimistic: true
restore_value: true
options:
- In Home Assistant
- On device
initial_option: On device
on_value:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- wait_until:
lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id} || id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
- if:
condition:
lambda: return x == "In Home Assistant";
then:
- micro_wake_word.stop
- delay: 30ms
- if:
condition:
switch.is_off: mute
then:
- lambda: id(va).set_use_wake_word(true);
- voice_assistant.start_continuous:
- if:
condition:
lambda: return x == "On device";
then:
- lambda: id(va).set_use_wake_word(false);
- voice_assistant.stop
- delay: 30ms
- if:
condition:
switch.is_off: mute
then:
- micro_wake_word.start
light:
- platform: esp32_rmt_led_strip
id: led_ring
name: led strip
entity_category: config
pin: GPIO1
default_transition_length: 0s
chipset: ws2812
num_leds: 1
rgb_order: grb
rmt_channel: 0
effects:
- pulse:
name: "Pulse"
transition_length: 0.5s
update_interval: 0.5s
- addressable_twinkle:
name: "Working"
twinkle_probability: 5%
progress_interval: 4ms
- addressable_color_wipe:
name: "Wakeword"
colors:
- red: 0%
green: 50%
blue: 0%
num_leds: 1
add_led_interval: 40ms
reverse: false
- addressable_color_wipe:
name: "Power"
colors:
- red: 100%
green: 0%
blue: 0%
num_leds: 1
add_led_interval: 50ms
reverse: false
output:
- platform: ledc
id: light_output
pin: GPIO21
inverted: true
script:
- id: reset_led
then:
- if:
condition:
- switch.is_on: use_wake_word
- switch.is_on: use_listen_light
# - switch.is_off: mute
then:
- light.turn_on:
id: led_ring
blue: 100%
red: 0%
green: 0%
brightness: 80%
effect: none
else:
- if:
condition:
- switch.is_on: use_wake_word
- switch.is_off: use_listen_light
# - switch.is_on: mute
then:
- light.turn_on:
id: led_ring
blue: 100%
red: 0%
green: 0%
brightness: 30%
effect: none
else:
- light.turn_off: led_ring
switch:
- platform: template
name: Use wake word
id: use_wake_word
optimistic: true
restore_mode: RESTORE_DEFAULT_ON
entity_category: config
on_turn_on:
- lambda: id(va).set_use_wake_word(true);
- if:
condition:
not:
- voice_assistant.is_running
then:
- voice_assistant.start_continuous
on_turn_off:
- voice_assistant.stop
- lambda: id(va).set_use_wake_word(false);
- platform: template
name: Mute
id: mute
optimistic: true
restore_mode: RESTORE_DEFAULT_OFF
entity_category: config
on_turn_off:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- lambda: id(va).set_use_wake_word(true);
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
- if:
condition:
not:
- voice_assistant.is_running
then:
- voice_assistant.start_continuous
- script.execute: reset_led
on_turn_on:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- voice_assistant.stop
- lambda: id(va).set_use_wake_word(false);
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: reset_led
- platform: template
name: Use Listen Light
id: use_listen_light
optimistic: true
restore_mode: RESTORE_DEFAULT_ON
entity_category: config
on_turn_on:
- script.execute: reset_led
on_turn_off:
- script.execute: reset_led
- platform: restart
name: restart
Looks like ESP-ADF won’t ever make it into esphome officially:
This is very good news and makes sense. I am sure Espressif went out there way on software examples for the S3 box variants. But it was also a combination of the hardware and documentation. It seems like now that the voice team has gotten there feet wet so to speak, they are ready to move on which is good IMO.
I remember watching the live stream when they announced it and developer did talk a lot about the TensorFlow lite (tflite) open source code/library he found. Esp-adf didn’t really come up until the S3 box was using it.
This is very good news and makes sense. I am sure Espressif went out there way on software examples for the S3 box variants. But it was also a combination of the hardware and documentation. It seems like now that the voice team has gotten there feet wet so to speak, they are ready to move on which is good IMO.
I remember watching the live stream when they announced it and developer did talk a lot about the TensorFlow lite (tflite) open source code/library he found. Esp-adf didn’t really come up until the S3 box was using it.
With huge help of Kevin (MWW dev), got Respeaker Lite fully working with MicroWakeWord too - but i used the version for Voice-Kit (PE), that isn’t yet merged into ESPHome, to get rid of Seeed-specific I2S implementation. Won’t publish that now, as it will be changing drastically in next couple weeks, and my code will be obsolete.
New MWW is using separate mic stream to keep listening even when voice_assistant is streaming audio. That’s pretty good solution for XMOS board used in PE device (it is exposing 2 different streams of audio), but not that good for Respeaker XMOS, because it’s exposing single consolidated audio input stream (ci=onfirmed with Seeed support). That means, MWW and VA components won’t be able to modify stream separately (adjust gain, mostly), that can lead to false positives on wake word. Will test it more, but looks not bad, actually.
On other news: mute button, similar to USR button, can be soldered to exposed GPIO pin and be accessible in ESP code. Confirmed this with Seeed support too.
LED is working for me flawlessly.
Hello
Could you publish your code, just for testing ?
Thanks !
Okay, here it is, but remember - it all will be broken very soon. Also, i didn’t have a chance to tidy it up, and add mute/usr buttons. It’s just working file, with lot of stuff going on from my personal setup, with added pieces from HA PE dev YAML.
Edit: to remove old misleading YAML: here’s link to newest working config (link to my post lower in this thread): "ReSpeaker Lite" - new Seeed Studio Voice Assistant Development Kit hardware combine ESP32 with XMOS XU316 DSP chip for advanced audio processing as a ESPHome-based Home Assistant Assist Satellite voice devkit - #87 by formatBCE