"ReSpeaker Lite" - new Seeed Studio Voice Assistant Development Kit hardware combine ESP32 with XMOS XU316 DSP chip for advanced audio processing as a ESPHome-based Home Assistant Assist Satellite voice devkit

nanosonde · September 26, 2024, 2:55pm

Seeed should really provide the schematics for the respeaker-lite just to be sure.
Then we could compile the XMOS FW for 48kHz and possibly also use it for high quality music playback (if the ESP32-S3 is able to cope with this).

We just have to recompile this: sln_voice/examples/ffva at develop · xmos/sln_voice · GitHub

I already did and it works too. Tested with the UA (USB-version) variant and 48kHz.
We would have to use the INT variant. Default here is already 48kHz: sln_voice/examples/ffva/ffva_int.cmake at 06bf254955dfe2b6d1a83d9b2614a4445021f819 · xmos/sln_voice · GitHub

ginandbacon · September 26, 2024, 4:34pm

I hope so but, according to their own docs it’s 16kHz but your compiling your own firmware so I am probably misreading something… I wouldn’t hold my hopes up. Seeed tends to launch a device, have some examples, always Arduino, ESPhome like the voice one they have if you are lucky. After that they never tend to touch or add anything else to that product documentation. Since it’s somewhat new and it appears to be popular, I hope I am wrong, just my past experience with some of their hardware.

Also, ironically just got mine, might have to check the USB firmware above out if even via USB. Flashed with i2c dfu right now. Did the listen light work for everyone else using Seeed’s yaml? Everything else is working. Just wondering if it’s bad yaml or defective hardware.

EDIT: I am wondering if this could be accomplished using the external pad 1 or 2 pins… 1 is for the ESP, 2 is for the XMOS although you are probably aware of this already./

ginandbacon · September 26, 2024, 4:45pm

Every single time on the included 5w speaker so far during boot or reboots

formatBCE · September 26, 2024, 4:53pm

They’re saying it’s because of i2s initialization… Probably, i should omit mclk pin… It’s not needed in most of setups.

formatBCE · September 26, 2024, 4:58pm

Did the listen light work for everyone else using Seeed’s yaml?

Yes, that thing worked. However, voice itself was pretty unstable, hanging ESP after 3-4 tries.

After that they never tend to touch or add anything else to that product documentation.

Hopefully we can hold them accountable on Discord. That would be bummer to have good hardware without possibility to use it because of lack of docs.

1 or 2 pins

Yeah, ESP pad just exposes some unused GPIO, nice thing if someone wants to add stuff. Have no idea what to do with XMOS pins.
Also, there’s pins you can jump-solder to have that USR button working.

ginandbacon · September 26, 2024, 5:41pm

I had it lock up 2 or 3 times after installing it in the ESPHome but I rebooted HA and haven’t had that issue yet. Kind of sucks the listen light just don’t work and certainly not usee error. I had this out of the box and setup in under 5, maybe 10 minutes most so I certainly didn’t mishandle it. I may need to mess with OpenWakeWord. I only use it for Wyoming satellite at the moment. I’m sure somebody can get this working with microwakeword and I may mess with it later.

I got an m5stack CoreS3 working without asp-adf (tensorflow tlite compiled) . By far my round USB speakerphone works best using assist microphone add on… It’s supposed to utilize the DSP to some degree based on research

A close second is the Wyoming satellite with the S3 box slightly behind that. The firmware I used and altered (BigBooba or. Something like that).has a button to hold and speak and that is super accurate with the TV on but you get to control when it quits listening. That or you can touch the screen to make it quit listening after triggering it with the voice word and saying the command. That and 32 buttons or something close that you can manually configure to whatever script or automation you want. I also own a Korvo-1 and it’s actually just as good as the S3 box.regarding voice.

I’m still making up my mind about this. The fact that the listen LED doesn’t work at all out of the box kind of gets it off to a rocky start. I may.flash the PC DFU and see if it works with Assist Microphone in HA, it should. It seems to want a command sooner than my other esp32 devices but it’s also using open wake word while the rest are using Microwakeword so I’m not sure if that has anything to do with it. I need to use Discord more, seems like that’s what everyone is moving to for support or any help these days. I also should change the logging because it’s on the highest (I think verbose) levels using seeeds YAML. I’ve never seen that in an example before. Almost like it was a rush job to get posted

formatBCE · September 26, 2024, 5:49pm

I only use it for Wyoming satellite at the moment.

But Wyoming satellite can use OWW locally, you don’t need add-on for that. Add-on needed only for always-streaming satellites…

By far my round USB speakerphone works best using assist microphone add on

That’s for sure.

The firmware I copied BigBooba or. Something like that)

It’s BigBobba He’s using @gnumpi 's adf implementation, with nice display add-ons for Box.

I use Box (not 3, 1st one) as satellite with MWW - it works OK, but speaker is crap. Also i have Wyoming satellite with Respeaker hat on Pi Zero 2W. It’s not so good for me, doesn’t hear me from couple meters despite mics exposed…
Apart from that, i have several generic ESP32-S3 satellites with INMP mic and MAX98357 DAC, with Microwakeword. It’s usable, if it’s quiet in the room… I use gnumpi’s code there too.

ginandbacon · September 26, 2024, 5:55pm

I understand this is going to be the hardest part and it’s going to be incremental, especially for 100% local. It’s really my only issue. Music in the background without lyrics does okay but if anyone is talking, TV, music, ECT… It just keeps listening. Not complaining because as I said, this can’t be easy

No telling how much Cloud resources Google in Amazon use for this type of stuff.

formatBCE · September 26, 2024, 6:01pm

Very true. Well, XMOS should mitigate it, with directional audio support and ML algorithm of speech detection.

ginandbacon · September 27, 2024, 9:32am

MicroWakeWord for anyone that want it. You can change the chip back to the esp32-s3 version. I accidentally copied/pasted that in there going by another file to create this. That voice pipeline I am using has no Openwakeword specified. The tensorflow stuff takes up double the space on the ROM than what is on Seeed’s site. Doesn’t do esp-adf though which is why the VAD line is commented out in the esp voice pipeline

Also, let me know if the LED’s work because mine just flat out doesn’t work at all, period. Everything else does though.

substitutions:
  name: respeakerv3   
  friendly_name: Respeakerv3
  
  voice_assist_idle_phase_id: '1'
  voice_assist_listening_phase_id: '2'
  voice_assist_thinking_phase_id: '3'
  voice_assist_replying_phase_id: '4'
  voice_assist_not_ready_phase_id: '10'
  voice_assist_error_phase_id: '11'  
  voice_assist_muted_phase_id: '12'

  micro_wake_word_model: okay_nabu  

esphome:
  name: ${name}
  friendly_name: ${friendly_name}
  min_version: 2024.7.0
  platformio_options:
    board_build.flash_mode: dio
  on_boot:
    priority: 600
    then:
      - delay: 30s            
      - if:
          condition:
            - lambda: return id(init_in_progress);
          then:
            - lambda: id(init_in_progress) = false;    
      - script.execute: reset_led            

  # on_boot:
  #   then:
  #     - if:
  #            condition:
  #              switch.is_on: 
  #            then:
  #              - voice_assistant.start_continuous:

esp32:
  board: esp32s3box
  flash_size: 16MB
  framework:
    type: esp-idf
    sdkconfig_options:
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"

psram:
  mode: octal
  speed: 80MHz  

#captive_portal:  


    
# Enable logging
logger:
  #level: VERY_VERBOSE

# Enable Home Assistant API
api:
  encryption:
    key: "API="

ota:
  - platform: esphome
    password: "OTA"

wifi:
  ssid: !secret wifi_ssid 
  password: !secret wifi_password
  on_connect:
    then:
      - delay: 10s # Gives time for improv results to be transmitted
      - ble.disable:
      - delay: 20ms
      - script.execute: reset_led        
  on_disconnect:
    then:
      - ble.enable:  
      - script.execute: reset_led           

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Respeakerv3 Fallback Hotspot"
    password: "HS"

improv_serial:

esp32_improv:
  authorizer: none  
# 
# 
# Globals
# 
globals:
  - id: init_in_progress
    type: bool
    restore_value: no
    initial_value: 'true'
  - id: voice_assistant_phase
    type: int
    restore_value: no
    initial_value: ${voice_assist_not_ready_phase_id}

time:
  - platform: homeassistant
    id: homeassistant_time    

text_sensor:
  - platform: wifi_info
    ip_address:
      name: "${friendly_name} IP Address"    

external_components:
  - source: github://QingWind6/ESPHome_XIAO-ESP32S3      

i2s_audio_xiao:
  i2s_lrclk_pin: GPIO7
  i2s_bclk_pin: GPIO8
  i2s_mclk_pin: GPIO9

microphone:
  - platform: i2s_audio_xiao
    id: xiao_mic
    adc_type: external
    i2s_din_pin: GPIO44
    pdm: false
    bits_per_sample: 32bit
    channel: left

speaker:
  - platform: i2s_audio_xiao
    id: xiao_speaker
    dac_type: external
    i2s_dout_pin: GPIO43
    mode: stereo
    
micro_wake_word:
  models: ${micro_wake_word_model}
  on_wake_word_detected:
    - voice_assistant.start:
        wake_word: !lambda return wake_word;    

voice_assistant:
  id: va
  microphone: xiao_mic
  speaker: xiao_speaker
  use_wake_word: true
  noise_suppression_level: 0
  auto_gain: 31dBFS
  volume_multiplier: 1.0
  #vad_threshold: 3
  on_listening:
    - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
    - light.turn_on:
        id: led_ring
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: wakeword
    - script.execute: reset_led 
  on_stt_vad_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 50%
        effect: pulse    
    - script.execute: reset_led
  on_tts_stream_start:
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 50%
        effect: pulse    
    - script.execute: reset_led
  on_tts_stream_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - script.execute: reset_led
  on_end:
    - if:
        condition:
          and:
            - switch.is_off: mute
            - lambda: return id(wake_word_engine_location).state == "On device";
        then:
          - wait_until:
              not:
                voice_assistant.is_running:
          - micro_wake_word.start:
  on_error:
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
          - script.execute: reset_led
          - delay: 1s
          - if:
              condition:
                switch.is_off: mute
              then:
                - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
              else:
                - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
          - script.execute: reset_led
  on_client_connected: 
    - if:
        condition:
          switch.is_off: mute
        then:
          - voice_assistant.start_continuous:
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
        else:
          - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
    - lambda: id(init_in_progress) = false; 
    - script.execute: reset_led
  on_client_disconnected:
    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};  
    - script.execute: reset_led
    
select:
  - platform: template
    entity_category: config
    name: Wake word engine location
    id: wake_word_engine_location
    optimistic: true
    restore_value: true
    options:
      - In Home Assistant
      - On device
    initial_option: On device
    on_value:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - wait_until:
                lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id} || id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
            - if:
                condition:
                  lambda: return x == "In Home Assistant";
                then:
                  - micro_wake_word.stop
                  - delay: 30ms
                  - if:
                      condition:
                        switch.is_off: mute
                      then:
                        - lambda: id(va).set_use_wake_word(true);
                        - voice_assistant.start_continuous:
            - if:
                condition:
                  lambda: return x == "On device";
                then:
                  - lambda: id(va).set_use_wake_word(false);
                  - voice_assistant.stop
                  - delay: 30ms
                  - if:
                      condition:
                        switch.is_off: mute
                      then:
                        - micro_wake_word.start         

light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    name: led strip
    entity_category: config
    pin: GPIO1
    default_transition_length: 0s
    chipset: ws2812
    num_leds: 1
    rgb_order: grb
    rmt_channel: 0
    effects:
      - pulse:
          name: "Pulse"
          transition_length: 0.5s
          update_interval: 0.5s
      - addressable_twinkle:
          name: "Working"
          twinkle_probability: 5%
          progress_interval: 4ms
      - addressable_color_wipe:
          name: "Wakeword"
          colors:
            - red: 0%
              green: 50%
              blue: 0%
              num_leds: 1
          add_led_interval: 40ms
          reverse: false
      - addressable_color_wipe:
          name: "Power"
          colors:
            - red: 100%
              green: 0%
              blue: 0%
              num_leds: 1
          add_led_interval: 50ms
          reverse: false

output:
  - platform: ledc
    id: light_output
    pin: GPIO21
    inverted: true


script:
  - id: reset_led
    then:
      - if:
          condition:
            - switch.is_on: use_wake_word
            - switch.is_on: use_listen_light
          #  - switch.is_off: mute
          then:
            - light.turn_on:
                id: led_ring
                blue: 100%
                red: 0%
                green: 0%
                brightness: 80%
                effect: none
          else:
            - if:
                condition:
                  - switch.is_on: use_wake_word
                  - switch.is_off: use_listen_light
                #  - switch.is_on: mute
                then:
                  - light.turn_on:
                      id: led_ring
                      blue: 100%
                      red: 0%
                      green: 0%
                      brightness: 30%
                      effect: none
                else:
                  - light.turn_off: led_ring


switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(va).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
    on_turn_off:
      - voice_assistant.stop
      - lambda: id(va).set_use_wake_word(false);
      
  - platform: template
    name: Mute
    id: mute
    optimistic: true
    restore_mode: RESTORE_DEFAULT_OFF
    entity_category: config
    on_turn_off:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:      
            - lambda: id(va).set_use_wake_word(true);
            - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
            - if:
                condition:
                  not:
                    - voice_assistant.is_running
                then:
                  - voice_assistant.start_continuous
            - script.execute: reset_led
    on_turn_on:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:      
            - voice_assistant.stop
            - lambda: id(va).set_use_wake_word(false);
            - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
            - script.execute: reset_led      

  - platform: template
    name: Use Listen Light
    id: use_listen_light
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - script.execute: reset_led
    on_turn_off:
      - script.execute: reset_led  


  - platform: restart
    name: restart

nanosonde · September 27, 2024, 10:02am

Looks like ESP-ADF won’t ever make it into esphome officially:

ginandbacon · September 27, 2024, 10:08am

This is very good news and makes sense. I am sure Espressif went out there way on software examples for the S3 box variants. But it was also a combination of the hardware and documentation. It seems like now that the voice team has gotten there feet wet so to speak, they are ready to move on which is good IMO.

I remember watching the live stream when they announced it and developer did talk a lot about the TensorFlow lite (tflite) open source code/library he found. Esp-adf didn’t really come up until the S3 box was using it.

ginandbacon · September 27, 2024, 10:25am

This is very good news and makes sense. I am sure Espressif went out there way on software examples for the S3 box variants. But it was also a combination of the hardware and documentation. It seems like now that the voice team has gotten there feet wet so to speak, they are ready to move on which is good IMO.

I remember watching the live stream when they announced it and developer did talk a lot about the TensorFlow lite (tflite) open source code/library he found. Esp-adf didn’t really come up until the S3 box was using it.

formatBCE · September 27, 2024, 2:19pm

With huge help of Kevin (MWW dev), got Respeaker Lite fully working with MicroWakeWord too - but i used the version for Voice-Kit (PE), that isn’t yet merged into ESPHome, to get rid of Seeed-specific I2S implementation. Won’t publish that now, as it will be changing drastically in next couple weeks, and my code will be obsolete.
New MWW is using separate mic stream to keep listening even when voice_assistant is streaming audio. That’s pretty good solution for XMOS board used in PE device (it is exposing 2 different streams of audio), but not that good for Respeaker XMOS, because it’s exposing single consolidated audio input stream (ci=onfirmed with Seeed support). That means, MWW and VA components won’t be able to modify stream separately (adjust gain, mostly), that can lead to false positives on wake word. Will test it more, but looks not bad, actually.

On other news: mute button, similar to USR button, can be soldered to exposed GPIO pin and be accessible in ESP code. Confirmed this with Seeed support too.

LED is working for me flawlessly.

will35 · September 27, 2024, 2:28pm

Hello

Could you publish your code, just for testing ?

Thanks !

formatBCE · September 27, 2024, 3:17pm

Okay, here it is, but remember - it all will be broken very soon. Also, i didn’t have a chance to tidy it up, and add mute/usr buttons. It’s just working file, with lot of stuff going on from my personal setup, with added pieces from HA PE dev YAML.

Edit: to remove old misleading YAML: here’s link to newest working config (link to my post lower in this thread): "ReSpeaker Lite" - new Seeed Studio Voice Assistant Development Kit hardware combine ESP32 with XMOS XU316 DSP chip for advanced audio processing as a ESPHome-based Home Assistant Assist Satellite voice devkit - #87 by formatBCE

ginandbacon · September 27, 2024, 6:10pm

Thank, I’m not a developer I just play one on HA forums and for works sometimes as I do work in IT but not a dev. I can just put pieces of the puzzle together so what I can do in an hour a real dev could do in 15 minutes or less.

Thanks for the info above and confirmed that this is going to be for testing, I wouldn’t order it if I hadn’t already and Seeed will probably never release a single firmware update (I hope I’m wrong but I doubt I will be).

As far as the LED, I fixed that, sort of with a simple solution. I actually thought of this. I’m sure others have done it but it works great, I’m sure I’ll get 3 replies from people who have done it also. But with no LED to tell if it even heard the Wake word. By far the quickest and best automation idea I’ve.had. I’ve got another when it’s done, although not really needed…

formatBCE · September 27, 2024, 6:14pm

You can play sound on respeaker itself. Check how ding.mp3 is played in my code for timer. No need to create automation.
P.S. so is LED problem hardware-based for you? Because for me, LED works completely fine…

ginandbacon · September 27, 2024, 6:18pm

Yes, or something wasn’t soldered correctly. It hasn’t worked once. The green led and red led on the XIOS S3 work just fine. Since everything appears to be working without any issues I’m speculating it’s hardware or quality control issue. I haven’t tried to flash the PC firmware/DFU but was probably going to try and see if that works with the assist microphone add-on just to test. I’m not really expecting better results

I’ll certainly be looking at your code later, this was just a quick solution and I was sick at looking at the Android app to see if it actually heard the wake word.

formatBCE · September 27, 2024, 6:26pm

I tried DFU - LED is turned off there completely. I2C does expose it on ESP GPIO1. Bummer, if it’s broken…