Far field satellite with an Elegant 3d printed enclosures

Alextrical · March 13, 2024, 9:52am

I can get the speaker mount to match, though out of curiosity do you have a link to the speaker you are using? A make/model, or at the least the size and wattage of the speaker?

aherbjornsen · March 13, 2024, 10:06am

I’ve yet to connect and test it (and I see others have problems to get audio to work), but I found that this model did fit physically:
https://www.aliexpress.com/item/1005003717211529.html

Alextrical · March 19, 2024, 5:40pm

@aherbjornsen Ive got a speaker on order from AliExpress, that got some good reviews
(No technical data sheets, but I don’t feel like paying for a reputable supplier while testing)

I expect I will eventually go for Viaston FRS 5 X units, once the project has its kinks ironed out and can play music

Fabric for covering the unit has been ordered too, this stuff in grey should look nice, and hopefully be photogenic (Black would look nicer IRL, but likely not photograph well)

I got access to the ESP32-Korvo V1.1 and ESP32-LyraTD-MSC Dev boards today, Now to look at getting the audio working (if the ESP32 Wrover is a fast enough chip for ESPHome)
Work will begin on getting the aesthetic enclosure closer to being released

SpencerDub · March 21, 2024, 8:04pm

This is so exciting! Like I commented over in Egg Voice Assistant, I’ve been dreaming of a smart speaker system with pretty much exactly the same criteria you outlined in your post—far field, LED ring display, can play music and sound good. I hadn’t considered a ceiling mount, though—what a neat bonus!

I’m definitely interested in this project and will be keeping my eyes on it.

Alextrical · March 22, 2024, 11:05am

Thank you for your interest, I’m collaborating with a few other smart speaker creators over on the ESPHome and Willow Discord server, to get the schematic put together to meet those goals.
I’m aiming to the the first draft of the KiCAD schematics posted to GitHub later today, under the LibreTalk GitHub organization.

tc23 · March 23, 2024, 12:12am

My apologies, as this seems to be focusing alot on enclosures and the like, without a ton of talk on the setuo. Am I correct in that this is mostly just a thought experiment?

Alextrical · March 23, 2024, 8:21am

I think my issue was that I have tried to attack this project from multiple directions at the same time.

I tested both the ESP32-Korvo-V1.1 and ESP32-LyraTD-MSC,
The Audio output on the Korvo V1.1 doesn’t seem to work regardless of all the setting I’ve tried (and a lot of others look to have the same issue with it)

The LyraTD-MSC worked well for Audio reproduction (TTS playback) and the microphones worked to pick up the wake work and STT. Not perfectly tuned, but better than a wall mounted tablet (looking at you Amazon Fire 7") however the light array is a bit of a PITA with it not being a neopixel

Then the icing on the top is neither board is using a ESP32 S3

So I have been focusing my time collaborating with other smart speaker designers, to design something that I believe will suit the needs of most end users. Which has resulted in me spending less time on the Enclosure or Yaml configs for the development boards (mostly this thread’s intent)

The schematic for the PCB we have been working on can be found at
GitHub - Libre-Talk/Wave: Smart Speaker satellite it’s an early draft without all the nets fully connected, and the PCB is being layed out

Alextrical · March 23, 2024, 8:50am

thecodingart · March 23, 2024, 12:45pm

So the setup with me appropriately using Sonos speakers as output media players “was” working for me — until I started hitting buffer errors.

At this point, I’m migrating to Raspberry Pis with ReSpeakers. IMO, ESP32-S3 devices don’t have enough memory to properly handle quality voice assistant work and was a terrible recommendation over satellite pis.

Alextrical · March 23, 2024, 7:05pm

Hmm, sorry to hear that. Oddly I did get a ESP32 Wrover on the LyraTD_MSC working nicely as my first try with Assist (Audio pickup and playback) and that’s a lower spec chip than the S3.

I wonder if that’s an issue related to Bluetooth audio streaming?

If you’re looking to sell the Korvo-1’s, give me a DM

foreverimagining · March 23, 2024, 7:51pm

Actually, yes. There’s a warning all over the ESPHome documentation that says that bluetooth, audio, and voice assistant functions each “consume a significant amount of resources” which causes issues when used together. I would be willing to bet most boards available for sale are quite underpowered for the purpose. ESP32-S3s may have the issue compounded if you’re using microWakeWord as well, since the requirement for using that is at least 2Mb of PSRAM.

thecodingart · March 23, 2024, 7:57pm

This guy details out some of the root issues I’m seeing. If everything works with the Pi I’ll definitely reach out.

But for clarity I’m using a network call to call my Sonos speakers through media player integrations in HA (see my linked config). No BL use at all.

tc23 · March 23, 2024, 11:20pm

No worries, I didn’t mean to seem dismissive. This looks amazing, I just have been hurt before by good design and not good software/hardware

Alextrical · March 24, 2024, 7:44am

I get where you are coming from. For a project like this all 3 need to be worked on for a good device, Hardware, Software and the Enclosure.

I’m putting my hope into the work that has been done by the community on the software side as that has a lot of activity with the Espressif ADF at it’s core

Part of me is still worried that a ESP32 S3 isn’t powerful enough to stream music, while also being able to use a Wake word. I’m keeping an eye on how that is developing

synesthesiam · March 24, 2024, 6:52pm

Some additional discussion over on the Rhasspy forums: Question: would anyone be interested in a open source DSP mic array? - #2 by synesthesiam - Hardware - Rhasspy Voice Assistant

janstadt · April 2, 2024, 11:01pm

Any updates on the 3d printed enclosure for the Korvo v1.1? I found this one on thingiverse: ESP32 Korvo (Jarvis Voice Assistant) Puck Enclosure by CleverCasaChronicles - Thingiverse which is not bad looking and pretty basic to just get going with. Credit to CleverCasaChronicles for this: https://www.youtube.com/shorts/ehUY2GHL8XE Also, if anyone has been able to get the korvo to continually listen as well as stay running after boot that’d be great. Followed this install and it works, but its kinda unstable for some reason: Voice Assistant-Add support for the espressif esp32-korvo-v1.1 · Issue #2430 · esphome/feature-requests · GitHub

ginandbacon · April 7, 2024, 3:59am

That’s the thread for the original korvo that has a vanilla ESP32 in it. It’s the around 20 dollar one on AliExpress. I actually just posted the below because it shouldn’t work at all without an S3 with PSRAM. The GPIO pins are different between the 2.models so I’m surprised it worked at all but explains the odd behavior… Also, there are a mix of korvov1.1 and korvo-1 posts that just make it a bit more confusing to navigate through.

←
the voice pipeline used should have no value in home assistant because it’s defined in the ESPHome configuration file. While I have been working on the Korvo-1, and I know this isn’t the thread for that one, I’m just curious is PSRAM is no longer required for microwakeword because everything I’ve read says it’s a requirement. Additionally the ESPHome documation says it’s still required although it may not have been updated.

The micro_wake_word component requires an ESP32-S3 with PSRAM to function
→

I do have the korvo-1 working with microwakeword but I’m having an “odd” issue that I want to get fixed because the firmware doesn’t show and ESPHome thinks the korvo-1 is offline and prompts me to install the “newly” found one. It works with zero issues which is just really odd. In ESPHome it shows up as red (offline) but if I click on logs they show up and I can see everything working, yet ESPHome still thinks it’s offline. Very odd. I will post the confirmation once I figure out what’s going on because you will more than likely end up with the same issue. I also used the below so if the GPIO pins don’t match what’s in that other thread then it’s not going to work. Look at the file with pins in the name.

ginandbacon · April 7, 2024, 3:14pm

here is the yaml, everything works without any issues. I tried looking into the firmware thing and it looks like it has to be added on the esphome side and there has been a feature request put in. If it will ever happen is another story though. I did get some warnings about the microphone and during the tensorflow.

Also, it was actually working with OpenWakeWord still until I looked into it. I realized after looking into the repository I am pointing to and if has an option to work with both. When looking more closely at the logs, it wasn’t, they are totally different (as far a cli output) when being triggered and processing a wake word. Remember to create a new pipline with no wake word defined. If you use your default, I believe it will use OpenWakeWord. Another fun thing is since I set the board to esp32-s3-devkit-1, it formatted the ROM at 4MB or either 8MB. It was saying 19344 bytes was to large for my flash of 18345 bytes, which I interpreted as 19MB and 18MB which is obviously WAY off. I probably spent a good 2 hours trying to figure this out when I noticed that when I got that message the PSRAM was more than double the size!!!. So all I had to do was add one simple line to specify flash size as 16MB. DOH. Oh well, it’s working, that is all that matters.

Remember to add the API key below and change wifi if not using the same secret format I am using also. For some reason, a few releases ago all the LED’s stopped working. I don’t know if if it’s just a weird LED thing or if only one mic is working. I was pretty much relying on that repo for a lot of stuff because he had the actual espressif files with the includes to the driver files and specific hardware firmware. Stuff like that. So, you could just specify the mic and one of the mic gpio’'s, but I doubt it will work out as well as this because it won’t be utilizing all the hardware. I honestly can’t say if all of it is working with this either. The 3.5mm output works fine, haven’t tried the speaker output and all my speakers are passive, I also don’t have a jst adapter that will fit it either.

Just remember, you may forever have a new entity wanting to be adopted after this, just leave it alone, even though it shows up as red it just works and outside someone adding the firmware to esphome, it will always be that way. What is really odd is there is a line in there that points to a github url that doesn’t exist for the firmware. If I comment it out, flashing fails, so it has to be in there, even though the the file doesn’t exist in the main esphome repository it’s pointed to

Lastly, it took over 600 seconds to compile, and I am running HA on a 7 year old intel Nuc with an I5 so if you are using a raspberry pi 4, it could take a while longer so make sure to just wait until it flashes or times out. Actually , if it’s your first time flashing it you will have to plug it into a PC or the HA server so you will see the CLI output. The tensorflow part takes quite some time. You have to hold the boot button when plugging the power cord in to put it into boot mode. The USB cable for for communication should already be plugged into your PC when doing this.

substitutions:
  name: "korvo"
  friendly_name: korvo

esphome:
  name: ${name}
  friendly_name: ${friendly_name}
  name_add_mac_suffix: true
  platformio_options:
    board_build.flash_mode: dio
    upload_speed: 460800
  project:
    name: esphome.voice-assistant
    version: "1.0"
  min_version: 2023.11.5
  on_boot:
    - priority: 600
      then:
        - light.turn_on:
            id: led_ring
            brightness: 70%
            effect: connecting

esp32:
  board: esp32-s3-devkitc-1
  flash_size: 16MB
  framework:
    type: esp-idf
    sdkconfig_options:
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
      CONFIG_AUDIO_BOARD_CUSTOM: "y"
      CONFIG_ESP32_S3_KORVO1_BOARD: "y"
    components:
      - name: esp32_s3_korvo1_board
        source: github://abmantis/esphome_custom_audio_boards@main
        refresh: 0s

psram:
  mode: octal
  speed: 80MHz

external_components:
  - source: github://pr#5230
    components: esp_adf
    refresh: 0s

ota:
logger:
api:
  encryption:
     key: you_api_key
  on_client_connected:
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - delay: 1s
            - voice_assistant.start_continuous:
            - delay: 1s
            - voice_assistant.stop:
            - delay: 1500ms
            - voice_assistant.start_continuous:
            - script.execute: reset_led
  on_client_disconnected:
    then:
      - light.turn_on:
          id: led_ring
          blue: 0%
          red: 100%
          green: 100%
          brightness: 50%
          effect: connecting

dashboard_import:
  package_import_url: github://esphome/firmware/voice-assistant/esp32-s3-korvo1.yaml@main

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  use_address: 192.168.0.xx
  ap:
  on_connect:
    then:
      - delay: 5s # Gives time for improv results to be transmitted
      - ble.disable:
  on_disconnect:
    then:
      - ble.enable:

improv_serial:

esp32_improv:
  authorizer: none

button:
  - platform: factory_reset
    id: factory_reset_btn
    name: Factory reset

esp_adf:
  board: esp32s3korvo1

microphone:
  - platform: esp_adf
    id: korvo_mic

speaker:
  - platform: esp_adf
    id: korvo_speaker
    
micro_wake_word:
  on_wake_word_detected:
    # then:
    - voice_assistant.start:
        wake_word: !lambda return wake_word;
    - light.turn_on:
        id: led_ring      
        red: 30%
        green: 30%
        blue: 70%
        brightness: 60%
        effect: fast pulse 
  model: okay_nabu.

voice_assistant:
  id: voice_asst
  microphone: korvo_mic
  speaker: korvo_speaker
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2
  use_wake_word: true
  on_listening:
    - light.turn_on:
        id: led_ring
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: wakeword
  on_tts_start:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 50%
        effect: pulse
  on_end:
    - delay: 500ms
    - wait_until:
        not:
          speaker.is_playing:
    - script.execute: reset_led
  on_error:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 100%
        green: 0%
        brightness: 100%
        effect: none
    - delay: 1s
    - script.execute: reset_led
    - script.wait: reset_led
    - lambda: |-
        if (code == "wake-provider-missing" || code == "wake-engine-missing") {
          id(use_wake_word).turn_off();
        }

script:
  - id: reset_led
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - light.turn_on:
                id: led_ring
                blue: 100%
                red: 0%
                green: 0%
                brightness: 30%
                effect: none
          else:
            - light.turn_off: led_ring

switch:
  - platform: gpio
    id: pa_ctrl
    pin: GPIO38
    name: "${friendly_name} Speaker Mute"
    restore_mode: ALWAYS_ON

  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(voice_asst).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      - script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - script.execute: reset_led

light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    name: "${friendly_name} Light"
    pin: GPIO19
    num_leds: 12
    rmt_channel: 0
    rgb_order: GRB
    chipset: ws2812
    default_transition_length: 0s
    effects:
      - pulse:
          name: "Pulse"
          transition_length: 0.5s
          update_interval: 0.5s
      - addressable_twinkle:
          name: "Working"
          twinkle_probability: 5%
          progress_interval: 4ms
      - addressable_color_wipe:
          name: "Wakeword"
          colors:
            - red: 0%
              green: 50%
              blue: 0%
              num_leds: 12
          add_led_interval: 20ms
          reverse: false
      - addressable_color_wipe:
          name: "Connecting"
          colors:
            - red: 60%
              green: 60%
              blue: 60%
              num_leds: 12
            - red: 60%
              green: 60%
              blue: 0%
              num_leds: 12
          add_led_interval: 100ms
          reverse: true

binary_sensor:
  - platform: template
    name: "${friendly_name} Volume Up"
    id: btn_volume_up
  - platform: template
    name: "${friendly_name} Volume Down"
    id: btn_volume_down
  - platform: template
    name: "${friendly_name} Set"
    id: btn_set
  - platform: template
    name: "${friendly_name} Play"
    id: btn_play
  - platform: template
    name: "${friendly_name} Mode"
    id: btn_mode
    on_multi_click:
      - timing:
          - ON for at least 10s
        then:
          - button.press: factory_reset_btn
  - platform: template
    name: "${friendly_name} Record"
    id: btn_record
    on_press:
      - voice_assistant.start:
      - light.turn_on:
          id: led_ring
          brightness: 100%
          effect: "Wakeword"
    on_release:
      - voice_assistant.stop:
      - light.turn_off:
          id: led_ring

sensor:
  - id: button_adc
    platform: adc
    internal: true
    pin: 8
    attenuation: 11db
    update_interval: 15ms
    filters:
      - median:
          window_size: 5
          send_every: 5
          send_first_at: 1
      - delta: 0.1
    on_value_range:
      - below: 0.55
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: ON
      - above: 0.65
        below: 0.92
        then:
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: ON
      - above: 1.02
        below: 1.33
        then:
          - binary_sensor.template.publish:
              id: btn_set
              state: ON
      - above: 1.43
        below: 1.77
        then:
          - binary_sensor.template.publish:
              id: btn_play
              state: ON
      - above: 1.87
        below: 2.15
        then:
          - binary_sensor.template.publish:
              id: btn_mode
              state: ON
      - above: 2.25
        below: 2.56
        then:
          - binary_sensor.template.publish:
              id: btn_record
              state: ON
      - above: 2.8
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: OFF
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: OFF
          - binary_sensor.template.publish:
              id: btn_set
              state: OFF
          - binary_sensor.template.publish:
              id: btn_play
              state: OFF
          - binary_sensor.template.publish:
              id: btn_mode
              state: OFF
          - binary_sensor.template.publish:
              id: btn_record
              state: OFF

Alextrical · April 7, 2024, 3:23pm

Sorry for the quietness on this thread, I’ve been spending my time trying to develop a cheap smart speaker solution that hopefully ticks the boxes for the most users possible.

I’m guessing the code above is for the Korvo1 V2.4 not the Korvo V1.1 (that naming convention caught me out too, Now I have 3 ESP32 based units that don’t work with ESPHome ) though I did have some luck with the ESP32-LyraTD-MSC, despite it having a lower end cpu, happily picked up the wake word, and even worked as an input and output for assist, until I tried to transplant a ESP32-S3 N16R8 and it got locked into bootloader mode.

ginandbacon · April 7, 2024, 5:18pm

Yes, it’s for the Korvo-1. Real great naming convention. It’s the one with an S3 with 16MB of ROM and 8MB of PSRAM so not the Korvov1.1 or whatever, technically mine is korvo-1 v5, and there is a Korvo-2 also which has screen and camera interface along with 2 mics and a speaker if I am not mistaken. If you use the code above it that I originally posted it will work with OpenWakeWord, just take out the microwakeword part. from “on_wake_word_detected:” to “model”, just delete all that and it will work on the KorvoV1.1 using OpenWakeWord. Microwakeword requires PSRAM and the vanilla ESP32 in that model doesn’t have any. I originally thought it was using microwakeword, it wasn’t, it was using Open. Works better with micro, just noticeably quicker to allow a voice command but still works well with Open. If you do use micro, create a voice pipline with no wakeword. I have heard conflicting information about if you use a pipeline with a wake word defined it uses Open.

Another ethe above won’t work for the korvov1.1 . It’s getting the GPIO.amdnother information from that repo. What you would need is something similar to below, which I’m unsure how I missed during my searches. Issue is it’s for the same model I have so the GPIO pins and drivers/code s defined may be different. Someone should be able to redu the GPIO pins, and their has to be some v1.1 yaml out there. I’ll do some searches but something like the ESPHome.code in the below link, which doesn’t depend on any repositories, is more straight forward. I actually might see how the below compares to the above but I’ve been impressed in the difference in picking up the trigger word much easier with micro.

gist.github.com

https://gist.github.com/mattkasa/83eb96b1590735f3fd9fbbd14a7ca0a0

esp32-s3-korvo-1.yaml

substitutions:
  name: korvo-1
  friendly_name: "Korvo 1"
  ip_address: 192.168.1.101
  wifi_ssid: iot
  power_save: high

esphome:
  name: ${name}
  platformio_options:

This file has been truncated. show original

EDIT: Specific model just to make sure there is no confusion