A presentable voice assistant satellite

Just attach the bluetooth mic/speaker device to RPi bluetooth

And that will work as a satellite that reacts to wake words?

I dont think anyone has gotten that to work yet.

Been googling this but can’t see to find how to do it. Any pointer?

It just acts like a plain microphone / speaker for the system whatever you have on that system should be able to use it as such.

Debian Bullseye (Skip this and upgrade to bookworm directions below it works better)

sudo apt update
sudo apt full-upgrade
sudo apt -y dist-upgrade
sudo reboot

Check to see if SAP failed, and if so remove the plugin:
sudo systemctl status bluetooth

● bluetooth.service - Bluetooth service
     Loaded: loaded (/lib/systemd/system/bluetooth.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2024-03-09 19:15:27 EST; 5min ago
       Docs: man:bluetoothd(8)
   Main PID: 4016 (bluetoothd)
     Status: "Running"
      Tasks: 1 (limit: 779)
        CPU: 166ms
     CGroup: /system.slice/bluetooth.service
             └─4016 /usr/libexec/bluetooth/bluetoothd

Mar 09 19:15:27 VA-001 bluetoothd[4016]: Bluetooth daemon 5.55
Mar 09 19:15:27 VA-001 systemd[1]: Started Bluetooth service.
Mar 09 19:15:27 VA-001 bluetoothd[4016]: Starting SDP server
Mar 09 19:15:27 VA-001 bluetoothd[4016]: Bluetooth management interface 1.22 initialized
Mar 09 19:15:27 VA-001 bluetoothd[4016]: profiles/sap/server.c:sap_server_register() Sap driver initialization failed.
Mar 09 19:15:27 VA-001 bluetoothd[4016]: sap-server: Operation not permitted (1)
Mar 09 19:15:27 VA-001 bluetoothd[4016]: Failed to set mode: Failed (0x03)
Mar 09 19:15:27 VA-001 bluetoothd[4016]: Set device flags return status: Invalid Parameters
Mar 09 19:15:27 VA-001 bluetoothd[4016]: Endpoint registered: sender=:1.30 path=/MediaEndpoint/A2DPSink/sbc
Mar 09 19:15:27 VA-001 bluetoothd[4016]: Endpoint registered: sender=:1.30 path=/MediaEndpoint/A2DPSource/sbc

sudo nano /lib/systemd/system/bluetooth.service

Edit the ExecStart=/usr/lib/bluetooth/bluetoothd line to add --noplugin=sap

Should look like this
ExecStart=/usr/libexec/bluetooth/bluetoothd --noplugin=sap
Ctrl + O, Ctrl + X to write then exit
sudo systemctl daemon-reload
sudo systemctl restart bluetooth

sudo apt -y install --no-install-recommends pulseaudio-module-bluetooth

sudo nano /etc/systemd/system/pulseaudio.service
[Unit]
Description=Pulseaudio sound server
After=avahi-daemon.service network.target

[Service]
ExecStart=/usr/bin/pulseaudio --system --disallow-exit --disable-shm --daemonize=no
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target

Save/Exit

Add this to the end of the system.pa file:

sudo nano /etc/pulse/system.pa
.ifexists module-bluetooth-policy.so
load-module module-bluetooth-policy
.endif
.ifexists module-bluetooth-discover.so
load-module module-bluetooth-discover
.endif

Save/Exit

sudo adduser pulse bluetooth
Adding user `pulse' to group `bluetooth' ...
Adding user pulse to group bluetooth
Done.

sudo adduser pi pulse-access
Adding user 'pi' to group `pulse-access' ...
Adding user pi to group pulse-access
Done.

Add this to the end of the default.pa file:

sudo nano /etc/pulse/default.pa
# automatically switch to newly-connected devices
load-module module-switch-on-connect

sudo systemctl restart dbus

sudo systemctl enable pulseaudio
sudo systemctl start pulseaudio

sudo usermod -G bluetooth -a pi
su - $USER
id

rfkill list
0: phy0: Wireless LAN
        Soft blocked: no
        Hard blocked: no
1: hci0: Bluetooth
        Soft blocked: yes
        Hard blocked: no

rfkill unblock all
rfkill list
0: phy0: Wireless LAN
        Soft blocked: no
        Hard blocked: no
1: hci0: Bluetooth
        Soft blocked: no
        Hard blocked: no

bluetoothctl
Agent registered

power on
Changing power on succeeded

agent on
Agent is already registered

default-agent
Default agent request successful

scan on
Discovery started
[CHG] Controller DE:AD:BE:EF:CA:FE Discovering: yes
[NEW] Device DE:AD:BE:EF:FA:CE AudioHeadSet

pair de:ad:be:ef:fa:ce
Attempting to pair with DE:AD:BE:EF:FA:CE
[CHG] Device DE:AD:BE:EF:FA:CE Connected: yes
[CHG] Device DE:AD:BE:EF:FA:CE UUIDs: #########-####-####-####-###############
[CHG] Device DE:AD:BE:EF:FA:CE UUIDs: #########-####-####-####-###############
[CHG] Device DE:AD:BE:EF:FA:CE ServicesResolved: yes
[CHG] Device DE:AD:BE:EF:FA:CE Paired: yes
Pairing successful
[CHG] Device DE:AD:BE:EF:FA:CE ServicesResolved: no
[CHG] Device DE:AD:BE:EF:FA:CE Connected: no

trust de:ad:be:ef:fa:ce
[CHG] Device DE:AD:BE:EF:FA:CE Trusted: yes
Changing DE:AD:BE:EF:FA:CE trust succeeded

connect de:ad:be:ef:fa:ce
Attempting to connect to DE:AD:BE:EF:FA:CE
[CHG] Device DE:AD:BE:EF:FA:CE Connected: yes
Connection successful
[CHG] Device DE:AD:BE:EF:FA:CE ServicesResolved: yes

scan off
quit

sudo reboot

LANG=C pactl list | grep -A2 'Source #' | grep 'Name: ' | cut -d" " -f2
alsa_output.platform-bcm2835_audio.analog-stereo.monitor
bluez_sink.DE_AD_BE_EF_FA_CE.a2dp_sink.monitor

aplay /usr/share/sounds/alsa/Front_Center.wav
Playing WAVE '/usr/share/sounds/alsa/Front_Center.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Mono

arecord -d 5 test.wav

aplay test.wav

General Satellite Setup

sudo apt-get install python3 python3-pip python3-venv \
                     alsa-utils git

sudo apt-get install --no-install-recommends \
                     ffmpeg

cd
mkdir code
cd code

git clone https://github.com/synesthesiam/homeassistant-satellite.git
cd homeassistant-satellite
script/setup

sudo nano /etc/systemd/system/homeassistant-satellite.service

[Unit]
Description=Home Assistant Satellite
Wants=network-online.target
After=network-online.target

[Service]
Type=simple
ExecStart=/home/pi/code/homeassistant-satellite/script/run --host 10.0.1.42 --token <HALongLiveTokeHere> --awake-sound sounds/awake.wav --done-sound sounds/done.wav --mic-device pulse --snd-device pulse
WorkingDirectory=/home/pi/code/homeassistant-satellite
Restart=always
RestartSec=1

[Install]
WantedBy=default.target

Debian Bookworm and Pipewire the bluetooth mic works using Profile: Headset Head Unit (HSP/HFP, codec CVSD)

sudo apt install pipewire \
  libspa-0.2-bluetooth \
  pipewire-audio-client-libraries \
  pipewire-media-session- \
  wireplumber \
  bluez \
  blueman \
  pavucontrol
sudo usermod -G bluetooth -a pi
su - $USER
id
rfkill list
0: phy0: Wireless LAN
        Soft blocked: no
        Hard blocked: no
1: hci0: Bluetooth
        Soft blocked: yes
        Hard blocked: no
rfkill unblock all
rfkill list
0: phy0: Wireless LAN
        Soft blocked: no
        Hard blocked: no
1: hci0: Bluetooth
        Soft blocked: no
        Hard blocked: no
bluetoothctl
Agent registered

power on
Changing power on succeeded

agent on
Agent is already registered

default-agent
Default agent request successful

scan on
Discovery started
[CHG] Controller DE:AD:BE:EF:CA:FE Discovering: yes
[NEW] Device DE:AD:BE:EF:FA:CE AudioHeadSet

pair de:ad:be:ef:fa:ce
Attempting to pair with DE:AD:BE:EF:FA:CE
[CHG] Device DE:AD:BE:EF:FA:CE Connected: yes
[CHG] Device DE:AD:BE:EF:FA:CE UUIDs: #########-####-####-####-###############
[CHG] Device DE:AD:BE:EF:FA:CE UUIDs: #########-####-####-####-###############
[CHG] Device DE:AD:BE:EF:FA:CE ServicesResolved: yes
[CHG] Device DE:AD:BE:EF:FA:CE Paired: yes
Pairing successful
[CHG] Device DE:AD:BE:EF:FA:CE ServicesResolved: no
[CHG] Device DE:AD:BE:EF:FA:CE Connected: no

trust de:ad:be:ef:fa:ce
[CHG] Device DE:AD:BE:EF:FA:CE Trusted: yes
Changing DE:AD:BE:EF:FA:CE trust succeeded

connect de:ad:be:ef:fa:ce
Attempting to connect to DE:AD:BE:EF:FA:CE
[CHG] Device DE:AD:BE:EF:FA:CE Connected: yes
Connection successful
[CHG] Device DE:AD:BE:EF:FA:CE ServicesResolved: yes

scan off
quit

If HSP is not showing up:

sudo cp -r /var/lib/bluetooth /var/lib/bluetooth_BACKUP
sudo systemctl stop bluetooth
sudo rm -rf /var/lib/bluetooth/*
sudo systemctl start bluetooth
sudo reboot
sudo apt install gstreamer1.0
gst-launch-1.0 audiotestsrc ! audioresample ! autoaudiosink
Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Redistribute latency...
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
Redistribute latency...
New clock: GstPulseSinkClock
0:00:00.1 / 99:99:99.

You should hear a beep.

pw-cli list-objects | grep node.name
                node.name = "Dummy-Driver"
                node.name = "Freewheel-Driver"
                node.name = "Midi-Bridge"
                node.name = "v4l2_input.platform-bcm2835-isp.2"
                node.name = "v4l2_input.platform-bcm2835-isp.3"
                node.name = "v4l2_input.platform-bcm2835-isp.6"
                node.name = "v4l2_input.platform-bcm2835-isp.7"
                node.name = "alsa_output.platform-bcm2835_audio.stereo-fallback"
                node.name = "bluez_input.DE_AD_BE_EF_FA_CE.0"
                node.name = "bluez_output.DE_AD_BE_EF_FA_CE.1"

You can switch between the AD2P and HSP profile like this:

pactl set-card-profile bluez_card.DE_AD_BE_EF_FA_CE headset-head-unit-cvsd
pactl set-card-profile bluez_card.DE_AD_BE_EF_FA_CE a2dp-sink-sbc
pactl list
git clone https://github.com/rhasspy/wyoming-openwakeword.git
cd wyoming-openwakeword
script/setup
script/run --uri 'tcp://0.0.0.0:10400' --debug --preload-model 'hey_jarvis'
git clone https://github.com/rhasspy/wyoming-satellite.git
cd wyoming-satellite.git
scripts/setup
script/run --name 'bluetooth satellite' --uri 'tcp://0.0.0.0:10700' --mic-command 'parecord --device bluez_input.DE_AD_BE_EF_FA_CE.0 --rate=16000 --channels=1 --format=s16le --raw' --snd-command 'paplay --device bluez_output.DE_AD_BE_EF_FA_CE.1 --rate=22000 --channels=1 --format=s16le --raw' --debug --wake-uri 'tcp://127.0.0.1:10400' --wake-word-name 'hey_jarvis' --done-wav sounds/done.wav --awake-wav sounds/awake.wav
INFO:root:Ready
DEBUG:root:Detected IP: 10.0.42.100
DEBUG:root:Zeroconf discovery enabled (name=############, host=None)
DEBUG:root:Connecting to mic service: ['parecord', '--device', 'bluez_input.DE_AD_BE_EF_FA_CE.0', '--rate=16000', '--channels=1', '--format=s16le', '--raw']
DEBUG:root:Connecting to snd service: ['paplay', '--device', 'bluez_output.DE_AD_BE_EF_FA_CE.1', '--rate=16000', '--channels=1', '--format=s16le', '--raw']
DEBUG:root:Connecting to wake service: tcp://127.0.0.1:10400
INFO:root:Connected to services
DEBUG:root:Connected to mic service
DEBUG:root:Connected to wake service
DEBUG:root:Server set: 3667213960213
INFO:root:Connected to server
INFO:root:Waiting for wake word
DEBUG:root:Ping enabled
DEBUG:root:Detection(name='hey_jarvis_v0.1', timestamp=3674454417437)
DEBUG:root:Streaming audio
DEBUG:root:Event(type='run-pipeline', data={'start_stage': 'asr', 'end_stage': 'tts', 'restart_on_end': False, 'snd_format': {'rate': 22050, 'width': 2, 'channels': 1}}, payload=None)
DEBUG:root:Muting microphone for 0.8995918367346939 second(s)
DEBUG:root:Connected to snd service
DEBUG:root:Unmuted microphone
DEBUG:root:Connected to snd service
DEBUG:root:Unmuted microphone
DEBUG:root:Event(type='transcript', data={'text': ' Turn off the family room light.'}, payload=None)
INFO:root:Waiting for wake word
DEBUG:root:Connected to snd service
DEBUG:root:Event(type='synthesize', data={'text': 'Turned off the lights', 'voice': {'name': 'en_US-lessac-medium'}}, payload=None)
DEBUG:root:Connected to snd service
DEBUG:root:Detection(name='hey_jarvis_v0.1', timestamp=3784489216895)
DEBUG:root:Streaming audio
1 Like

Here is my esphome config for the esspressif esp32s3 korvo-1 development board. I ended up getting it since an S3 Box is impossible to get. It works, all the LED’s work as they should but it could probably use some tweaking to take full advantage of the hardware. Now I just need to design a 3D case. Best part is it has speaker outputs for low power or powered speakers and a 3.5mm output for pretty much anything you want to hook it into for sound output. No onboard speaker though.

substitutions:
  friendly_name: korvo 

esphome:
  name: korvo
  friendly_name: ${friendly_name}
  name_add_mac_suffix: true
  platformio_options:
    board_build.flash_mode: dio
    upload_speed: 460800
  project:
    name: esphome.voice-assistant
    version: "1.0"
  min_version: 2023.11.1
  on_boot:
    - priority: 600
      then:
        - light.turn_on:
            id: led_ring
            brightness: 70%
            effect: connecting

esp32:
  board: esp32-s3-devkitc-1
  framework:
    type: esp-idf
    sdkconfig_options:
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
      CONFIG_AUDIO_BOARD_CUSTOM: "y"
      CONFIG_ESP32_S3_KORVO1_BOARD: "y"
    components:
      - name: esp32_s3_korvo1_board
        source: github://abmantis/esphome_custom_audio_boards@main
        refresh: 0s

psram:
  mode: octal
  speed: 80MHz

external_components:
  - source: github://pr#5230
    components: esp_adf
    refresh: 0s

ota:
logger:
api:
  on_client_connected:
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - delay: 1s
            - voice_assistant.start_continuous:
            - delay: 1s
            - voice_assistant.stop:
            - delay: 2s
            - voice_assistant.start_continuous:
            - script.execute: reset_led
  on_client_disconnected:
    then:
      - light.turn_on:
          id: led_ring
          blue: 0%
          red: 100%
          green: 100%
          brightness: 50%
          effect: connecting

dashboard_import:
  package_import_url: github://esphome/firmware/voice-assistant/esp32-s3-korvo1.yaml@main

wifi:
  use_address: 192.168.0.57
  ap:
  on_connect:
    then:
      - delay: 5s # Gives time for improv results to be transmitted
      - ble.disable:
  on_disconnect:
    then:
      - ble.enable:

improv_serial:

esp32_improv:
  authorizer: none

button:
  - platform: factory_reset
    id: factory_reset_btn
    name: Factory reset

esp_adf:
  board: esp32s3korvo1

microphone:
  - platform: esp_adf
    id: korvo_mic

speaker:
  - platform: esp_adf
    id: korvo_speaker

voice_assistant:
  id: voice_asst
  microphone: korvo_mic
  speaker: korvo_speaker
  noise_suppression_level: 4
  auto_gain: 10dBFS
  volume_multiplier: 1
  use_wake_word: false
  on_listening:
    - light.turn_on:
        id: led_ring
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: wakeword
  on_tts_start:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 50%
        effect: pulse
  on_end:
    - delay: 100ms
    - wait_until:
        not:
          speaker.is_playing:
    - script.execute: reset_led
  on_error:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 100%
        green: 0%
        brightness: 100%
        effect: none
    - delay: 1s
    - script.execute: reset_led
    - script.wait: reset_led
    - lambda: |-
        if (code == "wake-provider-missing" || code == "wake-engine-missing") {
          id(use_wake_word).turn_off();
        }

script:
  - id: reset_led
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - light.turn_on:
                id: led_ring
                blue: 100%
                red: 0%
                green: 0%
                brightness: 30%
                effect: none
          else:
            - light.turn_off: led_ring

switch:
  - platform: gpio
    id: pa_ctrl
    pin: GPIO38
    name: "${friendly_name} Speaker Mute"
    restore_mode: ALWAYS_ON

  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(voice_asst).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      - script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - script.execute: reset_led

light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    name: "${friendly_name} Light"
    pin: GPIO19
    num_leds: 12
    rmt_channel: 0
    rgb_order: GRB
    chipset: ws2812
    default_transition_length: 0s
    effects:
      - pulse:
          name: "Pulse"
          transition_length: 0.5s
          update_interval: 0.5s
      - addressable_twinkle:
          name: "Working"
          twinkle_probability: 5%
          progress_interval: 4ms
      - addressable_color_wipe:
          name: "Wakeword"
          colors:
            - red: 0%
              green: 50%
              blue: 0%
              num_leds: 12
          add_led_interval: 20ms
          reverse: false
      - addressable_color_wipe:
          name: "Connecting"
          colors:
            - red: 60%
              green: 60%
              blue: 60%
              num_leds: 12
            - red: 60%
              green: 60%
              blue: 0%
              num_leds: 12
          add_led_interval: 100ms
          reverse: true

binary_sensor:
  - platform: template
    name: "${friendly_name} Volume Up"
    id: btn_volume_up
  - platform: template
    name: "${friendly_name} Volume Down"
    id: btn_volume_down
  - platform: template
    name: "${friendly_name} Set"
    id: btn_set
  - platform: template
    name: "${friendly_name} Play"
    id: btn_play
  - platform: template
    name: "${friendly_name} Mode"
    id: btn_mode
    on_multi_click:
      - timing:
          - ON for at least 10s
        then:
          - button.press: factory_reset_btn
  - platform: template
    name: "${friendly_name} Record"
    id: btn_record
    on_press:
      - voice_assistant.start:
      - light.turn_on:
          id: led_ring
          brightness: 100%
          effect: "Wakeword"
    on_release:
      - voice_assistant.stop:
      - light.turn_off:
          id: led_ring

sensor:
  - id: button_adc
    platform: adc
    internal: true
    pin: 8
    attenuation: 11db
    update_interval: 15ms
    filters:
      - median:
          window_size: 5
          send_every: 5
          send_first_at: 1
      - delta: 0.1
    on_value_range:
      - below: 0.55
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: ON
      - above: 0.65
        below: 0.92
        then:
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: ON
      - above: 1.02
        below: 1.33
        then:
          - binary_sensor.template.publish:
              id: btn_set
              state: ON
      - above: 1.43
        below: 1.77
        then:
          - binary_sensor.template.publish:
              id: btn_play
              state: ON
      - above: 1.87
        below: 2.15
        then:
          - binary_sensor.template.publish:
              id: btn_mode
              state: ON
      - above: 2.25
        below: 2.56
        then:
          - binary_sensor.template.publish:
              id: btn_record
              state: ON
      - above: 2.8
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: OFF
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: OFF
          - binary_sensor.template.publish:
              id: btn_set
              state: OFF
          - binary_sensor.template.publish:
              id: btn_play
              state: OFF
          - binary_sensor.template.publish:
              id: btn_mode
              state: OFF
          - binary_sensor.template.publish:
              id: btn_record
              state: OFF
1 Like

For anyone looking for an elegant, 3d printed (that is covered with a textile speaker grille) enclosure for the Korvo and Korvo1, have a look at my WIP over here Far field satellite with an Elegant 3d printed enclosures

I’ve been following this thread, but it would be great if someone could wrap it up, with some simple to follow instructions on the board in regards to how it all finalized with what works best. both for the S3 dev board and the regular esp board.

2 Likes

Hello, with my ESP32 N16R8 I was able to control the RGB LED, use the wake word, the actions are well done but despite all my attempts I did not manage to output sound via the Max98357 either in speaker mode or media_player. So also a taker of a connection / code that works with the ESP32 S3 :wink:

Here is a link to the code I use on N16R8. It works, but has a few false positives when the radio is on in the background. So need to do some fiddling about with settings.

Thanks to BigBobbas

As an update to my crackling speaker issues, I replaced the max98357 and it was fine. So one non working amp.

1 Like

hi

i seen your post, i tried the same setup with my esp32 wroom32 but have issues with the media_player, when i play a file the esp32 crashes.

is the ESP32-s3-N16R8 Dev Board better than the “ESP32 wroom 32” ?

how do you make it which the command to stop the music?
i read something about custom commands with the assist, or is it possible to mute the sound when the wake word is speaken?`

Everyone with noise problems… it is the 3.3V or 5V supply voltage… I run an OrangePI 2W with USB Audio and it was noisy. The solution is to Boost 5V to 8V and use an analog Voltage regulator such as 7805. I cut the red and black cables in an USB extention cable being careful of the data wires, and connected the MT3608 Boost converter and the regulator 7805 supplying smooth voltage to the USB audio module. Be aware if you need 3.3V you need a 3.3V regulator and no Boost converter if you have 5V somewhere. For me the noise disapeared completely /Mikael

3 Likes

Well, I was wrong, i thought the repository was handling microwakeword and it wasn’t, here is the updated working esphome code for microwakeword. Also, if you flashed it with the board as esp32-se-devkitc-1, it only formats as either 4MB or 8MB, I was pulling my hair out until I realized the PSRAM was more then double the ROM showing in esphome. It said the BIN file was 19xxx bytes and the flash was 18xxx bytes so I automatically assumed 18MB when it’s nowhere close to that… Just had to specify the flash size, one line…

Remeber to change API key, and wifi as I have it statically IP’d. Also, it points to a non existent file for the firmware, because of this, it shows up as red in esphome but everything works and you will always be prompted to install a “new” device. Just leave it alone, don’t adopt it. From what I understand, it will need to be added on the esphome side and there has been a feature request to add it. If that will happen or when, if ever, I have no idea. If you comment it out, it fails to flash… Everything works without issues although I’m pretty postie the buttons don’t do anything. There are some warning during install and they are way above my skill set as I am not a developer but can cut and paste other peoples code and mess around with it and usually can get things to work. Most credit goes out to whoever wrote the repository I am pointing to because I never would have been able to do it without that

substitutions:
  name: "korvo"
  friendly_name: korvo

esphome:
  name: ${name}
  friendly_name: ${friendly_name}
  name_add_mac_suffix: true
  platformio_options:
    board_build.flash_mode: dio
    upload_speed: 460800
  project:
    name: esphome.voice-assistant
    version: "1.0"
  min_version: 2023.11.5
  on_boot:
    - priority: 600
      then:
        - light.turn_on:
            id: led_ring
            brightness: 70%
            effect: connecting

esp32:
  board: esp32-s3-devkitc-1
  flash_size: 16MB
  framework:
    type: esp-idf
    sdkconfig_options:
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
      CONFIG_AUDIO_BOARD_CUSTOM: "y"
      CONFIG_ESP32_S3_KORVO1_BOARD: "y"
    components:
      - name: esp32_s3_korvo1_board
        source: github://abmantis/esphome_custom_audio_boards@main
        refresh: 0s

psram:
  mode: octal
  speed: 80MHz

external_components:
  - source: github://pr#5230
    components: esp_adf
    refresh: 0s

ota:
logger:
api:
  encryption:
     key: you_api_key
  on_client_connected:
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - delay: 1s
            - voice_assistant.start_continuous:
            - delay: 1s
            - voice_assistant.stop:
            - delay: 1500ms
            - voice_assistant.start_continuous:
            - script.execute: reset_led
  on_client_disconnected:
    then:
      - light.turn_on:
          id: led_ring
          blue: 0%
          red: 100%
          green: 100%
          brightness: 50%
          effect: connecting

dashboard_import:
  package_import_url: github://esphome/firmware/voice-assistant/esp32-s3-korvo1.yaml@main

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  use_address: 192.168.0.xx
  ap:
  on_connect:
    then:
      - delay: 5s # Gives time for improv results to be transmitted
      - ble.disable:
  on_disconnect:
    then:
      - ble.enable:

improv_serial:

esp32_improv:
  authorizer: none

button:
  - platform: factory_reset
    id: factory_reset_btn
    name: Factory reset

esp_adf:
  board: esp32s3korvo1

microphone:
  - platform: esp_adf
    id: korvo_mic

speaker:
  - platform: esp_adf
    id: korvo_speaker
    
micro_wake_word:
  on_wake_word_detected:
    # then:
    - voice_assistant.start:
        wake_word: !lambda return wake_word;
    - light.turn_on:
        id: led_ring      
        red: 30%
        green: 30%
        blue: 70%
        brightness: 60%
        effect: fast pulse 
  model: okay_nabu.

voice_assistant:
  id: voice_asst
  microphone: korvo_mic
  speaker: korvo_speaker
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2
  use_wake_word: true
  on_listening:
    - light.turn_on:
        id: led_ring
        blue: 100%
        red: 0%
        green: 0%
        brightness: 100%
        effect: wakeword
  on_tts_start:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 0%
        green: 100%
        brightness: 50%
        effect: pulse
  on_end:
    - delay: 500ms
    - wait_until:
        not:
          speaker.is_playing:
    - script.execute: reset_led
  on_error:
    - light.turn_on:
        id: led_ring
        blue: 0%
        red: 100%
        green: 0%
        brightness: 100%
        effect: none
    - delay: 1s
    - script.execute: reset_led
    - script.wait: reset_led
    - lambda: |-
        if (code == "wake-provider-missing" || code == "wake-engine-missing") {
          id(use_wake_word).turn_off();
        }

script:
  - id: reset_led
    then:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - light.turn_on:
                id: led_ring
                blue: 100%
                red: 0%
                green: 0%
                brightness: 30%
                effect: none
          else:
            - light.turn_off: led_ring

switch:
  - platform: gpio
    id: pa_ctrl
    pin: GPIO38
    name: "${friendly_name} Speaker Mute"
    restore_mode: ALWAYS_ON

  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(voice_asst).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
      - script.execute: reset_led
    on_turn_off:
      - voice_assistant.stop
      - script.execute: reset_led

light:
  - platform: esp32_rmt_led_strip
    id: led_ring
    name: "${friendly_name} Light"
    pin: GPIO19
    num_leds: 12
    rmt_channel: 0
    rgb_order: GRB
    chipset: ws2812
    default_transition_length: 0s
    effects:
      - pulse:
          name: "Pulse"
          transition_length: 0.5s
          update_interval: 0.5s
      - addressable_twinkle:
          name: "Working"
          twinkle_probability: 5%
          progress_interval: 4ms
      - addressable_color_wipe:
          name: "Wakeword"
          colors:
            - red: 0%
              green: 50%
              blue: 0%
              num_leds: 12
          add_led_interval: 20ms
          reverse: false
      - addressable_color_wipe:
          name: "Connecting"
          colors:
            - red: 60%
              green: 60%
              blue: 60%
              num_leds: 12
            - red: 60%
              green: 60%
              blue: 0%
              num_leds: 12
          add_led_interval: 100ms
          reverse: true

binary_sensor:
  - platform: template
    name: "${friendly_name} Volume Up"
    id: btn_volume_up
  - platform: template
    name: "${friendly_name} Volume Down"
    id: btn_volume_down
  - platform: template
    name: "${friendly_name} Set"
    id: btn_set
  - platform: template
    name: "${friendly_name} Play"
    id: btn_play
  - platform: template
    name: "${friendly_name} Mode"
    id: btn_mode
    on_multi_click:
      - timing:
          - ON for at least 10s
        then:
          - button.press: factory_reset_btn
  - platform: template
    name: "${friendly_name} Record"
    id: btn_record
    on_press:
      - voice_assistant.start:
      - light.turn_on:
          id: led_ring
          brightness: 100%
          effect: "Wakeword"
    on_release:
      - voice_assistant.stop:
      - light.turn_off:
          id: led_ring

sensor:
  - id: button_adc
    platform: adc
    internal: true
    pin: 8
    attenuation: 11db
    update_interval: 15ms
    filters:
      - median:
          window_size: 5
          send_every: 5
          send_first_at: 1
      - delta: 0.1
    on_value_range:
      - below: 0.55
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: ON
      - above: 0.65
        below: 0.92
        then:
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: ON
      - above: 1.02
        below: 1.33
        then:
          - binary_sensor.template.publish:
              id: btn_set
              state: ON
      - above: 1.43
        below: 1.77
        then:
          - binary_sensor.template.publish:
              id: btn_play
              state: ON
      - above: 1.87
        below: 2.15
        then:
          - binary_sensor.template.publish:
              id: btn_mode
              state: ON
      - above: 2.25
        below: 2.56
        then:
          - binary_sensor.template.publish:
              id: btn_record
              state: ON
      - above: 2.8
        then:
          - binary_sensor.template.publish:
              id: btn_volume_up
              state: OFF
          - binary_sensor.template.publish:
              id: btn_volume_down
              state: OFF
          - binary_sensor.template.publish:
              id: btn_set
              state: OFF
          - binary_sensor.template.publish:
              id: btn_play
              state: OFF
          - binary_sensor.template.publish:
              id: btn_mode
              state: OFF
          - binary_sensor.template.publish:
              id: btn_record
              state: OFF

Many people have already written about the fact that crackling and noise are primarily caused by interference in the wiring between the ESP and the sound module board. Because it is not just a relay or LED control wire, but a digital serial data bus operating at high frequencies. The best solution is to use no wires at all. I’m sure everyone has these things, they are usually bundled with the boards. I made such an assembly, I’m sharing the pictures. No noise when turned on, in operation, crystal clear sound in music and voice.




In my opinion, this is more reliable and compact.
The sound module board (amplifier) should be powered directly from 5 volts, the microphone board from 3.3 volts. I took out a separate type-c power connector and soldered it to the esp directly to the inputs of the 3.3 volt step-down converter (don’t mess it up). This gives the least amount of voltage dropout.

And yes, I confirm that microphones can be easily killed when soldering. In my case, flux got into the hole and I couldn’t get it out with alcohol and other cleaners.

p.s.: And another mounting option (this is not for a speaker, but as an adapter to a 5.1 system)

5 Likes

I have a question. If we use a Bluetooth speaker like the one in this build, why using audio output through amplifier (that may generate noises and headaches)? Can we use instead the bloototh connection (that is usually available in all ESP32 boards) to connect to the speaker itself for audio output? Hoping, by doing so, we can use the volume controls on the speaker without supplementary programming (I’m guessing), and remove the problematic amplifier from the build. Esp32 board will be used for wakeword, incoming audio through microphone and led lights. Can it be done?

I don’t believe this is currently possible. Most examples I’ve seen or used turn off Bluetooth when microwakeword is enabled. Someone in another thread said you can output audio to a media player, but it has to be defined in the ESPHome configuration and you may or may not have to allow the device to make HA service calls. Just be sure to enable it on the device after it’s added to HA. You would use one of the 2 services below

# In some trigger
on_...:
  # Simple
  - homeassistant.service:
      service: notify.html5
      data:
        message: Button was pressed
  # With templates and variables
  - homeassistant.service:
      service: notify.html5
      data:
        title: New Humidity
      data_template:
        message: The humidity is {{ my_variable }}%.
      variables:
        my_variable: |-
          return id(my_sensor).state;

I have had some time to look at this. And the microphone part is not possible, unless you have a device with headset_head_unit profile. PulseAudio does not support handsfree_head_unit profile, which most Bluetooth speaker would have, like mine. I could use the audio output of the speaker with the add-on, but unfortunately I couldn’t get the microphone work through Bluetooth.

If PulseAudio would have ofono as well, then it could work, but I guess, that is a different issue with HAOS.

Honestly, I wish I knew because on my Korvo-1 there is an annoying popping sound before playback. I believe it’s because there is a pin and if it detects something plugged into the 3.5mm jack it’s high and if not it’s low and uses the jat output to a non passive (powered ) speakers which nobody has gotten to.ork. S3 box is gone and this is decently priced.

It’s just super annoying but looking at alternatives the M5Stack CoreS3 which has the yaml out there for being a voice assistant. It also has several modules although.Im not sure how ESPHome would handle them as they have an RCA module with left and right channels. They also have one doe Ethernet. Problem is it can’t just be plug and play in ESPHome. They do have the code out there though and it’s essentially an S3 box made by m5stack but you can stack modules like batteries to Lora modules, only 2 that interested me are below but like I said, it won’t just be plug and play.

I doubt this will ever do audio over BT. It’s listening for the wake work ever 15ms or even higher by default and uses the PSRAM to hold on to what it heard just to see if it was triggered so listening for the wake word is it’s full time job. I guess not once your in the actual voice pipeline as it’s just STT and TTS but I’ve always gone by the S3 box examples and they disable BT once they connect to HA so it won’t take up resources (per comments in YAML). The thing is the below might just end in the same situation because outside the S3 boxes it seems hit or miss. I have read lots of people ha ing some audio issues of various degrees

And after typing all that out, that has audio issues also according some GitHub discussion. Try the below, it might work but I haven’t tried yet. I e watched a few of his videos and so if he said he has it working I don’t doubt him.

Honestly, I ended up spending 10 dollars and I am going to try this because while vice commands are nice, they certainly still have some issues. I’m sure they will get worked out over time but 8 dollars for physical keys to run anything would be nice to have around and you can do that with.a direct WiFi num pad, just need an available USB port on your HA server.

Ive moved on from these ESP32 options to the Wyoming Satellite. They are much better.

Oddly I just figured out how to do it but yeah, Wyoming satellite can actually run Openeakeword and some other stuff like Wyoming and piper on the pi. which was never ported to ESP32 and I honestly don’t think it has the power to. Just leaving this here for others in case they run into this. You also have to go to devices, then ESPHome, click configure on the voice assistant and check allow this device to make home assistant service calls. After that I unplugged the 3.5mm audio output. Problem solved.


Source

on_tts_end:
  - homeassistant.service:
      service: media_player.play_media
       data:
         entity_id: media_player.vlc_telnet  
         media_content_id: !lambda 'return x;'
         media_content_type: music
         announce: "true"