A presentable voice assistant satellite

FredTheFrog · February 26, 2024, 12:14pm

I’ll be honest, I took a day off from fighting with it yesterday. Integrating the changes supplied by @bkprath and @robgough1970 I would hope for a more stable experience with this board. I think the idea of stopping and restarting the voice_assistant is a good move, so long as it doesn’t introduce long delays.

Going to add wires soldered to the D+ and D- signals on that Onn power/USB board, specifically to enable data connections using the external USB-C connector. Ordered a few of these USB-C connectors to use.

bkprath · February 26, 2024, 12:30pm

There is a delay between completion of response and start of detection of the next wake word, as a lot of times I have to say the wake word twice. However, I do not know if that was the result of my change or just the way it would work. A lot of times I have to say the wake word twice when initiating the first command. For me stability was a big issue. With the change I don’t have any lost in space moments, the ones that required restart of the board to get things going again. And audio is a lot better.

Rich37804 · February 26, 2024, 12:57pm

Starting and stopping the voice assistant was the issue I was having with the S3 boards. They would work at first, then just not respond after a bit of time.
I removed that code from one of them and left it in the other last night. This morning, the one I removed that code from responded and the one that had the code did not.
This doesn’t seem to affect the other boards but I removed the code anyway.

Rich37804 · February 26, 2024, 2:06pm

Here is a vid of me showing the 4 I have built right after each other. Got a bit of choppiness in one. I talk a bit about what I did that I think helps that speaker buffer issue.

bkprath · February 26, 2024, 6:36pm

Nice video. You mention that you turn off wake word detection when you’re out of the room. Turning off/on wake word detection would actually be similar to my cycling the ESP voice assistant code during command processing. Since I’ve had they system hang problems with the other platforms people are using it makes sense that some network latency could be what triggers the issue within the ESP voice assistant code. I’m going to see if I get better performance simply by putting my one unit closer to my WiFi router. Which ESP32S3 are you using, I looked into the code and M5StampS3 I tried to use are not supported by the code base?

bkprath · February 26, 2024, 6:43pm

Any chance you have reference for doing the openAI intergration if HA doesn’t get an answer?

Rich37804 · February 26, 2024, 8:55pm

These are the S3 boards I am using
Amazon.com: AITRIP 3PCS ESP32-S3-DevKitC-1-N16R8 ESP32-S3 Development Board Wi-Fi + BLE MCU Module Integrates Complete Wi-Fi and BLE Functions for Arduino : Electronics
These are the standard esp32 boards I am using:
Amazon.com: 3 Pieces Development Board 2.4GHz WiFi Dual Cores Microcontroller Integrated with Antenna RF Filters Compatible with Arduino IDE(32) : Electronics

Rich37804 · February 26, 2024, 8:57pm

Here is setting up the OpenAI agent:

bkprath · February 27, 2024, 2:26am

Thanks for posting the responses. I’m trying to get the localai set up now. It is an impressive setup. I’m also looking to see if shifting traffic in my network has an effect. I pulled my ESP32-S3-BOX back out to refresh myself on why I shelved it. Turn out it has both the stutter issue and the lost in space issue. I gave up before because the lost in space issue makes it a useless device. I just made similar changes to the code for it as I did to the code you posted to see if that helps with the issues. The code is a little different because it has the initial wake word detection onboard the ESP. I much prefer your simple LED feedback to the images they present on the S3-BOX. Thanks again.

michel72 · February 27, 2024, 3:25pm

Hi, after trying for hours and hours with different config, I tried you config yesterday evening, but no luck either.

I am using these boards: I have both varients:
https://s.click.aliexpress.com/e/_DlkzZON

Slightly modified config:

substitutions:
  voice_assist_idle_phase_id: '1'
  voice_assist_listening_phase_id: '2'
  voice_assist_thinking_phase_id: '3'
  voice_assist_replying_phase_id: '4'
  voice_assist_not_ready_phase_id: '10'
  voice_assist_error_phase_id: '11'
  voice_assist_muted_phase_id: '12'
esphome:
  name: esp32-va-001
  friendly_name: Kantoor Voice assistant
  platformio_options:
    board_build.flash_mode: dio
  on_boot:
      priority: 600
      then:
        - script.execute: control_led
        - delay: 30s
        - if:
            condition:
              lambda: return id(init_in_progress);
            then:
              - lambda: id(init_in_progress) = false;
              - script.execute: control_led

psram:
  mode:  octal
  speed: 80MHz

esp32:
  board:   esp32-s3-devkitc-1
  variant: esp32s3
  framework:
    type: esp-idf
    version: recommended
    components:
      - name:    esphome_board
        source:  github://jesserockz/esphome-esp-adf-board@main
        refresh: 0s
    sdkconfig_options:
      CONFIG_ESP32_S3_BOX_BOARD: "y"
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB:      "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B:  "y"
      CONFIG_AUDIO_BOARD_CUSTOM:           "y"

# Enable logging
logger:
  level: debug

# Enable Home Assistant API
api:
  encryption:
    key: !secret api_key

  on_client_connected:
    - script.execute: control_led
  on_client_disconnected:
    - script.execute: control_led

ota:
  password: !secret ota_password

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

esp_adf:
external_components:
  - source: github://pr#5230
    components:
    - esp_adf 
    refresh: 0s

light:
  - platform: esp32_rmt_led_strip
    rgb_order: GRB
    pin: GPIO18
    num_leds: 8
    rmt_channel: 0
    chipset: WS2812
    name: "Status LED"
    id: led
    default_transition_length: 0s
    effects:
      - pulse:
          name: "extra_slow_pulse"
          transition_length: 800ms
          update_interval: 800ms
          min_brightness: 0%
          max_brightness: 30%
      - pulse:
          name: "slow_pulse"
          transition_length: 250ms
          update_interval: 250ms
          min_brightness: 50%
          max_brightness: 100%
      - pulse:
          name: "fast_pulse"
          transition_length: 100ms
          update_interval: 100ms
          min_brightness: 50%
          max_brightness: 100%

i2s_audio:
  - id: i2s_in
    ## Connect the l/R pin of the INMP441 to ground!
    i2s_lrclk_pin: GPIO3  ## INMP441 - WS
    i2s_bclk_pin: GPIO2  ## INMP441 - SCK
  - id: i2s_out
    i2s_lrclk_pin: GPIO6  ## Max98357 - LRC
    i2s_bclk_pin: GPIO20   ## Max98357 - BCLK

microphone:
  platform: i2s_audio 
  id: external_microphone 
  adc_type: external 
  i2s_audio_id: i2s_in
  i2s_din_pin: GPIO4 ## INMP441 - SD
  channel: left
  pdm: false

speaker:
  platform: i2s_audio 
  id: external_speaker 
  dac_type: external
  i2s_audio_id: i2s_out
  i2s_dout_pin: GPIO8  ## Max98357 - DIN
  mode: stereo 

voice_assistant:
  id: va
  microphone: external_microphone 
  speaker: external_speaker
  use_wake_word: true
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.5

  on_listening:
    - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
    - script.execute: control_led

  on_stt_vad_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
    - script.execute: control_led

  on_tts_stream_start:
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - script.execute: control_led

  on_tts_stream_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - script.execute: control_led

  on_error: 
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
          - script.execute: control_led
          - delay: 1s
          - if:
              condition:
                switch.is_on: use_wake_word
              then:
                - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
              else:
                - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
          - script.execute: control_led

  on_client_connected: 
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.start_continuous
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
        else:
          - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
    - script.execute: control_led          

  on_client_disconnected: 
    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
    - script.execute: control_led

switch:
  - platform: template
    name: Use Wake Word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    on_turn_on:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
            - if:
                condition:
                    not:
                      - voice_assistant.is_running
                then:
                  - voice_assistant.start_continuous
            - script.execute: control_led          
 
    on_turn_off:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - voice_assistant.stop
            - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
            - script.execute: control_led          

globals:
  - id: init_in_progress
    type: bool
    restore_value: no
    initial_value: 'true'
  - id: voice_assistant_phase
    type: int
    restore_value: no
    initial_value: ${voice_assist_not_ready_phase_id}
  
script:
  - id: control_led
    then:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - if:
                condition:
                    wifi.connected:
                then:
                  - if:
                      condition:
                          api.connected:
                      then:
                        - lambda: |
                            switch(id(voice_assistant_phase)) {
                              case ${voice_assist_listening_phase_id}:
                                id(led).turn_on().set_rgb(0, 0, 1).set_brightness(1.0).set_effect("none").perform();
                                break;
                              case ${voice_assist_thinking_phase_id}:
                                id(led).turn_on().set_rgb(0, 1, 0).set_effect("slow_pulse").perform();
                                break;
                              case ${voice_assist_replying_phase_id}:
                                id(led).turn_on().set_rgb(0, 0, 1).set_brightness(1.0).set_effect("fast_pulse").perform();
                                break;
                              case ${voice_assist_error_phase_id}:
                                id(led).turn_on().set_rgb(1, 1, 1).set_brightness(.5).set_effect("none").perform();
                                break;
                              case ${voice_assist_muted_phase_id}:
                                id(led).turn_off().perform();
                                break;
                              case ${voice_assist_not_ready_phase_id}:
                                id(led).turn_on().perform();
                                break;
                              default:
                                id(led).turn_on().set_rgb(1, 0, 0).set_brightness(0.2).set_effect("none").perform();
                                break;
                            }
                      else:
                        - light.turn_off:
                            id: led
                else:
                  - light.turn_off:
                      id: led
          else:
            - light.turn_on:
                id: led
                blue: 50%
                red: 50%
                green: 50%
                effect: "fast_pulse"

Connected all pins as to the instructions. The only difference is that I use a PCM5102 board for output.

For me the microphone is the issue. I can’t get any input.

Debug log:

INFO ESPHome 2024.2.0
INFO Reading configuration /config/esphome/esp32-va-001.yaml...
INFO Updating https://github.com/esphome/esphome.git@pull/5230/head
WARNING GPIO3 is a strapping PIN and should only be used for I/O with care.
Attaching external pullup/down resistors to strapping pins can cause unexpected failures.
See https://esphome.io/guides/faq.html#why-am-i-getting-a-warning-about-strapping-pins
INFO Starting log output from 192.168.207.241 using esphome API
INFO Successfully connected to esp32-va-001 @ 192.168.207.241 in 0.191s
INFO Successful handshake with esp32-va-001 @ 192.168.207.241 in 0.310s
[16:22:13][I][app:102]: ESPHome version 2024.2.0 compiled on Feb 27 2024, 16:19:54
[16:22:13][C][wifi:577]: WiFi:
[16:22:13][C][wifi:409]:   Local MAC: 3C:84:27:14:86:7C
[16:22:13][C][wifi:414]:   SSID: [redacted]
[16:22:13][C][wifi:415]:   IP Address: 192.168.207.241
[16:22:13][C][wifi:417]:   BSSID: [redacted]
[16:22:13][C][wifi:418]:   Hostname: 'esp32-va-001'
[16:22:13][C][wifi:420]:   Signal strength: -56 dB ▂▄▆█
[16:22:13][C][wifi:424]:   Channel: 11
[16:22:13][C][wifi:425]:   Subnet: 255.255.255.0
[16:22:13][C][wifi:426]:   Gateway: 192.168.207.1
[16:22:13][C][wifi:427]:   DNS1: 192.168.207.130
[16:22:13][C][wifi:428]:   DNS2: 0.0.0.0
[16:22:13][C][logger:447]: Logger:
[16:22:13][C][logger:448]:   Level: DEBUG
[16:22:13][C][logger:449]:   Log Baud Rate: 115200
[16:22:13][C][logger:451]:   Hardware UART: USB_SERIAL_JTAG
[16:22:13][C][esp32_rmt_led_strip:175]: ESP32 RMT LED Strip:
[16:22:13][C][esp32_rmt_led_strip:176]:   Pin: 18
[16:22:13][C][esp32_rmt_led_strip:177]:   Channel: 0
[16:22:13][C][esp32_rmt_led_strip:202]:   RGB Order: GRB
[16:22:13][C][esp32_rmt_led_strip:203]:   Max refresh rate: 0
[16:22:13][C][esp32_rmt_led_strip:204]:   Number of LEDs: 8
[16:22:13][C][light:103]: Light 'Status LED'
[16:22:13][C][light:105]:   Default Transition Length: 0.0s
[16:22:13][C][light:106]:   Gamma Correct: 2.80
[16:22:13][C][template.switch:068]: Template Switch 'Use Wake Word'
[16:22:13][C][template.switch:091]:   Restore Mode: restore defaults to ON
[16:22:13][C][template.switch:057]:   Optimistic: YES
[16:22:13][C][psram:020]: PSRAM:
[16:22:13][C][psram:021]:   Available: NO
[16:22:13][C][mdns:115]: mDNS:
[16:22:13][C][mdns:116]:   Hostname: esp32-va-001
[16:22:13][C][ota:096]: Over-The-Air Updates:
[16:22:13][C][ota:097]:   Address: esp32-va-001.local:3232
[16:22:13][C][ota:100]:   Using Password.
[16:22:13][C][ota:103]:   OTA version: 2.
[16:22:13][C][api:139]: API Server:
[16:22:13][C][api:140]:   Address: esp32-va-001.local:6053
[16:22:13][C][api:142]:   Using noise encryption: YES
[16:22:17][D][voice_assistant:521]: Event Type: 0
[16:22:17][D][voice_assistant:521]: Event Type: 2
[16:22:17][D][voice_assistant:611]: Assist Pipeline ended
[16:22:17][D][voice_assistant:414]: State changed from STREAMING_MICROPHONE to WAIT_FOR_VAD
[16:22:17][D][voice_assistant:420]: Desired state set to WAITING_FOR_VAD
[16:22:17][D][voice_assistant:172]: Waiting for speech...
[16:22:17][D][voice_assistant:414]: State changed from WAIT_FOR_VAD to WAITING_FOR_VAD
[16:22:17][D][voice_assistant:185]: VAD detected speech
[16:22:17][D][voice_assistant:414]: State changed from WAITING_FOR_VAD to START_PIPELINE
[16:22:17][D][voice_assistant:420]: Desired state set to STREAMING_MICROPHONE
[16:22:17][D][voice_assistant:202]: Requesting start...
[16:22:17][D][voice_assistant:414]: State changed from START_PIPELINE to STARTING_PIPELINE
[16:22:17][D][voice_assistant:435]: Client started, streaming microphone
[16:22:17][D][voice_assistant:414]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[16:22:17][D][voice_assistant:420]: Desired state set to STREAMING_MICROPHONE
[16:22:17][D][voice_assistant:521]: Event Type: 1
[16:22:17][D][voice_assistant:524]: Assist Pipeline running
[16:22:18][D][voice_assistant:521]: Event Type: 9
[16:22:22][D][voice_assistant:521]: Event Type: 0
[16:22:22][D][voice_assistant:521]: Event Type: 2
[16:22:22][D][voice_assistant:611]: Assist Pipeline ended
[16:22:22][D][voice_assistant:414]: State changed from STREAMING_MICROPHONE to WAIT_FOR_VAD
[16:22:22][D][voice_assistant:420]: Desired state set to WAITING_FOR_VAD
[16:22:22][D][voice_assistant:172]: Waiting for speech...
[16:22:22][D][voice_assistant:414]: State changed from WAIT_FOR_VAD to WAITING_FOR_VAD
[16:22:22][D][voice_assistant:185]: VAD detected speech
[16:22:22][D][voice_assistant:414]: State changed from WAITING_FOR_VAD to START_PIPELINE
[16:22:22][D][voice_assistant:420]: Desired state set to STREAMING_MICROPHONE
[16:22:22][D][voice_assistant:202]: Requesting start...
[16:22:22][D][voice_assistant:414]: State changed from START_PIPELINE to STARTING_PIPELINE
[16:22:22][D][voice_assistant:435]: Client started, streaming microphone
[16:22:22][D][voice_assistant:414]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[16:22:22][D][voice_assistant:420]: Desired state set to STREAMING_MICROPHONE
[16:22:23][D][voice_assistant:521]: Event Type: 1
[16:22:23][D][voice_assistant:524]: Assist Pipeline running
[16:22:23][D][voice_assistant:521]: Event Type: 9
[16:22:32][D][voice_assistant:521]: Event Type: 0
[16:22:32][D][voice_assistant:521]: Event Type: 2
[16:22:32][D][voice_assistant:611]: Assist Pipeline ended
[16:22:32][D][voice_assistant:414]: State changed from STREAMING_MICROPHONE to WAIT_FOR_VAD
[16:22:32][D][voice_assistant:420]: Desired state set to WAITING_FOR_VAD
[16:22:33][D][voice_assistant:172]: Waiting for speech...
[16:22:33][D][voice_assistant:414]: State changed from WAIT_FOR_VAD to WAITING_FOR_VAD
[16:22:33][D][voice_assistant:185]: VAD detected speech
[16:22:33][D][voice_assistant:414]: State changed from WAITING_FOR_VAD to START_PIPELINE
[16:22:33][D][voice_assistant:420]: Desired state set to STREAMING_MICROPHONE
[16:22:33][D][voice_assistant:202]: Requesting start...
[16:22:33][D][voice_assistant:414]: State changed from START_PIPELINE to STARTING_PIPELINE
[16:22:33][D][voice_assistant:435]: Client started, streaming microphone
[16:22:33][D][voice_assistant:414]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[16:22:33][D][voice_assistant:420]: Desired state set to STREAMING_MICROPHONE
[16:22:33][D][voice_assistant:521]: Event Type: 1
[16:22:33][D][voice_assistant:524]: Assist Pipeline running
[16:22:33][D][voice_assistant:521]: Event Type: 9
[16:22:38][D][voice_assistant:521]: Event Type: 0
[16:22:38][D][voice_assistant:521]: Event Type: 2
[16:22:38][D][voice_assistant:611]: Assist Pipeline ended
[16:22:38][D][voice_assistant:414]: State changed from STREAMING_MICROPHONE to WAIT_FOR_VAD
[16:22:38][D][voice_assistant:420]: Desired state set to WAITING_FOR_VAD
[16:22:38][D][voice_assistant:172]: Waiting for speech...
[16:22:38][D][voice_assistant:414]: State changed from WAIT_FOR_VAD to WAITING_FOR_VAD
[16:22:38][D][voice_assistant:185]: VAD detected speech
[16:22:38][D][voice_assistant:414]: State changed from WAITING_FOR_VAD to START_PIPELINE
[16:22:38][D][voice_assistant:420]: Desired state set to STREAMING_MICROPHONE
[16:22:38][D][voice_assistant:202]: Requesting start...
[16:22:38][D][voice_assistant:414]: State changed from START_PIPELINE to STARTING_PIPELINE
[16:22:38][D][voice_assistant:435]: Client started, streaming microphone
[16:22:38][D][voice_assistant:414]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[16:22:38][D][voice_assistant:420]: Desired state set to STREAMING_MICROPHONE
[16:22:38][D][voice_assistant:521]: Event Type: 1
[16:22:38][D][voice_assistant:524]: Assist Pipeline running
[16:22:38][D][voice_assistant:521]: Event Type: 9

When I speak " Hey Jarvis" nothing happens.

Rich37804 · February 27, 2024, 3:33pm

Have you tried a different mic?
I can tell you from experience, that these mics can be damaged very easily while soldering.
Out of 10 I bought, I wound up with 6 usable ones.

michel72 · February 27, 2024, 3:41pm

I don’t know how, but after erasing the build files and recompling and flashing, it works (somewhat).

I do get many errors:

[16:39:03][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:03][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:03][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:521]: Event Type: 99
[16:39:04][D][voice_assistant:670]: TTS stream end
[16:39:04][D][voice_assistant:285]: End of audio stream received
[16:39:04][D][voice_assistant:414]: State changed from STREAMING_RESPONSE to RESPONSE_FINISHED
[16:39:04][D][voice_assistant:420]: Desired state set to RESPONSE_FINISHED
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop
[16:39:04][D][voice_assistant:349]: Speaker buffer full, trying again next loop

Rich37804 · February 27, 2024, 3:44pm

Sometimes that error will happen with these boards the first couple of times…
What is your HA server on? Hardware wise…
Watch the last video I posted in this thread. I talk about a couple things I think helped with that error.

michel72 · February 27, 2024, 3:45pm

It’s on proxmox on a modern i7 NOC.

michel72 · February 27, 2024, 3:47pm

The video you mention if for the wyoming satellite based on raspebrry pi zero w. That actually works quit descent.

Rich37804 · February 27, 2024, 3:49pm

Sorry, not that video. That isnt me…

FredTheFrog · February 27, 2024, 3:50pm

I’m beginning to wonder about the particular ESP32-S3 board I’m using for my testbed. It seems the Voice Assistant code goes dormant for long periods or is waiting/looping an excessively long time. When the microphone works, it works well. When it doesn’t, it’s usually because the ESP32-S3 isn’t expecting it. The funny thing is, I shifted away from hard-wired/soldered connections to pins and very short jumper wires, specifically to obtain more flexibility/reliability. I have a few more ESP32-S3 boards, and a few more INMP441 microphones. I’ll definitely be trying with them today, and turning on VERY_VERBOSE logging if necessary.

All that being said, my summary opinion is: this Onn speaker makes a GREAT voice assistant presentation. The LEDs appear clearly through the grille mesh, the MIC works well through the grille mesh, and the ESP32 and amplifier boards tuck easily and neatly inside the case. After it’s connected/configured and the ESPhome Voice Assistant code is running reliably, it’s a FANTASTIC project.

What I’m observing after switching to a different microphone board (same manufacturer) is no different than before. It appears Voice_Assistant is executing a loop infinitely. No actual wake word detection is occurring. The following block repeats continuously:

[D][voice_assistant:435]: Client started, streaming microphone
[D][voice_assistant:414]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[D][voice_assistant:420]: Desired state set to STREAMING_MICROPHONE
[D][voice_assistant:521]: Event Type: 1
[D][voice_assistant:524]: Assist Pipeline running
[D][voice_assistant:521]: Event Type: 9
[D][voice_assistant:521]: Event Type: 0
[D][voice_assistant:521]: Event Type: 2
[D][voice_assistant:611]: Assist Pipeline ended

I’m going to double-check my connections and pin assignments, even with a digital meter to insure connectivity. Then moving to a fresh ESP32-S3 board.

Rich37804 · February 28, 2024, 11:34am

Ive managed to go 2 full days now with only having to reboot one satellite due to the eternal blue light of death.

FredTheFrog · February 28, 2024, 4:37pm

~~I’m dead in the water since switching to a different ESP32-S3 board. It refuses to connect to Wifi, or the Asus ZenWifi guest network is refusing its connections.~~ Either way, it won’t connect.

The blame for the aggravation rests solely with the module/board/ESP32 manufacturer. Two of three ESP32-S3-WROOM-1 boards in a single purchase were BAD. The first would not register correctly with USB, failing to provide a valid Device Descriptor. The second would not connect via WiFi no matter what was attempted. Only the third would actually load and connect. Jarvis is behaving properly for the first time (for me) ever.

Now, if only Voice_Assistant wouldn’t say this when nobody is speaking:

[06:11:35][D][voice_assistant:185]: VAD detected speech

Arh · March 1, 2024, 8:38am

Well I have the new board set up

All is working as expected except the speaker just crackles, occasionally it has a bit that recognisable but not often. Mostly on the longer responses that last word is legible. The actions are completed as requested 95% of the time. this is using the esp on board wake word detection.

This is my code if anyone has any suggestions for ending the crackling I would be grateful.

esphome:
  name: s3-test
  friendly_name: s3_test

  platformio_options:
    board_build.flash_mode: dio
  on_boot:
    - light.turn_on:
        id: led_ww
        blue: 100%
        brightness: 60%
        effect: fast pulse

esp32:
  board: esp32-s3-devkitc-1
  variant: esp32s3
  framework:
    type: esp-idf

    sdkconfig_options:
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
      CONFIG_AUDIO_BOARD_CUSTOM: "y"


psram:
  mode: octal
  speed: 80MHz

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: "xxxxxxxxxxx"

  on_client_connected:
        then:
          - delay: 50ms
          - light.turn_off: led_ww
          - micro_wake_word.start:
  on_client_disconnected:
        then:
          - voice_assistant.stop:

ota:
  password: "xxxxxxxxxxxxxxxx"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

button:
  - platform: restart
    name: "Restart"
    id: but_rest

switch:
  - platform: template
    id: mute
    name: mute
    optimistic: true
    on_turn_on:
      - micro_wake_word.stop:
      - voice_assistant.stop:
      - light.turn_on:
          id: led_ww
          red: 100%
          green: 0%
          blue: 0%
          brightness: 60%
          effect: fast pulse
      - delay: 2s
      - light.turn_off:
          id: led_ww
      - light.turn_on:
          id: led_ww
          red: 100%
          green: 0%
          blue: 0%
          brightness: 30%
    on_turn_off:
      - micro_wake_word.start:
      - light.turn_on:
          id: led_ww
          red: 0%
          green: 100%
          blue: 0%
          brightness: 60%
          effect: fast pulse
      - delay: 2s
      - light.turn_off:
          id: led_ww

light:
  - platform: esp32_rmt_led_strip
    id: led_ww
    rgb_order: GRB
    pin: GPIO14
    num_leds: 2
    rmt_channel: 0
    chipset: ws2812
    name: "on board light"
    effects:
      - pulse:
      - pulse:
          name: "Fast Pulse"
          transition_length: 0.5s
          update_interval: 0.5s
          min_brightness: 0%
          max_brightness: 100%

  - platform: esp32_rmt_led_strip
    id: led_w1
    rgb_order: GRB
    pin: GPIO48
    num_leds: 1
    rmt_channel: 0
    chipset: ws2812
    name: "on board light"
    effects:
      - pulse:
      - pulse:
          name: "Fast Pulse"
          transition_length: 0.5s
          update_interval: 0.5s
          min_brightness: 0%
          max_brightness: 100%


 # Audio and Voice Assistant Config
i2s_audio:
  - id: i2s_mic
    i2s_lrclk_pin: GPIO3
    i2s_bclk_pin: GPIO2
  - id: i2s_spk
    i2s_lrclk_pin: GPIO6
    i2s_bclk_pin: GPIO7

microphone:
  platform: i2s_audio
  id: va_mic
  adc_type: external
  i2s_audio_id: i2s_mic
  i2s_din_pin: GPIO4
  channel: left
  pdm: false

speaker:
  platform: i2s_audio
  id: va_spk
  dac_type: external
  i2s_audio_id: i2s_spk
  i2s_dout_pin: GPIO18
  mode: mono

micro_wake_word:
  on_wake_word_detected:
    - voice_assistant.start:
    - light.turn_on:
        id: led_ww
        red: 30%
        green: 30%
        blue: 70%
        brightness: 60%
        effect: fast pulse
  model: hey_jarvis

voice_assistant:
  id: va
  microphone: va_mic
  speaker: va_spk
  noise_suppression_level: 2.0
  #auto_gain: 31dBFS
  volume_multiplier: 2.0
  on_stt_end:
       then:
         - light.turn_off: led_ww

  on_error:
          - micro_wake_word.start:
  on_end:
        then:
          - light.turn_off: led_ww
          - wait_until:
              not:
                voice_assistant.is_running:
          - micro_wake_word.start:



web_server:
  port: 80
  version: 2