Sunton esp32 8048s70c voice assistant

Hi there, I just finished a good working config for the Sunton esp32 8048s70c, Cheap 7" touchscreen. Basically a Box3 clone based on lvgl.

esphome:
  name: wallpanel
  friendly_name: Wallpanel
  platformio_options:
    build_flags: "-DBOARD_HAS_PSRAM"
    board_build.esp-idf.memory_type: qio_opi
    board_build.flash_mode: dio
    board_upload.maximum_ram_size: 524288

esp32:
  board: esp32-s3-devkitc-1
  variant: esp32s3
  flash_size: 16MB
  framework:
    type: esp-idf 
    # Required to achieve sufficient PSRAM bandwidth
    sdkconfig_options:
      COMPILER_OPTIMIZATION_SIZE: y
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: y
      CONFIG_ESP32S3_DATA_CACHE_64KB: y
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: y
      CONFIG_SPIRAM_FETCH_INSTRUCTIONS: y
      CONFIG_SPIRAM_RODATA: y
      CONFIG_ESPTOOLPY_FLASHSIZE_16MB: y # fix warning about 2mb found

psram:
  mode: octal
  speed: 80MHz

# Enable logging
logger:


# Enable Home Assistant API
api:
  encryption:
    key: "qI3Jiy8ILejyQU1eyspzMsAPdRjG/dhUYMtlNQVggF4="

ota:
  - platform: esphome
    password: "7e097303fc5a5f885464acb718a72782"
    id: ota_esphome

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  on_connect:
    - delay: 5s  # Gives time for improv results to be transmitted
    - ble.disable:
  on_disconnect:
    - ble.enable:
    - lvgl.page.show: no_wifi_page_lvgl
  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Wallpanel Fallback Hotspot"
    password: "7xzWoZo8ib8H"

improv_serial:

esp32_improv:
  authorizer: none

captive_portal:

i2s_audio:
  - id: audio_out
    i2s_lrclk_pin: 18
    i2s_bclk_pin: 0
  - id: audio_in
    i2s_lrclk_pin: 13  # WS für Mikrofon
    i2s_bclk_pin: 12   # BCLK für Mikrofon

i2c:
  sda: 19
  scl: 20
  scan: true

display:
  - id: s3_box_lcd
    platform: rpi_dpi_rgb
    dimensions:
      width: 800
      height: 480
    color_order: RGB
    invert_colors: true
    de_pin: 41
    hsync_pin: 39
    vsync_pin: 40
    pclk_pin: 42
    pclk_frequency: 12MHz # unsure about this
    data_pins:
      red:
        - 14
        - 21
        - 47
        - 48
        - 45
      green:
        - 9
        - 46
        - 3
        - 8
        - 16
        - 1
      blue:
        - 15
        - 7
        - 6
        - 5
        - 4
    update_interval: never
    auto_clear_enabled: false

lvgl:
  top_layer:
    widgets:
      - button:
          id: timer_bg
          align: TOP_MID
          hidden: true
          width: 200
          height: 100
          bg_color: black
          widgets:
            - label:
                id: timer_label
                text: "00:00"
                align: CENTER
                text_color: white
                text_font: font_timer
      - bar:
          id: timer_bar
          width: 800
          height: 20 
          animated: true
          max_value: 100
          anim_time: 300ms
          bg_opa: TRANSP
          value: 0
          indicator:
            bg_color: green
          align: BOTTOM_LEFT
          hidden: true
      - button:
          id: timer_stop
          x: 120
          y: 200
          hidden: true
          bg_color: red
          checkable: true
          width: 100
          height: 50
          widgets:
            - label:
                align: center
                text: "Stop" 
                text_color: black
          on_value:
            - switch.turn_off: timer_ringing
          
  buffer_size: 60%
  pages:
    - id: idle_page_lvgl
      bg_color: 0x000000
      on_long_press:
        - switch.toggle: mute
      widgets:
        - image:
            align: CENTER
            src: casita_idle
    - id: listenig_page_lvgl
      widgets:
        - image:
            align: CENTER
            src: casita_listening
    - id: replying_page_lvgl
      widgets:
        - animimg:
            align: CENTER
            src: [ casita_replying, casita_listening ] 
            duration: 500ms
        - label:
            align: TOP_LEFT 
            width: 780
            pad_top: 20
            id: request
            recolor: true
            text: text_request
            text_font: font_request
        - label:
            align: BOTTOM_LEFT
            width: 780
            id: response
            recolor: true
            pad_bottom: 20
            text: text_response
            text_font: font_response
    - id: thinking_page_lvgl
      widgets:
        - image:
            align: CENTER
            src: casita_thinking
    - id: error_page_lvgl
      bg_color: 0x000000
      widgets:
        - image:
            align: CENTER
            src: casita_error
    - id: no_ha_page_lvgl
      bg_color: black
      widgets:
        - image:
            align: CENTER
            src: error_no_ha
    - id: no_wifi_page_lvgl
      widgets:
        - image:
            align: CENTER
            src: error_no_wifi
    - id: initializing_page_lvgl
      bg_color: 0x000000
      widgets:
        - image:
            align: CENTER
            src: casita_initializing
    - id: timer_finished_page_lvgl
      widgets:
        - image:
            align: CENTER
            src: casita_timer_finished
            on_press:
              - switch.turn_off: timer_ringing
     
touchscreen:
  platform: gt911
  id: main_touchscreen
  address: 0x5D
  update_interval: 16ms
  transform:
    mirror_x: true
    swap_xy: true

speaker:
  - platform: i2s_audio
    i2s_audio_id: audio_out
    dac_type: external
    i2s_dout_pin: 17
    id: box_speaker
    buffer_duration: 600ms

media_player:
  - platform: speaker
    name: jarvis_player
    id: speaker_media_player
    announcement_pipeline:
      speaker: box_speaker
      num_channels: 1
    files:
      - id: timer_finished_wave_file
        file: sound/timer_finished.wav
    on_announcement:
      - if:
          condition:
            microphone.is_capturing:
          then:
            - voice_assistant.stop:
    on_idle:
      - voice_assistant.start_continuous

microphone:
  - platform: i2s_audio
    id: box_microphone
    i2s_audio_id: audio_in
    adc_type: external
    i2s_din_pin: 11
    channel: left
    sample_rate: 16000
    bits_per_sample: 16bit

output:
  - id: backlight_output
    platform: ledc
    pin: 2

light:
  - platform: monochromatic
    id: led
    name: Screen
    icon: "mdi:television"
    entity_category: config
    output: backlight_output
    restore_mode: RESTORE_DEFAULT_ON
    default_transition_length: 250ms

voice_assistant:
  id: va
  microphone: box_microphone
  speaker: box_speaker
  use_wake_word: true
  noise_suppression_level: 2
  #vad_threshold: 3
  auto_gain: 24dBFS
  volume_multiplier: 3.0
  on_start:
    - lvgl.page.show: idle_page_lvgl
  on_listening:
    then:
    - lvgl.page.show: listenig_page_lvgl
  on_idle:
    - lvgl.page.show: idle_page_lvgl
  on_stt_vad_end:
    - lvgl.page.show: thinking_page_lvgl
  on_stt_end:
    - lvgl.widget.enable: request
    - lvgl.label.update: 
        id: request
        text: !lambda return x; 
  on_tts_start:
    - lvgl.widget.enable: response
    - lvgl.label.update: 
        id: response
        text: !lambda return x;
  on_tts_stream_start:
    then:
    - lvgl.page.show: replying_page_lvgl
  on_tts_stream_end:
    then:
    - wait_until:
        speaker.is_stopped
    - lvgl.page.show: idle_page_lvgl
    - lvgl.widget.disable: response
    - lvgl.widget.disable: request
  on_end:
    - wait_until:
        and:
          - not:
              media_player.is_announcing:
          - not:
              voice_assistant.is_running:
    - lvgl.page.show: idle_page_lvgl
  on_error:
    -  then:
          - lvgl.page.show: error_page_lvgl
          - delay: 5s
          - lvgl.page.show: idle_page_lvgl
  on_client_connected:
    - wait_until:
        not: ble.enabled
    - voice_assistant.start_continuous:
    - lvgl.page.show: idle_page_lvgl
  on_client_disconnected:
    - lvgl.page.show: error_page_lvgl
  on_timer_started:
    - lvgl.widget.show:
        - timer_bg
        - timer_bar
  on_timer_cancelled:
    - lvgl.widget.hide:
        - timer_bg
        - timer_bar
  on_timer_tick:
    - lvgl.label.update:
        id: timer_label
        text: !lambda |-
          if (timers.empty()) return std::string("00:00");
          auto timer = timers[0];
          int minutes = timer.seconds_left / 60;
          int seconds = timer.seconds_left % 60;
          char buffer[6];
          snprintf(buffer, sizeof(buffer), "%02d:%02d", minutes, seconds);
          return std::string(buffer);
    - lvgl.bar.update:
        id: timer_bar
        value: !lambda |-
          if (timers.empty()) return 0;
          auto timer = timers[0];
          if (timer.total_seconds == 0) return 0;
          return static_cast<int>(100.0 * timer.seconds_left / timer.total_seconds); 
  on_timer_finished:
    - lvgl.widget.hide:
        - timer_bg
        - timer_bar
    - switch.turn_on: timer_ringing
    - lvgl.widget.show: timer_stop
    - lvgl.page.show: timer_finished_page_lvgl
    - while:
        condition:
          switch.is_on: timer_ringing
        then:
          - media_player.speaker.play_on_device_media_file:
              media_file: timer_finished_wave_file
              announcement: true
          - wait_until:
              media_player.is_announcing:
          - wait_until:
              not:
                media_player.is_announcing:

switch:
  - platform: template
    id: timer_ringing
    optimistic: true
    restore_mode: ALWAYS_OFF
    on_turn_off:
      # Stop playing the alarm
      - media_player.stop:
          announcement: true
      - lvgl.widget.hide: timer_stop
      - lvgl.page.show: idle_page_lvgl
    on_turn_on:
      # Turn on the repeat mode and pause for 1000 ms between playlist items/repeats
      - delay: 15min
      - switch.turn_off: timer_ringing
      - lvgl.page.show: idle_page_lvgl

  - platform: template
    name: Mute
    id: mute
    icon: "mdi:microphone-off"
    optimistic: true
    restore_mode: RESTORE_DEFAULT_OFF
    entity_category: config
    on_turn_off:
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
            - light.turn_on: led
            - lvgl.page.show: idle_page_lvgl
    on_turn_on:
      - voice_assistant.stop
      - light.turn_off: led
      - lvgl.page.show: idle_page_lvgl
      
image:
  - file: image/error_320_240.png
    id: casita_error
    resize: 640x480
    type: RGB565
    transparency: alpha_channel
  - file: image/idle_320_240.png
    id: casita_idle
    resize: 640x480
    type: RGB565
    transparency: alpha_channel
  - file: image/listening_320_240.png
    id: casita_listening
    resize: 640x480
    type: RGB565
    transparency: alpha_channel
  - file: image/thinking_320_240.png
    id: casita_thinking
    resize: 640x480
    type: RGB565
    transparency: alpha_channel
  - file: image/replying_320_240.png
    id: casita_replying
    resize: 640x480
    type: RGB565
    transparency: alpha_channel
  - file: image/loading_320_240.png
    id: casita_initializing
    resize: 640x480
    type: RGB565
    transparency: alpha_channel
  - file: image/error-no-wifi.png
    id: error_no_wifi
    resize: 640x480
    type: RGB565
    transparency: alpha_channel
  - file: image/error-no-ha.png
    id: error_no_ha
    resize: 640x480
    type: RGB565
    transparency: alpha_channel
  - file: image/timer_finished_320_240.png
    id: casita_timer_finished
    resize: 640x480
    type: RGB565
    transparency: alpha_channel

font:
  - file: 
      type: gfonts
      family: Figtree
      weight: 400
      italic: true
    id: font_request
    size: 18
    glyphsets: GF_Latin_Core
  - file:
      type: gfonts
      family: Figtree
      weight: 400
    id: font_response
    #glyphs: ${allowed_characters}
    size: 20
    glyphsets: GF_Latin_Core
  - file:
      type: gfonts
      family: Figtree
      weight: 400
    id: font_timer
    #glyphs: ${allowed_characters}
    size: 36
    glyphsets: GF_Latin_Core
    
button:
  - platform: restart
    name: "Jarvis Restart"

Just attach a speaker to the output of the board and connect an i2c microphone.
Cheers

Doesn’t work here… speakers are silent, although log (and media player) says it plays.
Only way i could get my speaker to work was via mixer:

i2s_audio:
  - id: i2s_out
    i2s_lrclk_pin: GPIO18
    i2s_bclk_pin: GPIO0

speaker:
  - platform: i2s_audio
    id: speaker_id
    dac_type: external
    i2s_dout_pin: GPIO17
    sample_rate: 48000
    channel: stereo
    timeout: never
    buffer_duration: 100ms
  
  - platform: mixer
    id: mixer_speaker_id
    output_speaker: speaker_id
    source_speakers:
      - id: announcement_spk_mixer_input
      - id: media_spk_mixer_input
  - platform: resampler
    id: media_spk_resampling_input
    output_speaker: media_spk_mixer_input
  - platform: resampler
    id: announcement_spk_resampling_input
    output_speaker: announcement_spk_mixer_input

media_player:
  - platform: speaker
    name: "Speaker Media Player"
    id: speaker_media_player_id
    buffer_size: 100000
    media_pipeline:
        speaker: media_spk_resampling_input
        sample_rate: 21000
        num_channels: 2
        format: WAV
    announcement_pipeline:
        speaker: announcement_spk_resampling_input
        sample_rate: 21000
        num_channels: 1
        format: WAV

But, “side effect” of built-in audio is nasty few kHz sound if backlight is less than 100% - noise from light pwm is played on speakers.

Nice looks good! What microphone did you use?

Do you have a GitHub repo so we can get your graphic files?

Wich Board Version do you got?
Depends on the version you need to change bclk pin to 19



And I used an inmp441 microphone
The pictures are out of the voice-assistant github.

It should be the noise from the backlight control MOSFET.

Thanks for this extremely useful information. I purchased my board recently and it’s version 1.4. It is also using GPIO 0 for the I2S_BCLK.

When I googled this board I only found the information on the 1.0 version and I thought I needed to use pin 19. This of course did not work and I don’t think audio would work on a 1.0 board at all as this pin is shared with the touch screen.

When I boot the board I do initially get static on the speaker but once I play sound everything is fine. The audio amp on this board is very low power so I need max volume just to hear it at all.

Now I just need to get a mic so I can try the voice assistant. They are 3 for $9USD on Amazon!

I’ll check my version when i get home, i guess it could be that i have a different one, yes. My backlight works, though, it’s just as NIUB said, mosfet causes noise, which makes whole thing almost useless. Since i rarely keep my LCD’s at full brightness when idle speaker would “beep” the whole time except when i use my lcd. I make them in a way that brightness goes up to 100% only at touch (when i use it) and then goes down after set time of no-use.

But, recently some things were done/corrected regarding speaker part in esphome, so it could be that some bug still remained…?

I do hear some constant low level noise on my speaker but I have to hold my ear directly to the speaker to hear it so it’s not annoying. I would attribute this to pour quality audio components. Based on the schematic for this board the audio piece costs 20 cents or less.

My display is version 1.4.
I don’t need sound right now anyway, i just wanted to test it. I guess when/if i will need it i’ll rather add external I2S audio board, since they are dirt cheap on aliexpress and i have good experience with them. Audio chip on this board is some unknown variant anyway…

Have you seen a good one with both a DAC and an amp?

Yes, i use “TENSTAR MAX98357 I2S 3W Class D Amplifier Breakout Interface Dac Decoder Module” aliexpress modules in a couple of speakers (with esp32 wemos d1 mini) around my house as warning audio players (like doorbell, gate open etc…) and together with two 3W speakers they work quite well. They cost around 2€ per piece and they are well supported in esphome.

That is a really nice board I see a few vendors sell 10pcs for only a few dollars.

You said you used two speakers with it. It looks like it’s only mono and supports a single 3w speaker.

Hi, I just use one speaker. No external DAC.
I tried sound with dimming the backlight and had no problems at all. I got the version 1.3

Yes, it’s mono, i use two speakers because i have a bunch of small 8 ohm 3W speakers, so since board is 3W at 4 ohms i wired 2 of them in parallel. That way i get “even more bang for my buck”… :grinning: