ESP32-S3-BOX manual setup

I just got started with voice control, using an ESP32-S3-BOX. I hit some snags when setting it up. I’m going to share my observations in this post in case someone else runs into the same problems…

The tutorial for the ESP32-S3-BOX says that you need “Chrome or Edge browser on a desktop (not Android/iOS)” and then continues with “Make sure this page is opened in a Chromium-based browser”. Note that “this page” is not a link, it’s literally the tutorial page.

I was using Firefox and despite the “If your browser does not support web serial, you will see a warning message indicating this instead of a button” comment I got no such warning, so I missed that I should have opened that page in Chrome.

The next snag was that I was using a USB-C-to-USB-A cable to connect the box. That powers up the box, but does not expose the necessary serial port for flashing it. One has to use a USB-C-to-USB-C cable. With that I got a new /dev/ttyACM0 when connecting the box to my Linux laptop. Other users had permission issues and conflicts with brltty, but that was okay for me.

At this point, following the tutorial in Chrome probably would have worked for me. But instead, I manually installed esphome and checked out GitHub - esphome/wake-word-voice-assistants. I created an esphome-config directory and created a secrets.yaml file there:

wifi_ssid: "<my wifi network name>"
wifi_password: "<my wifi password>"

As config, I created a symlink to esp32-s3-box.yaml in the same directory and made the following changes:

diff --git a/esp32-s3-box/esp32-s3-box.yaml b/esp32-s3-box/esp32-s3-box.yaml
index 47c0b74..de43a58 100644
--- a/esp32-s3-box/esp32-s3-box.yaml
+++ b/esp32-s3-box/esp32-s3-box.yaml
@@ -34,8 +34,8 @@ substitutions:
   micro_wake_word_model: okay_nabu
 
 esphome:
-  name: ${name}
-  friendly_name: ${friendly_name}
+  name: esp32-s3-box
+  friendly_name: ESP32-S3-BOX
   min_version: 2024.9.0
   name_add_mac_suffix: true
   platformio_options:
@@ -61,6 +61,9 @@ esp32:
       CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
       CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
       CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
+    # PO: https://github.com/esphome/wake-word-voice-assistants/issues/43#issuecomment-2552029335
+    version: 4.4.8
+    platform_version: 5.4.0
 
 psram:
   mode: octal
@@ -75,6 +78,9 @@ external_components:
     refresh: 0s
 
 api:
+  # PO: from "esphome dashboard" config
+  encryption:
+    key: "<random 32 bytes, base64 encoded ?>" # dd if=/dev/random count=32 bs=1 status=none | base64
   on_client_connected:
     - script.execute: draw_display
   on_client_disconnected:
@@ -83,11 +89,18 @@ api:
 ota:
   - platform: esphome
     id: ota_esphome
+    password: "<random 32 characters, hex>"
+
 logger:
   hardware_uart: USB_SERIAL_JTAG
 
 wifi:
+  # PO: from "esphome dashbard" config.
+  ssid: !secret wifi_ssid
+  password: !secret wifi_password
   ap:
+    ssid: "Esp32-S3-Box Fallback Hotspot"
+    password: "<random 12 characters, alphanumeric>"
   on_connect:
     - script.execute: draw_display
   on_disconnect:

With that, I can manually update the box:

esphome/bin/esphome run --device=/dev/ttyACM0 esphome-config/esp32-s3-box.yaml

Let me repeat: when opening the tutorial in Chrome, none of this is necessary. But I’m now glad that I have it set up, because now I can experiment with changes to the configuration. And I learned something :grinning:

you can use a usb c to usb a - it just needs to be a data cable and not a charge only cable. if you are looking for other things you can do with the S3 Box 3 then you might want to check this project out :wink: GitHub - BigBobbas/ESP32-S3-Box3-Custom-ESPHome: Custom ESPHome config for ESP32-S3-Box-3 with sensors and touchscreen

You are probably right. I have no idea where that USB-C-to-USB-A cable came from, so it might very well be just for charging.