I just got started with voice control, using an ESP32-S3-BOX. I hit some snags when setting it up. I’m going to share my observations in this post in case someone else runs into the same problems…
The tutorial for the ESP32-S3-BOX says that you need “Chrome or Edge browser on a desktop (not Android/iOS)” and then continues with “Make sure this page is opened in a Chromium-based browser”. Note that “this page” is not a link, it’s literally the tutorial page.
I was using Firefox and despite the “If your browser does not support web serial, you will see a warning message indicating this instead of a button” comment I got no such warning, so I missed that I should have opened that page in Chrome.
The next snag was that I was using a USB-C-to-USB-A cable to connect the box. That powers up the box, but does not expose the necessary serial port for flashing it. One has to use a USB-C-to-USB-C cable. With that I got a new /dev/ttyACM0 when connecting the box to my Linux laptop. Other users had permission issues and conflicts with brltty, but that was okay for me.
At this point, following the tutorial in Chrome probably would have worked for me. But instead, I manually installed esphome and checked out GitHub - esphome/wake-word-voice-assistants. I created an esphome-config directory and created a secrets.yaml file there:
wifi_ssid: "<my wifi network name>"
wifi_password: "<my wifi password>"
As config, I created a symlink to esp32-s3-box.yaml in the same directory and made the following changes:
diff --git a/esp32-s3-box/esp32-s3-box.yaml b/esp32-s3-box/esp32-s3-box.yaml
index 47c0b74..de43a58 100644
--- a/esp32-s3-box/esp32-s3-box.yaml
+++ b/esp32-s3-box/esp32-s3-box.yaml
@@ -34,8 +34,8 @@ substitutions:
micro_wake_word_model: okay_nabu
esphome:
- name: ${name}
- friendly_name: ${friendly_name}
+ name: esp32-s3-box
+ friendly_name: ESP32-S3-BOX
min_version: 2024.9.0
name_add_mac_suffix: true
platformio_options:
@@ -61,6 +61,9 @@ esp32:
CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
+ # PO: https://github.com/esphome/wake-word-voice-assistants/issues/43#issuecomment-2552029335
+ version: 4.4.8
+ platform_version: 5.4.0
psram:
mode: octal
@@ -75,6 +78,9 @@ external_components:
refresh: 0s
api:
+ # PO: from "esphome dashboard" config
+ encryption:
+ key: "<random 32 bytes, base64 encoded ?>" # dd if=/dev/random count=32 bs=1 status=none | base64
on_client_connected:
- script.execute: draw_display
on_client_disconnected:
@@ -83,11 +89,18 @@ api:
ota:
- platform: esphome
id: ota_esphome
+ password: "<random 32 characters, hex>"
+
logger:
hardware_uart: USB_SERIAL_JTAG
wifi:
+ # PO: from "esphome dashbard" config.
+ ssid: !secret wifi_ssid
+ password: !secret wifi_password
ap:
+ ssid: "Esp32-S3-Box Fallback Hotspot"
+ password: "<random 12 characters, alphanumeric>"
on_connect:
- script.execute: draw_display
on_disconnect:
With that, I can manually update the box:
esphome/bin/esphome run --device=/dev/ttyACM0 esphome-config/esp32-s3-box.yaml
Let me repeat: when opening the tutorial in Chrome, none of this is necessary. But I’m now glad that I have it set up, because now I can experiment with changes to the configuration. And I learned something