Fix: ThirdReality Voice/Music Assistant Dev Edition stuck with "ESPHome api section" error in Home Assistant
TL;DR
The voice service is hard-gated behind a successful ntpdate against public NTP servers (most of them in China). If your network blocks WAN egress for the device's subnet — a normal IoT VLAN setup — the speaker boots, announces it's ready, gets a DHCP lease, and then never opens its ESPHome API port (6053). Home Assistant shows the misleading error "Unable to connect to the ESPHome device. Make sure the device's YAML configuration includes an api section." The fix is to point the device at a local NTP server. You need the Dev Edition with debug board to do this — there's no other way to get a shell on the device.
The contradiction
The ThirdReality Voice/Music Assistant is marketed as a "local-first" satellite — preloaded with Home Assistant Voice Assistant and Music Assistant, sold to users "building a local-first smart home environment" and an "open-source, local-first" experience.
In reality, the device cannot complete first boot without reaching the internet. The product copy emphasizes local audio processing while the on-device code refuses to start the voice service until ntpdate succeeds against one of seven hardcoded public NTP servers (predominantly Chinese: ntp.aliyun.com, ntp.tencent.com, cn.pool.ntp.org, plus Cloudflare/Google fallbacks).
For anyone running the speaker on an isolated IoT VLAN — exactly the recommended practice for any IoT device, and exactly the practice a security-conscious Home Assistant user would follow — the speaker provisions Wi-Fi over BLE, plays its ready announcement, and then sits forever in a connected_no_internet state, never exposing the ESPHome API.
This is a design defect, not a misconfiguration on the user's end. A local-first satellite device should never require WAN connectivity to function. NTP is genuinely useful for log timestamps and certificate validation, but gating the entire voice service behind it — in a while true retry loop with no fallback — is the kind of decision that should not have survived a design review.
Symptoms
- Device provisions over BLE Improv successfully via the HA mobile app
- Device appears in your DHCP table with a valid lease
- Device responds to ping from anywhere on its subnet
- TCP connection to port 6053 fails (
nc -zv <speaker-ip> 6053returns exit 1) - HA's ESPHome integration shows "Unable to connect to the ESPHome device. Make sure the device's YAML configuration includes an
apisection" when you try to add it manually - Auto-discovery (mDNS) never finds the device because the device never advertises a service that doesn't exist
The error message is misleading. The device doesn't run an ESPHome YAML at all — it runs a Python linux-voice-assistant daemon that implements the ESPHome native API protocol. The HA-side error is a generic connection failure dressed up as a YAML configuration problem.
Who this affects
You'll hit this bug if any of these apply:
- Your device's subnet has no WAN egress (IoT VLAN, segmented LAN, air-gapped network)
- Your firewall blocks NTP destinations or outbound UDP 123
- Your DNS doesn't resolve
pool.ntp.organd the Chinese NTP servers from the device's subnet - You're behind a captive portal the device can't navigate
If your speaker has open internet access, you won't see this — ntpdate succeeds, the service starts, and everything works.
Prerequisites
Dev Edition with debug board is required.
The Dev Edition ships in two SKUs:
- With Debug Board — exposes a USB serial console at 115200 baud over one of two USB ports on the debug board PCB. Also exposes a USB-OTG flashing port.
- Without Debug Board — Type-C cable for firmware flashing only, no serial console, no shell access.
The fix below requires editing files on the device, which requires shell access, which requires the serial console, which requires the debug board.
If you have the no-debug-board version and you hit this bug, your only options are:
- Give the device's subnet WAN egress (allow it through your firewall)
- Move the device to a network with internet access
- Return it
The vendor should have flagged this in the product description for the no-debug-board SKU. They didn't.
Diagnosis
1. Get a serial console
Connect a USB cable (data-capable, not charge-only) from the debug board's serial/UART USB port to your PC. The debug board has two USBs — the UART one enumerates immediately as a USB serial adapter (CH340/CP2102/FT232 style) and appears as COMx on Windows or /dev/ttyUSB0 on Linux/macOS. The other is USB-OTG for flashing and only enumerates in burn mode.
Open a serial terminal at 115200 8N1, no flow control. PuTTY, screen, minicom, or the Arduino serial monitor all work.
Power-cycle the speaker. You'll see Linux boot messages stream past, then a prompt. No password.
2. Confirm the symptom
ss -lnt | grep 6053 # nothing — API not listening
date # shows 1970 — clock never synced
ip route # default route should exist; if not, fix DHCP first
ping -c 2 <your-gateway> # confirm subnet connectivity
ping -c 2 8.8.8.8 # confirms WAN egress
If ping 8.8.8.8 fails but the gateway responds, your subnet is isolated and you've hit the bug.
3. Look at the gating logic
cat /etc/init.d/ntpdate.sh
You'll find this structure:
while true ; do
$NTPDATE_BIN -v -b $NTPDATE_OPTS $NTPSERVERS > /dev/null 2>&1
if [ $? = 0 ]; then
# ... start voice-assistant and snapclient ...
break;
else
killall -9 ntpd > /dev/null 2>&1
sleep 1
fi
done
The voice service is only ever started inside the success branch. There is no fallback. The loop retries every second forever.
4. Check the NTP server list
cat /etc/default/ntpd
Default value:
NTPSERVERS="ntp.aliyun.com ntp.tencent.com cn.pool.ntp.org time.cloudflare.com time.google.com pool.ntp.org time.android.com"
None of which are reachable from an isolated subnet.
Fix
1. Set up a local NTP server
You need an NTP server reachable from the device's subnet. Pick whichever is easiest:
- Synology NAS (DSM 6.2+/7.x): Control Panel → Regional Options → NTP Service tab → Enable NTP service. Open UDP 123 in Synology's firewall if active.
- Any always-on Linux box:
sudo apt install chrony echo "allow <your-subnet>/24" | sudo tee -a /etc/chrony/chrony.conf sudo systemctl restart chrony - Docker container, anywhere on the subnet:
docker run -d --restart=always --name ntp -p 123:123/udp cturra/ntp - Router: pfSense, OPNsense, OpenWrt, MikroTik, EdgeOS, and most consumer routers can serve NTP. Test before assuming.
Important note on UniFi gateways: UniFi Network (as of v9.x) does not expose an NTP-server toggle for any gateway in the USG/UDM/UXG family. The DHCP Option 42 setting only advertises an NTP server IP to clients — it does not host one. You need a separate device.
2. Verify your NTP source from the speaker
Before changing anything on the device, confirm the speaker can reach your chosen NTP server:
ntpdate -q <your-ntp-server-ip>
Expected output:
server <ip>, stratum N, offset +<huge_number>.xxxxx, delay 0.0xxx
The "huge_number" offset is your speaker's clock being roughly 56 years behind real time. Stratum 2 or 3 is normal for a local server.
If ntpdate -q times out or refuses, your NTP server isn't reachable. Fix that before continuing.
3. Edit the speaker's NTP config
The rootfs is UBIFS — real flash storage that persists across reboots. Remount it writable, then write the new config:
mount -o remount,rw /
cat > /etc/default/ntpd <<'EOF'
NTPSERVERS="<your-ntp-server-ip>"
NTPDATE=yes
NTPDATE_OPTS="-4 -t 2"
NTPD=yes
EOF
For redundancy, list multiple NTP servers space-separated:
NTPSERVERS="<ntp1-ip> <ntp2-ip>"
Do NOT remount the rootfs read-only afterward. The voice-assistant daemon needs to write to its install directory at runtime (creates /usr/lib/python3.11/site-packages/local). PulseAudio also fails to start on a read-only rootfs. Leave it RW.
4. Restart the NTP service
/etc/init.d/S49ntp restart
5. Watch it actually work
tail -f /var/log/messages
You should see, in order:
ntpdate OK- System clock jumps from 1970 to current date
voice-assistantandsnapclientservices start- The speaker plays "Your device is ready to connect to Home Assistant" (first time only)
- A log line like:
INFO:__main__:Server started (host=<speaker-ip>, port=6053)
6. Verify from Home Assistant
From a shell on your HA host or any device on the same network:
nc -zv <speaker-ip> 6053
echo $? # 0 = success
Then add the device in HA: Settings → Devices & Services → Add Integration → ESPHome → enter the speaker IP and port 6053, leave the encryption key blank.
Persistence and reboots
The speaker has no working RTC battery. Every boot starts at 1970-01-01 UTC. Your local NTP source must be available every time the speaker boots, or you'll be stuck again.
Recommendations:
- Configure two NTP servers in
NTPSERVERSfor redundancy - Use static DHCP reservations for your NTP server(s) so their IPs don't shift
- Keep at least one NTP source on the same subnet as the speaker, in case inter-VLAN routing fails
The /etc/default/ntpd edit persists across reboots since UBIFS is real flash. But firmware updates (OTA in v1.1.3 and later) will overwrite it. After any update, redo the config edit.
After your first reboot, verify the rootfs comes up RW:
mount | grep " / "
If it shows (ro,...), the voice-assistant will crash in a loop on every boot with OSError: [Errno 30] Read-only file system. In that case, add mount -o remount,rw / to an early init script (e.g., a new /etc/init.d/S00remount-rw).
What ThirdReality should fix
In rough order of importance:
-
Remove the NTP gate from the voice service. Logs can warn about clock skew. The voice/music daemon should not refuse to start.
-
If NTP must stay gated, add a retry limit. After N failed attempts, start the service anyway and let the user deal with timestamp inaccuracy. A
while trueloop with no escape is unacceptable in shipping firmware. -
Honor DHCP option 42 (NTP server). Or add a setup-time field in the BLE provisioning flow for a local NTP server. Either is a few lines of code and solves this for everyone.
-
Reorder the default NTP server list. Putting Chinese servers first means users outside China — almost the entire English-speaking customer base — experience 30+ second NTP delays at every boot when those servers are slow or unreachable.
-
Document the WAN-egress requirement on the product page, quick-start guide, and packaging. Users with segmented networks — i.e., the natural Home Assistant audience — will hit this and have no idea why their "local-first" device doesn't work.
-
Improve the failure mode. When the voice service hasn't started, the device should ideally serve a status page on port 80 or similar so users can see what's wrong without a serial console. As-is, the only signal is silence from port 6053.
Closing thoughts
Marketing a product as "open source, local-first, distributed voice and music experience" while shipping firmware that refuses to function without phoning home for time is, at best, misleading. The fix is trivial — a config tweak — but only Dev Edition buyers with the debug board have the access required to apply it. Standard Dev Edition buyers (no debug board) have no recourse beyond returning the unit or compromising their network segmentation.
For now, the workaround above gets working hardware out of a returnable product. If you hit this issue, please file it with ThirdReality (their GitHub at thirdreality/voice-music-assistant) and on the Home Assistant community forums. The more reports, the more pressure on the vendor to fix this properly in firmware rather than every buyer reinventing the workaround.
Tested on: firmware v1.1.7, Home Assistant Core 2026.4.x, HAOS 17.x
Hardware needed: ThirdReality Voice/Music Assistant Dev Edition with Debug Board, USB data cable