Hello everyone,
I have been banging my head against the wall trying to get the voice pipeline to work from ESPHome. I have both custom hardware as well as an M5Stack Echo that I have been testing with.
Idea:
The assist pipeline works completely fine on my computer and phone. I wanted to set up an I2S assist device that would take input from a microphone and have a speaker output to a PCM5102-> RCA receiver. Simple right? I thought so. I also bought an M5Stack Echo for testing and maybe to have a deskside voice assistant to fiddle with.
I have ESPHome and Home assistant as a container. They communicate no problem as I have a few other devices like a DSMR reader active so I pretty much always know when it is working or something is wrong.
The playing music part as a speaker works very well already. No problem there.
Problem
ESPHome provides an example config for the M5Stack Echo, so I tried to use that after failing with my MSM261S4030H0 based breakout board to an ESP32-S3 Xiao device.
According to the whisper logs, the generated STT wav file is always empty. Always. with multiple devices. The M5Stack auto-detects the end of listening based on the pipeline and it simply stays listening forever because it detects nothing.
Here is the log file of a failed attempt as well as a “normal” successful STT from my computer microphone:
03/26/202401:50:24 PM
ERROR:asyncio:Task exception was never retrieved
* 03/26/202401:50:24 PM
future: <Task finished name='wyoming event handler' coro=<AsyncEventHandler.run() done, defined at /usr/local/lib/python3.9/dist-packages/wyoming/server.py:31> exception=AssertionError()>
* 03/26/202401:50:24 PM
Traceback (most recent call last):
* 03/26/202401:50:24 PM
File "/usr/local/lib/python3.9/dist-packages/wyoming/server.py", line 41, in run
* 03/26/202401:50:24 PM
if not (await self.handle_event(event)):
* 03/26/202401:50:24 PM
File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/handler.py", line 63, in handle_event
* 03/26/202401:50:24 PM
assert self._wav_file is not None
* 03/26/202401:50:24 PM
AssertionError
* 03/26/202401:53:15 PM
INFO:faster_whisper:Processing audio with duration 00:01.770
* 03/26/202401:53:16 PM
INFO:wyoming_faster_whisper.handler: Love you.
Attempted Solutions
I have looked at a ton of different threads, but not a single one has the same issue. I thought at first that it was an esphome device yaml issue, but that didn’t pan out.
-
I have completely opened all firewall rules during testing between my server. It is not a communication access issue
-
I have put homeassistant, whisper, and esphome all on the same docker network
-
I have used the default device yaml provided here: firmware/media-player/m5stack-atom-echo.yaml at fd24297932de474d8a552c1d238ba508153ad687 · esphome/firmware · GitHub with wifi and api stuff changed to successfully connect to homeassistant.
-
I have manually probed the signal lines with my logic analyzer and confirmed that data is being sent from the I2s modules to the ESP.
I am lost with what else to do. Literally everything works with esphome/homeassistant communication, including streaming music as a connected speaker to the exact same device except for receiving .wav information.
docker-compose snippets for HA, esphome, and whisper:
version: '3.7'
services:
homeassistant:
container_name: homeassistant
image: ghcr.io/home-assistant/home-assistant:latest
volumes:
- $USERDIR/dockerconfig/homeassistant:/config
- /etc/localtime:/etc/localtime:ro
- $USERDIR/data/Music/:/Music
- $USERDIR/dockerconfig/homeassistant/assist_pipeline:/share/assist_pipeline/
restart: unless-stopped
#privileged: true
networks:
addons:
ipv4_address: 172.21.0.2
web:
ipv4_address: 192.168.90.17
zwave:
ipv4_address: 172.18.0.3
mqtt:
ipv4_address: 172.20.0.3
ports:
- 8123:8123
environment:
- PUID=$PUID
- PGID=$PGID
- TZ=$TZ
labels:
- "traefik.enable=true"
- "traefik.docker.network=web"
## HTTP Routers
- "traefik.http.routers.hass-rtr.entrypoints=https"
- "traefik.http.routers.hass-rtr.rule=Host(`home.$DOMAINNAME`)" # || Host(`www.$DOMAINNAME`)"
- "traefik.http.routers.hass-rtr.tls=true"
## Middlewares
- "traefik.http.routers.hass-rtr.middlewares=chain-authelia@file"
## HTTP Services
- "traefik.http.routers.hass-rtr.service=hass-svc"
- "traefik.http.services.hass-svc.loadbalancer.server.port=8123"
## HTTP Routers - Mobile
- "traefik.http.routers.hassM-rtr.entrypoints=https"
- "traefik.http.routers.hassM-rtr.rule=Host(`mhome.$DOMAINNAME`)" # || Host(`www.$DOMAINNAME`)"
- "traefik.http.routers.hassM-rtr.tls=true"
## Middlewares - Mobile
- "traefik.http.routers.hassM-rtr.middlewares=chain-no-auth@file"
## HTTP Services - Mobile
- "traefik.http.routers.hassM-rtr.service=hass-svc"
esphome:
container_name: esphome
image: ghcr.io/esphome/esphome
environment:
- PUID=$PUID
- PGID=$PGID
- TZ=$TZ
- ESPHOME_DASHBOARD_USE_PING=true
volumes:
- $USERDIR/dockerconfig/esphome:/config
- /etc/localtime:/etc/localtime:ro
restart: unless-stopped
#privileged: true
networks:
web:
ipv4_address: 192.168.90.140
addons:
ipv4_address: 172.21.0.6
ports:
- 6052:6052
- 6123:6123
whisper:
container_name: whisper
image: rhasspy/wyoming-whisper:latest
#command: --model base-int8 --language fr
command: --model base-int8 --language en --beam-size 1
privileged: true
restart: unless-stopped
environment:
- TZ=$TZ
volumes:
- $USERDIR/data/Volumes/whisper:/data
ports:
- 10300:10300
networks:
addons:
ipv4_address: 172.21.0.3
...
networks:
web:
external: true
zwave:
external: false
driver: bridge
ipam:
config:
- subnet: 172.18.0.0/16
gateway: 172.18.0.1
mqtt:
external: true
addons:
external: false
driver: bridge
ipam:
config:
- subnet: 172.21.0.0/16
gateway: 172.21.0.1
Can anyone shine a light on to why everything works except streamed wav files are always empty? The only error ever is in whisper. HA logs have no error, ESPHome logs have no errors, device logs have no error.