Year of the Voice - Chapter 4: Wake words

Could we also use a M5Core2 as a voice satelite with ESPHOME?
If yes, could we get the display working at the same time?

Thanks for your posting.

Here’s my docker compose snippet using my new wake word., running on a RP4.

  openwakeword: # openwakeword

    container_name: openwakeword

    image: rhasspy/wyoming-openwakeword:latest

    command: 
      --preload-model 'ok'
      --custom-model-dir /custom

    volumes:
      - /home/kertz/HALIVE/openwakeword-data:/data
      - /home/kertz/HALIVE/openwakeword-data:/custom

    environment:
      - TZ=Europe/Amsterdam

    restart: unless-stopped

    ports:
      - 10400:10400

I left out the default models as they are always loaded by the rhasspy/wyoming-openwakeword image.

Just one warning in the log files and that is

Logger: homeassistant.components.esphome.voice_assistant
Source: components/esphome/voice_assistant.py:164
Integration: ESPHome (documentation, issues)
First occurred: 10:09:13 (4 occurrences)
Last logged: 10:09:31

Received unknown pipeline event type: stt-vad-start
Received unknown pipeline event type: stt-vad-end

Does anybody knows what this relates to ?

1 Like

It definitely looks like it is an esphome issue. Are you running the esphome beta channel? I think the beta is required for wake word support until it is officially released.

Yes, I am using the beta version. If you try any other version, you will see many errors listed when using the docker-compose logs command.
I have seen one other post on the ‘Wake word is rolled out?’ community with the same warning.
Early days for this function, so best just to ignore for now, unless I see many other people with the same issue, then I will create a core issue.

@synesthesiam Mike do you think its worthwhile setting up some recomended products on a Raspberry Pi.

I will reply to you jamesleak70 as the hardware problems are mainly due a lack of audio algs to utilise mic arrays.

Many USB soundcards have a mic input but they expect broadcast situations of close field and are usually poor near/far. You can digitally increase the volume but much resolution is lost (low volume = a few lsb bits) so an analogue mic amp espeacially with silicon AGC makes huge difference as mics are not designed to be used near/farfield, but close broadcast style.

have a really good analogue AGC and prob easiest is to snip some dupoints jumpers and use terminals or solder to a 3.5mm audio jack lead.
The cheapest soindcard plus the Max9814 will likely far exceed £100 plus conference mics (I will get to that later)

The Respeaker 2 Mic is an easy all-in-one plug on hat for Pi you need audio in/out on the same device if you wish to use Speex AEC. The x2 onboard mics are sort of pointless and everybody often has them pointing up at the ceiling, but for $10 its a really cost effective hat and all the others are pointless as they don’t have a single alg to use the multiple mics they have.

Conference mics and speaker phones for most parts useless as they are for a completely different scenario than a smart speaker microphone or at least a whole lot of expenditure that the above el cheapo soundcard and max9814 with electret can produce.
Conference mics and speaker phones just beam to the most prominent voice in a 360’ whilst a smart speaker mic will lock on to a voice for a command sentence.
There are a whole array of webcams that have dual mics as I have a Anker PowerConf C300 webcam and the pickup is brilliant compared to my Anker Powerconf speakerphone and this is the same for many.

Also like needs some tidying and prob optimising for Neon but I created a realtime linear array beamformer for any array.

Its does output TDOA so you can make an avg and also set to where it beams which is totally missing on speaker/conf phones

/tmp/ds-out contains current TDOA so poll to set LEDs To monitor watch -n 0.1 cat /tmp/ds-out

To fix the beamformer write a file to /tmp/ds-in echo 1 > /tmp/ds-in sets beam to a delay of 1 Delete file to clear and use TDOA

Still need to sample the DOA output for the VAD/KWS to set the beampoint and to clear on command sentence, but a simple delay-sum in the above is working.
For a good stereo ADC soundcard which is more usefull than a hat as on a Pi4/5 you could run multiple mics

is the only one I know now and $10 its still cheap.

Uses the mighty ESP32-S3 that for ML & DSP is approx x10 faster than a ESP32 such as a M5 stack.

If someone, as did try with the willow guys if they would just record the output of what Espressif have done with there Alexa certified

As likely we could replace the Esspressif fixed KWS with one trained like has been done above.
Still though due to environments to catch up with commercial we need to capture data of use and train locally via on-device training where through transfer learning you can bias a big pretrained model via a smaller on-device trained one whose dataset is of local capture.
It very likely and ESP32 with a stereo ADC and x2 Max9814 with the ESP32-AFE and x2 KWS could get very near commercial offeringsas the training happens upstream on say Pi4/5 and updates to esp32 OTA.
Same scheme could be used with the Pi02W 2pin hat or maybe ‘Home Assistant’ could benefit all by having a default device on sale that models are specifically trained for.

2 Likes

I have both an ESP Muse Luxe and the $13 Atom, both are behaving the same in regard to an issue I am having. They work fine for wake word detection right after turning on wake word switch on their device page. However, after 1 or 2 commands they stop responding until I turn wake word off then back on again. What are some troubleshooting steps I can try?

2 Likes

Did you update, i think today some fixed where made for openwakeword

What is your assist pipeline? Fully local as mine? I have the same behaviour…
HA running on a Pi4 2GB

I have a local only pipeline and M5stack echo and it does the same as well.

1 Like

Does it work with smartphones and smartwatches also?

[deleted by author]

No. It basically keeps an open line to the HA until the hotword is recognized server side.

1 Like

Hey, I try to use Jabra SPEAK 510 MS Bluetooth via USB in my HA with plugin Assist Microphone in version 2.2.0 but it don’t work. All time I see this error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/homeassistant_satellite/__main__.py", line 200, in main
    async for _timestamp, event_type, event_data in stream(
  File "/usr/local/lib/python3.11/dist-packages/homeassistant_satellite/remote.py", line 28, in stream
    async with session.ws_connect(url) as websocket:
  File "/usr/local/lib/python3.11/dist-packages/aiohttp/client.py", line 1141, in __aenter__
    self._resp = await self._coro
                 ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/aiohttp/client.py", line 779, in _ws_connect
    resp = await self.request(
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/aiohttp/client.py", line 560, in _request
    await resp.start(conn)
  File "/usr/local/lib/python3.11/dist-packages/aiohttp/client_reqrep.py", line 899, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/aiohttp/streams.py", line 616, in read
    await self._waiter
aiohttp.client_exceptions.ServerDisconnectedError: Server disconnected
ERROR:__main__:Unexpected error
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/homeassistant_satellite/__main__.py", line 200, in main
    async for _timestamp, event_type, event_data in stream(
  File "/usr/local/lib/python3.11/dist-packages/homeassistant_satellite/remote.py", line 28, in stream
    async with session.ws_connect(url) as websocket:
  File "/usr/local/lib/python3.11/dist-packages/aiohttp/client.py", line 1141, in __aenter__
    self._resp = await self._coro
                 ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/aiohttp/client.py", line 779, in _ws_connect
    resp = await self.request(
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/aiohttp/client.py", line 560, in _request
    await resp.start(conn)
  File "/usr/local/lib/python3.11/dist-packages/aiohttp/client_reqrep.py", line 899, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/aiohttp/streams.py", line 616, in read
    await self._waiter
aiohttp.client_exceptions.ServerDisconnectedError: Server disconnected
ERROR:__main__:Unexpected error
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/homeassistant_satellite/__main__.py", line 200, in main
    async for _timestamp, event_type, event_data in stream(
  File "/usr/local/lib/python3.11/dist-packages/homeassistant_satellite/remote.py", line 28, in stream
    async with session.ws_connect(url) as websocket:
  File "/usr/local/lib/python3.11/dist-packages/aiohttp/client.py", line 1141, in __aenter__
    self._resp = await self._coro
                 ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/aiohttp/client.py", line 779, in _ws_connect
    resp = await self.request(
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/aiohttp/client.py", line 560, in _request
    await resp.start(conn)
  File "/usr/local/lib/python3.11/dist-packages/aiohttp/client_reqrep.py", line 899, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/aiohttp/streams.py", line 616, in read
    await self._waiter
aiohttp.client_exceptions.ServerDisconnectedError: Server disconnected
ERROR:__main__:Unexpected error

This is print screen with plugin configuration:

How can I solve this problem?

1 Like

Is there a way to monitor the ongoing traffic from the atom echo to HA for the wakeword detection? Would a LAN/WiFi sniffer be an option e.g.? At the moment I don’t know if wakeword detection stops working because the atom stops sending things to HA, or if things reach HA and then don’t get processed (properly)

I’ve also got speakers in every room using Logitech Media server. The LMS server is a Raspberry Pi and the players (the speakers) in each room are all run from a Rpi with a DAC hat. It’s a waste just using the Rpi s for the speakers. It would be great to be able connect the satellite microphones in each room to the speaker’s Rpi in order to connect to Home Assistant. This would save save money, installation of new gizmos and use the (currently wasted) resources of the Rpi s. I think you can do this with Rhasspy but what about with this new HA Assist system?

1 Like

Hi everyone, this could work with esp muse devices?

You can see the traces in the voice assistant

I have it working on an echo in my office. Asks questions along with occupancy and time of day to figure out if I’m stopping by to grab something before work, working from home, or messing around after work. I’d love to transition that to local hardware and processing.

They are speakers only aren’t they?

No microphone.

Yes, they have a i2s microphone. Here is the firmware https://github.com/esphome/firmware/blob/1cc35128b9d3d2e7edf2dd62331a058cc27e754d/voice-assistant/raspiaudio-muse-proto.yaml

1 Like