Voice Chapter 7 - Supercharged wake words and timers

@donburch888 @ginandbacon if want deeper discussions regarding that ReSpeaker Lite product specifically that only applies to it then suggest that you post to the separate thread that instead, see:

https://community.home-assistant.io/t/respeaker-lite-voice-assistant-kit-seeed-studio-voicekit-combining-xmos-xu-316-and-esp32-s3/756944

Yes it is interesting, showing a wider interest in better cheaper voice assistant devices … but unless Mike says that this is the hardware in Nabu Casa’s VoiceKit I will just wait for Nabu Casa VoiceKit.

FYI, FutureProofHomes has now also announced a similar XMOS and ESP32-based two-board Voice Satellite hardware development kit for Home Assistant that he is call ”Satellite1 PCB Dev Kit

Satellite1 PCB Dev Kit

The Satellite1 PCB Dev Kit contains the two PCBs necessary to build your own completely private voice assistant & multi-sensor with XMOS advanced audio processing & music playback. Add your own speaker and power supplies.

Satellite1 HAT Board:

This board features 4 PDM microphones, 12 NeoPixel LEDs, humidity/temp/lux sensors, 4 buttons (volume up/down, action button & hardware mute), plus the XMOS audio processing chip and a power DAC with for amplified speaker-out connection or 3.5mm headphone connection. All remaining GPIOs are also exposed.

The Satellite1 Hat connects easily to the Sat1 Core Board but can also be paired with a Raspberry Pi or a PC/Mac via USB! Perfect for all your voice assistant and audio projects!

Satellite1 Core Board:

The Satellite1 Core Board contains the ESP32-S3 n16r8, USB-C Power Delivery and 40-pin connection. This board attaches to the companion Sat1 HAT Board.

Looks like he has posted a future roadmap showing that he working on a a nice enclosure and more:

Noticed that @FutureProofHomes had a preview video on YouTube mentioning this project as “HomeX” 4-months ago (but at that time he had based the prototype on the wyoming-satellite platform running on a Raspberry Pi instead of using Nabu Casa’s upcoming ESPHome-based voice-kit hardware platform that runs on ESP32-S3 and using an XMOS xCORE chip for audio processing):

PS: The new design reminds me of the “Onju Voice” PCB replacement for the Google Nest Mini (2nd gen), which is a open-source hardware project that I hope someone else will pick up and update now:

Thanks for sharing out @Hedda! Happy to answer any questions you guys may have. Ask away.

We’re aiming to launch before Christmas and detailed documentation is coming. Hit me up if you want to help the core-team and have extensive hardware/firmware skills. We’re excited to launch!

1 Like

@FutureProofHomes Can you tell which exact SKU of XMOS chip you use? Same is also asked here:

Oh, didn’t see the separate thread over here too. Maybe let’s keep the conversation here since it makes more sense?

Can you tell which exact SKU of XMOS chip you use?

Here’s the actual XMOS chip we’re using:

And also wondering if your PBC(s) will be open-source hardware and/or use OSH/OSHW design?

I just updated the repo to clarify a bit our open-source strategy. In a nutshell, upon launch all the firmware (ESP & XMOS) will be open source and all our hardware schematics will be published too. The KiCad proj. files will follow a delayed open source model (I’ll publish those dates for us), at which point we will then put out the proj. files too. Open to folks thoughts on this! And again, if you want to work closely with the core-team then please do ping me!

I read that Nabu Casa’s will have an audio output jack (3.5mm headphone jack) for connecting external speakars.

The Sat1 has this as well. You’ll be able to power a 25W speaker directly from the device OR plug-in an external amplified speakers via the 3.5mm headphone jack.

4 Likes

Go on! That make me a nice Christmas gift!

@FutureProofHomes Really excited to see this.

Oddly specific use-case question: do you think enabling the bluetooth_proxy feature on the Sat1 would lead to any performance degradation? I’m building out room presence detection in my home using Bermuda and would love for any ESP32-based satellite like this I use to also act as a proxy, but I’ve heard of other ESP32 devices becoming unreliable when tasked with music + satellite + Bluetooth.

Bermuda implementation on my end is stable with all the other ESPHome bells and whistles turned on. It’s looking good @SpencerDub!

2 Likes

That’s great news! You may very well have my dream voice assistant in the works; it’s like you built it with exactly my desired features in mind. :wink: Excited for your launch!

That’s the goal. Build the holy grail!

2 Likes

Probably one of the last things on your mind right now but I was wondering if you were going to try and make this an “Made for ESPHome” device, especially since the latest OTA functionality. One device that most people have heard of is the “everything presence one” mmwave sensor. With the new OTA update functionality it allows users to add these devices to HA and do updates without the ESPHome add on installed. I imagine Nabus version will use this functionality. These are a bit different than the ones you can flash on their site. Here are the requirements

It just seems like this method would be easier for new users and allow for this device to be easy installed. It also appears you meet all the requirements, specifically the software being open source. It would just make it more appealing to newer HA users who may not want to mess around with creating a secrets file for various information. Just a thought.


For all projects
Your project is powered by ESPHome (runs ESPHome as its firmware)

Your project is powered by an ESP32 or supported ESP32 variant such as the S2, S3, C3, etc.

Your ESPHome configuration is open source, available for end users to modify/update

Users should be able to apply updates if your project sells ready-made devices

Your project supports adoption via the dashboard_import feature of ESPHome (see Sharing). In particular:

There are no references to secrets or passwords

Network configuration must assume defaults (no static IPs or DNS configured)

The configuration must be valid, compile and run successfully without any user changes after adopting it.

Use of remote packages in the YAML is permitted only if the above criteria are met.

Your product name cannot contain “ESPHome” except in the case of ending with “for ESPHome”

Updates via http_request

Update Entities
So, we created update entities. These are similar to the ones that Home Assistant shows now when you have the ESPHome Add-on installed in Home Assistant OS, except those ones show you an update to the version of the ESPHome Add-on and in the background will compile and upload new firmware to your device.

These new update entities are a bit different. If you have acquired a device that was pre-installed with ESPHome, the vendor you acquired the device from is now able to compile the firmware and host it on a website along with a description of the firmware the device can read and present that there is an update available for this device. You do not need to adopt the device into the ESPHome dashboard, and you don’t actually need the ESPHome dashboard installed. Using the new http_request OTA platform, the device will be able to download the firmware and update itself.

1 Like

Will definitely look into this! I’m somewhat aware of this program but it looks right up our alley. Thanks for the tip @ginandbacon.

1 Like

@FutureProofHomes
Wondering whether the satellite1 will support the following use case. I would like to be able to have separate wake words for separate functions. For example, “Jarvis” to open communication with home assistant, “speech” to open communications with a program running on a different server, and so on. I would like one device that serves both as home assistant controller and as a microphone input for a separate program with separate speech to text capabilities. Ideally, the voice activity detection and wake word detection would occur on the satellite1 device, and the digitized audio could be directed to my Python program through a network communication technique such as a websocket. And I would like the option of keeping the speech to text going until there is a new command to shut it off. That command could either be detected by the satellite1 or by my own program which could signal the satellite1 via websocket to stop sending speech. That’s my “Holy Grail”.

The Sat1 hardware won’t necessarily unlock that feature, but with a little hacking it should be possible to build what you’re describing today, I think? Perhaps you could use the multi-wake-word and multi-pipeline mapping feature and have one of the wake word/pipelines fire up a UDP stream (you’d have to custom-build a UDP streaming ESP component) targeting the correct STT endpoint for your application.

Just thinking out loud.

Oh I see. So this looks like it may be doable. I assume you will be using a highly modified version of voice kit. Or a replacement.

Should actually be more of a slightly modified version of voice-kit, actually. But it’s a bit hard to be 100% sure because everything is happening so fast on both projects. I’ll make a point to catch up with the HA voice guys here really soon to ensure we’re on the same path. Our FutureProofHomes/Satellite1-ESPHome repo should go live really soon.

1 Like

@FutureProofHomes maybe it would be a good idea for you to create your repo as a downstream fork of esphome’s voice-kit repository then just add your own patches and modifications to yor forked repo as that way it should be easier to keep up with the HA voice guys upstream repo and even simpler to submit patches to them as upstream? See → GitHub - esphome/voice-kit

Not a bad idea at all. We’re working on getting our Github Actions Build & Release strategy dialed in so our ESPWebTools and releases are automated… once our repo stabilizes I’ll see what the viability of a downstream fork might look like depending on the deltas in our repo and firmwares. Thanks for thinking out loud! I like it.

2 Likes

Seems like a few boards are coming, this is another one. More based on the concept of a multi-sensor and local voice assistant…