The era of open voice assistants has arrived

Can this device use cloud speech to text other than Home Assistant Cloud? For example Google STT? I’ve been using this cloud stt and tts with m5stack atom echo device with success (atom software modified to send speech to google home speaker). I don’t need Home Assistant Cloud at all (own my domain and SSL certificate) and google STT is very cheap for my usage (a few dollars for months).

If assist configuration works on atom will it work on new HA device or is it limited to HA cloud STT/TTS only?

That’s not the release thread. It is a history and “state of play” blog post.

you can use vosk (with dict) or rhasspy-speech for this tasks
these STT will work fast even on weak devices

Or you can offload local processing to a more powerful system, which is what you’d want to do if you want locally hosted AI as well as faster responses to your commands.

1 Like

For a non-preview edition or future new revision would you please consider also adding an ESP32-H2 (or ESP32-C6) SoC as another “coprocessor” and secondary IoT radio module (making the same PCB have both a ESP32-S3 and a ESP32-H2 on a single board so that second SoC can be used as dedicated Thread Border Router for Home Assistant?

The main real-world use case reason to add an ESP32-H2 (or an ESP32-C6) module other than just using it as a generic-coprocessor MCU SoC that could be used to offload stuff that ESP32-H2 (or ESP32-C6) has an IEEE 802.15.4 radio which means it can be used as a “Thread Border Router” (with OpenThread Border Router firmware), for a Thread network used by the Matter integration in Home Assistant.

Adding such a ESP32-H2 (or ESP32-C6) SoC or module with its own antenna would take some space on the PCB however should not add that much larger BOM cost since ESP32 chips are not expensive, and I think the additional possibilities such an extra ESP32 SoC could add should hopefully more than make up for that slightly higher cost! If go with the slightly more powerful ESP32-C6 then could perhaps off-load some other processes too (like maybe any sensors connected to the Grove port).

You can possibly in the future alternativly also use ESP32-H2 (or ESP32-C6) on a single of them as a remote Zigbee Coordinator (also known as a Serial-over-IP Zigbee controller adapter) for Home Assistant’s built-in ZHA integration (native Zigbee Gateway), see/follow this work-in-progress but note that the ESP Zigbee radio library for zigpy is is still very experimental and not yet fully working with the zha project.

2 Likes

Plug your own speaker into the 3.5mm port.

Excited to try this out! I will probably wait until French is supported though.
Would love a round case design also! :slightly_smiling_face:

That would be a killer device H

And if I could put that in a chassis that had bangin speakers. Maybe a Squeeze lite compatible player…

1 Like

Is PoE support on the roadmap?

1 Like

Its a real shame. This could have changed the entire voice landscape. for a LOT of people.

looks like I wont be replacing my google system in a hurry.

Who will the initial retailers be in Australia?

loads of stock, buy it today… unless you live in the uk where both resellers are out of stock and on pre order!!!

same with France and Germany

It seems that Seeed studio stock is sold out. No more previews for Australia. Super sad I’ve missed out. Been following all the announcements but apparently wasn’t fast enough.

It’s hard to tell the difference since it contains almost all the duplicate info contained here.

either way, it’s not that big of a deal. just an observation.

1 Like

I do own a 3D printer. I find that to get a high-quality finish, I need to use spray paint. if you wanted a black case should try painting a white case.

reduce layer hight to 0.12 or 0.1mm and the finish will be much better

Does Voice PE has BLE Proxy functionality out of the box?

Why do you ask? Read the source.

They have called it the preview edition, but think the hardware design is prob set in stone.
Some of the sales speak was likely optimism as at least this time, it is sold as a preview for what it says the future of voice assistants.
Google is some way ahead of that as without doubt the targetted voice extraction of the later Nest Audio devices, they outperform what we just saw in the preview video.
We never got any demonstration of farfield or 3rd party media noise that the current crop of closed source do quite well.
Google has halted all assistant dev apart from local models running on accelerators like there Tensor chip in the Pixel devices.
Near all the big players are moving away from cloud devices as it doesn’t make revenue for them but the hardware costs have limited them to mid to flagship phones and tablets, whilst they still sell the original cloud based devices.

HA seems to be tackling the problem by building up to LLM driven accelerated devices, but yeah a stereo mic beamformer even if powered by that xmos chip, is a considerable way behind Google & Apple.

There was some basic elementary 101 errors in its design and this hovering of permissive licences, to refactor and rebrand just wastes time whilst training existing with new language models and capturing voice data would be faster.

I keep repeating a request to allow an opt-in to collate data on device and submit in batches to HA as opensource still has some very poor quality datasets compared to what big data has.
Until that has been overcome opensource will be fudging models with synthetic created data that just isn’t the same as real world capture.

Even what is being doing now is essentially wrong and will create a poor dataset as I did post as an issue but was just closed as completed…

It going to be a long haul before some of the sales speak becomes anywhere near true, but at least there is an effort being made in the opensource community.
Opensource is still making 101 errors whilst the likes of Google are past Phd with many active employee’s conducting cutting edge dev in this field.

HA do there best with what they have, even if some of the errors are frustrating at times.

Until now ESPHome mediaplayer could not play radio streams with AAC+ coding, only mp3 streams (squeezelite on ESP32 plays AAC+).
Is there any hope that this new hardware brings also improvements to mediaplayer capabilities?