Poll: What's your biggest struggle with voice control right now?

mjolsen74 · February 11, 2024, 12:12am

I think the lack of follow-up ability. When you start using LLM/GPT3/4, and it asks for more info or a followup, you’re out of luck… (I did hear someone say to just say the wake word and answer, but not sure that works).

Really need local processing for what HA can do and have a fallback option if HA doesn’t know, then let another assistant pipeline work… Meaning, I want to control everything in HA locally, but also use it for conversational things.

Rich37804 · February 26, 2024, 1:28am

Saying the wake word after follow up works.
What also works, " Is the hallway light on?" (answer is yes), (Wake Word) “Turn it off please”

marisa · February 28, 2024, 12:39am

If you don’t mind custom components, there are solutions for this.

I made one for example:

As for the point of this thread, my second most used feature of Alexa isn’t available out of the box, and that’s a problem for me, and that’s setting timers and reminders using voice.

I have implemented timers manually, but it was a hassle to set up, and having it built in would be nice.

ra31el71 · March 21, 2024, 1:07pm

My voice assistant doesnt recognize my daughter and wife’s voice, only mine! It seems that female voice is a challenge for it.

Joe31388 · March 21, 2024, 5:36pm

Using the voice assistant on wearOS is very hit or miss. It rarely gets anything wrong but sometimes picks up background noise like a TV. Also, it does very poorly at recognizing when the phrase has ended and continues to listen way longer then needed.

It’s still a very impressive feat and looking forward to it being more polished in the future!

freakshock · March 22, 2024, 1:00pm

Check our the voice reminders blueprint I have just posted

SNett · April 15, 2024, 5:49am

Anyone know how to add an audio acknowledgement on successful wake word detection?

Im not always able to look at the led to see if its listening.

for example (“yes sir?”) after triggering the wakeword.

pepe59 · April 15, 2024, 9:19am

Has anyone found a solution to send audio feedback to another player?
On esp32-s3-box3 the sound is very muffled and cuts off the first words.

luk1 · April 19, 2024, 6:48am

My voice commands are often not understood, it works much better with the Alax. I hope this will be better in future, I wan’t to get Alexa out of my house.

mongo365 · May 1, 2024, 8:36pm

Is anyone else seeing that they can control most all of there devices except those using the Zigbee2MQTT Add-On?

tjiho · June 19, 2024, 6:52pm

Your audio lasts 15 seconds, perhaps because HASS doesn’t detect the end of speech and cuts at 15 seconds of listening.

Check that your audios are not too loud, which may be the reason why the VAD doesn’t detect the end of speech.

styphonthal · June 20, 2024, 12:23am

I would like to be on the positive side of the s3-box3. I am using “bubbas” firmware, with Marissa’s “fall back conversation” (HA → openai gpt3 turbo) and it really has helped. The fall back helps specifically due to openai struggling with basic tasks. The firmware also opens up the speaker instead of the stock sound level which was very quiet.

vv_rockhound · June 20, 2024, 1:57am

Have never been able to get it to work - ever. I suspect I am missing some critical concept or add-on.

Rofo · July 25, 2024, 12:23pm

Where can i get hold of this… google is not playing ball

styphonthal · July 25, 2024, 1:01pm

whc2001 · August 30, 2024, 11:11pm

I decided to completely ditch the concept of voice control because it’s simply unviable. Have tried different hardware (computer microphone, headset, ESP32 I2S), different ASR backend (Whisper, vosk, rhasspy) and different language (English, Chinese) and below is the constant result. I have even tried official Amazon Echo and it’s having trouble distinguish “On” and “Off”. The conclusion for me is that voice control worth nothing and it’s just MUCH MUCH quicker to take out the phone, open the companion app and click a button.

2024-08-30 191026
2024-08-30 190652

Rofo · August 31, 2024, 9:16am

@marisa - I’m using BigBobbas modified ESPhome code:-

It massively improves on the basic s3 box3 code and supports a timer and exposes a media player.

Like @dza though, I find that the box3 often locks up if it doesn’t quite understand what you are saying.

On Device wakeword detection is pretty good, just not quite there to be usuable by the rest of the family. You have to learn how to speak to it for it to be more reliable.

I found that pushing beam size to 5 made a huge difference, but I’m just guessing at the settings.

sparkydave · October 9, 2024, 11:18pm

My latest issue is that my ESP32-Box has stopped responding to the wake word. I re-flashed it with the stock code from the ESPhome projects page to see if it would help, but nothing.

In the ESPhome setting page the wake word location selector is greyed out as ‘unavailable’ which seems like the only hint to what is going on.

Pkkrusty · October 22, 2024, 8:16pm

I’ve been trying out a basic ESP32-S3 with INFP mic setup, and the biggest frustration I have right now is the STT errors. I’m curious if there’s anything out there like a simplified model that only recognizes a subset of words associated with an automated home? For example, I ask Jarvis to turn on the living room lamp, and it thinks I said “Turn on the learning room light”. The word “learning” will NEVER be used when I am vice controlling things around the house, so I’d love to not even have that as an option for the Whisper model to choose from.

Of course, with a general LLM, you need the full breadth if you’re going to ask it random questions, but that could be its own pipeline.

*edit Vosk STT seems like a pretty good option for limiting what can be recognized.

KE55ARD · December 2, 2024, 1:10pm

I’m having this exact experience, custom board using S3 chip and on board mic. Works absolutely fine as far as detecting “Hey Jarvis” using openwakeword, but then NEVER transcribes what I’m saying correctly.

I’ve installed Rhaspy but not sure what to do from there in terms of setting up Vosk