2023: Home Assistant's year of Voice

Exactly that. In their own language, as stated in the opening post:

(emphasis mine)

If you want more than that and don’t care about cloud privacy by all means keep using Alexa and/or Google Home. The integrations will continue to be supported as long as the APIs are available.

In short, stop cluttering up this topic with comments beyond the clearly defined scope of this project.

Hardware may be a legitimate concern.

1 Like

Alexa may be in maintenance mode, but they are certainly not going to turn it off. Likewise for Google, though they use the training data on their handset VR, which is certainly core to the Android experience, and they are investing in new devices as well, so I don’t think that is going anywhere. Google cvertainly is not willing to lose the same amount of money on subsidized hardware as Amazon, and I would expect Alexa devices to be more expensive going forward.

Certainly if they made VR support a paid feature or shut it down there would be a huge amount of user outrage, but it seems extremely clear that HA is not trying to be a competitive smart speaker, so even if those platforms were shut down, Rhasspy would not be a replacement for them.

I do find this direction a bit hard to understand, but the leaders in HA certainly know their user base more than I do, and the distribution of user’s languages, so maybe the TAM is significant. I wish folks luck in any case.

No one knows that. Especially Google is well known for killing unprofitable projects on a whim. I personally think that they will all turn this into a paid subscription based service eventually, maybe with the exception of Apple. But they could just as well shut it all down and write it off.

Uhm, they made this extremely clear right in the opening post. No one ever said they wanted to make a full featured smart speaker. Rhasspy would be a replacement to control HA. It’s you who is misinterpreting this.

Yeah, me too. But I’m biased against voice assistants. I think it’s mostly marketing tbh, like the (failed) Almond thing some time back. But I don’t mind as long as they don’t throw all their dev resources at it, neglecting the more important things, like the badly needed UI improvements people have already mentioned. They hired the Rhaspy guy, so it’s probably going to be mostly him on that part of the project.

[As a sidenote, it drives me crazy how you use ‘VR’ as an abbreviation for voice recognition, as it automatically maps to Virtual Reality in my brain, needing additional mental post processing ;)]

And a lot of volunteers for local language support. 59 at last count.

1 Like

I mean, it’s not going to take away dev time from other core devs.

1 Like

I believe the plan is that the NC devs will mostly only be involved in PR review/approval for this.

Would make sense. Voice processing is a pretty specialized field, it’s hard to just jump in without prior knowledge (which only Mike and his contributors have).

BTW, Polish is supported by the Google homes, one of 61 or so languages that appear to be supported. Have you tried it and it worked badly? Not all the languages work as well as others…

Sounds like an integration. If HA can do it then so can it’s voice assistant.

Good one. Not sure about this. It’s certainly possible for HAs voice assistant to fallback on web search for things that don’t correspond to a specific action and try to come up an answer. But not sure that’s in plans, seems unlikely.

Well I mean HA had had good TTS options for a while. People use it all the time to announce things from their speakers. So if you’ve connected speakers to HA I don’t see why not.

Possibly. HA has persistent notifications and the conversation integration does response, seems feasible you could ask it to read them to you. Course whether the device shows a yellow ring is different question since there’s no specific HA speaker.

Doubtful. These require devices with specific features beyond just a mic to be the source of voice input. Seems unlikely to be a focus.

You could certainly make or buy a device that listens for glass breaking and tells HA though. Kind of unrelated to voice stuff.

Good list though.

Some time ago I bought Google home voice assistant in the hope of supporting my language, it didn’t.
Today I use it as a photo frame and basic phrase to turn on the lights and TV (language barrier).
I will be modest and very pleased if HA allows me to control light, socket, TV devices in my native language, I don’t need to ask what time it is and what the temperature is, I can look it up.
A simple voice control option, not a smart speaker.

For TTS and for google assistant on mobile devices yes. But you can’t talk to the google speakers in Polish :frowning:

I would imagine we would be seeing some slightly different replies on this post if Amazon or Google moved to a subscription only service for their voice assistants to plug the hole in their finances!

I like the idea of a local voice assistant, but probably won’t use it often. For me is the second word of ‘home automation’ important. This means to reduce interaction. Lights, climate control (heating, fans, humidifier and awning) and alerting should operate based on sensors and of course a well written automation.

Yes I do have several Google ‘smart’ speakers, but I use them mainly to talk to me. The gimmick of asking them questions and get answers is nice, but a search with my mobile gives me more (and more extended) answers.

2 Likes

Exactly my thoughts and use case myself, nice possibility but not in my priority.

2 Likes

For example I’m not going to ask HA who was 23rd president of USA… But you can ask Siri or Alexa and expect the answer. Of course, from perspective of home automation, there will be tons of common instructions, like turn the light on, but this is core functionality for HA, but only small subset of big boys’ assistants.

It’s awesome to see Rhasspy integrate Home Assistant and I’m thrilled by this.

We indeed need open hardware as well. It’s one thing to build a quick DIY setup with a Raspberry Pi Zero W with a mic and Rhasspy, it’s another to setup a system as good as the commercial standard, with mic arrays for instance. Also it remains to be seen where and how the complex processing of the sound. Ideally we just would have to create an object that registers as a microphone entity in Home Assistant, and the heavy sound processing would be offloaded to the HomeAssistant server. That way hardware could be very generic, minimal, easy to DIY and it would be easier (or less hard) to repurpose commercial devices. However I’m just not sure if that design is feasable at all, mainly because at least some sound processing still has to be done on the device I guess.

The dream goal : connect a mic array to an ESP 8266 / 32, flash it with ESPHome and voilà ! Since this kind of device would be in every room, we can take advantage of ESPhome to add any other sensor we would like. I’m just not sure we could have an ESP playing the recorder and speaker role at the same time.

For others, the most important would be to use existing commercial devices.

Given the state of hardware, in a follow up post it would be great to have a call for volonteers on this front. HA can really make a difference between isolated individual small efforts and a whole coordinated plan :grinning:

3 Likes

They don’t use Supercomputers - they use standard x86 based servers running linux, just like most of the rest of the world. They have vast numbers to cope with the vast numbers of users and concurrent requests, but I suspect that a single user request is handled by a single thread.
What they do have is access to a vast sample of user voices and phrases that they can train their ML models with, and that’s where their power lies. Reproducing that locally will be a challenge, but not impossible as Rhasspy is already demonstrating.

I totally agree with this. Voice Recognition as a remote control is highly overrated - automation should be automatic. However there are times when it’s nice, like opening a gate when a friend is coming over, or changing the temp in a room when you feel colder than normal. Most everything else is handled automatically by automations in our house. This is the same reason we don’t have status displays hung up on walls - we don’t generally use HA as a remote control.

But we don’t use our smart speakers for HA control as much as other things - setting timers in the kitchen while cooking, asking about traffic for commute, or weather forecasts, playing music, or just answering questions that come up during discussions, etc… The smart speaker does way more than just HA control, which is why even if people started getting charged for it (not going to happen at least in Google’s case), many would pay to keep that functionality.

Still, even if you wanted a dedicated voice assistant for for narrow HA control, that needs some decent hardware with array mics, and it can’t be hidden if it’s going to work well. An unreliable voice assistant is not going to be used much - it creates a ton of frustration! That’s where I was before moving to Google hubs and HA. Being out in front on counters etc… means it has to look pretty nice, or at least it does in most homes. Having some 3D printed thing that looks out of place on a counter is not going to fly. And since you’ll want it in most rooms, it needs to be pretty cheap as well. That’s hard to do without modest volumes for production.

The software piece is hard to do well, but the hardware here may be just as hard, esp if it doesn’t have a lot of volume. I don’t think it will displace any smart speakers as the team is very clear about limited functionality and emphasis on language support instead of equivalent functionality to exist smart speakers, so not sure what the volume will be like for a device like this.

1 Like

Some will use the features and some will not.
You can not really say much from the current situation. Just think about the media feature in HA
Before it was introduced many had YouTube, Spotify, Sonos speakers, Mopidy, MPD, SnapCast and a lot of other setups,yet the media features were still introduced and welcomed.

As has been pointed out already, I can’t really see this being of massive use unless there is some decent reasonably priced hardware to place in each room to make use of it.

I think step 1 should be getting a hardware specialist to create some kind of daughter board to replace Amazon echo / Google home mini guts with an esp32, while making use of the casing, mics, amp, speaker etc they already have… if something like that existed I’m sure people would be much more inclined to put this to use… as not many people want to have raspberry pi’s in each room with an array of expensive add-ons to make them work, when they can get an echo dot for £30…

2 Likes