Mosher
Amazing stuff as always!
Thank you
Amazing stuff as always!
Thank you
On beta 1 now and I must have missed something. Yes the light is light.dining_table
and its name in the GUI is dining table. I guess it is beta.
EDIT: Actually I checked. It is shown in the gui as dining table lights (plural) and asking it to turn on the dining table lights
1 replyWhen i saw your “first” announcement of “Year of the Voice” i was thinking i maybe could “contribute” here, but the fact is Im Danish , and lived (last)half of my life in Sweden, so my (edit: Spoken ) language is FU, not even Alexa/Google get it right, and im to oold to try improve my speech ( yeah i also mumble alot ) … so i wonder, is there any hope that ha would be able to interpret Swedish with a strong Danish dialect ? .
I tried with 3 languages in Google Assistant, totally chaos, i never knew whether it understood/interpret what i said, did it think i talked Danich or Swedish ?, or was it so confused so i just got an English answer back … you get my “concern” ? … Well it’s no a concern, I dropped it … beside starting every “command” with “Hey Google” or “Ok Google” makes me feel like im repeating my self in a ridicules way … I think it’s better i stay out of this topic, until there comes a voice-assistant, who learn my “Voice” only respond to my “Voice” and interpret commands whether it’s Danish, English or Swedish … And then respond back in i.e English ( or choice of language )
Great work so far! I have a question from the perspective of an integration developer. Will it be possible for integrations to provide intents? For example, the Mazda integration has a button entity that starts the car engine. Could the integration provide a customized intent that enhances the user experience by allowing users to say “Start my car” rather than a generic (built into HA) sentence that I assume would be something like “Activate/press the ‘start engine’ button”? Or could someone theoretically build a joke-telling integration that responds to “tell me a joke”?
1 replyAh, glad you figured it out! This is a great example of what we’ll be focusing on next: generating different forms of entity names. For English this is usually just singular and plural, but it gets much more complicated for other languages
It should be possible to “fine tune” a speech to text model for this purpose. This would be local only though, of course.
Yes! It’s already possible for integrations to register intents, but they can’t easily add custom sentences yet in code.
2 repliesThanks, love your work, been fiddling with rhasspy on and off for ages. Never got much done though. Incentives now.
Yeah thats a nice idea. Adding support for binding the triggering of assistant to a watch hardware button would be even better. And perhaps also submitting the query after you have stopped speaking, like google assistant also does. Then there is no need for a confirmation tap.
Is there a way to activate the voice assistant by speaking, and not just by clicking on the Assist icon?
I watched the live stream and I think it’s awesome how home assistant is evolving.
I was thinking about voice however and everything being local. A few months ago I came across something that might already have local voice control but seems to only available at a premium and marketed towards the high end smart home market like Savant,4-control,ELAN. Ect… The product I found was Josh.so If home assistant can develop something similar at a extremely lower price point I’d be in.
Also how would HA’s approach be different then Josh’s .so?
1 replyHey that’s great news. Do I get it right, that this part only covers the interpretation of the spoken words, but the recognition is left to Google or Apple?
If we could make this work with an offline speech recognition (“whisper” is very good and open source!), that would be a tremendous advantage over other voice assistants, because people do care about privacy!
Wow, This is just the start of the year and this looks already awesome. Wonder what it will look like at the end of 2023. Keep up the good work.
Really looking forward to playing with this and seeing if it can replace Alexa for a lot of things.
One thing Alexa can now do is a timed action - e.g. turn on the lights in 5 minutes. Will this be able to?
And perhaps also Turn on the lights for 20 minutes too.
It seems like most browsers have effectively blocked microphone access over plain HTTP. It sometimes works on localhost, but there are all sorts of hoops to jump through. It may be broken for a while
yeah i also figured this was the culprit , that is so weird a decision for the “browser” makers, should be a simple “will you allow, even thou it’s over http” this restriction to “Force Https” , if the url is http, is absurd
PS: it’s not only microphone … , it’s even camera, location and notification it seems
EDIT: Yo might need to “include” this “option” in the Cookie (Allow microphone / or ask for permission, when clicking the Mic, and store in the Cookie)
Wow, great work all, super excited to start playing with this.
We have google hubs in almost every room and not a day goes past when at some point in the day I want to collect them all up and set fire to them.
Looking forward to watching this develop.
I “fixed it” in Edge, on 1 of my Laptops
PS: I didn’t ask Chat GPT
Im not sure whether it’s just “rude” , or im mumbling , but in fact even thou it (ET) didn’t quite get it, it did turn of the light ,… so is that cause to quick-machine-learning, or a bug ? … i mean if it unsure it could have asked me, right ? … Not like google ! who just plays something weird from Youtube, thou you asked for local source on VLC
It’s very interesting, however, I don’t understand yet the difference between HA Assist and Rhasspy? Maybe they are complementary?
@synesthesiam Hey , i’ve tried to get the speech working in firefox, thou in my end it seems like the ha-voice-dialog never make it out of the header-bar, and no “mic” icon , in the pop-up, only a icon for publishing txt message … dunno if it suppose to be so, as i never figured out the function when paulus talked to his awful “cube” (in your U-tube video) … is/was that just “attached” as a Mic-device in/to his HA-Device ? ( and by that “bypassed” the stupid Browsers ? )
I’m glad I bet on Home Assistant. A phenomenal transatlantic sailing in the right direction.
I have 12 speakers from the Google stable at home. In January 2019, Google Assistant was to be made available in Polish. The paradox is that the speakers had the Google Assistant for several months, which spoke in Polish. Just like on smartphones. After these few months, it turned out that the Polish language disappeared from the speakers…
I am currently very excited about the announcements from the team from Home Assistant and NabuCasa. I keep my fingers crossed and step by step I intend to follow this trail with you.
of course. Picard is god.
Rhasspy has a voice wake-up, this can be combined if the new feature Assist?
This will be very useful, because in the current situation as they wrote if the access to the home assistant is local without a security certificate, (HTTP) the microphone is not activated in the browser, and this feature cannot be used,
Did you read this? Year of the Voice - Chapter 1: Assist - #22 by boheme61
1 replyAs they mentioned, they are “working” on it, this is Chapter 1: … various people volunteered to start with “commands/tests” in various languages . “ongoing” , join in at some of the links in initial post above
I believe the Voice part through the Browser need some more “rethinking” in regards to the way various browser are going cause to security reasons ( actually to avoid law-suits, if you ask me ) … anyway, before we’re there who knows, there might come a “Yellow-Mic / Blue-Mic” , generic-mic-integration, what ever
So it was a “homePod” Paulus talked in, ok, still awful , and i have no idea what it is, most like it sounds like some device he had “integrated”, i just have an impatience personality (maybe) , and as i previously had played around with “text/typed commands” , i felt i had to “fight” the “browser-companies-rules” , FF seems alittle more tricky, as most know, they goes their way, in regards to many things in their browsers.
Still after 30 years there is a “combat” on this area, all-thou some “attempt” has become “outdated” , Java-applet, flash, etc things like CSS is still one of the biggest battle-fields , and then there is ofcause the big US companies who needs to “comply” with their products , beside their own reputation …
EDIT: My second “Lyric” reply in this Topic … 2 out of 3 aint bad
Just started playing with this. My first suggestion is that the iOS app needs the ability to long-press on the icon and select to launch assist, so you don’t have to fully open the app and click on the assist button (“Home Screen Quick Actions” I believe they’re called).
Oh, and this quick action should have the option to auto-start listening for voice commands to save typing.
Interesting. I use both siri and Alexa…I would have thought Amazon Echo would be in the mix here as well. Does anyone know why it isn’t? I realize this is just the first month, but I would have assumed Echo would have been the easiest one to start with?
How to change OpenAI conversation language?
you can use android shortcuts to get to it. Create a shortcut to any lovelace dashboard and add ?conversation=1
so for the home page /lovelace/home?conversation=1
This will be part of the next beta version of the app, if you would like to get it today you will need to sideload the latest beta from github.
This year is going to be great! Would it be possible to add a sentence when a command isn’t recognized? Something like:
Make it bright in the kitchen.
Assist: I don’t recognize that command. What would you like it to do?
Turn on the Kitchen Overheads and the Dining Light to 100%.
Assist: OK. Saying, “Make it bright in the kitchen” will now turn on the Kitchen Overheads and the Dining Light to 100%.
It would be great if you can set a different language for the UI and for the assistant. For instance, I want to have the UI in English but the assistant in Dutch. I don’t like to look at Dutch terms in the UI as it confuses the hell out of me and I can never find what I want. However, when giving commands I prefer to do it in Dutch
1 replyThanks dshokouhi. Your answer worked a treat.
I would also add that after creating the Android Shortcut to the Home Assistant Companion, to add it to your Android home screen long press on the Companion App icon then users can drag them onto their home screen.
It would be great if you can set a different language for the UI and for the assistant
I really hope that’s their intentions, as im sure your far from alone in this situation, everyone who have been working with IT, and a whole bunch of others, prefer English UI… And for me lived half my life in Denmark, Half in Sweden ( Friends and relatives, my spoken languages), however i also find it very hard to “navigate” in a non-English OS / Software
EDIT: Have you thought about writing your own “intents” ? , why not “create” your entire own language
Instructions on how to have conversations with your Home Assistant.
Please add bash-like shortcuts for us Linux nerds! Add bash-like shortcuts to Assist text bar
Hi, I’ve been playing with assist today and I can’t seem to get the temperature state. It is in the intents repository, but HA can’t understand the sentence. When I put the hvac file in custom sentences HA understands fine, but outputs an error.
Is getting a state just not implemented yet or is there a problem on my end? And if it isn’t implemented, is there a list of intents that are implemented and we can test?
Here also the assistant is not working, I can ask everything it answer. But to turn off a light it is not possible. I run on a raspberry pi 4, I ask also how much entities I have, and this is not right. I have a lot more than the 17 there is saying…
I make a new instance on Synology for test and here it works. (Make a helper and it is turned on and off)
See below some pictures:
The example from the article “Assist - Custom Sentences” works well. But I’m trying to control the media player via {name} and my code throws an error.
language: "en"
intents:
SetVolume:
data:
- sentences:
- "(set|change) {name} volume to {volume} [percent]"
requires_context:
domain: "media_player"
lists:
volume:
range:
from: 0
to: 100
Error handling message: Received invalid slot info for SetVolume (unknown_error) from 192.168.1.1
upd
Everything is working. I just forgot to change the value of the entity_id in the intent_script
I think down the road it would be important to see something along the lines of ‘smart’ commands such as “turn off all lights” to turn off all lights that are exposed to HA.
Or “turn off kitchen” either turning off all devices in area:kitchen, or at least turning off my light group called “kitchen light” with a wild card match.
1 replyThis is something I wrote a while back. Is this something useful to this project? I have several rules defined in json that might be usable as intent/action
system automation tool. Contribute to computermaster0101/Hermes development by creating an account on GitHub.
yes, but a phrase that turns on/off all devices from all domains within an area.
It doesn’t know what your scenes is called, and you didn’t ask it to i.e RUN scene “Sluk alt lys”… You basically just spelled out the name of the of one of your scenes , it could be what-ever on earth you mend, i would also have answered you"Va ? "
Om, if it was “programed” to understand something in the area of what you want it to do, it would be something like “Sluk alt - Lys” where Lys is the “thing” it has to “turn of all / Sluk (for )alt” … PS: don’t be surprised if it get confused, and dont understand you if you say " Sluk for-Sluk alt lys" , most likely it has already an “intent” saying "Sluk alt - " , that would in that case be "if you saying “sluk alt -lys” , it would turn of all “light” entities, Most likely Not your scene “Sluk alt lys” …
PS: And , you can’t turn-off something that is not “on/running” , so if your “scene” is actually an automation which should turn of all your light, you should not ask anyone to turn.scene/automation of , right ? , you want it to run/turn on - the scene/automation (sluk alt lys / turn off all light)
See my reply here: Year of the Voice - Chapter 1: Assist - #47 by Holdestmade
Thank you, it works …
Thank you! That was the solution!
I saw it was asked here,
Is there a solution to this?
At the moment the situation is that I ask ‘OpenAI Conversation’ in my language and it answers me in English, how can it be configured that it will answer me in my language (which is set by default in HA)
My own language? Wow, great idea! Maybe for now I’ll start with my local dialect
I believe that for the moment you have to press the button. The whole thing is in it’s infancy, so I think patience is the order of the day
I can’t understant how can i make my own setences. I made a file on /config/custom_sentences/el/media.yaml and then i wrote:
# Example config/custom_sentences/el/media.yaml
language: "el"
intents:
SetVolume:
data:
- sentences:
- "(set|change) {name} volume to {volume} [percent]"
requires_context:
domain: "media_player.spotify_mario_hadjisavvas"
lists:
volume:
range:
from: 0
to: 100
But not working at all…
I want to built it on Greek. I’m using Assist on Greek on built in setences but I don’t know how can I make my own.
I want to make not only actions but conversation as well ex. “How are you”, “Tell me a joke” etc. and I want aswer with text that i wrote and the ability to play an mp3 with pre-recorded awnsers on spesific media players.
I’m trying to find a Youtube Tutorial but nobody upload any video with Assist configuration yet…
Any ideas?
language: "el" intents: SetVolume: data: - sentences: - "(set|change) {name} volume to {volume} [percent]" requires_context: domain: "media_player.spotify_mario_hadjisavvas" lists: volume: range: from: 0 to: 100
domain: "media_player"
Use device name or alias to call. This is a universal option.
Or you can specify a hard binding to one device. You need to make sure there are no duplicates.
- "(set|change) [add any names. It's optional] volume to {volume} [percent]"
slots:
name: "Specify the name of the media player"
Really cool
and it works for the most part, still have to apply a lot of aliases.
One question, I use “Hey Siri assist” but Siri tells me “done”, even if the action fails.
Is there a way to let Siri know if an action has failed or not?
Hi, by the way, do you know if it is possible to add aliases to entities, but with yaml file? I don’t have that many entities but going one by one and adding aliases to them in GUI is pretty tedious. Is there any way to manage aliases in the yaml file? I know this customize
section in configuration.yaml
where I can change friendly_name, entity picture, and some other stuff, but I don’t see aliases there, and trying to add them there doesn’t work.
EDIT: SOLVED, see below
Hi, I am trying to add things to my shopping list using the existing service
However, after asking the first senstence (via keyboard), it only comes back with “Sorry, I couldn’t understand that” regardless of what I type in
This is what I have in /config/custom_sentences/en/shopping_list.yaml
language: "en"
intents:
ShoppingListAdd:
data:
- sentences:
- "Add {shop_item} to shopping list"
lists:
shop_item:
values:
- "egg"
- "wine"
intent_script:
ShoppingListAdd:
action:
service: "shopping_list.add_item"
data:
name: "{{shop_item}}"
speech:
text: "{{shop_item}} added to shopping list"
Does it still require ‘conversation:’ in configuration.yaml?
EDIT: answering my own issue… my instance was on en_GB and this apparently is not “en”
Working now after I switched to “English”
Is there a way to pass on a word/string to the intent_script?
example : when asking to add to shoping list: "Add xyz to shopping list’
Can I programmatically take xyz, assuming it is always in the same location in the sentence, and pass it to the intent_script?
Curious why a DIY simple text context matching was used than maybe an existing such as GitHub - flairNLP/flair: A very simple framework for state-of-the-art Natural Language Processing (NLP).
Seems a great idea to do inference based control but looking through, as thought wonder what they are using for NLP to have a look at repo and curious why DiY the NLP bit?
1 reply@synesthesiam 's code is hardly a recent invention. I imagine he is building on his previous work.
1 replyTry to use the exact (!) sentences from the examples, then at least you know it is working, the question mark brings you to the doc and there you find some examples. I had issues when mine was on en-GB, switched to en and back and now it works (donot know how)
Am I right that this still use external speech engine, meaning internet access?
@synesthesiam 's code is hardly a recent invention. I imagine he is building on his previous work.
Its likely to have the research and resources that a current SotA product such as Developed by [Humboldt University of Berlin](https://www.informatik.hu-berlin.de/en/forschung-en/gebiete/ml-en/) and friends
as I posted.
As that is my point and why I am curious as with just @synesthesiam and a few others why develop when existing already has more resources? As looking at the code it seems not much more than a basic word stremmer than some of the latest and greatest that is being made with NLP.
Code isn’t a recent invention but from chatGPT, Whisper to NLP there are coders and repo’s with resources that are out of our league.
Just looks like the use of some very old techniques even if new code, I expected some sort of NLP framework, not homegrown is all I am saying and only saying because I was curious to what NLP you might be using as currently there are some very interesting inventions.
Flair is just one getting a lot of Huggingface traffic haven’t used it or have any self interest.
Latest, 2023.3.1
I must be doing something wrong?
Temperature intents aren’t implemented yet in HA. You can ask “what is the X” where X is the name of a sensor, though.
Just looks like the use of some very old techniques even if new code, I expected some sort of NLP framework, not homegrown is all I am saying and only saying because I was curious to what NLP you might be using as currently there are some very interesting inventions.
I’ve used Flair before, and while it’s a nice framework, it suffers from the same problems as most anything machine-learning related:
scikit-learn
that need to be pre-compiled (HA supports more architectures than just amd64 and arm64)While Assist is far less flexible than something like Flair, it:
Plus, as we’ve mentioned before, the plan is for the current sentence templates in Assist to serve as training data for more sophisticated machine learning systems in the future. So it should be possible some day to install a Flair add-on for HA (bundled with all its dependencies and weights), and train it on the sentences generated from Assist templates
Yeah I was just going through Mike and 1st impression that even if it is your own it was ‘no AI!?’ and why I was looking was I was wondering what you might be using.
python-dateutil>=2.6.1
torch>=1.5.0,!=1.8
gensim>=3.8.0
tqdm>=4.26.0
segtok>=1.5.7
matplotlib>=2.2.3
mpld3==0.3
scikit-learn>=0.21.3
sqlitedict>=1.6.0
deprecated>=1.2.4
hyperopt>=0.2.7
boto3
transformers[sentencepiece]>=4.0.0
bpemb>=0.3.2
regex
tabulate
langdetect
lxml
ftfy
janome
This file has been truncated. show original
Maybe you used an early version, but the requirements seem pretty standard, but its quite normal for for a ML release to be tied to a ML version I am frequently using miniconda, venv or docker as different python bases is not uncommon but usually no problem. Dependencies are only a problem if you ignore such tools?
I was curious as I thought it had chosen due to load and I thought with the introduction of ASR such as Whisper SotA models is gaining more focus and like GitHub - guillaumekln/faster-whisper: Faster Whisper transcription with CTranslate2 you have running in Rhasspy3.0 the papers and models that hit SotA WER are frecking huge, but actually have been boiled down to less accurate but far faster and smaller models.
Yes if you go no holds barred on achieving (State of the Art) then models can be huge, but that is what dev is about and with Assist entities we are talking a very small subset of a language model and hence why I was wondering what you where using.
Whisper is an example all on itself as there is nothing special really about the ASR of whisper, it the GPT like beamsearch that fixes those errors via the NLP of the decoder as an approx explanation my memory will dictate.
The times Whisper gets it totally wrong but still creates a logical sentence is this process in action.
Because you have a much smaller language subset I was expecting something lean and mean and very accurate and why I was checking to what you are using, but dependencies… !?
If Whisper was opensource then likely we could have a entity subset LM (language model) of the decoder part, sady its not and as far as I know we get what we are given.
Wav2Vec2 is a close 2nd to whisper where the LM or n-Gram could be generated on the fly based on enabled entities hopefully creating a super small and accurate LM.
Likely you could even increase accuracy by adding a custom ASR dataset of multi-voice TTS churning out common command sentences but seems nowadays resuse of others seems the norm.
Its things like ‘custom commands’ that are dictating large language models as it departs from a small known Hass entity model and we are talking very small. Just enabled entities create a LM on the fly, to needing a full large language model because we have to have everything to deal with custom, unknown, as in the huge model that could be the entities of a shopping list, or todo.
Which surely are not Hass inference based control but skills in there own right requiring much more complex language models.
I am just getting my head mainly Rhasspy3.0 Dev Preview and thinking the 1st amazing skill is the Hass Assistance Inference based module.
I can add a NLP infront of the word stemmer and convert from AI NLP to basic stemming NLP, but bottleneck of logic is always basic word stemming, losing much of the NLP AI is trained on, that could output correct YAML intent.
I thought Hass Assist was going to be specific to Hass entities so lighter weight, more accurate language subset models could be created, purely because it isn’t a full language model supporting all unknowns.
You could probably bypass Assist as module but train on the Assist YAML dataset with a tiny entity only LM NLP.
If I was going to use Assist I could also make a much lighter weight and equally even more accurate ASR by having a specific subset LM that creates ‘Better’ formatted entity specific output and feed direct into the word stemmer with better results and negate to an extent the need for additional NLP.
Niether what I was expecting, so guess I will just have to wait until a a Flair add-on for HA
if I want to gain the accuracies I was hoping for as that was what I was expecting and likely too enthuasiastic.
Cheers for the reply and explanation.
Just wondering how to setup the entities so assist works as advertised. I asked this:
Here is my window sensor:
Of these questions which were given as examples in the release notes:
Only the first one worked. I either got “Not any” or “I dont understand”. So what do I have wrong in my setup? (Note I substituted Study for all of the examples above)
2 repliesThe basic text stemming is just not capable of that with out exact definitions for each permitation.
MetaAI just released their Llama model which is much smaller than ChatGpt and does run on a Pi even if extremely slowly 10sec/token.
Also Stanford took the model and refined it to produce Alpaca in an extremely cost effective manner.
There has been a rise in the efficacy of instruction-following models like GPT-3.5 (text-da Vinci-003), ChatGPT, Claude, and Bing Chat. These versions are now widely used by consumers daily, with some even taking them into the workplace. Despite...
Est. reading time: 3 minutes
Code and documentation to train Stanford's Alpaca models, and generate the data. - GitHub - tatsu-lab/stanford_alpaca: Code and documentation to train Stanford's Alpaca models, and generate...
https://crfm.stanford.edu/2023/03/13/alpaca.html
GitHub - ggerganov/llama.cpp: Port of Facebook's LLaMA model in C/C++ aka whisper.cpp but llama
1 replyHave a look at GitHub - KoboldAI/KoboldAI-Client where GPT/Llama is being interfaced to fantasy errotic novels in AI Dungeon adventures, that is a concept I never thought of but wow the possibilities seem to be extremely varied and interesting even if they might not be your thing
Yeah all local models. This gives a bit of an idea on model size on Llama/Alpaca
9-3-23 Added Torrent for HFv2 Model Weights, required for ooga's webUI, Kobold, Tavern. 11-3-23 There's a new torrent version of the 4bit weights called "LLaMA-HFv2-4bit". The old "LLaMA-4bit" torrent may be fine. But if you have any issues with it,...
Likely a Pi4 is just too slow but could actually run with 4gb, there are Arm based SBC such as the RK3588 variants that will run much better, but for GPU based or Apple silicon advanced local based AI has moved from the future to now.
Much is the development of training via supervised and reinforced learning that say with a project like this is Documentation and peer review of forum solutions.
If the are doing it for Dungeon & Dragons AI fantasy chat then its just one strange and varied example of a specific knowledge domain.
Anyone know of an existing issue when entity names have an apostophe? Looking at the following picture the alias works but the name does not?
P.S. Sorry for the deleted post above. Pressed wrong button.
Is “Study” an area? Questions like “are all windows closed in the study” will look at the entities in that area.
1 replyYes I have made each room an area.
Edit: also why would the response be YES to both questions?
Edit2: Same behaviour with 2023.5.0
Can you share how you fixed it? I can’t seem to make it work.
Josh.ai just announced 3rd party integration toolings that might unlock HA’s ability to work with Josh Equipment…