Yep - Rene is my star user!
Cool! You might want to have a look at snowboy.kitt.ai for the hotword detection. They provide a python example and you can have the user decide which hotword he’d like to use jarvis.sh is using the bing api for the actual recognition of spoken words, but I am as well not very happy with sending my voice into the “cloud”.
Hth Hannes
in this code it is also possible to set your own hotword.
at first when i was testing in english i used hello, then i used the dutch hallo, and now i use Appie
at this moment i use google (so i send it in the cloud also), but with speechrecoqnition you can use several translators.
i could make it optional.
dutch versions locally are probably no good. i think that english is the only thing that locally is goog enough, but when it is complete people can decide for themselve.
i am just the most crazy
others would probably go the way to use HA directly for stuff that i do in appdaemon.
i know that almost all components in HA can be rebuilt with appdaemon. and because i like to play around and love the freedom, i use appdeamon for unusual things
This is more then awesome.
At the end of the day the command used, by me, in Alexa are few.
Having the possibility to talk in my language, shutting off corporation profit minded, the NSA and CIA, is top of the pop! !!!
@ReneTode This is cool. You should definitely take a look at Mycroft (https://mycroft.ai/), which is an Open Source equivalent to Alexa/Google Assistant. They use the same speech recognition APIs internally.
I started work on a Home Assistant skill, but haven’t had time to finish it yet: Mycroft AI
nice project, but i need a new RPI for that, because they only provide SDcard image.
i think i stick with speechrecognition for now and try what i can do with that.
i have used the google part today and that works. but i need to learn more because sometimes it hangs, but i havent played around with timeouts etc.
i will try out spinks too, because it is locally.
it can work with:
CMU Sphinx (works offline)
Google Speech Recognition
Google Cloud Speech API
Wit.ai
Microsoft Bing Voice Recognition
Houndify API
IBM Speech to Text
so i need some testing time
Mycroft have a Debian repo too, which contains armhf packages which should run on the Pi. Their custom HW platform is based on the Pi so it should work pretty well. See the “Installing using apt” section of their getting started page: https://docs.mycroft.ai/development/getting.started
thx.
i cant find anything about it except that they use google for default STT.
do you know if you can set other STT?
i was just looking around in the speechrcognition docs. (Always usefull AFTER you install things )
spinxs has no dutch available without doing a big amount of work
google wants you to get an account and grants 50 instances a day (but without a key it works (but can be revoked))
dont know if that will cause trouble.
cloud speech, wit, bing and IBM need you to setup an account (and probably you need to pay or give creditcard data)
so those are no favourites anyway.
and houdify too, but they have only 2 or 3 languages.
so i probably stick also with google for now, but will try to get sphinks to understand enough dutch.
Mycroft uses sphinx for the wake word detection, which is done locally. The rest of the phrase is then sent to the cloud for processing. I think the default it google, but they have support for others, which can be set up in the config file. It’s been a while since I looked at it. Probably best to ask on their support forum: https://community.mycroft.ai/
because i actaually want commands, i probably will go to spinx to.
but if mycroft uses google too, it has the same problem. you need an account and/or it is restricted.
unless google has given up the restrictions then speechrcognition has no problem too.
i actually see no reel advantage to use mycroft over doing it directly with this code.
in mycroft i need to create skills, and thats actually all i need to do with this program right now.
but it helped me to learn to programm the skills part
Cool, each to their own. I look forward to seeing your results. Good luck!
by the way @aimc
this kind off apps dont get reset like they should.
if i take out the app from the config it says it delete the entry but it stays running.
so i end up restarting appdaemon over and over now.
also the connections to alsa etc. are not reset. so if i edit the code i cant save it without errors and resttting appdaemon after that.
is there something we can do about that?
I’ve seen it too - I think I need to add a way for the app to clean the thread up.
Can you explain me what you are trying to do, or doing now?
Wake up word on the PI, that transmits the command to a voice recognition software on the cloud, that sends back the commands to the PI?
Have I understood correctly, or?
i have taken some existing parts and put them together.( python speechrecognition(including google STT, sphinks STT, and others), google TTS and appdaemon.)
i use google for the speech recognition. so that means wakeupword and commands are send to google. (not that you notice that, unless you dont have network on your RPI, and you need that for HA anyway)
google doesnt send back commands, but just some text from the spoken sound. the program makes commands out of the text.
I am going to try to use sphinks which works locally.
I’m still hoping for continued development of Lucida.ai , but this is also really great! I wonder if it is possible to have multiple rpi’s with microphones and have voice control all over the house (per room)?
Any suggestions on the microphone hardware?
Think the PI should be able to connect with a Bluetooth mic?
that wouldnt be a problem. they only need to be connected to a single network.
i am absolutely going that way and i know thats not very hard to accomplish.
just think of this:
any RPI that can understand your voice can translate that to a simple textfile.
any RPI connected to a server RPI can save that file on the server RPI.
any server RPI can translate a textfile to a command, which can do anything.
the slave RPIs wouldnt be any different then a sensor. if the sensor sends a new value, you can make an automation for that.
and the sensor would be your voice giving the sensorvalue.
do you want high end? or cheap?
if you say high end i say, dont go RPI.
if you say cheap i say any USB webcam, with a decent microphone.