I posted this guide on the Rhasspy support forum, but was asked to post it here as well. So here it is!
Part 1 - the basics and a first example
As I can very well recall my struggles to get Rhasspy going in combination with HomeAssistant, I just wanted to share a few lines with you, if you just installed Rhasspy and want to use your first voice command.
This is not a guide about the installation of Rhasspy, there are a lot of good guides out there on the net. This is just a starter, to get you going and have some kind of working example to built upon. I hope you find it useful.
Our starting point is right after you have installed Rhasspy, it doesn’t matter how, Docker or as a HA-AddOn or whichever way you chose. I’m as well assuming you have a running HA instance and you know where to setup your automations in HA.
If you now open your web admin from Rhasspy, you’ll find this menu in the upper left.
The menu contains the following items from top to bottom:
- Home Here you find a status page with some testing possibilities
- Sentences Here you set up your sentences, that means, the words you speak to Rhasspy after the wake word
- Slots Slots are lists of things or devices, that you can load into your sentences, so you don’t need to write nearly identical sentences
- Words This page is around the words and their pronunciation in Rhasspy
- Settings Here you change the setup for all the different parts of Rhasspy
- Documentation This opens the Rhasspy documentation (Note: this is the offline docu, that was installed together with Rhasspy)
Now, choose the settings page in the menu, and you should be presented with this screen:
Let’s check a few things first, to get the setup right.
-
siteId
It’s a good time to name your Rhasspy instance. Give it a name you can remember later and that is describing. If you ever change to a setup where you use Rhasspy satelites, you’ll need this and as we use this name later on in automations, it makes sense to do it now. -
MQTT
These are the settings for your MQTT broker. If you use HA-OS, a supervised install or a standalone HA-core installation, you normally will have a MQTT-broker already configured. In my case I’m running HA-OS, so I already use the mosquitto broker from the HA-AddOn store. So if you have a broker running, change the settings to “external” and fill in the data for your broker.-
Host Fill in your IP address or the domain name from your MQTT broker, with HA-OS it is the same as your HA address. Example:
192.168.178.100
orhomeassistant.local
-
Port The default port is
1883
-
User I recommend to setup a new user for your broker in the settings of HA. If you use the AddOn, you find these under
settings > people > users
. - Password Same as above
If you don’t have a MQTT broker running, leave the setting to “internal”.
-
Host Fill in your IP address or the domain name from your MQTT broker, with HA-OS it is the same as your HA address. Example:
-
Audio Recording, Wake Word, Speech to Text, Intent Recognition, Text to Speech, Audio Playing and Dialogue Management are out of the scope of this guide. If you need help with these, please refer to the documentation of Rhasspy, which you can find here. As you can see I went with the recommended options.
-
Intent Handling
This is the important part, here we setup HA as our intent handler. This means, what you speak to Rhasspy gets “translated” and then send to HA to actually do something, like switching a light.
So choose “HomeAssistant” and restart Rhasspy to reflect your change.There are two ways for Rhasspy to talk to HA. One is with
intents
, the other one is withevents
. As I couldn’t getintents
to work correctly, and after reading up some tutorials, I choose theevent
way. In the end it doesn’t make a huge difference in function, butevents
are def. easier to handle.- Hass URL Fill in the url to your HA instance
- Access Token Setup an access token in HA under your user profile and fill it in here
- Set the intent handling to
Send events to Home Assistant (/api/events)
- Save your settings and let Rhasspy restart
Now that we have our setup complete, we can start right into writing up our first sentence. Open the sentences
page (via the menu) and you’ll see the default sentences.ini
file presented in your editor window. Delete all the entries, we don’t need them for now and later on we are able to make our own sentences that really fit our needs.
Now add the following to the editor window:
[GetDate]
what date is today
This is very small, but it shows the principles, that are involved in training Rhasspy and send something to HA. So what are we looking at?
- The first line
[GetDate]
is the name of our intent. - The second line is the sentence we need to speak, to tell Rhasspy what we want.
Just think of the following way:
- You speak your wake word, Rhasspy wakes up and sends a short signal so we can now speak and Rhasspy listens.
- Whatever sentence is set here, Rhasspy tries to get your spoken word right and “translates” it to a command (the first line).
- Summed up, you speak, Rhasspy translates that to a command and this will be sent to HA to do something. This is what we call an “intent”.
As you might guess, it is not always easy and welcomed, if you need to get the sentence exactly right, so there is the possibility to set more than one sentence. But in the end, Rhasspy “translates” this always to one command.
Change the text in the editor by adding a third line
[GetDate]
what date is today
give me the date
Now we can speak one of the two sentences, and Rhasspy “translates” this always to just one command, namely [GetDate]
. Just to make it clearer: You need to speak one of the sentences, and Rhasspy will “answer” with that one command.
We will come back to our sentences file later, but for now, safe it and let Rhasspy re-train, so it knows the sentences we just added.
Now we have to do something in HA, as Rhasspy already did it’s first part of the job. Move now over to HA and setup an automation. I’ll show here the YAML
version of the automation, just because explaining what’s going on behind the scenes is easier. You can always do this automation in the UI
editor of HA, it’s entirely your choice.
Let’s see how an automation could look like with the sentences we added before:
automation:
- id: Rhasspy GetDate
alias: Rhasspy GetDate
mode: single
trigger:
- platform: event
event_data: {}
event_type: rhasspy_GetDate
action:
- service: mqtt.publish
data:
topic: hermes/dialogueManager/endSession
payload_template: '{"sessionId": "{{ trigger.event.data._intent.sessionId }}", "text": "Today is {{ states.sensor.date.state }}"}'
We’ll go through each line now, to explain what’s happening here (if there is more to explain, we’ll come to that later):
- id Give your automation a “speaking” id, if you move on, you’ll likely get a lot of automations for Rhasspy, and it is easy to loose the big picture. So choose a good name, in my case I start all automations regarding Rhasspy with “Rhasspy”. That makes it easier in the end, for example if you search for an automation in HAs automation window, you’ll have all the Rhasspy entries “grouped” together, as they all start with, you might guess it, “Rhasspy”.
-
alias I just copy the
id
to thealias
, as this is an optional step, but it makes things clearer down the road. -
mode This is the
mode
in which your automation is run. In our casesingle
is the right choice, as you likely won’t want the date told more than once. This will come in handy, if you have a command, that should be repeated. For example, if you later want to set your TV volume, you might want to run the automation a few times to increase the volume. Than this will change (don’t worry, we will come to an example later) -
trigger This is the part, where we will use our
command
from before-
platform: event
As you might remember, we configured Rhasspy to send anevent
instead of anintent
to HA, so we need to use theevent
platform in HA to recognize it -
event_data
For now we don’t need this, but it will come in handy later on, if your automations get more complicated. Just leave the two brackets empty. -
event_type
This is what identifies, what Rhasspy sends to HA. As you can see, it is the command we configured before,[GetDate]
. It is always prefixed with “rhasspy_” and followed by the actual command “GetDate”. Makes in combinationrhasspy_GetDate
. Easy, isn’t it?
-
-
action This is where we configure what HA should do, if this automation get’s triggered (aka you spoke something that Rhasspy identified and sent to HA)
-
service:mqtt_publish
We want HA to publish something (the answer) on the MQTT topic, so it is send back to Rhasspy -
data
-
topic
This is the topic Rhasspy listens to, in our case we want to close the session with an answer to our question. I added a few lines about seesions in Rhasspy at the end of this guide, if you’re interested what’s happening withsessionId
s and so on. -
payload_template
Here we tell Rhasspy in which session we are (yes, there could be more than one), and what we want Rhasspy to tell us back (aka the answer).
As you can see, we just setup a “text”, and it will be sent back to Rhasspy
-
-
This is, in an essence, what we need for Rhasspy and HA to work together. This is a very simple example, but the way things go, should be clear:
- Rhasspy wakes up
- You tell your sentence
- Rhasspy tries to find out, what you want from it, and “translates” your sentence into a command
- This command will be sent to HA over MQTT
- HA picks up the command and looks for an automation that fits (actually it’s the other way around, but let’s not get to techy here) => named after the command you sent
- HA is running the automation and publishes an “answer” over MQTT
- Rhasspy identifies the session and speaks the text from the MQTT topic back to you
Now safe your automation, reload the automations in HA and move back to Rhasspy.
For testing purposes, the “Home” page comes in handy. Call it by pressing the “Home” button. If you take a look under the status bar, you’ll see the line that starts with the “Recognize” button. This is where we’ll test our command and the connection with HA.
Type in one of the sentences exactly how you configured it. In our example type “give me the date” and push “recognize”. If everything works, you should be presented with the command you configured for this sentence in a red box, here it will be “GetDate”. This means, your sentence is recognized and is “translated” correctly to a command. Yeah! Roght now, we didn’t send anything out, it is just “inside” Rhasspy, to check, if a sentence works.
If you want to take a look, push the button “Show JSON”, and you’ll see exactly, what Rhasspy is sending over MQTT.
For our guide we are happy right now, our first intent was recognized by Rhasspy. So let’s move a step further, and check the box on the right that says “Handle”. If you now push “Recognize” again, Rhasspy isn’t only recognizing your intent, it will additionally send out the command (the JSON you can take a look at) to HA. Move over to the automation list in HA and you should see, that the automation “rhasspy_GetDate” was executed. It should show a timestamp for the last execution (shouldn’t be too long ago, depending on how long you needed to switch over to HA).
Note: you won’t hear a spoken answer from Rhasspy, this is purely to check the connection to HA!
If this works correctly, now is the time to check if your voice command and the answer are running as well. Leave the “Home” page open and speak your wakeword followed by one of the sentences. You should now see your spoken sentence in the “Recognize” field, followed by the command in the red box. And while you’re reading, you should hear your answer from HA spoken through Rhasspy.
Congratulations, your first voice command works, Rhasspy is doing it’s job and HA is ready to answer your questions or to do something for you. Pad your shoulder, you did great!
You think we’re done here? Nope, that’s only half the way, but don’t worry, from here on it’s merely an expanding than doing something totally new. The next steps are to refine the sentences and sent something to HA, that actually does something, like switching a light.