The Journey to Create my own Alexa Skill (Jarvis for the house) - Building a Custom Alexa Skill - Part 1

brokerandy25 · October 1, 2022, 2:24pm

I’m going to attempt to share how I have created my own alexa skill for my house. I chose to go down this route because I was tired of using the “alexa-home” nodes and having to Trigger things with “keywords”. “Alexa, reset bathroom lights on” doesn’t have the same ring to it that “Alexa, reset bathroom lights” (and getting a response that is more than just a “bing”).

I’m going to document this journey “real time”, as I am hoping that others will find value in this. Some things to note: I’m not looking for criticisms. I have a long history in IT (came from software development) I know when I’m cutting corners, and I am fully aware of what I am doing and when. If you should go down this route of creating a custom skill, I’m happy to share with you what I have picked up along the way, and I will see if I can help you out, but I have a wife, family and a day job .

Lets start with the “system background” . . . My set up consists of:

An old dell laptop with Ubuntu installed on it, and running supervisor on “bare metal” (yes, this is an unsupported install)
Zigbee / deconz dongle
Eero mesh network
Crappy Frontier Modem (ISP)
A lot of zigbee devices and a couple of wi-fi ones (but that really doesn’t matter for this walk through)
Running everything through containers

Design Concerns for me:
I’m cheap . . . so what can I do “shopping in my own closet”. I haven’t finished my skill yet (will it ever be?) but it is definitely not ready to be used. So I wanted to make this as cheaply / easily as possible. Thus I chose to stay with the “Amazon happy path for skill creation”. I want to always strive to keep as much as possible local. Thus, I only want to use the Alexa skill as an “interpreter”, with as much processing as possible occurring on my own internal hardware.
I wanted the node-red implementation to be as “simple” as possible, so that future growth / flexibility is there.

Hardware configuration:
First thing you will need to do is ensure that you have “ingress” into your Node-Red API (port 1880) as you will need access to http://yourDomain:1880/endpoint/. I can’t tell you explicitly how to do that on your modem / router, but I can tell you that on my set up, port forwarding 1880 (external) to 1880 (internal) on the same IP address as my HA system is exposed on also exposes the Node Red API. We have chosen to go this route as we will end up using the HTTP node (as it allows GET) as opposed to the HA Websocket node (which only allows POST). We want to use get, as this will allow us to ultimately return data to our Alexa Skill, so that she can talk to us!!

Once you have that done, go ahead and make your Alexa Developer account (covered all over the internet, not gonna do so here). Once you have your account, create a new skill: Choose custom and hosted endpoint (this is very important).

Alexa Skills contains essentially 2 parts as I have “designed” them in my brain. The first is the “speech app”. This is where you find the ability to interact with a computer via your voice. It is an interpreter and pattern matcher (more on that later). Based upon what the interpreter matches, it knows WHAT code that you have written to call. This code, since you chose the “hosted” option will live an AWS Lambda instance (easier for me as I was having HTTPS issues and I didn’t want to recreate some backend “things” like data storage and session handling).

Once your skill is created, we will now set up VS Code to support your development needs. First you need to ensure that you are NOT running the most recent version of VS Code. It has to be vs 1.5 or less. Once you have that installed, go to the marketplace and install git and the Alexa SDK. Restart VS Code. On the left hand side, you should see the Alexa Icon, click that, and you will see an option pop up that asks if you want to create / download a skill. Choose download, login with your developer acct info, and if GIT has properly installed (not gonna walk you through configuring git. Again, that is all over the internet), you will download your code. If you get a . . .err . . . git error. Add the path to git.exe (not any git bash terminals) into your PATH variables. Restart VS code and “bob’s your uncle”. (go back and try and pull down your skill again, it should work. If it doesn’t, don’t call me as I don’t know git.

Side note, and you WILL run into this later, once you commit your code locally and you go to push it . . . Amazon gets a little wonky and you will NOT see your files synced, especially if you look in the code editor in the browser ([ask-cli v2] After git push origin master, message The master branch of your skill is ahead of the dev version on Alexa console · Issue #355 · alexa/ask-cli · GitHub). To fix this, after a commit, type:

git checkout dev
git pull --rebase
git merge master
# fix any merge conflict if you have any
git push --no-verify

Apparently the way branches are named for our lambda functions are a little wonky.

Once you have reached this point, you will have a skill “skeleton” in AWS that you can play with.

So lets talk “code”. Remember that “model” I mentioned earlier? About how there are 2 parts of the app. We have the “speech interpreter” and we have code “that executes” based upon what the “speech interpreter” tells it to do.

Lets create a SLOT first. When logged into your Developer console, on the left hand side you will see a menu option called “Assets”, open that and go to “Slot Types”. Think of a Slot Type as a . . . well . . . a constraint for a variable. Create a slot type for what you “want”. For example: I want to be able to say “Alexa, ask House Jarvis to turn kitchen on”. Lets pause here and break down this sentence:
, {ask | tell | open} {unique app name} [to turn {slot1} {slot2}]. We will only really care about the things in the []. Amazon refers to what is in the [] as an utterance. You get to define “patterns” for the utterances. In my case I have a “turn {entity} {action}”. For my “action” slot, I have predefined words that work for my family. Things such as “on” “off” “dim” “brighten”. This list of words are constraints for that slot, and for the slot to be valid, it will need to contain one of those words. You can also use the AMAZON. as a short cut . . . for example, why create your own list of colors for a slot when there is an AMAZON.Color slot with more colors in the list than you could probably EVER come up with. Accepted values for Amazon slot types can be found on the internet.
By defining the SLOT type first, then we can use it in our intent.

Now, go back to your left nav and click on “interaction model” and then on Intents. You will see 5 predefined intents that Amazon gives you . . . don’t worry about them, and you will actually need several of them.

Now click on “Add Intent”, make sure the Create custom intent radio button is clicked, and enter a name for your intent. The first Intent I did I called “HouseLightsIntent” (hold on to this name, you WILL need it later). This is where you get to start figuring out “how” people will talk to your skill. To get you started, in my “HouseLightsIntent”, I have “utterances” or phrases such as:

to set {entity} to {color}
{entity} to {action}
set {entity} lights {action}
^^This one is my favorite due to how flexible it is . . . one “utterance” and I can turn any thing in my house on or off (as long as it is supported by HA)

Reminder . . . the things in the {} are Slot values and contain “an approved list of things”. My Entity List is the “human words” for my lights and switches. We will make a mapper in Node Red in a follow up post. Once you define your “utterances” and you put the {} in . . . the console will ask you if you want to “create” a new slot. Click add and then scroll to the bottom of the page. You will see the same slot name, and then you need to choose your slot type. (this is why we created our slots types first!). **NOTE: the skills console does a lot of “auto saving”. If you don’t see your slot type listed in the drop down, scroll to the top of the screen and click the “build model” (it will save your model at this point if you need it saved).

Once you have an Intent built (AND I RECOMMEND ONLY DOING ONE AT A TIME FOR YOUR OWN SANITY), click save model and build model. This is what “teaches” amazon how you will talk to the skill. (the utilization of ML and AI is beyond the scope of this).

This is a good as a place as any to pause . . . I’m on a roll and want to get back to things. I’ll post more later.