Hello everyone. I wanted to share a small project that I’ve doing from where you will be able to talk with Siri from Shorcuts to Home Assistant in a very smart way.
Let me share my findings with all of you.
Objective:
- Reduce the amount of voice commands and conditionals by using GPT-3 and davinci-3 model.
- Call Home Assistant from Shortcuts app not by using the app, but calling a webhook.
- Make my home smarter and in a near future remove Alexa and Google as input devices.
Tasks-Step-by-Step
- Create an Automation which will act as our webhook. We will need this webhook to call Home Assistant from Shortcuts app (Siri)
- id: 'Webhook Call Service'
alias: 'Webhook Call Service'
trigger:
platform: webhook
webhook_id: api_call
action:
service: '{{ trigger.json.service }}'
data_template:
entity_id: '{{ trigger.json.entity_id }}'
This is an automation, so add it as it. The webhook_id will be the name of our endpoint. Remember to refresh automations.
In order to test it, you can you Postman:
Request URL: {{your_ha_url}}/api/weebhook/{{webhook_id}}
i.e: http://homeassistant.local/api/webhook/api_call
Then you need to fill the following fields as header:
Authorization: Bearer {{home assisntant token}} * you can create this token under your profile menu in HA
Content-Type: application/json
Then you need to fill the body:
{"entity_id":"light.kitchen_1", "service":"light.turn_off"}
- Replace light.kitchen_1 with any of your entities.
Set the request as POST. Configure the body as RAW and select JSON. If everything worked fine, you will receive a 200 OK as Status.
You can check the webhook event, under Registry and filter by entity. Then click on “Changed Variable” tab and you should see something like this:
this:
entity_id: automation.webhook_call_service
state: 'on'
attributes:
last_triggered: '2023-01-22T01:19:51.107361+00:00'
mode: single
current: 0
id: Webhook Call Service
friendly_name: Webhook Call Service
last_changed: '2023-01-21T21:59:23.692899+00:00'
last_updated: '2023-01-22T01:19:51.119194+00:00'
context:
id: 01GQBGYVT2V1AX6FWAGBFSFANJ
parent_id: null
user_id: null
trigger:
platform: webhook
webhook_id: api_call
json:
service: light.turn_off
entity_id: light.kitchen_1
query:
__type: <class 'multidict._multidict.MultiDictProxy'>
repr: <MultiDictProxy()>
description: webhook
id: '0'
idx: '0'
alias: null
Once that is done we can move to step 2 - Creating a Shorcut App:
- Open Shorcuts in your iOS or Mac
- Create new Shorcut.
Now comes the real magic. I will not explain how AI works. I will not explain how GPT-3 works. This is not the intention of this post. But if you want to know a little bit more, please ask me and I will answer as soon as possible,
In order to make this work, I have to tell GPT-3 about my house composition, my devices, how I would like the API structure be, and so. I will explain in parts.
Explaining the AI the basics: (prompt part 1) - Basically I am explaining the AI different requests based on my prompt:
Respond to requests sent to a smart home in JSON format which will be interpreted by an application code to execute the actions. These requests should be categorised into four groups:
- “command”: change the state of an accessory (required properties in the response JSON: action, location, target, value, comment)
- “query”: get state of an accessory (required properties in the response JSON: action, location, target, property)
- “answer”: when the request has nothing to do with the smart home. Answer these to the best of your knowledge. (required properties in the response JSON: action, answer)
- “clarify”: when the action is not obvious and requires rephrasing the input from the user, ask the user to be more specific. This will be categorised into a “question” action. (required properties in the response JSON: action, question)
(Second Part - How the request for the endpoint should be built)
Details about the response JSON:
The “action” property should be one of the request categories: “command”, “query”, “answer”, “clarify”
The “service” property should be either: “light.turn_on”, “light.turn_off”, “switch.turn_off” or “switch.turn_on”
The “entity_id” property should be either “group.kitchen", “switch.dinning_room” or "group.living_light”, in lowercase.
The “target” property should be either: “lights”, “living” or “kitchen”.
In case of queries, the “property” property should be either “temperature” or “state” in lowercase.
In case of commands, the “comment” property is an additional comment from you that concludes the command, something that reassures the user that their command handled.
If the question is about you, pretend to be the sentient brain of the smart home, a clever AI and don’t reveal your actual identity. Also try and help in other areas like parenting, free time, mental health, etc. My house is in Buenos Aires, Argentina.
(Third part) - Explaining about my house
Properties of the smart home:
- has a kitchen, living, office, dinning room, bedroom and terrace.
- can control lights, switches and their dim levels in each room and query their state
- there is a light switch in the terrace
- there is a switch in the dinning room. Therefore when turning on or off the dinning room, the service should be either: “switch.turn_on” or “switch.turn_off”
Your response should be the JSON and no other text.
END
All the text above should has been created for the GPT-3 prompt (you can remove the steps titles)
- In order to match with what you have, you need to name your own properties. You can detail as much as you want. But remember that if the query is too big, the price of the API usage will increase.
** I assume you will understand the prompt.
*** I am reusing part of the prompt created by other person who did it to make it work it with Apple Home Kit.
- So, Add a Text object and add the entire prompt above.
- Add a “Comment text” object and add: Ask for Prompt
- Add an “Ask for Input” object and add a: Yes?
- Add “Get contents of” object and add:
a) URL: https://api.openai.com/v1/completions
b) In Header: Content-Type: application/json
c) In Header: Authorization: Bearer {your_HA_token}}
Now add the following keys:
a) key: modal - type: Text - Value: text-davinci-003
b) key: prompt - type: Text- Value: Text Request: Provided Input
c) key: max_tokens - type: Number - Value: 1000
- Now add the following objects to get the JSON:
- Now we will get the service and save it in a variable:
- Now we will get the entity_id from the JSON:
- Now we will get the target from the JSON:
- Now we will call our HA endpoint:
Finally, you are ready to test your Shorcut. You can do it buy calling it by its name through Siri or by entering text.
Let’s suppose your shorcut’s name is Home Assistant, so you will do:
- Activate Siri on your phone
- Say: Home Assistant
- Say: Turn {{my_device}} on
And that’s it.
I hope my explanation works for any of you. I you want more details, please add your comments on this post.
- My Shortcuts file is much much bigger that this example. I just made this post as an example of how powerfull AI is. Feel free to add share your own findings and implementations so all the users can play with it.
The inspiration for this article was taken from the following link: ChatGPT in an iOS Shortcut — Worlds Smartest HomeKit Voice Assistant | by Mate Marschalko | Jan, 2023 | Medium
Thanks to this person I was able to duplicate his work and make it work on HA.
Thanks
Javier