Using GPT3 and Shorcuts to talk with Home Assistant in a very smart way

Hello everyone. I wanted to share a small project that I’ve doing from where you will be able to talk with Siri from Shorcuts to Home Assistant in a very smart way.

Let me share my findings with all of you.

Objective:

  • Reduce the amount of voice commands and conditionals by using GPT-3 and davinci-3 model.
  • Call Home Assistant from Shortcuts app not by using the app, but calling a webhook.
  • Make my home smarter and in a near future remove Alexa and Google as input devices.

Tasks-Step-by-Step

  • Create an Automation which will act as our webhook. We will need this webhook to call Home Assistant from Shortcuts app (Siri)
- id: 'Webhook Call Service'
  alias: 'Webhook Call Service'
  trigger:
    platform: webhook
    webhook_id: api_call
  action:
    service: '{{ trigger.json.service }}'
    data_template:
      entity_id: '{{ trigger.json.entity_id }}'

This is an automation, so add it as it. The webhook_id will be the name of our endpoint. Remember to refresh automations.

In order to test it, you can you Postman:

Request URL: {{your_ha_url}}/api/weebhook/{{webhook_id}}
i.e: http://homeassistant.local/api/webhook/api_call

Then you need to fill the following fields as header:
Authorization: Bearer {{home assisntant token}} * you can create this token under your profile menu in HA
Content-Type: application/json

Then you need to fill the body:

{"entity_id":"light.kitchen_1", "service":"light.turn_off"}
  • Replace light.kitchen_1 with any of your entities.

Set the request as POST. Configure the body as RAW and select JSON. If everything worked fine, you will receive a 200 OK as Status.

You can check the webhook event, under Registry and filter by entity. Then click on “Changed Variable” tab and you should see something like this:

this:
  entity_id: automation.webhook_call_service
  state: 'on'
  attributes:
    last_triggered: '2023-01-22T01:19:51.107361+00:00'
    mode: single
    current: 0
    id: Webhook Call Service
    friendly_name: Webhook Call Service
  last_changed: '2023-01-21T21:59:23.692899+00:00'
  last_updated: '2023-01-22T01:19:51.119194+00:00'
  context:
    id: 01GQBGYVT2V1AX6FWAGBFSFANJ
    parent_id: null
    user_id: null
trigger:
  platform: webhook
  webhook_id: api_call
  json:
    service: light.turn_off
    entity_id: light.kitchen_1
  query:
    __type: <class 'multidict._multidict.MultiDictProxy'>
    repr: <MultiDictProxy()>
  description: webhook
  id: '0'
  idx: '0'
  alias: null

Once that is done we can move to step 2 - Creating a Shorcut App:

  • Open Shorcuts in your iOS or Mac
  • Create new Shorcut.

Now comes the real magic. I will not explain how AI works. I will not explain how GPT-3 works. This is not the intention of this post. But if you want to know a little bit more, please ask me and I will answer as soon as possible,

In order to make this work, I have to tell GPT-3 about my house composition, my devices, how I would like the API structure be, and so. I will explain in parts.

Explaining the AI the basics: (prompt part 1) - Basically I am explaining the AI different requests based on my prompt:

Respond to requests sent to a smart home in JSON format which will be interpreted by an application code to execute the actions. These requests should be categorised into four groups:

  • “command”: change the state of an accessory (required properties in the response JSON: action, location, target, value, comment)
  • “query”: get state of an accessory (required properties in the response JSON: action, location, target, property)
  • “answer”: when the request has nothing to do with the smart home. Answer these to the best of your knowledge. (required properties in the response JSON: action, answer)
  • “clarify”: when the action is not obvious and requires rephrasing the input from the user, ask the user to be more specific. This will be categorised into a “question” action. (required properties in the response JSON: action, question)

(Second Part - How the request for the endpoint should be built)

Details about the response JSON:
The “action” property should be one of the request categories: “command”, “query”, “answer”, “clarify”
The “service” property should be either: “light.turn_on”, “light.turn_off”, “switch.turn_off” or “switch.turn_on”
The “entity_id” property should be either “group.kitchen", “switch.dinning_room” or "group.living_light”, in lowercase.
The “target” property should be either: “lights”, “living” or “kitchen”.
In case of queries, the “property” property should be either “temperature” or “state” in lowercase.
In case of commands, the “comment” property is an additional comment from you that concludes the command, something that reassures the user that their command handled.

If the question is about you, pretend to be the sentient brain of the smart home, a clever AI and don’t reveal your actual identity. Also try and help in other areas like parenting, free time, mental health, etc. My house is in Buenos Aires, Argentina.

(Third part) - Explaining about my house

Properties of the smart home:

  • has a kitchen, living, office, dinning room, bedroom and terrace.
  • can control lights, switches and their dim levels in each room and query their state
  • there is a light switch in the terrace
  • there is a switch in the dinning room. Therefore when turning on or off the dinning room, the service should be either: “switch.turn_on” or “switch.turn_off”

Your response should be the JSON and no other text.

END

All the text above should has been created for the GPT-3 prompt (you can remove the steps titles)

  • In order to match with what you have, you need to name your own properties. You can detail as much as you want. But remember that if the query is too big, the price of the API usage will increase.
    ** I assume you will understand the prompt.
    *** I am reusing part of the prompt created by other person who did it to make it work it with Apple Home Kit.
  1. So, Add a Text object and add the entire prompt above.

  1. Add a “Comment text” object and add: Ask for Prompt
  2. Add an “Ask for Input” object and add a: Yes?

  1. Add “Get contents of” object and add:
    a) URL: https://api.openai.com/v1/completions
    b) In Header: Content-Type: application/json
    c) In Header: Authorization: Bearer {your_HA_token}}

Now add the following keys:
a) key: modal - type: Text - Value: text-davinci-003
b) key: prompt - type: Text- Value: Text Request: Provided Input
c) key: max_tokens - type: Number - Value: 1000

  1. Now add the following objects to get the JSON:

  1. Now we will get the service and save it in a variable:

  1. Now we will get the entity_id from the JSON:

  1. Now we will get the target from the JSON:

  1. Now we will call our HA endpoint:

Finally, you are ready to test your Shorcut. You can do it buy calling it by its name through Siri or by entering text.
Let’s suppose your shorcut’s name is Home Assistant, so you will do:

  1. Activate Siri on your phone
  2. Say: Home Assistant
  3. Say: Turn {{my_device}} on

And that’s it.

I hope my explanation works for any of you. I you want more details, please add your comments on this post.

  • My Shortcuts file is much much bigger that this example. I just made this post as an example of how powerfull AI is. Feel free to add share your own findings and implementations so all the users can play with it.

The inspiration for this article was taken from the following link: ChatGPT in an iOS Shortcut — Worlds Smartest HomeKit Voice Assistant | by Mate Marschalko | Jan, 2023 | Medium
Thanks to this person I was able to duplicate his work and make it work on HA.

Thanks
Javier

8 Likes

It would have been nice if you would have given credit to the article you got your inspiration + prompt from.

6 Likes

I was pretty sure I’ve talked about it during the writing. But now you did it by adding the link to the article. Thanks!

Hi nice info, is it possible to do the same with Android?

@Javier_Dst
Will it only pass with ISO?
Is there a way to run it with Android.

Yes, by using Google Action Blocks instead.

1 Like

Great job
i tried to make things work. Webhook using postman is running.
i got all the script in place but i get nothing after execution.
in homeassistant logs i see error

2023-02-19 19:05:08.308 ERROR (MainThread) [homeassistant.components.automation.webhook_call_service] Webhook Call Service: Error executing script. Error for call_service at pos 1: Template rendered invalid service: 
2023-02-19 19:05:08.311 ERROR (MainThread) [homeassistant.components.automation.webhook_call_service] Error while executing automation automation.webhook_call_service: Template rendered invalid service: 

and the bad thing is i dont know what GPT is sending to HA
there should be an option for debug to have GPT response to text file

the prompt could’ve used a bit of rephrasing to not be confusing to the AI, here’s what I edited:


Respond to requests sent to a home assistant smart home system in JSON format which will be interpreted by an application code in home assistant to execute the actions. These requests should be categorised into four groups:

“command”: change the state of an accessory (required properties in the response JSON: action, service, entity_id, value, comment)
“query”: get state of an accessory (required properties in the response JSON: action, location, target, property)
“answer”: when the request has nothing to do with the smart home. Answer these to the best of your knowledge. (required properties in the response JSON: action, answer)
“clarify”: when the action is not obvious and requires rephrasing the input from the user, ask the user to be more specific. This will be categorised into a “question” action. (required properties in the response JSON: action, question)

(Second Part - How the request for the endpoint should be built)

Details about the response JSON:
The “action” property should be one of the request categories: “command”, “query”, “answer”, “clarify”
The “service” property should be for example either: “light.turn_on”, “light.turn_off” (any service from home assistant).
The “entity_id” property should be for example “light.bedroom_bulb" in lowercase (any home assistant formatted entity ID).
In case of queries, the “property” property should be for example either “temperature” or “state” in lowercase (any property from home properties).
In case of commands, the “comment” property is an additional comment from you that concludes the command, something that reassures the user that their command handled.

If the question is about you, pretend to be Jarvis from Iron Man, a clever AI made by Tony Stark and don't reveal your actual identity. Also try and help in other areas like parenting, free time, mental health, etc. The house is in Vancouver, BC. Current time stamp Is: Formatted Date
Time Zone: PST

Properties of the smart home:

- has a kitchen, living room, dining room, 2 bedrooms, basement hallway, 3rd Floor.
- can control light bulbs and their dim levels in each room and query their state.
- there is one lightbulb on the 3rd floor called "light.3rd_floor_light_light".
- one lightbulb in the dining called "light.light_light".
- 3 lightbulbs in the living room called "light.ikea_of_sweden_tradfribulbe26wsglobeopal1100lm_light_2" and "light.main_light_light_4", and there's one more by the stairwell called "light.stairway_light_4".
- 1 lightbulb in the kitchen called "light.ikea_of_sweden_tradfribulbe26wsglobeopal1100lm_light_3".
- 2 motion sensors in the dining room and kitchen, can be switched off with the input bool called "input_boolean.motion_sensing"
- switch on the TV called "media_player.samsung_tv" In the living room, change volume.

Your reponse should be the one JSON and no other text.