Hi everyone. I wanted to be able to use my default voice assistant (which is the one I use to control HA entities) to also query OpenAI to be able to ask random trivia questions from my Wear OS smartwatch. Normally, you would have to switch the voice assistant to the OpenAI one for this to work. I found a workaround however which allows me to do both with the same assistant.
Here’s a video that shows it working:
I am now able to for instance ask: “Ask OpenAI who is Barack Obama”? And I get the answer back.
For this to work I use this awesome custom component which extends the conversation agent with regex capabilities and allows us to retrieve specific values from our conversation queries.
In the example above the conversation agent would use the “who is Barack Obama” bit and forward that to the target conversation agent and store the response.
Setup instructions:
Install Yarvis via HACS by adding the repository as a custom repository and then installing it.
Setup Yarvis via the integrations menu and set the intents to trigger on, e.g:
AskOpenAI:
sentences:
- Ask OpenAI to\s(?P<query>.+)
- Ask Open AI to\s(?P<query>.+)
- Ask OpenAI\s(?P<query>.+)
- Ask Open AI\s(?P<query>.+)
Add a voice assistant (Setting - Voice assistants) or change your current one to use the Yarvis conversation agent:
Setup an intent_script (and a template sensor) in your configuration.yaml and replace the agent_id:
The agent_id can be found in the debug assistant for your target conversation agent:
(Settings-Voice assistants-OpenAI(or whatever you called the OpenAI assistant)-three dot menu-Debug)
Restart HA or reload your YAML configuration to finish.
You should now be able to use you default voice assistant (the one that has Yarvis set as the conversation agent) to both control your HA entities and ask OpenAI questions without switching.
Changelog
2023-07-23: Initial version
2023-07-24: Used template sensor triggered by custom event to hold full response instead of input_text which is limited to 255 characters.
Thanks, I tried to do this recently but couldn’t figure out how to trigger a wildcard sentence.
Small steps on the road to the ideal smart assistant. I think HA will need to deal with this issue soon — how to combine the locality and privacy of HA Assist/intents, with the desire for a conversational (meaning: GPT-powered) assistant, without manually switching back and forth.
Adding a wake word processing emphasizes this issue, since a wake word can only trigger one assistant. For now, “Ask OpenAI” is great, but it’s too similar to Alexa’s antiquated paradigm of “ask [skill] to do [x]”.
Have you tried to save the answer as an attribute of a template sensor? There you wouldn’t have the limitation to 255 chars. You could use the conversation trigger in a template sensor, so it fires on the response. Just from the top of my head, but could be worth a try.
oh my gosh, thank you for reaching out on my post. I was already half way through trying to implement this exactly before irl obligations took over. So excited to try this out!!
Is there anyway to use yarvis as a fallback to the default agent? I want sentences that are specific to homeassistant to be checked first, and then send all unmatched requests to OpenAI to get closer to a natural conversation
I.e. Hey Hal, turn on the lights > matches a assist sentence and turns on the lights
Hey Hal, Open the Pod bay doors, please > forwarded to GPT api > response: “I’m sorry, Dave, I’m afraid I can’t do that.”
Well actually there is, I’m doing just that, with the help of custom intent, and nodered.
Shortly described, I created a custom intent, with a wildcard, passing everything assist can’t do on to nodered. In nodered I’m then using my own OpenAI script, and sending the response back to HA.
I’m currently working on doing it with the extended OpenAI conversation, so it should be possible with only HA. I’m not just done yet.
The reason I’m going for assist at first, is faster response time, and to save money on the OpenAI api
That sounds great! Could you show us the custom intent w/wildcard? How do you know when there is no match? Is that in the esp32 firmware or in the yaml intent?
This will help with the custom intent w/wildcard, not with how to know if there is no match. I made one change, for the sentence it is “chat {question}” so it will only use openai when I start a sentence with “chat.”
Hello, I have seen this post and liked the idea, but like the most of you I’d like not to have a special trigger word for OpenAI. I liked the idea of having a fallback for the default agent.
So I gave it a try and implemented that as a custom component. I just finished this 10 minutes ago and didn’t have much time for testing, but i thought you should see this as soon as possible
Just have a look at the code or try it out as you like. For me it seems to work quite good. Please read the README before setting it up, to understand how it works.
Actually it just allows to setup a list of agents which are called one after the other until an agent returns a successful result.
UPDATE: I have changed the repo-url and I’ve updated it in this post as well
This looks super cool! I’m really interested in your approach to creating a fallback mechanism for the default agent without the need for a special trigger word for OpenAI. I’ve had a look at your GitHub repo, and the concept of having a list of agents that are called sequentially until a successful result is obtained seems very practical.
I do have a question about how fast the fallback to OpenAI occurs. Does the system have to wait for a complete response from the initial assistant before it switches to OpenAI, and if so, does this mean there might be a significant delay before receiving an answer? I’m curious about how this impacts the overall responsiveness.
Hi, thanks for your feedback.
As I understand the API of the agent, I need to wait for the response of the agent to get its response object including the error property.
Fortunately the default agent seems to be quite fast when it comes to intent recognition - or at least to find that there is no match.
So I don’t feel any delay. I could add some debug logging to get an overview of the delays.
However, in my opinion, the delay of the default agent is negligible compared to the performance of ChatGPT itself.
——————
Update:
If the performance is that critical for you, one could call all agents in parallel and take the first (ordered) response. But I don’t find that practical and it could create strange side effects - and costs (OpenAI API)
Another issue I have seen is that the OpenAI integration caches its own conversation history by saving all input and output messages into a history object. With my component, the OpenAI agent doesn’t get all messages, but only those which are passed in as a fallback. So you could not ask OpenAI things related to the history („what was the last thing I turned off?“).
I have improved the integration a bit and just released version 0.1.2.
Unfortunately there was a breaking change when I switched from “main” branch to GitHub Releases and additionally I renamed the Git-Repository (from hass to hacs)
So please make sure to remove the integration and delete the repo from HACS. Then add it again using the following repo. From now on, I’ll try to reduce breaking changes as much as possible .
For anyone else who wants to get more examples, here is my last Assist test conversation.
(I have told OpenAI to answer as a pirate )
Funnily enough, I made the same thing a couple weeks ago!
I implemented it a little differently, mainly due to this being the first custom integration I’ve ever written.
Great minds think alike I guess!
One important feature for me that I included was debugging. I wanted the ability to see what conversation agent responded, and even what the failures were of each conversation agent, right in the final response, assuming debugging is enabled.
Thought you might find my implementation interesting as well.
Just wanted to chime in and thank you both, this was one of the last remaining blockers from my perspective in getting to a fully functional VA model for the house.
I ended up trying both repos linked here last night and did notice one behavioral difference I thought would be worth lifting up as I’m not sure if its WAD or not.
For @t0bst4r 's implementation, when I use a custom sentence I’ve implemented that utilizes a wildcard (e.g. “play classical radio on the living room speakers” where ‘classical radio’ is a wildcard slot) I found the chained agent skips Home Assistant and goes right to the backup agent (OpenAI in my case), failing to play the music as expected. All other custom sentences that do not have wildcard slots seem to work as expected.
When using @marisa 's implementation, the behavior is as expected: a request for “play classical music on the living room speakers” is recognized as an HA intent and the appropriate script is executed.
Not sure if anyone else is seeing this behavior or its WAD, but wanted to share. Again, thank you both for sharing your work here!
If I had to take a guess as to the reason, I assume it’s because @t0bst4r 's implementation doesn’t pass a language along, where as my implementation does. Intents are based on language, and custom intents probably just don’t trigger unless a language is provided.
Interesting! I’m fairly certain my other custom intents that did not have wildcard slots triggered fine, so I wonder if its tied to that wildcard? Either way, I appreciate the work and thought!