Ask OpenAI questions from your default conversation agent!

freakshock · July 23, 2023, 7:15pm

Hi everyone. I wanted to be able to use my default voice assistant (which is the one I use to control HA entities) to also query OpenAI to be able to ask random trivia questions from my Wear OS smartwatch. Normally, you would have to switch the voice assistant to the OpenAI one for this to work. I found a workaround however which allows me to do both with the same assistant.
Here’s a video that shows it working:

I am now able to for instance ask: “Ask OpenAI who is Barack Obama”? And I get the answer back.
For this to work I use this awesome custom component which extends the conversation agent with regex capabilities and allows us to retrieve specific values from our conversation queries.

In the example above the conversation agent would use the “who is Barack Obama” bit and forward that to the target conversation agent and store the response.

Setup instructions:

Install Yarvis via HACS by adding the repository as a custom repository and then installing it.
Setup Yarvis via the integrations menu and set the intents to trigger on, e.g:

AskOpenAI:
  sentences:
    - Ask OpenAI to\s(?P<query>.+)
    - Ask Open AI to\s(?P<query>.+)
    - Ask OpenAI\s(?P<query>.+)
    - Ask Open AI\s(?P<query>.+)

Add a voice assistant (Setting - Voice assistants) or change your current one to use the Yarvis conversation agent:

image710×1185 47.6 KB
Setup an intent_script (and a template sensor) in your configuration.yaml and replace the agent_id:
The agent_id can be found in the debug assistant for your target conversation agent:
(Settings-Voice assistants-OpenAI(or whatever you called the OpenAI assistant)-three dot menu-Debug)

image780×274 11.4 KB

intent_script:
  AskOpenAI:
    action:
      - service: conversation.process
        data:
          text: "{{query}}"
          language: EN
          agent_id: 50005158a4b775603223d530315c184e
        response_variable: agent
      - event: openai_response_received
        event_data:
          full_response: "{{ agent.response.speech.plain.speech }}"
    speech:
      text: "{{ states['sensor.openai_query_response'].attributes.full_response }}"

template:
  - trigger:
      platform: event
      event_type: openai_response_received
    sensor:
      - name: "OpenAI query response"
        state: "{{now()}}"
        attributes:
          full_response: "{{ trigger.event.data.full_response }}"

Restart HA or reload your YAML configuration to finish.
You should now be able to use you default voice assistant (the one that has Yarvis set as the conversation agent) to both control your HA entities and ask OpenAI questions without switching.

Changelog

2023-07-23: Initial version
2023-07-24: Used template sensor triggered by custom event to hold full response instead of input_text which is limited to 255 characters.

enzo2 · July 23, 2023, 11:56pm

Thanks, I tried to do this recently but couldn’t figure out how to trigger a wildcard sentence.

Small steps on the road to the ideal smart assistant. I think HA will need to deal with this issue soon — how to combine the locality and privacy of HA Assist/intents, with the desire for a conversational (meaning: GPT-powered) assistant, without manually switching back and forth.
Adding a wake word processing emphasizes this issue, since a wake word can only trigger one assistant. For now, “Ask OpenAI” is great, but it’s too similar to Alexa’s antiquated paradigm of “ask [skill] to do [x]”.

paddy0174 · July 24, 2023, 2:58am

Have you tried to save the answer as an attribute of a template sensor? There you wouldn’t have the limitation to 255 chars. You could use the conversation trigger in a template sensor, so it fires on the response. Just from the top of my head, but could be worth a try.

Great guide btw. Much appreciated!

automated_chonk · July 24, 2023, 7:06pm

oh my gosh, thank you for reaching out on my post. I was already half way through trying to implement this exactly before irl obligations took over. So excited to try this out!!

freakshock · July 24, 2023, 7:47pm

Great idea. I have edited the post to change the input_text to a template sensor with an attribute holding the full response from OpenAI.

ACiDGRiM · December 7, 2023, 5:46am

Is there anyway to use yarvis as a fallback to the default agent? I want sentences that are specific to homeassistant to be checked first, and then send all unmatched requests to OpenAI to get closer to a natural conversation

I.e. Hey Hal, turn on the lights > matches a assist sentence and turns on the lights
Hey Hal, Open the Pod bay doors, please > forwarded to GPT api > response: “I’m sorry, Dave, I’m afraid I can’t do that.”

freakshock · December 7, 2023, 9:21am

No I don’t think that’s possible currently. There is no way to combine/fallback to multiple voice pipelines.

pdoyle · December 10, 2023, 7:58pm

I agree 1000% that this needed!

bloody2k · December 11, 2023, 5:38pm

Well actually there is, I’m doing just that, with the help of custom intent, and nodered.

Shortly described, I created a custom intent, with a wildcard, passing everything assist can’t do on to nodered. In nodered I’m then using my own OpenAI script, and sending the response back to HA.

I’m currently working on doing it with the extended OpenAI conversation, so it should be possible with only HA. I’m not just done yet.

The reason I’m going for assist at first, is faster response time, and to save money on the OpenAI api

pdoyle · December 17, 2023, 3:07pm

That sounds great! Could you show us the custom intent w/wildcard? How do you know when there is no match? Is that in the esp32 firmware or in the yaml intent?

Jedman · January 26, 2024, 4:42am

This will help with the custom intent w/wildcard, not with how to know if there is no match. I made one change, for the sentence it is “chat {question}” so it will only use openai when I start a sentence with “chat.”

t0bst4r · February 11, 2024, 7:53pm

Hello, I have seen this post and liked the idea, but like the most of you I’d like not to have a special trigger word for OpenAI. I liked the idea of having a fallback for the default agent.
So I gave it a try and implemented that as a custom component. I just finished this 10 minutes ago and didn’t have much time for testing, but i thought you should see this as soon as possible

Just have a look at the code or try it out as you like. For me it seems to work quite good. Please read the README before setting it up, to understand how it works.

Actually it just allows to setup a list of agents which are called one after the other until an agent returns a successful result.

UPDATE: I have changed the repo-url and I’ve updated it in this post as well

Qurium · February 11, 2024, 9:22pm

This looks super cool! I’m really interested in your approach to creating a fallback mechanism for the default agent without the need for a special trigger word for OpenAI. I’ve had a look at your GitHub repo, and the concept of having a list of agents that are called sequentially until a successful result is obtained seems very practical.

I do have a question about how fast the fallback to OpenAI occurs. Does the system have to wait for a complete response from the initial assistant before it switches to OpenAI, and if so, does this mean there might be a significant delay before receiving an answer? I’m curious about how this impacts the overall responsiveness.

Great job on this, and thanks for sharing!

t0bst4r · February 11, 2024, 10:05pm

Hi, thanks for your feedback.
As I understand the API of the agent, I need to wait for the response of the agent to get its response object including the error property.

Fortunately the default agent seems to be quite fast when it comes to intent recognition - or at least to find that there is no match.
So I don’t feel any delay. I could add some debug logging to get an overview of the delays.

However, in my opinion, the delay of the default agent is negligible compared to the performance of ChatGPT itself.

——————
Update:

If the performance is that critical for you, one could call all agents in parallel and take the first (ordered) response. But I don’t find that practical and it could create strange side effects - and costs (OpenAI API)
Another issue I have seen is that the OpenAI integration caches its own conversation history by saving all input and output messages into a history object. With my component, the OpenAI agent doesn’t get all messages, but only those which are passed in as a fallback. So you could not ask OpenAI things related to the history („what was the last thing I turned off?“).

DonNL · February 14, 2024, 5:50pm

Great work! This should be a core feature.
This way you could add multilanguage too.

t0bst4r · February 20, 2024, 7:05pm

That’s a great idea, @DonNL !

I have improved the integration a bit and just released version 0.1.2.
Unfortunately there was a breaking change when I switched from “main” branch to GitHub Releases and additionally I renamed the Git-Repository (from hass to hacs)

So please make sure to remove the integration and delete the repo from HACS. Then add it again using the following repo. From now on, I’ll try to reduce breaking changes as much as possible .

For anyone else who wants to get more examples, here is my last Assist test conversation.
(I have told OpenAI to answer as a pirate )

You can find all the instructions to set this up in the README.md of the repo.

marisa · February 23, 2024, 11:39pm

Funnily enough, I made the same thing a couple weeks ago!

I implemented it a little differently, mainly due to this being the first custom integration I’ve ever written.

Great minds think alike I guess!

One important feature for me that I included was debugging. I wanted the ability to see what conversation agent responded, and even what the failures were of each conversation agent, right in the final response, assuming debugging is enabled.

Thought you might find my implementation interesting as well.

kbromer · February 25, 2024, 7:08pm

Just wanted to chime in and thank you both, this was one of the last remaining blockers from my perspective in getting to a fully functional VA model for the house.

I ended up trying both repos linked here last night and did notice one behavioral difference I thought would be worth lifting up as I’m not sure if its WAD or not.

For @t0bst4r 's implementation, when I use a custom sentence I’ve implemented that utilizes a wildcard (e.g. “play classical radio on the living room speakers” where ‘classical radio’ is a wildcard slot) I found the chained agent skips Home Assistant and goes right to the backup agent (OpenAI in my case), failing to play the music as expected. All other custom sentences that do not have wildcard slots seem to work as expected.

When using @marisa 's implementation, the behavior is as expected: a request for “play classical music on the living room speakers” is recognized as an HA intent and the appropriate script is executed.

Not sure if anyone else is seeing this behavior or its WAD, but wanted to share. Again, thank you both for sharing your work here!

marisa · February 25, 2024, 7:18pm

If I had to take a guess as to the reason, I assume it’s because @t0bst4r 's implementation doesn’t pass a language along, where as my implementation does. Intents are based on language, and custom intents probably just don’t trigger unless a language is provided.

kbromer · February 25, 2024, 7:28pm

Interesting! I’m fairly certain my other custom intents that did not have wildcard slots triggered fine, so I wonder if its tied to that wildcard? Either way, I appreciate the work and thought!