Multi Agents in Home Assistant using Extended OpenAI Conversation

I started experimenting with LLM Conversation Agents in Home Assistant recently and fell in love with the Extended OpenAI Conversation custom component. However, I quickly found myself trying to cram too much into a single agent and decided to try and implement multiple specialized agents instead.

I manage this by starting with a Dispatcher Agent that is used by the voice pipeline. This agent lets me maintain a single point of contact by deciding which specialized agent to pass the query onto, and returning a result based on the agentā€™s response.

I then set up separate agents for each specialized task. To be clear, an ā€œagentā€ is an implementation of the Extended OpenAI Conversation integration. I add a new integration for each agent. This lets me customize the model, prompt template, and functions of each integration/agent.

The Dispatcher Agent has the following prompt template:

I want you to act as smart AI manager. You take user queries and pass them onto the appropriate AI agent to process. You provide responses based on what these agents tell you.

Select the ID of the relevant agent when using the call_agent_by_id tool:

Agent, agent_id
Meteorologist Agent, ABC123
Smart Home Agent, ABC124
To Do List Agent, ABC125

And the following function:

- spec:
    name: call_agent_by_id
    description: Pass user query to relevant AI Agent
    parameters:
      type: object
      properties:
        query:
          type: string
          description: The users query
        agent_id:
          type: string
          description: ID of the AI Agent
          enum:
          - ABC123
          - ABC124
          - ABC125
      required:
      - query
      - agent_id
  function:
    type: composite
    sequence:
    - type: script
      sequence:
      - service: conversation.process
        data:
          text: "{{ query }}"
          agent_id: "{{ agent_id }}"
        response_variable: _function_result
      response_variable: res
    - type: template
      value_template: >-
        {% set res = res.response.speech.plain.speech %}
        {{ {'agent_response': res} }}

One of the big benefits of this method is that you can customize what data is passed to the agent in its prompt template. That lets you avoid having to expose your whole home to it and spam it with needless info. I take advantage of this with my Meteorologist Agent. Its prompt template looks like this:

I want you to act as a meteorologist. Provide brief, one or two sentence responses. You know the following about the current conditions and upcoming forecast.

The current time is: {{now()}}

The current conditions are: {{ states.sensor.mycity_condition.state }}{% if states.sensor.mycity_warnings.state|int > 0 %}

There is a weather warning in effect: {{ state_attr("sensor.mycity_warnings","alert_1") }}

{% endif %}{% if not states('mycity_chance_of_precip') == 'unknown' %}

The chance of preciptation is: {{ states.sensor.mycity_chance_of_precip.state }}{% endif %}

The current weather data is:

Property, Value
Temperature, {{ states.sensor.mycity_temperature.state }}
Humidity, {{ states.sensor.mycity_humidity.state }}
Humidex, {{ states.sensor.mycity_humidex.state }}
Wind Gust, {{ states.sensor.mycity_wind_gust.state }}
Wind Speed, {{ states.sensor.mycity_wind_speed.state }}
UV Index, {{ states.sensor.mycity_uv_index.state }}

I also gave it functions simply because of how the weather entity and the get_forecast service work:

- spec:
    name: get_hourly_forecast
    description: Get an hourly weather forecast
  function:
    type: script
    sequence:
      - service: weather.get_forecasts
        metadata: {}
        data:
          type: hourly
        target:
          entity_id: weather.mycity
        response_variable: _function_result

- spec:
    name: get_daily_forecast
    description: Get a daily weather forecast
  function:
    type: script
    sequence:
      - service: weather.get_forecasts
        metadata: {}
        data:
          type: daily
        target:
          entity_id: weather.mycity
        response_variable: _function_result

I think the rest is self explanatory from here. My Smart Home Agent is just the default config for the Extended OpenAI Conversation integration. The To Do List Agent is just the functions from the shopping_list example.

4 Likes

This is an awesome approach, thanks for taking the time to share!

So if I am understanding correctly, you are using four different instances of extended openai?

Yes, thatā€™s right. I went into ā€œDevices & Servicesā€ and clicked ā€œAdd Integrationā€ for each ā€œAgentā€. Each time I get a new ā€œServiceā€ under the ā€œExtended OpenAI Conversationā€ integration. The API key is the same between each but they are all configured differently.

How can you expose different entities to different agents? My main problem is the prompt dimensions due to all the entities exposed. I want to expose only whats is needed for every agent to do his job

Entities only need to be exposed if you want to control them. You can manually reference whatever entities you want in the system prompt. None of the entities in my Meteorologist Agent system prompt are exposed. Iterating over a list of exposed entities like the default prompt does is just one option. Iā€™m experimenting with adding entity ids to an input_text helper and iterating over that instead. That way I can group entities together based on whatever criteria I want.

Either way, I felt too constrained by the function calling ability of Extended OpenAI Conversation so Iā€™ve started moving my Agents over to Python and Langchain. Instead of the conversation.process action, I created a rest_command that can call an endpoint my Agents respond to. GitHub - peveleigh/eveagents: EveAgents are a suite of LLM agents designed to act as personal assistants.

Sorry but iā€™m a little bit unexperiencedā€¦
If i want to turn on a light with an agent, i need to have that light exposed and referenced in the prompt? Or just exposed? Or just referenced?

To turn on a light it both needs to be exposed and referenced in a prompt so the agent knows what entity_id to use. The default prompt loops over every exposed entity and lists them all.

If you just want to provide the agent with some info, like a temperature reading or if a door is open or closed, you donā€™t need to expose it. But, you still need to reference it in a prompt and provide the relevant info.

If you reference a light in a prompt but donā€™t expose it, the agent will still think it can use it. However, if it tries HA throws an error. Presumably as a safety precaution.

Understood.
Iā€™ve referenced all the light in one room, if i ask to turn on/turn off/change color of one or more of them it works. If i ask to turn on all the lights in the room it returns an error:

2024-08-15 13:23:37.279 ERROR (MainThread) [custom_components.extended_openai_conversation] Error code: 500 - {'error': {'message': 'The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID req_ea4c34bc5c8616469c6b5264f6a021cf in your email.)', 'type': 'server_error', 'param': None, 'code': None}}

Any tips? What iā€™m doing wrong?

Doing this manually itā€™s a pain ahahah.
How can i iterate a list of exposes entities for just one domain? I wish to create an agent for control lights, another for media and so on, mantaining the area information of that devices

Not sure what the issue is with calling them all at once. Itā€™s either trying to make multiple function calls and failing or trying to control a fictitious light group.

You can iterate over domains like so:

{% for light in states.light %}
{{ light.entity_id }},{{ light.name }},{{ light.state }}
{% endfor %}

You can also do something like this:

{% set exposed_doors = states.input_text.exposed_doors.state.split(ā€˜,ā€™) -%}
{% for entity in exposed_doors -%}
{% set entity = states[ā€˜binary_sensorā€™][entity] -%}
{{ entity.entity_id }},{{ entity.name }},{{ entity.state }}
{% endfor -%}

Where input_text.exposed_doors is a csv string of your door entity_ids.

Got it working with multiple agents, but itā€™s slow compared to a single agents. I think is because the multiple calls?

The prompt on the other hands is smaller

Can i not make a spec with the entities i need? So when i call the the spec the agent can see the entities needed

i dont know what im doing wrong. all agents work as expected
image
but if i ask agent manager i get
image

ok i found a solution: create for each (extended openai conversation) agent a assistant in voice assistant settings, then in developer tools fire actions Conversation: Process and choose assistant, switch to ymal and copy the agent_id. then past the agen_id in the prompt of agent manager. hope that help

This is great! I have a couple of questions:
How do you invoke langchain agents? From assistant interface in Home assistant? Are you using an intent to trigger sending your query to langchain agent?
How have you set up Voice Assistant - is it e.g. OpenAI bot where you use intent script to invoke rest command?