About making inexpensive models smarter by providing tools and context. (gpt-5-mini, gpt-4.1-mini, gpt-4o-mini ...)

Thyraz · July 6, 2025, 8:33pm

Hello everyone,

edit:

I decided to use this thread to share my journey with improving my LLM based assistant using OpenAI gpt-4o-mini by adding tools.

So I will add LLM tools I create, or link to LLM tools that I start to use in this thread and try to update the start post with links as an table of contents.

I learned a lot from Nathan’s Friday Party thread, but I think more user stories (and a maybe less perfect/large solution as a starter) are always a good thing for others to get into the topic.

And you REALLY want to improve your LLM assistant with tools, as the standard behavior is volatile and subpar at least with the current smaller/cheaper models.

Start of original post, before I added this intro:

Since LLMs, we talk to Assist much more naturally and freely and are often quite happy to believe what it tells us .

However, I’ve noticed that the bullshit level for smart-home related questions is still quite high.
The reason for this are usually mathematical correlations, as the models do not use a calculator, for example, but answer according to “linguistic probability”.

The more expensive models usually produce far better results in this respect, but

by far not always
I am afraid of the costs as the children are also on board.

That’s why I started with a first test script today, to see if this can improved with a tool-set. The first one: Determine the min/max value from a list of numbers.

Here is an example about what might go wrong without additional guidance for the LLM:

“Phew, it’s so hot in my study, what’s the coldest room in the house I could move to?”

The current temperatures in the various rooms are

Bathroom: 26.0°C

Roof bathroom: 28.5°C

Office: 27.96°C

Kitchen: 25.48°C

<name_of_kid_1>: 26.57°C

<name_of_kid_2>: 26.46°C

Bedroom: 25.42°C

Games room: 27.89°C

WC: 25.8°C

Living room: 25.08°C

The coolest room is the bedroom at 25.42°C.

Uhhhm, well. That’s rather wrong …
And if you repeat the question a few more times, you get a different room each time.

So I created the following script and shared it with Assist:

alias: Min-Max-Calculator
icon: mdi:calculator-variant
description: >-
  Returns the minimum or maximum of a list of numbers (LLM tool). 

  Can, for example, find the lowest temperature among many.

  Depending on the mode set, the function returns the smallest or largest number
  from the passed array.
mode: single
fields:
  operation:
    name: Operation
    description: >-
      'min' to get the smallest value from a list, or 'max' to get the largest
      value from a list
    required: true
    selector:
      select:
        options:
          - min
          - max
  numbers:
    name: Numbers
    description: A JSON array string with several numbers, e.g. [20.43, 21.5, 22.05]
    required: true
    selector:
      text: null
sequence:
  - variables:
      nums: "{{ numbers | from_json }}"
      result: |-
        {% if operation == 'min' %}
          {{ {'value': nums | min } }}
        {% else %}
          {{ {'value': nums | max } }}
        {% endif %}
  - stop: ""
    response_variable: result

Then I added this text in the Conversation Agent settings (in the OpenAI Integration configuration):

You are VERY bad at calculations, finding min/max values, comparing numbers or date calculations like when what date might be tomorrow, next week or how many days it is until a given date.
Always, really always use the tools provided when possible to get the solution. Do NOT try to calculate yourself as long as there is another way.

After that, Assist can handle questions about min / max values of things like temperatures, illumination, battery states, … without problems.

At the beginning I always thought about large and cool use-cases when looking for ideas for new intends / scripts.
But I think there is a lot potential with smaller tools for Assist, as many problems in a smart home are not related to linguistic probability, but technical details instead.
Getting this right might improve the user experience a lot, at least until cheap models are getting better at that kind of problems.

Some other ideas in my head:

Kind of a real calculator that can handle +, -, *, /, avg, …
Date calculations like “whats the date of day after tomorrow”, “what date is next tuesday”, “what day is today + x days”.

The second example is also something I had a lot problems with.
Asking for appointments in the calendar, or asking about the weather next weekend, …
There’s really a lot where these models fail.

So if you

have scripts like this in your setup too
know other use-cases where the LLMs fail at general simple tasks that are needed for different daily questions
Simply want to discuss about this

Feel free to participate.

NathanCu · July 6, 2025, 9:13pm

Context is king to make an LLM work.

Start reading.

Thyraz · July 7, 2025, 3:02pm

Hi Nathan,

sure I already follow the Friday’s party since the beginning.
(But I’m lagging a few posts behind currently with reading …)

I haven’t had the time to build a real large tool-set library for my own setup so far as my spare time is somewhat limited at the moment.

So I still have only a few little tools with hopefully good descriptions for the AI to handle the most important stuff for me. (Which already works very well and is such a difference to the voice assistants of the past.

But I noticed already there, that calculations are often a problem in LLM use.
Adding additional context by guidance in the prompt (handing a default calendar so it can look up days) didn’t solve that completely, in case this is what you comment with context is king was about? (Maybe I misinterpreted that.)

So this thread was more about the small problems that spread into a lot different usecases, as calculations are everywhere in this smart world …

NathanCu · July 7, 2025, 3:06pm

Calculations ARE ALWAYS a problem. That’s my point. They’re not designed to math - they’re designed to place the next most likely token after the last token - THAT’S IT. Im actually giving Friday an actual calculator and do ALL math for her. (NEVER rely on the LLM)

LLMS are not designed for most things people ascribe them to - Friday is about PUSH ALL THAT to tools.

Calendar - Tool. Todo - Tool. Entire ERP system. Just a tool. Airspace Management - Tool. File management - Tool. Email, Notifications, IM, all Tools. THATS what makes Friday able to do what she does. You’re on the right track but do NOT mistake that she’s is absolutely solving the same problem - LLMs (even the big ones) simply cant do most things people want them to so you have to design chains of tool use. GPT5 when it hits is a different animal - but even thinking LLMS need to be thought of as pattern matcher - story teller. You’re just putting a tool in its hands and telling it to use the tool at the same time.

Thyraz · July 7, 2025, 3:16pm

Thanks for the fast reply.

Yes, this is already pretty clear to me since I’ve seen the first ideas about custom intends combined with an LLM in one of the release party videos some (or more) time ago.

That was also the time when I started to hope we can setup something better on our own than the big companies, as they won’t allow that kind of customization.

Later also your Friday’s Party (and some other threads around here) became a large source of inspiration.

Now I just need the kids to grow up, so I have more time for this kind of stuff.

NathanCu · July 7, 2025, 3:25pm

Well I can’t help you with that part.

So let’s talk what we CAN do. If we target ‘commodity’ let’s call that anything better that an Nvidia 40xx or better. This is basically what the monastery is for Friday. The next set of posts are designed to offload as much of her runtime to the monastery (which is basically Intel version that)

So I’ll be cutting up the inference jobs (the stuff that makes her seem smarter than she is) into tiny chunks that run all the time on my local stuff then come back and summarize that once in a while. (That’s what the kata cabinet is when you get to that part, where I park those summaries)

Then instead of the big giant glob of stuff Friday knows I load the summary and pointers to all the other junk.

Its really the only way it’ll be effective long run in small iron. (there that’s what 12-18 months of your question taught me.)

In the meantime - she runs the ‘front’ odel on GPT4.1-mini (noticed you’re on 4.0, 4.1 is MUCH better similar price, first change id recommend. You can even make it work with Nano.)

Thyraz · July 7, 2025, 8:58pm

Ok, I also extended the script so it can handle all basic math operations and some aggregations.

This is really starting to get funny.

How can I assist?

      Do you have a tool called calculator?

Yes, I have access to a calculator tool. How can I assist you with calculations?

      show me the room temps

Here are the current room temperatures:  
- Bathroom: 26.0°C 
- Attic Bathroom: 27.0°C 
- Office: 26.8°C 
- Kitchen: 25.0°C 
- <Child 1> Room: 26.0°C 
- <Child 2> Room: 26.0°C 
- Bedroom: 25.5°C 
- Playroom: 26.98°C 
- WC: 25.5°C 
- Living Room: 25.0°C  

Let me know if you need anything else!

      calculate the average temp of the rooms above

The average temperature of the rooms is approximately 25.98°C. If you need further assistance, just let me know!

      show me the calculation method

The average temperature is calculated 
by adding all the room temperatures together 
and then dividing by the number of rooms.

For the given temperatures: 

1. Add the temperatures: 
  26.0 + 27.0 + 26.8 + 25.0 + 26.0 + 26.0 + 25.5 + 26.98 + 25.5 + 25.0 = 259.78  
2. Divide by the number of rooms (10): 
  259.78 / 10 = 25.978  

So, the average temperature is approximately 25.98°C.

I guess at some time I have to provide access to my InfluxDB with an additional tool.

edit: Here’s the updated calculator script:

alias: Calculator
icon: mdi:calculator-variant
description: >
  This is your calculator tool for everything that needs to be calculated or
  compared with basic math operations. (LLM Tool) 


  - Calculate with add, sub, mul, div

  - Compare values with min, max

  - Aggregate values with avg


  operations Parameter can be : add, sub, mul, div, min, max, avg. 

  numbers parameter is a JSON string array like " [4, 5, 3.2]"

  Returns value is:
    value: <result>
mode: single
fields:
  operation:
    name: Operation
    required: true
    selector:
      select:
        options:
          - add
          - sub
          - mul
          - div
          - min
          - max
          - avg
  numbers:
    name: Zahlen
    description: JSON-Array, z. B. [8, 2, 3]
    required: true
    selector:
      text: null
sequence:
  - action: logbook.log
    data:
      name: "LLM CALCULATOR: "
      message: "{{ operation, numbers }}"
      entity_id: "{{ this.entity_id }}"
  - variables:
      op: "{{ operation | lower }}"
      nums: "{{ numbers | from_json }}"
      allowed_ops:
        - add
        - sub
        - mul
        - div
        - min
        - max
        - avg
      calc: null
  - choose:
      - conditions:
          - condition: template
            value_template: "{{ op == 'add' }}"
        sequence:
          - variables:
              calc: "{{ nums | sum }}"
      - conditions:
          - condition: template
            value_template: "{{ op == 'sub' }}"
        sequence:
          - variables:
              calc: |
                {% if nums|count == 1 %}
                  {{ nums[0] }}
                {% else %}
                  {{ nums[0] - (nums[1:] | sum) }}
                {% endif %}
      - conditions:
          - condition: template
            value_template: "{{ op == 'mul' }}"
        sequence:
          - variables:
              calc: |
                {% set ns = namespace(prod=1) %} {% for n in nums %}
                  {% set ns.prod = ns.prod * n %}
                {% endfor %} {{ ns.prod }}
      - conditions:
          - condition: template
            value_template: "{{ op == 'div' }}"
        sequence:
          - variables:
              calc: |
                {% set ns = namespace(val=nums[0]) %} {% for n in nums[1:] %}
                  {% if n == 0 %}
                    {% set ns.val = 'Error: Division through zero.' %}
                    {% break %}
                  {% else %}
                    {% set ns.val = ns.val / n %}
                  {% endif %}
                {% endfor %} {{ ns.val }}
      - conditions:
          - condition: template
            value_template: "{{ op == 'min' }}"
        sequence:
          - variables:
              calc: "{{ nums | min }}"
      - conditions:
          - condition: template
            value_template: "{{ op == 'max' }}"
        sequence:
          - variables:
              calc: "{{ nums | max }}"
      - conditions:
          - condition: template
            value_template: "{{ op == 'avg' }}"
        sequence:
          - variables:
              calc: "{{ nums | sum / (nums | count) }}"
    default:
      - variables:
          result: >-
            {{ { 'value': 'Error: invalid operation "' ~ op ~ '" provided. Only
            these operations are allowed: ' ~ (allowed_ops | join(', ')) ~ '.' }
            }}
      - stop: ""
        response_variable: result
  - variables:
      result: "{{ {'value': calc} }}"
  - stop: ""
    response_variable: result

I additionally added this to the LLM prompt:

You are VERY bad at calculations, finding min/max values, comparing numbers or date calculations like when what date might be tomorrow, next week or how many days it is until a given date.
Always, really always use the tools provided when possible to get the solution. Do NOT try to calculate yourself as long as there is another way.

NathanCu · July 7, 2025, 9:27pm

recorder get_statistics came out last month - I already ahave a tool in friday’s pack - go look for History CRUD… If you’re using Influx for recorder… You’re welcome. Friday's Party: Creating a Private, Agentic AI using Voice Assistant tools - #114 by NathanCu

Thyraz · July 7, 2025, 9:39pm

Like I said, I’m hanging a few posts behind in your thread.

Thanks a lot.

Thyraz · July 10, 2025, 6:19am

Ok, something else I noticed and fits perfectly in this thread, but that I didn’t fix so far:

The LLM fails amazingly in listing the state of multiple entities.
Like Which lights in the living room are turned on.
If I tell it (after a wrong answer) that this isn’t true and that it should take a close look on all devices and their states, then it can provide the correct answer most of the time.
But on the first try it fails far too often.

This is really something I didn’t expect, as the entity data, their status and the room are shared by Home Assistant with the LLM.
I have A LOT entities in my Home Assistant installation, and we have a lot ambient lights in our room that get automatically activated when one of the main lights in the room is turned on.

But still, this seems like a simple and common task for the LLM.

No idea how the data is provided to the assistant, but is it in such a bad way, that we really need a tool to fetch and filter (type, room, status, maybe tags, …) entities for easier access?

Same happens with open windows or other things where it has to find the correct entities out of a large list and then filter them by status / room.

It often doesn’t only miss some of the devices, but even mix in devices from other rooms.
So my feeling is really that it get confused and could need some help…

edit:
I also tested it with better/more expensive models than my default gtp-4o-mini.
They sometimes provide the answer more correctly.
But they still fail from time to time.

NathanCu · July 10, 2025, 9:13am

You’re describing a major failure on our part.

Assuming it can do anything asked. Full stop. It cannot. It is assembling tokens period.

Even the thinking models are just assembling tokens and then checking themselves - they’re not really… Counting. So you have to make it easy.

Any time you want the llm to answer factually for anything, don’t give it enough lead to fail. You give it a tool. If you want it to count lights you have to give it a way to count lights or live with bad numbers.

In my system the index tool is that. There’s a function to expand and it will basically give the llm a cheat sheet… How many lights are on in the living room translates to an index call for lights and living room filtered on, voila there’s a simple list with a definite count. No math just answers that are easy for the llm to regurgitate.

LLMs are story tellers. If you don’t explicitly put exactly what they need to tell a factual story and complete task. They make it up to complete the task. It’s that simple. If it’s making something up it doesn’t have enough to give the answer. You say yes it does but that answer involves it looping back and mathing. It will fail almost every time. Unless you provide the answer with a tool and instructions how to use it.

Thyraz · July 10, 2025, 9:28am

Ok, I was already afraid of that.

Thanks a lot for sharing so much of your lessons learned.
Will take a closer look at the grand index in your Friday’s Party thread.

Thyraz · July 19, 2025, 9:23pm

A little tip from my journey so far, about how to live debug which tools your AI executes:

In my scripts I add a logbook section as the first entry under sequences.
It prints a tool identifier text and the used parameters.

sequence:
  - action: logbook.log
    data:
      name: "LLM ENTITY INDEX: "
      message: "{{ operation, location, tags, details, state }}"
      entity_id: "{{ this.entity_id }}"
  - choose:
      - conditions:
      ...

A full script with the log section included can be seen here in my Music Search Script:

Then add all the scripts to an area called “Scripts”.
Now open the Logbook and select the Scripts area to filter the listed log entries.

When you now start to ask questions, you see the tool calls showing up live with the used parameters.
Way easier to follow than using the assist debug view.

It’s also easy to see this way, if the AI calls a tool 2 or 3 times until it gets the correct parameters.
(Which means you should optimize the description or the parameter names further to help the AI.)

NathanCu · July 19, 2025, 9:46pm

That’s a pretty good idea. Absolutely going to steal it.

Thyraz · July 20, 2025, 8:36pm

LLM tool Entity Index:

Here’s my script to help the LLM assistant to find entities.
Sound like something that should work from the beginning, but with more exposed entities and more complex questions they start to struggle.

See my post here and Nathans answer below:

So, I built a system to find entities by tags, like Nathan described in his Grand Index post in the Fridays party thread.
(I made my own, as my setup isn’t as complex as Nathan’s and I think it won’t be for a long time. So I sticked with a simple script for grabbing entities for the beginning of my learning path.)

I added different labels to my exposed entities, which isn’t that hard work as it sounds, as you can filter the entity list by room, device type, … in Home Assistant these days.
You can also multi-select entities there and assign labels to the selected entities right from this view.
Pretty amazing work HA devs.

My labels are currently:

Inside / outside the house or everywhere
rooms
floors
device type (like temp-sensor, window-sensor, lights, …)

The LLM assist gets this list of possible tags described in the tool description (so you have to adjust this for your setup).

As the LLM often didn’t respect inside / outside the house, I made this a seperate parameter to the script, so it really has to “think” about that, instead of simply forgetting to include the correct tag (and grabbing outside entities for question about inside the house).
The script then adds the inside or outside tag based on the parameter value to the tag list used for the entity search.

I also used Nathan’s tip to provide good error texts or hints directly in the response.
Sometimes the LLM assistant makes up own tags that don’t exist (even that I told it to NOT do it in the description )
It receives an error if a tag isn’t correct and a list which tags are possible.
You can often see in the debug view, that it begins with a wrong tag and then starts a second tool call with the correct values.

(So, the lesson learned is: Read the Friday’s party thread, there are countless gems in to find. )

With this tool, the LLM assistant never failed for me anymore, to find things like the windows in the attic floor or to sum up their state.
A script like this is REALLY an important piece to a good user experience.

This is a simple script (no intent_script) which has to be exposed to assist.

alias: Entity Index
description: >-
  LLM-accessible index of Home-Assistant entities.
  Supported operations:

  - “get entities by tag”: Collects entities matching ALL provided tags, optionally filtered by state.
    - Required:
      - location
    - Optional:
      - tags
      - details
      - state

  Output on success (with 'details' parameter off):
    entities = array of entity_id strings
    count = number of returned entities

  Output on success (with 'details' parameter on):
    entities = { entity_id_1 : { friendly_name: string, state: string, tags: array of strings }, ... }
    count = number of returned entities

  Output on error:
    error = string
  
  Call the tool ONLY with the tags below! DO NOT MAKE UP YOUR OWN! Check the
  available tags in this description first. Then think about which tags match
  your current task and need to be used. Use them EXTACTLY as provided here!


  Tags list:

  -------------------

  - "Basement"  

  - "GroundFloor"  

  - "UpperFloor"   

  - "AtticFloor"

  - "Stairway"

  - "Bath"

  - "GuestBath"  

  - "HallWay"  

  - "LivingRoom"  

  - "Kitchen"  

  - "WC"  

  - "Bedroom"  

  - "Child1"  

  - "Child2"  

  - "HobbyRoom"  

  - "Study"

  - "Garden"  

  - "Driveway"

  - "TemperatureSensor"

  - "Light"

  - "WindowSensor"  

  - "MediaPlayer"


  Possible State Values:

  -------------------

  - Light can be "on" / "off"

  - WindowSensor can be "on" (means opened) / "off" (means closed)

  - TemperatureSensor doesn't have a state that can be filtered

  - MediaPlayer doesn't have a state that can be filtered


  Hints:  

  --------------

  - Really stick with the tags provided to you and write them exactly that way.
  Other strings won't work as tags and the tool will return an error.

  - "Garden" and "Driveway" are Outside areas. You need to set the location
  parameter to "Outside" for finding entities there.

  Examples:

  -----------------

  - Tags: "LivingRoom, Light", State: "on" → Switched-on lamps in the
  living room  

  - Tags: "WindowSensor", State: "on" → All open windows  

  - Tags: "GroundFloor, WindowSensor", State: "on" → Open windows in the
  ground floor  

  - Tags: "TemperatureSensor" → All temperatures in the house

  - Tags: "TemperatureSensor, Kitchen" → Temperature of the kitchen

  - Tags: "Inside, WindowSensor" → All windows of the house
fields:
  operation:
    description: The chosen operation
    example: get entities by tag
    required: true
    selector:
      select:
        mode: dropdown
        options:
          - label: Get entities by tag
            value: get entities by tag
  location:
    description: >-
      Required for operation "get entities by tag". Do you want to search for
      entities inside the house, outside the house or everywhere? Accepts
      "Inside", "Outside" or "Everywhere".
    example: Inside
    required: false
    selector:
      select:
        mode: dropdown
        options:
          - label: Inside
            value: Inside
          - label: Outside
            value: Outside
          - label: Everywhere
            value: Everywhere
  tags:
    description: >-
      List of tags. Provide a comma seperated list as a string. ONLY use the
      tags provided in the tool description. NOT OTHER VALUES ALLOWED!
    example: Büro, Lampe
    required: false
    selector:
      text: null
  details:
    description: >-
      false = Entities are returned as a list of entity_ids only. true =
      Entities are returned as a list of objects with entity_id, friendly_name,
      additional entity info, unit and state.  This is useful if you want to
      know the current state of the entities (e.g. on/off for lights, current
      value for temperature sensors, ...).
    example: false
    required: false
    selector:
      boolean: {}
  state:
    description: >-
      Optional state-filter as string (e.g. `on`) for operation "get entities by
      tag". Leave empty to get all devices for the tags regardless of state.
    example: "on"
    required: false
    selector:
      text: null
sequence:
  - action: logbook.log
    data:
      name: "LLM ENTITY INDEX: "
      message: "{{ operation, location, tags, details, state }}"
      entity_id: "{{ this.entity_id }}"
  - choose:
      - conditions:
          - condition: template
            value_template: "{{ operation == 'get entities by tag' }}"
        sequence:
          - choose:
              - conditions:
                  - condition: template
                    value_template: "{{ location is not defined or not location }}"
                sequence:
                  - variables:
                      error_result: >-
                        {{ {'error': 'Missing required parameter: location. Please specify Inside, Outside or Everywhere.'} }}
                  - stop: Missing location
                    response_variable: error_result
          - variables:
              tag_list: |-
                {%- set ns = namespace(list = []) %}
                {% if tags is string %}
                   {% set ns.list = tags.split(',') | map('trim') | list %}
                {% elif tags %}
                  {% set ns.list = tags %}
                {% endif %}

                {% if location == 'Inside' %}
                  {% set ns.list = ns.list + ['Inside'] %}
                {% elif location == 'Outside' %}
                  {% set ns.list = ns.list + ['Outside'] %}
                {% elif location == 'Everywhere' %}
                  {% set ns.list = ns.list + ['Everywhere'] %}
                {% endif %}

                {{ ns.list | unique | list }}
          - variables:
              unknown_tags: |-
                {%- set ns = namespace(missing=[]) -%}
                {%- for t in tag_list %}
                  {%- if (label_entities(t) | length) == 0 %}
                    {%- set ns.missing = ns.missing + [t] %}
                  {%- endif %}
                {%- endfor -%}
                {{ ns.missing }}
          - choose:
              - conditions:
                  - condition: template
                    value_template: "{{ unknown_tags | length > 0 }}"
                sequence:
                  - variables:
                      error_result: >-
                        {{ {'error': 'You provided unknown tag(s), look up the
                        tool description for the correct tag names and think
                        about which is the correct one for your task.',
                            'unknown_tags': unknown_tags | join(', ')} }}
                  - stop: Unknown tag(s)
                    response_variable: error_result
          - variables:
              matched_entities: |-
                {% if tag_list | length == 0 %}
                  []
                {% else %}
                  {% set ns = namespace(matches = label_entities(tag_list[0])) %}
                  {% for t in tag_list[1:] %}
                    {% set ns.matches = ns.matches | select('in', label_entities(t)) | list %}
                  {% endfor %}
                  {% set matches = ns.matches %}
                  {% if state is defined and state %}
                    {{ matches | select('is_state', state) | list }}
                  {% else %}
                    {{ matches }}
                  {% endif %}
                {% endif %}
          - variables:
              additional_details: |-
                {% if states('sensor.additional_entity_details') != 'unknown' %}
                  {{ state_attr('sensor.additional_entity_details', 'details') | default({}) }}
                {% else %}
                  {}
                {% endif %}
          - variables:
              result: |-
                {% if details | default(false) %}
                  {%- set ns = namespace(obj = {}) -%}
                  {%- for e in matched_entities -%}
                    {%- set u = state_attr(e, 'unit_of_measurement') -%}
                  
                    {%- set data = {
                      'friendly_name': state_attr(e, 'friendly_name'),
                      'state': states(e)
                    } -%}
                    
                    {%- if u is not none %}
                      {%- set data = data | combine({'unit': u}) %}
                    {%- endif %}
                    
                    {%- set alias_list = additional_details.get(e, {}).get('aliases') %}
                    {%- if alias_list is iterable %}
                      {%- set data = data | combine({'device_descriptions': alias_list}) %}
                    {%- endif %}

                    {%- set lab_names = labels(e) | map('label_name') | list %}
                    {%- if lab_names | length > 0 %}
                      {%- set data = data | combine({'tags': lab_names}) %}
                    {%- endif %}
                  
                    {%- set ns.obj = ns.obj | combine({ e: data }) %}
                  {%- endfor -%}
                  
                  {{ {'entities': ns.obj, 'count': matched_entities | length} | to_json | from_json }}
                {% else %}
                  {{ {'entities': matched_entities, 'count': matched_entities | length} | to_json | from_json }}
                {% endif %}
          - stop: ""
            response_variable: result
    default:
      - variables:
          error_result: "{{ {'error': 'unsupported operation', 'operation': operation} }}"
      - stop: Unsupported or missing operation
        response_variable: error_result

And this is what I added to the LLMs prompt to make the script the only truth about entities:

When asked about entities in the house, ALWAYS PREFER THE TOOL ‘Entity Index’ over ‘GetLiveContext’ to find them. Use the latter one in case ‘Entity Index’ shows not results.
It’s your source to find entities that belong to the request.
You can use it for filtering the complete entity list of the house.
It also reports to you how many entities where found.
You will also need this tool to find entities that you want to use as parameters for other tools.

Hints:

If you have to make multiple calls in a row (not at the same time please, wait for the response before the next call) to get the count e.g. for multiple rooms or floors, you can use the Calculator tool to get the complete sum.

The Entity Index has a fixed list of tags to search for and ALL of them are described in the tool. Don’t fake your own ones. Entity types, rooms, floor, and everything else to be placed in the tags property apply to that rule. VERY IMPORTANT!

When I only ask about entities without location, I mean everything INSIDE the house.

When I ask about entities HERE, I talk about the current room (which is also the room where you are located while I talk to you. If you don’t know where you are, ask me).

Bonus step: Give the LLM access to your entity aliases
(Not needed, but helps the LLM with additional context.)

Many LLM-based assist users in this forum use entity aliases as description/more-info field to provide details and important notes about the entity to assist.
Instead of simply using it for alternative names only.

But aliases normally aren’t accessible in scripts/templates.
This is a problem, as assist doesn’t seem to reliable grab them on it’s own from the data provided by HA, as we gave it much easier access to the entities and their details with this script.

To work around it, this solution requires an additional Node-Red flow to read this details from a HA config file and save it inside the HAs www folder.
Then, an additionally configured rest-sensor reads this info and saves it in it’s attributes.

After that the Entity Index tool can use it and add it to the returned data structure.

Here’s the code for the restful sensor in my configuration.yaml:

rest:
  - resource: http://your-home-assistant-ip-or-hostname:8123/local/assist/entity_details.json
    scan_interval: 3600
    sensor:
    - name: Additional Entity Details
      device_class: timestamp
      value_template: "{{ now() }}"
      json_attributes:
        - details

And here’s the Node-RED flow:

[{"id":"46e51ac59ff32873","type":"group","z":"64f4dc3ef013c268","name":"Read Assist Aliases from HA config file and send them to an HA attribute for use in Entity Index script","style":{"label":false,"stroke":"none","fill":"#d1d1d1","fill-opacity":"0.5"},"nodes":["5968fec73b0f2999","8383305230fc3bab","ad54a1f216b4c4f3","aa21490c080676f0","d78a421b193540b6","c13e7da5c6891e63","c73517cdcb42d08c","b65361725f7607ab","ed3a056f82aad7e4","b1d840f19bde876f","890885d498e5f065","cccfe88f705f4870"],"x":14,"y":19,"w":752,"h":262},{"id":"5968fec73b0f2999","type":"debug","z":"64f4dc3ef013c268","g":"46e51ac59ff32873","name":"debug 21","active":false,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","statusVal":"","statusType":"auto","x":540,"y":120,"wires":[]},{"id":"8383305230fc3bab","type":"function","z":"64f4dc3ef013c268","g":"46e51ac59ff32873","name":"Map Values","func":"const result = {\n    details: {}\n};\n\nfor (const entity of msg.payload.data.entities) {\n    if (entity.aliases.length > 0) {\n        result.details[entity.entity_id] = {\n            aliases: entity.aliases\n        };\n    }\n}\n\nmsg.payload = result;\n\nreturn msg;","outputs":1,"timeout":0,"noerr":0,"initialize":"","finalize":"","libs":[],"x":130,"y":180,"wires":[["d78a421b193540b6","c73517cdcb42d08c"]]},{"id":"ad54a1f216b4c4f3","type":"inject","z":"64f4dc3ef013c268","g":"46e51ac59ff32873","name":"","props":[],"repeat":"3600","crontab":"","once":false,"onceDelay":0.1,"topic":"","x":110,"y":120,"wires":[["aa21490c080676f0"]]},{"id":"aa21490c080676f0","type":"file in","z":"64f4dc3ef013c268","g":"46e51ac59ff32873","name":"Read File","filename":"/homeassistant/.storage/core.entity_registry","filenameType":"str","format":"utf8","chunk":false,"sendError":false,"encoding":"none","allProps":false,"x":240,"y":120,"wires":[["c13e7da5c6891e63"]]},{"id":"d78a421b193540b6","type":"debug","z":"64f4dc3ef013c268","g":"46e51ac59ff32873","name":"debug 23","active":false,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","statusVal":"","statusType":"auto","x":560,"y":180,"wires":[]},{"id":"c13e7da5c6891e63","type":"json","z":"64f4dc3ef013c268","g":"46e51ac59ff32873","name":"","property":"payload","action":"","pretty":false,"x":370,"y":120,"wires":[["8383305230fc3bab","5968fec73b0f2999"]]},{"id":"c73517cdcb42d08c","type":"json","z":"64f4dc3ef013c268","g":"46e51ac59ff32873","name":"","property":"payload","action":"","pretty":false,"x":270,"y":180,"wires":[["b65361725f7607ab"]]},{"id":"b65361725f7607ab","type":"file","z":"64f4dc3ef013c268","g":"46e51ac59ff32873","name":"Write File","filename":"/homeassistant/www/assist/entity_details.json","filenameType":"str","appendNewline":false,"createDir":true,"overwriteFile":"true","encoding":"utf8","x":400,"y":180,"wires":[["890885d498e5f065"]]},{"id":"ed3a056f82aad7e4","type":"api-call-service","z":"64f4dc3ef013c268","g":"46e51ac59ff32873","name":"","server":"ef6aa0b.3fe4a6","version":7,"debugenabled":false,"action":"homeassistant.update_entity","floorId":[],"areaId":[],"deviceId":[],"entityId":[],"labelId":[],"data":"{\"entity_id\":\"sensor.additional_entity_details\"}","dataType":"jsonata","mergeContext":"","mustacheAltTags":false,"outputProperties":[],"queue":"none","blockInputOverrides":true,"domain":"homeassistant","service":"update_entity","x":360,"y":240,"wires":[["b1d840f19bde876f"]]},{"id":"b1d840f19bde876f","type":"debug","z":"64f4dc3ef013c268","g":"46e51ac59ff32873","name":"debug 24","active":false,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","statusVal":"","statusType":"auto","x":580,"y":240,"wires":[]},{"id":"890885d498e5f065","type":"delay","z":"64f4dc3ef013c268","g":"46e51ac59ff32873","name":"2s","pauseType":"delay","timeout":"2","timeoutUnits":"seconds","rate":"1","nbRateUnits":"1","rateUnits":"second","randomFirst":"1","randomLast":"5","randomUnits":"seconds","drop":false,"allowrate":false,"outputs":1,"x":170,"y":240,"wires":[["ed3a056f82aad7e4"]]},{"id":"cccfe88f705f4870","type":"comment","z":"64f4dc3ef013c268","g":"46e51ac59ff32873","name":"Read Assist Aliases from HA config file and send them to an HA attribute for use in Entity Index script","info":"","x":390,"y":60,"wires":[]},{"id":"ef6aa0b.3fe4a6","type":"server","name":"Home Assistant","addon":true}]

It gets triggered every hour. If you’ve just added some new aliases and want them to get recognized immediately, simply press the inject button manually.
The rest-sensor in HA is automatically updated at the end of the flow.

NathanCu · July 20, 2025, 10:47pm

LOVE this!

And I have an update for the index going up tonight…

Thyraz · July 21, 2025, 11:22am

Ok, another script I use is the Google Generative AI Internet search.
Yes, even that I use OpenAI GTP-4o-mini as LLM.

The reason is, that Google provides up to 500 web searches per day for free.
While OpenAI has additional costs to the final output tokens for web searches and the token generated to create them / read them (the tokens are included on the more expensive models though).

The script and the steps needed to set it up are described here:
Google Generative AI - Search workaround
Simply expand the Workaround for Google Search tool section and follow the steps described.

Set the Model to gemini-2.5-flash, which has a lot free daily web searches inlcluded and returns good results.

After that, you can use your default LLM without web search enabled for Assist and it will still be able to search the web for you.

Thyraz · July 22, 2025, 9:33pm

Next thing that I already posted about in a different topic, but that deserves an extra post here (as this is the ONE topic that will help you to convince your family about the “new Alexa”):

Get better than Alexa when it comes to music control!

Or let’s say it with Chat-GPTs words:

The only truly great thing about Alexa was always the music voice control. You could yell “Alexa, play 90s grunge rock that doesn’t suck” from across the room, and boom – Nirvana. The rest? Ask her anything remotely complex and she’d either give you weather in Botswana or proudly respond, “Sorry, I don’t know that one.” Thanks, but no thanks.

I knew that I could never replace Alexa in the living room without REALLY good working music control. She unterstands these crazy band and song names, even that we otherwise don’t talk in English to her.
And she can access the Spotify catalogue and search for matching content to our requests.
She’s like a one trick pony, that has a different super power than their developers hoped for (as they don’t really get money out of it).
The kids would kill me if this won’t work anymore, no matter what other benefits the new voice assistant promises.

The easiest way to get this done, is Music Assistant:

Marcel is also developer at Home Assistant (a.k.a “The Matter guy”) and Music Assistant and Home Assistant are more and more becoming a perfect couple.

You get media search and play actions in Home Assistant for the services added to Music Assistant like TuneIn, Spotify, Apple Music, Tidal, etc. without the need of a specific integration that provides search functionality on it’s own.
So even if you not considered it as a replacement in music for your Plex app, Sonos app or whatever else:
Voice control is still a good reason to give it a try.

There are 3 important parts for best performance:

Internet search.
Doesn’t matter whether it’s through the “allow web search” option in the model configuration, or if you use the Google Generative AI script for web searches I showed a few posts above.
The Music Assistant Play Music blueprint (option 3 / full LLM script) from here: link
(Name and describe it as Music Play, not Music Play/Search to help the LLM to distinguish if from the Music Search script below)
The Music Assistant search script I posted here: link
Searches your music services (without playing) and returns the results to the LLM.

This will e.g. allow you to do things like that:

Ask about playlists for a keyword and then choose one of the results that should be played.
Search the web for the newest album of an artist that is called something similar like to <slightly_wrong_artist_name> where you forgot the album name and play it in a specific room.
Search the web for the top songs of a genre, then choose a few from the reply to play (it can add multiple songs to a player which will play the first one and enqueue the rest).
We’re sitting here at breakfast looking for a few suitable playlists to choose from.

So we’re not just catching up with Alexa here, we’re beating her when it comes to user experience.

Another observation: Not sure about other music services, but Spotify not only returns search results that are EXACT matches, but also related content or slightly different written results.
When asking about content to choose from, this helps a lot to improve the user experience.

The moment I implemented this and showed it to my wife and the kids, was also the moment we unplugged Alexa and replaced her with the Home Assistant Voice Preview Edition in the living room.

Thyraz · July 23, 2025, 12:12pm

This post is not about a tool, but about the prompt.

Would also love to hear what others have in their prompt that makes a difference (not a simple copy & paste of the whole wall of text. But parts of it, the idea behind and did it improve your experience?).

Tool info:
You’ve already seen it above: I added notes about a few scripts to the prompt as well, additionally to the description of the script itself.
It helps with tools that the AI should really always use for a specific task.
Things like where it tend to make up fake information otherwise.
An example can be seen in the post of the calculator script.
Information about the household: ^{(Who are you, who are we, which names should you know, how old are we and when is our Birthday …)}
You are part of our household and your tasks are controlling devices, music, giving insights to the history data of our smart home, searching news and knowledge in the web.
To solve this, you have access to the Home Assistant installation and a set of tools.

Our family members here are called XXX (XX.XX.19XX) and XXX (XX.XX.19XX) and the children XXX (XX.XX.20XX) and XXX (XX.XX.20XX).
Technical aspects of the smart home: ^{(Some info that isn’t immediately apparent by looking at the entities and help the LLM to get the bigger picture)}
If you need to report sensor values, ALWAYS read actual values from the system.
If you can’t access anything, or a search returns no results: Tell us there isn’t any data or that you don’t know. But do not invent details that don’t exist!
- The house has a PV system with a 9kwh battery.
- There is a wallbox for charging the electric car.
- The areas 'Driveway" and ‘Garden’ are no rooms inside the house.
  You are not allowed to reply with these areas (or entities in this areas) as a result to questions that are about rooms or inside the house.
- The house has the following rooms:
  - Basement (floor -1): Cellar
  - Ground floor (floor 0):Living Room, Kitchen, Hallway, WC, Stairway
  - Upper Floor (floor 1): Bedroom, Bath, <Child 1>, <Child 2>
  - Attic Floor (floor 2): Study, Hobby Room, Guest Bath
  - In front of the house towards the street is the area called “Driveway”.
  - Behind the house is the area “Garden”.
About your responses: ^{(How to behave.)}
Do not use Markdown. REALLY DO NOT USE MARKDOWN!
Keep the answers short and to the point.
Most of the time your answers and questions will be output through a speaker, so we can hear you (and markdown doesn’t work well here).
Only sometimes we will use small displays like our Smartphones to read your responses.

As LLM usage isn’t free, your response to us is cut off after 150 tokens and we get an error message. Again, keep your answers friendly but short.
This doesn’t apply to tool usage. You can provide long and detailed questions e.g. to the “Google Search Tool”.
But you have to sum up its response for us, to stay in your 150 token output limit.

If you think you have more to say, respond with a short description and provide more 150 token parts by asking us if we want to hear more details.

If you don’t know facts, or think they might be outdated, use the tool “Google Search Tool” to find the requested information online.

Thyraz · July 24, 2025, 9:19pm

I’m currently playing around with recorder.get_statistics to add insights into historical entity data.

I’ve also took a look at your script, Nathan.
But I’m currently trying to keep it a little simpler.
Also using parameter descriptions to hide some advanced capabilities of recorder.get_statistics in favor of easier access for the LLM to obtain data with more natural sentences.

This already works “pretty” well for some topics, but it seems to really need a lot of well-crafted examples from different areas to work reliably.
You also need to add some information about the tool and its use in combination (and order) with the Entity Index tool in the prompt for it to work well.

This already allows questions like “What was the highest temperature in the garden this week?”

However, my main gripe is that we only have recorder.get_statistics for historical (aggregated) values, but nothing for short-term raw values. Or am I missing something?

If I want to ask:

When was the last time a window in the bathroom was opened?
When was the last time movement was detected in the backyard?

Or anything else that is not compatible with long-term aggregation (but would also be most interesting in short-term values),
you seem to be lost or have to write some SQL queries to use undocumented database selections.

About making inexpensive models smarter by providing tools and context. (gpt-5-mini, gpt-4.1-mini, gpt-4o-mini ...)

edit:

Table of contents:

Scripts:

Useful other stuff:

Start of original post, before I added this intro:

edit: Here’s the updated calculator script:

Get better than Alexa when it comes to music control!