Discussion thread for tools to make inexpensive models smarter (gpt-4o-mini in my case)

Thyraz · July 6, 2025, 8:33pm

Hello everyone,

edit:

I decided to use this thread to share my journey with improving my LLM based assistant using OpenAI gpt-4o-mini by adding tools.

So I will add LLM tools I create, or link to LLM tools that I start to use in this thread and try to update the start post with links as an table of contents.

I learned a lot from Nathan’s Friday Party thread, but I think more user storys (and maybe less perfect/large solution as a starter) are always a good thing for others to get into the topic.

And you REALLY want to improve your LLM assistant with tools, as the standard behavior is volatile and subpar at least with the current smaller/cheaper models.

Start of original post, before I added this intro:

Since LLMs, we talk to Assist much more naturally and freely and are often quite happy to believe what it tells us .

However, I’ve noticed that the bullshit level for smart-home related questions is still quite high.
The reason for this are usually mathematical correlations, as the models do not use a calculator, for example, but answer according to “linguistic probability”.

The more expensive models usually produce far better results in this respect, but

by far not always
I am afraid of the costs as the children are also on board.

That’s why I started with a first test script today, to see if this can improved with a tool-set. The first one: Determine the min/max value from a list of numbers.

Here is an example about what might go wrong without additional guidance for the LLM:

“Phew, it’s so hot in my study, what’s the coldest room in the house I could move to?”

The current temperatures in the various rooms are

Bathroom: 26.0°C

Roof bathroom: 28.5°C

Office: 27.96°C

Kitchen: 25.48°C

<name_of_kid_1>: 26.57°C

<name_of_kid_2>: 26.46°C

Bedroom: 25.42°C

Games room: 27.89°C

WC: 25.8°C

Living room: 25.08°C

The coolest room is the bedroom at 25.42°C.

Uhhhm, well. That’s rather wrong …
And if you repeat the question a few more times, you get a different room each time.

So I created the following script and shared it with Assist:

alias: Min-Max-Calculator
icon: mdi:calculator-variant
description: >-
  Returns the minimum or maximum of a list of numbers (LLM tool). 

  Can, for example, find the lowest temperature among many.

  Depending on the mode set, the function returns the smallest or largest number
  from the passed array.
mode: single
fields:
  operation:
    name: Operation
    description: >-
      'min' to get the smallest value from a list, or 'max' to get the largest
      value from a list
    required: true
    selector:
      select:
        options:
          - min
          - max
  numbers:
    name: Numbers
    description: A JSON array string with several numbers, e.g. [20.43, 21.5, 22.05]
    required: true
    selector:
      text: null
sequence:
  - variables:
      nums: "{{ numbers | from_json }}"
      result: |-
        {% if operation == 'min' %}
          {{ {'value': nums | min } }}
        {% else %}
          {{ {'value': nums | max } }}
        {% endif %}
  - stop: ""
    response_variable: result

Then I added this text in the Conversation Agent settings (in the OpenAI Integration configuration):

You are VERY bad at calculations, finding min/max values, comparing numbers or date calculations like when what date might be tomorrow, next week or how many days it is until a given date.
Always, really always use the tools provided when possible to get the solution. Do NOT try to calculate yourself as long as there is another way.

After that, Assist can handle questions about min / max values of things like temperatures, illumination, battery states, … without problems.

At the beginning I always thought about large and cool use-cases when looking for ideas for new intends / scripts.
But I think there is a lot potential with smaller tools for Assist, as many problems in a smart home are not related to linguistic probability, but technical details instead.
Getting this right might improve the user experience a lot, at least until cheap models are getting better at that kind of problems.

Some other ideas in my head:

Kind of a real calculator that can handle +, -, *, /, avg, …
Date calculations like “whats the date of day after tomorrow”, “what date is next tuesday”, “what day is today + x days”.

The second example is also something I had a lot problems with.
Asking for appointments in the calendar, or asking about the weather next weekend, …
There’s really a lot where these models fail.

So if you

have scripts like this in your setup too
know other use-cases where the LLMs fail at general simple tasks that are needed for different daily questions
Simply want to discuss about this

Feel free to participate.

NathanCu · July 6, 2025, 9:13pm

Context is king to make an LLM work.

Start reading.

Thyraz · July 7, 2025, 3:02pm

Hi Nathan,

sure I already follow the Friday’s party since the beginning.
(But I’m lagging a few posts behind currently with reading …)

I haven’t had the time to build a real large tool-set library for my own setup so far as my spare time is somewhat limited at the moment.

So I still have only a few little tools with hopefully good descriptions for the AI to handle the most important stuff for me. (Which already works very well and is such a difference to the voice assistants of the past.

But I noticed already there, that calculations are often a problem in LLM use.
Adding additional context by guidance in the prompt (handing a default calendar so it can look up days) didn’t solve that completely, in case this is what you comment with context is king was about? (Maybe I misinterpreted that.)

So this thread was more about the small problems that spread into a lot different usecases, as calculations are everywhere in this smart world …

NathanCu · July 7, 2025, 3:06pm

Calculations ARE ALWAYS a problem. That’s my point. They’re not designed to math - they’re designed to place the next most likely token after the last token - THAT’S IT. Im actually giving Friday an actual calculator and do ALL math for her. (NEVER rely on the LLM)

LLMS are not designed for most things people ascribe them to - Friday is about PUSH ALL THAT to tools.

Calendar - Tool. Todo - Tool. Entire ERP system. Just a tool. Airspace Management - Tool. File management - Tool. Email, Notifications, IM, all Tools. THATS what makes Friday able to do what she does. You’re on the right track but do NOT mistake that she’s is absolutely solving the same problem - LLMs (even the big ones) simply cant do most things people want them to so you have to design chains of tool use. GPT5 when it hits is a different animal - but even thinking LLMS need to be thought of as pattern matcher - story teller. You’re just putting a tool in its hands and telling it to use the tool at the same time.

Thyraz · July 7, 2025, 3:16pm

Thanks for the fast reply.

Yes, this is already pretty clear to me since I’ve seen the first ideas about custom intends combined with an LLM in one of the release party videos some (or more) time ago.

That was also the time when I started to hope we can setup something better on our own than the big companies, as they won’t allow that kind of customization.

Later also your Friday’s Party (and some other threads around here) became a large source of inspiration.

Now I just need the kids to grow up, so I have more time for this kind of stuff.

NathanCu · July 7, 2025, 3:25pm

Well I can’t help you with that part.

So let’s talk what we CAN do. If we target ‘commodity’ let’s call that anything better that an Nvidia 40xx or better. This is basically what the monastery is for Friday. The next set of posts are designed to offload as much of her runtime to the monastery (which is basically Intel version that)

So I’ll be cutting up the inference jobs (the stuff that makes her seem smarter than she is) into tiny chunks that run all the time on my local stuff then come back and summarize that once in a while. (That’s what the kata cabinet is when you get to that part, where I park those summaries)

Then instead of the big giant glob of stuff Friday knows I load the summary and pointers to all the other junk.

Its really the only way it’ll be effective long run in small iron. (there that’s what 12-18 months of your question taught me.)

In the meantime - she runs the ‘front’ odel on GPT4.1-mini (noticed you’re on 4.0, 4.1 is MUCH better similar price, first change id recommend. You can even make it work with Nano.)

Thyraz · July 7, 2025, 8:58pm

Ok, I also extended the script so it can handle all basic math operations and some aggregations.

This is really starting to get funny.

How can I assist?

      Do you have a tool called calculator?

Yes, I have access to a calculator tool. How can I assist you with calculations?

      show me the room temps

Here are the current room temperatures:  
- Bathroom: 26.0°C 
- Attic Bathroom: 27.0°C 
- Office: 26.8°C 
- Kitchen: 25.0°C 
- <Child 1> Room: 26.0°C 
- <Child 2> Room: 26.0°C 
- Bedroom: 25.5°C 
- Playroom: 26.98°C 
- WC: 25.5°C 
- Living Room: 25.0°C  

Let me know if you need anything else!

      calculate the average temp of the rooms above

The average temperature of the rooms is approximately 25.98°C. If you need further assistance, just let me know!

      show me the calculation method

The average temperature is calculated 
by adding all the room temperatures together 
and then dividing by the number of rooms.

For the given temperatures: 

1. Add the temperatures: 
  26.0 + 27.0 + 26.8 + 25.0 + 26.0 + 26.0 + 25.5 + 26.98 + 25.5 + 25.0 = 259.78  
2. Divide by the number of rooms (10): 
  259.78 / 10 = 25.978  

So, the average temperature is approximately 25.98°C.

I guess at some time I have to provide access to my InfluxDB with an additional tool.

edit: Here’s the updated calculator script:

alias: Calculator
icon: mdi:calculator-variant
description: >
  This is your calculator tool for everything that needs to be calculated or
  compared with basic math operations. (LLM Tool) 


  - Calculate with add, sub, mul, div

  - Compare values with min, max

  - Aggregate values with avg


  operations Parameter can be : add, sub, mul, div, min, max, avg. 

  numbers parameter is a JSON string array like " [4, 5, 3.2]"

  Returns value is:
    value: <result>
mode: single
fields:
  operation:
    name: Operation
    required: true
    selector:
      select:
        options:
          - add
          - sub
          - mul
          - div
          - min
          - max
          - avg
  numbers:
    name: Zahlen
    description: JSON-Array, z. B. [8, 2, 3]
    required: true
    selector:
      text: null
sequence:
  - action: logbook.log
    data:
      name: "LLM CALCULATOR: "
      message: "{{ operation, numbers }}"
      entity_id: "{{ this.entity_id }}"
  - variables:
      op: "{{ operation | lower }}"
      nums: "{{ numbers | from_json }}"
      calc: null
  - choose:
      - conditions:
          - condition: template
            value_template: "{{ op == 'add' }}"
        sequence:
          - variables:
              calc: "{{ nums | sum }}"
      - conditions:
          - condition: template
            value_template: "{{ op == 'sub' }}"
        sequence:
          - variables:
              calc: |
                {% if nums|count == 1 %}
                  {{ nums[0] }}
                {% else %}
                  {{ nums[0] - (nums[1:] | sum) }}
                {% endif %}
      - conditions:
          - condition: template
            value_template: "{{ op == 'mul' }}"
        sequence:
          - variables:
              calc: |
                {% set ns = namespace(prod=1) %} {% for n in nums %}
                  {% set ns.prod = ns.prod * n %}
                {% endfor %} {{ ns.prod }}
      - conditions:
          - condition: template
            value_template: "{{ op == 'div' }}"
        sequence:
          - variables:
              calc: |
                {% set ns = namespace(val=nums[0]) %} {% for n in nums[1:] %}
                  {% if n == 0 %}
                    {% set ns.val = 'Error: Division through zero' %}
                    {% break %}
                  {% else %}
                    {% set ns.val = ns.val / n %}
                  {% endif %}
                {% endfor %} {{ ns.val }}
      - conditions:
          - condition: template
            value_template: "{{ op == 'min' }}"
        sequence:
          - variables:
              calc: "{{ nums | min }}"
      - conditions:
          - condition: template
            value_template: "{{ op == 'max' }}"
        sequence:
          - variables:
              calc: "{{ nums | max }}"
      - conditions:
          - condition: template
            value_template: "{{ op == 'avg' }}"
        sequence:
          - variables:
              calc: "{{ nums | sum / (nums | count) }}"
  - variables:
      result: "{{ {'value': calc} }}"
  - stop: ""
    response_variable: result

I additionally added this to the LLM prompt:

You are VERY bad at calculations, finding min/max values, comparing numbers or date calculations like when what date might be tomorrow, next week or how many days it is until a given date.
Always, really always use the tools provided when possible to get the solution. Do NOT try to calculate yourself as long as there is another way.

NathanCu · July 7, 2025, 9:27pm

recorder get_statistics came out last month - I already ahave a tool in friday’s pack - go look for History CRUD… If you’re using Influx for recorder… You’re welcome. Friday's Party: Creating a Private, Agentic AI using Voice Assistant tools - #114 by NathanCu

Thyraz · July 7, 2025, 9:39pm

Like I said, I’m hanging a few posts behind in your thread.

Thanks a lot.

Thyraz · July 10, 2025, 6:19am

Ok, something else I noticed and fits perfectly in this thread, but that I didn’t fix so far:

The LLM fails amazingly in listing the state of multiple entities.
Like Which lights in the living room are turned on.
If I tell it (after a wrong answer) that this isn’t true and that it should take a close look on all devices and their states, then it can provide the correct answer most of the time.
But on the first try it fails far too often.

This is really something I didn’t expect, as the entity data, their status and the room are shared by Home Assistant with the LLM.
I have A LOT entities in my Home Assistant installation, and we have a lot ambient lights in our room that get automatically activated when one of the main lights in the room is turned on.

But still, this seems like a simple and common task for the LLM.

No idea how the data is provided to the assistant, but is it in such a bad way, that we really need a tool to fetch and filter (type, room, status, maybe tags, …) entities for easier access?

Same happens with open windows or other things where it has to find the correct entities out of a large list and then filter them by status / room.

It often doesn’t only miss some of the devices, but even mix in devices from other rooms.
So my feeling is really that it get confused and could need some help…

edit:
I also tested it with better/more expensive models than my default gtp-4o-mini.
They sometimes provide the answer more correctly.
But they still fail from time to time.

NathanCu · July 10, 2025, 9:13am

You’re describing a major failure on our part.

Assuming it can do anything asked. Full stop. It cannot. It is assembling tokens period.

Even the thinking models are just assembling tokens and then checking themselves - they’re not really… Counting. So you have to make it easy.

Any time you want the llm to answer factually for anything, don’t give it enough lead to fail. You give it a tool. If you want it to count lights you have to give it a way to count lights or live with bad numbers.

In my system the index tool is that. There’s a function to expand and it will basically give the llm a cheat sheet… How many lights are on in the living room translates to an index call for lights and living room filtered on, voila there’s a simple list with a definite count. No math just answers that are easy for the llm to regurgitate.

LLMs are story tellers. If you don’t explicitly put exactly what they need to tell a factual story and complete task. They make it up to complete the task. It’s that simple. If it’s making something up it doesn’t have enough to give the answer. You say yes it does but that answer involves it looping back and mathing. It will fail almost every time. Unless you provide the answer with a tool and instructions how to use it.

Thyraz · July 10, 2025, 9:28am

Ok, I was already afraid of that.

Thanks a lot for sharing so much of your lessons learned.
Will take a closer look at the grand index in your Friday’s Party thread.

Thyraz · July 19, 2025, 9:23pm

A little tip from my journey so far, about how to live debug which tools your AI executes:

In my scripts I add a logbook section as the first entry under sequences.
It prints a tool identifier text and the used parameters.

sequence:
  - action: logbook.log
    data:
      name: "LLM ENTITY INDEX: "
      message: "{{ operation, location, tags, details, state }}"
      entity_id: "{{ this.entity_id }}"
  - choose:
      - conditions:
      ...

A full script with the log section included can be seen here in my Music Search Script:

Then add all the scripts to an area called “Scripts”.
Now open the Logbook and select the Scripts area to filter the listed log entries.

When you now start to ask questions, you see the tool calls showing up live with the used parameters.
Way easier to follow than using the assist debug view.

It’s also easy to see this way, if the AI calls a tool 2 or 3 times until it gets the correct parameters.
(Which means you should optimize the description or the parameter names further to help the AI.)

NathanCu · July 19, 2025, 9:46pm

That’s a pretty good idea. Absolutely going to steal it.

Thyraz · July 20, 2025, 8:36pm

Ok, as I decided that I will share more about my way to a better voice assistant using small tools (See first post, which will now also act as table of contents):

Here’s my script to help the LLM assistant to find entities.
Sound like something that should work from the beginning, but with more exposed entities and more complex questions they start to struggle.

See my post here and Nathans answer below:

So, I built a system to find entities by tags, like Nathan described in his Grand Index post in the Fridays party thread.
(I made my own, as my setup isn’t as complex as Nathan’s and I think it won’t be for a long time. So I sticked with a simple script for grabbing entities for the beginning of my learning path.)

I added different labels to my exposed entities, which isn’t that hard work as it sounds, as you can filter the entity list by room, device type, … in Home Assistant these days.
You can also multi-select entities there and assign labels to the selected entities right from this view.
Pretty amazing work HA devs.

My labels are currently:

Inside or outside the house
rooms
floors
device type (like temp-sensor, window-sensor, lights, …)

The LLM assist gets this list of possible tags described in the tool description (so you have to adjust this for your setup).

As the LLM often didn’t respect inside / outside the house, I made this a seperate parameter to the script, so it really has to “think” about that, instead of simply forgetting to include the correct tag (and grabbing outside entities for question about inside the house).
The script then adds the inside or outside tag based on the parameter value to the tag list used for the entity search.

I also used Nathan’s tip to provide good error texts or hints directly in the response.
Sometimes the LLM assistant makes up own tags that don’t exist (even that I told it to NOT do it in the description )
It receives an error if a tag isn’t correct and that it should look up again which tags are possible.
You can often see in the debug view, that it begins with a wrong tag and then starts a second tool call with the correct values.

(So, the lesson learned is: Read the Friday’s party thread, there are countless gems in to find. )

With this tool, the LLM assistant never failed anymore, to find things like the windows in the attic floor or to sum up their state.
A script like this is REALLY an important piece to a good user experience.

This is a simple script (no intent_script) which has to be exposed to assist.

alias: Entity Index
description: >-
  LLM-accessible index of Home-Assistant entities.

  Supported operations:

  - “get entities by tag”: 
    Collects entities matching ALL provided tags, optionally filtered by state.
    Returns: A mapping with entities (array) and count (number).

  Call the tool ONLY with the tags below! DO NOT MAKE UP YOUR OWN! Check the
  available tags in this description first. Then think about which tags match
  your current task and need to be used. Use them EXTACTLY as provided here!


  Tags list:

  -------------------

  - "Basement"  

  - "GroundFloor"  

  - "UpperFloor"   

  - "AtticFloor"

  - "Stairway"

  - "Bath"

  - "GuestBath"  

  - "HallWay"  

  - "LivingRoom"  

  - "Kitchen"  

  - "WC"  

  - "Bedroom"  

  - "Child1"  

  - "Child2"  

  - "HobbyRoom"  

  - "Study"

  - "Garden"  

  - "Driveway"

  - "TemperatureSensor"

  - "Light"

  - "WindowSensor"  

  - "MediaPlayer"


  Possible State Values:

  -------------------

  - Light can be "on" / "off"

  - WindowSensor can be "on" (means opened) / "off" (means closed)

  - TemperatureSensor doesn't have a state that can be filtered

  - MediaPlayer doesn't have a state that can be filtered


  Hints:  

  --------------

  - Really stick with the tags provided to you and write them exactly that way.
  Other strings won't work as tags and the tool will return an error.

  - "Garden" and "Driveway" are Outside areas. You need to set the location
  parameter to "Outside" for finding entities there.

  Examples:

  -----------------

  - Tags: "LivingRoom, Light", State: "on" → Switched-on lamps in the
  living room  

  - Tags: "WindowSensor", State: "on" → All open windows  

  - Tags: "GroundFloor, WindowSensor", State: "on" → Open windows in the
  ground floor  

  - Tags: "TemperatureSensor" → All temperatures in the house

  - Tags: "TemperatureSensor, Kitchen" → Temperature of the kitchen

  - Tags: "Inside, WindowSensor" → All windows of the house
fields:
  operation:
    description: The chosen operation
    example: get entities by tag
    required: true
    selector:
      select:
        mode: dropdown
        options:
          - label: Get entities by tag
            value: get entities by tag
  location:
    description: >-
      Required for operation "get entities by tag". Do you want to search for
      entities inside the house, outside the house or everywhere? Accepts
      "Inside", "Outside" or "Everywhere".
    example: Inside
    required: false
    selector:
      select:
        mode: dropdown
        options:
          - label: Inside
            value: Inside
          - label: Outside
            value: Outside
          - label: Everywhere
            value: Everywhere
  tags:
    description: >-
      List of tags. Provide a comma seperated list as a string. ONLY use the
      tags provided in the tool description. NO OTHER VALUES ALLOWED!
    example: Study, Light
    required: false
    selector:
      text: null
  details:
    description: >-
      false = Entities are returned as a list of entity_ids only. true =
      Entities are returned as a list of objects with entity_id, friendly_name
      and state.  This is useful if you want to know the current state of the
      entities (e.g. on/off for lights, current value for temperature sensors,
      ...).
    example: false
    required: false
    selector:
      boolean: {}
  state:
    description: >-
      Optional state-filter as string (e.g. `on`) for operation "get entities by
      tag". Leave empty to get all devices for the tags regardless of state.
    example: "on"
    required: false
    selector:
      text: null
sequence:
  - action: logbook.log
    data:
      name: "LLM ENTITY INDEX: "
      message: "{{ operation, location, tags, details, state }}"
      entity_id: "{{ this.entity_id }}"
  - choose:
      - conditions:
          - condition: template
            value_template: "{{ operation == 'get entities by tag' }}"
        sequence:
          - variables:
              tag_list: |-
                {%- set ns = namespace(list = []) %}
                {% if tags is string %}
                   {% set ns.list = tags.split(',') | map('trim') | list %}
                {% elif tags %}
                  {% set ns.list = tags %}
                {% endif %}

                {% if location == 'Inside' %}
                  {% set ns.list = ns.list + ['Inside'] %}
                {% elif location == 'Outside' %}
                  {% set ns.list = ns.list + ['Outside'] %}
                {% endif %}

                {{ ns.list | unique | list }}
          - variables:
              unknown_tags: |-
                {%- set ns = namespace(missing=[]) -%}
                {%- for t in tag_list %}
                  {%- if (label_entities(t) | length) == 0 %}
                    {%- set ns.missing = ns.missing + [t] %}
                  {%- endif %}
                {%- endfor -%}
                {{ ns.missing }}
          - choose:
              - conditions:
                  - condition: template
                    value_template: "{{ unknown_tags | length > 0 }}"
                sequence:
                  - variables:
                      error_result: >-
                        {{ {'error': 'You provided unknown tag(s), look up the
                        tool description for the correct tag names and think
                        about which is the correct one for your task.',
                            'unknown_tags': unknown_tags | join(', ')} }}
                  - stop: Unknown tag(s)
                    response_variable: error_result
          - variables:
              matched_entities: |-
                {% if tag_list | length == 0 %}
                  []
                {% else %}
                  {% set ns = namespace(matches = label_entities(tag_list[0])) %}
                  {% for t in tag_list[1:] %}
                    {% set ns.matches = ns.matches | select('in', label_entities(t)) | list %}
                  {% endfor %}
                  {% set matches = ns.matches %}
                  {% if state is defined and state %}
                    {{ matches | select('is_state', state) | list }}
                  {% else %}
                    {{ matches }}
                  {% endif %}
                {% endif %}
          - variables:
              result: |-
                {% if details | default(false) %}
                  {%- set ns = namespace(obj = {}) -%}
                  {%- for e in matched_entities -%}
                    {%- set ns.obj = ns.obj | combine({
                          e: {
                            'friendly_name': state_attr(e, 'friendly_name'),
                            'state': states(e)
                          }
                        }) -%}
                  {%- endfor -%}
                  {{ {'entities': ns.obj, 'count': matched_entities | length} }}
                {% else %}
                  {{ {'entities': matched_entities, 'count': matched_entities | length} }}
                {% endif %}
          - stop: ""
            response_variable: result
    default:
      - variables:
          error_result: "{{ {'error': 'unsupported operation', 'operation': operation} }}"
      - stop: Unsupported or missing operation
        response_variable: error_result

And this is what I added to the LLMs prompt to make the script the only truth about entities:

When asked about multiple entities in the house, ALWAYS USE THE TOOL ‘Entitiy Index’ to find them.
It’s your source to find entities that belong to the request.
You can use it for filtering the complete entity list of the house.
It also reports to you how many entities where found.

Hints:

If you have to make multiple calls in a row (not at the same time please, wait for the response before the next call) to get the count e.g. for multiple rooms or floors, you can use the Calculator tool to get the complete sum.

The Entity Index has a fixed list of tags to search for and ALL of them are described in the tool. Don’t fake your own ones. Entity types, rooms, floor, and everything else to be placed in the tags property apply to that rule. VERY IMPORTANT!

When I only ask about entities without location, I mean everything INSIDE the house.

When I aks about entities HERE, I talk about the current room (which is also the room where you are located while I talk to you. If you don’t know where you are, ask me).

NathanCu · July 20, 2025, 10:47pm

LOVE this!

And I have an update for the index going up tonight…

Thyraz · July 21, 2025, 11:22am

Ok, another script I use is the Google Generative AI Internet search.
Yes, even that I use OpenAI GTP-4o-mini as LLM.

The reason is, that Google provides up to 500 web searches per day for free.
While OpenAI has additional costs to the final output tokens for web searches and the token generated to create them / read them (the tokens are included on the more expensive models though).

The script and the steps needed to set it up are described here:
Google Generative AI - Search workaround
Simply expand the Workaround for Google Search tool section and follow the steps described.

Set the Model to Gemini 2.5 Flash-Lite Preview, which has the most free daily web searches inlcluded.

After that, you can use your default LLM without web search enabled for Assist and it will still be able to search the web for you.

Discussion thread for tools to make inexpensive models smarter (gpt-4o-mini in my case)

edit:

Table of contents:

Scripts:

Useful other stuff:

Start of original post, before I added this intro:

edit: Here’s the updated calculator script: