Solved: Using GPT-4o-Mini-Search-Preview with Voice Assist PE – Easy Workaround

OpenAI just released the API model gpt-4o-mini-search-preview
Unfortunately I’m receiving an error message using OpenAI Conversation integration for my assist pipeline, when I try to use it:

Error talking to OpenAI

Was anybody successful with gpt-4o-mini-search-preview?

I would love to use the web search functionality.

Thank you!

1 Like

What is the error message in your log.

Error talking to Open AI will be accompanied by an HTML return code telling you why.

Hi @NathanCu,

it just says ‘error’ in the assist logs, but I found out its even happening via the OpenAI playground:
“The requested model ‘gpt-4o-mini-search-preview’ does not exist.”

(accessed via https://platform.openai.com/docs/models/gpt-4o-mini-search-preview)

Maybe it’s a regional thing and the model is not activated in Germany yet? Can you use it in the OpenAI playground?

Thanks!

I’m able to use the model (I’m in the US), but the model token limit is 6000, and the requests coming from HA all seem to be just above that at ~6200.

Typical error: Rate limited by OpenAI: Error code: 429 - {‘error’: {‘message’: ‘Request too large for gpt-4o-mini-search-preview in organization org-[orgID] on tokens per min (TPM): Limit 6000, Requested 6680. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.’, ‘type’: ‘tokens’, ‘param’: None, ‘code’: ‘rate_limit_exceeded’}}

Edit: added error message

2 Likes

It seems like the current integration (or the OpenAI Extended Conversation integration) does not support this model, since the model does not support temperature and top_p parameters, which the integration sends. Here’s the response:

Error talking to OpenAI: Error code: 400 - {'error': {'message': 'Model incompatible request arguments supplied: temperature, top_p', 'type': 'invalid_request_error', 'param': None, 'code': None}}

And here’s the response with OpenAI Extended Conversation integration:

Error code: 404 - {'error': {'message': 'functions is not supported in this model. For a list of supported models, refer to https://platform.openai.com/docs/guides/function-calling#models-supporting-function-calling.', 'type': 'invalid_request_error', 'param': None, 'code': None}}
2 Likes

If anybody has a nice solution to integrate AI web search in assist, I would be very interested.

I have tried to avoid starting my MCP journey so far. :smile:

This is the #1 use case right now for MCP If someone made brave search on an MCP enabled docker addon it would be downloaded a billion times.

And I’m not even going to try without MCP

Using GPT-4o-Mini-Search-Preview with Voice Assist PE – Easy Workaround

This is a game changer for me. I found a very simple and effective way to integrate gpt-4o-mini-search-preview with my Voice Assist PE.

What can I do with it?

Voice Assist PE can now answer ANY question using web search. Pretty cool!

Examples:

  • “How did Bayern play today?”
  • “Is the cocktail bar Mojito open today?”
  • “What’s the weather tomorrow?” (Location can be handed over via API request)

To trigger gpt-4o-mini-search-preview, you just need to use a specific trigger word or sentence.


How it works:

The core idea is simple: I set up a separate gpt-4o-mini-search-preview flow outside the normal assist pipeline.

  1. Define a trigger phrase, e.g.:
    • I have a question: How did Bayern play today?”
    • Question: How did Bayern play today?”
  2. Voice command with trigger phrase triggers an automation
  3. Automation saves the prompt to a text sensor
  4. Node-RED reads the prompt and sends an OpenAI API request
    • Model: gpt-4o-mini-search-preview
    • Location (if needed) is defined in the request
  5. Node-RED saves the response to a text sensor
  6. Automation detects the updated response and continues
  7. Voice Assist PE announces the response using assist_satellite.announce via TTS
alias: "Voice: Web Search"
description: ""
triggers:
  - trigger: conversation
    command:
      - I have a question {websearch_prompt}
      - Question {websearch_prompt}
conditions: []
actions:
  - action: input_text.set_value
    metadata: {}
    data:
      value: "{{ trigger.slots.websearch_prompt }}"
    target:
      entity_id: input_text.voice_web_search_prompt
  - wait_for_trigger:
      - trigger: state
        entity_id:
          - input_text.voice_web_search_prompt_response
  - action: assist_satellite.announce
    metadata: {}
    data:
      message: "{{ states('input_text.voice_web_search_prompt_response') }}"
    target:
      entity_id: assist_satellite.home_assistant_voice_xxxx_assist_satellit
mode: single

NodeRed flow:

Since you’re already using Node-RED, you could replace the Home Assistant automation with a sentence node and handle the dynamic response directly in Node-RED.

https://zachowj.github.io/node-red-contrib-home-assistant-websocket/node/sentence.html#dynamic-response-examples

1 Like

That’s true. Thank you for the tip.

I was wondering, if it could be completely done in HA… but I couldn’t find a solution to run the API request.

That’ll need a small tweak to the openai conversation handler just like what they did for o1/o3. Unfortunately the model needs tweaks to what’s being sent and doesn’t just ‘slot in’ so we’ll have to wait for the conversation agent to catch up.

That said, I doubt they miss it… I know continued convo is scheduled for 2025.4 I’d love to see this hit too. Search is one of the few things I’m missing rn. Tbh.

1 Like

That’s pretty radical - why? I personally would use any solution that works and integrates nicely, such a a HA script exposed to Assist.

I’ve been working on building it for half a year. For local you need an ollama installation that supports openai api calling then you need to install a tool to do the search and something to publish. Look what I’m saying I’d there’s an easy way and a hard way.

For cloud 4omini search is about to be a giant easy button.

To get the same on local your llm either needs to support search locally (you installed the tool) or you gave access to a tool through MCP.

Ive done all of them mcp is far and away the easiest and most light weight.

Someone could (ive almost nailed it I’m working on it but my docker-fu sucks) create a container that has brave search, and an MCP proxy. Fire it up and publish to HA. Boom search.

Want dB access. Container with a sql tool.
Want tool a, b, c or even an entirely different llm?

Its absolutely the way we’ll be doing this in a year or so… So I’m building to that.

1 Like