Next iteration of our Voice Assistant is here - Voice chapter 10

But it looks like Voice is going to stay a black box. We can see all sorts of events and errors in the Assist debug log, but we can’t actually use any of them. It’s a bit of a deal breaker.

Great work from the team!

Is there any update on voice matching please? All the new stuff you’re adding is great, but voice matching is a must-have for me living in a family where everyone wants to listen to their own music on their own accounts. I can’t move away from Google until Voice/Assist supports this.

Many thanks

1 Like

Great updates! Hopefully we can get a timer entity when one is set so we can display it on the dashboard

1 Like

Yes, thanks! I’ve got 8 models currently in ā€œtrialā€ mode—aiming to narrow it down to one, though I might land on three depending on the context. I’m offloading to a 3090 GPU running Whisper and Piper on Proxmox. So far, I’m leaning towards Qwen, Llama, Phi, and Mistral. I’ll check out Friday’s party—I must’ve missed the invite on that.

1 Like

Yes, we need the whole Timers integration! :hourglass_flowing_sand:


and Alarm clock integration too :alarm_clock:

That’s why it helps to read the docs :wink:

The UI will almost certainly never show all the possible options available in yaml, especially when it comes to cards. Might as well go direct to the source to see what you can do

You can do that already with a custom sentence/intent.

Custom sentences and intents in home assistant

1 Like

So happy to see voice development continue. Can’t wait to try a few things.
Well done team!

+1 that assist debug logging would be nice to be able to enable for troubleshooting voice and LLM commands.

1 Like

ESPhome has functionality to transmit events from any stage of VA to Home Assistant. Just configure it to suit your needs.

I used this to get text from stt. But after adding ā€˜Ask a question,’ it was no longer necessary.

But as I understand it, that’s thanks to ESPhome, not Home Assistant.

Also - and this may come as a shock - not all of us use ESPhome. :scream:

That page you linked mentions HA as a prerequisite for ESphomes Voice Assistant stuff. And anyone that uses Voice Assistant Preview Edition, uses ESPHome.

But not everyone uses Voice Assistant Preview Edition and Open Voice is more than VPE.

In fact, Assist predates VPE by quite a long way. It has never allowed users to leverage events and errors - and there are plenty of them:

stage: done
run:
  pipeline: 01h0amejr3nf5059gavrx5jm7g
  language: en
  conversation_id: 01JZ86PBWFHMAPA1Q6W0EVMCQN
  runner_data:
    stt_binary_handler_id: null
    timeout: 300
events:
  - type: run-start
    data:
      pipeline: 01h0amejr3nf5059gavrx5jm7g
      language: en
      conversation_id: 01JZ86PBWFHMAPA1Q6W0EVMCQN
      runner_data:
        stt_binary_handler_id: null
        timeout: 300
    timestamp: "2025-07-03T13:12:54.927548+00:00"
  - type: intent-start
    data:
      engine: conversation.home_assistant
      language: en
      intent_input: undo the trittle.
      conversation_id: 01JZ86PBWFHMAPA1Q6W0EVMCQN
      device_id: null
      prefer_local_intents: false
    timestamp: "2025-07-03T13:12:54.927637+00:00"
  - type: intent-end
    data:
      processed_locally: true
      intent_output:
        response:
          speech:
            plain:
              speech: No area named undo
              extra_data: null
          card: {}
          language: en
          response_type: error
          data:
            code: no_valid_targets
        conversation_id: 01JZ86PBWFHMAPA1Q6W0EVMCQN
        continue_conversation: false
    timestamp: "2025-07-03T13:12:55.155924+00:00"
  - type: run-end
    data: null
    timestamp: "2025-07-03T13:12:55.155987+00:00"
intent:
  engine: conversation.home_assistant
  language: en
  intent_input: undo the trittle.
  conversation_id: 01JZ86PBWFHMAPA1Q6W0EVMCQN
  device_id: null
  prefer_local_intents: false
  done: true
  processed_locally: true
  intent_output:
    response:
      speech:
        plain:
          speech: No area named undo
          extra_data: null
      card: {}
      language: en
      response_type: error
      data:
        code: no_valid_targets
    conversation_id: 01JZ86PBWFHMAPA1Q6W0EVMCQN
    continue_conversation: false

This appears to have been an early design decision.

When it comes to follow-up on missing parameters, it would be cool if we could set some defaults. For example, ā€˜Ok Nabu, dim the lights in the living room.’

In a separate place, maybe we could define some rules for how it would determine the amount. I’m not a programmer, but perhaps something like this:

defaults:

lights:
  - light.living_room_pot_lights
    - brightness:
      - if (input_boolean.party_mode == on && command == dim) # allow for 'dim' vs 'brighten' type commands
          set to 20%
      - elif (input_select.home_mode == 'day' && light.living_room_pot_light.brightness > 75%)
          change by 25% # allow for 'change by' commands (if dim was requested, it will go down by 25%, if brighten was requested, up by 25%
      - elif (input_select.home_mode == 'day' && light.living_room_pot_lights > 40%)
          set to 35% # allow for 'set to' commands
      - elif ...
      - else: ask # allow for ask command, that will ask for confirmation everytime.

timers:
  - timer.washing_machine    # if a timer is set with the tag washing machine
      0:45:00                # eg. Ok Nabu, set my washing machine timer
  - timer.dryer
      1:00:00
  - timer.kitchen.device # add some defaults on a per device level?
      ask                # eg. When using the kitchen speaker, always ask

Again, this sort of thing can be done with intent scripts.

2 Likes

Agree at least with feentrant scripts and event triggers you can In theory both have a script call out and ask for clarification if said info isn’t in (condition X) and we can now ask a question and receive answers with clarified slots. You could then restart the script with the correct data which would skip the question etc…

Workflow Ala HA voice.