Multi-part responses in custom intents

Is it somehow possible to give multi-part responses in a custom intent? Something like:
→ “Do something that takes long”

“OK, I am starting immediatly”
busy calcuations
“OK, I am done”

My current understanding is to pass the result in the speech/text, and this gets spoken at the end.

If you use assist_satellite.announce or tts.speak instead of set_conversation_response, then it is quite possible. See the adjacent topic.

so, you mean I could throw in a announce at the beginning of my intent and THEN return my output for the actual intent???

I’m intrigued and will try that tomorrow

I wasnt able to retrieve the device_id in an intent. The samples in your link refer to an automation. :frowning:

You can work it backwards from entity_id also using devices in your automation will lead to pain. Especially in voice process ing because the llm context does jot contain device_ids.

What are you ACTUALLY trying to accomplish and someone can help you work it out.

as I said, in the opener:
I hve an intent that waits for an external process to finish (its an MQTT command and reply) with a timeout of 10 seconds. In the meantime I would like to respond with “OK, understand, please wait…” on that speaker as part of the reply - which follows after 10s (max)

What call makes the call in to your planned intent and what is it capable of sending. What triggers thism

I am not sure I understand the question.
The intent is triggered with “Hey Nabu, play .”
The code then send an mqtt command and waits for an MQTT reply (a publish to homeassistant), uses the data to generate an AI reply and puts that into the speech: text

PlaylistSelect:
  action:
    - variables:
        area: "{{ area }}"
        QueueCtrl: "{{ QueueCtrl | default('queue') }}"
        Playlist: "{{ Playlist | default('Everything')}}"
    - variables:
        play_request_dict: >-
          {{ {
            "QueueCtrl": QueueCtrl,
            "Playlist": Playlist
          } }}
    - variables:
        request_id: "r{{ now().strftime('%Y%m%d%H%M%S%f') }}"
    - service: script.log
      data:
        message: "[PlaylistSelect] Requesting playlist '{{ Playlist }}' with QueueCtrl='{{ QueueCtrl }}' in area '{{ area }}'"
################
## HERE would be nice to inform the user that it might take some time
#################
# this might take some time to complete
    - service: script.smartflat_mqtt_request_and_respond
      data:
        cmd_topic: "PlaylistSelect"
        payload: "{{ play_request_dict | tojson }}"
        task_name: "PlaylistSelect"
        Description: "Inform the user about the prepared songs. Summarize the information about the included songs into a very short sentence, choosing two or three of the result parameters."
        request_id: "{{ request_id }}"
      response_variable: reply_payload
    - variables:
        result: >-
          {{ (reply_payload.result if reply_payload is mapping else (reply_payload | default(''))) | default('') }}
    - stop: ""
      response_variable: result
## now provide the reponse fromt he external tool (that gets delivered by smartflat_mqtt_request_and_respond
  speech:
    text: "{{ action_response }}"

Move your automation from intent_script to the GUI. Then you will have access to standard methods for obtaining the trigger ID {{ trigger.device_id }}.

And also, LLM still hasn’t learned that we’ve moved from service to action

2 Likes

It won’t reliably for 18 more months at the current training pace. Just enough time for the pattern to change again…

2 Likes

There is also a way for intent_script, but it is more complicated and requires filtering by name. Here is a minimal example. If an announcement action is required, then the domain in the request must be changed.

Also, the final speech block cannot be placed, otherwise there will be problems with playback.

  CheckConnection:
    data:
      - sentences:
          - "sound check"
        requires_context:
          area:
            slot: true 
        slots:
          domain: media_player

CheckConnection:
  action:
    action: tts.speak
    data:
      media_player_entity_id: "{{ targets.entities | select('search', '(?i)esp32va') | first }}"
      message: "It's {{ area }}"
    target:
      entity_id: tts.piper

But the UI option is much easier to edit and check, so I wouldn’t waste time on the legacy method.

3 Likes

So, I finally took the time to build this and test it and can confirm it works within intent_scripts as well.
as outlined by @mchk it requires the “area” slot in the intent to provide the area where the user ask spoken.
This needs to be mapped to a output device (either via the search in the example above or with a mapping table).
You can then use tts.speak in the intent, do your stuff and (at least working for me) than use the final speech block of the intent to return to actual output. If the Intent finishes too quick then it will abort the tts.speak and speak its text.