I want to be able to walk up to my Home Assistant Voice PE and say, “hey Jarvis, trust no one” and have it respond verbally with a specific message. This is part of a scavenger hunt for my kids to play on Christmas day.
I have the Home Assistant Voice PE and it’s setup to use the “hey Jarvis” wake word. In my Settings/Assistants menu I have an assist named “Jarvis” that is set as the preferred assistant. It’s Conversation agent is configured to use an Ollama LLM model (gpt-oss:120b-cloud). Speech-to-text and Text-to-speech are both configured to use Home Assistant Cloud.
Currently, if I do this now, Jarvis just thinks I’m having a conversation with him and thinks I’m being weird and responds like any other LLM would. But If I open a conversation from a dashboard using the assist button and type the sentence “trust no one”, then it triggers the automation appropreiately and Jarvis responds with the message in my automation. Do I need to change how the speech-to-text is configured?
Here’s the automation as it looks right now:
alias: Trust No One
description: ""
triggers:
- trigger: conversation
command: trust no one
conditions: []
actions:
- action: tts.speak
metadata: {}
target:
entity_id: tts.home_assistant_cloud
data:
cache: true
media_player_entity_id: media_player.home_assistant_voice_095147_media_player
message: >-
“In a chamber of storms, glass and steel clash, and rain falls sideways
in the dark. What enters filthy emerges pure. Look within the metal
mouth.”
mode: single
Interestingly, I wasn’t thinking that the automation was triggering at all when I would say, “hey jarvis, trust no one”. But now it appears it is indeed triggering but the only audible response I’m getting is “done”.
Text-to-speech (TTS) 'Speak' on Home Assistant Cloud
Executed: December 6, 2025 at 1:54:24 PM
Result:
params:
domain: tts
service: speak
service_data:
cache: true
media_player_entity_id: media_player.home_assistant_voice_095147_media_player
message: >-
“In a chamber of storms, glass and steel clash, and rain falls sideways in
the dark. What enters filthy emerges pure. Look within the metal mouth.”
entity_id:
- tts.home_assistant_cloud
target:
entity_id:
- tts.home_assistant_cloud
running_script: fals
alias: Trust No One
description: ""
triggers:
- trigger: conversation
command: trust no one
conditions: []
actions:
- action: tts.speak
metadata: {}
target:
entity_id: tts.home_assistant_cloud
data:
cache: true
media_player_entity_id: media_player.home_assistant_voice_095147_media_player
message: >-
In a chamber of storms, glass and steel clash, and rain falls sideways
in the dark. What enters filthy emerges pure. Look within the metal
mouth.
- set_conversation_response: ""
mode: single
I’m not a dev, so this might not be 100% accurate… but as I understand it:
The conversation response always plays at the end of the automation. In the case of your TTS action, the TTS action was fired, but before the audio is fully processed the automation was considered “finished”. Then you have a kind of race between which audio clip is ready to play and the short, cached “Done” wins. The audio you wanted to play is cancelled because something is already playing.
Really, if you like the voice used by your default Conversation agent and the sentence should play out of the device you are speaking to; you can use conversation response instead of the tts.speak action:
alias: Trust No One
description: ""
triggers:
- trigger: conversation
command: trust no one
conditions: []
actions:
- set_conversation_response: |
In a chamber of storms, glass and steel clash, and rain falls sideways
in the dark. What enters filthy emerges pure. Look within the metal
mouth.
mode: single