Here is further details along with formatting and just added thoughts after watching the firmware logs across multiple tests (and also diving into the firmware itself but with no luck.
Continuous Conversation Mode for Voice PE: Firmware Limitations
Current Workaround (Limited Success)
I’ve have a “quasi” continue conversation mode with this automation:
- alias: Voice assistant follow-up
description: Keeps the conversation going if Follow Up Mode is enabled
trigger:
- platform: state
entity_id: assist_satellite.home_assistant_voice_yourid
from: responding
to: idle
- platform: state
entity_id: assist_satellite.home_assistant_voice_yourid
from: responding
to: idle
condition:
- condition: state
entity_id: input_boolean.follow_up_mode
state: 'on'
action:
- service: assist_satellite.start_conversation
data:
start_message: ''
preannounce: false
target:
entity_id: '{{ trigger.entity_id }}'
mode: single
id: 702e8dd4a3c748e8a33f779162978101
And these automations to toggle follow-up mode by voice:
- alias: 'Voice: Stop Follow Up Mode'
trigger:
- platform: conversation
command:
- stop follow up
- stop follow up mode
- stop listening
- stop talking
- end conversation
- conversation over
- disregard
- nevermind
- cancel conversation
- enough
action:
- service: input_boolean.turn_off
target:
entity_id: input_boolean.follow_up_mode
- service: persistent_notification.create
data:
title: Follow Up Mode
message: Follow up mode has been disabled.
mode: single
id: a6f2309be431465084e8e81ec687e65a
This approach is great for commands like “turn off the light. Done. Turn on the TV. Done. Play some music” - all without constantly repeating the wake word. It also lets you immediately cut off the assistant while it’s speaking.
The Downside: This does NOT carry over context. Every single call is like a new conversation, so you’re basically speaking to an LLM with the dude from Memento.
Home Assistant Voice PE Analysis: Question Detection Behavior
Based on detailed log analysis, I’ve found evidence of how Voice PE decides when to keep listening:
State Transition Patterns
When analyzing device logs, two distinct patterns emerge:
Pattern 1: Responses WITH questions
[20:51:07] Response: "What makes you think of purple today?"
[20:51:12] State changed from RESPONSE_FINISHED to START_MICROPHONE
Pattern 2: Responses WITHOUT questions
[20:51:19] Response: "Purple is now a key data point in my brain!"
[20:51:23] State changed from RESPONSE_FINISHED to IDLE
Technical Analysis
The firmware appears to:
Perform linguistic analysis on response text
Identify question patterns (question marks, interrogative phrases)
Force different state transitions based solely on this detection
Make this decision extremely quickly (within milliseconds)
Proposed Enhancement
A simple firmware modification could add a configuration option:
voice_assistant:
# Existing configuration...
continued_conversation:
enabled: true
mode: "always" # Options: "questions_only", "always", "disabled"
timeout: 5s
This would enable users to choose their preferred conversation style without complex workarounds.
Conclusion
I have literally tried everything from carrying over conversation IDs across states to flashing the firmware with custom code. I cannot overcome this design limitation, and it’s REALLY holding back the device from being a true conversational AI home assistant - for real D&D games, proper long-form discussions not constantly interjected with “HEY JARVIS”.
I hope the you can look into where it needs to be changed, because when it gets into firmware design, I can’t crack it. The issue appears to be that the state transitions happen TOO quickly at the firmware level for Home Assistant automations to reliably intercept. multiple state transitions in milliseconds.
I love your product but tying continuous conversations to questions doesn’t make sense for natural spoken word conversation.
Thank you for all you do!