I’m planning to set up a voice-triggered automation in Home Assistant (HA) using a Voice PE, but I might be missing something. I can’t figure out how to trigger an automation using a voice command, such as “OK Nabu, run Automation Beta Gamma.” I’ve searched but couldn’t find clear documentation or settings to make this work.
Additionally, I’m looking for guidance on enabling the microphone, like virtually triggering the wake word (“OK Nabu”). My goal is to integrate this into a broader workflow:
A camera motion sensor triggers an automation.
The automation captures a snapshot and sends it to a local LLM/LMM (Large Language Model/Multimodal Model) for analysis.
The analysis result is converted to text-to-speech (TTS) and played back.
-------Up to here is all done------
After the TTS, I’d like HA to listen for a voice response (e.g., a yes/no or boolean-type response) to proceed with the next steps.
I haven’t been able to find documentation on enabling the microphone or setting up this kind of voice interaction. If anyone has experience with this or can point me to the right resources, I’d really appreciate the help!
You don’t need the word awakening for your task. A trigger can call the required action directly.
If you need to activate V:PE with a trigger, you will need to add a button to the firmware code that will start the VA service. Basic knowledge of esphome is required.
I use LLM Vision for AI image analysis—totally worth the coffees I donated.
Thank you for the “Adding a custom sentence to trigger an automation” link; it’s very insightful.
However, I’m not entirely sure how to command V:PE to both initiate and cease listening through an automation.
My proposed workflow is as follows:
Detection: A camera detects a person, triggering the LLM Vision automation.
Processing: LLM Vision processes the image, and the resulting AI output is sent to the Alexa Media Player.
Listening Initiation: The automation instructs V:PE to start listening.
User Interaction: The user responds verbally with a Yes/No answer.
Listening Termination: V:PE is commanded to stop listening.
Action Execution: A subsequent action is performed based on the user’s response.
This cannot be done within the current state of HA. There is no mechanism to build dialog chains.
Perhaps in the future this will be solved by developing LLM and using assist_satellite.start_conversation action.
For now, it is possible to automatically activate a listen and then call one of the custom sentences. This will work stably