I just got my Voice PE, and let me first say, its a great piece of hardware! Really liking it so far.
The feature I’d love to have is to automatically continue a conversation. For example, if the AI responded with “do you want me to turn on the lights?” I don’t want to have to say, “Hey Jarvis” and then “Yes”. It should immediately start listening again after it finished streaming the response. As a nice to have, it would be cool if this could be configured such that only if the response contained a question should it do this, otherwise, don’t.
This could be implemented by simply adding a switch or toggle to have the Voice PE start listening manually. Then an automation could easily be set up to say, for 5 seconds after a response has been given, turn on “listen” again. This would also give the ability to start listening after an announcement for example.
USE the HACS Integration “Extended OpenAI Conversation” as your voice assistant.
Under Integration “Extended OpenAI Conversation” click on configure and change the default promt
The current state of devices is provided in available devices.
Use execute_services function only for requested action, not for current states. Do not execute service without user’s confirmation. Do not restate or appreciate what user says, rather make a quick inquiry.
to
The current state of devices is provided in available devices.
Use execute_services function only for requested action, not for current states. Do execute service without user’s confirmation.
That avoids the assistant to asked you again of you want to turn the light on. That’s not the best solution but maybe a beginning
with these two buttons i can simulate a wake word or stop the assistant (i.e. from listening.
with that a simple automation does the trick (listening 6 seconds after the assistant is finished speaking)
alias: continous conversation
description: "let’s the voice PE listen for 6 seconds after it finished speaking"
triggers:
- trigger: state
entity_id:
- assist_satellite.home_assistant_voice_093a54_assist_satellit
from: responding
to: idle
conditions: []
actions:
- action: button.press
metadata: {}
data: {}
target:
entity_id: button.home_assistant_voice_093a54_wakeword
- delay:
hours: 0
minutes: 0
seconds: 6
- if:
- condition: state
entity_id: assist_satellite.home_assistant_voice_093a54_assist_satellit
state: listening
then:
- action: button.press
metadata: {}
data: {}
target:
entity_id: button.home_assistant_voice_1_stop
mode: single
The only problem is if the Voice PE is in the noisy room or if the music or TV is on, the Assist will keep on listening and replying something nonsense in continuous loop.
Do you have any way to prevent this from happening? Or what do you do to get away from this?
As a bit of a novice, where do I actually add this? I tried adding this as a template under helper, but it doesn’t give me the option to alter in Yaml, and I can’t see an entity called voice. How do I actually go about adding this to the Voice PE configuration?
i worked around a little by adding a sensor that shows what i said (text_request). if i say „thanks“ or „stop“ or whatever, the automation to wake the assist won’t trigger.
code is borrowed here:
however, to extend the voice_assistantfrom github://esphome/home-assistant-voice-pe/home-assistant-voice.yaml with on_stt_end and on_tts_start i had to fully redefine the voice_assistant: config section and append/replace the changes for on_stt_end: and on_tts_start (did not manage to use !extend).
voice_assistant:
id: va
microphone: asr_mic
media_player: nabu_media_player
micro_wake_word: mww
use_wake_word: false
noise_suppression_level: 0
auto_gain: 0 dbfs
volume_multiplier: 1
on_client_connected:
- lambda: id(init_in_progress) = false;
- micro_wake_word.start:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
- script.execute: control_leds
# […]
on_timer_tick:
- script.execute: control_leds
on_stt_end:
- text_sensor.template.publish:
id: text_request
state: !lambda return x;
on_tts_start:
- lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
- script.execute: control_leds
# Start a script that would potentially enable the stop word if the response is longer than a second
- script.execute: activate_stop_word_if_tts_step_is_long
- text_sensor.template.publish:
id: text_response
state: !lambda return x;
the automation to wake the assistant now ha a condition, so it won’t listen after i say „thanks“ or something like that (in german):
condition: template
value_template: >-
{%- set request = states("sensor.home_assistant_voice_1_text_request") | lower
| regex_replace('[^\w\s]', '') -%}
{{
request != "danke" and
request != "ja danke" and
request != "schon gut" and
request != "is gut" and
request != "ist gut" and
request != "ok" and
request != "merci" and
request != "jut"
}}
Apologies for seeming thick, but what exactly do you mean by “adopting the configuration”? As in installing the Voice PE device? I can see it as a device in ESP Home, but there is nothing to configure there. If I check the yaml files under /config/esphome/, there is no config file either. Should I add a new config file there with this info? Any help would be greatly appreciated.
if the device is discoverable and you see it in esphome, it should be adopable:
adopting creates a configuration file in your esphome configuration directory and lets you (requires you to) compile the firmware (and also deactivates auto updates from precompiled images).