Continue conversation automatically on Home Assistant Voice PE

I just got my Voice PE, and let me first say, its a great piece of hardware! Really liking it so far.

The feature I’d love to have is to automatically continue a conversation. For example, if the AI responded with “do you want me to turn on the lights?” I don’t want to have to say, “Hey Jarvis” and then “Yes”. It should immediately start listening again after it finished streaming the response. As a nice to have, it would be cool if this could be configured such that only if the response contained a question should it do this, otherwise, don’t.

This could be implemented by simply adding a switch or toggle to have the Voice PE start listening manually. Then an automation could easily be set up to say, for 5 seconds after a response has been given, turn on “listen” again. This would also give the ability to start listening after an announcement for example.

This is exactly the problem I have. I haven’t found a solution yet.

1 Like

Brilliant suggestion, all for it.
would be an ease as now i also use an additional “hey jarvis” to reply.

I have a workaround under review:

  1. USE the HACS Integration “Extended OpenAI Conversation” as your voice assistant.
  2. Under Integration “Extended OpenAI Conversation” click on configure and change the default promt

The current state of devices is provided in available devices.
Use execute_services function only for requested action, not for current states.
Do not execute service without user’s confirmation.
Do not restate or appreciate what user says, rather make a quick inquiry.

to

The current state of devices is provided in available devices.
Use execute_services function only for requested action, not for current states.
Do execute service without user’s confirmation.

That avoids the assistant to asked you again of you want to turn the light on. That’s not the best solution but maybe a beginning :wink:

Thanks

1 Like

i adopted the configuration and added the following to the voice PE configuration:

button:
  - platform: template
    name: Wakeword
    id: wkaewor_trigger_external
    on_press:
        - voice_assistant.start:
            wake_word: !lambda return "okay_nabu"; # fake
  - platform: template
    name: Stop
    id: stop_trigger_external
    on_press:
        - voice_assistant.stop:

with these two buttons i can simulate a wake word or stop the assistant (i.e. from listening.
with that a simple automation does the trick (listening 6 seconds after the assistant is finished speaking)

alias: continous conversation
description: "let’s the voice PE listen for 6 seconds after it finished speaking"
triggers:
  - trigger: state
    entity_id:
      - assist_satellite.home_assistant_voice_093a54_assist_satellit
    from: responding
    to: idle
conditions: []
actions:
  - action: button.press
    metadata: {}
    data: {}
    target:
      entity_id: button.home_assistant_voice_093a54_wakeword
  - delay:
      hours: 0
      minutes: 0
      seconds: 6
  - if:
      - condition: state
        entity_id: assist_satellite.home_assistant_voice_093a54_assist_satellit
        state: listening
    then:
      - action: button.press
        metadata: {}
        data: {}
        target:
          entity_id: button.home_assistant_voice_1_stop
mode: single
1 Like

I also did the same.

The only problem is if the Voice PE is in the noisy room or if the music or TV is on, the Assist will keep on listening and replying something nonsense in continuous loop.

Do you have any way to prevent this from happening? Or what do you do to get away from this?

Any suggestion would absolutely be appreciated.

This is brilliant, thank you!
I’ll test this during the day!

As a bit of a novice, where do I actually add this? I tried adding this as a template under helper, but it doesn’t give me the option to alter in Yaml, and I can’t see an entity called voice. How do I actually go about adding this to the Voice PE configuration?

Thanks in advance!

after adopting the configuration, you should end up with a configuration like this:

substitutions:
  name: "home-assistant-voice-1"
  friendly_name: Home Assistant Voice
packages:
  Nabu Casa.Home Assistant Voice PE: github://esphome/home-assistant-voice-pe/home-assistant-voice.yaml
[…]

… and then could extend the configuration as mentioned abive.

i worked around a little by adding a sensor that shows what i said (text_request). if i say „thanks“ or „stop“ or whatever, the automation to wake the assist won’t trigger.

code is borrowed here:

however, to extend the voice_assistantfrom github://esphome/home-assistant-voice-pe/home-assistant-voice.yaml with on_stt_end and on_tts_start i had to fully redefine the voice_assistant: config section and append/replace the changes for on_stt_end: and on_tts_start (did not manage to use !extend).

voice_assistant:
  id: va
  microphone: asr_mic
  media_player: nabu_media_player
  micro_wake_word: mww
  use_wake_word: false
  noise_suppression_level: 0
  auto_gain: 0 dbfs
  volume_multiplier: 1
  on_client_connected:
    - lambda: id(init_in_progress) = false;
    - micro_wake_word.start:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - script.execute: control_leds
# […]
  on_timer_tick:
    - script.execute: control_leds
  on_stt_end:
    - text_sensor.template.publish:
        id: text_request
        state: !lambda return x;
  on_tts_start:
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - script.execute: control_leds
    # Start a script that would potentially enable the stop word if the response is longer than a second
    - script.execute: activate_stop_word_if_tts_step_is_long
    - text_sensor.template.publish:
        id: text_response
        state: !lambda return x;

the automation to wake the assistant now ha a condition, so it won’t listen after i say „thanks“ or something like that (in german):

condition: template
value_template: >-
  {%- set request = states("sensor.home_assistant_voice_1_text_request") | lower
  | regex_replace('[^\w\s]', '') -%}
  {{
  request != "danke" and
  request != "ja danke" and
  request != "schon gut" and
  request != "is gut" and
  request != "ist gut" and
  request != "ok" and
  request != "merci" and
  request != "jut"
  }}

Thanks diplix!

Apologies for seeming thick, but what exactly do you mean by “adopting the configuration”? As in installing the Voice PE device? I can see it as a device in ESP Home, but there is nothing to configure there. If I check the yaml files under /config/esphome/, there is no config file either. Should I add a new config file there with this info? Any help would be greatly appreciated.

if the device is discoverable and you see it in esphome, it should be adopable:


adopting creates a configuration file in your esphome configuration directory and lets you (requires you to) compile the firmware (and also deactivates auto updates from precompiled images).

Perfect, thank you very much! I get it now.