Use wyoming piper TTS

I have Ollama with llama 3.2 and llava LLM up and running.VisonLLM, wyoming piper and whisper are also working in HA (but offloaded to external server). everything works fine. TTS/STT, also vision works fine and can tell me what it sees in a video stream. I am using a wyoming satellite for TTS an SST. I want to ask " what is happening around the house" and I want the response to come from the wyoming sattelite. But for the love of all. I cannot get it to work: This is my latest script. The main issue seems to be that the satellite is not a mediaplayer.

alias: surrounding check
description: ""
triggers:
  - command:
      - What is happening around the house
      - " What's happening around the house"
      - check surroundings
    trigger: conversation
conditions: []
actions:
  - data:
      duration: 10
      max_frames: 3
      include_filename: true
      target_width: 1280
      max_tokens: 100
      temperature: 0.2
      generate_title: false
      expose_images: false
      provider: 01JJ2KWA02PY7PK4GSDFGHGFDXHED
      remember: true
      image_entity:
        - camera.192_168_188_22
      model: llava:7b
      message: Describe what you see in one sentence.
    response_variable: response
    action: llmvision.stream_analyzer
  - data:
      message: "{{ response }}"
      entity_id: tts.piper_2
    action: tts.speak
  - data:
      level: info
      message: "Response from stream analyzer: {{ response }}"
    action: system_log.write
mode: single

Try this instead of calling the tts service

  set_conversation_response: "{{ response.response_text }}"

All I am getting is “Done” either through TTS or chat.

alias:  omgevings controle
description: ""
trigger:
  - platform: conversation
    command:
      - What is happening around the house
      - " What's happening around the house"
      - check surroundings
condition: []
action:
  - service: llmvision.stream_analyzer
    data:
      duration: 10
      max_frames: 3
      include_filename: true
      target_width: 1280
      max_tokens: 100
      temperature: 0.2
      generate_title: false
      expose_images: false
      provider: 01JJ2KWA02PY7PK4B6QR2QXHED
      remember: true
      image_entity:
        - camera.192_168_188_22
      model: llava:7b
      message: Describe what you see in one sentence.
    response_variable: response
  - service: conversation.set_conversation_response
    data:
      text: "{{ response.response_text }}"
  - service: system_log.write
    data:
      level: info
      message: "Response from stream analyzer: {{ response.response_text }}"
mode: single

You copied the code incorrectly.

that works for me:

actions:
  - action: llmvision.stream_analyzer
    data:
      image_entity:
        - camera.192_168_1_241
      duration: 2
      max_frames: 2
      target_width: 1280
      max_tokens: 180
      temperature: 0.2
      generate_title: false
      expose_images: false
      remember: false
      model: llava-llama3:8b-v1.1-q4_0
      provider: 01JHDY8JTRD8434QPVWMWF4AJQ
      include_filename: false
      message: What is the situation in the room? describe it briefly
    response_variable: llm
  - set_conversation_response: "{{ llm.response_text }}"
mode: single
1 Like

It works. Thanks. I got stuck in a loop I guess.