[Custom Component] extended_openai_conversation: Let's control entities via ChatGPT

bartonbrownings · May 29, 2024, 6:15pm

Is the intended behavior to only create the snapshot for the camera it infers from the prompt? I am only seeing 1 pic in the tmp

valentinfrlch · May 29, 2024, 7:15pm

Well that depends. The input the snapshot service takes is a list of camera entity ids.
Which cameras are considered depends on the phrasing of you question and how GPT interprets it. What happens when you ask: “What’s going on around the house?”. (This should prompt GPT to look at all cameras)

And just to be sure:

What modifications have you made to the spec?
Have you changed anything in the repeat sequence?
Does that part still look like this:

      sequence:
        - service: camera.snapshot
          metadata: {}
          data:
            filename: /config/www/tmp/{{repeat.item}}.jpg
          target:
            entity_id: "{{repeat.item}}"
      for_each: "{{ entity_ids }}"

You mentioned that some of your cameras don’t support the snapshot service. Do only images from cameras that support snapshots show up in the /tmp folder?

The actual error message you’re seeing is from gpt4vision and occurs when the file is not found. This could also be due to an empty line in the service call configuration. Does the image_file: still look like this:

image_file: |-
  {%for camera in entity_ids%}/config/www/tmp/{{camera}}.jpg
  {%endfor%}

bartonbrownings · May 29, 2024, 7:42pm

- spec:
    name: describe_camera_feed
    description: Get a description what's happening on security cameras around the house
    parameters:
      type: object
      properties:
        message:
          type: string
          description: The prompt for the image analyzer
        entity_ids:
          type: array
          description: List of camera entities
          items:
            type: string
            description: Entity id of the camera
      required:
      - message
      - entity_ids
  function:
    type: script
    sequence:
    - repeat:
        sequence:
          - service: camera.snapshot
            metadata: {}
            data:
              filename: /config/www/tmp/{{repeat.item}}.jpg
            target:
              entity_id: "{{repeat.item}}"
        for_each: "{{ entity_ids }}"
    - service: gpt4vision.image_analyzer
      metadata: {}
      data:
        provider: OpenAI
        model: gpt-4o
        max_tokens: 100
        target_width: 1000
        temperature: 0.3
        image_file: |-
          {%for camera in entity_ids%}/config/www/tmp/{{camera}}.jpg
          {%endfor%}
        message: "{{message}}"
      response_variable: _function_result```



This is my current format. The only image that is snapped is for my nursery camera. Even if I ask for the basement camera. I currently rely on every minute snaps of these cameras so I know both are capable.

valentinfrlch · May 29, 2024, 8:38pm

I assume you have something similar to this in your prompt template in your Extended OpenAI Conversation configuration (it’s included by default):

Available Devices:
```csv
entity_id,name,state,aliases
{% for entity in exposed_entities -%}
{{ entity.entity_id }},{{ entity.name }},{{ entity.state }},{{entity.aliases | join('/')}}
{% endfor -%}

This template lists all entities that you have exposed to Assist.

Have you exposed all your camera entities to Assist?
You need to do this, otherwise GPT has no way of knowing what entity_id to use when calling the spec.

solo24 · May 30, 2024, 3:32am

bartonbrownings:

- spec:
    name: describe_camera_feed
    description: Get a description what's happening on security cameras around the house
    parameters:
      type: object
      properties:
        message:
          type: string
          description: The prompt for the image analyzer
        entity_ids:
          type: array
          description: List of camera entities
          items:
            type: string
            description: Entity id of the camera
      required:
      - message
      - entity_ids
  function:
    type: script
    sequence:
    - repeat:
        sequence:
          - service: camera.snapshot
            metadata: {}
            data:
              filename: /config/www/tmp/{{repeat.item}}.jpg
            target:
              entity_id: "{{repeat.item}}"
        for_each: "{{ entity_ids }}"
    - service: gpt4vision.image_analyzer
      metadata: {}
      data:
        provider: OpenAI
        model: gpt-4o
        max_tokens: 100
        target_width: 1000
        temperature: 0.3
        image_file: |-
          {%for camera in entity_ids%}/config/www/tmp/{{camera}}.jpg
          {%endfor%}
        message: "{{message}}"
      response_variable: _function_result```

Can someone point me in the direction on how to add this to my installation. I thought this wa a script so I tried just adding it to the scripts.yaml file but it never shows up as a service. I was able to get the basic script below to show up as a service. I do have ha-gpt4vision installed and running.

test_camera_snapshot: # This is a separate script
alias: Yay Test Camera Snapshot
sequence:
- service: camera.snapshot
data:
entity_id: camera.living_room_cam
filename: /config/www/tmp/living_room_cam_test.jpg

valentinfrlch · May 30, 2024, 5:27am

Even though it looks similar, it’s not a script. It’s a spec for OpenAI Extended Conversation. Check the original post:

The configuration instructions are also explained in detail in this wiki:

DaOsmo76 · June 5, 2024, 11:27am

Just wanna add that Im both using paid openAI and also using this extended openai with groq.com. Almost using same intructions as OpenAI in my prompt.

Can use models:
LLaMA3 8b
LLaMA3 70b
Mixtral 8x7b
Gemma 7b

sayanova · June 6, 2024, 1:54am

Thanks, I have tested Groq. Groq is mind blowing fast on only text but the bottleneck is in STT and TTS. But it’s good chance to test how local LLM models perform.

hitnrun30 · June 6, 2024, 5:33pm

Wow, this is a home run

Herian · June 10, 2024, 7:45pm

I wish to use TTS to listen to my assist from a media player. Can i retrieve the response somehow? I’m trying to do with the help of chat gpt but is not working lol

valentinfrlch · June 11, 2024, 7:29pm

Assuming your setup in your automation/script looks something like this:

service: conversation.process
data:
  agent_id: agent_id
  text: "{{trigger.event.data.text}}"
response_variable: response

You can access the response like this:

"{{ response.response.speech.plain.speech }}"

meni123 · June 13, 2024, 7:54am

@valentinfrlch I have an integration compatible with frigate installed that comes with an entity that takes a picture every time something passing near the camera is detected, and the picture can be seen through this entity, it is the entity image.xxxxxx_person
How do I guide him to check the image from this entity?

valentinfrlch · June 13, 2024, 3:32pm

This is not currently possible. However I think this would be a great addition as I think many people use Frigate (myself included). Can you open a feature request on github? I’ll try to add this in the next release.

hitnrun30 · June 13, 2024, 4:38pm

I need help, the program I am using saves the files to the /media/wyze folder. Not the media folder inside of the home assistant (config folder) and when I put /media/wyze/ before the camera name, I get the error

Something went wrong: Cannot write /media/wyze/camera.doorbell.jpg, no access to path; allowlist_external_dirs may need to be adjusted in configuration.yaml

What am I doing wrong? Also know I added that folder to the allowlist and still didn’t work.

valentinfrlch · June 14, 2024, 7:45am

I will look into this. Just know that the error does not come from gpt4vision but from home assistant. If you cant get it to work with whitelisting maybe you can change where the file gets saved to.

hitnrun30 · June 14, 2024, 2:25pm

I am working with the wyze addon developer and asked the same, and he can’t put it into a folder within the home assistant folder.

So currently the files are in the /media/wyze folder and that is the media folder at the same level of homeassistant. I also added to the configuration

allowlist_external_dirs:
- /media/wyze

and now my error is
Something went wrong: invalid_image_path

valentinfrlch · June 14, 2024, 4:07pm

This is an error thrown by gpt4vision. Usually this means there is no file at the path provided. Did you provide the path to the file with the .jpg suffix?
In your case it should look something like this:

/media/wyze/your_image_name.jpg

hitnrun30 · June 14, 2024, 5:15pm

I used the same spec as above but with my folder.

- spec:
    name: describe_camera_feed
    description: Get a description what's happening on security cameras around the house
    parameters:
      type: object
      properties:
        message:
          type: string
          description: The prompt for the image analyzer
        entity_ids:
          type: array
          description: List of camera entities
          items:
            type: string
            description: Entity id of the camera
      required:
      - message
      - entity_ids
  function:
    type: script
    sequence:
    - repeat:
        sequence:
          - service: camera.snapshot
            metadata: {}
            data:
              filename: /media/wyze/{{repeat.item}}.jpg
            target:
              entity_id: "{{repeat.item}}"
        for_each: "{{ entity_ids }}"
    - service: gpt4vision.image_analyzer
      metadata: {}
      data:
        provider: OpenAI
        model: gpt-4o
        max_tokens: 100
        target_width: 1000
        temperature: 0.3
        image_file: |-
          {%for camera in entity_ids%}/media/wyze/{{camera}}.jpg
          {%endfor%}
        message: "{{message}}"
      response_variable: _function_result

I would like to remove some of them

when I add this to the template for testing

{%for camera in states.camera%}
  /media/wyze/camera.{{camera.name}}.jpg
{%endfor%}

I get these values.

/media/wyze/camera.Ender 3.jpg
/media/wyze/camera.doorbell.jpg
/media/wyze/camera.garage-back-door.jpg
/media/wyze/camera.jeffrey-cam.jpg
/media/wyze/camera.temp.jpg
/media/wyze/camera.backyard-cam.jpg
/media/wyze/camera.Cura Print Thumbnail.jpg
/media/wyze/camera.OctoPrint Camera.jpg
/media/wyze/camera.driveway-cam.jpg

I only want it to look for the highlighted ones

So how do I change this piece of the spec

{%for camera in entity_ids%}/media/wyze/{{camera}}.jpg
{%endfor%}

from this template

{%for camera in states.camera | rejectattr('name', 'contains', ' ')%}/media/wyze/{{camera.entity_id | regex_replace(find='_',replace='-')}}.jpg
{%endfor%}

CircuitSetup · June 17, 2024, 2:17pm

Is there a way to use this with Willow on the ESP32-S3-BOX3, or does the BOX3 have to run ESPHome?

valentinfrlch · June 17, 2024, 2:44pm

Can you open a bug report here?

If this is an issue with gpt4vision I’ll try to fix it. Maybe even integrate wyze directly.
For now, in case you are using image entities, the latest v0.3.8 of gpt4vision supports these now too.