[Custom Component] extended_openai_conversation: Let's control entities via ChatGPT

Rich37804 · March 10, 2024, 5:52pm

I have no idea wht Im doing. I just cant get this to work…

- spec:
  name: play_apple_TV
  description: Use this function to play Apple TV.
  function:
  type: script
  sequence:
    - service: media_player.select_source
      data:
        target:
          entity_id: media_player.android_tv_192_168_86_42
        source: com.apple.atve.androidtv.appletv

Yahav · March 11, 2024, 8:21am

It works for me perfectly.

spec:
name: send_tts_message_to_kitchen_speaker
description: Use this function to send a TTS message to the kitchen speaker.
parameters:
type: object
properties:
message:
type: string
description: Message you want to send
required:
- message
function:
type: script
sequence:
- service: tts.cloud_say
  data:
  entity_id: media_player.kitchen2
  message: “{{ message }}”

Thanks

redaste · March 11, 2024, 11:39am

Hello, I’m founding a way to redirect all the vocal responses to a specific media player entity media_player.living but with no luck. The entity is exposed.
ChatGPT alwayes responds to the device where the input is coming from (and with bad speakers…). Is this an available option?

bartonbrownings · March 11, 2024, 2:07pm

Also interested in this.

hideandseem · March 15, 2024, 5:32am

Is there a way to close the assist conversation by a command/event? Or if not that, then atleast clear the conversation context for the gpt agent?

stueble · March 16, 2024, 11:53am

Nice idea! The question is, is it necessary to send all the entity states at all or would it be more effcient to explain how to get information that is required? In my house we make heavily use of KNX, i.e., even if I remove lots entities, each request already consumes more than 5500 token.

Rafaille · March 16, 2024, 10:42pm

In theory this is possible but you would have to name your entities so that they can be read by gpt seamlessly.

By declaring the entities (along with their aliases) in the prompt, gpt can just ‘pick’ which one to read/target in a service by ‘transcribing’ your prompt into a useful command. If you do not declare them then you lose this transcription ability. Every call by gpt would be a shot in the dark, calling random entities, some calls might be successful but without an extremely careful naming scheme of your entities most will interrogate a non existing entity.

Conversely, your audio request needs to be spot on, any mispronunciation would kill the service call.

I would be very happy to be proven wrong though, as this would indeed be the ultimate way to save tokens. However, at best, I suspect you would lose a lot of flexibility in your requests.

stban1983 · March 18, 2024, 12:08pm

Which model did you choose exactly ?
I tried TheBloke/laser-dolphin-mixtral-2x7b-dpo-GGUF but i get some errors using it:

{"error":{"code":500,"message":"runtime error: invalid memory address or nil pointer dereference","type":""}}

Anto79-ops · March 18, 2024, 1:03pm

Oh so interesting. I just started getting this error yesterday too, as I updated LocalAI. Its not just this model however, its all my models.

Anto79-ops · March 18, 2024, 2:51pm

@stban1983 there is a bug. The dev is going to release a new LocalAI version today, so probably update tomorrow since its takes 1 day for the images to form.

stban1983 · March 18, 2024, 3:13pm

ok great, thanks @Anto79-ops
I try with localai/localai:v2.8.0-ffmpeg-core, (instead of 2.9.0)
no more errors, but no answers !

curl http://192.168.0.130:8081/v1/chat/completions -H "Content-Type: application/json" -d '{
    "model": "dolphin-mixtral-8x7b",
    "messages": [{"role": "user", "content": "How are you doing?", "temperature": 0.1}]
}'

can you elaborate a little on your localAI? I test it whith no GPU, only simple cpu

Anto79-ops · March 18, 2024, 4:15pm

sure! I use both CPU and GPU (it has dual 4060 cards) and 64GB CPU Ram on AMD ryzon 7 7770 with Ubuntu

It runs 14b models really fast, and also 8x7b models well, too.

Anto79-ops · March 18, 2024, 6:22pm

@stban1983 there is a workaround to solving this issue, add this

    environment:
      - DEBUG=true

do you docker-compose.yaml file, like this below and it fixes the issue

  localai:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: ["gpu"]
    image: quay.io/go-skynet/local-ai:master-cublas-cuda12-ffmpeg
    environment:
      - DEBUG=true
    tty: true
    restart: always
    ports:
      - 8080:8080
    env_file:
      - .env
    volumes:
      - ./models:/models
      - ./images/:/tmp/generated/images/
    command: ["/usr/bin/local-ai" ]

stban1983 · March 18, 2024, 6:45pm

Thanks a lot! it’s working.
now, i’ve got to integrate it on my home assistant!

stueble · March 19, 2024, 9:47am

Hi,

I also have the problem that GPT (both, 1106 and 0125)

invokes execute_services to get the state of entities, or even
changes the state of entities although I am only asking a question about the state

I tried several different prompts (I am obviously not an expert), but maybe some background info may help: The component specifies the function execute_services, but it does not define which services are available, right? Is this already part of the model? Or does the AI always guess it?

When asking questions about my smart home, the AI often tries to invoke a service ‘get_state’ (which I assume does not exist, since HA returns an error). When I ask, e.g., “Which room has the highest temperature”, it sometimes works, sometimes GPT tries to execute “get_highest_temperature”. That is, it hallucinates service calls (which is mentioned on the OpenAI page btw). Could it help to specify which function calls are available? Any other ideas or results what could help? At the moment, the component is unfortunately not usable due to the problems mentioned about. Nevertheless, I would like to improve it, since sometimes it returns amazing results.

WW983 · March 21, 2024, 8:23pm

I’m not familiar with ChatGPT and I’m not a programmer. Can GPT learn? He recognized the wrong entity for me. After I referred to the right one, it worked. But only as long as the chat is open. After that, he forgets it again

Fripplebubby · March 21, 2024, 11:14pm

What I would recommend is to try manipulating the prompt template (settings → voice assistants → select your assistant → “conversation agent” gear icon → prompt template) to explain to it exactly what you want. It will probably take trial and error to find the right prompt which improves your use case. Some things I would try are:

add a sentence like “only use the execute_services function when the user is commanding a change of state, not at any other time”
“do not change the state of entities unless explicitly asked by the user”
“only use the following functions: (list your functions). Do not attempt other functions”

etc - basically, try to give the model more information to nudge it in the right direction. Even if this helps, it will probably not help 100% and different models will behave differently.

The component specifies the function execute_services, but it does not define which services are available, right? Is this already part of the model? Or does the AI always guess it?

I believe the model “guesses” about the services, or to be more precise, it knows a lot of information about home assistant services already (try asking ChatGPT about home assistant services) so the only new information you are giving it is access to this particular function “execute_services”. But, it is common to provide information to the model automatically about entities so that it knows about your particular home assistant setup. You could do something similar by actually enumerating services that are available, although I haven’t seen anyone do that yet! For entities, that’s the part of the prompt that looks like this:

Available Devices:
```csv
entity_id,name,state,aliases
{% for entity in exposed_entities -%}
{{ entity.entity_id }},{{ entity.name }},{{ entity.state }},{{entity.aliases | join('/')}}
{% endfor -%}
```

So you can see how one could extend that template to do services instead.

Fripplebubby · March 21, 2024, 11:20pm

No, it will not “learn” permanently. But you can add something like this “use service (long German name) for commands to do with motion sensor” to the prompt (see my response above for how to find and edit it, or the other documentation for extended_openai_conversation) and that should solve your problem permanently. It will “learn” temporarily because it will keep your ongoing conversation in its context window if it fits, but once you close the conversation, it is gone forever - however, things you add to the prompt will always be present (but they will always count against your token use as well).

WW983 · March 22, 2024, 6:11am

Thank you. I will try it

EDIT:
Thank you very much. It worked.

But if I execute the same command in the same chat, then GPT tries to do something different. He accesses the Sonsor again. I think because the commands come in quick succession, he thinks it didn’t work and tries something else. Can I get him to do nothing else in that case?

WW983 · March 22, 2024, 11:49am

Dimming the lights doesn’t work reliably. Sometimes GPT says that he adjusted the value, but nothing happened. And if I want to dim all the lights in a room, then it turns on the light that is not dimmable. Has anyone had the same experience and maybe has a solution for this?