OpenAI chat GTP assistant + TTS (of your choice) + Support Context

Hello, I think this can be useful for someone.

In my case I wanted to trigger a service that would call open chat GTP API while using a context.

You require an Account with open ai and API KEY. (paid)

This can be useful for example with a prompt like this it could play a tts sound to advice you to turn on the heater… well you know the possibilities are endless.


What do you recomend me to do? act like a nice AI assistant advisor and  reply with no longer sentences!

This is one possible responses I reveived that was forward to the TTS and then play on my local media.

It is recommended that you turn on the heater to increase the room temperature. Additionally, you may want to consider wearing warmer clothing or using a blanket to stay warm.

check the github repository

1 Like

Oh this is gonna be fun!

I’ve always wanted TTS prompts to be less repetitive. They get ignored. Now… I can add sarcasm. Or have them crafted in iambic pentameter in the style of a shakespearian tragedy.

Prompt: “Sarcastically explain to my housemates why leaving the garage door open is a terrible idea in the winter.”

Thanks for putting this together. Works easily.

I am considering myself a beginner-to-intermediate user… could I have an example of how I could use this in an automation? I cannot figure out how to pass the prompt to it.

Miguel, thank you!
I’ve been using your integration with the gpt-3.5-turbo-instruct model for funny and creative TTS announcements.

In the context prompt I include all type of information about relevant sensor states and in the message I ask AI to write the text to announce in the house when certain triggers occur.
It’s been perfect and I even created an input_select to choose the personalities and moods for the messages.

Working from there, I tried to adapt the code to allow input from images. It would be great to announce or describe the person who just rang the doorbell.

I ended up finding that there are slight differences in the API for GPT-4 Turbo with vision, mainly in the HTTP request and in the response format.

So, I thought it would be preferable to create a new custom_component with the image_analyzer service that allows to load a jpeg photograph from Home Assistant, send it to OpenAI and pass the response to the TTS service.
I took advantage of the text file feature to save the response for possible use by other scripts. So, the response from the AI image analysis is spoken and also saved in the text file.

There are the files and details. Just setup and then call service as:

service: gpt4vision.image_analyzer
  message: >-
    This photo was taken from a home surveillance camera.
    Write a short description of the person in the photo.
  max_tokens: 300
  entity_id: media_player.google_nest_kitchen
  image_file: '/config/www/images/doorbell_snapshot.jpg'

Oh awesome!

I haven’t got the time to keep up with recent models, but I already update my component to use recent turbo model but haven’t yet pushed.

Will try to do it someday :grin:

You need to read the doc file on that GitHub page.

For example, you can setup a new time pattern trigger and then call the ask service, with you message as instructions, then it’s a matter of criativity. When you follow my instructions you can tell which media player it will play.