Gemini Pro API

Has anyone had success in calling the new Google AI Studio (previously MakerSuite) Gemini API?

I would like to use the gemini-pro LLM model. At presently I can only seem to get the integration to use the PaLM Bison model.

Configuring the Google Generative AI Conversation Integration, I am unable to use the gemini model

models/gemini-pro

but can still use the bison model

models/chat-bison-001

The following is the error message:

Sorry, I had a problem talking to Google Generative AI: 404
models/gemini-pro is not found for API version v1beta2, or is not supported for generateMessage. Call ListModels to see the list of available models and their supported methods.

Do I need to generate a new API key?

2 Likes

Iā€™m the author of the Google Generative AI Conversation integration. It requires a few small changes to switch to Gemini Pro. I also plan on adding support for Gemini Pro Vision for the next core release.

5 Likes

Wonderful! Thank you for the fantastic integration. I look forward to the upgraded support.

Is it possible to male a wish for function calling, like it is used in extended openai conversation. So it is possible the control entities?

1 Like

Google Generative AI: Add support for Gemini Pro by tronikos Ā· Pull Request #105787 Ā· home-assistant/core Ā· GitHub fixes this.

Google Generative AI: Add a service for prompts consisting of text and images using Gemini Pro Vision by tronikos Ā· Pull Request #105789 Ā· home-assistant/core Ā· GitHub adds support for Gemini Pro Vision.

@bloody2k I donā€™t have such plans

1 Like

I turned it into an ā€œanything-sensorā€ yesterday, as proof of concept.

Looks like the update will be available in 2024.1! I found it in the beta release notes.

I also stumbled across this inspiration for the future of LLMs controlling Home Assistant!

So far no luck with gemini-pro in 2024.1.0. I tried removing and re-configuring the integration. Do I need a new API key? Unsure what Iā€™m doing wrong.

The PRs havenā€™t been merged yet. Hopefully they will make it in the next monthly release.

1 Like

@tronikos Can we pull your repo from git via HACS to get your fixes early?

Also Iā€™m guessing @User87 wants function calling like the OpenAI conversation agent to control and send commands. That would be amazing to not have to pay for credits on OpenAI.

1 Like

You can but itā€™s much easier to wait. It will be included in the 2024.2 release.

1 Like

You mean 2024.2 of course :slight_smile:

One question though - Gemini Pro is not directly available in my country yet. Would it be possible to use Vertex AI API with Gemini Pro as a model instead - described here: https://ai.google.dev/examples?hl=en&keywords=vertexai

Yes I meant 2024.2. Fixed.

I donā€™t think it will work with Vertex AI API without significant changes that I donā€™t plan on making them.

Itā€™s a pity, because the direct version does not work almost anywhere in Europe and Vertex AI version would work.

Is it working now in Europe? According to Google Bard update: Image generation and Gemini Pro adds more languages itā€™s supposed to be globally available since February 1.

I have the service working perfect in dev tools/services. i cant figure out how to store the response to something i can use? Any chance you could provide an example?

Here is a script that takes a text prompt, camera to take a snapshot, and media player to play back the response:

alias: Gemini Pro Vision
fields:
  prompt:
    selector:
      text:
        multiline: true
    name: prompt
  media_player:
    selector:
      entity:
        filter:
          - domain: media_player
    name: media player
  camera:
    selector:
      entity:
        filter:
          - domain: camera
    name: camera
sequence:
  - service: camera.snapshot
    data:
      entity_id: "{{ camera }}"
      filename: /media/snapshot.jpg
  - service: google_generative_ai_conversation.generate_content
    data:
      prompt: "{{ prompt }}"
      image_filename: /media/snapshot.jpg
    response_variable: content
  - service: tts.speak
    target:
      entity_id: tts.piper
    data:
      media_player_entity_id: {{ media_player }}
      message: "{{ content.text }}"
      cache: false
  - variables:
      content: "{{ content }}"
  - stop: end
    response_variable: content
mode: single
icon: mdi:message-image

No, it does not work, at least the API version - still getting an error for the location of the user.

yea, also I canā€™t use the nest api as an image_filename: either. I donā€™t totally understand response_variable: or blueprints either. I think what you provided is a blueprint? This is what I came up with as I believe I am also going to have to store the thumbnail in a response_variable: ?

- id: 'c12'
  alias: Doorbell Camera Snapshot Notification
  trigger:
    platform: device
    device_id: feb17d26775a5xxxxxxxxxxxxxxxxx
    domain: nest
    type: camera_person
  action:
    - service_template: >-
        {%- if is_state('input_boolean.home', 'off') or
               not is_state('device_tracker.iphone', 'home') and
               is_state('sensor.ipad_ssid', 'wirelessfun') -%}
              notify.mobile_app_iphone
        {% else %}
              notify.ios
        {% endif %}
      data:
        message: Person Detected at the Front Door.
        data:
          image: >-
            /api/nest/event_media/{{ trigger.event.data.device_id }}/{{ trigger.event.data.nest_event_id }}/thumbnail
	    response_variable: thumbnail
    - service: google_generative_ai_conversation.generate_content
      data:
        prompt: "Very briefly describe what you see in this image from my doorbell camera. Your message needs to be short enough to fit in a phone notification. Do not describe stationary objects or buildings."
        image_filename: {{ thumbnail}}
      response_variable: content
    - service: tts.speak
      target:
        entity_id: tts.piper
      data:
        media_player_entity_id: media_player.google_mini
        message: "{{ content.text }}"
        cache: false
    - variables:
        content: "{{ content}}"
    - stop: end
      response_variable: content    
  mode: queued

@tronikos I have no idea, how you coded the integration, but I wonder, if it would be possible to specify the endpoint for the integration manually? It seems, european endpoint blocks the integration at the moment, but US one should work: python - Google Generative AI API error: "User location is not supported for the API use." - Stack Overflow