The Gemini API is a powerful AI-driven tool from Google, designed to enhance applications with natural language understanding, text generation, and multimodal capabilities. It enables developers to integrate AI-powered responses into their apps, improving automation and user interactions. Webhooks play a crucial role in real-time communication between applications, allowing external systems to send instant updates or trigger events based on specific actions. For example, a webhook can notify a smart home system when a new message arrives, enabling text-to-speech (TTS) to vocalize the message through a speaker. TTS technology converts text into natural-sounding speech, making it ideal for accessibility, virtual assistants, and hands-free applications. By combining the Gemini API, webhooks, and TTS, developers can create seamless, AI-powered voice experiences that enhance user engagement and automation.
In short i figured out a way to run boring TTS messages through the Gemini API (have it spice it up and make it unique) then send it to my homepod mini **
in this example i have a webhook being triggered from my iphone for two things:
-When my wakeup alarm turns off
-When i received a message from my boss
Step One
Setup a webhook node as shown below
Step Two
Setup webhook automation on Iphone as shown below (test the webhook to ensure its working)
Step 3
Drag a function node and input the code below
// Extract the received message from the webhook payload
let userMessage = msg.payload.message || “No message received.”;// Define API request parameters
msg.method = “POST”;
msg.url = “https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-lite:generateContent?key=YOUR-API-KEY-HERE”;
msg.headers = { “Content-Type”: “application/json” };// AI Prompt - Format the text message dynamically
msg.payload = JSON.stringify({
“contents”: [
{
“parts”: [
{
“text”:Summarize the text message, keep it short sweet and leave out confidential information. \n\nMessage: "${userMessage}"
}
]
}
]
});
return msg;
(Optional Step: Drag In a debug node to display to console the raw text message to ensure things are working properly refer to the first screen show to see my flow)
Step Four
Drag in an http request node and set the url to https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-lite:generateContent?key=YOUR-API-KEY-HERE
Step Five
Drag in a function node and paste in this code
if (msg.payload && msg.payload.candidates) {
msg.payload = msg.payload.candidates[0].content.parts[0].text;
} else {
msg.payload = “Error: No valid response from AI.”;
}
return msg;
Step Sex… Wink Wink
Drag in an action service node / call service node (depends on your version) & set action to “media_player.volume_set” from the drop down box, set entity to your media player device. We want to make sure the alert always plays at a specific volume rather than the volume it was previously set to or else your gonna have some fun moments be sure to add the code below to the data field as well
{
“volume_level”: 0.7
}
Step Seven
Okay im getting tired so step seven is gonna be like the next 40 steps in one lol, but refer to the first image for reference of where to place your nodes… So drag in a delay node set it 1 second, then drag in another call service / action node (pay attention here) copy as seel below
Great now everything is working for ya, i chose this specifict flow as it highlights how to encorparate webhooks from iphone to trigger the flow. but you can also use Gemini API to customize every TTS message from Home Assistant if you really want to and even prompt it to sound more like Jarvis if thats your thing, idk just keeps things more natural instead of it saying the same exact words all the time and the best part is unlike ChatGPT, the bard api is free with a rate limit of 1-30 messages per minute depending on which model you use.
if you dont know where to grab the gemini api heres the link
[https://ai.google.dev/](https://Gemini API)
Im a bit late to Home Assistant and hope to be a value to the community!!!