Building an AI-Powered Motion Video Search System with Home Assistant, AppDaemon, and LLM Vision

A few weeks ago, I came across the UniFi AI Key, a system that intelligently analyzes security footage and makes it searchable using AI. The idea fascinated me β€” imagine being able to instantly find motion events like a person near the front door, a package delivery, or an unknown vehicle pulling up just by typing a simple search query.

Instead of purchasing the UniFi AI Key, I decided to take on the challenge of building my own AI-powered motion search system from scratch using Home Assistant, AppDaemon, LLM Vision, and Google Gemini/Ollama.

This article outlines the entire process, with full code samples and integration details.

Key Components Used

Home Assistant β€” Smart home automation platform.
AppDaemon β€” Python automation for Home Assistant.
LLM Vision β€” AI-powered image analysis for motion detection.
Google Gemini/Ollama β€” AI model for analyzing motion events.
SQLite Database β€” Stores motion event descriptions and video links for future searches.

How It Works

Motion Detection & Recording β€” Cameras detect motion and automatically record short video clips.

AI Analysis (Google Gemini/Ollama via LLM Vision) β€” The AI analyzes what caused the motion and generates a concise, human-readable description.

Storing Data in a Custom Database (SQLite) β€” Motion videos and AI-generated descriptions are saved for future searchability.

AI-Powered Search β€” Users can find specific events by searching in natural language (e.g., β€œShow me all people approaching my front door”).

Web Interface for Searching & Filtering β€” A simple front-end UI allows users to browse, filter, and search motion events by date, time, and AI description.

Step 1: Setting Up Motion Detection & Recording
The first step is configuring Home Assistant to detect motion and record video.

Home Assistant Automation (Triggers AI Processing)

This automation listens for motion detection events and triggers a script to process the video.

alias: Ring Door Bell Camera - Video on motion
description: ""
triggers:
  - platform: state
    entity_id:
      - binary_sensor.door_bell_person_detection
    to: "on"
actions:
  - metadata: {}
    service: script.camera_ring_video_ai_notification
mode: single

Step 2: Capturing Video & AI Processing
When motion is detected, a script handles the AI analysis, saves the video, and triggers notifications.

Home Assistant Script

sequence:
  - variables:
      timestamp: "{{ now().strftime('%Y%m%d_%H%M%S') }}"
      camera_name: front_door
  - metadata: {}
    service: camera.snapshot
    target:
      entity_id: camera.front_door
    data:
      filename: ./www/snapshots/{{ camera_name }}_motion_{{ timestamp }}.jpg
  - metadata: {}
    service: camera.record
    target:
      entity_id: camera.front_door
    data:
      filename: ./www/videos/{{ camera_name }}_motion_{{ timestamp }}.mp4
      duration: 10
  - delay: "00:00:12"
  - metadata: {}
    service: google_generative_ai_conversation.generate_content
    data:
      prompt: >-
        Motion detected. Analyze the video feed from my driveway camera and describe what caused the motion alarm- Focus only on **moving objects** (people, animals, or vehicles).
        - **Do NOT describe stationary objects, buildings, or parked cars** (including my blue Mercedes-Benz B-Class or Mitsubishi Colt).
        - If a vehicle caused motion, describe it and extract the **license plate** (if visible).
        - If no clear motion is detected, return: "No Obvious Motion Detected."
        - Keep the response brief and suitable for a phone notification.
      image_filename:
        - ./www/snapshots/{{ camera_name }}_motion_{{ timestamp }}.jpg
    response_variable: generated_content
  - event: ai_video_event
    event_data:
      video: >-
        http://192.168.1.100:8123/local/videos/{{ camera_name }}_motion_{{ timestamp }}.mp4
      ai_message: "{{ generated_content['text'] }}"
mode: single

Step 3: Storing AI Results in a Database
Since Home Assistant does not store AI-generated metadata, I built an AppDaemon app to save motion event details in an SQLite database.

Python AppDaemon Code (Stores Video Links & AI Messages) β€” motion_video_saver.py

import appdaemon.plugins.hass.hassapi as hass
import sqlite3

class SaveVideoAI(hass.Hass):
    def initialize(self):
        self.listen_event(self.store_ai_results, "ai_video_event")
        self.ensure_db_exists()

    def ensure_db_exists(self):
        """Ensure that the database and table exist."""
        db_path = "/homeassistant/custom_snapshots.db"

        try:
            with sqlite3.connect(db_path) as conn:
                cursor = conn.cursor()
                cursor.execute("""
                    CREATE TABLE IF NOT EXISTS motion_videos (
                        id INTEGER PRIMARY KEY AUTOINCREMENT,
                        timestamp DATETIME DEFAULT CURRENT_TIMESTAMP,
                        video TEXT,
                        ai_message TEXT
                    )
                """)
                conn.commit()
            self.log("Database and table verified.")
        except Exception as e:
            self.log(f"Error creating database: {str(e)}", level="ERROR")

    def store_ai_results(self, event_name, data, kwargs):
        """Store AI snapshot data in the database."""
        db_path = "/homeassistant/custom_snapshots.db"

        # Retrieve the snapshot filename and AI message
        video = data.get("video", "")
        ai_message = data.get("ai_message", "")

        try:
            with sqlite3.connect(db_path) as conn:
                cursor = conn.cursor()
                cursor.execute(
                    """
                    INSERT INTO motion_videos (video, ai_message)
                    VALUES (?, ?)
                    """,
                    (video, ai_message),
                )
                conn.commit()

            self.log("Saved video data successfully.")

        except Exception as e:
            self.log(f"Error saving video data: {str(e)}", level="ERROR")

Step 4: Creating an API for Searching Videos
To make the stored data accessible, I built an API using AppDaemon.

Python API to Fetch Data β€” video_api.py

import appdaemon.plugins.hass.hassapi as hass
import sqlite3
import requests

GEMINI_API_KEY = "YOUR API KEY"
GEMINI_URL = f"https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key={GEMINI_API_KEY}"

class VideoAPI(hass.Hass):

    def initialize(self):
        self.log("Video API running...")
        self.register_endpoint(self.handle_request, "videos")
        self.register_endpoint(self.ai_search, "video_search")

    def handle_request(self, data, args):
        """Fetch all videos from the database"""
        db_path = "/homeassistant/custom_snapshots.db"

        try:
            conn = sqlite3.connect(db_path)
            cursor = conn.cursor()
            cursor.execute("SELECT id, timestamp, video, ai_message FROM motion_videos ORDER BY timestamp DESC")
            rows = cursor.fetchall()
            conn.close()

            videos = [{"id": row[0], "timestamp": row[1], "video": row[2], "ai_message": row[3]} for row in rows]
            return {"videos": videos}, 200
        except sqlite3.OperationalError as e:
            self.log(f"Database error: {e}", level="ERROR")
            return {"error": "Database query failed."}, 500

    def ai_search(self, data, args):
        """Process natural language search using Google Gemini AI"""
        query = data.get("query", "").strip()
        self.log(f"Received AI search query: {query}", level="INFO")

        
        if not query:
            return self.handle_request(data, args)

        
        db_path = "/homeassistant/custom_snapshots.db"
        conn = sqlite3.connect(db_path)
        cursor = conn.cursor()
        cursor.execute("SELECT id, timestamp, video, ai_message FROM motion_videos ORDER BY timestamp")
        rows = cursor.fetchall()
        conn.close()

        if not rows:
            return {"error": "No videos found in the database."}, 200

        
        ai_messages = [row[3] for row in rows]  

        
        formatted_messages = "\n".join([f"- {msg}" for msg in ai_messages])
        prompt_text = (
            f"You are analyzing AI-generated security camera messages. Given the following motion detection descriptions:\n\n"
            f"{formatted_messages}\n\n"
            f"Identify which messages are most relevant to the query: '{query}'. Return only the messages that match."
        )

        
        payload = {
            "contents": [
                {
                    "parts": [
                        {"text": prompt_text}
                    ]
                }
            ]
        }
        headers = {"Content-Type": "application/json"}

        try:
            response = requests.post(GEMINI_URL, json=payload, headers=headers)
            response.raise_for_status()
            gemini_data = response.json()

            
            self.log(f"Gemini API Response: {gemini_data}", level="INFO")

            
            relevant_videos = self.filter_videos_based_on_ai(gemini_data, rows)

            return {"videos": relevant_videos}, 200
        except requests.exceptions.RequestException as e:
            self.log(f"Error calling Gemini API: {e}", level="ERROR")
            return {"error": "AI search request failed."}, 500

    def filter_videos_based_on_ai(self, gemini_data, videos):
        """Filter videos based on AI response"""
        if "candidates" not in gemini_data:
            self.log("No candidates returned by Gemini AI.", level="WARNING")
            return []

        recommended_messages = gemini_data["candidates"][0]["content"]["parts"][0]["text"].split("\n")
        cleaned_messages = [msg.strip().lstrip("0123456789.- ") for msg in recommended_messages]

        self.log(f"Processed AI recommendations: {cleaned_messages}", level="INFO")

        
        filtered_videos = [
            {
                "id": video[0],
                "timestamp": video[1],  
                "video": video[2],   
                "ai_message": video[3]
            }
            for video in videos if any(msg.lower() in video[3].lower() for msg in cleaned_messages)
        ]

        
        self.log(f"Filtered videos count: {len(filtered_videos)}", level="INFO")

        return filtered_videos

NOTE: Save the python scripts in this folder on your HA instance: \your_IP\addon_configs\xxxxxx_appdaemon\apps

also add the scrips to your apps.yml file. See example below:

motion_snapshot_saver:
  module: motion_snapshot_saver
  class: SaveSnapshotAI
 
motion_video_saver:
  module: motion_video_saver
  class: SaveVideoAI  

video_api:
  module: video_api
  class: VideoAPI

Step 5: Creating a Searchable Web UI β€” videos.html
I built a simple web interface to browse, filter, and search motion events.

Features:

Paginated Results (15 videos per page)
AI-powered Search
Filter by Date & Time
Note: Save the html in the www folder of Appdaemon and serve from there: http://homeassistant.local/5050/local/videos.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Motion Video Search (AI Powered)</title>
    <style>
        body {
            font-family: Arial, sans-serif;
            background-color: #121212;
            color: #ffffff;
            margin: 20px;
            padding: 0;
            text-align: center;
        }
        input, select {
            padding: 10px;
            margin: 5px;
            font-size: 16px;
            border: 1px solid #444;
            border-radius: 5px;
            background-color: #222;
            color: #ffffff;
        }
        button {
            padding: 10px 15px;
            background-color: #007BFF;
            color: white;
            border: none;
            border-radius: 5px;
            cursor: pointer;
            margin: 5px;
        }
        .grid {
            display: flex;
            flex-wrap: wrap;
            justify-content: center;
            gap: 10px;
        }
        .card {
            border: 1px solid #333;
            border-radius: 10px;
            overflow: hidden;
            width: 250px;
            text-align: left;
            background-color: #1e1e1e;
            box-shadow: 2px 2px 10px rgba(255, 255, 255, 0.1);
            cursor: pointer;
        }
        .card video {
            width: 100%;
            height: 150px;
            object-fit: cover;
        }
        .card p {
            padding: 10px;
            font-size: 14px;
            color: #bbb;
        }
        .fullscreen-container {
            display: none;
            position: fixed;
            top: 0;
            left: 0;
            width: 100%;
            height: 100%;
            background: rgba(0, 0, 0, 0.9);
            justify-content: center;
            align-items: center;
            flex-direction: column;
        }
        .fullscreen-container video {
            max-width: 90%;
            max-height: 80%;
            border-radius: 5px;
        }
        .fullscreen-container p {
            color: white;
            margin-top: 10px;
            font-size: 16px;
        }
        .close-btn {
            position: absolute;
            top: 20px;
            right: 30px;
            font-size: 30px;
            color: white;
            cursor: pointer;
        }
        .pagination {
            margin: 20px;
        }
    </style>
</head>
<body>

    <h1>Find Anything (AI Powered)</h1>
    
    <input type="text" id="searchInput" placeholder="Enter a keyword or ask a question...">
    <button onclick="performSearch()">Search</button>

    <br><br>

    <label for="dateFilter">Select Date:</label>
    <input type="date" id="dateFilter">

    <label for="startTime">Start Time:</label>
    <input type="time" id="startTime">

    <label for="endTime">End Time:</label>
    <input type="time" id="endTime">

    <button onclick="applyFilters()">Apply Filters</button>
    <button onclick="clearFilters()">Clear Filters</button>

    <div id="videoGrid" class="grid"></div>

    <div class="pagination">
        <button onclick="prevPage()" id="prevBtn" disabled>Previous</button>
        <span id="pageInfo"></span>
        <button onclick="nextPage()" id="nextBtn" disabled>Next</button>
    </div>

    <div id="fullscreenContainer" class="fullscreen-container" onclick="closeFullscreen()">
        <span class="close-btn">&times;</span>
        <video id="fullscreenVideo" controls></video>
        <p id="fullscreenMessage"></p>
    </div>

    <script>
        let videos = [];
        let filteredVideos = [];
        let currentPage = 1;
        const itemsPerPage = 15;
        const API_URL = "http://homeassistant.local:5050/api/appdaemon/videos";
        const AI_SEARCH_URL = "http://homeassistant.local:5050/api/appdaemon/video_search";

        async function fetchVideos() {
            try {
                const response = await fetch(API_URL);
                if (!response.ok) {
                    throw new Error("Failed to fetch videos");
                }
                const data = await response.json();

                if (!data.videos || !Array.isArray(data.videos)) {
                    console.error("Invalid API response format:", data);
                    alert("Error loading videos. Please check your API response.");
                    return;
                }

                videos = data.videos.sort((a, b) => new Date(b.timestamp) - new Date(a.timestamp));
                filteredVideos = [...videos];
                currentPage = 1;
                displayVideos();
            } catch (error) {
                console.error("Error fetching videos:", error);
                alert("Error loading videos. Please check your connection.");
            }
        }

        function updatePagination() {
            document.getElementById("pageInfo").textContent = `Page ${currentPage} of ${Math.ceil(filteredVideos.length / itemsPerPage) || 1}`;
            document.getElementById("prevBtn").disabled = currentPage === 1;
            document.getElementById("nextBtn").disabled = currentPage * itemsPerPage >= filteredVideos.length;
        }

        function nextPage() {
            if (currentPage * itemsPerPage < filteredVideos.length) {
                currentPage++;
                displayVideos();
            }
        }

        function prevPage() {
            if (currentPage > 1) {
                currentPage--;
                displayVideos();
            }
        }

        async function performSearch() {
            const query = document.getElementById("searchInput").value.trim();
            if (!query) {
                filteredVideos = [...videos];
                currentPage = 1;
                displayVideos();
                return;
            }

            try {
                const response = await fetch(AI_SEARCH_URL, {
                    method: "POST",
                    headers: { "Content-Type": "application/json" },
                    body: JSON.stringify({ query: query })
                });

                if (!response.ok) {
                    throw new Error("Failed to fetch AI search results");
                }

                const data = await response.json();

                if (!data.videos || !Array.isArray(data.videos)) {
                    console.warn("AI search response does not contain 'videos' array:", data);
                    alert("AI search did not return any videos.");
                    return;
                }

                filteredVideos = data.videos.sort((a, b) => new Date(b.timestamp) - new Date(a.timestamp));
                currentPage = 1;
                displayVideos();
            } catch (error) {
                console.error("Error performing AI search:", error);
                alert("Error searching videos. Please check your connection.");
            }
        }

        function applyFilters() {
            const selectedDate = document.getElementById("dateFilter").value;
            const startTime = document.getElementById("startTime").value;
            const endTime = document.getElementById("endTime").value;

            filteredVideos = videos.filter(video => {
                const videoDate = new Date(video.timestamp);
                const videoDateStr = videoDate.toISOString().split("T")[0];
                
                if (selectedDate && videoDateStr !== selectedDate) {
                    return false;
                }

                if (startTime || endTime) {
                    const videoTime = videoDate.toTimeString().split(" ")[0].slice(0, 5);
                    if (startTime && videoTime < startTime) return false;
                    if (endTime && videoTime > endTime) return false;
                }

                return true;
            });

            currentPage = 1;
            displayVideos();
        }

        function clearFilters() {
            document.getElementById("dateFilter").value = "";
            document.getElementById("startTime").value = "";
            document.getElementById("endTime").value = "";
            filteredVideos = [...videos];
            currentPage = 1;
            displayVideos();
        }

        function displayVideos() {
            const grid = document.getElementById("videoGrid");
            grid.innerHTML = "";

            const startIndex = (currentPage - 1) * itemsPerPage;
            const endIndex = startIndex + itemsPerPage;
            const paginatedVideos = filteredVideos.slice(startIndex, endIndex);

            paginatedVideos.forEach(video => {
                grid.innerHTML += `
                    <div class="card">
                        <video src="${video.video}" controls muted onclick="openFullscreen('${video.video}', '${video.ai_message}')"></video>
                        <p><strong>Timestamp:</strong> ${video.timestamp}</p>
                        <p><strong>AI Message:</strong> ${video.ai_message}</p>
                    </div>`;
            });

            updatePagination();
        }

        window.onload = fetchVideos;
    </script>

</body>
</html>

Conclusion
This AI-powered motion search system has been a fun and rewarding project. Let me know if you’d like to build your own AI-powered motion search system!

3 Likes

This is some fantastic work you’ve done! Would you like to collaborate? I am developing a card for LLM Vision currently and adding search functionality would be the next logical step.

Just FYI, frigate has something similar to this built in. Stores it’s own embeddings from the vector DB and allows semantic search based on descriptions and images. The description box can be filled out manually or use generative AI (Gemini, openAI, llama, etc.) for the description.

1 Like