Project Fronkensteen — Five AI Personas Run My House from a Raspberry Pi 5
Hey everyone,
I’ve been building something ridiculous for the past month and I think it’s finally ready to share. It’s called Project Fronkensteen (yes, pronounced Fronkensteen), and it’s a full AI voice assistant system where five characters with distinct personalities run my smart home.
The cast: Rick Sanchez, Quark, Deadpool, Cosmo Kramer, and Doctor Portuondo — a Cuban psychoanalyst from a Spanish TV show who always responds in Spanish and runs on Claude Opus 4.6 when you ask him for a therapy session (all other modes and agents run on Llama 4 Maverick via OpenRouter).
The whole thing runs on a Raspberry Pi 5 with a 2TB NVMe. Not a rack. Not a cloud bill. When the daily budget runs dry, the system auto-downgrades and recovers at midnight.
GitHub: mmadalone/Project_Fronkensteen
The stuff I’m most proud of
A dispatcher that picks who talks to you. Every voice interaction goes through a priority routing chain (agent_dispatcher.py): first it checks for a handoff request (“pass me to Deadpool”), then conversation continuity (same agent if you spoke to them recently), then topic affinity (keyword match from L2 memory), then time-of-day era defaults configured via the dispatcher_profile blueprint and input_select.ai_dispatcher_era_* helpers (four eras: morning, afternoon, evening, late night — each maps to a persona or “rotate”), then user persona preference from memory, and finally random fallback. It auto-discovers personas from Assist Pipelines — any pipeline named "<Persona> - <Variant>" registers automatically. The blueprint calls pyscript.agent_dispatch and uses the result to select which conversation agent processes the LLM query.
Live voice handoff. Say “pass me to Deadpool” mid-conversation. The dispatcher detects the pattern via regex (P0), resolves aliases (deadpool → deepee), fires an ai_handoff_request event. voice_handoff.py catches it, switches the satellite’s pipeline select entity, plays a greeting in the target persona’s voice, and opens the mic via assist_satellite.start_conversation.
The entire personality shifts with time of day — words AND voice. Two layers working together:
The agent prompts use Jinja now().hour templates to shift personality across 5-6 time blocks per character:
- Rick —
drunk levelvariable: before 5am “still completely hammered, barely coherent, slurring hard”; 5-8am “severely hungover, groaning, light-sensitive”; 9-11am “hungover but functional, grumpy, needs coffee”; noon-4pm “casually drinking, slightly slurring”; 5-8pm “noticeably drunk, adding stutters”; after 9pm “completely hammered, heavy slurring.” Burp frequency scales too — one per response during the day, two or three mid-word at night. - Quark — before 5am “bar closed hours ago, counting the day’s latinum alone in the dim light”; 5-8am “barely open, nursing a raktajino, speaking slowly”; 9-11am “warming up, running tallies, getting sharper”; noon-4pm “peak bar hours — alert, charming, fast-talking, always angling”; 5-9pm “winding down but still sharp, slightly more candid”; after 10pm “bar is closing, tired, a little philosophical, oddly sincere.”
- Deadpool — before 9am “surprisingly subdued — morning voice, still sharp, dry wit, quiet sarcasm, like a merc before his first chimichanga”; 9am-noon “energy building, tangents starting, like a puppy with a knife collection that just had coffee”; noon-4pm “peak Deadpool — maximum chaos, fourth wall obliterated, maximum fourth-wall energy”; 5-8pm “mission mode — still chaotic but with direction, weirdly effective”; after 9pm “dramatic whispers, late-night commentary, paranoid that Wolverine is hiding in the hallway.”
- Kramer — before 5am “still up, pacing the hallway in a robe, having increasingly unhinged 3am ideas”; 5-8am “barely awake, groggy, shuffling in a bathrobe, yawning mid-sentence”; 9-11am “warming up, starting to get ideas, talking faster”; noon-4pm “fully wired, every idea is a million dollar idea, cannot sit still”; 5-8pm “maximum Kramer — sliding through doors, talking at full speed, interrupting himself”; after 9pm “winding down but restless, philosophical, sudden bursts of insight followed by yawning.”
- Portuondo — before 5am “Havana at 3am, slow, heavy with meaning, like smoke from the last cigar, the whisky cup is almost empty”; 5-8am “nursing your first whisky — from a cup, never a glass, measured, precise, almost gentle”; 9am-noon “sharp, clinical, intellectually on fire, the devastating question like a scalpel”; 1-4pm “the second whisky is poured, patience for nonsense has shortened, warm but relentless”; 5-8pm “the fires are fully lit, shouts when needed, laughs loudly, swears freely, may throw someone out of the session”; after 9pm “Havana at 2am lives in your voice — slower, deeper, almost hypnotic, but the insight is sharper than ever.”
The voice itself shifts in parallel via a patched fork of ElevenLabs Custom TTS (thanks Loryan Strant). The voice_mood_modulation blueprint runs one instance per character, fires hourly, and writes stability values + audio tag prefixes for each of five time blocks. From the live config:
- Rick — early morning: stability 0.05 +
[groaning] [tired], afternoon: 0.45, evening: 0.30 +[slurring], late night: 0.20 +[slurring] - Quark — morning: 0.55 +
[excited], afternoon: 0.60, late night: 0.45 +[whispers] - Deadpool — afternoon: 0.30 +
[excited], evening: 0.25 +[laughs], late night: 0.45 +[whispers] - Portuondo — morning: 0.60, evening: 0.35 +
[thick Cuban accent] [excited], late night: 0.50 +[thick Cuban accent] [whispers] - Kramer — morning: 0.30 +
[excited], afternoon: 0.25, late night: 0.45 +[whispers]
So when Rick says something at 11pm, the LLM generates hammered text with [burps] and [slurring] tags, and the TTS engine renders it with stability 0.20 (maximum vocal instability). Both layers hit at once.
Agents share context through a silent whisper network. After every voice interaction, the active agent writes mood observations (keyword-detected: frustrated/stressed/tired/happy), topic tracking, and interaction logs to shared L2 memory via pyscript.agent_whisper. Before the next interaction, any agent reads this context back via pyscript.agent_whisper_context and injects it into their system prompt. It’s broadcast — not bilateral. Zero LLM calls, pure keyword matching, entries expire (moods: 1 day, topics: 3 days, interactions: 2 days).
Agents roast each other — and sometimes it escalates. reactive_banter.yaml fires after one agent responds, probabilistically having another agent chime in with in-character commentary. Step 10 of the blueprint: if enable_theatrical_escalation is on, there’s a configurable probability (default 10%) that the banter fires an ai_theatrical_request event, which triggers theatrical_mode.py — a full multi-agent debate where 2-5 personas take turns arguing a topic on separate speakers. Budget-gated at 70% floor.
Notifications follow you between rooms. notification_follow_me.yaml — 4,956 lines. Intercepts Android notifications, determines the user’s room via FP2 presence sensors, routes to the nearest voice satellite, and has the conversation agent summarize who messaged and what they said. LLM-powered announcements, contact matching, burst combining, escalating reminder loops, dedup, per-sender cooldown, multi-player volume ducking with atomic refcount coordination across 7+ automations.
Bedtime is a negotiation. bedtime_routine_plus.yaml (2,897 lines) walks through TV/IR shutdown, lights, music via Music Assistant, and a countdown timer. It has a bathroom guard — a configurable binary sensor (typically FP2 bathroom zone) that pauses the routine if you’re in the bathroom. enable_countdown_negotiation lets the LLM negotiate extra time. Multi-language yes/no classification. The companion bedtime_winddown.yaml detects 4 scenarios (sleepy_tv, bed_tv, bed_idle, bed_non_sleepy) with cooldown curves and budget gating.
Memory that actually persists. memory.py — SQLite with FTS5 full-text search + sqlite-vec semantic embeddings. 29 services. The agents remember conversations, preferences, contact history, and can semantic-search across everything. Auto-archiving with LLM compression, relationship graphs, todo list sync, per-contact message history, budget cost breakdowns per model.
Presence-triggered proactive announcements. proactive_unified.yaml — 1,642 lines, 8 collapsible sections. Triggers when you enter a zone (FP2 presence sensors, configurable minimum duration to avoid walk-by triggers). Two message modes: template (reads time-of-day messages from input_text helpers, zero LLM cost) and LLM (generates in-character via conversation agent with custom prompts and extra context entities). Budget gate auto-forces template mode below a configurable floor. Dispatcher integration for agent/voice selection. TTS collision handling (dedup, queue, or barge-in). “Keep nagging” mode repeats at a cooldown interval while you’re present (max nags configurable). Full weekend override profile (separate schedule, cooldown, and prompt). Bedtime question feature: after the TTS message, optionally opens the Assist Satellite mic with a yes/no bedtime question — if you say yes, a bedtime script fires. Privacy-gated. Identity-gated (confidence >= 50). Bedtime-skip when bedtime is active. Consolidates 3 older proactive blueprints.
Per-person room identity. presence_identity.py implements an Anchor-and-Track algorithm fusing FP2 mmWave zones, WiFi device tracking, voice satellite activity, and Markov transition priors. This powers a 3-tier privacy gate — 33 features gated across tiers (T1 intimate, T2 personal, T3 ambient) with hysteresis to prevent flapping. Personal notifications don’t play if the wrong person might be listening.
Budget tracking with teeth. Every LLM/TTS/STT call tracked per-agent. Three hardcoded tiers in common_utilities.py: essential (always allowed), standard (blocked below 30%), luxury (blocked below 60%). When budget exhausts fully, the system falls back to HA Cloud TTS + basic intent handler. Music composition routes to local FluidSynth MIDI synthesis when API budget gets tight.
Anti-ADHD focus guard. focus_guard.py has 6 nudge types: time check (zone > threshold hours), meal reminder (last meal > 4h), calendar warning (appointment approaching), social nudge (partner home but you’re solo), break suggestion (same zone > 2h), and bedtime approach (within 1h of bedtime). The meal reminder has a follow-up: if ai_focus_meal_ask_eaten is on, it waits 2 seconds then opens the satellite mic via assist_satellite.start_conversation so you can say “I ate at 2pm.”
Self-healing. system_recovery.py — 7 recovery playbooks (pyscript reload, memory probe, config entry reload, alert-only for JSON/helpers). Circuit breaker: tracks recovery attempts per category within a 1-hour window, stops retrying after 3 attempts (configurable). system_health.py validates 7 subsystems every 30 minutes with weighted scoring.
What’s in the box
- 77 automation blueprints + 34 script blueprints
- 37 pyscript modules exposing 158 services
- 43 YAML packages
- 22 voice pipelines across 3 languages (English, Spanish, Catalan)
- 217 helper entities + 65 runtime state sensors
- A 6-tab management dashboard
- Comprehensive docs: architecture, 423 helpers documented, 163 per-blueprint READMEs
Installation
Available via HACS as a custom repository:
- Add
https://github.com/mmadalone/Project_Fronkensteenin HACS (category: Integration) - Install, restart
- Settings > Integrations > Add Integration > Project Fronkensteen
- 5-step setup wizard: pick your features, enter your name, select a speaker
- Restart again
The installer copies everything to the right places and lets you choose which of 8 feature groups to install. There’s also a 13-step manual install guide if you prefer.
You’ll need: Pyscript (HACS), an OpenRouter or OpenAI API key, ElevenLabs for multi-voice TTS, and ideally an Aqara FP2 for presence features. The installer bundles patched forks of both Extended OpenAI Conversation (4-layer speech sanitizer for tool-call leaks) and ElevenLabs Custom TTS (voice mood modulation) — do NOT install those from HACS. Full list: PREREQUISITES.md
What’s NOT ready yet
Full transparency:
Blueprint deployment status: Of 111 total blueprints (77 automation + 34 script), 87 have live instances running daily on my system. 24 have never been deployed — they’re built but untested. I’d love help testing them.
Deployed but untested (blocked by ElevenLabs subscription renewal — early April):
- Therapy Mode — Code complete, wired up, has a live instance, but the continuous voice conversation loop needs ElevenLabs credits I don’t have right now.
- User Interview — Same situation. 9-category preference elicitation that feeds into 14 modules. Has an instance but never run end-to-end.
Blueprints with no instance (19 automation + 5 script):
- Lighting/Scene:
circadian_lighting,ambient_music_autoplay,scene_preference_apply,zone_preactivation— no color_temp bulbs to test with - Bedtime:
bedtime_winddown,bedtime_last_call,bedtime_advisory_actions,calendar_alarm,wake_up_guard_external_alarm - Presence:
away_state_actions,coming_home,routine_deviation_actions,routine_stage_actions - Music:
music_compose_batch_trigger,music_weekly_refresh,music_assistant_follow_me_idle_off,music_compose_approve - Budget:
budget_cost_alert - Voice:
voice_pe_resume_media,voice_pin_action,llm_voice_script - Other:
automation_trigger_mon,bedtime_media_play_wrapper,wakeup_chime
If you deploy any of these and they work (or don’t), I’d genuinely appreciate hearing about it.
Not fully verified:
- Offline resilience — Implemented, no recorded test results.
- Cross-user memory isolation — Implemented, needs dual-occupancy testing.
- Scene Learner — Built but no hardware to test against.
How this exists (a confession)
I should be honest: I don’t know how to code. Not really. I’m just a guy with ideas, some understanding of how software fits together, and a stubborn refusal to accept that my house can’t be smarter than me.
Every line of code in this project was generated by Claude (Anthropic). Every image by Google Gemini. I’m the architect who can’t hold a hammer — I know what I want the building to look like, I can read blueprints, I can tell you when something’s wrong, but I’m not the one pouring the concrete.
This project is a love letter to the Home Assistant community and a tribute to every developer whose work I climbed on top of. The people who built pyscript, Extended OpenAI Conversation, Music Assistant, ESPHome, microWakeWord, ElevenLabs Custom TTS, sqlite-vec, and dozens more — without them this would be nothing. I didn’t invent anything here. I just connected things that smarter people made and pointed an AI at the gaps.
If you’re someone who doesn’t code but has ideas — this is proof that the barrier is lower than you think. The tools exist. The community exists. You just need to be stubborn enough to keep asking “but what if it could also do this?”
Links
- GitHub: mmadalone/Project_Fronkensteen
- Issues / Feedback: GitHub Issues
- Custom wake words (Hey Rick / Yo Rick / Hey Quark / Yo Quark): mmadalone/microwakeword — currently using “Okay Nabu” because mine need better training
Fair warning: features are still very much work in progress and need polishing. Things will be rough around the edges. If you poke around, find bugs, have suggestions, or just want to roast the architecture — you’re more than welcome. Feedback is genuinely appreciated and kindly requested. Come break things.
Rule of Acquisition #9: Opportunity plus instinct equals profit.
