Hello everyone!
I’m new here in the community and also relatively new to using Home Assistant. If I make any mistakes in the explanation, I apologize in advance.
I’d like to share a project I’ve been developing over the last few months. I call it “Monstrão” — not because of its size, but because of the “brain” it has (and the robust look it ended up getting).
The Problem
Most smart intercoms only send a notification to your phone or make a call.
If you’re in a meeting, sleeping, or driving, the delivery driver leaves and you miss your package.
Also, I didn’t want a system that would confirm to strangers whether I’m home or not.
The Solution
I built a custom intercom using an ESP32-S3, integrated with Google Gemini and Home Assistant.
It doesn’t just ring an internal bell.
It talks to the person at the gate and makes decisions.
Hardware and “Root-Level” Engineering
Enclosure and Protection
To house everything, I used an IP65-rated enclosure to ensure weather resistance.
It may not be the most elegant design in the world, but it’s extremely functional and secure.
The Secret Behind the Sound
Since the intercom sits directly on the street, I needed the AI audio to be loud and clear.
So I did a “transplant”.
I used the full acoustic chamber from a cheap TWS Bluetooth speaker.
I removed the battery and original electronics, keeping only the 3W / 4Ω speaker inside its original acoustic chamber.
This provided sound pressure that would be impossible with a loose speaker inside the box.
I2S Audio
The signal goes through a PCM5102A DAC and a PAM8406 amplifier, eliminating Wi-Fi interference noise.
Intelligence Highlights
Autonomous Delivery Logic
If a delivery driver arrives and the house is empty, the AI identifies who the package is for, checks groups.yaml, and negotiates:
“The resident is not home, but can you leave it at house 416 C?”
The address is only given if the driver confirms they can take it there.
“Do Not Disturb” System
Imagine being tired and sleeping, and suddenly a loud announcement plays:
“THERE IS A VISITOR AT THE GATE.”
To avoid this, I created helpers like
input_boolean_nao_perturbe_nome_do_morador, allowing each resident to enable it when they don’t want to be disturbed.
Privacy
The system never reveals names or confirms if someone is home to strangers.
For visitors, it offers the option to record a message.
This message is transcribed by Gemini and sent as a private notification.
Access
Residents can enter using:
• NFC with their phone
• Facial recognition
The main access method is the NFC tag.
Facial recognition works as a last resort, since image processing with Gemini is slower.
Resilience
The system includes:
• Do Not Disturb mode (23:00 – 07:00)
• Offline fallback, playing pre-recorded audio messages if the internet goes down.
Internal Announcements
I don’t rely only on the phone.
I used a cheap USB sound card connected to the Home Assistant PC running VLC Telnet, allowing real-time announcements inside the house.
Documentation and Code
I organized everything on my GitHub repository, including:
• hardware photos
• demonstration videos
• wiring diagram
• ESPHome YAML
• Home Assistant automations
• AI prompt
Some photos:
![]()
I hope you like the solution.
I’m completely open to suggestions to improve the “Monstrão.” ![]()


