I wanted to share the steps I found to build out a facial recognition doorbell cam system without the normal steps of having hardware analyze raw video streams using Frigate.
What you need:
- A cheap camera or doorbell, such as Blink or Nest
- Anything running Home Assistant, even an Raspberry Pi
- The Double Take add-on
- A Double Take detector plugin. I chose Amazon Rekognition.
- MQTT
What you don’t need:
- Frigate or any beefy CPU/GPU
The main idea here is to use the terrific Double Take add on to manager our facial recognition training and detection, but to bypass the video processing input step (so we can do this on the cheapest hardware).
My configs contain lines for both Blink and Nest. Delete whatever you don’t need.
First of all, here is my Double Take configuration which creates fake cameras for Blink and Nest which are really just a JPEG captured on my Home Assistant box. I chose Amazon rekognition for my “detector” since it offloads work from my rpi to the cloud, but feel free to try one of the others. (rekognition requires an AWS access key. Get one by signing up for AWS and going to the IAM console).
cameras:
blink-front-door:
snapshot:
url: http://homeassistant.local:8123/local/blink-doorbell.jpg
nest-front-door:
snapshot:
url: http://homeassistant.local:8123/local/nest-doorbell.jpg
detectors:
rekognition:
aws_access_key_id: !secret aws_access_key_id
aws_secret_access_key: !secret aws_secret_access_key
aws_region: us-west-2
collection_id: double-take
# require opencv to find a face before processing with detector
opencv_face_required: true
mqtt:
host: core-mosquitto.local.hass.io
username: <xxx>
password: <xxx>
Now the next trick is to get snapshots from your camera into that URL above (http://homeassistant.local:8123/local/blink-doorbell.jpg) and then poke Double Take’s REST API to process it. Here are the config.yaml additions I made for both Nest and Blink cameras:
# For Nest only
homeassistant:
allowlist_external_dirs:
- "/config/nest/event_media"
# For Nest only
folder_watcher:
- folder: /config/nest/event_media
patterns:
- "*-camera_person.jpg"
rest_command:
# Commands to trigger Double Take processing via REST API
process_blink_front_door:
url: "http://homeassistant.local:3000/api/camera/blink-front-door"
verify_ssl: false
process_nest_front_door:
url: "http://homeassistant.local:3000/api/camera/nest-front-door"
verify_ssl: false
# For Nest only, to get the latest photo into the public folder
shell_command:
cp: /bin/cp -f {{source}} {{dest}}
Then here’s an automation I created to get Nest photos into the Double Take fake camera. (I created it in the Home Assistant UI, but I’m just dumping the YAML)
alias: Ring doorbell facial recognition
description: ""
trigger:
- platform: event
event_type: folder_watcher
event_data:
event_type: created
condition: []
action:
- service: shell_command.cp
data:
source: "{{ trigger.event.data.path }}"
dest: /config/www/nest-doorbell.jpg
- service: rest_command.process_nest_front_door
data: {}
mode: single
And the more complicated automation for Blink, since I had to do some extra steps to get it to take a live pic instead of serving an old cached pic. (ignore my device ids. Create this action through the home assistant UI and it will fill in your ids.)
alias: Blink doorbell facial recognition
description: ""
trigger:
- type: motion
platform: device
device_id: 7a644eb4712c4c53f97e3d7c1644a8af
entity_id: c86282f4fa7c7ddfc5042d6dc7d6c49a
domain: binary_sensor
condition: []
action:
- service: blink.trigger_camera
data: {}
target:
device_id: 7a644eb4712c4c53f97e3d7c1644a8af
- delay:
hours: 0
minutes: 0
seconds: 5
milliseconds: 0
enabled: true
- service: blink.blink_update
data: {}
- service: camera.snapshot
data:
filename: /config/www/blink-doorbell.jpg
target:
device_id: 7a644eb4712c4c53f97e3d7c1644a8af
- service: rest_command.process_blink_front_door
data: {}
mode: single
Now all you have to do is go into the Double Take UI and create some profiles on the Training tab. When a face is seen, Double Take will send the results to Home Assistant via MQTT which should result in a new entity being created automatically, with a name like “sensor.double_take_david” that contains the last timestamp when it saw David.