Facial recognition without NVR or additional hardware

Hello community, I want to add local facial recognition to a camera in HA without adding hardware or causing performance issues to my existing config. Is this is a realistic requirement? How would you implement a solution?

I run HA on a Home Assistant Yellow with a 8GB CM4 Lite and 500GB NVME SSD and have two cameras available to use, a Ring internal camera integrated via the Ring-MQTT add on and an Aqara G3 integrated via the HomeKit integration. Both devices are mains powered and expose motion sensors in HA. The Aqara exposes a camera entity and I used the Camera integration to create a camera entity with the Ring-MQTT’s RTSP source URL. Online specs tell me both devices use H.264 video codec.

I have two use cases and could use either one camera per use case or one camera for both use cases. I have no requirements per se to record a continuous camera stream.

First, I’d use a motion detection event to trigger recognition of if a specific person is present, either a man (me), woman (my spouse) or child (my kids). This event would trigger personalised TTS on a Google Nest speaker.

Second, I’d use a motion detection event to trigger recognition of if any person is present, and then re-check at a fixed time-interval (say every 180 seconds). When a person is no longer detected this would indicate room occupancy has cleared (and the facial recognition can cease processing video).

Both use cases are for when we’re at home by ourselves and we have no pets, so in theory each motion detection event can only have been triggered by one of us.

I’m assuming I need to implement facial recognition of still images at spaced out intervals (should I look at using DeepStack???). For my use cases it’s fine if it takes many seconds from a motion detection event to delivering an image for recognition and a person recognition event finally registering in HA. So will my hardware resource requirements be a fraction of say running Frigate on a live stream, for example?

Additional details: the Ring-MQTT docs state “Ring enforces a time limit on active live streams and terminates them, typically after approximately 10 minutes, although sometimes significantly less and sometimes a little more.” and “Home Assistant uses the LL-HLS protocol for streaming in the web browser which splits the existing stream into segments for delivery over HTTP/HTTPS. While HLS streaming is extremely reliable and widely compatible with various web browsers and network setups, it typically adds 3-5 seconds of delay”.

FWIW Ring-MQTT exposes a snapshot sensor and controls for the Snapshot mode and Live stream in HA (see screenshot below) but the quality of the video on the Aqara is better and it has a slightly wider viewing angle that is useful. I have successfully run the camera.snapshot service on both devices but, so far, the camera.record service only works on the Ring device while the Aqara returns a “Failed to call service camera.record. camera.kitchen_camera_g3 does not support record service”.

I don’t believe it will be possible on that hardware. But perhaps you can add a Coral TPU.

see