Home safety and security is always the highest priority. With the power of Machine Learning and Deep learning, we can find the person from the video frame and we can also track them by analyzing the live CCTV camera feed. We can also recognize the person whether the person is a house member or he is a Delivery person or servant using Machine Learning. This becomes very helpful to detect any threats or stolen activity by the non-family member and send an alert to homeowners.
Whenever the system detects a person in a video frame from live streaming, it searches the detected face against the collection of known faces i.e face of all family members. It can send the voice notification to Alexa if the detected person is not a family member.
Infrastructure Overview
The entire system architecture builds over the various Amazon Web Services such as Amazon Kinesis video stream, Amazon Rekognition, AWS lambda and Amazon Simple Notification Service (SNS).
Amazon Kinesis video stream
Amazon Kinesis video stream allows us to easily ingest live video streams securely from connected devices for video analytics, machine learning (ML), and other processing. Kinesis Video Streams facilitate us to playback video for live and on-demand viewing. With the help of Amazon Kinesis video stream, we can easily build applications with ultra-low latency live streaming for video analysis such as face recognition and verification, baby monitoring, home surveillance system, traffic monitoring, etc.
Please refer to this tutorial, to get to know about live video streaming from CCTV IP camera. This will ingest live CCTV camera feed to AWS Kinesis Video Stream.
Amazon Rekognition
Amazon Rekognition is a deep-learning-powered video analysis service that tracks people, detects activities, and recognizes objects in live streams and returns a specific label of activity, person, faces and object with timestamps. With Amazon Rekognition video service, you receive a detailed analysis of an image, which includes coordinates of bounding boxes surrounding the face, confidence scores, facial attributes, pose, etc…
For the face identification, we required the face collection of the known faces. So, AWS Rekognition searches the input face against this known faces collection.
Create a face collection –
This is the private collection database that stores the pictures of the known people that we want Amazon Rekognition to detect. AWS Command Line Interface(CLI) is used to create a face collection by specifying the name of the collection. Here is the command to create the face collection called my-family:
$ aws rekognition create-collection --collection-id my-family
Response:
{ "StatusCode": 200, "CollectionArn": "aws:rekognition:us-east-2:291596880574:collection/my-family", "FaceModelVersion": "5.0" }
Add faces to the collection –
The following AWS CLI command will be used to add images to the created face collection from the Amazon S3 bucket:
$ aws rekognition index-faces --collection-id my-family --image '{"S3Object":{"Bucket":"BucketName","Name":"Path_to_Image"}}' --external-image-id "PersonName"
Response
{ "FaceRecords": [ { "Face": { "FaceId": "3ea1d3bb-f95a-4927-9b50-6e3f09b83442", "BoundingBox": { "Width": 0.2959272563457489, "Height": 0.33254632353782654, "Left": 0.2970787584781647, "Top": 0.17190377414226532 }, "ImageId": "e8746906-f2dd-333a-a341-33bdb0deee3d", "ExternalImageId": "Bhavika", "Confidence": 99.99665832519531 }, "FaceDetail": { "BoundingBox": { "Width": 0.2959272563457489, "Height": 0.33254632353782654, "Left": 0.2970787584781647, "Top": 0.17190377414226532 }....
Amazon Rekognition doesn’t save the actual face images in the collection. Instead, it extracts facial information from the face images and stores this information in a database. This facial information is used in searching a collection for a matching face.
Create the Stream Processor –
Amazon Rekognition provides a stream processor that can manage to read live video data from Kinesis Video Stream and write an analysis response to Kinesis Data Stream.
In the above section, we have created the Kinesis Video Stream. Now, let’s create the Kinesis Data Stream.
Go to the Amazon Kinesis Data stream console to create the new data stream by setting up the details with stream name and number of shards you need.
Now, we are ready to create a Stream Processor. The Stream processor contains information about Kinesis Video Stream and Kinesis Data Stream for reading video data and writing analysis response.
The following AWS CLI command is used to create a stream processor :
$ aws rekognition create-stream-processor \ --input '{"KinesisVideoStream":{"Arn":"<Kinesis video stream ARN>"}}' \ --name my-stream-processor \ --role-arn <role ARN> \ --stream-processor-output '{"KinesisDataStream":{"Arn":"<Kinesis data stream ARN>"}}' \ --settings '{"FaceSearch":{"CollectionId":"my-family", "FaceMatchThreshold": 85.5}}'
{ "StreamProcessorArn": "arn:aws:rekognition:us-west-2:123456789012:streamprocessor/my-stream-processor" }
Start Stream Processor –
The following start-stream-processor command starts the specified video stream processor.
$ aws rekognition start-stream-processor --name my-stream-processor
Now, Amazon rekognition can analyze the video stream from Kinesis Video Stream and put the analysis response to Kinesis Data Stream. Rekognition service process each frame of the kinesis video stream and matches the faces to people in face collection.
Creating a Lambda function to process Kinesis Data Stream –
AWS Lambda is a serverless compute service that processes code without provisioning or managing servers. AWS Lambda enables us to run and scale the code for virtually any type of application or backend services on high-availability compute infrastructure.
AWS Lambda function runs the code to process the JSON response of Amazon Rekognition.