Objective
The objective of the project is to detect the human face from an image taken from
an image acquisition device and then to match that image with the stored images of
a database.
Introduction
The Problem
Recently face recognition is attracting much attention in the society of network
multimedia information access. Areas such as network security, content indexing
and retrieval, and video compression benefits from face recognition technology
because "people" are the center of attention in a lot of video. Network access
control via face recognition not only makes hackers virtually impossible to steal
one's "password", but also increases the user-friendliness in human-computer
interaction. Indexing and/or retrieving video data based on the appearances of
particular persons will be useful for users such as news reporters, political
scientists, and moviegoers. For the applications of videophone and
teleconferencing, the assistance of face recognition also provides a more efficient
coding scheme.
The project is basically divided into two stages. The first stage is to detect human
face from a given image captured by the webcam, this stage is called face
detection. In the second stage we match the captured image with our stored
database of images this stage is called face recognition.
Face Detection is a technique for automatically detecting human face in digital
images. The system relies on a two step process which first detects regions which
are likely to contain human skin in the color image and then extracts information
from these regions which might indicate the location of a face in the image. The
skin detection is performed using a skin filter which relies on color and texture
information. The face detection is performed on a grayscale image containing only
the detected skin areas. A combination of thresholding and mathematical
morphology are used to extract object features that would indicate the presence of
a face. The face detection process works predictably and fairly reliably, as test
results show. The process for detection of faces in this project was based on a two-
step approach. First, the image is filtered so that only regions likely to contain
human skin are marked. This filter was designed using basic mathematical and
image processing functions in MATLAB and was based on the skin filter
Modifications to the filter algorithm were made to offer subjective improvement to
the output. The second stage involves taking the marked skin regions and removing
the darkest and brightest regions from the map.
IMAGE ACQUISITION
SMOOTHING
DETECTION
RECOGNITION
SKIN COLOUR
SKIN REGION
FACE
IMAGE DATABASE
DETECTED AREA
BLOCK DIAGRAM OF FACE RECOGNITION
DESCRIPTION
IMAGE ACQUISITION
The basic goal of this step is to acquire an image from a webcam installed in the
user‟s system. For acquiring an image here are some basic commands described as
follows.
1. The first step is to get the information about the installed adaptors in the
user‟s computer, for this the following code is typed in the MATLAB
prompt.
Imaqhwinfo;
Output of this command
InstalledAdaptors: {'coreco' 'winvideo'}
MATLABVersion: '7.6 (R2008a)'
ToolboxName: 'Image Acquisition Toolbox'
ToolboxVersion: '3.1 (R2008a)'
Looking at the output we infer that „coreco‟ and „winvideo‟ are the installed
adapters for the webcam. We select the „winvideo‟ adaptor for further processing.
To get the device id of the installed adaptor we simply type the following code in
the MATLAB prompt.
Imaqhwinfo(„winvideo‟)
Result of this command
AdaptorDllName: [1x81 char]
AdaptorDllVersion: '3.1 (R2008a)'
AdaptorName: 'winvideo‟
DeviceIDs: {[1]}
DeviceInfo: [1x1 struct]
From the result of this command we interpret that the Device-Id for this installed
adaptor is 1.So now with the help of this device-id and the installed adaptor we can
communicate with our webcam.
2.Determining the video format
To determine which video formats an image acquisition device supports,
look in the DeviceInfo field of the data returned by imaqhwinfo. The
DeviceInfo field is a structure array where each structure provides
information about a particular device. To view the device information for a
particular device, you can use the device ID as a reference into the structure
array.
imaqhwinfo('winvideo',1)
Result of this command
DefaultFormat: 'YUY2_160x120'
DeviceFileSupported: 0
DeviceName: 'HP Webcam'
DeviceID: 1
ObjectConstructor: 'videoinput('winvideo', 1)'
SupportedFormats: {1x5 cell}
From the result of this command we infer that the supported video format of our
webcam is 'YUY2_160x120'
3. Create a Video Input Object
In this step we create the video input object that the toolbox uses to
represent the connection between MATLAB and an image acquisition
device. Using the properties of a video input object, we can control many
aspects of the image acquisition process.
v = videoinput('winvideo',1,'YUY2_160x120')
Result of this command
Summary of Video Input Object Using 'HP Webcam'.
Acquisition Source(s): input1 is available.
Acquisition Parameters: 'input1' is the current selected source.
10 frames per trigger using the selected source.
'YUY2_160x120' video data to be logged upon START.
Grabbing first of every 1 frame(s).
Log data to 'memory' on trigger.
Trigger Parameters: 1 'immediate' trigger(s) on START.
Status: Waiting for START.
0 frames acquired since starting.
0 frames available for GETDATA.
This command creates a video object “v” which will be used for activating the
webcam and taking a snapshot. For this purpose we type the following command
preview(v);
After the successful execution of this command webcam is automatically activated
for taking the snapshot.
To take the snapshot type the following command
d=getsnapshot(v);
As this command is executed an image is taken by our webcam and is stored in an
image object„d‟. To view this image „i‟ type the following command in the
MATLAB prompt.
figure,imshow(d);
The image shown above is in YCbCr format.
YCbCr Format
YCbCr or Y′CbCr, sometimes written YCBCR or Y′CBCR, is a family of color
spaces used as a part of the color image pipeline invideo and digital
photography systems. Y′ is the luma component and CB and CR are the blue-
difference and red-difference chroma components. Y′ (with prime) is distinguished
from Y which is luminance, meaning that light intensity is non-linearly encoded
using gamma correction.
Y′CbCr is not an absolute color space, it is a way of encoding RGB information.
The actual color displayed depends on the actual RGB colorants used to display the
signal.
To convert the above image into RGB format simply type the following command
in MATLAB prompt.
d=ycbcr2rgb(d);
figure,imshow(d);
RGB Format
The RGB color format is an additive color format in which red, green,
and blue light are added together in various ways to reproduce a broad array
of colors. The name of the format comes from the initials of the three additive
primary colors, red, green, and blue.
The main purpose of the RGB color format is for the sensing, representation, and
display of images in electronic systems, such as televisions and computers, though
it has also been used in conventional photography. Before the electronic age, the
RGB color model already had a solid theory behind it, based in human perception
of colors.
RGB is a device-dependent color format: different devices detect or reproduce a
given RGB value differently, since the color elements (such as phosphors or dyes)
and their response to the individual R, G, and B levels vary from manufacturer to
manufacturer, or even in the same device over time. Thus an RGB value does not
define the same color across devices without some kind of color management.
Creating The Database
The purpose of creating a database is to match the snapshot with the stored images
present in the database.
The code for creating the database is as follows
rootname='ah';
extension='.jpg';
for i=1:10
filename=[rootname,int2str(i),extension];
d=getsnapshot(v);
d=ycbcr2rgb(d);
pause(1);
figure,imshow(d);
imwrite(d,filename);
end
Explanation
Here rootname=‟ah‟ specifies the name of each file that will be present in the
database and extension=‟.jpg‟ specifies the file format in which the files will be
saved in the database.
Now a for loop will iterate ten times. Size of the database can be increased by
increasing the number of iterations, and every time a snapshot is taken from the
webcam after an interval of one second it is converted into a rgb color format
image and finally the file is saved in the database with the help of command
imwrite(d,filename).
Image Database
Processing The Database
In this step every image present in the database is normalized to compensate the
effect of light and background, they are further processed to detect the skin region.
After detecting the skin region in an image the image is smoothed to reduce or
compensate the effect of noise and possible face candidates are detected in an
image. Then our next step is to detect the mouth and eye region in an image .If the
eye and face region are found in the image we infer that a face is found and the co-
ordinates of the region is stored. The detected region is cropped and resized and
then finally saved in our database.
STEP1:Light Compensation and Normalization
Faces=[];
numFaceFound=0;
for i=1:10
filename=[rootname,int2str(i),extension];
I=imread(filename);
x=I;
I=double(I);
H=size(I,1);
W=size(I,2);
R=I(:,:,1);
G=I(:,:,2);
B=I(:,:,3);
YCbCr=rgb2ycbcr(I);
Y=YCbCr(:,:,1);
minY=min(min(Y));
maxY=max(max(Y));
Y=255.0*(Y-minY)./(maxY-minY);
YEye=Y;
Yavg=sum(sum(Y))/(W*H);
T=1;
if (Yavg<64)
T=1.4;
elseif (Yavg>192)
T=0.6;
end
if (T~=1)
RI=R.^T;
GI=G.^T;
else
RI=R;
GI=G;
end
C=zeros(H,W,3);
C(:,:,1)=RI;
C(:,:,2)=GI;
C(:,:,3)=B;
figure,imshow(C/255);
title('Lighting compensation');
Explanation
In this step first the file is read using imread() function. The height and width of
the image are stored in variables „H‟ and „W‟ respectively. The red green and blue
component of the image are stored in the variables „R‟,‟G‟ and „B‟ respectively.
The image „I‟ is converted into YcbCr format and is stored in the variable
„YcbCr‟. The difference of the red component of this YcbCr image is stored in the
variable „Y‟.The minimum and maximum values of „Y‟ are stored in the variables
„minY‟ and „maxY‟ and these variables are used to normalize an image to a scale
of 255 and the normalized image is stored in the variable „Y‟.A variable „T‟ is
initialized to 1 and if the average values of the Y is less than 64 then it infers that
the image is dark, so to increase the brightness of the image the value of T is
increased to 1.4 and if the average value of Y is greater than 192,then it infers that
the image is very bright, so to reduce the brightness of image,the value of T is
reduced to 0.6.
If the value of „T‟ is not equal to 1 then the differences of red component and
green component are raised to the power of T.
Now a variable „C’ is initialized in which the values of all the red, green and blue
components in the detected region are assigned to zero. Now red, green and blue
component of the image C are initialized to RI, GI and B.
STEP 2 Extracting Skin
YCbCr=rgb2ycbcr(C);
Cr=YCbCr(:,:,3);
S=zeros(H,W);
[SkinIndexRow,SkinIndexCol] =find(10<Cr & Cr<45);
for i=1:length(SkinIndexRow)
S(SkinIndexRow(i),SkinIndexCol(i))=1;
end
figure,imshow(S);
title('skin');
Image is now converted to yCbCr format because light has least to do with Cb and
Cr components.We choose Cr component for skin region detection. First of all we
create an image array initialized with 0 of same dimension as of our image .
By our experiment we found that value of Cr component for normal skin in
normal conditions ranges from 10 to 45. So we search for co-ordinates where
value of Cr is greater than 10 but less than 45 and its row number are stored in
SkinIndexRow and column number are stored in SkinIndexCol. Next the co-
ordinates(pixels) having value of Cr in between 10 and 45 are turned white(1). One
of figure shown below
skin
STEP 3 REMOVE NOISE
SN=zeros(H,W);
for i=1:H-5
for j=1:W-5
localSum=sum(sum(S(i:i+4, j:j+4)));
SN(i:i+5, j:j+5)=(localSum>12);
end
end
figure,imshow(SN);
title('skin with noise removal');
As there may be some stray white dots outside face region or even black dots in
face region. We need to smooth this image. We again create an image array
initialized with black(0) and of same height and width. We compute a local sum of
5X5 block and if the sum is greater than 12 that pixel is turned white. This step is
basically remove noise from skin region. So get an image of skin with noise
removed as shown below.
skin with noise removal
FIND SKIN COLOR BLOCKS
L = bwlabel(SN,8);
BB = regionprops(L, 'BoundingBox');
bboxes= cat(1, BB.BoundingBox);
widths=bboxes(:,3);
heights=bboxes(:,4);
hByW=heights./widths;
lenRegions=size(bboxes,1);
foundFaces=zeros(1,lenRegions);
rgb=label2rgb(L);
figure,imshow(rgb);
title('face candidates');
There may be more than two faces in an image; although not permitted in our case
because in that case we won‟t be able recognize a single user;we label different
skin region and show them as possible face candidates. For that two MATLAB™ 3
built in function viz. bwlabel, regionprops and cat. bwlabel() labels different region
regionprops() measures a set of properties for each labeled region according to
given properties cat(dim,array) concatenates the arrays along dim.
Height and widths of these regions are stored. Height to width ratios are
calculated to be used in next part of code. L contains labeled regions and this is
converted to rgb and shown.One of the example is shown below
for i=1:lenRegions
% 1st criteria: height to width ratio, computed above.
if (hByW(i)>1.75 || hByW(i)<0.75)
% this cannot be a mouth region. discard
continue;
end
% Impose a minimum face dimension constraint
if (heights(i)<20 && widths(i)<20)
% this cannot be a face region. discard
continue;
end
code is self explanatory yet if height to width ratio is greater than 1.75 or less 0.75
(values are determined by experiment )the region can‟t be a mouth region and
discarded and we go to next region. Secondly if size is less than 20X20 again
discarded. If we get one region satisfying above condition the x,y co-ordinates are
and height, width are found to form a box
% get current region's bounding box
CurBB=bboxes(i,:);
XStart=CurBB(1);
YStart=CurBB(2);
WCur=CurBB(3);
HCur=CurBB(4);
This part of code does the above task XStart and YStart stores the starting point of
box and WCur, HCur stores the dimension of the current box
% crop current region
rangeY=int32(YStart):int32(YStart+HCur-1);
rangeX= int32(XStart):int32(XStart+WCur-1);
RIC=RI(rangeY, rangeX);
GIC=GI(rangeY, rangeX);
BC=B(rangeY, rangeX);
figure, imshow(RIC/255);
title('Possible face channel');
Now the identified box is cropped from main image and shown as below
Possible face channel
% 2nd criteria: existance & localisation of mouth
M=zeros(HCur, WCur);
theta=acos( 0.5.*(2.*RIC-GIC-BC) ./ sqrt( (RIC-GIC).*(RIC-GIC) + (RIC-
BC).*(GIC-BC) ) );
theta(isnan(theta))=0;
thetaMean=mean2(theta);
[MouthIndexRow,MouthIndexCol] =find(theta<thetaMean/4);
for j=1:length(MouthIndexRow)
M(MouthIndexRow(j),MouthIndexCol(j))=1;
end
Hist=zeros(1, HCur);
for j=1:HCur
Hist(j)=length(find(M(j,:)==1));
end
wMax=find(Hist==max(Hist));
wMax=wMax(1); % just take one of them.
if (wMax < WCur/6)
%reject due to not existing mouth
continue;
end
figure, imshow(M);
title('Mouth map');
Again image array M is created . theta stores the cos inverse of half of (2.*RIC-
GIC-BC) ./ sqrt( (RIC-GIC).*(RIC-GIC) + (RIC-BC).*(GIC-BC) ) );
Found in original paper of "A simple and accurate face detection algorithm in
complex background" published by Yu-Tang Pai, Shanq-Jang Ruan, Mon-Chau
Shie, Yi-Chi Liu
This part applies adaptive theta thresholding .Theta is thresholded by
mean2(theata)/4
Now histogram of the cropped image is made and max is taken and if it is less than
one by sixth of average it is discarded
Then we check for existence of eye
eyeH=HCur-wMax;
eyeW=WCur;
YC=YEye(YStart:YStart+eyeH-1, XStart:XStart+eyeW-1);
E=zeros(eyeH,eyeW);
[EyeIndexRow,EyeIndexCol] =find(65<YC & YC<80);
for j=1:length(EyeIndexRow)
E(EyeIndexRow(j),EyeIndexCol(j))=1;
end
EyeExist=find(Hist>0.3*wMax);
if (~(length(EyeExist)>0))
continue;
end
foundFaces(i)=1;
numFaceFound=numFaceFound+1;
end
if length of eye region is between 65 to 80 we infer eyes exist.
If we found mouth and eye in this region we increment the number of faces found
by 1.
Face Recognition
The basic goal of this step is to compare the captured image with the images stored
in the database and if the captured image matches with any of the stored images
then we conclude that the face is recognized.
For recognizing the captured image with the database of stored images following
steps are needed to be followed.
STEP 1 Initializing The Webcam
In this step the web cam is initialized and it captures the image of the person whose
face will be matched with the stored images of the database.
To initialize the webcam following code is needed to be typed in the MATLAB
prompt.
Imaqhwinfo;
Output of this command
InstalledAdaptors: {'coreco' 'winvideo‟)
MATLABVersion: '7.6 (R2008a)‟
ToolboxName: 'Image Acquisition Toolbox‟
ToolboxVersion: '3.1 (R2008a)'
Imaqhwinfo(„winvideo‟)
Output of this command
AdaptorDllName: [1x81 char]
AdaptorDllVersion: '3.1 (R2008a)'
AdaptorName: 'winvideo‟
DeviceIDs: {[1]}
DeviceInfo: [1x1 struct]
imaqhwinfo('winvideo',1)
Output of this command
DefaultFormat: 'YUY2_160x120‟
DeviceFileSupported: 0
DeviceName: 'HP Webcam'
DeviceID: 1
ObjectConstructor: 'videoinput('winvideo', 1)'
SupportedFormats: {1x5 cell}
v = videoinput('winvideo',1,'YUY2_160x120')
This command creates a videoinput object „v‟ which will be used later to take the
snapshot.
Result of this command
Summary of Video Input Object Using 'HP Webcam'.
Acquisition Source(s): input1 is available.
Acquisition Parameters: 'input1' is the current selected source.
10 frames per trigger using the selected source.
'YUY2_160x120' video data to be logged upon START.
Grabbing first of every 1 frame(s).
Log data to 'memory' on trigger.
Trigger Parameters: 1 'immediate' trigger(s) on START
Status: Waiting for START
0 frames acquired since starting.
0 frames available for GETDATA
After the successful execution of the above commands an image is captured which
we will be used to match with the stored images of the database.
STEP 2 Image Processing on the Captured Image
The main aim of this step is to process the captured image in a sequence of steps
before matching it with the stored images of the database.
The sequence of steps for image processing are as follows
a) SMOOTHING
During this step the image is normalized to compensate the effect of light in
order to produce a clear image.
b) SKIN
During this step the skin part of the image is detected and is converted into
black and white image.
c) SKIN REGION
During this step the possible face candidates in an image are found out in
which the face candidate with blue color has the highest probability of being
a face.
d) Detecting Face
During this step the face part in the captured image is detected which will be
used for recognizing the face with the stored images of the database.
e) Croping The Face Region
The basic goal of this image is to crop the detected face region which will be
used for matching with the stored images of the database.
f) Resizing of Image
Face Recognition can only be performed only on those images which are of
same size otherwise an error will occur. To prevent from such kind of errors
the captured image is resized to the same scale as that of the stored images
in the database.
After the captured image is gone through all the above mentioned steps, it is
matched with the stored images of the database and if the captured image is
matched with any of the stored images we concluded that the face is recognized
successfully.
For recognizing the face with the stored images of the database type the following
command in the MATLAB prompt.
for i= 1:30
filename=[rootname,int2str(i),extension];
im1=imread(filename);
im2=imread('2.jpg');
D = sqrt(sum((im2(:) - im1(:)).^2)) / sqrt(sum(im1(:).^2));
if(D<0.4)
if(i>1 & i<10)
fprintf('face is correctly matched with akash \n');
figure,imshow(filename);
elseif(i>11 & i<20)
fprintf('face is correctly matched with hemant \n');
figure,imshow(filename)
else
fprintf('face is correctly matched with abhay \n');
figure,imshow(filename);
end
else
fprintf('face is not correctly matched with database \n');
figure,imshow('failed.jpg');
end
end
Explanation
The above mentioned code simply takes the squared differences of the two images
in which one image(im1) is the image present in the database while the other
image is the captured image(im2) which is to be matched with the database of the
stored images.
The code D = sqrt(sum((im2(:) - im1(:)).^2)) / sqrt(sum(im1(:).^2)) takes the
difference of the pixels of the two images and stores that difference in a variable
„D‟. Now if the value of D is less than a minimum threshold i.e. 0.4,here 0.4 is
measured by our experiment we conclude that the image is successfully recognized
with stored images of the database and then we open the figure window which
displays the name of the person with which the image is matched, in this case the
name displayed will be Akash, Hemant and Abhay. If the value of „i‟ is between
1 to 10 then face will be matched with Akash . If the value of „i‟ is between 11 to
20 then face will be matched with Hemant and if the value of „i‟ is between 21 to
30 then face will be matched with Abhay.
While if the value of the variable D is greater than the minimum threshold i.e.
0.4,then the face is not matched with the stored images of the database. In that case
we open a figure window which displays „FAILED‟ indicating that the face was
not recognized properly.
The rest part of the code is very simple and is self explanatory.
Conclusion
Our contribution in the project proved to be fruitful as we were successfully able to
extract the human face from an input image which was captured by an image
acquisition device at any particular instance of time and then match the image with
the image database. Face recognition was done successfully and it can be applied
in various applications used for authentication.
Future Enhancements
Face detection is the first step of “Face Recognition”, it has many utilities
e.g. counting number of people,tagging people as commonly seen in
facebook ™.
Face recognition can be used for authentication of passwords, it protects the
user from hackers keeping the password protected.
Another very useful and important enhancement can be for checking the
attendance of students and for authentication of employees in an
organization.
It can used to match the images captured by the CCTV at various public
places with the images present in the criminal records.
Face recognition can be used for verifying visas and can be used in e-
passport microchip to verify if the holder is the rightful owner of the
passport
So this project can be further enhanced in order to incorporate many functionalities
it can further be enhanced to make it faster and smaller.
References
For any help regarding image processing and image acquisition please visit
the website www.mathworks.com.
To have a detailed knowledge about the various inbuilt functions used in the
above coding, please visit the Image Acquisition and Image Processing
Toolboxes present in the HELP of the MATLAB software.
And last but not least our best friend http://www.google.co.in.