INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET)

A NEW FACE RECOGNITION SCHEME FOR FACES WITH EXPRESSIONS, GLASSES AND ROTATION

Walaa M Abdel-Hafiez1, Mohamed Heshmat2, Moheb Girgis3, Seham Elaw4

1, 2, 4 Faculty of Science, Mathematical and Computer Science Department, Sohag University, 82524, Sohag, Egypt
3 Faculty of Science, Department of Computer Science, Minia University, El-Minia, Egypt

ABSTRACT

Face recognition is considered as one of the hottest research areas in computer vision field. The purpose of the proposed research work is to develop an algorithm that can recognize a person by comparing the characteristics of his/her face, which may have expressions, glasses and/or rotation, to those of known faces in a database. This work provides a simple and efficient technique to recognize human faces. The new method is based on variance estimation of the three components of the color faces images and facial features extraction of the most facial features. The features under consideration are eyes, nose and mouth. The technique used to extract facial features was developed based on feature location with respect to face dimensions. The proposed algorithm has been tested on various face images and its performance was found to be good in most cases. Experimental results show that our method of human face recognition achieves very encouraging results with good accuracy, great speed and simple computations. Keywords: Face Recognition, Facial Features Extraction, Color Spaces, Variance Estimation.

I. INTRODUCTION

Face recognition has been used in various applications where personal identification is required like, visual attendance systems where student identification and recognition are achieved through face recognition. Face recognition [1-4] has been used also in gaming applications, security systems, credit-card verification, criminal identifications, and teleconference, and short face recognition applications are widely used in many corporate and educational institutions.

Human faces are complex objects with features that can vary over time. However, humans have a natural ability to recognize faces and identify persons with just a glance. Our natural recognition ability also extends beyond face recognition, where we are equally able to quickly recognizing patterns, sounds or smells. Unfortunately, this ability does not exist in machines, thus the need to simulate recognition artificially in our attempts to create intelligent autonomous machines. Facial feature recognition is an example of popular applications for artificial intelligence systems. Face recognition by machines has various important applications in real life, such as, electronic and physical access control, national defense and international security. Simulating our face recognition natural ability in machines is difficult but not impossible. Throughout our lifetime, many faces are seen and stored naturally in our memories which forming a kind of database. Machine recognition of faces requires also a database, which consists of facial images that may include different face images of the same person. The development of intelligent face recognition systems requires providing sufficient information and meaningful data during machine learning of a face. Face Recognition can be defined as the visual perception of familiar faces or the biometric identification by scanning a person's face and matching it against a database of known faces. In both definitions, the faces can be identified as familiar or known faces. One of the main challenging problems in building an automated system that perform face recognition and verification tasks is face detection and facial feature extraction. Though people are good at face identification, recognizing the human faces automatically by computer is a very difficult task. Face recognition is influenced by many complications, such as the differences of facial expression, the light directions of imaging, and the variety of posture, size and angle. Even for the same person, the images taken in a different surrounding condition may be unlike. The problem is so complicated that the achievement in the field of automatic face recognition by computer is not as satisfied as the finger prints [5]. The objective of facial feature localization is to detect the presence and location of features after the locations of faces are extracted by using any face detection method. The challenges associated with face and facial feature detection methods can be attributed to the following factors [6]: • Intensity: There are three types of intensity: color, gray, and binary. • Pose: Face images vary due to the relative camera-face pose (frontal, 45º, profile), and some facial features such as an eye may become partially or wholly occluded. • Structural components: Facial features such as beards, mustaches, and glasses may or may not be present. • Image rotation: Face images directly vary for different rotations. • Poor quality: Image intensity in poor-quality images, for instance, blurry images, distorted images, and images with noise, becomes unusual. • Facial expression: The appearance of faces depends on a personal facial expression. • Unnatural intensity: Cartoon faces and rendered faces from 3D model have unnatural intensity. • Occlusion: Faces may be partially occluded by other objects such as hand, scarf, etc. • Illumination: Face images vary due to the position of light source [6]. Phimoltares et al. [6] presented algorithms for all types of face images in the presence of several image conditions. There are two main stages in their method. In the first stage, the faces are detected from an original image by using Canny edge detection and their proposed average face templates. Second, a proposed neural visual model (NVM) is used to recognize all possibilities of facial feature positions.

Finally, to improve the results, image dilation is applied for removing some irrelevant regions. Nikolaidis and Pitas [7] proposed a combined approach for facial feature extraction and determination of gaze at direction that employs some improved variations of the adaptive hough transforms for curve detection, minima analysis of feature candidates, template matching for inner facial feature localization, active contour models for inner face contour detection and projective geometry properties for accurate pose determination. Koo and Song [8] suggested defining 20 facial features. Their method detects the facial candidate regions by haar classifier, and detects eye candidate region and extracts eye features by dilate operation, then detect lip candidate region using the features. The relative color difference of a* in the L*a*b* color space was used to extract lip feature and to detect nose candidate region and detected 20 features from 2D image by analyzing end of nose. Yen and Nithianandan [9] presented an automatic facial feature extraction method based on the edge density distribution of the image. In the preprocessing stage, a face is approximated to an ellipse, and a genetic algorithm is applied to search for the best ellipse region match. In the feature extraction stage, a genetic algorithm is applied to extract the facial features, such as the eyes, nose and mouth, in the predefined sub regions. Gu et al. [5] proposed a method to extract the feature points from faces automatically. It provided a feasible way to locate the positions of the two eyeballs, near and far corners of eyes, midpoint of nostrils and mouth corners from face image. Srivastava [10] proposed an efficient algorithm for facial expression recognition system, which performs facial expression analysis in a near real time from a live web cam feed. The system is composed of two different entities: trainer and evaluator. Each frame of video feed is passed through a series of steps, including Haar classifiers, skin detection, feature extraction, feature point tracking, creating a learned support vector machine model to classify emotions to achieve a tradeoff between accuracy and result rate. Radha and Nallammal [11] described a comparative analysis of face recognition methods: principle component analysis (PCA), linear discriminant analysis (LDA) and independent component analysis (ICA) based on curvelet transform. The algorithms are tested on ORL Database. Kumar et al. [12] presented an automated system for human face recognition in a real time background world for a large homemade dataset of persons' faces. To detect real time human face AdaBoost with Haar cascade is used and a simple fast PCA and LDA are used to recognize the faces detected. The matched face is then used to mark attendance in the laboratory, in their case. El-Bashir [13] introduced a method for face recognition. After a preprocessing and normalization stage to the image, PCA is applied to recognize a specified face. If the face is not recognized correctly, then more features are extracted: face color and moment invariant. The face is recognized again using decision tree. Javed [14] proposed a computer system that can recognize a person by comparing the characteristics of face to those of known individuals. He focused on frontal two-dimensional images that have been taken in a controlled environment, i.e. the illumination and the background were constant, and used the PCA technique. The system gives good results especially with angled face views. Pattanasethanon and Savithi [15] presented a novel technique for facial recognition through the implementation of successes mean quantization transform and spare network of winnow with the assistance of eigenface computation. After having limited the frame of the input image or images from web-cam, the image has cropped into an oval or ellipse shape. Then the image is transformed into grey scale color and is normalized in order to reduce color complexities. They also focused on the special characteristics of human facial aspects such as nostril areas and oral areas, compared the images obtained by web-cam with images in database, and have good accuracy with low time.

Wu et al. [16] presented a system that can automatically remove eyeglasses from an input face image. The system consists of three modules: eyeglasses recognition, localization and removal. Given a face image, first, an eyeglasses classifier is used to determine if a pair of eyeglasses is present. Then, a Morkov chain Monte Carlo method is applied to locate the glasses by searching for the global optimum of the posteriori. Finally, a novel example based approach has been developed to synthesize an image with eyeglasses removed from the detected and localized face image. The experiments demonstrated that their approach produces good quality of face images with eyeglasses removed. Chen and Gao [17] presented a local attributed string matching (LAStrM) approach to recognize face profiles in the presence of interferences. The conventional profile recognition algorithms heavily depend on the accuracy of the facial area cropping. However, in realistic scenarios the facial area may be difficult to localize due to interferences (e.g., glasses, hairstyles). The proposed approach is able to efficiently find the most discriminative local parts between face profiles addressing the recognition problem with interferences. Experimental results have shown that the proposed matching scheme is robust to interferences compared against several primary approaches using two profile image databases (Bern and FERET). This paper presents a new face recognition method for faces with expressions, glasses and/or rotation. The proposed method uses variance estimation of RGB components to compare the extracted faces and the faces in the database used in comparison. In addition, Euclidean distance of facial features of the extracted faces from test image and faces extracted from the database after a variance test is used. The rest of this paper is organized as follows. Section II describes the methodology of the proposed method with its stages: variance estimation, feature extraction, method representation and the proposed algorithm. Section III presents the results and method analysis. Section IV draws the conclusion of this work and possible points for future work.

II. METHODOLOGY

The face and facial feature detection algorithms are applied to detect generic faces from several face images. Most automatic face recognition approaches are based on frontal images. Facial profiles, on the other hand, provide complementary information of the face that is not present in frontal faces. Fusion of frontal and profile views makes the overall personal identification technique foolproof and efficient. The proposed face recognition method is based on the average variance estimation of the three components of RGB faces images, and the extraction of the most facial features. The features under consideration are eyes, nose and mouth. The technique used to extract facial features is based on feature location with respect to the dimensions of the face image. Given a face image, which obtained from a camera or preprocessed previously, our goal is to identify this face image using a database of known humans' faces. Therefore, our algorithm is divided into three main steps. First: variance estimation of faces images. Second: facial feature extraction, an effective method to extract facial features like eyes, nose and mouth depending on their locations with respect to the face region is used, which we have developed before in [18]. Third: similar face identification or image searching; the goal of this step is to scan the database of known faces to find the most similar faces to the test face.

1. Variance Estimation

Variance calculation is a very light calculation and considered as an important constraint to prove similarity between two images. Let x be a vector of dimension n, the variance of x can be calculated as follows: n var = 2 ∑ ( xi − x ) i =1 n , (1) where x is the mean value of x . However, it is not necessary that the two images which have the same variance to have the same contents. Different images may have the same value of variance because variance estimation is totally depending on the values of image pixels and their mean value. So the variance is used at first to filter the database of faces and extract faces that have the same or close value to variance of the input face image, then another test is required to detect the most similar faces to this test face [18]. When working with RGB color images, there are three values for each pixel in the image, representing the red, green, and blue components. To compute the variance of RGB image, the variance for each color is calculated separately. So there are three values for variance, one for the red values, another for the green values and third for the blue values [18], which are calculated as follows:

n ∑(x v red = n r ∑(x − x r )2 i =1 , v green = n n g ∑(x − x g )2 i =1 , v blue = n b − x b )2 i =1 n , (2)

To simplify the comparison, the average of the three values is computed as follows:

v= ( v red + v green + v blue ) 3 , (3)

2. Facial Features Extraction

In this part of work, the aim is to compare two color faces to decide whether they both belong to the same person or not and detect the similarity between them using Euclidean distance. RGB (Red, Green and Blue) color space, fig. 1, which is used here, is an additive color system based on tri-chromatic theory. It is often found in systems that use a CRT to display images. The RGB color system is very common, and is being used in virtually every computer system as well as television, video etc [19], [20]. Red Green Figure 1: RGB color model Blue

In RGB color model, any source color (F) can be matched by a linear combination of three color primaries, i.e. Red, Green and Blue, provided that none of those three can be matched by a combination of the other two, see fig. 1. Here, F can be represented as:

(4) F = r R + gG + b B ,

where r, g and b are scalars indicating how much of each of the three primaries (R, G and B) are contained in F. The normalized form of F can be as follows:

F = R ' R + G 'G + B ' B (5) ,

where R ' = r / (r + g + b) , G ' = g / (r + g + b) , (6)
(6) B ' = b / (r + g + b) ,

To extract facial features, we used our method proposed in [18], which is based on feature location with respect to the whole face region. By detecting the candidate regions of left eye, right eye, nose and mouth, by training, then applying the obtained dimensions of each region on several other faces with the same size, the results were very good, as shown in fig. 2. Given a face image of 200 pixels height and 200 pixel width, after training with a lot of images, we found that the candidate region of eyes is located between rows 60 and 95, columns 25 and 80 for right eye and columns 115 and 170 for left eye. The candidate region for the nose is located between rows 110 and 145 and columns 75 and 125 and the candidate region for the mouth is located between rows 145 and 185 and columns 60 and 135. When applying the dimensions obtained by training on many face images, we found that they there were suitable for any face image with the same width and height even it has expression, as shown in fig. 2. Figure 2: Examples of Feature extraction

This feature extraction technique can be generalized and the candidate region for each feature, which is based on height and width of the face image to match any face image size can be as follows:

• Right eye: Rows from (height/3.3) to (height /2.1) Columns from (width/8) to (width/2.5)
• Left eye: Rows from (height/3.3) to (height /2.1) Columns from (width/1.7) to (width/1.17)
• Nose: Rows from (height/1.8) to (height /1.38) Columns from (width/2.67) to (width/1.6)
• Mouth: Rows from (height/1.38) to (height /1.08) Columns from (width/3.33) to (width/1.48)

3. Method Representation

The proposed algorithm consists of three parts. Firstly, variance estimation is applied to extract database images, which have a close variance value to the test image. Secondly, the features extraction method is used to extract facial features from the face images. Finally, Euclidean distance of facial features is computed by the following equation: d = abs (test feature [ R ] − matched feature [ R ]) + (7) abs (test feature [G ] − matched feature [G ]) + abs (test feature [ B ] − matched feature [ B ]) By applying eq. (7) to find the distance between the right eye region of the test image and the right eye region of each image, which has variance value close to the variance value of the test image (returned from variance test), then, by applying eq. (7) to left eye, nose and mouth regions and find summation of these four distance values, it can be decided which of the images that have close variance value is the most similar to test image. The steps of the proposed algorithm are shown in fig. 3. Step 1: Read input image. Step 2: Read the database of images and calculate the variance of each image by using eq. (2) , (3), and put variance values in an array, Step 3: Calculate variance of the test image using eq. (2), (3). Step 4: Compare variance value of test image and each image in database and keep locations of the most similar images to test image, which satisfy the condition ( −600 ≤ variance difference ≤ 600) , in an array. Step 5: For i=1 to number of similar images which extracted from step 4. a) Extract facial features from each image according to location (right eye – left eye – nose – mouth). b) Calculate the Euclidean distance between the 3-arrays containing the RGB color values of each feature using eq. (7) plus the Euclidean distance between the 3-arrays containing the RGB color values of the whole test image and each similar image from step 4. Step 6: Detect minimum distance ( d ) and location of the image that has the minimum distance from step 5. Step 7: Display the best-matched image from the database.

Figure 3: The proposed algorithm's steps III. RESULTS AND DISCUSSION

The experiments were performed on a computer with 2.20 GHz speed and 4 Gbyte RAM using several color images containing faces and a database of 10 different images with different sizes, as shown in fig. 4. The test images include different images for the same person with different conditions. Some images have expressions, glasses and some rotation, as shown in fig. 5. The images in the database have been chosen carefully such that they are standard and have no expressions if possible. The proposed algorithm gives good results in recognizing all the test images, which belong to the same person in the database, with different expressions, glasses and rotation. Even if the gaze direction is different, the proposed algorithm succeeds in returning the correct location of the right image in the used database. 1 2 3 4 5 6 7 8 9 10 Figure 4: The used database Figure 5: The used database and some of test images Table 1 shows some of the results obtained using 150 test RGB images of 10 different persons and a database of 10 standard RGB images of those persons, which is shown in fig. 4. The first column of the table shows the test face. The next columns show the results that were obtained by applying the classical method, variance estimation formula, feature extraction method, the proposed method and the time in each method. The classical method is the general method in comparing two images, by comparing pixel by pixel and computing the summation of the difference of all pixels. The classical method performs on the whole image without partitioning. The variance estimation is 18 International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-6367(Print), ISSN 0976 - 6375(Online), Volume 5, Issue 4, April (2014), pp. 11-23 © IAEME applied by using eq. (2) and (3). We have displayed the results of the variance separately to show how the variance computation is efficient and important in comparing similarity between images. When we used the variance estimation as a first test in the proposed method, we noticed that it gives the correct image location from the database if the test image and the matched image in the database have similar conditions of illumination and background. Also, we have applied the feature extraction method separately to study how it is efficient in face recognition. Facial features are extracted and Euclidean distance is computed for each feature then the summation of the difference is obtained. By comparing the difference between the test image and all the images in the database, the matched image is detected as the image has minimum difference. It is noticed from the table that the classical method and variance estimation method have less time than the two other methods. The execution of the proposed method proceeds as follows: the first test (variance test) with variance difference range equals [-600,600] is applied first to detect the images that have close variance values to the test image. (It should be noted that the variance difference range is arbitrary and can be changed). The algorithm returns the locations of faces whose variance value close to the variance of the test face. In order to know which one of them is the same or the closest to the test face, the facial features of the test face and the facial features of the obtained face images are extracted then the Euclidean distance of their RGB components is calculated by eq. (7). The face image with the minimum distance (d) is considered as the best-matched image and its location is returned. The search efficiency is evaluated by how many times the distance (d) computations are performed on average compared to the size of the database. In the proposed method the total number of distance calculations is small, because it uses the variance test to find out the face images that have a close variance value to the input face image, then the distance computation is performed only on those images where their number is always small compared to the database size. But execution time of the proposed method is high compared to the other methods because the proposed method works in two stages or two tests variance estimation and facial feature extraction where each of these stages take some time. The execution time depends on the database size. The execution time of the classical method is 0.43 seconds on average, the execution time of the variance estimation method is 0.1 seconds on average, the execution time of facial feature extraction method is 0.22 seconds on average and the execution time of the proposed method is 1.06 seconds on average, as shown in Fig. 6. In most cases, the proposed algorithm gives good results. However, in some cases, the results are not good, because the proposed algorithm is affected by illumination conditions in some images, zooming and big rotation in some others, see Table 2. Time ( in seconds) 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Images Figure 6: The time chart of the proposed algorithm and comparison methods

Table 1: Some results of the proposed algorithm and the comparison methods

Table 1: Some results of the proposed algorithm and the comparison methods (Continued)

Table 2: Some false positive results of the proposed algorithm and the comparison methods

IV. CONCLUSION

In this paper, a new method of face recognition, for faces with expressions, glasses and/or rotation, based on variance estimation and facial feature extraction is proposed. It can be used in face recognition systems such as video surveillance, human computer interfaces, image database management and smart home applications. The proposed algorithm has been tested using a database of faces and the results showed that it is able to recognize a variety of different faces in spite of different expressions, rotation and illumination conditions. Zoomed images and their effect on the recognition of humans need further investigation. 