Face Recognition Technology and Practical Design

Face Recognition Technology and Practical Design

Face recognition technology has garnered significant attention and investment from top tech giants, including Google, Facebook, Ali, Tencent, and Baidu. This surge in interest has given rise to numerous startups, such as Face++, Shang Tang technology, Linkface, and Branch from the cloud, which are actively developing applications in video surveillance, criminal detection, internet banking identity verification, and automated border crossing systems. This paper aims to provide an overview of the development of face recognition technology and offer practical design solutions based on the author’s experience in the field.

I. Overview

Machine learning problems can be simplified to finding the most suitable transformation function. This concept is evident in voice recognition, where the goal is to apply an appropriate transformation to convert the input signal into the semantic space. Similarly, in image recognition, the objective is to convert the input image into the feature space to uniquely determine the corresponding identity. Face recognition, in particular, has been thought to be more challenging than Go, but this assumption is based on a superficial understanding of the problem. In reality, face recognition is a more complex task due to the vast number of possible input values.

Comparing Face Recognition and Go

In the computer’s “eyes,” a Go board is a 19x19 matrix with each element having a possible value from a triplet {0,1,2}, resulting in 3361 possible input vector values. In contrast, a 512x512 input image for face recognition is a 512x512x3 dimensional matrix, with each element ranging from 0 to 255, yielding 256,786,432 possible input vector values. Although both tasks involve finding an appropriate transformation function, the complexity of the input space for face recognition is significantly larger than that of Go.

Ideal Transformation Function

The ideal transformation function aims to achieve the optimal effect of classification in the feature space by minimizing the difference within the same class and maximizing the difference between classes. However, this ideal is often compromised due to factors such as illumination, facial expression, and posture (FIG. 1), which can lead to smaller gaps between different people than between the same person (FIG. 2).

Historical Development of Face Recognition Algorithms

The history of face recognition algorithms is marked by the struggle to overcome the impact of various factors on the accuracy of face recognition. The first method, face geometry and topology, was introduced in the 1960s but was soon found to be limited in its ability to handle variations in face pose and facial expression. The “feature face” method, which used principal component analysis and statistical characteristics, was introduced in the 1990s and showed significant improvement in accuracy. However, this method was also found to be limited in its ability to handle variations in face pose and facial expression.

Recent Advances in Face Recognition

The development of machine learning theory has led to the exploration of new methods for face recognition, including genetic algorithms, SVM, boosting, manifold learning, and nuclear methods. The sparse representation method, which was introduced in 2009, became a hot topic due to its beauty and robustness to occlusion factors. The industry has reached a consensus that subspace feature extraction and feature selection methods can achieve the best recognition performance based on local descriptors artificially designed.

State-of-the-Art Methods

The current state-of-the-art methods for face recognition include the use of convolution neural networks (CNNs), such as Deep-ID, VGG-Net, ResNet, and Google Inception structure. These methods have achieved high accuracy in face recognition and have been widely adopted in various applications. The use of multi-patch division, CNN feature extraction, multi-task learning, and feature fusion has become a popular approach for improving recognition performance.

Practical Recognition Program

A practical recognition program flowchart is shown in FIG. 5. The program includes face detection, face critical point positioning, alignment, and other modules to prevent impersonation attacks. The use of infrared/3D cameras has also become a popular approach for improving recognition accuracy.

Conclusion

Face recognition technology has made significant progress in recent years, with accuracy rates exceeding 99% in the LFW open competition. However, there is still a long way to go in achieving practical recognition performance, especially in scenarios with large N. The future of face recognition technology lies in the development of new models, expansion of training data, and improvement of measure learning.

References

[1] Turk and Pentland, “Eigenfaces for Recognition,” Journal of Cognitive Neuroscience, 1991.
[2] Belhumeur et al., “Eigenfaces vs. Fisherfaces: Recognition using Class-Specific Linear Projection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997.
[3] Liu and Wechsler, “Gabor Feature-Based Classification Using Enhanced Fisher Linear Model for Face Recognition,” IEEE Transactions on Image Processing, 2002.
[4] Ahonen et al., “Face Description with Local Binary Patterns: Application to Face Recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006.
[5] Wright et al., “Robust Face Recognition via Sparse Representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009.
[6] Chen et al., “Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification,” IEEE International Conference on Computer Vision and Pattern Recognition, 2013.
[7] Sun et al., “Deep Learning Face Representation by Joint Identification-Verification,” IEEE International Conference on Computer Vision and Pattern Recognition, 2014.
[8] Zhao et al., “Face Recognition: A Literature Survey,” ACM Computing Surveys, 2003.
[9] Li and Jain, “Handbook of Face Recognition (2nd Edition),” Springer-Verlag, 2011.
[10] Wang et al., “Illumination Normalization Based on Weber’s Law with Application to Face Recognition,” IEEE Signal Processing Letters, 2011.
[11] Wang et al., “Robust Pose Normalization for Face Recognition under Varying Views,” ICIP, 2015.
[12] Kan, “Domain Adaptation for Face Recognition: Targetize Source Domain Bridged by Common Subspace,” IJCV, 2014.