Skip to main content

IBM Israel Research Seminars

 

I will describe a method for simultaneously detecting faces and estimating their pose in real time. To exploit the synergy between these two tasks, we train a learning machine to map input images to points in a low-dimensional space. In the low-dimensional output space we embed a face manifold which is parameterized by facial pose parameters (e.g. pitch, yaw, and roll). A convolutional network is trained to map face images to points on the face manifold that correspond to the pose of the faces and non-face images to points far away from that manifold. After training, a detection is performed by measuring whether the distance of the output point from the manifold is lower than a threshold. If the point is close to the manifold, indicating that a face is present in the image, its pose parameters can be inferred from the position of the projection of the point onto the manifold.

To map input images to points in the low-dimensional space, we employ a convolutional network architecture. Convolutional networks are specifically designed to learn invariant representation of images. They can easily learn the type of shift-invariant local features that are relevant to face detection and pose estimation. More importantly, they can be replicated over large images (applied to every sub-windows in a large image) at a small fraction of the cost of applying more traditional classifiers to every sub-windows in an image. This is a considerable advantage for building real-time systems.

The system is designed to handle very large range of poses without retraining. The performance of the system was tested on three standard datasets -- for frontal views, rotated faces, and profiles -- is comparable to previous systems that are designed to handle a single one of these datasets.

About the Speaker
Dr. Margarita Osadchy is a lecturer in the University of Haifa. She received her Ph.D. degree with honors in computer science in 2002 from the University of Haifa. In 2001 - 2004 she was a postdoctoral fellow at the NEC Research Institute and then in the Department of Computer Science at the Technion. In 2005, she joined the Department of Computer Science at the University of Haifa.

Dr. Osadchy's research interests are computer vision and machine learning. She has been working on developing new, effective methods of object detection and recognition in realistic environments. She proposed a new method (called Anti-faces) for the detection of image classes and event recognition in video sequences. The proposed algorithm dramatically increases detection rate and speed for complex image classes that often occur in real-life problems. This work led to a novel direction of incorporating statistics of natural images into various learning mechanisms. She also developed a robust method for simultaneous face detection and head pose estimation. The proposed system is the first detector in the world that allows such big variations in pose (yaw, pitch and roll) and runs in near real time. Her other work is in illumination insensitive image representation and recognition, object categorization, and privacy preserving face detection.