Facing up to new technology
| 18 February 2004
The human face has 80 muscles that work in tandem to create a seemingly infinite array of expressions that dramatically change the way we look from moment to moment. While humans have a relatively easy time matching our contorting faces to names, computers are notoriously bad at it. If software could automatically and accurately identify people’s faces, though, myriad applications might emerge — from intelligent surveillance systems to software that would help us navigate massive collections of photographs.
Berkeley Ph.D. candidate Tamara Miller and computer-science professor David Forsyth are tackling the latter objective in an effort to advance the science of computerized face recognition as a whole.
The researchers developed a system that automatically associates 45,000 face images culled from online news articles with the names of the individuals in the photos. In their current demonstration, a user is presented with a cluster of photos depicting a single individual — top United Nations weapons inspector Hans Blix, for instance. The more someone appears in the news, the larger the cluster of images. Clicking on a particular photo links the image to its associated news article.
“The system enables you to browse the news by faces and bring up articles related to the people you see,” Miller says.
The software is remarkably adept at identifying dozens of images of, say, Colin Powell even when the photos depict the Secretary of State from a variety of angles, under different lighting conditions, and with dozens of very different expressions on his face.
“Most photos in the news aren’t mugshots, with the person looking right into the camera,” says Forsyth, a researcher with the Center for Information Technology Research in the Interest of Society (CITRIS). “People do all kinds of remarkable things with their faces. For example, we have piles of photographs of George Bush biting his upper lip when he’s nervous.”
One potential application of the technology would be a tool to automatically organize and enable easy searches of photographic archives without depending solely on text annotations. Also, while the Berkeley research is not focused on surveillance, Forsyth imagines it could lead to a system that would analyze video footage taken during or before a criminal activity to flag possible suspects.
The process of linking a massive collection of faces with names begins with extracting the faces from the rest of a photograph. Software written by Miller then corrects, or rectifies, the position of each face so that it matches a “canonical” pose that can be compared with other faces. The rectifying software runs on the Millennium Cluster, a CITRIS testbed of more than 1,000 individual PCs that work in parallel to solve computationally intensive problems.
“The rectifying software finds the eyes, nose, and mouth and conducts the transformation between the original and the canonical pose,” Miller says.
The identification process is helped along by extracting names from the captions that accompany the photos. Labeling the photos based only on the captions is not possible, however, because (to cite just one reason) there may be several people in a particular photograph. While humans can determine who is who in a photo by reading the caption, computers are tripped up by the syntax of the text. Instead, each face in a photo is associated with all of the names in a caption. Then the computer compares the face with already established clusters of named faces to statistically determine if it tagged the subject photo with the correct name.
In a recent scientific paper, Forsyth, Miller, and their colleagues report that the system is correct 95 percent of the time. Sometimes, though, “one innocent error by the program could cause considerable offense,” Forsyth says. For example, due to a mistake extracting names from the caption, the system incorrectly labeled a photo of the current German Justice Minister as Adolf Hitler.
While the kinks of the software are still being worked out, including its inability to label faces photographed in profile, the development of a massive image database that can be automatically labeled is a leap forward for computerized face recognition.
“One problem in face-recognition research is that the experimental datasets of images that people use are often very different from the real world,” says Forsyth “It’s a bit like studying animal behavior in a zoo. You can do it, but you can never be certain about what you’ve learned. Our dataset is more realistic because it contains faces captured ‘in the wild.’”
This article is reprinted with permission from the January issue of Lab Notes, published by the College of Engineering.