Top Banner
Mechanisms of Face Perception Doris Y. Tsao 1 and Margaret S. Livingstone 2 1 Centers for Advanced Imaging and Cognitive Sciences, Bremen University, D-28334 Bremen, Germany; email: [email protected] 2 Department of Neurobiology, Harvard Medical School, Boston, Massachusetts 02115; email: [email protected] Annu. Rev. Neurosci. 2008. 31:411–37 First published online as a Review in Advance on April 2, 2008 The Annual Review of Neuroscience is online at neuro.annualreviews.org This article’s doi: 10.1146/annurev.neuro.30.051606.094238 Copyright c 2008 by Annual Reviews. All rights reserved 0147-006X/08/0721-0411$20.00 Key Words face processing, face cells, holistic processing, face recognition, face detection, temporal lobe Abstract Faces are among the most informative stimuli we ever perceive: Even a split-second glimpse of a person’s face tells us his identity, sex, mood, age, race, and direction of attention. The specialness of face processing is acknowledged in the artificial vision community, where contests for face-recognition algorithms abound. Neurological evidence strongly implicates a dedicated machinery for face processing in the human brain to explain the double dissociability of face- and object-recognition deficits. Furthermore, recent evidence shows that macaques too have specialized neural machinery for processing faces. Here we propose a unifying hypothesis, deduced from computational, neurological, fMRI, and single-unit experiments: that what makes face processing special is that it is gated by an obligatory detection process. We clarify this idea in concrete algorithmic terms and show how it can explain a variety of phenomena associated with face processing. 411 Click here for quick links to Annual Reviews content online, including: • Other articles in this volume • Top cited articles • Top downloaded articles • Our comprehensive search Further ANNUAL REVIEWS Annu. Rev. Neurosci. 2008.31:411-437. Downloaded from www.annualreviews.org by Harvard University on 01/25/12. For personal use only.
29

Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

May 14, 2018

Download

Documents

lediep
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

Mechanisms ofFace PerceptionDoris Y. Tsao1 and Margaret S. Livingstone2

1Centers for Advanced Imaging and Cognitive Sciences, Bremen University,D-28334 Bremen, Germany; email: [email protected] of Neurobiology, Harvard Medical School, Boston, Massachusetts 02115;email: [email protected]

Annu. Rev. Neurosci. 2008. 31:411–37

First published online as a Review in Advance onApril 2, 2008

The Annual Review of Neuroscience is online atneuro.annualreviews.org

This article’s doi:10.1146/annurev.neuro.30.051606.094238

Copyright c© 2008 by Annual Reviews.All rights reserved

0147-006X/08/0721-0411$20.00

Key Words

face processing, face cells, holistic processing, face recognition, facedetection, temporal lobe

AbstractFaces are among the most informative stimuli we ever perceive: Evena split-second glimpse of a person’s face tells us his identity, sex, mood,age, race, and direction of attention. The specialness of face processingis acknowledged in the artificial vision community, where contests forface-recognition algorithms abound. Neurological evidence stronglyimplicates a dedicated machinery for face processing in the humanbrain to explain the double dissociability of face- and object-recognitiondeficits. Furthermore, recent evidence shows that macaques too havespecialized neural machinery for processing faces. Here we propose aunifying hypothesis, deduced from computational, neurological, fMRI,and single-unit experiments: that what makes face processing special isthat it is gated by an obligatory detection process. We clarify this ideain concrete algorithmic terms and show how it can explain a variety ofphenomena associated with face processing.

411

Click here for quick links to

Annual Reviews content online,

including:

• Other articles in this volume

• Top cited articles

• Top downloaded articles

• Our comprehensive search

FurtherANNUALREVIEWS

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 2: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

Contents

INTRODUCTION . . . . . . . . . . . . . . . . . . 412Detection . . . . . . . . . . . . . . . . . . . . . . . . . 412Measurement and Categorization . . . 413

COMPUTER VISIONALGORITHMS. . . . . . . . . . . . . . . . . . . 413Detection . . . . . . . . . . . . . . . . . . . . . . . . . 413Measurement . . . . . . . . . . . . . . . . . . . . . . 415Categorization . . . . . . . . . . . . . . . . . . . . . 418Invariance . . . . . . . . . . . . . . . . . . . . . . . . . 418Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 418

HUMAN BEHAVIOR ANDFUNCTIONAL IMAGING . . . . . . . 419Norm-Based Coding . . . . . . . . . . . . . . . 420Detection . . . . . . . . . . . . . . . . . . . . . . . . . 420Holistic Processing of Faces . . . . . . . . 421

HUMAN FUNCTIONALIMAGING . . . . . . . . . . . . . . . . . . . . . . . . 421Measurement and Categorization . . . 423Invariance . . . . . . . . . . . . . . . . . . . . . . . . . 424Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 424

MONKEY fMRI AND SINGLE-UNIT PHYSIOLOGY . . . . . . . . . . . . 425Detection . . . . . . . . . . . . . . . . . . . . . . . . . 425Holistic Processing of Faces . . . . . . . . 426Anatomical Specialization

of Face Cells . . . . . . . . . . . . . . . . . . . . 426The Functional Significance

of the Anatomical Localizationof Face Processing . . . . . . . . . . . . . . 428

Time Course of Feature-Combination Responses . . . . . . . . . 429

Norm-Based Coding . . . . . . . . . . . . . . . 429Invariance . . . . . . . . . . . . . . . . . . . . . . . . . 430Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 431

INTRODUCTION

The central challenge of visual recognition isthe same for both faces and objects: We mustdistinguish among often similar visual forms de-spite substantial changes in the image arisingfrom changes in position, illumination, occlu-sion, etc. Although face identification is oftensingled out as demanding particular sensitivityto differences between objects sharing a com-

mon basic configuration, in fact such differ-ences must be represented in the brain for bothfaces and nonface objects. Most humans caneasily identify hundreds of faces (Diamond &Carey 1986), but even if one cannot recognizea hundred different bottles by name, one cancertainly distinguish them in pairwise discrim-ination tasks. Furthermore, most of us can rec-ognize tens of thousands of words at a glance,not letter by letter, a feat requiring expert detec-tion of configural patterns of nonface stimuli.Thus, face perception is in many ways a mi-crocosm of object recognition, and the solutionto the particular problem of understanding facerecognition will undoubtedly yield insights intothe general problem of object recognition.

The system of face-selective regions in thehuman and macaque brain can be defined pre-cisely using fMRI, so we can now approach thissystem hierarchically and physiologically to askmechanistic questions about face processing ata level of detail previously unimaginable. Herewe review what is known about face processingat each of Marr’s levels: computational theory,algorithm, and neural implementation.

Computer vision algorithms for face percep-tion divide the process into three distinct steps.First, the presence of a face in a scene mustbe detected. Then the face must be measuredto identify its distinguishing characteristics.Finally, these measurements must be used tocategorize the face in terms of identity, gender,age, race, and expression.

Detection

The most basic aspect of face perception is sim-ply detecting the presence of a face, which re-quires the extraction of features that it has incommon with other faces. The effectivenessand ubiquity of the simple T-shaped schematicface (eye, eye, nose, mouth) suggest that facedetection may be accomplished by a simpletemplate-like process. Face detection and iden-tification have opposing demands: The identi-fication of individuals requires a fine-grainedanalysis to extract the ways in which each facediffers from the others despite the fact that all

412 Tsao · Livingstone

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 3: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

faces share the same basic T-shaped configu-ration, whereas detection requires extractingwhat is common to all faces. A good detectorshould be poor at individual recognition andvice versa.

Another reason why detection and identi-fication should be separate processes is thatdetection can act as a domain-specific filter,ensuring that precious resources for face recog-nition [e.g., privileged access to eye movementcenters ( Johnson et al. 1991)] are used onlyif the stimulus passes the threshold of being aface. Such domain-specific gating may be onereason for the anatomical segregation of faceprocessing in primates (it is easier to gate cellsthat are grouped together). A further impor-tant benefit of preceding identification by de-tection is that detection automatically accom-plishes face segmentation; i.e., it isolates theface from background clutter and can aid inaligning the face to a standard template. Manyface-recognition algorithms require prior seg-mentation and alignment and will fail withnonuniform backgrounds or varying face sizes.

Measurement and Categorization

After a face has been detected, it must be mea-sured in a way that allows for accurate, effi-cient identification. The measurement processmust not be so coarse as to miss the subtle fea-tures that distinguish one face from another.On the other hand, it must output a set of val-ues that can be efficiently compared with storedtemplates for identification. There is a zero-sum game between measurement and catego-rization: The more efficient the measurement,the easier the classification; conversely, less ef-ficient measurement (e.g., a brute force tabula-tion of pixel gray values) makes the classificationprocess more laborious.

COMPUTER VISIONALGORITHMS

A comprehensive review of computer algo-rithms for face recognition can be foundin Zhao et al. (2003) and Shakhnarovich &

Moghaddam (2004). Our goal here is to dis-cuss algorithms that offer special insights intopossible biological mechanisms.

Detection

How can a system determine if there is a facein an image, regardless of whose it is? An obvi-ous approach is to perform template matching(e.g., search for a region containing two eyes, amouth, and a nose, all inside an oval). In manyartificial face-detection systems a template isswept across the image at multiple scales, andany part of the image that matches the templateis scored as a face. This approach works, but itis slow.

To overcome this limitation, Viola & Jones(2004) introduced the use of a cascade of in-creasingly complex filters or feature detectors.Their reasoning was that the presence of aface can be ruled out most of the time witha very simple filter, thus avoiding the com-putational effort of doing fine-scale filteringon uninformative parts of the image. The firststage in their cascade consists of only two sim-ple filters, each composed of a few rectangularlight or dark regions (Figure 1a). Subsequentstages of filtering are performed only on regionsscoring positive at any preceding stage. Thiscascade approach proved just as accurate, but10 times faster, than single-step face-detectoralgorithms.

Sinha’s face-detection algorithm (Sinha2002a) is based on the observation that qual-itative contrast relationships between differentparts of a face are highly conserved, even un-der different lighting conditions (Figure 1b).Even though any single contrast relationshipbetween two facial regions would be inadequateto detect a face, a set of such relationships couldbe adequate (because probabilities multiply). Asubset of Sinha’s directed contrasts ([r2, r3] and[r4, r5]) are equivalent to the first stage of theViola-Jones face detector.

Effective primitives for face detection canalso be computed using an information the-ory approach by identifying fragments (sub-windows) of face images that are maximally

www.annualreviews.org • Mechanisms of Face Perception 413

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 4: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

r0 r1

r4 r5r2 r3

r7 r8r6

r9 r10

r11

130

102

245

69

168

148

107

161

43

8566

147

153

24

154

140

244

135

a

b

414 Tsao · Livingstone

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 5: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

informative about the presence or absence ofa face (Ullman et al. 2002). The resulting frag-ments consist of medium-resolution face parts,e.g., an eye, rather than the whole face, so in thisalgorithm, face detection is triggered by detec-tion of a threshold number of such fragments.

All three algorithms discussed above use ba-sic feature detectors much simpler than a wholeface (rectangle features in the Viola-Jones algo-rithm, qualitative contrast ratios between pairsof face regions in the Sinha algorithm, and faceparts in the Ullman algorithm). Yet, all threealgorithms perform holistic detection, that is,they obligatorily detect faces as correctly ar-ranged wholes. This is because all three al-gorithms detect overlapping constellations ofelemental features that cover the whole face.The feature overlaps implicitly enforce the cor-rect overall arrangement of features.

Measurement

Once a face has been detected, it may need to beidentified or classified. Algorithms for the iden-tification of individual faces are generally ei-ther feature-based or holistic. In feature-basedmethods, fiducial points (e.g., eyes, mouth,nose) are identified and used to compute var-ious geometric ratios. As long as the featurescan be detected, this approach is robust to po-sition and scale variations. In holistic methods,the entire face is matched to memory templateswithout isolating specific features or parts. Oneadvantage of holistic methods is that all partsof the face are used, and no information isdiscarded.

The simplest holistic recognition algorithmis to correlate a presented image directly toa bank of stored templates, but having tem-plates for every face is expensive in time andmemory space. Turk & Pentland (1991) de-veloped the eigenface algorithm to overcome

Eigenface: aneigenvector of thecovariance matrixdefined by a set offaces that allows acompressedrepresentation

PCA: principalcomponents analysis

Caricature: an artistictechnique to enhancethe recognizability of aface by exaggeratingfeatures distinguishingthat face from theaverage face

these limitations. The eigenface algorithm ex-ploits the fact that all faces share a commonbasic structure (round, smooth, symmetric, twoeyes, a nose, and a mouth). Thus the pixel arraysdefining various faces are highly correlated, andthe distinguishing characteristics of a face canbe expressed more efficiently if these correla-tions are removed using principal componentsanalysis (PCA). When PCA is performed on alarge set of faces, the eigenvectors with largesteigenvalues all look like faces, and hence arecalled “eigenfaces” (Figure 2a). An arbitraryface can be projected onto a set of eigenfaces toyield a highly compressed representation; goodface reconstructions can typically be obtainedwith just 50 eigenfaces and passable ones withjust 25. In other words, something as ineffableas an identity can be reduced to 25 numbers(Figure 2b).

PCA on sets of faces varying in both expres-sion and identity generates some principal com-ponents that are useful for only expression oronly identity discrimination and others that areuseful for both (Calder et al. 2001). This par-tial independence of PCs can successfully modelthe independent perception of expression andidentity (Cottrell et al. 2002).

The eigenface algorithm does not performwell if the sample face is not accurately alignedin scale and position to the template eigenfaces.Human face perception, however, is tolerantto changes in both scale and position. More-over, if a face is transformed further along themorph line representing the deviation of thatface from the average face, the transformed faceis easily recognized as the same individual; thisis the basis of caricature (Leopold et al. 2001).The process of morphing one individual intoanother (Wolberg 1996) involves both an in-tensity transform (which eigenfaces model verywell) and a simultaneous geometric transform(Figure 3a). Because eigenfaces represent axes

←−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−Figure 1(a) The two most diagnostic features defining a face comprise the first level of the detection cascade in theViola-Jones algorithm for face detection. From Viola & Jones 2004. (b) The Sinha algorithm for facedetection, showing the ratio-templates defining a face. From Sinha 2002a.

www.annualreviews.org • Mechanisms of Face Perception 415

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 6: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

a

b

Figure 2The eigenface algorithm for face recognition. (a) The first 25 eigenvectors computed from the Yale face database (a collection of 165face images). (b) Eigenface reconstructions of 5 different images, using the 25 eigenfaces shown in panel a. Note that nonface imagescan have nontrivial projections onto eigenfaces. Courtesy of C. DeCoro.

416 Tsao · Livingstone

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 7: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

a

b xyI

III

xyIxyI

Inte

nsi

ty

x

y

x

y

Inte

nsi

ty

c

Figure 3A computational approach that can represent both spatial and intensity variations. (a) The computer graphicstechnique of morphing, in which the identity of one individual can be continuously transformed into that ofanother, provides insights about the nature of the face template. In the middle row, the individual outlined inred is continuously morphed into the individual outlined in green, which requires both a geometric transformand an intensity transform. The top and bottom rows show pure geometric transforms (morphing of themesh) of the same 2 faces (the top rows show the geometric distortion of the red face into the shape of thegreen face, and the bottom row shows the distortion of the green face into the shape of the red face). Themiddle row shows a weighted intensity average of the aligned meshes from the top and bottom rows. FromWolberg 1998. (b) Bags of Pixels variant on the eigenface algorithm. The (x, y) coordinate of each pixel iselevated to the same status as the intensity value. (c) Adding or subtracting traditional eigenfaces to an averageface produces only intensity variations at each pixel. Adding or subtracting eigenfaces computed using Bagsof Pixels, however, can produce geometric variations in addition to intensity variations. From Jebara 2003.

www.annualreviews.org • Mechanisms of Face Perception 417

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 8: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

of intensity values on a fixed spatial basis, theeigenface approach does not interpret carica-ture transformations as the same individual.

Jebara (2003) proposed a clever way to getaround the spatial rigidity of the original eigen-face approach: Instead of performing PCA onthe intensity values, the size of the representa-tion is tripled, so each pixel conveys not onlythe image intensity value but also the intensityvalue’s (x, y) location. Then PCA can be done onthe triple-sized image containing a concatena-tion of (x, y, I ) values (Figure 3b,c). The powerof this approach is that spatial coordinates aretreated just like intensity coordinates, and thusthe resultant eigenfaces represent both geomet-ric and intensity variations. The fact that thisbags of pixels approach performs three orders ofmagnitude better than standard eigenface anal-ysis on face sets with changes in pose, illumina-tion, and expression is computational proof ofthe importance of representing geometric vari-ations in addition to intensity variations.

Categorization

Turk and Pentland used a simple Euclideandistance metric on face eigen-coordinates toperform recognition. More powerful classifiersthat have been applied to the problem of facerecognition include Fisher linear discriminants(Belhumeur et al. 1997), Bayesian estimation(Moghaddam et al. 2000), and support vec-tor machines (Shakhnarovich & Moghaddam2004). These classification techniques can beregarded as second-tier add-ons to the basiceigenface measurement system. Measurementyields analog descriptions, whereas classifica-tion is nonlinear and yields discrete boundariesbetween descriptions.

Separating the process of measurement fromthe process of classification gives a computa-tional system maximum flexibility because dif-ferent categorizations (e.g., emotion, identity,gender) can all operate on the same set ofbasic eigenvector projections. Gender determi-nation can be based on large eigenvalue eigen-vectors, whereas identification of individuals re-lies on lower-value eigenvectors (O’Toole et al.

1993). Furthermore, because classifications arenecessarily nonlinear, the independence ofclassification mechanisms from measurementmechanisms would be very exciting from anexperimental point of view because the tem-plates for measurement could thus be linear,and therefore their detailed structure could bemapped. We will return to the idea of lin-ear measurement mechanisms when we discusstuning properties of face cells.

Invariance

Developing position and scale invariant recog-nition is a huge challenge for artificial face-recognition systems. Initial attempts to com-pute a meaningful set of eigen-coordinates for aface required that the face be accurately alignedin scale, position, and rotation angle to the tem-plate eigenfaces. However, if, as we propose,face detection precedes measurement, the de-tector can determine the location, size, and ro-tation angle of the eyes and face outline andthen use these to normalize the input to face-measurement units.

Summary

The main lesson we can extract from artificialsystems for face processing is that detection andrecognition are distinct processes, with distinctgoals, primitives (coarse contrast relationshipsvs. detailed holistic templates), and computa-tional architectures (filter cascade vs. parallelmeasurements). By preceding recognition, de-tection can act as a domain-specific filter to gatesubsequent processing and can include align-ment and segmentation, preparing the face rep-resentations for subsequent measurement. Theeffectiveness of the eigenface algorithm for facerecognition shows that faces can be representedby their deviation from the average in a com-pressed subspace. To characterize faces most ef-fectively, this subspace needs to include spatialvariations as well as intensity variations.

Some machine vision models of recogni-tion use common meta-algorithms to learn theprimitives for both detection and recognition

418 Tsao · Livingstone

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 9: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

of faces (Riesenhuber & Poggio 2000, Ullman2007). Thus the two processes may share corecomputational principles. Whether biologicalsystems use discrete steps of detection, mea-surement, and classification to recognize faces isa question that can only be resolved empirically.

HUMAN BEHAVIOR ANDFUNCTIONAL IMAGING

The extensive behavioral literature on face per-ception provides a rich source of clues about thenature of the computations performed in pro-

cessing faces (Figure 4). One of the hallmarksof face processing is that recognition perfor-mance drops substantially when faces are pre-sented upside down (Figure 4a) or in negativecontrast, and both effects are much smaller forobjects (Kemp et al. 1990, Yin 1969). We pro-pose that both these properties can be explainedif only upright, positive-contrast faces gain ac-cess to the face-processing system, i.e., if anupright, positive-contrast template is used forface detection. This template may be innate inhumans, as evidenced by the tendency for new-borns to track normal schematic faces longer

a b

d

e

f

c

Figure 4Behavioral observations on the nature of human face processing. (a) Flip the page upside down. TheThatcher Illusion shows that faces are obligatorily processed as wholes (an identical pair of features such asthe upright and inverted mouth can appear similar or dramatically different depending on the surroundingcontext). From Thompson 1980. (b) Robustness of face identification to caricature. (c) Adaptation: Run youreyes along the 5 red dots for a minute, and then shift your gaze to the single red dot. From Afraz & Cavanagh2008. (d ) Robustness to compression. From Sinha et al. 2006. (e) The importance of external features. FromSinha & Poggio 1996. ( f ) Robustness to low resolution. From Sinha 2002b.

www.annualreviews.org • Mechanisms of Face Perception 419

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 10: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

Prosopagnosia:highly specific inabilityto recognize faces, dueto either congenitalbrain miswiring(“developmentalprosopagnosia”) orfocal brain lesions(“acquiredprosopagnosia”)

than scrambled schematic faces ( Johnson et al.1991, Simion et al. 1998).

Norm-Based Coding

Caricatures are remarkably powerful in evok-ing recognition (Figure 4b): Caricatured facesare often more identifiable than veridicalphotographs (Lee et al. 2000). This finding hasled to the proposal that faces are representedin terms of their deviation from the norm, oraverage, face (Leopold et al. 2001, Rhodes et al.1987). Furthermore, the existence of face after-effects (Figure 4c) shows that the face norm isadaptable (Webster & MacLin 1999). Becausesuch face aftereffects transfer across retinalpositions (Leopold et al. 2001) and image sizes( Jeffery et al. 2006), they apparently do notreflect adaptation to specific low-level imagefeatures, but instead indicate adaptation ofhigher-level representations. This face identityaftereffect was interpreted as indicating thatadaptation to a given face shifts the norm oraverage face in the direction of the adaptingface, making faces on the opposite side of thenorm more distinctive (i.e., more differentfrom the norm). To explain these resultsRhodes & Jeffrey (2006) propose that faceidentity is coded by pairs of neural populationsthat are adaptively tuned to above-average andbelow-average values along each dimension offace space.

Opposite adaptation can occur simultane-ously for upright and inverted faces, consis-tent with the idea that distinct neural path-ways underlie the coding (and adaptation to)upright versus inverted faces (Rhodes & Jeffery2006). Finally, although norm-based coding canwork only for classes of stimuli that have sim-ilar enough first-order shape that a norm canbe defined, this situation may not be unique tofaces. Rhodes & McLean (1990) showed ev-idence for norm-based coding for images ofbirds, and adaptation effects can also be ob-served for simple shapes such as taper and over-all curvature (Suzuki & Cavanagh 1998). Thusadaptive norm-based coding may be a generalfeature of high-level form-coding processes.

Detection

As argued in the modeling section, it is com-putationally efficient to separate detection andrecognition and to have detection precederecognition because detection can act as adomain-specific filter to make the recognitionprocess more efficient (by focusing recognitionon regions actually containing faces). Thatthere are also separate detection and recogni-tion stages in human face processing fits withone of the most striking findings from the neu-ropsychology literature: Patient CK, who wasseverely impaired at object recognition, includ-ing many basic midlevel visual processes, wasnonetheless 100% normal at face recognition(Moscovitch et al. 1997). His pattern of deficitsindicated that face processing is not simply afinal stage tacked onto the end of the non-face object recognition pathway but rather acompletely different pathway that branchesaway from object recognition early in the visualhierarchy, and it is this branching off that wepropose to equate with the detection process.CK’s dissociation is illustrated by his perceptionof the painting of a face made up of vegetablesby Arcimbaldo—CK sees the face but not theconstituent vegetables.

CK’s ability to recognize famous or fa-miliar faces was at least as good as normalcontrols, until the faces were shown upsidedown, and then his performance became muchworse than that of controls. Conversely, pa-tients with prosopagnosia perform better thancontrols in recognizing inverted faces (Farahet al. 1995). This double dissociation of theinversion effect is consistent with the existenceof a face-specific processing system that canbe accessed only by upright faces, present inCK and absent in prosopagnosics. Presumably,CK can process objects using only the face-specific system, prosopagnosics have a generalobject-recognition system but not the face-specific system, and normal subjects have bothsystems. The general nonface object system isnot as good at processing faces as the face-specific system (hence the inversion effect innormal subjects), is missing in CK (hence his

420 Tsao · Livingstone

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 11: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

disproportionate deficit for inverted faces), andis the only way prosopagnosics can processany face (hence their relatively superior perfor-mance with inverted faces because their gen-eral object system gets extra practice processingfaces).

Holistic Processing of Faces

Face processing is said to be distinct from non-face object processing in that it is more holistic;that is, faces are represented as nondecomposedwholes rather than as a combination of inde-pendently represented component parts (eyes,nose, mouth) and the relations between them(Farah et al. 1998). Evidence for holistic pro-cessing of faces comes from a number of behav-ioral paradigms, of which the two most cited arethe part-whole effect (Tanaka & Farah 1993)and the composite effect (Young et al. 1987).In the part-whole effect, subjects are better atdistinguishing two face parts in the context ofa whole face than in isolation. In the compos-ite effect, subjects are slower to identify halfof a chimeric face aligned with an inconsistentother half-face than if the two half-faces aremisaligned (Young et al. 1987). As with the part-whole effect, the composite effect indicates thateven when subjects attempt to process only partof the face, they suffer interference from theother parts of the face, suggesting a lack of ac-cess to parts of the face and mandatory process-ing of the whole face.

One interpretation of the uniqueness of faceprocessing is that it uses special neural machin-ery not shared by other kinds of objects, an ideathat is consistent with functional imaging stud-ies, as described below. Another interpretationis that holistic processing is characteristic ofany kind of object that must be distinguishedon a subordinate level, especially objects withwhich the subject is highly trained or familiar(Diamond & Carey 1986). It is not yet clearwhat the perceptual phenomenology of holis-tic processing implies either mechanistically orcomputationally. We suggest that holistic faceprocessing can be explained by an obligatorydetection stage that uses a coarse upright tem-

Inversion effect:some objects arerecognized betterwhen they are uprightthan inverted, this isespecially true forfaces and words

plate to detect whole faces (Figure 5). Thismodel explains the composite effect because analigned chimera would be detected as a wholeface and therefore would be processed as a unitby subsequent measurement and classificationstages.

However, we cannot rule out alternatives,such as one-stage models in which both facedetection and identification are carried out bythe same set of face-selective cells. In this case,to explain holistic properties of face process-ing, we would have to postulate that individ-ual face cells, unlike nonface cells, are selectivenot just for local features but for whole facesor that the readout of face information mustcomprise all or most of the population code.Either or both of these models would producethe behavioral holistic effects, even without anantecedent detection gate. The key evidencefavoring our early detection gating hypothesisover a single-stage system comes from the iden-tification of a series of face-selective areas in themacaque (Pinsk et al. 2005, Tsao et al. 2003)and the finding that an area early in this hier-archy already consists entirely of face-selectivecells (Tsao et al. 2006); both these results arediscussed more extensively below.

Although faces are unique in the degreeto which they are processed holistically, othernonface objects can also show holistic effects,especially well-learned categories; for reviewsee Gauthier & Tarr (2002). Words may ap-proach faces in the degree to which they areprocessed holistically: Coltheart et al. (1993)found that some acquired dyslexics can readwhole words and understand their meanings butcannot distinguish individual letters making upthe words. And Anstis (2005) showed that wordrecognition can show the composite effect, inthat observers cannot tell whether two wordshave same or different top halves.

HUMAN FUNCTIONAL IMAGING

Positron emission tomography studies initiallyshowed activation of the fusiform gyrus in a va-riety of face-perception tasks (Haxby et al. 1991,Sergent et al. 1992), and fMRI subsequently

www.annualreviews.org • Mechanisms of Face Perception 421

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 12: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

0.50.5

0

0.50.5

0.50.5

0.250.25

0.250.25

0.50.5

0.250.25

0.250.25

0

0.250.25

0.250.25

0.50.5

0.50.5

0.250.25

0.50.5

0.5

0

0.5

0.5

0.25

0.25

0.5

0.25

0.25

0

0.25

0.25

0.5

0.5

0.25

0.5

StimulusAfter

detectionBill

templateJesse

templateWinner take

all

Bill

Jesse

?

Bill

Beetle

Isuzu

Isuzu

Isuzu

0.5

0.25 0

0.25 0

0.5

0.25 0.25

0.25

0.5 0

0.5 0

0.25

0.5 0.5

a

b StimulusAfter

detectionBeetle

templateIsuzu

templateWinner take

all

422 Tsao · Livingstone

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 13: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

revealed more specificity in these cortical re-gions for faces with demonstrations of fusiformregions that responded more strongly to facesthan to letter strings and textures (Puce et al.1996), flowers (McCarthy et al. 1997), every-day objects, houses, and hands (Kanwisher et al.1997). Although face-specific fMRI activationcan also be seen in the superior temporal sul-cus (fSTS) and in part of the occipital lobe [theoccipital face area (OFA)], the most robust face-selective activation is consistently found on thelateral side of the right mid-fusiform gyrus, thefusiform face area (FFA) (Kanwisher et al. 1997)(Figure 6). The fact that this part of the brainis activated selectively in response to faces indi-cates that activity in this region must arise at orsubsequent to a detection stage.

Many studies support the idea that the FFAis activated specifically by faces and not by thelow-level stimulus features usually present infaces, that is, activity in the FFA indicates thatstimuli have been detected as faces: The FFAshows increased blood flow in response to awide variety of face stimuli: front and pro-file photographs of faces (Tong et al. 2000),line drawings of faces (Spiridon & Kanwisher2002), and animal faces (Tong et al. 2000). Fur-thermore, the FFA BOLD signal to uprightMooney faces (low-information two-tone pic-tures of faces; Mooney 1957) is almost twiceas strong as to inverted Mooney stimuli (whichhave similar low-level features but do not looklike faces) (Kanwisher et al. 1998). Finally, forbistable stimuli such as the illusory face-vase,or for binocularly rivalrous stimuli in which aface is presented to one eye and a nonface ispresented to the other eye, the FFA responds

STS: superiortemporal sulcus

FFA: fusiform facearea

Blood-oxygen-level-dependent (BOLD)signal: hemodynamicsignal measured infMRI experiments.Active neuronsconsume oxygen,causing a delayedblood flow increase1–5 s later

Expertise hypothesis:face-processingmechanisms are usedto process any stimulisharing a commonshape and visualexpertise

Distributed coding:representation schemeusing distributedactivity of coarselytuned units. A keychallenge for this ideais specifying howdistributed codes canbe read out

more strongly when subjects perceive a facethan when they do not, even though the retinalstimulation is unchanged (Andrews et al. 2002,Hasson et al. 2001).

Although the FFA shows the strongest in-crease in blood flow in response to faces, itdoes also respond to nonface objects. There-fore, two alternative hypotheses have beenproposed to the idea that activity in the FFArepresents face-specific processing. First is theexpertise hypothesis. According to this idea, theFFA is engaged not in processing faces per se,but rather in processing any sets of stimuli thatshare a common shape and for which the sub-ject has gained substantial expertise (Tarr &Gauthier 2000). Second is the distributed cod-ing hypothesis: In an important challenge toa more modular view of face and object pro-cessing, Haxby et al. (2001) argued that objectsand faces are coded via the distributed profileof neuronal activity across much of the ventralvisual pathway. Central to this view is the sug-gestion that nonpreferred responses, for exam-ple, to objects in the FFA, may form an impor-tant part of the neural code for those objects.The functional significance of the smaller butstill significant response of the FFA to nonfaceobjects will hopefully be unraveled by the com-bined assaults of higher-resolution imaging inhumans and single-unit recordings in nonhu-man primates.

Measurement and Categorization

Does the human brain use separate systemsfor face measurement and face classification?Some fMRI evidence suggests that it does. For

←−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−Figure 5We propose that holistic (composite) effects of face processing can be explained by a detection stage thatobligatorily segments faces as a whole. Subjects are asked to identify the top (faces) or left (car) part of eachchimera (third and fourth rows) or simply to identify the object (first and second rows). Four face (a) and car(b) stimuli are detected, projected onto holistic templates, and then identified through a winner-take-allmechanism. The numbers in the third and fourth columns indicate the result of projecting each stimulus,after detection, onto the respective templates. Aligned faces are obligatorily detected as a whole, butmisaligned faces and cars are not, and therefore their attended parts can be processed independently.According to this hypothesis, the essential difference between face (a) and nonface (b) processing occurs atthe detection stage (red boxes). Subsequent measurement and classification could use similar mechanisms.

www.annualreviews.org • Mechanisms of Face Perception 423

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 14: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

rFFA

Right Left

FFA

LO-facesLO-faces

pSTSpSTS

FFA

Figure 6Face-selective regions in one representative subject. Face-selective regions ( yellow) were defined as regionsthat respond more strongly to faces than to houses, cars, and novel objects ( p < 10−4). From Grill-Spectoret al. 2004.

fMRI adaptation:controversialtechnique for deducingtuning properties ofsingle cells from themagnitude of theBOLD signal, whichaverages activity oftens of thousands ofcells

example, in a study of morphing betweenMarilyn Monroe and Margaret Thatcher, adap-tation strength in the OFA followed the amountof physical similarity along the morph line,while in the FFA it followed the perceived iden-tity (Rotshtein et al. 2005), suggesting that theOFA performs measurement and the FFA per-forms classification. However, another studyindicates that release from adaptation occursin the FFA when physical differences are un-accompanied by changes in perceived identity(Yue et al. 2006).

According to Bruce & Young (1986), theprocessing of facial expression (one form of cat-egorization) and facial identity (another formof categorization) takes separate routes. Haxbyand colleageus (2000) proposed a neural basisfor this model. According to this idea, the in-ferior occipital gyri are involved in early per-ception of facial features (i.e., measurement).The pathway then diverges, with one branchgoing to the superior temporal sulcus, whichis proposed to be responsible for processingchangeable aspects of faces including directionof eye gaze, view angle, emotional expression,

and lip movement. The other projection is tothe lateral fusiform gyrus, which is responsi-ble for processing identity. A recent review haschallenged the Bruce and Young model, argu-ing that changeable aspects and invariant iden-tity may instead be processed together and relyon partially overlapping visual representations(Calder & Young 2005).

Invariance

Several studies have used fMRI adaptation forface identity in the FFA and found invarianceto image size (Andrews & Ewbank 2004) andspatial scale (Eger et al. 2004). Thus repre-sentations in the FFA are not tied to low-level image properties, but instead show at leastsome invariance to simple image transforma-tions, though not to viewpoint (Pourtois et al.2005).

Summary

Behavioral studies complement computationalapproaches by indicating that specialized

424 Tsao · Livingstone

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 15: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

machinery may be used to process faces andthat a face-detection stage gates the flow ofinformation into this domain-specific module.The filters, or templates, used by this detec-tion stage require an upright, positive contrastface, with the usual arrangement of features,and images that do not fit the template are an-alyzed only by the general object-recognitionsystem. Even images that pass into the face-specific module are probably also processed inparallel by the general system, but the face mod-ule appears to process images differently fromthe general object system: Face processing isholistic in the sense that we cannot process in-dividual face parts without being influenced bythe whole face. We suggest that this differencearises early in the face processing pathway. Theface-detection stage may, in addition to gatingaccess, obligatorily segment faces as a wholefor further processing by the face module. Fi-nally, substantial recent evidence suggests thatface identity is coded in an adaptive norm-basedfashion.

Human imaging studies converge on theconclusion that faces are processed in specificlocations in the temporal lobe, but the degreeof specialization for faces within these loca-tions is debated. The modular interpretationis consistent with neurological findings and,as described below, with single-unit recordingsin macaques. The role of experience in deter-mining both the localization of face processingand its holistic characteristics is also debated.And the relationship, if any, between modu-lar organization and holistic processing is com-pletely unexplored. Only a few visual object cat-egories show functional localization in fMRI:faces, body parts, places, and words (for reviewsee Cohen & Dehaene 2004, Grill-Spector &Malach 2004). Faces, bodies, and places are allbiologically significant, and their neural ma-chinery could conceivably be genetically pro-grammed, but the use of writing arose too re-cently in human history for word processing tobe genetically determined. Therefore, at leastone kind of anatomical compartmentalizationmust be due to extensive experience. We havesuggested that the existence of discrete brain

Inferotemporalcortex (IT): ventraltemporal lobe,including the lowerbank of the STS andouter convexity,specialized for visualobject recognition

gnostic unit (orgrandmother cell): ahypothetical cellresponding exclusivelyto a single high-levelpercept in a highlyinvariant manner

regions dedicated to face processing implies anobligatory detection stage and that an obliga-tory detection stage results in holistic process-ing. What we know about word processing sug-gests that it too displays holistic properties, andit is localized, interestingly, in the left hemi-sphere in an almost mirror symmetric locationto the position of the FFA in the right hemi-sphere (Cohen & Dehaene 2004, Hasson et al.2002).

MONKEY fMRI AND SINGLE-UNIT PHYSIOLOGY

Detection

The seminal finding by Gross and his col-leagues (1969, 1972) that there exist cells ininferotemporal cortex (IT) that are driven op-timally by complex biologically relevant stim-uli, such as hands or faces, was novel and ini-tially not well accepted, despite the fact thatKonorski (1967) had predicted the existence offace-selective cells, or gnostic units, and thatthey would be found in IT. Although IT cells donot generally appear to be detectors for complexobjects, there are consistently observed popu-lations of cells selectively responsive to faces,bodies, and hands, suggesting that faces, bod-ies, and hands are treated differently from othertypes of complex patterns, consistent with theiralso being among the only object categories,aside from words and numbers, that show lo-calization in human fMRI. But the strong pos-sibility remained that these cells were not re-ally tuned to biologically relevant objects, butrather to some more abstract basis set, in whichall possible shapes are represented by differ-ent cells and some cells were tuned to partic-ular parameters that happened to fit the face orhand stimuli better than any of the other objectstested. Foldiak et al. (2004) recently providedevidence that face selectivity is not just an inci-dental property of cells tuned to an exhaustiveset of image features: They presented 600–1200stimuli, randomly chosen from several imagearchives, to cells recorded from both the up-per and the lower bank of the STS and found

www.annualreviews.org • Mechanisms of Face Perception 425

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 16: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

that the distribution of tuning to these imagesshowed bimodality, i.e., cells were either pre-dominantly face selective or not face selective. Itis not unprecedented to have specialized neuralsystems for socially important functions: Birdshave evolved specialized structures for the per-ception and generation of song, and in humansthere are specialized parts of the auditory andmotor systems devoted specifically to language.

Direct evidence that some face cells are usedfor face detection comes from a microstimu-lation study by Afraz et al. (2006). Monkeyswere trained to discriminate between noisy pic-tures of faces and nonface objects. Through sys-tematic sampling, Afraz et al. identified corticallocations where clusters of face-selective cellscould be reliably recorded. When they stimu-lated these regions and observed the monkeys’perceptual choices, they found a shift in the psy-chometric curve favoring detection of a face.

Holistic Processing of Faces

In general, face cells require an intact faceand are not selective just for individual fea-tures (Bruce et al. 1981; Desimone et al. 1984;Kobatake & Tanaka 1994; Leonard et al. 1985;Oram & Perrett 1992; Perrett et al. 1982,1984; Scalaidhe et al. 1999; Tsao et al. 2006).Figure 7 shows nonlinear combinatorial re-sponse properties of a face-selective cellrecorded in IT by Kobatake & Tanaka (1994).Out of a large number of three-dimensional ob-jects, this cell responded best to the face of atoy monkey (panel a), and by testing varioussimplified two-dimensional paper stimuli, theydetermined that the cell would also respond toa configuration of two black dots over a hori-zontal line within a disk (panel b) but not in theabsence of either the spots or the line (panels cand d ) or the circular outline (panel e). The con-trast between the inside and the outside of thecircle was not critical (panel g), but the spots andthe bar had to be darker than the disk (panel h).Thus the cell responded only when the stimuluslooked like a face, no matter how simplified.

The response selectivity of face cells indi-cates that they must not only combine fea-

tures nonlinearly but also require them to bein a particular spatial configuration. However,such spatial-configuration selective responsesand nonlinear combination of features are notrestricted to face cells as such behavior has beenreported for other kinds of complex object-selective cells in the temporal lobe (Baker et al.2002, Kobatake & Tanaka 1994, Tanaka et al.1991). Even earlier in the temporal pathway,nonlinear combinatorial shape selectivity canbe seen (Brincat & Connor 2004).

Anatomical Specializationof Face Cells

Most studies on face cells reported face-selective cells scattered throughout the tem-poral lobe, though they tended to be foundin clusters (Perrett et al. 1984). Because otherkinds of shape selectivities also tend to be clus-tered (Desimone et al. 1984, Fujita et al. 1992,Tanaka et al. 1991, Wang et al. 1996), it was as-sumed that within the temporal lobe there was acolumnar organization for shape, in which facecolumns represented just one of many shape-specific types of columns. However, this viewwas inconsistent with emerging evidence fromhuman neurology and functional imaging thathuman face processing was localized to specific,reproducible regions of the temporal lobe. Theapparent discrepancy was resolved by two re-cent studies by Tsao et al. (2003, 2006), whofound that in monkeys, as in humans, face pro-cessing, as revealed by functional imaging, islocalized to discrete regions of the temporallobe, and they further showed that even at thesingle-unit level, face processing is highly local-ized (Figure 8; note also Figure 7, top).

Tsao et al. used functional imaging to local-ize regions in the macaque temporal lobe thatwere selectively activated by faces, comparedwith nonface objects, and then they recordedalmost 500 single units within the largest ofthese face-selective regions in two monkeys.They found a remarkable degree of face se-lectivity within this region; 97% of the cellswere face selective, on average showing almost20-fold larger responses to faces than to nonface

426 Tsao · Livingstone

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 17: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

20º

10º

0.87

–0.08 0.01 0.69 –0.13

1.0 0.02 0.18

10º

1 s

50i s–1

a b c d

e f g h

50i s–1

1 s

Figure 7Holistic face detection. (Top) recording site and receptive-field location of a face cell. (a-h) Responseselectivity. From Kobatake & Tanaka (1994).

objects. The region where they recorded wasquite posterior in the temporal lobe (6 mmanterior to the interaural canal, correspond-ing to posterior TE/anterior TEO). The factthat an area consisting almost entirely of face-selective cells exists so early in the ventral

stream provides strong support for the hypoth-esis that the face pathway is gated by an oblig-atory detection stage.

In light of the clear large-scale organiza-tion of face processing in macaques revealed byTsao et al. and recently by Pinsk et al. (2005),

www.annualreviews.org • Mechanisms of Face Perception 427

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 18: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

A 20

1 cm

A/P 0

20

40

60

80

100

120

140

160

18016 32 48

Image number

Cel

l nu

mb

erM

ean

res

po

nse

64 80 96

16

Faces

Perrett 1985, 1987 Rolls 1984Desimone 1984Yamane 1988Hasselmo 1989Harries 1991Tanaka 1991Kobatake 1994Foldiak 1904Eifuku 1904Desousa 1905Tsao 1906

10–2 10–9

Bodies Fruit Gadgets Hands Scram

32 48 64 80 96

a e

f

b

c

d 0.6

0.4

0.2

0

–0.2

Apple Clock

Figure 8Mapping face and object selectivity in the monkey brain. (a) Five stimulus categories included faces, four nonface object categories(hands, gadgets, fruits, and bodies), and grid scrambled patterns. (b) Map of faces > objects. (c) Map of objects > scrambled.(d ) Meta-analysis showing the location of physiologically identified face-selective cells; studies identified by first author and date. Fivehundred face-selective cells were recorded by Tsao et al. 2006 at the location indicated by the pink asterisk. (e) Responses of 182neurons from M1’s middle face patch to 96 images of faces and nonface objects. ( f ) Average normalized population response to eachimage. Panels a–c, e, f are from Tsao et al. 2006.

we reexamined all previous physiological stud-ies that mapped out locations of face-selectivecells, and by remapping their face-cell localiza-tions onto a common map, we found that, takenen masse, these studies do show a concentrationof face selectivity in two major regions of thetemporal lobe, regions that correspond to themiddle and anterior face patches described by

Tsao and colleagues using functional imaging(Figure 8d ).

The Functional Significanceof the Anatomical Localizationof Face Processing

The cerebral cortex is functionally parcellated:Neurons concerned with similar things are

428 Tsao · Livingstone

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 19: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

organized into areas and columns, each hav-ing extensive interconnections and common in-puts and outputs (Mountcastle 1997). It is notsurprising that face processing, being an im-portant, identifiable and discrete form of ob-ject recognition, is also organized into anatom-ically discrete processing centers. Individualneurons connect with only a small fraction ofthe rest of the neurons in the brain, usually tonearby cells, because longer axons delay neu-ral transmission, are energetically expensive,and take up space. Barlow (1986) has notedthat facilitatory interactions within a functionalarea or column could underlie Gestalt link-ing processes—clustering cells concerned withcolor or motion might facilitate interactionsbetween parts of the visual field having com-mon color or motion. However, enriched localinhibitory interactions and sharpening of tun-ing might be an even more important functionof colocalization because inhibitory neuronsare always local, and long-range intracorticalconnections are invariably excitatory (Somogyiet al. 1998). Wang et al. (2000) recorded re-sponses in anterior IT to a set of complex stim-uli before, during, and after applying the GABAantagonist bicuculline near the recording elec-trode. In many cases, for both face-selectiveand nonface-selective cells, blocking local in-hibition revealed responses to previously non-activating stimuli, which were often activatingstimuli for neighboring cells. This suggests thatneighboring cells refine each other’s responseselectivity by mutual inhibition.

Time Course of Feature-Combination Responses

Although a large fraction of the informationabout which face stimulus was shown is car-ried by the earliest 50 ms of the response offace-selective cells (Tovee et al. 1993), severalstudies have shown that the information car-ried by the early part of the response is differentfrom the information carried by later spikes. Inparticular, the earliest spikes in a response aresufficient for distinguishing faces from otherobject categories, but information about in-

dividual facial identity does not develop until∼50 ms later (Sugase et al. 1999, Tsao et al.2006).

Similarly, responses in IT to nonface stimulialso become more selective, or sparser, overtime (Tanaka et al. 1991, Tamura & Tanaka2001). Similar temporal dynamics indicativeof early detection activity followed by laterindividual identification activity have beenobserved for face-selective MEG responsesin human occipitotemporal cortex (Liu et al.2002). The observations that global informa-tion precedes finer information are consistentwith a role for local inhibition in sharpeningtuning within a local cluster of cells havingsimilar response properties. Such responsedynamics suggest a feedback or competitiveprocess, whereby cells that respond best to agiven stimulus inhibit nearby cells, resulting ina winner-take-all situation.

Norm-Based Coding

Recently an idea has emerged for both face pro-cessing and general object coding in the tem-poral lobe—that firing rate represents the mag-nitude of deviation from a template or normfor that property. Cells in V4 can be tuned tocurvature, but the optimal values for curvatureare most often found at either extreme or zerocurvature, with few cells tuned to intermediatecurvature (Pasupathy & Connor 2001). Kayaertand colleages (2005a) found norm-based tun-ing for shapes in IT; neurons tuned to differentshapes tended to show monotonic tuning, withmaximum responses to extreme values of thoseshapes. Lastly, Leopold et al. (2006) recordedfrom face-responsive cells in anterior IT andfound that most cells were tuned around anidentity-ambiguous average human face, show-ing maximum firing to faces farthest from anaverage face (i.e., tuning was V-shaped aroundthe average). Freiwald et al. (2005), on the otherhand, reported that many cells in the macaquemiddle face patch showed monotonic tuningcurves to different feature dimensions in a largecartoon face space, with the maximum responseat one extreme and the minimum response at

www.annualreviews.org • Mechanisms of Face Perception 429

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 20: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

16.5

42.5

Fir

ing

rat

e (H

z)

Featurevalue

b

a

1

1.5

7.0

6 11

Facedirection

Hairlength

Eyebrowslant

Eyebrowheight

Eyeaspectratio

Irissize

Nosebase

Mouth-to-nosedistance

Mouth-top

shape

Faceaspectratio

Height offeature

assemblyHair

thickness

Hairwidth

Eyebrowslant

Inter-eyedistance

Eyebrowwidth

Inter-eyedistance

Eyesize

Gazedirection

Nosealtitude

Mouth-size

Mouth-bottomshape

Figure 9Tuning of face cells to a cartoon face space. (a) Three example dimensions of the 19-dimensional cartoon space. Each row showsexample values for one parameter, with all other parameters fixed at their mean. (b) Tuning curves of two example cells to each of the19 feature dimensions. Maximal, minimal, and mean values from shift predictor are shown in gray. Stars mark significant modulation.From Freiwald et al. 2005.

the opposite extreme (Figure 9). This ramp-shaped tuning is consistent with the model pro-posed by Rhodes et al. (2004) for explaining theface-adaptation effect (Figure 5b)—that eachface feature axis is coded by two opponent cellpopulations; thus the face norm would be im-plicitly represented as the virtual point of in-tersection between face cell populations withopponent ramp-shaped feature tuning curves.For both faces and nonface objects, many cellsshow tuning to several feature dimensions, andthe tuning is separable, or independent, for thedifferent tuning axes (Freiwald et al. 2005,Kayaert et al. 2005b).

Invariance

Face-selective cells in the temporal lobe areusually position and scale invariant in their abil-ity to detect and distinguish faces, but theyare seldom view and angle invariant (Desimoneet al. 1984; Perrett et al. 1984, 1985, 1989, 1991;Rolls & Baylis 1986; Tanaka et al. 1991; Toveeet al. 1994; Tsao et al. 2006). The marked viewselectivity of some IT cells may reflect a role ininterpreting social gestures (who is looking atwhom) (Argyle & Cook 1976, Bertrand 1969).De Souza et al. (2005) recently found a strik-ing pattern of view selectivity in rostral versus

430 Tsao · Livingstone

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 21: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

caudal anterior STS. In caudal anterior STS,they found mirror-symmetric view-tuned cells,but in rostral anterior STS, view tuning wasnot mirror symmetric. Furthermore, view angleand gaze direction interacted, with neurons se-lective for a particular combination of face viewand direction of gaze and often were stronglymodulated by eye contact.

Recordings from the medial temporal lobeof human epilepsy patients have revealed theexistence of cells that respond to familiar indi-viduals in a highly invariant manner (Quirogaet al. 2005), as expected of a grandmother cell.For example, some cells responded to multiplepictures of a well-known individual as well as toa letter string of the person’s name but were un-responsive to all other images. Such individual-specific cells have not been found in the lateralinferior temporal lobe, where most face cellsin monkeys have been recorded, although as apopulation, cells in the anterior inferior tem-poral gyrus of the macaque can represent view-invariant identity (Eifuku et al. 2004).

Summary

The correlation between fMRI localization offace processing in macaques and the strong

clustering of physiologically identified face-selective cells supports the idea of domain speci-ficity, suggested by neurological findings andfMRI studies in humans. The strength and pre-dominance of face selectivity within the mid-dle face patch are not consistent with eitherthe expertise hypothesis or the distributed cod-ing model. The existence of neurons locatedat an early stage of form processing in themacaque brain that respond selectively to facessupports the idea that face processing beginswith a detection stage, and the response proper-ties of face cells indicate that this stage is highlynonlinear.

However, face cells seem to measure differ-ent face variables independently and linearly,so how does this reconcile with evidence thatface perception in humans is holistic; i.e., howcan we explain the composite effect and thepart-whole effect neurally? We suggest thatboth these apparently nonlinear perceptual ef-fects are consistent with a linear neural mea-suring stage if the preceding detection stageis holistic and nonlinear. One surprising resultfrom physiological studies on face processing isthe preponderance of view-selective units, butwhat role they play in face processing is stillunclear.

FUTURE DIRECTIONS

1. Is face processing unique? We do not yet understand the details of how either facesor nonface objects are represented in the brain—perceptual studies have shown majordifferences in the ways that faces and objects are recognized, but there are neverthelesssimilarities in the response properties between face-selective cells and object-selectivecells in IT. Both face- and object-selective cells in IT show tuning characteristics of anorm-based code. A variety of evidence suggest that our perception of faces is holistic,but processing of some nonface objects, like words, also shows important context effects.One fact is clear: The basic computational challenges to face processing are commonto all object recognition (namely, detection, measurement, and classification). What isa face template in computational and neural terms, and how does it differ from a chairtemplate? A truly satisfying answer to the question of whether face processing is uniquewill come only when we understand the precise neural mechanism underlying both faceand nonface object recognition.

www.annualreviews.org • Mechanisms of Face Perception 431

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 22: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

2. Is face processing modular? Perhaps the most striking result to come from the neurobi-ological research on face perception in the past decade is that specialized machinery isused for processing faces. Evidence reveals a fundamental specialization both at the grossanatomical level and at the level of single cells. It will be exciting to move forward alongthis pathway to understand how these face cells are used for different high-level perceptsand behaviors; e.g., conveying invariant identity, expression, direction of attention, socialdominance. But we believe that equally important new insights will come from lookingback, asking how these cells acquire their face selectivity—undertaking a systematic studyof the face-detection process.

3. What makes face processing special? We have proposed that what is special about faceprocessing is that it is gated by an obligatory detection process. Such a design wouldbe computationally elegant (by allowing for fast domain-specific filtering, segmentation,and alignment prior to fine-grained identification) and could explain the existence of facecells, face areas, prosopagnosia, and holistic processing. This detection-gating hypothesisnaturally leads to the idea that there are two distinct classes of face cells: face-recognitioncells, which encode different kinds of face templates, and face-detector cells, which (con-trary to their name) could perform the triple function of detection, segmentation, andalignment. However, it is also possible that detection and discrimination are carried outby the same cells (either simultaneously or sequentially). Either way, we should at leastbe able to find out the answer. Because we know that face-selective cells are coding faces,we can distinguish detection-related activity from discrimination-related activity, whichis impossible when one is studying a cell whose form specialization is unknown. Perhapswhat is truly special about face processing is that it is now amenable to being under-stood. We have a beautiful hierarchy, a gift from nature, and we should exploit it in bothdirections.

DISCLOSURE STATEMENT

The authors are not aware of any biases that might be perceived as affecting the objectivity of thisreview.

LITERATURE CITED

Afraz SR, Cavanagh P. 2008. Retinotopy of the face aftereffect. Vision Res. 48:42–54Afraz SR, Kiani R, Esteky H. 2006. Microstimulation of inferotemporal cortex influences face

categorization. Nature 442:692–95Andrews TJ, Ewbank MP. 2004. Distinct representations for facial identity and changeable aspects

of faces in the human temporal lobe. Neuroimage 23:905–13Andrews TJ, Schluppeck D, Homfray D, Matthews P, Blakemore C. 2002. Activity in the fusiform

gyrus predicts conscious perception of Rubin’s vase-face illusion. Neuroimage 17:890–901Anstis S. 2005. Last but not least. Perception 34:237–40Argyle M, Cook M. 1976. Gaze and Mutual Gaze. Cambridge, UK/New York: Cambridge Univ.

Press. xi, 210 pp.Baker CI, Behrmann M, Olson CR. 2002. Impact of learning on representation of parts and wholes

in monkey inferotemporal cortex. Nat. Neurosci. 5:1210–16Barlow HB. 1986. Why have multiple cortical areas? Vision Res. 26:81–90

432 Tsao · Livingstone

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 23: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

Belhumeur PN, Hespanha JP, Kriegnam DJ. 1997. Eigenfaces vs Fisherfaces: recognition usingclass specific linear projection. IEEE Trans. Patt. Anal. Mach. Intell. 19:711–20

Bertrand M. 1969. The Behavioural Repertoire of the Stumptail Macaque: A Descriptive and ComparativeStudy. Basel, Switz.: Karger

Brincat SL, Connor CE. 2004. Underlying principles of visual shape selectivity in posterior in-ferotemporal cortex. Nat. Neurosci. 7:880–86

Bruce C, Desimone R, Gross CG. 1981. Visual properties of neurons in a polysensory area insuperior temporal sulcus of the macaque. J. Neurophysiol. 46:369–84

Bruce V, Young A. 1986. Understanding face recognition. Br. J. Psychol. 77(Pt. 3):305–27Calder AJ, Burton AM, Miller P, Young AW, Akamatsu S. 2001. A principal component analysis

of facial expressions. Vision Res. 41:1179–208Calder AJ, Young AW. 2005. Understanding the recognition of facial identity and facial expression.

Nat. Rev. Neurosci. 6:641–51Cohen L, Dehaene S. 2004. Specialization within the ventral stream: the case for the visual word

form area. Neuroimage 22:466–76Coltheart M, Curtis B, Atkins P, Heller M. 1993. Models of reading aloud: dual-route and parallel-

distributed-processing approaches. Psychol. Rev. 100:589–608Cottrell GW, Branson KM, Calder AJ. 2002. Do expression and identity need separate representations?

Presented at Annu. Meet. Cogn. Sci. Soc., 24th, Fairfax, Va.Desimone R, Albright TD, Gross CG, Bruce C. 1984. Stimulus-selective properties of inferior

temporal neurons in the macaque. J. Neurosci. 4:2051–62De Souza WC, Eifuku S, Tamura R, Nishijo H, Ono T. 2005. Differential characteristics of face

neuron responses within the anterior superior temporal sulcus of macaques. J. Neurophysiol.94:1252–66

Diamond R, Carey S. 1986. Why faces are and are not special: an effect of expertise. J. Exp. Psychol.Gen. 115:107–17

Eger E, Schyns PG, Kleinschmidt A. 2004. Scale invariant adaptation in fusiform face-responsiveregions. Neuroimage 22:232–42

Eifuku S, De Souza WC, Tamura R, Nishijo H, Ono T. 2004. Neuronal correlates of face iden-tification in the monkey anterior temporal cortical areas. J. Neurophysiol. 91:358–71

Farah MJ, Wilson KD, Drain HM, Tanaka JR. 1995. The inverted face inversion effect inprosopagnosia: evidence for mandatory, face-specific perceptual mechanisms. Vision Res.35:2089–93

Farah MJ, Wilson KD, Drain M, Tanaka JN. 1998. What is “special” about face perception?Psychol. Rev. 105:482–98

Foldiak P, Xiao D, Keysers C, Edwards R, Perrett DI. 2004. Rapid serial visual presentation forthe determination of neural selectivity in area STSa. Prog. Brain Res. 144:107–16

Freiwald WA, Tsao D, Tootell RB, Livingstone MS. 2005. Single-unit recording in an fMRI-identified macaque face patch. II. Coding along multiple feature axes. Soc. Neurosci. Abstr.362.6

Fujita I, Tanaka K, Ito M, Cheng K. 1992. Columns for visual features of objects in monkeyinferotemporal cortex. Nature 360:343–46

Gauthier I, Tarr MJ. 2002. Unraveling mechanisms for expert object recognition: bridging brainactivity and behavior. J. Exp. Psychol. Hum. Percept. Perform. 28:431–46

Grill Spector K, Knouf N, Kanwisher N. 2004. The fusiform face area subserves face perception,not generic within-category identification. Nat. Neurosci. 7:555–62

Grill-Spector K, Kushnir T, Edelman S, Avidan G, Itzchak Y, Malach R. 1999. Differentialprocessing of objects under various viewing conditions in the human lateral occipital complex.Neuron 24:187–203

www.annualreviews.org • Mechanisms of Face Perception 433

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 24: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

Grill-Spector K, Malach R. 2004. The human visual cortex. Annu. Rev. Neurosci. 27:649–77Gross CG, Bender DB, Rocha-Miranda CE. 1969. Visual receptive fields of neurons in inferotem-

poral cortex of the monkey. Science 166:1303–6Gross CG, Rocha-Miranda CE, Bender DB. 1972. Visual properties of neurons in inferotemporal

cortex of the macaque. J. Neurophysiol. 35:96–111Hasson U, Hendler T, Ben Bashat D, Malach R. 2001. Vase or face? A neural correlate of shape-

selective grouping processes in the human brain. J. Cogn. Neurosci. 13:744–53Hasson U, Levy I, Behrmann M, Hendler T, Malach R. 2002. Eccentricity bias as an organizing

principle for human high-order object areas. Neuron 34:479–90Haxby JV, Gobbini MI, Furey ML, Ishai A, Schouten JL, Pietrini P. 2001. Distributed and

overlapping representations of faces and objects in ventral temporal cortex. Science 293:2425–30

Haxby JV, Grady CL, Horwitz B, Ungerleider LG, Mishkin M, et al. 1991. Dissociation of objectand spatial visual processing pathways in human extrastriate cortex. Proc. Natl. Acad. Sci. USA88:1621–25

Haxby JV, Hoffman EA, Gobbini MI. 2000. The distributed human neural system for face per-ception. Trends Cogn. Sci. 4:223–33

Jebara T. 2003. Images as bags of pixels. Presented at IEEE Int. Conf. Comp. Vis. (ICCV’03), 9th,Nice, France

Jeffery L, Rhodes G, Busey T. 2006. View-specific coding of face shape. Psychol. Sci. 17:501–5Johnson MH, Dziurawiec S, Ellis H, Morton J. 1991. Newborns’ preferential tracking of face-like

stimuli and its subsequent decline. Cognition 40:1–19Kanwisher N, Tong F, Nakayama K. 1998. The effect of face inversion on the human fusiform

face area. Cognition 68:B1–11Kanwisher NG, McDermott J, Chun MM. 1997. The fusiform face area: a module in human

extrastriate cortex specialized for face perception. J. Neurosci. 17:4302–11Kayaert G, Biederman I, Op de Beeck H, Vogels R. 2005a. Tuning for shape dimensions in

macaque inferior temporal cortex. Eur. J. Neurosci. 22:212–24Kayaert G, Biederman I, Vogels R. 2005b. Representation of regular and irregular shapes in

macaque inferotemporal cortex. Cereb. Cortex 15:1308–21Kemp R, McManus C, Pigott T. 1990. Sensitivity to the displacement of facial features in negative

and inverted images. Perception 19:531–43Kobatake E, Tanaka K. 1994. Neuronal selectivities to complex object features in the ventral visual

pathway of the macaque cerebral cortex. J. Neurophysiol. 71:856–67Konorski J. 1967. Integrative Activity of the Brain: An Interdisciplinary Approach. Chicago:

Univ. Chicago Press. xii, 531 pp.Lee K, Byatt G, Rhodes G. 2000. Caricature effects, distinctiveness, and identification: testing the

face-space framework. Psychol. Sci. 11:379–85Leonard CM, Rolls ET, Wilson FA, Baylis GC. 1985. Neurons in the amygdala of the monkey

with responses selective for faces. Behav. Brain Res. 15:159–76Leopold DA, Bondar IV, Giese MA. 2006. Norm-based face encoding by single neurons in the

monkey inferotemporal cortex. Nature 442:572–75Leopold DA, O’Toole AJ, Vetter T, Blanz V. 2001. Prototype-referenced shape encoding revealed

by high-level aftereffects. Nat. Neurosci. 4:89–94Liu J, Harris A, Kanwisher N. 2002. Stages of processing in face perception: an MEG study. Nat.

Neurosci. 5:910–16McCarthy G, Luby M, Gore J, Goldman-Rakic P. 1997. Infrequent events transiently activate

human prefrontal and parietal cortex as measured by functional MRI. J. Neurophysiol. 77:1630–34

434 Tsao · Livingstone

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 25: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

Moghaddam B, Jebara T, Pentland A. 2000. Bayesian face recognition. Pattern Recognit. 33:1771–82

Mooney CM. 1957. Age in the development of closure ability in children. Can. J. Psychol. 11:219–26Moscovitch M, Winocur G, Behrmann M. 1997. What is special about face recognition? Nineteen

experiments on a person with visual object agnosia and dyslexia but normal face recognition.J. Cogn. Neurosci. 9:555–604

Mountcastle VB. 1997. The columnar organization of the neocortex. Brain 120(Pt. 4):701–22Oram MW, Perrett DI. 1992. Time course of neural responses discriminating different views of

the face and head. J. Neurophysiol. 68:70–84O’Toole A, Abdi H, Deffenbacher K, Valentin D. 1993. Low dimensional representation of faces

in high dimensions of the space. J. Opt. Soc. Am. A 10:405–10Pasupathy A, Connor CE. 2001. Shape representation in area V4: position-specific tuning for

boundary conformation. J. Neurophysiol. 86:2505–19Perrett DI, Harries MH, Bevan R, Thomas S, Benson PJ, et al. 1989. Frameworks of analysis for

the neural representation of animate objects and actions. J. Exp. Biol. 146:87–113Perrett DI, Oram MW, Harries MH, Bevan R, Hietanen JK, et al. 1991. Viewer-centred and

object-centred coding of heads in the macaque temporal cortex. Exp. Brain Res. 86:159–73Perrett DI, Rolls ET, Caan W. 1982. Visual neurones responsive to faces in the monkey temporal

cortex. Exp. Brain Res. 47:329–42Perrett DI, Smith PA, Potter DD, Mistlin AJ, Head AS, et al. 1984. Neurones responsive to faces

in the temporal cortex: studies of functional organization, sensitivity to identity and relationto perception. Hum. Neurobiol. 3:197–208

Perrett DI, Smith PA, Potter DD, Mistlin AJ, Head AS, et al. 1985. Visual cells in the temporalcortex sensitive to face view and gaze direction. Proc. R. Soc. London B Biol. Sci. 223:293–317

Pinsk MA, Desimone K, Moore T, Gross CG, Kastner S. 2005. Representations of faces andbody parts in macaque temporal cortex: a functional MRI study. Proc. Natl. Acad. Sci. USA102:6996–7001

Pourtois G, Schwartz S, Seghier ML, Lazeyras F, Vuilleumier P. 2005. View-independent codingof face identity in frontal and temporal cortices is modulated by familarity: an event-relatedfMRI study. Neuroimage 24:1214–24

Puce A, Allison T, Asgari M, Gore JC, McCarthy G. 1996. Differential sensitivity of human visualcortex to faces, letterstrings, and textures: a functional magnetic resonance imaging study. J.Neurosci. 16:5205–15

Quiroga RQ, Reddy L, Kreiman G, Koch C, Fried I. 2005. Invariant visual representation bysingle neurons in the human brain. Nature 435:1102–7

Rhodes G, Brennan S, Carey S. 1987. Identification and ratings of caricatures: implications formental representations of faces. Cogn. Psychol. 19:473–97

Rhodes G, Jeffery L. 2006. Adaptive norm-based coding of facial identity. Vision Res. 46:2977–87

Rhodes G, Jeffery L, Watson TL, Jaquet E, Winkler C, Clifford CW. 2004. Orientation-contingent face aftereffects and implications for face-coding mechanisms. Curr. Biol. 14:2119–23

Rhodes G, McLean IG. 1990. Distinctiveness and expertise effects with homogeneous stimuli:towards a model of configural coding. Perception 19:773–94

Riesenhuber M, Poggio T. 2000. Models of object recognition. Nat. Neurosci. 3(Suppl.):1199–204Rolls ET, Baylis GC. 1986. Size and contrast have only small effects on the responses to faces of

neurons in the cortex of the superior temporal sulcus of the monkey. Exp. Brain Res. 65:38–48Rotshtein P, Henson RN, Treves A, Driver J, Dolan RJ. 2005. Morphing Marilyn into Maggie

dissociates physical and identity face representations in the brain. Nat. Neurosci. 8:107–13

www.annualreviews.org • Mechanisms of Face Perception 435

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 26: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

Scalaidhe SP, Wilson FA, Goldman-Rakic PS. 1999. Face-selective neurons during passive viewingand working memory performance of rhesus monkeys: evidence for intrinsic specializationof neuronal coding. Cereb. Cortex 9:459–75

Sergent J, Ohta S, MacDonald B. 1992. Functional neuroanatomy of face and object processing.A positron emission tomography study. Brain 115:15–36

Shakhnarovich G, Moghaddam B. 2004. Face recognition in subspaces. In Handbook of Face Recog-nition, ed. SZ Li, AK Jain. Berlin: Springer-Verlag

Simion F, Valenza E, Umilta C, Dalla Barba B. 1998. Preferential orienting to faces in newborns:a temporal-nasal asymmetry. J. Exp. Psychol. Hum. Percept. Perform. 24:1399–405

Sinha P. 2002a. Qualitative representations for recognition. In Lecture Notes in Computer Science,pp. 249–62. Berlin: Springer-Verlag

Sinha P. 2002b. Recognizing complex patterns. Nat. Neurosci. 5(Suppl.):1093–97Sinha P, Balas BJ, Ostrovsky Y, Russell R. 2006. Face recognition by humans: nineteen results all

computer vision researchers should know about. Proc. IEEE 94(11):1948–62Sinha P, Poggio T. 1996. I think I know that face. Nature 384:404Somogyi P, Tamas G, Lujan R, Buhl EH. 1998. Salient features of synaptic organisation in the

cerebral cortex. Brain Res. Brain Res. Rev. 26:113–35Spiridon M, Kanwisher N. 2002. How distributed is visual category information in human

occipito-temporal cortex? An fMRI study. Neuron 35:1157–65Sugase Y, Yamane S, Ueno S, Kawano K. 1999. Global and fine information coded by single

neurons in the temporal visual cortex. Nature 400:869–73Suzuki S, Cavanagh P. 1998. A shape-contrast effect for briefly presented stimuli. J. Exp. Psychol.

Hum. Percept. Perform. 24:1315–41Tamura H, Tanaka K. 2001. Visual response properties of cells in the ventral and dorsal parts of

the macaque inferotemporal cortex. Cereb. Cortex 11:384–99Tanaka JW, Farah MJ. 1993. Parts and wholes in face recognition. Q. J. Exp. Psychol. A Hum. Exp.

Psychol. 46A:225–45Tanaka K, Saito H, Fukada Y, Moriya M. 1991. Coding visual images of objects in the inferotem-

poral cortex of the macaque monkey. J. Neurophysiol. 66:170–89Tarr MJ, Gauthier I. 2000. FFA: a flexible fusiform area for subordinate-level visual processing

automatized by expertise. Nat. Neurosci. 3:764–69Thompson P. 1980. Margaret Thatcher: a new illusion. Perception 9:483–84Tong F, Nakayama K, Moscovitch M, Weinrib O, Kanwisher N. 2000. Response properties of

the human fusiform face area. Cogn. Neuropsychol. 17:257–79Tovee MJ, Rolls ET, Azzopardi P. 1994. Translation invariance in the responses to faces of single

neurons in the temporal visual cortical areas of the alert macaque. J. Neurophysiol. 72:1049–60Tovee MJ, Rolls ET, Treves A, Bellis RP. 1993. Information encoding and the responses of single

neurons in the primate temporal visual cortex. J. Neurophysiol. 70:640–54Tsao DY, Freiwald WA, Knutsen TA, Mandeville JB, Tootell RB. 2003. Faces and objects in

macaque cerebral cortex. Nat. Neurosci. 6:989–95Tsao DY, Freiwald WA, Tootell RB, Livingstone MS. 2006. A cortical region consisting entirely

of face-selective cells. Science 311:670–74Turk M, Pentland A. 1991. Eigenfaces for recognition. J. Cogn. Neurosci. 3:71–86Ullman S. 2007. Object recognition and segmentation by a fragment-based hierarchy. Trends Cogn.

Sci. 11:58–64Ullman S, Vidal-Naquet M, Sali E. 2002. Visual features of intermediate complexity and their use

in classification. Nat. Neurosci. 5:682–87Viola P, Jones M. 2004. Robust real-time face detection. Int. J. Comp. Vision 57:137–54

436 Tsao · Livingstone

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 27: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

ANRV346-NE31-18 ARI 14 May 2008 15:6

Wang G, Tanaka K, Tanifuji M. 1996. Optical imaging of functional organization in the monkeyinferotemporal cortex. Science 272:1665–68

Wang Y, Fujita I, Murayama Y. 2000. Neuronal mechanisms of selectivity for object featuresrevealed by blocking inhibition in inferotemporal cortex. Nat. Neurosci. 3:807–13

Webster MA, MacLin OH. 1999. Figural aftereffects in the perception of faces. Psychon. Bull. Rev.6:647–53

Wolberg G. 1998. Image morphing: a survey. Vis. Comput. 14:360–72Yin R. 1969. Looking at upside-down faces. J. Exp. Psychol. 81:141–45Young AW, Hellawell D, Hay DC. 1987. Configurational information in face perception. Perception

16:747–59Yue X, Tjan BS, Biederman I. 2006. What makes faces special? Vision Res. 46:3802–11Zhao W, Chellappa R, Phillips PJ, Rosenfeld A. 2003. Face recognition: a literature survey. ACM

Comput. Surv. 35:399–458

www.annualreviews.org • Mechanisms of Face Perception 437

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 28: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

AR346-FM ARI 20 May 2008 15:1

Annual Review ofNeuroscience

Volume 31, 2008Contents

Cerebellum-Like Structures and Their Implications for CerebellarFunctionCurtis C. Bell, Victor Han, and Nathaniel B. Sawtell � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 1

Spike Timing–Dependent Plasticity: A Hebbian Learning RuleNatalia Caporale and Yang Dan � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �25

Balancing Structure and Function at Hippocampal Dendritic SpinesJennifer N. Bourne and Kristen M. Harris � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �47

Place Cells, Grid Cells, and the Brain’s Spatial Representation SystemEdvard I. Moser, Emilio Kropff, and May-Britt Moser � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �69

Mitochondrial Disorders in the Nervous SystemSalvatore DiMauro and Eric A. Schon � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �91

Vestibular System: The Many Facets of a Multimodal SenseDora E. Angelaki and Kathleen E. Cullen � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 125

Role of Axonal Transport in Neurodegenerative DiseasesKurt J. De Vos, Andrew J. Grierson, Steven Ackerley, and Christopher C.J. Miller � � � 151

Active and Passive Immunotherapy for Neurodegenerative DisordersDavid L. Brody and David M. Holtzman � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 175

Descending Pathways in Motor ControlRoger N. Lemon � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 195

Task Set and Prefrontal CortexKatsuyuki Sakai � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 219

Multiple Sclerosis: An Immune or Neurodegenerative Disorder?Bruce D. Trapp and Klaus-Armin Nave � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 247

Multifunctional Pattern-Generating CircuitsK.L. Briggman and W.B. Kristan, Jr. � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 271

Retinal Axon Growth at the Optic Chiasm: To Cross or Not to CrossTimothy J. Petros, Alexandra Rebsam, and Carol A. Mason � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 295

v

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.

Page 29: Mechanisms of Face Perception - Linguisticsling.umd.edu/~ellenlau/courses/nacs642/Tsao_2008-1.pdf · Sinha’s face-detection algorithm ... of face images that are maximally • Mechanisms

AR346-FM ARI 20 May 2008 15:1

Brain Circuits for the Internal Monitoring of MovementsMarc A. Sommer and Robert H. Wurtz � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 317

Wnt Signaling in Neural Circuit AssemblyPatricia C. Salinas and Yimin Zou � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 339

Habits, Rituals, and the Evaluative BrainAnn M. Graybiel � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 359

Mechanisms of Self-Motion PerceptionKenneth H. Britten � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 389

Mechanisms of Face PerceptionDoris Y. Tsao and Margaret S. Livingstone � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 411

The Prion’s Elusive Reason for BeingAdriano Aguzzi, Frank Baumann, and Juliane Bremer � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 439

Mechanisms Underlying Development of Visual Maps andReceptive FieldsAndrew D. Huberman, Marla B. Feller, and Barbara Chapman � � � � � � � � � � � � � � � � � � � � � � � � 479

Neural Substrates of Language AcquisitionPatricia K. Kuhl and Maritza Rivera-Gaxiola � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 511

Axon-Glial Signaling and the Glial Support of Axon FunctionKlaus-Armin Nave and Bruce D. Trapp � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 535

Signaling Mechanisms Linking Neuronal Activity to Gene Expressionand Plasticity of the Nervous SystemSteven W. Flavell and Michael E. Greenberg � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 563

Indexes

Cumulative Index of Contributing Authors, Volumes 22–31 � � � � � � � � � � � � � � � � � � � � � � � � � � � 591

Cumulative Index of Chapter Titles, Volumes 22–31 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 595

Errata

An online log of corrections to Annual Review of Neuroscience articles may be found athttp://neuro.annualreviews.org/

vi Contents

Ann

u. R

ev. N

euro

sci.

2008

.31:

411-

437.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

gby

Har

vard

Uni

vers

ity o

n 01

/25/

12. F

or p

erso

nal u

se o

nly.