The neural representation of face space dimensions

7
The neural representation of face space dimensions Xiaoqing Gao n , Hugh R. Wilson Centre for Vision Research, York University, 4700 Keele Street, Toronto, Ontario, Canada M3J 1P3 article info Article history: Received 1 April 2013 Received in revised form 17 June 2013 Accepted 1 July 2013 Available online 10 July 2013 Keywords: Facial identity Face space dimension PCA Multi-voxel pattern analysis fMRI abstract Functional neural imaging studies have identied a network of brain areas that are more active to faces than to other objects. However, it remains largely unclear how these areas encode individual facial identity. To investigate the neural representations of facial identity, we constructed a multidimensional face space structure, whose dimensions were derived from geometric information of faces using the Principal Component Analysis (PCA). Using fMRI, we recorded participants' neural responses when viewing blocks of faces that differed only on one dimension within a block. Although the response magnitudes to different blocks of faces did not differ in a univariate analysis, multi-voxel pattern analysis revealed distinct patterns related to different face space dimensions in brain areas that have a higher response magnitude to faces than to other objects. The results indicate that dimensions of the face space are encoded in the face-selective brain areas in a spatially distributed way. & 2013 Elsevier Ltd. All rights reserved. 1. Introduction Recognizing faces is among the most important basic skills for social interaction. Although a typical human adult can identify a person from his/her face in a fraction of a second, this seemingly simple ability surpasses any computer system in its efciency and robustness. Valentine (1991) suggested that underlying the ability to individuate faces is a system that encodes faces as points in a multidimensional space (the face space). This hypothesis has received support from numerous behavioral studies (e.g., Webster, Kaping, Mizokami, & Duhamel, 2004; Leopold, Rhodes, Müller, & Jeffery, 2005; Rhodes et al., 2011; Said & Todorov, 2011). However, little is known about how this multidimensional face space is represented in the human brain. Neuroimaging studies have identied a network of brain areas that are involved in face perception. By comparing the magnitude of brain response to faces with the response to other categories of objects (e.g., houses), studies consistently report that the fusiform face area (FFA, Kanwisher, McDermott, & Chun, 1997; Sergent, Ohta, & MacDonald, 1992) and the occipital face area (OFA, Gauthier et al., 2000; see Pitcher, Walsh, & Duchaine, 2011 for a review) are more active to faces than to other objects. Although the heightened response to faces in these brain areas does not directly indicate the function of individuat- ing faces, later studies have shed light on the role of these areas in encoding individual facial identity. Grill-Spector, Knouf, and Kanwisher (2004) reported a positive correlation between the blood oxygen level-dependent (BOLD) response magnitudes in the FFA with beha- vioral performance in identifying faces. Using the fMRI-adaptation paradigm, which exploits the observation that BOLD response is reduced after prolonged presentation of a stimulus, studies have shown that the FFA and the OFA are sensitive to changes of facial identity (Rotshtein, Henson, Treves, Driver, & Dolan, 2005; Grill- Spector et al., 1999; Lofer, Yourganov, Wilkinson, & Wilson, 2005). Another line of evidence comes from studies with prosopagnosia patients with focal lesion in the FFA (e.g., Barton, Press, Keenan, & OConnor, 2002) or OFA (e.g., Buvier & Engel, 2006; Rossion et al., 2003; Steeves et al., 2006). Furthermore, by temporarily disrupting the function of the OFA (Pitcher, Walsh, Yovel, & Duchaine, 2007) through transcranial magnetic stimulation (TMS), people' s accuracy in recog- nizing individual faces was reduced. Collectively, these ndings suggest the important roles of the FFA and the OFA in individuating faces. Therefore, the FFA and the OFA are good candidates for the current investigation of the neural representations of the multidimen- sional face space. A face space structure consists of two basic elements: the origin of the space and the dimensions. Valentine (1991) suggests that the origin of the face space represents the central tendency of all the faces encountered in one' s life. There are two types of encoding mechanisms in relation to the origin of the face space that have been proposed. One hypothesis suggests that individual facial identities are encoded relative to the origin of the space (norm-based coding). The other hypothesis suggests that faces are encoded relative to the existing exemplars (exemplar-based coding) without referencing to the origin of the space. A recent study (Lofer et al., 2005) demonstrated that BOLD responses to faces in the FFA increase with increasing distance between the face and the origin of the face space (the average face) as would be predicted by the norm-based coding hypothesis but not by the exemplar-based coding hypothesis. The results indicate that the distance between an individual face and the origin of the face space is encoded as BOLD response amplitude in the FFA. Contents lists available at SciVerse ScienceDirect journal homepage: www.elsevier.com/locate/neuropsychologia Neuropsychologia 0028-3932/$ - see front matter & 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.neuropsychologia.2013.07.001 n Corresponding author. Tel.: +1 416 736 2100x33325; fax: +1 416 736 5857. E-mail address: [email protected] (X. Gao). Neuropsychologia 51 (2013) 17871793

Transcript of The neural representation of face space dimensions

Neuropsychologia 51 (2013) 1787–1793

Contents lists available at SciVerse ScienceDirect

Neuropsychologia

0028-39http://d

n CorrE-m

journal homepage: www.elsevier.com/locate/neuropsychologia

The neural representation of face space dimensions

Xiaoqing Gao n, Hugh R. WilsonCentre for Vision Research, York University, 4700 Keele Street, Toronto, Ontario, Canada M3J 1P3

a r t i c l e i n f o

Article history:Received 1 April 2013Received in revised form17 June 2013Accepted 1 July 2013Available online 10 July 2013

Keywords:Facial identityFace space dimensionPCAMulti-voxel pattern analysisfMRI

32/$ - see front matter & 2013 Elsevier Ltd. Ax.doi.org/10.1016/j.neuropsychologia.2013.07.0

esponding author. Tel.: +1 416 736 2100x3332ail address: [email protected] (X. Gao).

a b s t r a c t

Functional neural imaging studies have identified a network of brain areas that are more active to facesthan to other objects. However, it remains largely unclear how these areas encode individual facialidentity. To investigate the neural representations of facial identity, we constructed a multidimensionalface space structure, whose dimensions were derived from geometric information of faces using thePrincipal Component Analysis (PCA). Using fMRI, we recorded participants' neural responses whenviewing blocks of faces that differed only on one dimension within a block. Although the responsemagnitudes to different blocks of faces did not differ in a univariate analysis, multi-voxel pattern analysisrevealed distinct patterns related to different face space dimensions in brain areas that have a higherresponse magnitude to faces than to other objects. The results indicate that dimensions of the face spaceare encoded in the face-selective brain areas in a spatially distributed way.

& 2013 Elsevier Ltd. All rights reserved.

1. Introduction

Recognizing faces is among the most important basic skills forsocial interaction. Although a typical human adult can identify aperson from his/her face in a fraction of a second, this seeminglysimple ability surpasses any computer system in its efficiency androbustness. Valentine (1991) suggested that underlying the ability toindividuate faces is a system that encodes faces as points in amultidimensional space (the face space). This hypothesis has receivedsupport from numerous behavioral studies (e.g., Webster, Kaping,Mizokami, & Duhamel, 2004; Leopold, Rhodes, Müller, & Jeffery,2005; Rhodes et al., 2011; Said & Todorov, 2011). However, little isknown about how this multidimensional face space is represented inthe human brain.

Neuroimaging studies have identified a network of brain areas thatare involved in face perception. By comparing the magnitude of brainresponse to faces with the response to other categories of objects (e.g.,houses), studies consistently report that the fusiform face area (FFA,Kanwisher, McDermott, & Chun, 1997; Sergent, Ohta, & MacDonald,1992) and the occipital face area (OFA, Gauthier et al., 2000; seePitcher, Walsh, & Duchaine, 2011 for a review) are more active to facesthan to other objects. Although the heightened response to faces inthese brain areas does not directly indicate the function of individuat-ing faces, later studies have shed light on the role of these areas inencoding individual facial identity. Grill-Spector, Knouf, and Kanwisher(2004) reported a positive correlation between the blood oxygenlevel-dependent (BOLD) response magnitudes in the FFA with beha-vioral performance in identifying faces. Using the fMRI-adaptation

ll rights reserved.01

5; fax: +1 416 736 5857.

paradigm, which exploits the observation that BOLD response isreduced after prolonged presentation of a stimulus, studies haveshown that the FFA and the OFA are sensitive to changes of facialidentity (Rotshtein, Henson, Treves, Driver, & Dolan, 2005; Grill-Spector et al., 1999; Loffler, Yourganov, Wilkinson, & Wilson, 2005).Another line of evidence comes from studies with prosopagnosiapatients with focal lesion in the FFA (e.g., Barton, Press, Keenan, &O’Connor, 2002) or OFA (e.g., Buvier & Engel, 2006; Rossion et al.,2003; Steeves et al., 2006). Furthermore, by temporarily disrupting thefunction of the OFA (Pitcher, Walsh, Yovel, & Duchaine, 2007) throughtranscranial magnetic stimulation (TMS), people's accuracy in recog-nizing individual faces was reduced. Collectively, these findingssuggest the important roles of the FFA and the OFA in individuatingfaces. Therefore, the FFA and the OFA are good candidates for thecurrent investigation of the neural representations of the multidimen-sional face space.

A face space structure consists of two basic elements: the origin ofthe space and the dimensions. Valentine (1991) suggests that theorigin of the face space represents the central tendency of all the facesencountered in one's life. There are two types of encodingmechanismsin relation to the origin of the face space that have been proposed. Onehypothesis suggests that individual facial identities are encodedrelative to the origin of the space (norm-based coding). The otherhypothesis suggests that faces are encoded relative to the existingexemplars (exemplar-based coding) without referencing to the originof the space. A recent study (Loffler et al., 2005) demonstrated thatBOLD responses to faces in the FFA increase with increasing distancebetween the face and the origin of the face space (the average face) aswould be predicted by the norm-based coding hypothesis but not bythe exemplar-based coding hypothesis. The results indicate that thedistance between an individual face and the origin of the face space isencoded as BOLD response amplitude in the FFA.

X. Gao, H.R. Wilson / Neuropsychologia 51 (2013) 1787–17931788

Valentine (1991) suggests that the dimensions of the face spacewere formed through experience with faces, but no specific mechan-ism was proposed. The neural representation of the face spacedimensions in the human brain remains largely unclear. Recentneuroimaging studies investigating the neural representations ofindividual facial identities have found that individual facial identitiesare encoded as patterns of neural responses in distributed corticalareas. Kriegeskorte, Formisano, Sorger, & Goebel (2007) recoded neuralresponses to one female and onemale face, both of three quarter view.The two faces elicited different neural response patterns in theanterior inferotmeporal cortex (aIT). To maximize the discriminabilityof the facial identities, Natu et al. (2010) included face/anti-face pairs intheir experiments. They found that a pattern classifier could reliablydiscriminate the neural response patterns to different facial identitiesin the ventral temporal cortex, including the fusiform gyrus and thelateral occipital areas, despite changes of point of view of the faces.A recent study (Nester, Plaut, & Behrmann, 2011) has confirmed therole of the FFA in individuating faces, as a searchlight analysis revealedthat the fusiform area is one of the most informative areas in theneural response patterns for discriminating four facial identities withvarying facial expressions. Since the face space dimensions are thebasic elements encoding individual facial identity, one possibility isthat the face space dimensions are also encoded as distributedpatterns of neural activations in the face-selective cortical areas.Alternatively, it is possible that different face space dimensions areencoded in different loci in the brain. The collective pattern ofactivation of these loci encodes individual facial identity. To test thesetwo hypotheses, we took a univariate approach and a multivariateapproach to analyze the neural responses to changes of facial identitieson different face space dimensions.

We defined the face space dimensions based on statisticalregularities of a set of faces. Specifically, we ran Principal Compo-nent Analysis (PCA) on the geometric information of a set of maleCaucasian faces. We used the average face as the origin and usedthe resulting Principal Components (PCs) as dimensions to set up aface space structure. PCs have proved effective in encoding imagesof faces for computer recognition (Sirovich & Kirby, 1987; Turk &Pentland, 1991) and in modeling human perception (Hancock,Burton, & Bruce, 1996; O’Toole, Deffenbacher, Abdi, & Bartlett,1991). The dimensions represented by the PCs are orthogonal.They do not represent local facial features, such as the eyes or thenose; instead, they represent the global configuration of the faces,which has been demonstrated to be important in face recognition(e.g., Tanaka & Farah, 1993; Maurer, Le Grand, & Mondloch, 2002).

One important feature of the PCA approach is that the PCs explaindifferent amount of variation in the face set. Therefore, some PCs aremore “prominent” than the others, as they explain more variations inthe face set. In the current study, besides investigating how the brainencodes the face space dimensions defined by PCA, we are alsointerested in comparing the brain response to PCs of differentimportance. We compared brain responses between two PCs, onewith a high eigenvalue (PC1) and one with a low eigenvalue (PC16). Inthe current set of faces, PC1 explained eight times as much of thevariance as PC16 explained. By collecting both behavioral and func-tional neural imaging data, we are able to measure both perceptualsensitivity and neural sensitivity to changes of facial identities alongthe two face space dimensions that differed statistically.

Fig. 1. A face space structure constructed based on PCA. The origin of the space(red) is the average of 41 Caucasian male faces. The two dimensions of the space arederived from PC1 (green) and PC16 (blue) of the 41 Caucasian male faces. On eachdirection of each dimension, three faces were created with a distance of 0.1, 0.17, or0.24 from the average face, with the distances defined as the Euclidean distancebetween two faces in the 37-dimensional face space as a fraction of the mean headradius of the original 41 faces (For interpretation of the references to color in thisfigure legend, the reader is referred to the web version of this article.).

2. Material and methods

2.1. Participants

Participants were nine adults (4 females, mean age¼31 years, SD¼3.6 years).All (except one male) participants were right-handed. None of the participantsreported any history of psychiatric or neurological disorders, or current use of anypsychoactive medications. The data from one male participant were excluded from

the final data analysis because this participant has unusually large ear canals, whichcaused artifacts in BOLD signals in the ventral part of the temporal lobe. The studyis approved by York University Research Ethics Board. We obtained informedwritten consent from all the participants.

2.2. Stimuli

2.2.1. Synthetic facesWe used synthetic faces derived from digital photographs of 41 Caucasian

males. The detailed description of the design of the synthetic faces has beenreported in a previous study (Wilson, Loffler, & Wilkinson, 2002). Briefly, eachsynthetic face is defined by 37 parameters. Within the 37 parameters, 23 of themdefine the head shape and hairline, while the remaining 14 parameters define thelocations and sizes of the facial features. All the 37 measures were normalized withthe unit change on each measure representing a percentage relative to the meanhead radius of the 41 synthetic faces. The reconstructed synthetic faces weregrayscale and were filtered with a bandpass difference of Gaussians (DOG) filtercentered on 10 cycles per face with a bandwidth of two octaves to keep the mostimportant information for facial identity (Gao & Maurer, 2011; Gold, Bennett, &Sekuler, 1999; Näsänen, 1999). The synthetic faces capture the major geometricinformation of individual faces, while leaving out fine details such as color and skintexture. The synthetic faces simplified the representations of the real facescompared to pixel based coding, while they still carry sufficient information ofindividual identities as demonstrated by high accuracy in matching the identitiesbetween synthetic faces and photographs of individual faces (Wilson et al., 2002).

2.2.2. Face space structureWe submitted the 41 synthetic faces to PCA. Unlike the original 37 parameters

that have a certain degree of correlation among them, the resulting 37 PCs areorthogonal to each other, making them good candidates for face space dimensions.We set up a multidimensional face space structure centered on the mean of the 41synthetic faces with the 37 PCs as the dimensions. Distance in this face spacestructure is defined as the Euclidean distance between two faces in the37-dimensional face space as a fraction of the mean head radius of the faces

2.2.3. Experimental stimuliWe created synthetic faces along two dimensions (PC1 and PC16). PC1

explained 13.2% of the total variance in the original 41 faces while PC16 explainedonly 1.7% of the total variance. On each direction (+ or �) of each PC dimension, thesynthetic faces had distances of 0.1, 0.17, and 0.24 from the average face. We chosethese three distances because a previous study (Wilson et al., 2002) has shown thatdiscrimination threshold at 75% accuracy for the synthetic faces was at a distance of0.06. Therefore in the current stimuli, the faces that were the closest (a distance of

X. Gao, H.R. Wilson / Neuropsychologia 51 (2013) 1787–1793 1789

0.1) to the average face can still be discriminated from the average face, while themost similar two faces (a distance of 0.07) can be discriminated from each other.There were 12 faces in total (Fig. 1): 0.1PC1+, 0.17PC1+, 0.24PC1+, 0.1PC1� ,0.17PC1� , 0.24PC1� , 0.1PC16+, 0.17PC16+, 0.24PC16+, 0.1PC16� , 0.17PC16� ,and 0.24PC16� . The face stimuli were generated within Matlab with customwritten code and presented using Psychophysics Toolbox (Brainard, 1997; Pelli,1997). They were back-projected onto a projection screen by an MRI compatibleLCD projector and viewed by the participant through a mirror placed within the RFhead coil at a viewing distance of 43 cm. The face stimuli were on average 10.9�8.1degree of visual angle at the viewing distance. The position of each stimulus wasrandomly jittered for one degree of visual angle (44 pixel) away from the center ofthe screen to prevent apparent motion between trials.

2.3. Procedures

2.3.1. Face discrimination taskWe used a block design for the face discrimination task. There were four types

of block conditions: PC1+, PC1� , PC16+, and PC16� . Within each block, faces werefrom the same direction of the same dimension. They only differed on the distancefrom the average face. Participants completed six face discrimination runs in thescanner. For each run, there were 12 blocks consisting of three repetitions of eachof the 4 types of blocks. The blocks were presented in a pseudo-random order sothat the adjacent blocks were always different. Within each block, there were 10presentations of faces with a 20% probability that a face would be identical to theprevious one. Participant performed a one-back task where they pressed apredefined key when the current facial identity matched the previous one.On average, the participants achieved an accuracy score of 0.83 (SD¼0.03) forthe one-back task. Within a block, each presentation began with a fixation crosslasting for 300 ms followed by a face presented for 1200 ms. Each block was 15 slong. The face blocks were separated by resting periods of 15 s, during which, theparticipants were instructed to keep looking at a fixation cross at the center of thescreen. Each run started and ended with a 15 s resting block, making the totallength of each run 375 s. Following the face discrimination runs, the participantsfinished one control run with the same task structure as the face discriminationruns except that the face images were Fourier phase scrambled. These phasescrambled images have the same Fourier power spectrum as the face images, butlack the spatial structure of the face images.

2.3.2. Functional localizer taskThe functional localizer task had three types of block conditions: faces, houses,

and Fourier phase scrambled images. Participants completed two face localizerruns with a one-back task. For each run, there were 12 blocks consisting of fourrepetitions of each of the three types of blocks. Each block had 10 presentationswith a total length of 15 s. As in the face discrimination task, blocks were separatedby 15 s resting periods and each run started and ended with a 15 s resting period.The total length of each face localizer run was 375 s.

2.3.3. Retinotopic mappingWe ran a standard retinotopic mapping procedure (Warnking, 2002) with

rotating wedges (one clockwise run and one counterclockwise run) and expanding/contracting rings.

2.3.4. Perceptual sensitivityTo measure the perceptual sensitivity to changes of facial identities along PC1

and PC16, we tested a different group of ten adults (5 males, mean age¼30.6 years,SD¼7.5 years) with a delayed match to sample task with a constant stimuliprocedure. On each trial, following a 500 ms fixation cross, a target face wasdisplayed for 500 ms. The target face was then replace by a mask of white Gaussiannoise, which lasted for 300 ms. After the noise mask, two faces were displayed sideby side until the participant pressed a key for answer. The participant wasinstructed to press a predefined key to indicate which of the two faces was exactly

Table 1Number of voxels and decoding accuracy (in parenthesis) of each ROI.

Participants lFFA lOFA rFFA

M1 9 (0.389) N.A. 41 (0.3F1 35 (0.306) 58 (0.375) 33 (0.4F2 110 (0.181) 536 (0.417) 522 (0.2M2 18 (0.347) 60 (0.333) 19 (0.4M3 134 (0.208) 106 (0.222) 197 (0.2F3 353 (0.417) 165 (0.611) 586 (0.3M4 120 (0.306) 166 (0.417) 117 (0.3F4 10 (0.389) N.A. 20 (0.2

Note: The decoding accuracies were calculated in classifying four block conditions (PC1+qo0.05.

the same as the target face. The target face differed from the distracter face only onone PC dimension with the amount of difference varied at four levels (0.03, 0.06,0.09, and 0.12). For each direction of each PC dimension, there were 15 repetitionsat each difference level. There were 240 trials in total (2 PC dimensions (PC1 vs.PC16)�2 directions (7)�4 levels of differences�15 repetitions). We calculatedthe proportion of correct responses for each level of difference of each PCdimension averaged across the two directions on that dimension and then fitWeibull functions to estimate the threshold level of difference at 75% accuracy.

2.4. MR acquisition

We collected all the data in a 3 T SIEMENS MAGNETOM TrioTim scanner(software version B17, Siemens, Erlangen, Germany) using a 12-channel phased-array head coil.

2.4.1. AnatomyAnatomical data were collected using a high-resolution T1-weighted

magnetization-prepared gradient-echo image (MP-RAGE) with the following para-meters: 192 sagittal slices, TR¼1900 ms, TE¼2.5 ms, slice thickness¼1 mm,FA¼91, FoV¼256�256 mm2, matrix size¼256�256.

2.4.2. FunctionalBOLD data were collected using an Echo Planar Imaging (EPI) sequence with the

following parameters: 21 oblique slices covering the ventral part of the temporallobe, TR¼1500 ms, TE¼30 ms, FA¼901, slice thickness¼3 mm, in-plane resolu-tion¼3�3 mm2, zero gap, FoV¼210�210 mm2, matrix size¼64�64, interleavedacquisition.

2.5. Preprocessing

We ran the following data preprocessing procedures using the FMRIB SoftwareLibrary (FSL, version 4.1.6): (1) 3D motion correction using rigid body translationand rotation via an intra-modal volume linear registration (FSL command: mcflirt);(2) slice timing correction for interleaved acquisition using Sinc interpolation (FSLcommand: slicetimer); and (3) skull stripping (FSL command: bet). For the facediscrimination runs, no further preprocessing was performed. For the functionallocalizer runs and the retinotopic runs, we spatially smoothed the data with aGaussian kernel with fwhm¼6 mm (FSL command: fslmaths). For each run, wediscarded the first 5 volumes (7.5 s) to avoid the effects of magnetic saturation.

We created a three-dimensional surface model of each individual brain fromthe T1-weighted high-resolution structural image using Freesurfer (Dale, Fischl, &Sereno, 1999; Fischl, Sereno, & Dale, 1999). To map the functional data to thecortical surface, we first aligned the functional images with the high-resolutionstructural image using a 12 degree of freedom affine transformation (FSL com-mand: flirt), then we used the two-surface method in the SUMA software packageto map the functional data to the cortical surface (Argall, Saad, & Beauchamp,2006). The two-surface method maps the absolute maximum value along thesegment connecting the white matter surface and the pial surface to the surfacenode; thereby the activation along the entire gray matter thickness is mapped.

2.6. Analysis

2.6.1. Functional localizerWe estimated the voxel-wise BOLD response amplitudes to each of the three

categories (faces, houses, and scrambled images) by fitting the data with a generallinear model (GLM), convolved with a hemodynamic response function (canonicaldifference of gammas HRF, SPM8) with temporal derivatives. We tested theface4house contrast to obtain the t statistical map for each individual. Weidentified voxels that have significantly higher response amplitude for faces thanfor houses in lateral occipital and ventral temporal areas with t thresholds

rOFA lV1 rV1

47) 364 (0.444) 364 (0.708) 438 (0.875)03) 166 (0.306) 451 (0.625) 229 (0.667)92) 97 (0.486) 295 (0.583) 235 (0.5)03) 25 (0.347) 190 (0.611) 215 (0.583)78) 68 (0.292) 216 (0.514) 290 (0.556)89) 191 (0.514) 276 (0.944) 250 (0.944)06) 337 (0.347) 413 (0.417) 364 (0.583)5) N.A. 180 (0.458) 244 (0.361)

, PC1� , PC16+, and PC16�). N.A.: No voxel survived an FDR corrected t-test with

0.0

0.5

1.0

lFFA lOFA rFFA rOFA

Per

cent

age

of S

igna

l Cha

nge

(%)

PC1+

PC1−

PC16+

PC16−

Fig. 2. BOLD responses to changes of facial identity on each direction (7) of PC1and PC16 in the face-selective ROIs.

X. Gao, H.R. Wilson / Neuropsychologia 51 (2013) 1787–17931790

corresponding to a false discovery rate (FDR) of qo0.05. Table 1 summarizes thenumber of voxels in the face-selective areas for each individual. We used the face-selective areas identified here as region of interest (ROI) masks for the subsequentanalysis. We also estimated the BOLD response amplitude (% of signal change) tochanges of facial identities along different face space dimensions averaged acrossthe voxels in each ROI.

2.6.2. Retinotopic mapsWe estimated the voxel-wise amplitude and phase of the BOLD signal at the

stimulation frequency of the retinotopic mapping procedure by Fourier transfor-mation. The hemodynamic response delay was adjusted by combining theestimated phase maps from stimuli moving in opposite directions (Warnking,2002). The final phase map was thresholded to keep the voxels representing thetop 5% of the amplitude estimation at the stimulation frequency. Individual phasemaps were mapped to their corresponding cortical surfaces. Boundaries betweenthe retinotopic areas were then manually labeled following the standard proce-dures (Warnking, 2002).

2.6.3. BOLD response patternsWe estimated the voxel-wise BOLD response amplitudes for each stimulus

block by fitting a GLM convolved with a hemodynamic response function withtemporal derivatives to the preprocessed time series. Unlike in the localizer runswhere we estimated BOLD response amplitude for each category, here weestimated the BOLD response amplitude for each experimental block. There were72 beta coefficients in total for each voxel, representing the estimated BOLDresponse amplitudes for 18 blocks (3 repetitions�6 runs) of each of the four blockconditions (PC1+, PC1� , PC16+, and PC16�). The resulting beta coefficients mapswere used as patterns for the subsequent multi-voxel pattern analysis.

2.6.4. Multi-voxel pattern analysis (MVPA)For each of the four face-selective ROIs, we conducted MVPA based on the beta

coefficients maps of the selected voxels using Matlab based Princeton MVPAtoolbox with a linear Support Vector Machine (SVM) classifier with c¼1 (Chang& Lin, 2011). Before submitting the data to MVPA, we normalized the data to havezero mean and a standard deviation of one. The decoding accuracy is calculatedthrough an 18-fold leave-one-out cross-validation. We separated the beta maps bytheir corresponding block conditions (PC1+, PC1� , PC16+, and PC16�) and sortedthem according to the temporal order of the blocks. We then grouped the betamaps according to their temporal ranks, so that the beta maps from different blockcondition with the same temporal rank would be in the same group. In total, wehad 18 groups. Within each group, there was one beta map for each of the fourblock conditions. For each cross-validation iteration, we trained the SVM on thedata from 17 groups and tested on the remaining one group. The final accuracy iscalculated by averaging the prediction accuracy across the 18 cross-validationiterations (Table 1). We also estimated the prediction accuracy on the control run inwhich participants only saw phase scrambled face images. For the control run, wetrained the SVM classifier on the 18 groups of the beta maps from the facediscrimination blocks and tested the classifier on the beta maps of the control run.The mean accuracy of the prediction of the control run at any of the four ROIs didnot differ significantly from chance (0.25) with ps40.05. The null results of thecontrol run suggest that the Fourier power spectrum of the face images does notprovide information for the decoding of the changes of facial identities alongdifferent PC dimensions. However, difference in low-level image features such aslocal contrast and edges may still provide information for decoding. To test thispossibility, we ran MVPA analysis with voxels (Table 1) from the primary visualcortex (V1).

3. Results

3.1. Perceptual sensitivity

In the behavioral test, on average, a difference of 0.08 is neededto achieve an accuracy of 75% in detecting changes of facialidentity on PC1, while only a difference of 0.06 is needed onPC16 to achieve the same accuracy. The result indicates that theobservers are more sensitive to physical changes on PC16 than onPC1 (p¼0.02, two-tailed t-test).

3.2. BOLD response amplitude

As shown on Fig. 2, we plotted the percentage of BOLD signalchange averaged across voxels in each ROI to changes of facialidentity on each direction (7) of PC1 and PC16. For each ROI, weran a repeated measure ANOVA on the BOLD response amplitude

with dimension (PC1 vs. PC16) and direction (+ vs. �) as repeatedmeasures. For all the four ROIs, none of the main effects orinteractions was significant (ps¼N.S.). The results suggest thatchanges of facial identity along different face space dimensions donot modulate BOLD response amplitude in face-selective areas.To test whether changes of facial identity along different facespace dimensions modulate BOLD response magnitudes in areasthat are not limited to the face-selective ROIs, we ran a wholebrain univariate general linear model based analysis with subjectas a random effect in the model. No voxel survived an FDR-corrected threshold of q¼0.05. This result suggests that the BOLDresponse magnitudes at single voxel level do not provide reliableinformation to differentiate changes of facial identity along differ-ent face space dimensions in any of the brain areas.

3.3. Multi-voxel pattern analysis

Using MVPA, we found that, the patterns of activation in allfour of the ROIs provided information for decoding of the blockconditions (PC1+, PC1� , PC16+, and PC16�). The mean decodingaccuracies for voxels in all four ROIs were all significantly higherthan chance level (0.25), as suggested by one-tailed t-tests(mean¼0.32, 0.40, 0.33, 0.39; p¼0.02, 0.01, 0.003, 0.003, uncor-rected, for lFFA, lOFA, rFFA, rOFA, respectively, Fig. 3). The decodingaccuracy of lFFA became marginally significant (p¼0.08) afterBonferroni correction, while the others remained significant atpo0.05 level. To test whether the patterns of activation indifferent ROIs provide redundant or complementary informationfor encoding the face space dimensions, we calculated the MVPAdecoding accuracy with ROI masks that contained voxels of all fourof the original face-selective ROIs. The decoding accuracy of thecombined ROI was significantly higher than chance (0.41,p¼0.003, one-tailed t-test). However, the decoding accuracy ofthe combined ROI was not significantly different from the decod-ing accuracy of any of the four face-selective ROIs (ps40.05,corrected for multiple comparison).

To test whether low-level images features such as local contrastand edges provide information for the discrimination of thechanges of facial identity along different PC dimensions, we ranan MVPA analysis with voxels from V1. Interestingly, BOLDresponse patterns in V1 can also be decoded at above chanceaccuracies (left V1, mean accuracy¼0.61; right V1, mean decodingaccuracy¼0.63; pso0.01, one-tailed t-tests against chance). Wealso analyzed the pixel-level differences among face imagesrepresenting changes on different PC dimensions. For each stimu-lus block, we calculated an average image of the ten stimuluspresentations with the location of each image randomly jitteredfor one degree of visual angle (44 pixels). We applied an SVMclassifier to the resulting average images representing pixel-levelof changes on each direction of each PC dimension. The classifierachieved an above chance average accuracy of 28.1% across ten

lFFA lOFA rFFA rOFA Combined

MV

P A

dec

odin

g ac

cura

cy

0.0

0.1

0.2

0.3

0.4

0.5

**

*

****

Fig. 3. MVPA decoding accuracy in the face-selective ROIs. The dashed linerepresents chance level (0.25). *po0.05; **po0.01 (one-tailed t-tests againstchance, corrected for multiple comparisons).

********

0.0

0.2

0.4

0.6

0.8

lFFA lOFA rFFA rOFA

MV

P A

dec

odin

g ac

cura

cy

PC1

PC16

Fig. 4. MVPA decoding accuracy for PC1 and PC16. The dashed line representschance level (0.5). *po0.05 (two-tailed t-tests between decoding accuracy for PC1and PC16).

X. Gao, H.R. Wilson / Neuropsychologia 51 (2013) 1787–1793 1791

simulated runs (SD¼2.25%, po0.01, one tailed t-test againstchance). However, if we align the face images in each block byremoving the location jitter, the decoding accuracy of the classifierrose to 100%.

We also ran MVPA for PC1 and PC16 separately to calculate theaccuracy in classifying changes of facial identities on differentdirections (7) within each dimension. The decoding accuraciesfor both PC1 and PC16 in all the four ROIs were all above chance(pso0.01, corrected for multiple comparison, Fig. 4). The decodingaccuracy between the two directions of PC16 was significantlyhigher than the decoding accuracy between the two directions ofPC1 in lOFA (po0.05, corrected for multiple comparison). There isno significant difference between the decoding accuracy of PC1and PC16 in any other ROIs.

4. Discussion

The face space hypothesis suggests that individual facialidentities are encoded as distances (from the origin) and directionsin a multidimensional space. A previous study (Loffler et al., 2005)has demonstrated that the distance between an individual faceand the origin of the face space (the average face) is encoded asBOLD response amplitude in the FFA. Here we show that directionsin the face space as represented by the face space dimensions areencoded as patterns of neural activities in the face-selective areasincluding the FFA and OFA. The current findings confirmed the roleof FFA and OFA in encoding individual facial identities. The fact

that multi-voxel response patterns but not single voxel responsemagnitudes differed for different face space dimensions indicatesthat information regarding face space dimensions is encoded in aspatially distributed way.

The current findings of the distributed neural representationsof face space dimensions are consistent with a recent studyshowing that cells in the middle face patch in macaque monkeysare selective to geometric changes of schematic faces alongdifferent feature dimensions (Freiwald, Tsao, & Livingstone,2009). Although the nature of the multi-voxel patterns measuredby fMRI is still not well understood (Op de Beeck, 2010;Kriegeskorte, Cusack, & Bandettini, 2010), one possible explanationis that the current findings reflect an uneven distribution of cellsthat are tuned to different face space dimensions. As a result,voxels sampled by fMRI show biases to different face spacedimensions.

Previous studies have shown that individual facial identities areencoded as patterns of neural activation in distributed brain areas(Kriegeskorte et al., 2007; Natu et al., 2010; Nester et al., 2011). Thecurrent study had two advantages over these studies. In theprevious studies, facial identities were kept constant despitechanges on other dimensions (e.g. facial expression (Nestor,Plaut, & Behrmann, 2011), face point of view (Natu et al., 2010)).Therefore, the neural activates represent encoding of fixed facialidentities. In the current study, the facial identities always variedalong a certain face space dimension in a block with the amount ofvariation as large as the difference between two facial identities(the maximum variation was at 34th percentile of the pair-wiseddifference of the original 41 facial identities). Therefore, the neuralactivities did not represent fixed facial identities as points in theface space structure. Instead, what have been measured in thecurrent study were the neural responses to changes of facialidentities on face space dimensions. This methodology allowedus to investigate the neural representation of the dimensionsforming the face space structure rather than neural representationof individual facial identities.

Another advantage is that defining the face space dimensionsbased on PCA allowed us to link image statistical properties toperceptual and neural sensitivities. We studied changes of facialidentities along two dimensions, with PC1 explaining eight timesas much of the variance in the face set as PC16 did. We foundobservers were more sensitive to physical changes on PC16 thanon PC1. This result is not surprising if we scale the discriminationthresholds relative to the variance on PC1 and PC16. The standarddeviation of the original 41 faces was 0.081 on PC1 and 0.022 onPC16. Therefore, the relative thresholds when scaled to thecorresponding standard deviation on each PC would be 0.97 forPC1 and 2.72 for PC16. Observers can detect changes less than1 standard deviation on PC1, while they need a more than2 standard deviation difference to detect changes on PC16. Wefound that the neural sensitivity to PC1 and PC16 was consistent tothe perceptual sensitivity. In left OFA, the decoding accuracy wassignificantly higher for changes on PC16 than for PC1 when thesame amount of physical difference was available. However, wedid not see such a difference between PC1 and PC16 in other threeface-selective ROIs. The current findings suggest that for the sameamount of physical difference, there is higher perceptual andneural sensitivity to changes on PC dimensions with smallereigenvalues. However, this conclusion is limited by the fact thatonly one PC dimension with a high eigenvalue and one with a loweigenvalue were used. We cannot rule out the possibility that therelation between perceptual and neural sensitivity to PC dimen-sions and the eigenvalues of PC dimensions may not bemonotonic.

The fact that pooling all the voxels in the four ROIs did notincrease the MPVA decoding accuracy suggests that similar

X. Gao, H.R. Wilson / Neuropsychologia 51 (2013) 1787–17931792

information regarding the face space dimensions is present in allfour ROIs as distributed patterns. Different ROIs may carry redun-dant information about changes on different face space dimen-sions. It is possible that these face-selective areas carry the sameinformation, but use it for different stages of processing. Poolingthe voxels in all four ROIs increased the dimensionality of theobservations, it might require more observations than we cur-rently have to train the classifier to achieve the same level ofdecoding accuracy. On the other hand, it might just reflect a ceilingeffect of the sensitivity of the methodology used in the currentstudy, which could be limited by the resolution (both spatial andtemporal) of fMRI.

The fact that information can be extracted from early visualareas for accurate decoding of changes of facial identity alongdifferent PC dimensions suggests that low-level features of thestimuli may provide information for discrimination among thesechanges. The pixel-level image analysis provided evidence for theexistence of such low-level features, although decoding accuracybased on these low-level features was low if the locations of theimages were jittered. However, since the participants wereallowed to view the images freely, the effect of location jitteringmay be compromised by eye movement. If the images were wellaligned, a classifier was able to decode the face images at 100%accuracy. Therefore, higher decoding accuracy based on the low-level feature could be achieved if the images were realignedthrough eye movement. To remove the potential confound of thelow-level features, future study could vary the size or the point ofview of the face images. On the other hand, it is possible that theface specific information in V1 may reflect a feedback mechanismfrom the higher-level visual areas. Such feedback connections froma higher visual area to the early visual areas have been suggestedin previous studies (Bar, 2003; Galuske, Schmidt, Goebel, Lomber,& Payne, 2002; Hupé et al., 1998; Rossion, Dricot, Goebel, &Busigny, 2011).

A previous study (Loffler et al., 2005) has shown that BOLDresponses to faces in the FFA increases with increasing distancebetween the face and the average face. It would also be interestingto investigate if distance in the face space also affects BOLDresponse patterns. Although we had faces at three levels ofdistance from the average face, because of the nature of the blockdesign, we were not able to separate the patterns of responses forfaces that were at different distances from the average face.Therefore, we were not able to investigate how changes in BOLDresponse patterns are related to changes in the distance from theaverage face. Future studies with an event related design would beable to provide more information on how neural response patternsencode distance in the face space structure.

5. Conclusions

We defined face space dimensions based on statistical regula-rities from a group of Caucasian male faces. We found that changesof facial identities along different face space dimensions wererepresented as distributed neural response patterns in the FFA andthe OFA. Within the current stimuli set, perceptual sensitivities tochanges of facial identities were linked to statistical properties ofthe face space dimensions, such that human observers were moresensitive to physical changes on dimensions where faces vary less.Such a difference is also present in neural sensitivity that higherneural sensitivity to changes of facial identity was observed ondimensions that faces vary less. Collectively, the current findingsprovide evidence for the representation of face space dimensionsin the face-selective cortical areas.

Acknowledgments

This research was supported in part by CIHR grant #172103 anda grant from the Canadian Institute for Advanced Researchto HRW.

References

Argall, B. D., Saad, Z. S., & Beauchamp, M. S. (2006). Simplified intersubjectaveraging on the cortical surface using SUMA. Human Brain Mapping, 27, 14–27.

Bar, M. (2003). A cortical mechanism for triggering top-down facilitation in visualobject recognition. Journal of Cognitive Neuroscience, 15, 600–609.

Barton, J. J. S., Press, D. Z., Keenan, J. P., & O’Connor, M. (2002). Lesions of thefusiform face area impair perception of facial configuration in prosopagnosia.Neurology, 58, 71–78.

Bouvier, S. E., & Engel, S. A. (2006). Behavioral deficits and cortical damage loci incerebral achromatopsia. Cerebral Cortex, 16, 183–191.

Brainard, D. H. (1997). The psychophysics toolbox. Spatial Vision, 10, 433–439.Chang, C. C., & Lin, C. J. (2011). LIBSVM: a library for support vector machines. ACM

Transactions on Intelligent Systems and Technology, 2, 27.Dale, A. M., Fischl, B., & Sereno, M. I. (1999). Cortical surface-based analysis. I.

Segmentation and surface reconstruction. NeuroImage, 9, 179–194.Fischl, B., Sereno, M. I., & Dale, A. M. (1999). Cortical surface-based analysis. II: inflation,

flattening, and a surface-based coordinate system. NeuroImage, 9, 195–207.Freiwald, W. A., Tsao, D. Y., & Livingstone, M. S. (2009). A face feature space in the

macaque temporal lobe. Nature Neuroscience, 12, 1187–1196.Galuske, R. A. W., Schmidt, K. E., Goebel, R., Lomber, S. G., & Payne, B. R. (2002). The

role of feedback in shaping neural representations in cat visual cortex.Proceedings of the National Academy of Sciences of the United States of America,99, 17083–17088.

Gao, X., & Maurer, D. (2011). A comparison of spatial frequency tuning for therecognition of facial identity and facial expressions in adults and children.Vision Research, 51, 508–519.

Gauthier, I., Tarr, M. J., Moylan, J., Skudlarski, P., Gore, J. C., & Anderson, A. W. (2000).The fusiform “face area” is part of a network that processes faces at theindividual level. Journal of Cognitive Neuroscience, 12, 495–504.

Gold, J., Bennett, P. J., & Sekuler, A. B. (1999). Identification of band-pass filteredletters and faces by human and ideal observers. Vision Research, 39, 3537–3560.

Grill-Spector, K., Kushnir, T., Edelman, S., Avidan, G., Itzchak, Y., & Malach, R. (1999).Differential processing of objects under various viewing conditions in thehuman lateral occipital complex. Neuron, 24, 187–203.

Grill-Spector, Kalanit, Knouf, N., & Kanwisher, N. (2004). The fusiform face areasubserves face perception, not generic within-category identification. NatureNeuroscience, 7, 555–562.

Hancock, P. J. B., Burton, A. M., & Bruce, V. (1996). Face processing: human perceptionand principal components analysis. Memory and Cognition, 24, 26–40.

Hupé, J. M., James, A. C., Payne, B. R., Lomber, S. G., Girard, P., & Bullier, J. (1998).Cortical feedback improves discrimination between figure and background byV1, V2 and V3 neurons. Nature, 394, 784–787.

Kanwisher, N., McDermott, J., & Chun, M. M. (1997). The fusiform face area: amodule in human extrastriate cortex specialized for face perception. The Journalof Neuroscience, 17, 4302–4311.

Kriegeskorte, N., Cusack, R., & Bandettini, P. (2010). How does an fMRI voxel samplethe neuronal activity pattern: compact-kernel or complex spatiotemporalfilter? NeuroImage, 49, 1965–1976.

Kriegeskorte, N., Formisano, E., Sorger, B., & Goebel, R. (2007). Individual faces elicitdistinct response patterns in human anterior temporal cortex. Proceedings ofthe National Academy of Sciences of the United States of America, 104,20600–20605.

Leopold, D. A., Rhodes, G., Müller, K. M., & Jeffery, L. (2005). The dynamics of visualadaptation to faces. Proceedings of The Royal Society B: Biological Sciences, 272,897–904.

Loffler, G., Yourganov, G., Wilkinson, F., & Wilson, H. R. (2005). fMRI evidence forthe neural representation of faces. Nature Neuroscience, 8, 1386–1390.

Maurer, D., Le Grand, R., & Mondloch, C. J. (2002). The many faces of configuralprocessing. Trends in Cognitive Sciences, 6, 255–260.

Näsänen, R. (1999). Spatial frequency bandwidth used in the recognition of facialimages. Vision Research, 39, 3824–3833.

Natu, V., Jiang, F., Narvekar, A., Keshvari, S., Blanz, V., & O’Toole, A. J. (2010).Dissociable neural patterns of facial identity across changes in viewpoint.Journal of Neuroscience, 22, 1570–1582.

Nestor, A., Plaut, D. C., & Behrmann, M. (2011). Unraveling the distributed neural codeof facial identity through spatiotemporal pattern analysis. Proceedings of theNational Academy of Sciences of the United States of America, 108, 9998–10003.

Op de Beeck, H. P. (2010). Probing the mysterious underpinnings of multi-voxelfMRI analyses. NeuroImage, 50, 567–571.

O’Toole, A. J., Deffenbacher, K., Abdi, H., & Bartlett, J. C. (1991). Simulating the“other-race effect” as a problem in perceptual learning. Connection Science, 3,163–178.

Pelli, D. G. (1997). The videotoolbox software for visual psychophysics: transform-ing numbers into movies. Spatial Vision, 10, 437–442.

Pitcher, D., Walsh, V., & Duchaine, B. (2011). The role of the occipital face area in thecortical face perception network. Experimental Brain Research, 209, 481–493.

X. Gao, H.R. Wilson / Neuropsychologia 51 (2013) 1787–1793 1793

Pitcher, D., Walsh, V., Yovel, G., & Duchaine, B. (2007). TMS evidence for theinvolvement of the right occipital face area in early face processing. CurrentBiology, 17, 1568–1573.

Rhodes, G., Jaquet, E., Jeffery, L., Evangelista, E., Keane, J., & Calder, A. J. (2011). Sex-specific norms code face identity. Journal of Vision, 11(1), 1–11.

Rossion, B., Caldara, R., Seghier, M., Schuller, A., Lazeyras, F., & Mayer, E. (2003).A network of occipito-temporal face-sensitive areas besides the right middlefusiform gyrus is necessary for normal face processing. Brain, 126, 2381–2395.

Rossion, B., Dricot, L., Goebel, R., & Busigny, T. (2011). Holistic face categorization inhigher order visual areas of the normal and prosopagnosic brain: toward anon-hierarchical view of face perception. Frontiers in Human Neuroscience, 4, 225.

Rotshtein, P., Henson, R. N. A., Treves, A., Driver, J., & Dolan, R. J. (2005). MorphingMarilyn into Maggie dissociates physical and identity face representations inthe brain. Nature Neuroscience, 8, 107–113.

Said, C. P., & Todorov, A. (2011). A statistical model of facial attractiveness.Psychological Science, 22, 1183–1190.

Sergent, J., Ohta, S., & MacDonald, B. (1992). Functional neuroanatomy of face andobject processing: a positron emission tomography study. Brain, 115, 15–36.

Sirovich, L., & Kirby, M. (1987). Low-dimensional procedure for the characterizationof human faces. Journal of the Optical Society of American A, 4, 519–524.

Steeves, J., Culham, J. C., Duchaine, B., Cavina, C., Valyear, K., Humphry, S. I., et al.(2006). The fusiform face area is not sufficient for face recognition: evidencefrom a patient with dense prosopagnosia and no occipital face area. Neurop-sychologia, 44, 596–609.

Tanaka, J. W., & Farah, M. J. (1993). Parts and wholes in face recognition. TheQuarterly Journal of Experimental Psychology Section A, 46, 225–245.

Turk, M., & Pentland, A. (1991). Eigenfaces for recognition. Journal of CognitiveNeuroscience, 3, 71–86.

Valentine, T. (1991). A unified account of the effects of distinctiveness, inversion, andrace in face recognition. The Quarterly Journal of Experimental Psychology, 43,161–204.

Warnking, J. (2002). fMRI Retinotopic mapping—step by step. NeuroImage, 17,1665–1683.

Webster, M. A., Kaping, D., Mizokami, Y., & Duhamel, P. (2004). Adaptation tonatural facial categories. Nature, 428, 557–561.

Wilson, H. R., Loffler, G., & Wilkinson, F. (2002). Synthetic faces, face cubes, and thegeometry of face space. Vision Research, 42, 2909–2923.