Reliability and validity of structural equation modeling applied to neuroimaging data: A simulation...

15
Journal of Neuroscience Methods 166 (2007) 278–292 Reliability and validity of structural equation modeling applied to neuroimaging data: A simulation study Aur´ elie Boucard, Alain Marchand, Xavier Nogu` es Centre de Neurosciences Int´ egratives et Cognitives, UMR 5228, CNRS, Universit´ e Bordeaux 1, Bˆ atiment B2, Avenue des Facult´ es, 33405 Talence, France Received 15 February 2007; received in revised form 12 July 2007; accepted 19 July 2007 Abstract Structural equation modeling aims at quantifying the strength of causal relationships within a set of interacting variables. Although the literature emphasizes that large sample sizes are required, this method is increasingly used with neuroimaging data of a limited number of subjects to study the relationships between cerebral structures. Here, we use a simulation approach to evaluate its ability to provide accurate information under the constraints of neuroimaging. Artificial samples representing the activity of a virtual set of structures were generated under both recursive and non-recursive connectivity models. Structural equation modeling was performed on these samples, and the quality of the analyses was evaluated by directly comparing the estimated path coefficients with the original ones. The validity and the reliability are shown to decrease with sample size, but the estimated models respect the relative strength of path coefficients in a large percentage of cases. The “smoothing method” appears to be the most appropriate to prevent improper solutions. Both the experimental error and the external structures influencing the network have a weak influence. Accordingly, structural equation modeling can be applied to neuroimaging data, but confidence intervals should be presented together with the path coefficient estimation. © 2007 Elsevier B.V. All rights reserved. Keywords: Structural equation modeling; Connectivity; Neuroimaging; Validity; Simulation Structural equation modeling (SEM) has been introduced to analyze neuroimaging data in 1991 (McIntosh and Gonzalez- Lima, 1991). Since the beginning of 2000s, the number of articles and the number of teams using this method has been rapidly growing. The aim of this approach is to describe func- tional networks of brain structures involved during a task. It is a way to address the effective connectivity, which “is a statement about the direct effect one region has on another, accounting for mutual or intervening influences” (McIntosh, 2000). Provided that anatomical connections within a clus- ter of structures are known, it allows an estimation of the strength by which each structure is influenced by its input structures. SEM exploits the linear correlation (or covari- ance) between activity measurements obtained from different structures to derive structural coefficients (scores), which are assumed to reflect the strength of the underlying causal processes. Corresponding author. Tel.: +33 5 40 00 87 41; fax: +33 5 40 00 87 43. E-mail address: [email protected] (X. Nogu` es). Initially, this method was developed by J¨ oreskog and S ¨ orbom and was essentially used to process econometric, sociometric and psychometric data (Bollen and Scott Long, 1993), i.e. dis- ciplines allowing the use of large sample sizes. Then, it was introduced in Neuroscience by McIntosh and Gonzalez-Lima (1991) and McIntosh and Gonzalez-Lima (1992) in rodents. It is currently widely used in human neuroimaging (Kilpatrick and Cahill, 2003; McIntosh et al., 1994; Schl¨ osser et al., 2003; Zhuang et al., 2005), and a recent study specifies how to use it with electrophysiological data (Astolfi et al., 2005). Although the aim of SEM seems to perfectly address the issue of network description, it is often claimed that this method may be used only with large samples (Kline, 2005). This assumption cannot be met in neuroimaging studies using animals. Indeed, the cost and time required to get a representation of the active brain hamper the data collection on a large number of subjects. Given that the measure of brain activity is performed ex vivo, this lead the authors to use inter-individual variability as source of variance and to estimate one model by group. In humans, functional neuroimaging methods make it possible to use 100 or more measures for each region of interest. Thus these methods 0165-0270/$ – see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.jneumeth.2007.07.011

Transcript of Reliability and validity of structural equation modeling applied to neuroimaging data: A simulation...

A

etc

me

ien©

K

aLartisa2tssasap

0d

Journal of Neuroscience Methods 166 (2007) 278–292

Reliability and validity of structural equation modeling applied toneuroimaging data: A simulation study

Aurelie Boucard, Alain Marchand, Xavier Nogues ∗Centre de Neurosciences Integratives et Cognitives, UMR 5228, CNRS, Universite Bordeaux 1, Batiment B2, Avenue des Facultes, 33405 Talence, France

Received 15 February 2007; received in revised form 12 July 2007; accepted 19 July 2007

bstract

Structural equation modeling aims at quantifying the strength of causal relationships within a set of interacting variables. Although the literaturemphasizes that large sample sizes are required, this method is increasingly used with neuroimaging data of a limited number of subjects to studyhe relationships between cerebral structures. Here, we use a simulation approach to evaluate its ability to provide accurate information under theonstraints of neuroimaging.

Artificial samples representing the activity of a virtual set of structures were generated under both recursive and non-recursive connectivityodels. Structural equation modeling was performed on these samples, and the quality of the analyses was evaluated by directly comparing the

stimated path coefficients with the original ones.The validity and the reliability are shown to decrease with sample size, but the estimated models respect the relative strength of path coefficients

n a large percentage of cases. The “smoothing method” appears to be the most appropriate to prevent improper solutions. Both the experimentalrror and the external structures influencing the network have a weak influence. Accordingly, structural equation modeling can be applied toeuroimaging data, but confidence intervals should be presented together with the path coefficient estimation.

2007 Elsevier B.V. All rights reserved.

y; Sim

aaci(IaZw

obc

eywords: Structural equation modeling; Connectivity; Neuroimaging; Validit

Structural equation modeling (SEM) has been introduced tonalyze neuroimaging data in 1991 (McIntosh and Gonzalez-ima, 1991). Since the beginning of 2000s, the number ofrticles and the number of teams using this method has beenapidly growing. The aim of this approach is to describe func-ional networks of brain structures involved during a task. Its a way to address the effective connectivity, which “is atatement about the direct effect one region has on another,ccounting for mutual or intervening influences” (McIntosh,000). Provided that anatomical connections within a clus-er of structures are known, it allows an estimation of thetrength by which each structure is influenced by its inputtructures. SEM exploits the linear correlation (or covari-nce) between activity measurements obtained from different

tructures to derive structural coefficients (scores), whichre assumed to reflect the strength of the underlying causalrocesses.

∗ Corresponding author. Tel.: +33 5 40 00 87 41; fax: +33 5 40 00 87 43.E-mail address: [email protected] (X. Nogues).

tbGtofm

165-0270/$ – see front matter © 2007 Elsevier B.V. All rights reserved.oi:10.1016/j.jneumeth.2007.07.011

ulation

Initially, this method was developed by Joreskog and Sorbomnd was essentially used to process econometric, sociometricnd psychometric data (Bollen and Scott Long, 1993), i.e. dis-iplines allowing the use of large sample sizes. Then, it wasntroduced in Neuroscience by McIntosh and Gonzalez-Lima1991) and McIntosh and Gonzalez-Lima (1992) in rodents.t is currently widely used in human neuroimaging (Kilpatricknd Cahill, 2003; McIntosh et al., 1994; Schlosser et al., 2003;huang et al., 2005), and a recent study specifies how to use itith electrophysiological data (Astolfi et al., 2005).Although the aim of SEM seems to perfectly address the issue

f network description, it is often claimed that this method maye used only with large samples (Kline, 2005). This assumptionannot be met in neuroimaging studies using animals. Indeed,he cost and time required to get a representation of the activerain hamper the data collection on a large number of subjects.iven that the measure of brain activity is performed ex vivo,

his lead the authors to use inter-individual variability as sourcef variance and to estimate one model by group. In humans,unctional neuroimaging methods make it possible to use 100 orore measures for each region of interest. Thus these methods

oscie

aiaoe

cTo(c

mlecsit

Mbcsao

siesbeiIstq1(2

ttut(laswi

rcmr

Ftortt

ciiwr

waab(ttdtwhd(

tctaSowtmawTrcti

1

1

iats

A. Boucard et al. / Journal of Neur

llow to calculate one model by subject and therefore to observenter-individual variability in path coefficients. However, someuthors still prefer to use inter-individual variability to estimatene model by group. These two approaches have been discussedarlier (Goncalves et al., 2001; Mechelli et al., 2002).

Interestingly, there are no strong theoretical arguments pre-luding the use of SEM analysis for small sample sizes.heoretically, there should be at least as many cases as variables,r more exactly “as many distinct response patterns as variables”Wothke, 1993), or else the covariance matrix will necessarilyontain collinearities and produce improper solutions.

Instead, the main limit of small sample sizes in SEM analysisay pertain the ability of the sample to represent the popu-

ation. Moreover, in rodents where brain activity is generallyvaluated by semi-automated methods, i.e. by measuring opti-al density or counting immuno-positive cells on slices, usingmall sample sizes leads to the magnification of the exper-mental error occurring during biological processing of theissue.

Supporting these views, Boomsma (1982) showed in aonte-Carlo study that the robustness of the analyses might

e questioned when performed with samples smaller than 100ases. Thus, the basic issue for neuroimaging is not “how big [aample] is big enough?” (Tanaka, 1987), but rather “how reli-ble and valid are the results obtained in the specific conditionsf this discipline?”.

The other studies addressing the problem of small sampleize focus either on the dependence of goodness-of-fit (GOF)ndexes on the sample size (Bentler, 1990; Bollen, 1990; Marsht al., 1988; Tanaka, 1987), or on the occurrence of “improperolutions”. Schematically, GOF indexes measure the adequacyetween the estimation of a structural network and the initial –mpirical – data set. As shown later, the interest of GOF indexess limited in neuroimaging, and will not be discussed here.mproper solutions are cases of mathematical failure related toampling and measurement error (Wothke, 1993) which preventhe computation of structural coefficients. They occur more fre-uently when the sample size is small (Anderson and Gerbing,984). Fortunately, solutions to this problem have been proposedBrowne and Mels, 1999; Joreskog and Sorbom, 2001; Kline,005).

Given the constraints characterizing neuroimaging research,he aim of the present article is to evaluate the reliability andhe validity of SEM when the coefficients have been estimatednder these constraints. Reliability can be defined as “the degreeo which the scores are free from random measurement error”Kline, 2005), and from random sampling error. Reliability isow when the scores computed from different samples have

high variance. Validity is the ability of the scores to “mea-ure what they are supposed to measure” (Kline, 2005), in otherords, their ability to approach the true underlying coefficients

n absolute terms.Beyond validity and reliability, i.e. parameters which both

elate to the absolute deviation of the estimations from the actualoefficients, it is pertinent to have an indicator of how the esti-ation of the structural coefficients of a network respects the

elative strength of connection of the actual population network.

tmar

nce Methods 166 (2007) 278–292 279

or example, we could be interested in knowing whether a struc-ure (for example the hippocampus) is more influenced by onef its afferent structures (the entorhinal cortex) during a task,ather than by another afferent structure (the parietal cortex). Weherefore used an index measuring how SEM analysis respectshe relative strength of the coefficients.

With real data, the actual path coefficients are unknown. Theyan only be estimated by SEM analysis. Consequently, the valid-ty and the accuracy of the relative strength of the coefficients arempossible to measure, and the reliability is difficult to evaluatehen using a one-model-by-group approach, because it would

equire replicating the same experiment a number of times.To address this issue, we propose a computational approach,

hich relies on specifying a priori structural equation modelsnd generating simulated case samples with these models. Thebility of SEM to identify the coefficients can then be evaluatedy directly comparing the estimated coefficients with the realand known) ones. This procedure allows a systematic study ofhe effects of changes within model parameters, sample proper-ies, or SEM methods. The efficacy of this approach has beenemonstrated by Guadagnoli and Velicer (1988), who showedhat some classically used rules in principal component analysisere founded whereas some others were not. A similar approachas also been conducted to study the ability of χ2

DIFF method toetect differences between two models when sample size is lowProtzner and McIntosh, 2006).

Here, we first study the effect of sample size on the validity,he reliability and the accuracy of the relative strength of theoefficients of SEM analyses when a solution is found, and onhe number of improper solutions. In a second part, we evalu-te and compare the efficiency of some methods used to forceEM algorithm to reach a solution when an improper solutionccurs. Finally, we study the effects of two sources of variancehich may impair the quality of structural coefficient estima-

ion. The first source is the set of structures not included in theodel, but which are influencing some of the endogenous vari-

bles. The second one is the “experimental error” introducedhen trying to evaluate brain activity within each structure.he parameters used in these experiments have been chosen by

eference to neuroimaging data on rodents, e.g. immunohisto-hemistry. However, the conclusions are likely to be applicableo human neuroimaging such as PETscan or magnetic resonancemaging.

. General methods

.1. Types of variables

The first step of SEM to study a network of brain structuress to select the set of structures which are supposed to inter-ct during a task. Anatomical paths linking the structures arehen postulated. Each structure is supposedly characterized by aingle scalar variable reflecting its activity. Among these struc-

ures, some do not receive any input from other variables of the

odel and represent the input to the network. These structuresre termed “exogenous variables”. The other structures, i.e. thoseeceiving input from other structures of the selected set, are the

2 oscience Methods 166 (2007) 278–292

“iattsseoorviao(

1

mm“Bcerefi

ad

Fig. 1. Principle of implementation of the models. Path coefficients (p.c.) arechosen. A different random value is attributed to each exogenous variable (Ex.)and to each “neglected variable” (Negl.) creating a variance (Var.) for thesevariables within the sample. The value of each endogenous variable (End.) isthen defined as a linear combination of its afferences (including exogenousvariables, neglected inputs, and other endogenous variables), weighted by therespective paths coefficients. Finally, a random experimental error (exp.er.) isa

tme

80 A. Boucard et al. / Journal of Neur

endogenous variables”. Although the set of selected structuress considered as a “functional cluster” (as defined by Tononi etl., 1998), this cluster cannot be completely disconnected fromhe rest of the brain. Therefore, some structures not included inhe model may influence endogenous variables, since only a sub-et of all brain structures has been selected. In our study, thesetructures will be labeled the “neglected input”, and their influ-nce will be examined as well. In the SEM model, the activityf each structure is supposed to result from a linear combinationf endogenous, exogenous and neglected influences. Finally, aandom experimental error is supposed to affect each measuredariable. It may reflect the variability, inherent to the biolog-cal material (sampling of brain slices), or to the measures ofctivity such as cell counting (immunohistochemical data), orptical density measure (autoradiographic data or enzymology)Fig. 1).

.2. The models

We constructed two models without any feed-back loop;odels 1 and 2, and three models with one feed-back loop;odels 3, 4 and 5. In SEM terminology, the former are termed

recursive models” because they can be solved “recursively”.y contrast, the latter are said “non-recursive”. Each modelontained six variables (structures) among which three werexogenous and three were endogenous. Three different (uncor-elated) neglected inputs, respectively influencing the threendogenous variables, were added to generate the primary data

les.

As shown in Fig. 2, all the models are very similar in structurend path coefficients. However, basic differences were intro-uced in some parameters, i.e. the feed-back loop locus and

voss

Fig. 2. The five models and the path

dded to each endogenous and exogenous variable.

he number of variables involved in the loop for non-recursiveodels. In order to facilitate the comparison between the mod-

ls, the structural coefficients were arranged so that their meanalue was equal to 0.9 for each model, whatever the numberf connections. The linear combination of all inputs to a given

tructure, weighted by their structural coefficients, defined thetructural equation for this structure.

coefficients used for the study.

oscie

wvevsSepn2

meeia

1e

2(wwOSwttSfic

1

rmtabgeCtr

iwegetEs

s

E

ettatw

ricbricnocc

1

hmwMaTtvpb

ohpfipa

2ac

A. Boucard et al. / Journal of Neur

For data generation, each exogenous and neglected variableas implemented as a normally distributed random number. Theariances of the exogenous variables were set to 1. Except forxperiment 3 (study of the effects of the neglected input), theariances of the neglected variables were individually adjustedo as to add 20% to the variance of each endogenous variable.tructural equations were then used to calculate the values ofndogenous variables. For non-recursive models, an iterativerocedure was conducted up to convergence. Before the begin-ing of the simulation, the assumption of equilibrium (Kline,005) was verified for each non-recursive model.

After this procedure, a normally distributed random experi-ental error was added to each variable (both exogenous and

ndogenous). Except for experiment 3 (study of the effects ofxperimental error), the variances of these random errors werendividually fixed so as to add 5% to the variance of each vari-ble.

.3. Data generation, file processing and structuralquation modeling

The first step of data generation consisted in generating50,000 subjects for each model, each with six measurementsstructures). According to the sample required size, subjectsere drawn from this data base, and the covariance matricesere calculated. These early steps were performed by usingctave 2.0.16 (©JW Eaton, GNU General Public Licence).EM analyses were performed with Systat 9.01 (SPSS Inc.),hich uses the “reticular action model or near approxima-

ion” algorithm (RAMONA; Browne and Mels, 1999). Allhe SEM analyses were performed automatically by usingystat command files. The maximum iteration number wasxed to 999 for each analysis to ensure convergence in allases.

.4. Indexes of quality

Three indexes were used to evaluate the quality of the SEMesults. To evaluate the validity and the reliability of the esti-ations, an “error index” (Er) was computed. Er is the mean of

he absolute value of the difference between the estimated (k)nd the real (κ) coefficients: Er = (

∑NKi=1|ki − κi|)/NK with NK

eing the number of path coefficients in the model. This indexives a precise view of the absolute error of the coefficientsstimated from one sample and should be as small as possible.onversely, the validity of the method is inversely related to

he average value of Er computed from several analyses, and itseliability is inversely related to the variance of Er.

Basically, the use of a quadratic estimator (i.e. �(k − κ)2/N)s more common (Astolfi et al., 2005). However, given that theeight of high differences are amplified, high values in these

stimators mean either that the error is concentrated on a sin-le or on a few coefficients and that the others have been well

stimated, or that the error is homogenously distributed amonghe whole set of estimated coefficients. Consequently, we usedr, which reflects the actual mean of errors, and separately, wetudied the relative concentration of error on a small number of

mtap

nce Methods 166 (2007) 278–292 281

tructural coefficients (Ec). Ec is defined as follows:

c = 100 × NK∑

(k − κ)2 − (∑ |k − κ|)2

NK(∑ |k − κ|)2 − (∑ |k − κ|)2

This index is designed to be independent from Er. When therror is concentrated on a few coefficients, the first terms ofhe numerator and denominator tend to become equivalent andhe index tends toward 100. Conversely, if the error is distributedmong all the coefficients, the numerator and therefore the indexend toward zero. We presented this index in the results onlyhen the studied factor influenced it significantly.Finally, to evaluate whether the estimation of the model

espects the relative strength of the coefficients, a third index,.e. the ARSC index (for accuracy of the relative strength of theoefficients) was computed as the Pearson correlation coefficientetween the estimated and the real coefficients for each sample:k,κ. Compared to Er, which assesses the absolute errors, thisndex reflects whether the relative magnitude of the structuraloefficients within a network, is well represented (rk,κ ≈ 1) orot (rk,κ � 1) by the set of coefficients estimation. Dependingn the studied model, these Pearson correlation coefficients arealculated on six (models 2 and 3) or seven (models 1, 4 and 5)ouples of data.

.5. Statistical analyses

All the indexes tested in this study presented either a higheteroscedasticity (heterogeneity of variance) and/or a dissym-etry in distribution. Given that no mathematical transformationould normalize these data, non-parametric statistics were used.ost of the analyses (Kruskal–Wallis test, Friedman test, Mann

nd Whitney U) were performed with Systat 9.01 (SPSS Inc.).he Nemenyi test (the non-parametric version of the Tukey

est) was performed according to Zar (1999). When addressingariances, the Bartlett’s test for homoscedasticity and post hocairwise comparisons of variances were computed as describedy Zar (1999).

Since non-parametric statistics were used, multiway analysesf variances could not be used, and the same statistical testsad to be run separately on the five models. The Bonferronirocedure was therefore applied to correct for tests repeatedve times, so that results were considered significant when thevalue was equal to, or below 0.012, which corresponds to an

ctual alpha error risk of 0.05.

. Experiment 1: decrease in sample size reduces SEMccuracy, but respects the relative strength of pathoefficients

The present experiment explored how decreasing sample size

ay increase the proportion of improper solutions, and decrease

he reliability and validity of the estimated coefficients, andssessed whether the results respected the relative strength ofath coefficients.

2 oscience Methods 166 (2007) 278–292

2

ueTW

btwsamwato

2

apv

Ff

(ttχ

d

Fthg

82 A. Boucard et al. / Journal of Neur

.1. Experimental design

In this experiment, we manipulated the size of the samplesed to perform the SEM analysis. This factor had seven lev-ls (number of cases = 10,000, 1000, 100, 25, 20, 15, and 10).he structural coefficients were estimated with the maximumishart likelihood (MWL) estimation method.Cases were generated as described in Section 1. Samples were

uilt by randomly selecting an appropriate number of cases fromhe database (without replacement), and the covariance matricesere computed. For each sample size, 25 different samples were

elected and the corresponding covariance matrices were stored,nd further processed for SEM analysis. When some covarianceatrices led to improper solutions, the procedure was repeatedith a new series of 25 covariance matrices, until at least 25

nalyses were successfully achieved. Thus, for any sample size,he reliability, validity and ARSC of the analyses were assessedn 25–49 samples.

.2. Results

The percentage of analyses leading to a solution increases asfunction of the sample size, so that we can deduce that the

ercentage of analyses leading to an improper solution becomesery low or null above a sample size of 100 (Fig. 3).

ow5o

ig. 4. Evolution of the validity and the reliability as functions of the sample size. Thhird quartiles. Whiskers show the minimum and the maximum values observed. Theere by the quartiles and extremes values), the higher the reliability. When the maximiven. +Significant difference in reliability (Bonferroni corrected p < 0.05); *significa

ig. 3. Percentage of the analyses that do not lead to an improper solution as aunction of the sample size.

The reliability of the technique for the different sample sizesFig. 4) was tested by comparing the variances of Er. Bartlett’sest showed a significant heteroscedasticity in the Er parame-er between the various sample sizes for each model (Bartlett26 > 262 for each model; p < 0.0001), which corresponded to aecrease in the reliability. Indeed, all the pairwise comparisons

f variances between sample sizes of 10,000, 1000, 100 and 25ere significant (p < 0.01 for all the models). In models 1, 4 and, there was a significant difference between the sample sizef 10 and all the other groups. In model 2, there was a signifi-

e central bar represents the median of Er. Upper and lower boxes: second andlower the median, the higher the validity. The lower the variability (representedum value exceeds the maximum value of the ordinate axes, the actual value is

nt difference in validity (Bonferroni corrected p < 0.05).

A. Boucard et al. / Journal of Neuroscience Methods 166 (2007) 278–292 283

F ,000 (c osenc

cswcndimh

Kipw1aadetm(ce

opimv

cHHsitc

wcsci

rticcis

3ai

owiri

gcgm1

3

ig. 5. A representative example of a model estimation with sample sizes of 10oefficients which the model has been calculated with. The example has been chonducted for a given sample size.

ant difference between the sample size of 15 and all the otherample sizes except size 10. In model 3, a significant differenceas observed between the sample sizes of 20 and 25, but the

omparisons between the groups of smaller sample sizes wereot significant. Altogether, these results indicate a continuousecrease in reliability when sample size decreases. Fig. 4 alsondicates that, for small sample sizes, the reliability is lower in

odels 4 and 5, i.e. the models containing pairs of structuresarboring direct reciprocal connections.

To assess validity, the mean of Er was considered (Fig. 4). Theruskal–Wallis test showed a significant decrease in the valid-

ty as sample size decreased (KW’s H > 138 for each model;< 0.0001). Post hoc pairwise comparisons were performedith the multiple comparison test for unequal sample sizes (Zar,999). Schematically, the large sample sizes (i.e. 10,000, 1000nd 100) can be distinguished from the other ones presentinglower validity. Within this latter set, we observed significant

ifferences between the sample sizes of 25 and 10 for the mod-ls 4 and 5. Finally, Fig. 4 displays outlier values of Er forhe models 4 and 5. Interestingly, they are due to outlier esti-

ations of a few easily detectable coefficients within a modelgenerally 1 or 2). The presence of a suspect estimated pathoefficient may suggest the requirement of a replication of thexperiment.

Together, these data show that the results validity continu-usly decreases when sample size decreases, a characteristicarticularly obvious for the smaller sample sizes (when n = 10)n models 4 and 5. An example of two representative model esti-

ations is shown in Fig. 5. For sample sizes above 25, both thealidity and the reliability are very high.

The effect of sample size on the error concentration index (Ec;f. Fig. 6) appeared significant for models 2, 3, 4 and 5 (KW’s> 21.5 in each case; p < 0.005), but not for model 1 (KW’s= 3.25; p = 0.78). Specifically, post hoc pairwise comparisons

howed that, on average, from n = 25 to n = 10, Ec graduallyncreased, indicating that when the sample size becomes small,he error is not equally distributed between all the estimatedoefficients.

cso

third column) and 20 (fourth column). The second column indicates the actualso that the Er and Ec parameters were the closest to the mean of all the analyses

As shown in Fig. 7, all the structural coefficients estimatedith 10,000, 1000 and 100 cases are correlated with the real

oefficients (r > 0.5). This correlation is still present for sampleizes of 15, 20 and 25 (less than 12.1%, 10.3%, and 7.4% of theorrelation coefficients <0.5, respectively), but this percentagencreases for a sample size of 10 (26.2%).

In summary, this experiment demonstrates the decrease ineliability and validity of SEM analyses when sample size dropso low values. However, at least for sample sizes of 20 or more,t also demonstrates that the accuracy of relative strength ofoefficients is retained about 90% of the time when the initialovariance matrix leads to a solution. The following experimentnvestigates the case of covariance matrices leading to improperolutions.

. Experiment 2: the smoothing method is moreppropriate than the GLS method in circumventingmproper solutions

As demonstrated in the first experiment, SEM analysis isften interrupted by the occurrence of “improper solutions”hen sample size is small. According to Wothke (1993),

mproper solutions often occur with small sample size becauseandom sampling combinations sometimes lead to collinearitiesnto the covariance matrix.

At least two solutions have been proposed to force pro-rams come to a solution: smooth the ridge of the initialovariance matrix (Joreskog and Sorbom, 2001), and use aeneralized least square (GLS) estimation method instead of aaximum Wishart likelihood method (MWL; Browne and Mels,

999).

.1. Methods

For each of the five models, the sample size was set to 20ases. The first 25 samples leading to improper solutions wereelected. Two different methods were used to reach a solutionn these samples: the MWL method performed after smoothing

284 A. Boucard et al. / Journal of Neuroscience Methods 166 (2007) 278–292

f sam

to

3

bm

D

wbpi(1

si

casf

3

(m

3

aivfa

M(aGFiobi2di

slb

Fig. 6. Evolution of the “error concentration index” (Ec) as a function o

he ridge of the initial covariance matrix, and the GLS methodn the initial covariance matrix.

.1.1. Smoothing the ridgeA series of 30 “smoothed” covariance matrices was computed

y multiplying the diagonal (the ridge) of the initial covarianceatrix with a constant greater than unity:

p = Di + Di × 10−4 × ( 4√

10)p

ith Di being the diagonal of the initial covariance matrix, Dp

eing the diagonal of the pth smoothed covariance matrix, andis an integer varying from 1 to 30. Basically, this method is

dentical to the procedure used in the “ridge option” of Lisrel 8Joreskog and Sorbom, 2001), where Dp = Di + Di × 10−4 ×0p.

The only difference resides in the size of the step of the con-tants increase, being four times smaller in our procedure thann Lisrel’s.

For each of the 25 samples, SEM analysis of these 30 matri-es were computed, starting with p = 1, and using the MWLlgorithm. The structural coefficients obtained with the firstmoothed matrice leading to an admissible solution was storedor analysis.

.1.2. GLS estimation methodThis procedure consists in applying a generalized least square

GLS) algorithm instead of an MWL one on the initial covarianceatrix (i.e. without smoothing its ridge).

icmu

ple size. *Significant difference in Ec (Bonferroni corrected p < 0.05).

.2. Results

When the initial covariance matrix led to improper solutions,solution was always found both with the GLS and the smooth-

ng methods. Interestingly, the sensitivity of the results to smallariations in the smoothing constant was low for the first stepsollowing the solution, i.e. if it remained close to the thresholdllowing a solution (data not shown).

When comparing the variance of the GLS solution with theWL solution performed on the smoothed covariance matrix

Fig. 8), it appears that smoothing the ridge of the covari-nce matrix provides significantly more reliable results than theLS estimation method for the models 1, 2, 3 and 5 (Snedecor> 3.10; F0.01(24,24) = 2.97; p < 0.01). The model 4 presents the

nverse, but not significant pattern. Therefore, an MWL solutionbtained after smoothing the ridge of the covariance matrix cane considered as more reliable than the GLS solution. The valid-ty is significantly better for the smoothing method on models 1,and 4 (Fr > 6.75; p < 0.01 in each case; Friedman test), but theifference is not significant for model 5 and the inverse patterns observed for model 3 (Fr = 9.00; p < 0.01).

The ARSC index (Fig. 10) is high in each model when themoothing method is used (the mean values of each model areocated between 0.60 and 0.95 for the smoothing method andetween 0.40 and 0.85 for the GLS method). The ARSC index

n model 3, which presents a better validity when data are pro-essed with the GLS method, is also higher in mean with the GLSethod, but remains very high when the smoothing method is

sed (mean = 0.71). Moreover, the ARSC index is lower than 0.5

A. Boucard et al. / Journal of Neuroscience Methods 166 (2007) 278–292 285

Fig. 7. Distribution of the ARSC index by sample size for all the analyses for each type of model. Horizontal axis: correlation coefficient between real coefficientsand their estimations (ARSC index). One symbol represents one analysis. Closed squares: model 1; closed triangles: model 2; open triangles: model 3; open squares:model 4; open circles: model 5.

Fig. 8. Effect of the method used to prevent the occurrence of improper solutions on the validity and reliability of the estimation. Central bar: median of Er. Upperand lower boxes: second and third quartiles. Whiskers: minimum and maximum values observed. OK: analyses which did not require a specific method to give aninterpretable solution; the MWL method was used. Smooth: the ridge of the initial covariance matrix was smoothed before computing an MWL solution. GLS: a GLSsolution was computed without smoothing the ridge of the initial covariance matrix. +Significant difference in reliability (Bonferroni corrected p < 0.05); *significantdifference in validity (Bonferroni corrected p < 0.05).

286 A. Boucard et al. / Journal of Neuroscie

Fig. 9. Distribution of the ARSC index for all the analyses for each type of model,which did not lead to an improper solution (OK), or which were preceded bythe smoothing of the ridge of the initial covariance matrix (smooth), or whichwtm

iitti

mt

sstcrbf3co

Irtr

4em

vts

satffihwab

aicnnt

4

2TCcd

utv

unTaT

4

ere conducted with a GLS algorithm (GLS). Closed squares: model 1; closedriangles: model 2; open triangles: model 3; open squares: model 4; open circles:

odel 5.

n very few cases with both methods (one case with the smooth-ng method, and two cases with the GLS method). Finally, ifhe five models are taken together, the ARSC indexes are lowerhan 0.5 in 10.4% of the cases with the smoothing method, andn 24.8% of the cases with the GLS method.

Therefore, these results demonstrate that the smoothingethod is actually more efficient than the GLS method when

he sample size is small.Finally, we compared the estimations obtained with the

moothing method on covariance matrices leading to improperolutions, with the ones obtained on matrices that did not requirehis procedure (data of the first experiment; sample size of 20;f. Figs. 8 and 9). There were no significant differences in theeliability for the models 1, 2, 4 and 5 (F test for the comparisonetween two variances). The reliability is significantly higher

or the matrices that initially led to improper solutions for model(F > 4.16; F0.01(32,24) = 2.87; p < 0.01). The validity is signifi-

antly higher in the matrices which did not require the smoothingf their diagonal for model 3 only (p < 0.001; Mann–Whitney U).

tii

nce Methods 166 (2007) 278–292

t can thus be concluded that the analyses that have required theidge of the covariance matrix to be smoothed provided resultshat are as accurate as those provided by analyses that did notequire this procedure.

. Experiment 3: neglected input and experimentalrror have little influence on SEM accuracy if theiragnitude is low

The aim of this experiment was to study how two sources ofariance may influence the SEM analysis result, i.e. the struc-ures that are not included in the model but which influence sometructures of the model, and the experimental error.

Performing a SEM study requires the a priori selection of aet of variable. In neuroimaging, this set of variables represents“functional cluster” of structures, i.e. a set of structures par-

icularly interconnected during the period of interest. However,unctional clusters cannot be considered as completely isolatedrom the rest of the brain (Tononi et al., 1998). Some surround-ng structures which do not belong to the functional cluster mayave an influence on some structures of the model. In this studye labeled them “neglected input”. They introduce a part of vari-

nce that is not explained by the other variables of the model,ut that is propagated through the model.

Another source of variance which may putatively impair SEMnalysis is the experimental error. This source of variance isntroduced at several steps during the processing of biologi-al material. It is different from the variance caused by theeglected structures because this new source of variance haso consequences on the interaction between the structures ofhe model.

.1. Methods

For the study of the neglected input influence, the same set of5 samples of 20 cases was used in each experimental condition.his procedure permits the use of repeated measures statistics.ovariance matrices leading to improper solutions were pro-essed with the smoothing method by smoothing their ridge asescribed above in the second experiment.

The percentage of influence of the neglected input was manip-lated so that they added 0, 5, 10, 25, or 50% of the variance tohe endogenous variable. The other parameters were set to thealues given in Section 1.

Twenty-five other samples of 20 cases were used for the eval-ation of the experimental error effect. The influence of theeglected input structures was fixed as described in Section 1.he percentage of experimental error was manipulated so that itdded 0, 2.5, 5, 10, 25, 50 or 100% of variance to the variable.he other parameters were set as described in Section 1.

.2. Results

The number of improper solutions tended to decrease whenhe influences of the neglected input or the experimental errorncreased. These results can be explained by the fact that thenfluence of both the neglected input structures and the exper-

A. Boucard et al. / Journal of Neuroscience Methods 166 (2007) 278–292 287

F ude oa ficanti

icBeoitemtt(

e(iIvamaotnt

tFdaww

se(F

5

itire

ig. 10. Evolution of the validity and of the reliability as functions of the magnitnd third quartiles. Whiskers: minimum and maximum values observed. +Signin validity (Bonferroni corrected p < 0.05).

mental error may lead to a decrease in the risk of obtainingollinearities in the covariance matrix (Wothke, 1993). Theartlett test showed a significant decrease in the reliability ofstimations with the increase in neglected input influence, butnly for the non-recursive models (Bartlett χ2 > 46; p < 0.001n each non-recursive model, cf. Fig. 10). The reliability ofhe estimation also tended to decrease in each model when thexperimental error increased (Bartlett χ2 > 47; p < 0.001 in eachodel; Fig. 11). Post hoc pairwise comparisons showed that

his decrease was significant essentially when the influence ofhe neglected structures or experimental error was relatively highwhen it added to 50% of the variance).

The validity of estimations was significantly impaired inach model when both the neglected input influence increasedFr > 23 in each model; p < 0.001), and the percentage in exper-mental error increased (Fr > 82 in each model; p < 0.001).nterestingly, only experimental errors adding to 50 and 100%ariance led to a significantly lower validity in models 1, 2, 3nd 5 compared to samples with lower percentage of experi-ental error. The effect is significant when experimental errors

dd 25% variance for the model 4. The same pattern of results is

bserved for the increase in neglected input influence (Nemenyiest). Neither the increase in experimental error nor the mag-itude of neglected input influenced the Ec index (Friedmanest).

wsmn

f the neglected input. Central bar: median of Er. Upper and lower boxes: seconddifference in reliability (Bonferroni corrected p < 0.05); *significant difference

The analysis of the ARSC index yielded interesting informa-ion in the study of the influence of neglected input. As shown inig. 12a, despite the significant decrease in validity, the actualecrease in ARSC was observed only when the neglected inputsdded to at least 25% of the variance in models 1, 4 and 5,hereas in models 2 and 3 the ARSC index remained high evenhen the neglected inputs added to 50% of the variance.For the experimental error, the analysis of ARSC index

upports the results on validity. This index was essentially influ-nced by the amount of experimental error when it added 25%model 5) or 50–100% of the variance (models 1, 2, 3 and 4;ig. 12b).

. Discussion

This study aimed at evaluating the validity and the reliabil-ty of results of structural equation modeling analyses, and howhis method respects the relative strength of connections whent is performed in conditions compatible with research in neu-oimaging, in particular in animal ex-vivo imaging. In a firstxperiment, we evaluated the increase in the quality of results

hen the sample size increases. For the first time, we demon-

trate that analyses conducted with small sample sizes provideeaningful information on the connectivity within a functional

etwork in a large percentage of cases. Moreover, this study con-

288 A. Boucard et al. / Journal of Neuroscience Methods 166 (2007) 278–292

F tage ol ues oba orrec

fifmawt“aaaFsicoirocsrut(b

sosc(teatcatma

5s

sa

ig. 11. Evolution of the validity and the reliability as functions of the percenower boxes: second and third quartiles. Whiskers: minimum and maximum valxes, the actual value is given. +Significant difference in reliability (Bonferroni c

rms and displays the high validity and reliability of this methodor sample sizes over 100. This support the view that the one-odel-by-subject approach (Goncalves et al., 2001; Mechelli et

l., 2002) is valid and reliable. In a second set of experimentse compared the efficiency of two methods used to circumvent

he occurrence of improper solutions. We demonstrate that theridge” method (i.e. smoothing the diagonal of the initial covari-nce matrix) is not only the best suited to achieve a solution, butlso that this method leads to results that are as reliable and valids those obtained on matrices that do not require this procedure.inally, this study shows that both the experimental error and thetructures that are not included in the model, but that actuallynfluence the functioning of the network, can decrease the effi-iency of SEM, but only when they contribute a large amountf variance to the data. Moreover, the loss in efficiency that theynduce is very low. This study is based on five different modelsepresenting both recursive and non-recursive models. The setf models that have been chosen samples the two prototypicallasses of models encountered in neuroimaging, and the conclu-ions can therefore be applied to empirical data. Together, theseesults support the view that structural equation modeling can be

sed to describe networks of brain structures involved during aask. However, they have some consequences on SEM practice:1) when the imaging method allows to get 100 images (or more)y structure and by subject, the one-model-by-subject approach

ltae

f variance added by experimental error. Central bar: median of Er. Upper andserved. When the maximum value exceeds the maximum value of the ordinateted p < 0.05); *significant difference in validity (Bonferroni corrected p < 0.05).

hould be preferred to the one-model-by-group; (2) when thene-model-by-group is adopted (consequently leading to smallample sizes), the confidence interval of the structural coeffi-ients should be presented along with the estimated coefficients;3) when an improper solution occurs, smoothing the ridge ofhe covariance matrix should be preferred to the GLS solution,ven if it is not included in the software. The present resultslso suggest that additional research will have to be conductedo (1) provide a better evaluation of how feed-back loops impairoefficient estimation (both as a function of the size of the loop,nd when loops are fitted into each other), and (2) to completehe statistical methods dedicated to compare structural equation

odels by tools taking into account the probability that samplesctually represent the population.

.1. Presenting confidence intervals in addition to thetructural coefficients

First, it should be noted that SEM results in neuroimaginghould be taken in a different way than the results of SEMnalyses performed in other disciplines where sample sizes are

arger. Because of the spatial resolution of neuroimaging andhe knowledge on brain neuroanatomy, the brain network is rel-tively well known compared to the architectures of structuralquation models in other disciplines. Indeed, in social sciences

A. Boucard et al. / Journal of Neuroscie

Fig. 12. Evolution of the percentage of analyses leading to ARSC indexes lowerthan 0.5 as a function of the magnitude of the neglected input (a), and experimen-tal error (b). Closed squares: model 1; closed triangles: model 2; open triangles:m

aawivctcottaaK1ctgrowiGtad

5s

acgoststftTrovvistctettmwrcd

5t

2alHewTbtobtitfu

In practice, some studies either present a feedforward archi-

odel 3; open squares: model 4; open circles: model 5; stars: means.

nd econometrics, where this method has been developed, thisrchitecture generally constitutes the core of the hypothesishich has to be tested. Thus, large sample sizes are required

n order to infer the exact path coefficient and further comparearious hypothetical structures reflecting competing theories. Byontrast, the basic issue in neuroimaging is not to compare hypo-hetical anatomical architectures, but rather to identify whichonnections are functionally activated within a known networkf structures. Hence, the actual issue is not the architecture ofhe anatomical network, or the exact path coefficients, but ratherheir relative magnitude. Accordingly, neuroscientists gener-lly report the range within which the structural coefficientsre located rather than their precise value (Grady et al., 2003;ilpatrick and Cahill, 2003; McIntosh and Gonzalez-Lima,991). The present study has demonstrated that the estimatedoefficients are generally close to but not exactly the same ashe real coefficients. This supports the practice consisting iniving range of structural coefficients. An alternative way ofesults presentation should be to provide the confidence intervalsf the structural coefficients, which are given by SEM soft-ares, together with their estimations. Although the confidence

nterval may slightly vary between softwares (Gonzalez andriffin, 2001), presenting the confidence interval, together with

he estimated path coefficient may give a means to estimate theccuracy of the relative strength of coefficient estimations in realata.

tt1

nce Methods 166 (2007) 278–292 289

.2. Sample representativity and the issue of statisticalignificance

The present study demonstrates a decrease in both validitynd reliability when the sample size decreases, leading in someases to lose the ARSC. Since the same models were used toenerate several samples and conduct the SEM analysis, theccasional loss of accuracy can only result from the fact thatmall samples do not always accurately represent the popula-ion. This suggests that the χ2 test that is sometime used as aignificance test for model estimation could be unsuited (thisest uses a χ2 value as GOF index). Indeed, the χ2 test is a testor the adequacy between the empirical covariance matrix andhe covariance matrix implied by the estimated path coefficients.his test is based on the assumption that the sample actually rep-

esents the population and cannot be used to determine whetherr not this assumption is justified. Supporting this view, the χ2

alues obtained on small sample sizes did not correlate with Eralues; i.e. the objective index of SEM accuracy. The χ2

DIFF test,nitially aimed at comparing hypothetical architectures from aingle sample, is based on this same assumption. It is now usedo test significance of changes induced by various experimentalonditions. Protzner and McIntosh (2006) showed that the χ2

DIFFest allows detecting differences between networks from differ-nt experimental conditions, thus demonstrating the power ofhis procedure. However, given that small samples may some-imes fail to correctly represent the population, the χ2

DIFF testay sometimes detect differences between two sample modelshereas the underlying path network is the same. Therefore, a

emaining issue is to evaluate the type I error risk which is asso-iated to the sampling bias, i.e. the probability that an observedifference has been obtained by chance.

.3. Small feed-back loops may impair SEM accuracy whenhe sample size is low

If the results of this study obtained with a sample size of0 (variance added by the experimental error = 5%; variancedded by the neglected input = 20 or 25%) are taken together,ess than 9% of analyses have an ARSC index lower than 0.5.owever, this percentage is heterogeneous among the mod-

ls. It is less than 1.3% for models 1, 2 and 3 taken together,hereas it reaches 20.3% for models 4 and 5 taken together.his strongly suggests that the presence of direct reciprocal pathsetween two variables within a model impairs the accuracy ofhe estimations. Preliminary investigations suggest that this lossf efficiency is limited to situations where two structures arei-directionally coupled, and does not concern loops involvinghree or more structures (e.g. model 3). Since many connectionsn the brain are bi-directional, this observation may have impor-ant consequences on the practice and interpretation of resultsrom neuroimaging when the one-model-by-group approach issed.

ecture, or a predominant direction of connection. For example,he auditory system model (McIntosh and Gonzalez-Lima,993) contains 9 structures, 20 connections, and only 3 feed-

2 oscie

bfmacbpt

iesTctsachso

miccewahvWw

ittcte

5a

emsioaibtaag

atAgoabc

tisa

5

abcmt

rnaaSiiauoc

5b

wmmmcs

smatttm

90 A. Boucard et al. / Journal of Neur

ack loops involving pairs of structures. The proportion ofeed-back loops between pairs is higher in studies bearing onore central networks. For example, a model of amygdala inter-

ctions (Kilpatrick and Cahill, 2003) presents 9 structures, 27onnections, and 7 pairs with feed-back loops. Finally, as showny Rajah et al. (1999), the number of feed-back loops betweenairs will be largely increased when interhemispheric interac-ions are studied.

Therefore, it may be useful to develop some methods allow-ng detecting estimation errors, or more accurate methods for thestimation of feed-back loops coefficients. As shown by Fig. 4,ome of the Er values are extremely high in models 4 and 5.his usually resulted from the fact that one of the path coeffi-ients within the loop had been largely overestimated, whereashe other had been underestimated. This suggests a way to detectome of the estimation errors. Although it is difficult to defineformal limit permitting to decide whether a structural coeffi-

ient estimation is plausible or not, unstandardized coefficientsigher than 5 can be considered as implausible in the brain andhould be discarded, whereas they can be considered as suspectr extremely high when they exceed 3.

Alternatively, two other solutions have been proposed to esti-ate path coefficients in feed-back loops when the sample size

s low. The first consists in using a two-stage least-square pro-edure instead of an MWL algorithm for the estimation of theoefficients of the loop (Bollen, 1996). The second consists instimating the coefficient which is supposed to be the highestithin the loop, and the other in a second analysis (McIntosh

nd Gonzalez-Lima, 1992). However, since no simulation studyas yet been carried out with a sample size compatible with exivo neuroimaging, the efficiency of these methods is not known.e are presently investigating several methods in order to findhich of them is the most appropriated.Until an efficient solution to this problem has been found,

t may be prudent not to present estimated coefficients withinwo-structures feed-back loops, but only the functional connec-ivity between the two structures, as indicated by their correlationoefficient or their partial correlation coefficient. This presenta-ion is less informative than structural coefficients (when theirstimation is correct), but it is probably less prone to error.

.4. Neglected structures: the case of structures that arefferent to several structures

In the present study, we treated the neglected input (andxperimental error) as uncorrelated. Further simulation experi-ents will have to explore whether the findings of this study are

ustained when some or all of these sources of variability arentercorrelated. However, this question raises a more basic the-retical issue. In real data, correlations between neglected inputctivities, between experimental errors, or between neglectednput and experimental error will increase multicollinearityetween the measured variables. An unknown part of this mul-

icollinearity should thus be attributed to experimental error;nother unknown part should be attributed to the network ofctive connections between structures, and a third part to thelobal brain activity. Specifically removing the global brain

tTit

nce Methods 166 (2007) 278–292

ctivity to measure local activities has already been showno constitute a full methodological issue (Andersson, 1997;ndersson et al., 2001), and the functional meaning of thelobal activity remains unclear. Therefore, addressing the issuef mathematical correlations between neglected input activitiesnd between experimental errors when SEM is used to studyrain networks would require a preliminary clarification of theoncept.

Nevertheless, preliminary data show that neglected struc-ures which influence two structures in the network have littlenfluence on the coefficient estimation when the two innervatedtructures are directly linked. Moreover, if they are separated bythird, or a third and fourth structures, the bias rapidly drops.

.5. Data distribution and anatomical knowledge

The present simulation has been carried out on data whichre normally distributed. Some of the biological indicators ofrain activity do not present such a distribution. Therefore, spe-ial attention should be paid to the choice of the biologicalarker, which should be normally distributed or appropriately

ransformed.A proper use of SEM to process neurobiological data also

elies on the accuracy of the knowledge of neuroanatomicaletworks. The results of the present study are based on thisssumption. As discussed above, networks of brain structuresre relatively well known as compared to other disciplines usingEM, and small errors in path specification may not dramatically

nfluence SEM analysis (Protzner and McIntosh, 2006). Interest-ngly, some recent studies provide methods to help in specifyingnatomical networks in a rigorous manner. These combine these of anatomical databanks and data-driven resampling meth-ds (Stein et al., 2007) that contribute to select the most criticalonnections for SEM.

.6. SEM in real brain world: brain functioning is dynamicut SEM remains useful

In summary, assuming that the anatomical network has beenell specified, and that the marker of activity provides nor-ally distributed data, this study shows that SEM is a validethod for the analysis of effective connectivity in the one-odel-by subject approach. In that case, structural coefficients

an be directly interpreted in terms of relative strength by whichtructures influence the others.

In the one-model-by-group approach (i.e. with low sampleizes), path coefficient estimation is less accurate but gives infor-ation on the relative strength of connections, but as discussed

bove, path coefficients of connections which are involved inwo-structure feed-back loops must be interpreted more cau-iously. Finally, a last issue should be pointed out. It is knownhat brain functioning is dynamic; i.e. that effective connectivity

ay change through time during a task, and that the course of

his kinetic may depend on its interaction with the environment.he main weakness of SEM for the study of brain connectivity

s that it assumes a static architecture of the network. In ordero describe this kinetic during a task, some authors (Friston et

oscie

a“nub

icbitcubgreesmsfr

A

ntfibU

A

i

R

A

A

A

A

B

B

B

B

B

B

C

F

G

G

G

G

J

K

K

L

M

M

M

M

M

M

M

P

P

R

S

A. Boucard et al. / Journal of Neur

l., 2003; Lee et al., 2006; Penny et al., 2004) have developeddynamic causal modeling”, which allows to process data fromeuroimaging in humans. Although some authors still prefer tose SEM in humans, the use of methods allowing the study ofrain dynamics will likely grow.

However, the use of SEM analysis will probably increasen animal research with the aim to understand the basic pro-esses governing connectivity reorganization associated withrain damages. Experimental lesions being of course precludedn humans, animal models research will remain the only wayo study the effects of experimental selective brain lesions ononnectivity reorganization. SEM will thus remain necessaryntil anatomic methods allowing recording this kinetic haveeen developed. Up to now, some pilot studies have investi-ated the dynamic of structure interaction by intra-cerebral EEGecordings in rodents (Chabaud et al., 1999). Given that thelectrophysiological approach does not allow the recording ofnough sites at the same time, this approach may only be used toupport SEM analysis of neuroimaging data. So, at least in ani-al studies, SEM will remain the most suited method allowing

tudying how various tasks or brain pathologies induce a dif-erent effective connectivity, and how local brain lesions induceeorganization in brain functional networks.

cknowledgements

The authors would like to thank Markus Brauer for the orga-ization of SEM training schools, O. Faivre for his critique ofhe early version of this manuscript, and Christine Schwimmeror editing the manuscript. The idea of presenting the confidencentervals along with the path coefficients was kindly suggestedy an anonymous reviewer. This work was supported by CNRSMR 5228.

ppendix A. Supplementary data

Supplementary data associated with this article can be found,n the online version, at doi:10.1016/j.jneumeth.2007.07.011.

eferences

ndersson JLR. How to estimate global activity independent of changes in localactivity. Neuroimage 1997;60:237–44.

ndersson JLR, Ashburner J, Friston K. A global estimator unbiased by localchanges. Neuroimage 2001;13:1193–206.

nderson JC, Gerbing DW. The effect of sampling error on convergence,improper solutions, and goodness-of-fit indexes for maximum likelihoodconfirmatory factor analysis. Psychometrika 1984;49(2):155–73.

stolfi L, Cincotti F, Babiloni C, Carducci F, Basilisco A, Rossini PM, et al.Estimation of the cortical connectivity by high-resolution EEG and structuralequation modeling: simulations and application to finger tapping data. IEEETrans Biomed Eng 2005;52(5):757–68.

entler PM. Comparative fit indexes in structural models. Psychol Bull1990;107(2):238–46.

ollen KA. An alternative two stage least squares (2SLS) estimator for latent

variables equations. Psychometrika 1996;61(1):109–21.

ollen KA. Overall fit in covariance structure models: two types of sample sizeeffects. Psychol Bull 1990;107(2):256–9.

ollen KA, Scott Long J. Testing structural equation models. London: SagePublications; 1993. p. 320.

S

T

nce Methods 166 (2007) 278–292 291

oomsma A. The robustness of Lisrel against small sample sizes in factor anal-ysis models. In: Joreskog KG, Wold H, editors. Systems under indirectobservation. Amsterdam: North-Holland Publ. Comp; 1982. p. 149–73.

rowne MW, Mels G. Path analysis (RAMONA). Systat 9 statistics II. Chicago:SPSS Inc; 1999, 161–219.

habaud P, Ravel N, Wilson DA, Gervais R. Functional coupling in rat cen-tral olfactory pathways: a coherence analysis. Neurosci Lett 1999;276:17–20.

riston KJ, Harrison L, Penny W. Dynamic causal modeling. Neuroimage2003;19:1273–302.

oncalves MS, Hall DA, Johnsrude IS, Haggard MP. Can meaningful effectiveconnectivities be obtained between auditory cortical regions? Neuroimage2001;14:1353–60.

onzalez R, Griffin D. Testing parameters in structural equation modeling: every“one” matters. Psychol Meth 2001;6(3):258–69.

rady CL, McIntosh AR, Craik FIM. Age-related differences in the functionalconnectivity of the hippocampus during memory encoding. Hippocampus2003;13:572–86.

uadagnoli E, Velicer WF. Relation of sample size to the stability of componentpatterns. Psychol Bull 1988;103(2):265–75.

oreskog K, Sorbom D. Lisrel 8: user’s reference guide. Lincolnwood: SSI;2001. p. 378.

ilpatrick L, Cahill L. Amygdala modulation of parahippocampal andfrontal regions during emotionally influenced memory storage. Neuroimage2003;20:2091–9.

line RB. Principles and practice of structural equation modeling. New York:The Guilford Press; 2005. p. 366.

ee L, Friston K, Horwitz B. Large-scale neural models and dynamic causalmodeling. Neuroimage 2006;30:1243–54.

arsh HW, Balla JR, McDonald RP. Goodness-of-fit indexes in confirma-tory factor analysis: the effect of sample size. Psychol Bull 1988;103(3):391–410.

cIntosh AR. Towards a network theory of cognition. Neural Networks2000;13:861–70.

cIntosh AR, Gonzalez-Lima F. Network analysis of functional auditorypathways mapped with fluorodeoxyglucose: associative effects of a toneconditioned as a Pavlovian excitor or inhibitor. Brain Res 1993;627:129–40.

cIntosh AR, Gonzalez-Lima F. Structural modeling of functional neural path-ways mapped with 2-deoxyglucose: effects of acoustic startle habituation onthe auditory system. Brain Res 1991;547:295–302.

cIntosh AR, Gonzalez-Lima F. The application of structural modeling tometabolic mapping of functional neural systems. In: Gonzalez-Lima F,Finkenstadt T, Scheich M, editors. Advances in metabolic mapping tech-niques for brain imaging of behavioral and learning functions. Dordrecht:Kluwer Acad. Publ.; 1992. p. 219–55.

cIntosh AR, Grady CL, Ungerleider LG, Haxby JV, Rappoport SI, Horwitz B.Network analysis of cortical visual pathways mapped with PET. J Neurosci1994;14(2):655–66.

echelli A, Penny WD, Price CJ, Gitelman DR, Friston KJ. Effective con-nectivity and intersubject variability: using a multisubject network to testdifferences and commonalities. Neuroimage 2002;17:1459–69.

enny WD, Stephan KE, Mechelli A, Friston KJ. Comparing dynamic causalmodels. Neuroimage 2004;22:1157–72.

rotzner AB, McIntosh AR. Testing effective connectivity changes with struc-tural equation modeling: what does a bad model tell us? Human Brain Mapp2006;27:935–47.

ajah MN, McIntosh AR, Grady CL. Frontotemporal interactions in face encod-ing and recognition. Cogn Brain Res 1999;8:259–69.

chlosser R, Gesierich T, Kaufmann B, Vucurevic G, Hunsche S, Gawehn J, etal. Altered effective connectivity during working memory performance inschizophrenia: a study with fMRI and structural equation modeling. Neu-roimage 2003;19:751–63.

tein JL, Wiedholz LM, Bassett DS, Weinberger DR, Zink CF, Mattay VS,et al. A validated network of effective amygdala connectivity. Neuroimage2007;36:736–45.

anaka JS. “How big is big enough?”: sample size and goodness of fit in structuralequation models with latent variables. Child Dev 1987;58:134–46.

2 oscie

T

W

Zar JH. Biostatistical analysis. 4th ed. Upper Saddle River: Prentice Hall; 1999.

92 A. Boucard et al. / Journal of Neur

ononi G, McIntosh AR, Russel DP, Edelman GM. Functional clustering: iden-

tifying strongly interactive brain regions in neuroimaging data. Neuroimage1998;7:133–49.

othke W. Nonpositive definite matrices in structural equation modeling. In:Bollen KA, Long JS, editors. Testing structural equation models. London:Sage Publications; 1993. p. 256–93.

Z

nce Methods 166 (2007) 278–292

p. 663.huang J, LaConte S, Peltier S, Zhang K, Hu X. Connectivity exploration with

structural equation modeling: an fMRI study of bimanual motor coordina-tion. Neuroimage 2005;25:462–70.