Does existing measurement theory support the use of composite and causal indicators in information...

10
Does existing measurement theory support the use of composite and causal indicators in information systems research? Andrew M. Hardin Department of Management, Entrepreneurship, and Technology Lee Business School University of Nevada, Las Vegas Jerry Cha-Jan Chang Department of Management, Entrepreneurship, and Technology Lee Business School University of Nevada, Las Vegas Abstract Despite lingering concerns surrounding the use of composite and causal indicators, a significant number of information systems researchers continue to employ them in their work. The authors suggest that misunderstandings about the appropriateness of implementing composite and causal indicators can be traced to the absence of measurement theory supporting their use. Recommendations on how researchers might design studies that avoid the use of these respective indicators are also provided. Keywords: Formative measurement, causal measurement, causal indicator, linear composite, composite indicator, ANOVA, regression, linear mixed modeling, research design ACM Categories: H.1.0, K.6 General terms: Formative and reflective measurement, causal and composite indicators, measurement theory Introduction Information systems researchers routinely incorporate composite indicators within models estimated using components-based partial least squares (PLS). These models continue to be deployed despite two recent MIS Quarterly articles that strongly discourage their use (Bollen 2011; Diamantopoulos 2011). Likewise, in spite of serious concerns raised by methods experts, causal indicators are frequently modeled and estimated using covariance-based SEM, (Bagozzi 2011; Edwards 2010). In many cases, no theoretical reasoning is offered by researchers to justify their decision to implement these respective indicators. This absence of justification reveals the apparent lack of recognition that measurement specification is a theoretical rather than statistical issue (Bollen 2011). In this article we address some of the theoretical concerns surrounding the implementation of composite and causal indicators in measurement contexts, and because of these concerns, provide suggestions on how researchers might design studies that avoid their use. Given the confusion surrounding terminology associated with composite and causal indicators, it is essential to first define the terms used in the current manuscript. We agree that the term “formative” is misleading and should be retired from the literature (Bollen 2011). Instead we use the term “composite indicators” when referring to indicators specified as part of a linear composite variable. We use “causal indicators” when referring to indicators that are implemented in models in which a construct level disturbance term is specified. We use the term The DATA BASE for Advances in Information Systems 56 Volume 44, Number 4, November 2013

Transcript of Does existing measurement theory support the use of composite and causal indicators in information...

  

Does existing measurement theory support the use of composite and causal indicators in information systems research?

Andrew M. Hardin Department of Management, Entrepreneurship, and Technology Lee Business School University of Nevada, Las Vegas

Jerry Cha-Jan Chang Department of Management, Entrepreneurship, and Technology Lee Business School University of Nevada, Las Vegas

Abstract

Despite lingering concerns surrounding the use of composite and causal indicators, a significant number of information systems researchers continue to employ them in their work. The authors suggest that misunderstandings about the appropriateness of implementing composite and causal indicators can be traced to the absence of measurement theory supporting their use. Recommendations on how researchers might design studies that avoid the use of these respective indicators are also provided.

Keywords: Formative measurement, causal measurement, causal indicator, linear composite, composite indicator, ANOVA, regression, linear mixed modeling, research design

ACM Categories: H.1.0, K.6

General terms: Formative and reflective measurement, causal and composite indicators, measurement theory

Introduction

Information systems researchers routinely incorporate composite indicators within models estimated using components-based partial least squares (PLS). These models continue to be deployed despite two recent MIS Quarterly articles that strongly discourage their use (Bollen 2011; Diamantopoulos 2011). Likewise, in spite of serious concerns raised by methods experts, causal indicators are frequently modeled and estimated using covariance-based SEM, (Bagozzi 2011; Edwards 2010). In many cases, no theoretical reasoning is offered by researchers to justify their decision to implement these respective indicators. This absence of justification reveals the apparent lack of recognition that measurement specification is a theoretical rather than statistical issue (Bollen 2011). In this article we address some of the theoretical concerns surrounding the implementation of composite and causal indicators in measurement contexts, and because of these concerns, provide suggestions on how researchers might design studies that avoid their use.

Given the confusion surrounding terminology associated with composite and causal indicators, it is essential to first define the terms used in the current manuscript. We agree that the term “formative” is misleading and should be retired from the literature (Bollen 2011). Instead we use the term “composite indicators” when referring to indicators specified as part of a linear composite variable. We use “causal indicators” when referring to indicators that are implemented in models in which a construct level disturbance term is specified. We use the term

The DATA BASE for Advances in Information Systems 56 Volume 44, Number 4, November 2013

  

“effect indicators” when referring to indicators influenced by latent variables within reflective measurement models. We also emphasize that the causal measurement models discussed in the current manuscript are conceptually distinct from the causal models described by Blalock (1963) and others. We do not endorse the term “measurement” when referring to models that employ either composite or causal indicators. In the current article we refer to these indicators in measurement terms only because they are frequently referred to as such in the literature.

What does the prior literature say?

Hardin et al. (2011a) provides a detailed review of the early causal modeling literature. Conclusions drawn from this literature review suggest a conceptualization of causal indicators as simply that, causes of other variables. The authors found little evidence to support the notion of using causal indicators for measuring unobserved variables. Hardin (2011a) cites the model in Figure 1 as an illustration of Blalock’s early application of causal modeling (Blalock 1963).

Figure 1. Race as a Causal

Indicator (adapted from Hardin et al. 2011a)

Race is depicted in Figure 1 as a cause of the unmeasured variable1, exposure to discrimination. The conceptual placeholder for exposure to discrimination is specified as a predictor of discontent with the world. Discontent with the world is in turn suggested to cause political liberalism. Note that although race is modeled as a cause of exposure to discrimination, both alienation score and voting behavior are modeled as being caused by discontent with the world and political liberalism respectively. Psychological states such as

                                                            1 We would like to thank a reviewer for pointing out that exposure to discrimination could also be considered undefined other than in terms of what it is related to in Blalock’s model.

discontent with the world are expected to cause individuals to respond to alienation questions in a specific fashion. Similarly, a person’s political liberalism causes them to vote in a particular manner.

There is an important distinction between these relationships that is beyond just direction of causality. Race, an observed variable, is said to cause the previously undefined, unmeasured variable exposure to discrimination. The observed variables alienation score and voting behavior are specified as being caused by the theoretically defined, latent constructs discontent with the world and political liberalism respectively. While proponents of composite indicators and causal measurement models may suggest that these relationships are conceptually similar, we submit that they are most definitely not. Specifying a theoretically defined, latent construct as a cause of an observed variable (reflective measurement) is conceptually distinct from specifying an observed variable as a composite indicator defining an unobserved concept. In the latter case, the specification is clearly not consistent with any established measurement theory.

Blalock’s research model implies that although exposure to discrimination is unmeasured, it can nonetheless be added to the research model as a conceptual placeholder. Blalock never suggests that race captures the concept of exposure to discrimination, acknowledging that other factors likely create variation in the variable. Even if one were to interpret Blalock’s model as an attempt to measure exposure to discrimination, given the use of a single composite indicator and no construct level disturbance term, exposure to discrimination would be equivalent to race. Not even proponents of composite indicators would support such a conceptualization. On the other hand, supporters of causal measurement models would most likely suggest that a disturbance term is needed to capture other unidentified factors that define exposure to discrimination. Unfortunately, the specification of a disturbance term has its own limitations as it concedes that critical factors defining exposure to discrimination have not been accounted for, and fully capturing the meaning of the construct is a long-standing requirement for causal measurement models (Jarvis et al. 2003).

Hardin et al. (2011a) also describes findings from a review of the socio-economic status (SES) literature. Based on this review, the authors found that early measures of SES were based on predetermined, fixed weights rather than weights that vary as a function of the variables they are used to predict (Chapin 1933). For example, Chapin (1933)

Race

AlienationScore

VotingBehavior

Exposure toDiscrimination

Discontent with the World

PoliticalLiberalism

The DATA BASE for Advances in Information Systems 57 Volume 44, Number 4, November 2013

 

dinwcwnFcnishisHctrceH

AU

PoFisaHisBtoeartouTeb

Itdnsuctwcct

  2

ininmwh

discusses howndicators mweights2. Thisconceptual dweights thatnomological nFixed weighconsistent dnomological nssues of inteheavily cited ssue on “foHowell and hconstruct defihese weightsresearch moconfounding employing caHowell et al. 2

An IS ExamUnmeasured

Providing an iof Blalock’s mFigure 2. In ths interested advanced sHowever, no s currently aBlalock (1963o social encouragemeadvanced syrelated to pehe researchoperate on unmeasured The path weigestimated usiby Blalock (19

t is easy to different fromnetworking tesocial networusage construconceptualizawo indicatocompletely caconstruct. Accechnologies

                        In Chapin’s eanventorying itendex some itemore heavily thwas used to inhouseholds.

w SES compultiplied by s method of eistinction bett are calcunet within wts are preddespite chanetworks. Thiserpretational c

Psychologicormative meis colleaguesnitions rely o

s can vary as odels and is a signific

ausal indicato2007b).

mple of Ca Variables

information symodel, considehis hypotheticin the poten

systems usameasure of

available. Us3), the researc

networking ent of social nystems usageerformance ouher is propoperformance concept, ad

ghts for the cang the statist963).

see that this one sugges

echnology anrking measuuct. One obviation would bors referencapture the acounting for twould obviou

                       arly work SES

ems in the homems, such ashan other itemsnsure that scor

posites are cspecific, p

estimation extween fixed ulated basedwhich they ar

determined anges in s distinction aconfounding cal Measuremeasurement”. s emphasize on indicator wthey are estimcontexts, incant problem

ors (Howell

ausal Model

ystems themer the situatiocal situation, ntial relationsage and advanced sy

sing a similacher theorize

technology,networking, ae, and furthutcomes. In osing that t

through andvanced systasual model ctical procedur

conceptualizsting that accnd the encoure the advaious limitation

be the implicacing social advanced syshe use of othusly be nece

            was frequently

me. When calcu televisions, w

s. A standardizeres were comp

created usingredetermined

xemplifies theweights, andd upon there estimatedand remaincontexts or

addresses theraised in thement specia

Specifically,that because

weights, andmated across

nterpretationam in studieset al. 2007a;

ls Involving

ed illustrationon depicted in

a researchership betweenperformance

ystems usager rational ass that access and the

are causes ofher, indirectly

other words,these stimuln as of yettems usagecould then beres described

zation is verycess to sociauragement ofnce systemsn to the latteration that the

networkingstems usageher innovativessary to fully

y measured byulating an SESwere weighteded score sheetparable across

g d e d e .

n r e e l ,

e d s l

s ;

g

n n r n . e s s e f y , i t . e d

y l f s r e g e e y

y S d t s

captBlaloadvasimpreseconcrelatvaria

F

Onceadvadevebe umodacceencopotesysteperfopersalteradvabetwperfo

So wlitera

MuchcompSES2011equawhenindicBecaliteraforwaof tdeve(Har

Gainon cthe abru

ure advanceock’s conceanced systemply as a conearcher believcept intervented stimuli able.

Figure 2. Cas

e the reseanced systemeloped and a used to captuel could be

ess to sociaouragement oential factors,ems usage, wormance. D

spective emrnatively conanced systemween the sormance outc

why is it impoature?

h of the semposite and ca

S and causal 1a). For examations textboon discussing

cators in lateause the inteature appearard than prevthis early

elopment of rdin et al. 2011

ning a better ucausal modelliterature on

uptly transition

ed systems eptualization ms usage cnceptual placves the advaes between and the p

sual Indicator

archer concms usage co

reflective meure the varia

conceptualizal networkingof social netware expecte

which in turn iDepending

mployed, thenceptualized ms usage plsocial netwocomes.

ortant to rev

minal researchausal indicato

modeling litemple, one higok frequently g the impleent variable merpretation ofrs to be somviously suggeliterature mmeasuremen1a; Hardin et

understandinling is espec composite ns to an exh

usage. in Figure

construct instceholder indicanced systemthe social n

performance

rs in an IS C

ceptually defonstruct, itemeasurement mable. At that zed as one

g technology working, amod to predictinfluences doon the t

e model cas one

lays a mediaorking variab

view this earl

h supporting tors referenceseratures (Harhly influentialmentions this

ementation omodels (Bollef the causal mewhat less

ested, continumay help gnt theory in al. 2011b).

g of the earlycially critical gand causal

haustive discu

 

Following 1, the

tead acts cating the ms usage networking

outcome

ontext

fines the ms can be model can

point the in which and the

ong other advanced

ownstream theoretical could be in which ating role bles and

ly

the use of s the early rdin et al. structural

s literature of causal en 1989). modeling

s straight-ued review guide the

this area

y literature given that indicators ussion on

The DATA BASE for Advances in Information Systems 58 Volume 44, Number 4, November 2013

  

the statistical ramifications of employing these indicators in structural equation models. This gap in theory development leaves contemporary researchers in the undesirable position of implementing a measurement approach that has not been sufficiently developed. The lack of theory supporting the use of composite and causal indicators is in stark contrast to reflective measurement, which is buttressed by an extensive body of theory-based research. Classical test theory and item response theory are two well-established examples of psychometric theories supporting the implementation of reflective measurement. No such theory guides the use of composite and casual indicators.

So what do the experts have to say?

Given the dearth of theory underlying the use of composite and causal indicators, it should come as no surprise that even SEM experts do not agree on their implementation. For instance, while some experts continue to support the use of causal indicators (Bollen 2011; Diamantopoulos 2011) others recommend against their use, except in very rare circumstances (Bagozzi 2011; Edwards 2010). Further, even proponents of causal measurement models discourage the common practice of employing composite indicators. In a recent issue of MIS Quarterly, Bollen (2011) clearly dismisses the use of these indicators because they are estimated in models that do not include a construct level disturbance term. The equations below illustrate the distinction between linear composites and causal measurement models:

Composite Model: C = aC + γC1χ1 + γC2χ2 +…+ γCηχη

(1)

Causal Measurement Model: η = γ1χ1 + γ2χ2 +…+

γηχη + ζ (2)

Equation (1) depicts the composite model. This model represents the highly unrealistic case in which the latent construct representing the unidimensional concept is an exact linear function of its indicators. Equation (2) represents the causal measurement model. This model includes a construct level disturbance term. Bollen describes the meaning of the disturbance term as: “The presence of the error (ζ) in equation (2) is in recognition that other variables not in the model are likely to determine the values of the latent variable. It seems unlikely that we could identify and measure all of the variables that go into the latent variable and the error collects

together all of those omitted causes”, (Bollen 2011, p. 8).

We will have more to say about the conceptual interpretation of the disturbance term in casual measurement models later in the manuscript. However, the implications of Bollen’s remarks on the use of composite models should not go unnoticed by those who frequently employ such models in their work. Bollen, a staunch proponent of casual measurement models, is recommending against the use of composite models. Yet, despite expressing these concerns in a research commentary published in MIS Quarterly, Bollen’s reproach has done little to stem the tide of composite indicators employed by IS researchers3. We believe there are numerous reasons for this continued trend, including the relative ease with which linear composites are employed in statistical packages, and a tendency of those who support their use to encourage other authors to incorporate them in their studies. Although we find these atheoretical justifications disheartening, identifying their origins is beyond the scope of the current essay.

So how can we develop theory in this area?

Developing theory that supports the use of composite and causal indicators is a daunting task. As noted above, the early literature on causal modeling provides very little useful guidance given that it does not appear to discuss the use of causal indicators in a measurement context. Further, psychometric theory has historically treated new construct development as one involving constructs measured by effect indicators. For example, theory supporting reflective measurement suggests that because latent variables are unobservable, researchers must attempt to measure them by looking at evidence of their effects. The SEM literature explains how accounting for the error associated with observed variables helps to improve the precision of reflective measurement (Bollen 1989). Ironically, this latter benefit is eliminated in models implementing composite and causal indicators. Neither linear composites, nor causal measurement models account for measurement error at the indicator level. Linear composites represent the even more unrealistic scenario in which the focal construct is perfectly represented by error free indicators. How these respective models treat measurement error is a critical issue that must be addressed before theory can be developed in this

                                                            3 While we might not have expected to see a change in the published literature yet, our experience as editors and reviewers reveals that IS studies continue to employ composite indicators.

The DATA BASE for Advances in Information Systems 59 Volume 44, Number 4, November 2013

  

area. We expand our discussion of how these models treat error in the following section .

Linear composites and causal measurement models

As described above, there are two models frequently discussed in the literature. One is modeled without a disturbance term, and is considered to be a linear composite rather than a latent variable (Bollen 2011). Such models are conceptually similar to the fixed weight composites designed to measure factors such as socio-economic-status (SES). Neither of these composites includes an error term at the indicator or construct level, and to maintain their conceptual definitions across contexts, these respective composites should be formed by summing the observed scores of each indicator multiplied by theoretically defined, predetermined weights (Hardin et al. 2011a). While a psychometric perspective is not currently available to inform researchers on how these weights should be determined, some studies have at least offered a conceptually palatable alternative. For example, Hardin et al. (2011a) proposes that weights should be derived by examining prior theory, or through meta-analyses based on studies developed for purely predictive purposes.4 In the case of the former, existing theory could be used to identify the respective magnitudes of the weights associated with the indicators contributing to the composite. Chapin’s early work on SES that proposed weighting some possessions more heavily than others serves as a particularly relevant example of how theory might inform the magnitudes of indicator weights. To be generalizable to other settings however, these weights must be consistently applied. If the weights are allowed to vary as a result of changes to model specifications or a change in research contexts, models employing composite indicators will suffer from the problem of unstable definitions and interpretational confounding (Howell et al. 2007a; Howell et al. 2007b; Wilcox et al. 2008).

The second model includes a construct level disturbance term. The disturbance term is suggested to represent all of the omitted factors affecting the latent construct. We agree with Bollen that it is extremely unlikely researchers can account for all relevant aspects of a construct when defining a new causal measurement model. This would be analogous to specifying a research model in which 100% of the variance in the dependent variable is

                                                            4 Predictive studies are defined as studies where the researcher’s goal is to provide specific entities with information about a model estimated in a particular context. One example is attempting to identify the factors that predict performance in a single organization.

explained. However, it is extremely important to distinguish between the ability of the disturbance term to statistically account for the unexplained variance in the model, and the incomplete definition that results when conceptually relevant factors are omitted from the model. While the disturbance term satisfies the statistical requirement of capturing the unexplained variance in the model, it does not satisfy the conceptual need to account for all relevant factors of the construct. Without the inclusion of all the relevant indicators, a causal measurement model can never be comprehensively defined (Diamantopoulos et al. 2001).

Not only is the size of the disturbance term inversely related to the completeness of the construct definition, the researcher also has very little information about what is being captured by the term. In other words, what part of the construct definition is missing? Further, because the disturbance term can change depending on the research model within which it is estimated, the magnitude of the disturbance term, and therefore the incompleteness of the definition, is model dependent. This means what is missing (i.e., what is being captured by the disturbance term) may be different across nomological networks and research settings. The existence of a disturbance term therefore contributes to the instability of the causal measurement model across studies. A fully defined causal measurement model would be one where the value of the disturbance term is zero (Diamantopoulos 2006). However, even in this improbable scenario, the magnitude of the disturbance term will almost certainly change when the causal measurement model is estimated in a different context (Hardin et al. 2008a; Hardin et al. 2008b).

Following Bollen (2011), model fit can be used to compare causal measurement model specifications. We agree that this is an appropriate tool for determining which model best fits the data; however, comparison of model fit provides little useful information for establishing the conceptual definitions of constructs. In other words, statistically speaking one model may fit better than another, but it may also be the case that the best fitting model is conceptually deficient. This point gets at the heart of the disagreement between Howell et al. (2007) and Bollen (2007) on interpretational confounding. Bollen contends that it is misspecification that leads to interpretational confounding, while Howell et al. maintains that interpretational confounding is a conceptual issue. We agree with Howell et al. on this point; even well-fitting models can differ conceptually when they are estimated across different contexts (Kim et al. 2010).

The DATA BASE for Advances in Information Systems 60 Volume 44, Number 4, November 2013

  

Causal indicators are by definition, error free (Diamantopoulos 2006). However, measurement theory stresses the improbability of error free measures. Even objective items such as income are not error free because they are frequently based on subject recall (Edwards 2010). Even if correctly recalled, respondents can still make mistakes when writing down an amount. Blalock’s example of race as a causal indicator is not exempt from such criticism. Respondents can easily mark the wrong box, not locate an appropriate category, or given contemporary privacy concerns, simply refuse to answer the question. Given the unlikelihood of error free objective measures, it seems even more improbable that perceptual items measured using Likert-type scales could ever be considered appropriate as causal indicators. Further, in models employing effect indicators, measurement error for each indicator is used to help determine whether or not an indicator should be retained. No such benefit is provided when applying causal indicators.

Another conceptual issue related to measurement error arises during estimation. To achieve model identification, causal measurement models commonly require the specification of both cause and effect indicators. In many cases, indicators are perceptual and measured using similar Likert-type scales within a single survey. Causal indicators are then statistically modeled without error and effect indicators with error. It is obviously illogical and inconsistent to suggest that data collected using the same types of survey questions and scales are expected to contain error when modeled as reflective, yet as error free when modeled as causal. It is not reasonable to expect survey respondents that are not privy to the measurement model specification to answer some questions with absolute certainty. Proponents of causal measurement models, who suggest specifying existing effect indicators as causal indicators, especially after data has been collected, are arguing for the acceptance of this absurd assumption. Without the ability to guarantee error free responses, developing rigorous causal measurement models seems unlikely.

An equality or inequality in arguments?

Although there is significant disagreement on this topic, there is also significant agreement. To our knowledge there is no dispute that measurement model specification should be theory driven. There also appears to be no disagreement that causal measurement models are defined by their indicators, and as a result, that comprehensiveness of indicator coverage is critical. There is no disagreement that causal indicators should generally be uncorrelated, or that they are modeled as error free. Researchers

agree that causal measurement models need at least two reflective indicators or endogenous variables to be statistically identified. The contention that SEM estimated indicator weights are dependent on endogenous constructs is also well documented (Hardin et al. 2008a; Hardin et al. 2008b; Hardin et al. 2011a; Howell et al. 2007a; Howell et al. 2007b; Wilcox et al. 2008).

Despite this agreement however, significant disagreement remains. For example, the impact of measurement misspecification on structural parameter bias remains a contentious issue. Jarvis et al. (2003), one of the most frequently cited articles on causal measurement models, proposed decision rules for determining whether existing constructs should be measured with cause or effect indicators. These rules were applied to published research to show that measurement misspecification was widespread in the marketing literature. While the article was reportedly focused on the development of conceptual criteria to guide measurement specification, a careful review reveals that the majority of the arguments are statistical rather than conceptual. As a result, the article is focused mainly on establishing the statistical impact of measurement misspecification on structural parameters, and on achieving model identification.

Within the IS literature, one heavily cited article used the decision rules put forth by Jarvis et al. to argue that IS research suffers from a similar level of construct misspecification (Petter et al. 2007). Like Jarvis et al. (2003) the article claims that construct misspecification leads to serious structural parameter bias. Interestingly, the severity of the structural parameter bias alleged by these two articles has since been disputed (Aquirre-Urreta et al. 2012). Therefore, at a minimum, this issue appears to be unresolved.

Beyond these measurement misspecification concerns, whether or not linear composites can be validated also appears to be unsettled. For example, Petter et al. (2007) proposes a series of steps for validating linear composites and causal measurement models. However, a recent MIS Quarterly article by Bollen (2011) suggests that because of the absence of a construct level disturbance term, linear composites cannot be validated. Because of these inconsistencies, it remains unclear what impact construct misspecification has on structural parameters, or what measurement validation represents in these contexts. We respectfully suggest that if constructs cannot be validated, researchers cannot adequately determine their meaning, and arguments surrounding the impact of construct misspecification on structural parameter bias are moot.

The DATA BASE for Advances in Information Systems 61 Volume 44, Number 4, November 2013

  

So what should we do about it?

Given the obdurate positions of the respective protagonists engaged in this debate, resolving it any time soon seems unlikely. Therefore, rather than adamantly recommending for or against the use of causal measurement models, we provide a few straightforward recommendations on how researchers might avoid this conflict until it is ultimately resolved. The methods discussed below are merely examples and are in no way meant to be all-inclusive. We encourage readers to consider these and other alternative methods when designing their future studies.

Table 1 provides a list of potential alternatives to research models that use causal or composite indicators. The first alternative involves specifying models in a manner similar to path analysis. Using this approach, researchers simply specify the respective observables as individual predictors in the model. This method avoids the practice of combining indicators to form unmeasured variables and therefore concerns associated with causal measurement model specification. Weaknesses include the inability to account for measurement error and the potential to increase model complexity.

The second alternative proposes the use of single item reflective constructs in place of causal indicators. Edwards (2010) reports on the use of “facet constructs” defined as either single item or multiple item reflective constructs that cause an outcome. We agree with this specification provided that the causes are not conceptualized as measuring the outcome variable. If the construct is specified in the latter fashion it is conceptually nothing more than a second order formative construct with first order reflective measures, and we would recommend against its use.

The third alternative is based upon the recommendations of Hardin et al. (2011a). Predetermined weights are used to create fixed weight composites that maintain their conceptual definitions across research models and contexts. Such an approach avoids the problems associated with interpretational confounding because the indicator weights are not allowed to vary as a function of the model within which they are estimated. The weakness of this approach is deriving the predetermined weights. Hardin et al. (2011a) suggests relying on prior theory and meta-analysis methods to determine the most appropriate weights.

Table 2 describes techniques that avoid the use of latent variables. As IS editors and reviewers we have found that a large number of quantitative studies use either covariance- or component-based SEM. While it is conceivable that the nature of IS research dictates the frequent use of SEM and/or PLS, we frequently

encounter studies where the use of these methods seems forced, and alternative designs would have been either preferred, or at least equally effective for addressing the research question.

The first approach in Table 2 discusses the use of traditional statistical techniques such as regression and ANOVA, or newer methods such as linear mixed modeling (LMM) or hierarchical linear modeling (HLM). The main strength of these methods is that they permit researchers to avoid measurement concerns. They are also frequently used in studies published in highly respected journals outside of the IS discipline. The weakness of these methods is that they often rely on the use of summed composites which makes the conceptual assumption that indicators should be equally weighted.

The second approach describes a situation in which the researcher designs the study to avoid using measurement models. Experimental treatments are the independent variables predicting objective dependent variables. The main weakness of this approach is that it may require the researcher to alter the research design. However, changing the design can also be positive. Using statistical techniques such as LMM and HLM to estimate longitudinal and hierarchical models can sometimes lead to a deeper understanding of the phenomena being investigated. Looney and Hardin employ this type of design in their studies on financial decision support systems (Hardin et al. 2012; Looney et al. 2009).

Conclusion

This essay endeavored to shed light on the lack of theory underlying the use of composites and causal indicators in the information systems literature. Consistent with Bollen (2011) we proposed that the term “formative measurement” is misleading and should be retired from the literature. We used the term composite indicator for referring to indicators that are specified as part of a linear composite variable and causal indicator for referring to indicators that are part of a causal measurement model in which a construct level disturbance term is specified.

We discouraged the use of the terms composite and causal indicators in a measurement context.

We reexamined the conclusions drawn from the literature review conducted by Hardin et al. (2011a) and expanded upon their findings. Specifically, we pointed out that there is an important distinction between the cause and effect relationships described in Blalock’s early causal models.

The DATA BASE for Advances in Information Systems 62 Volume 44, Number 4, November 2013

  

Table 1: Alternative approaches to employing causal or composite indicators

Alternative Approach Strengths Weaknesses Comments Use observed variables as individual predictors in the research model

Eliminates problems associated with defining an unobserved variable based upon composite or causal indicators.

Can increase model complexity. Does not account for measurement error associated with observables.

Factors such as income and education can be specified as individual predictors in research models.

Create single item reflective constructs using the observed variables

Specifies the construct consistent with reflective measurement. Can account for item level measurement error.

Loadings and unique variances cannot be estimated. Reliability must be estimated.

Factors such as income and education can be specified as single item reflective constructs in research models.

Create fixed-weight composites using predetermined weights

Maintains the consistency of the conceptual definition across research contexts and nomological networks

Difficult to ascertain the fixed-weights

Researchers define fixed-weight composites based upon a predetermined weighting scheme. Meta analyses or prior theory may be used to determine the most theoretically relevant weights.

Table 2: Alternative approaches to employing models that use latent variables

Alternative Approach Strengths Weaknesses Comments

Redesign studies to take advantage of techniques such as Regression or ANOVA, or newer techniques such as linear mixed modeling (LMM), or hierarchical linear modeling (HLM).

Avoids problems related to the use of latent variables. LMM and HLM allow for analyses of fixed and random effects, and time-series and multi-level data.

Does not account for measurement error. Not a direct replacement for SEM models, and therefore influences the research design.

Researchers can design studies so that multiple regression or ANOVA, or newer techniques such as linear mixed modeling (LMM), or hierarchical linear modeling (HLM) are appropriate.

Avoid the use of research designs that require measuring variables

Avoids problems related to the use of latent variables.

Only applies to situations where conditions can be manipulated

Researchers can employ designs that use the respective treatments as the independent variables and objective variables as the dependent variable

We also reiterated that early work on SES was based on the use of fixed weights rather than weights that vary as a function of the variables they are used to predict. We clarified that although a construct level disturbance term serves the statistical purpose of accounting for unexplained variance, it cannot resolve the incomplete conceptual definition that results from missing causal indicators. We called into question the logic of using surveys and estimating models that include both measures that contain error and ones that are error free. The use of causal indicators was suggested to preclude the ability to account for measurement error, one of the primary benefits of using latent variables. We used this discussion to suggest that the current implementation of composite and causal indicators

is not supported by established measurement theory, and advocated that measurement theory supporting their use is a necessary first step.

We also emphasized that although there is a significant amount of disagreement on this topic, there is also significant agreement. For example, both sides of this discussion agree that the development of measurement models should be theory driven. Causal measurement models are defined by their indicators and comprehensiveness of indicator coverage is crucial. Causal indicators are modeled as error free. At least two reflective indicators or endogenous variables are necessary to identify a causal measurement model. Causal indicator weights are dependent on endogenous constructs. We discussed how one of the original

The DATA BASE for Advances in Information Systems 63 Volume 44, Number 4, November 2013

  

papers on this topic, Jarvis et al. (2003) developed a set of guidelines for determining measurement specifications that are mostly unsubstantiated by psychometric theory. Despite this lack of theoretical grounding, a heavily cited article published in the IS literature relied on these guidelines. Similar to Jarvis et al. (2003), this article also suggested that measurement misspecification leads to severe structural parameter bias. The conclusions drawn in these articles were later questioned by other researchers. It was also suggested that despite suggestions to the contrary, significant disagreement on the validation of linear composites remains.

Finally, we presented some alternatives to composite and causal measurement models. Strengths and weaknesses for each alternative were discussed. Weaknesses included increased model complexity, and the inability to account for measurement error. Strengths included the ability to avoid the criticisms levied on composite and causal measurement models, and potentially gaining a richer understanding of the data through the use of alternative research designs.

We are optimistic that this article will lead to further consideration of the literature on which composite and causal measurement models are based> We hope that IS researchers will contemplate the use of alternative approaches in their future studies until theory can be developed to guide the implementation of these respective models.

References

Aquirre-Urreta, M. I., and Marakas, G. M. "Revisiting bias due to construct misspecification: Different results from considering coefficents in standardized form," MIS Quarterly (36:1) 2012, pp 123-138.

Bagozzi, R. P. "Measurement and Meaning in Information Systems and Organizational Research: Methodological and Philosophical Foundations," MIS Quarterly (35:2) 2011, pp 261-292.

Blalock, H. M. "Making causal inferences for unmeasured varaibles from correlations among indicators " American Journal of Sociology (69:1) 1963, pp 53-62.

Bollen, K. Structural Equations with Latent Variables John Wiley and Sons, New York, 1989.

Bollen, K. "Evaluating effect, composite, and causal indicators in structural equation models," MIS Quarterly (35:2) 2011, pp 359-372.

Chapin, S. F. The measurement of social status: By the use of the social status scale The University of Minnesota Press, Minnesota, 1933.

Diamantopoulos, A. "The error term in formative measurement models: Interpretation and modeling implications," Journal of Modeling in Management (1:1) 2006, pp 7-17.

Diamantopoulos, A. " Incorporating Formative Measures into Covariance-Based Structural Equation Models," MIS Quarterly (35:2) 2011, pp 335-358.

Diamantopoulos, A., and Winklhofer, H. M. "Index construction with formative indicators," Journal of Marketing Research (38) 2001, pp 269-277.

Edwards, J. R. "The fallacy of formative measurement," Organizational Research Methods OnlineFirst (14:2) 2010, pp 370-388.

Hardin, A., Chang, J., and Fuller, M. "Clarifying the use of formative measurement in the IS discipline," Journal of the Association for Information Systems (9:1) 2008a, pp 544-546.

Hardin, A., Chang, J., and Fuller, M. "Formative versus reflective measurement: Comment on Marakas, Johnson, and Clay (2007)," Journal of the Association for Information Systems (9:9) 2008b, pp 519-534.

Hardin, A., Chang, J., Fuller, M., and Torkzadeh, G. "Formative measurement and academic research: In search of measurement theory," Educational and Psychological Measurement (71:2) 2011a, pp 281-305.

Hardin, A., and Looney, C. "Myopic Loss Aversion: Demystifying the Key Factors Influencing Decision Problem Framing. ," Organizational Behavior and Human Decision Processes (11) 2012, pp 311-331.

Hardin, A., and Marcoulides, G. A. "A commentary on formative measurment," Educational and Psychological Measurement (71:5) 2011b, pp 753-764.

Howell, D. R., Breivik, E., and Wilcox, J. B. "Is formative measurement really measurement? Reply to Bollen (2007) and Bagozzi (2007)," Psychological Methods (12:2) 2007a, pp 238-245.

Howell, D. R., Breivik, E., and Wilcox, J. B. "Reconsidering formative measurement " Psychological Methods (12:2) 2007b, pp 205-218.

Jarvis, C. B., MacKenzie, S. B., and Podsakoff, P. "A critical review of construct indicators and measurement model mis-specification in marketing and consumer research," Journal of Consumer Research (30) 2003, pp 1999-1216.

Kim, G., Shin, B., and Grover, V. "Investigating two contradictory views of foramtive measurment in information systems research," MIS Quarterly (34:2) 2010, pp 1-xxx.

Looney, C., and Hardin, A. "Decision Support for Retirement Portfolio Management: Overcoming

The DATA BASE for Advances in Information Systems 64 Volume 44, Number 4, November 2013

  

Myopic Loss Aversion via Technology Design," Management Science (55:10) 2009, pp 1688-1703.

Petter, S., Straub, D., and Rai, A. "Specifying formative constructs in information systems research," MIS Quarterly (31:4) 2007, pp 623-656.

Wilcox, J. B., Howell, R. D., and Breivik, E. "Questions about formative measurement," Journal of Business Research (61:12) 2008, pp 1219-1228.

About the Authors

Andrew Hardin is the Director of the Center for

Entrepreneurship and an Associate Professor in the Lee Business School at the University of Nevada, Las Vegas. Professor Hardin’s research is focused on organizational collaboration and virtual work, financial decision support systems, and research methodologies. His work has been published in journals such as Management Science, MIS Quarterly, Organizational Behavior and Human Decision Processes, Journal of Management Information Systems, European Journal of Information Systems, Journal of the Association for Information Systems, The DATA BASE for

Advances in Information Systems, Group Decision and Negotiations, Small Group Research, and Educational and Psychological Measurement. Hardin currently serves as Senior Editor for the Information Systems Journaland The DATA BASE for Advances in Information Systems, Senior Associate Editor for the European Journal of Information Systems, and Guest Associate Editor for MIS Quarterly.

Jerry C. J. Chang is an Associate Professor of MIS

in Lee Business School, University of Nevada Las Vegas. He received his Ph. D. in MIS from the University of Pittsburgh. His research interests include instrument development, IS performance measurement, self-efficacy, software piracy, use and impact of IT, culture aspects of IS, and management of IS. His work has appeared in MIS Quarterly, Journal of Management Information Systems, European Journal of Information Systems, Journal of the Association for Information Systems,Educational and Psychological Measurement, Information & Management, Decision Support Systems,The DATA BASE for Advances in Information Systems, Communications of the ACM, Journal of Computer Information Systems, and International Journal of Information Technology Project Management.

The DATA BASE for Advances in Information Systems 65 Volume 44, Number 4, November 2013