Formative variables are unreal variables: why the formative MIMIC model is invalid

36
FORMATIVE VARIABLES ARE UNREAL VARIABLES: WHY THE FORMATIVE MIMIC MODEL IS INVALID John W. Cadogan, Nick Lee and Laura Chamberlain AMS Review, Vol. 3 (1), 38-49 ISSN: 1869-814X

Transcript of Formative variables are unreal variables: why the formative MIMIC model is invalid

FORMATIVE VARIABLES ARE UNREAL VARIABLES:

WHY THE FORMATIVE MIMIC MODEL IS INVALID

John W. Cadogan, Nick Lee and Laura Chamberlain

AMS Review, Vol. 3 (1), 38-49

ISSN: 1869-814X

Abstract

In this rejoinder we provide a response to the three commentaries written by Diamantopoulos,

Howell, and Rigdon (all this issue) on our paper The MIMIC Model and Formative Variables:

Problems and Solutions (also this issue). We contrast the approach taken in the latter paper

(where we focus on clarifying the assumptions required to reject the formative MIMIC model)

by spending time discussing what assumptions would be necessary to accept the use of the

formative MIMIC model as a viable approach. Importantly, we clarify the implications of entity

realism, and show how it is entirely logical that some theoretical constructs can be considered to

have real existence independent of their indicators, and some can not. We show how the

formative model only logically holds when considering these ‘unreal’ entities. In doing so, we

provide important counter-arguments for much of the criticisms made in Diamantopoulos’

commentary, and the distinction also helps clarify a number of issues in the commentaries of

Howell and Rigdon (both of which in general agree with our original paper). We draw together

these various threads to provide a set of conceptual tools researchers can use when thinking

about the entities in their theoretical models.

Keywords: Formative Variables; Measurement; Composites; Indicators; Theory; Causality;

Ontology; Philosophy

We are grateful for being given the chance to add a rejoinder to the comments that have been

made about our original contribution to the AMS Review. Our paper, The MIMIC Model and

Formative Variables: Problems and Solutions, which the commentators refer to as LCC, appears

to have split the commentators into different camps. First, Diamantopoulos has sought to find

ways of refuting our claims that the formative MIMIC model tells us little about formative

variables. Reflecting on his work, and the conclusions one must inevitably draw if one accepts

his arguments, his stance can be summed up in the following way:

It is entirely possible for a singular entity, with singular conceptual content, to also be

multifaceted in conceptual content. Likewise, it is possible for a grouping of conceptually

different entities, that is, a grouping of multiple entities that potentially have conceptually

orthogonal meanings, to also have singular, equivalent, conceptual content. In other words,

there is no such thing as either unidimensionality or multidimensionality of variables:

whether an entity is unidimensional or multidimensional is in the hands of the individual

researcher, such that if a researcher wishes to do so, she can decide that a variable is both

unidimensional and multidimensional at the same time. As such, the MIMIC model is a

usable tool for modeling formative variables.

On this reading, Diamantopoulos’ stance appears to be illogical and contradictory. Of course, we

(and we suspect many others) cannot agree with such a view. However, while this view might

appear to be an extreme caricature, drawn by us to illustrate a point, in this rejoinder we will

demonstrate that it is the unavoidable conclusion one comes to if one follows Diamantopoulos’

arguments and concurs with his reasoning that (a) the formative MIMIC model can provide

information on formative measures, and beyond that (b) that a latent variable, η, can be measured

using both formative indicators and reflective indicators, simultaneously. Part of the error in

Diamantopoulos’ comment on LCC seems to be a failure to appreciate the subtlety of the

differences in the ontologies underpinning the formative and the reflective variable models. We

expand on this issue in some depth below, and in particular the position – entrenched in

Diamantopoulos’ comment – that the MIMIC model’s interpretation is a purely conceptual

decision; that the researcher can choose whether the MIMIC model conceptually represents a

formative model, a reflective model, or even both simultaneously.

Howell’s comment on LCC is the polar opposite of Diamantopoulos’. In fact, Howell

provides additional evidence to support our claims that the MIMIC model says nothing about

formative variables. In so doing he fatally undermines Diamantopoulos’ arguments. For instance,

he makes reference to work by Treiblmaier, Bentler, and Mair (2011), who demonstrate that

when reflective items are added to a formative variable (to create a MIMIC model), the meaning

of η, the focal latent variable in the MIMIC model, is not derived from the presumed formative

indicators. Instead, Howell points out that, in line with LCC’s arguments, Treiblmaier, Bentler,

and Mair’s (2011) use of covariance algebra and path tracing rules shows that η in a MIMIC

model is just a common factor explaining the covariance among the MIMIC model’s reflective

items.

Rigdon’s position is also at odds with Diamanopoulos’, arguing that the MIMIC model

cannot simultaneously represent reflective and formative measurement. Rigdon’s ultimate goal,

however, is to move the debate into different territory, and so he spends relatively little time

reflecting on the specific issues presented in LCC. Rigdon instead considers a number of other

important questions, tangential to those covered in LCC. Importantly, Rigdon argues that

researchers are working at three levels of abstraction: at the most abstract level (and the most

important from a theory development perspective), are theoretical concepts; at the least abstract

are observed behaviors used to make inferences about the theoretical concepts. At the

intermediate level of abstraction, researchers develop representations of the theoretical concepts

with the observed variables, using common factors, weighted composites, or other approaches to

generate factor (e.g., using factor analysis) or composite (e.g., using partial least squares) proxy

variables. Using this view of conceptual variables as either highly abstract, observed, or proxy

representations, Rigdon discusses issues raised in LCC, such as fixing weights, as well as issues

beyond LCC, such as the pro-factor bias in the psychological measurement literature.

REFLECTING ON WHERE THE COMMENTARIES LEAVE US

Where do the commentaries leave the field? In our opinion, and despite the strong

endorsement of our position by Howell, and the confirmation that our reasoning is sound by

Rigdon, the situation remains rather unsatisfactory. Readers of Diamantopoulos’ commentary,

for instance, may wonder why anyone would question the utility of the MIMIC model. If

Diamantopoulos is correct, then LCC are trying – as he suggests – to ‘kill’ the formative

measurement model. As such, they must be stopped, because by undermining the ability of the

MIMIC model to handle formative variables, LCC are inhibiting scientific progress.

Therefore, it is somewhat understating the case to say that LCC’s arguments did not

convince Diamantopoulos that MIMIC models are inappropriate for modeling formative

variables. As a result, there is a chance that some readers will be swayed by Diamantopoulos’

rejection of LCC’s claims. Indeed, perhaps Diamantopoulos is right, and LCC, in challenging a

well-established methodological tool, have joined Howell (2013), Rigdon (2013), Borsboom

(e.g., Borsboom 2005), and the like, forming a body of – his word – ‘misguided’ academics who

cannot see some obvious truth, who are failing to grasp something fundamental about “what

things are”, and “how we measure them”.

Perhaps one of the problems with LCC’s approach is that, on focusing on why MIMIC

models do not make sense for formative variables, it did not make explicit the assumptions one

would need to make if one believes that the formative MIMIC model is a viable method of

empirically representing formative variables. It may be that additional light can be shed on the

issue of whether formative MIMIC models are valid, by examining the basic precepts one needs

to accept in order to allow formative MIMIC models to remain in our panoply of acceptable

methodological approaches. Diamantopoulos, for instance, uses arguments to criticize LCC’s

logic that rely on certain assumptions about the nature of reality: if the latter assumptions are

valid, then the arguments they prop up stand a greater chance of being able to dismantle LCC’s

thesis. However, should it be the case that the assumptions propping up the formative MIMIC

model are untenable, then critiques of LCC’s logic that rely on those assumptions must be

dismissed.

Accordingly, we use this rejoinder to provide an overview of the assumptions that

implicitly or explicitly are used when defending the formative MIMIC model, or when attacking

the logic of those who question its veracity. In this respect, the following assumptions appear to

be critical to the formative MIMIC model agenda:

Table 1 about here

Figure 1 about here

We identified these assumptions and implications by referring to Diamantopoulos’ (2013)

commentary on LCC. However, there may be additional issues and assumptions that are not

covered here that appear in Diamantopoulos’ writings, and in the writings of others who discuss

the formative model. Even so, it seems the issues above form an intersecting set of beliefs that

are used to (a) justify the formative MIMIC model’s capability to model formative variables, and

(b) defend the model from attack by the likes of LCC.

Do the assumptions hold up to scrutiny? We suggest not. Our reasoning depends on

understanding what the entity realism ontology implies, and so we now discuss the nature of real

entities.

TESTING THE ASSUMPTIONS UNDERPINNING THE FORMATIVE MIMIC

MODEL

Real and unreal variables

The most basic assumption underpinning the arguments of those advocating the MIMIC

model as a way of representing formative variables is that, for a formative variable, the focal

latent variable η is a separate entity from its formative indicators. For instance, Diamantopoulos

(2013) is explicit on this front, stating that “[o]ne can very well defend a claim that, in a

formative measurement model, the latent variable is a separate entity from its indicators”. If the

logic of this assumption is undone, such that it is found that η is indistinguishable from its

formative indicators, then, as we demonstrate later, the remaining assumptions in Table 1 are in

turn invalidated.

One way to approach the question of whether a formed η is the same thing as its

indicators, or whether it is somehow made up of different conceptual ‘stuff’ from the indicators

that form it, is to consider the issue of entity reality. At first, it seems strange even to

contemplate a discussion about real things, since to do so implies that there are unreal things. We

are scientists – at least, we aspire to adopt scientific principles and approaches – so why would

we contemplate talking about unreal things? Much depends on what is meant by real.

From an entity realist perspective, a real entity is simply an entity that is assumed to

actually exist, independent of measurement or examination, and which should be taken literally

(Borsboom, Mellenbergh, and van Heerden 2003). One can view real entities as being

fundamental, or unidimensional, since they are singular in conceptual meaning, and cannot be

broken into smaller, more fundamental conceptual entities. The converse are variables that one

might term unreal. However, the use of the term unreal is rather pejorative, and so while we do

not abandon the word unreal, we qualify its use by explicitly stating that the term unreal is used

as a way of identifying a variable that does not conform to the definition of a real entity. What,

then, is an unreal variable? Borsboom et al. (2003, p. 207) describe an entity of this kind as being

“a fiction, constructed by the human mind”, an operational or “numerical trick … a (possibly

weighted) sumscore and nothing more”, which does not have or require existence independent of

measurement.

Clearly, from this perspective, an unreal variable could produce numerical magnitudes as

scores, and those scores are real in the sense that they exist on paper, in our minds, or in a

database somewhere. However, the variable that is formed cannot represent a real entity with

genuine existential properties that transcend its indicators / components (see MacCorquodale and

Meehl 1948): it is just a mathematical structure applied to a set of more basic entities. There is

no fundamental, singular, real entity that the unreal variable equates to, or represents, or maps to

at a conceptual level: unreal variables are just what their mathematical structures imply, “merely

names attached to certain convenient groupings of terms” (MacCorquodale and Meehl 1948, p.

99).

Take for example, an individual’s socio-economic status (SES). The indicators of an

individual’s SES can be defined as comprising education, income, and occupation, and it is

probably not too controversial to suppose that education, income and occupation are real entities

in their own right. However, if we define SES as being a function of education, income, and

occupation, does it make sense to consider SES a real entity, or rather just some summary of the

three defining indicators? In other words, is there a convincing existence proposition for SES

that goes beyond the existence of the defining components?

Simple conceptual thinking will show that it is not possible to provide a convincing

existence proposition for SES, without changing its definition in some way. Without a

convincing existence proposition, SES is simply the combination of the three items. Of course,

one could certainly conceptualize something akin to ‘perception of SES’, which could be the

“subjective evaluations [of others] that confer status [on an individual]” which Blalock (1975 pp.

365) indeed does. But the latter (real) entity is fundamentally different to the simple combination

of characteristics (unreal) that characterizes the typical SES definition (Borsboom et al. 2003). A

look at the body of existing marketing and business research will turn up many theoretical

entities which might appear in light of the present discussion to be more amenable to an unreal

definition. In LCC we also gave examples, including our advertising expenditure and coercive

power examples (of which more later).

Are formative variables unreal? Reassessing the meaning of the error term

Are formative variables (i.e., the ones used in formative MIMIC models), conceptually

identical to these unreal variables discussed above? This is important, since we will show that in

order for the formative MIMIC model assumptions to hold, the formative variable must be real.

If they are unreal, the formative MIMIC model cannot be considered a valid tool. To resolve this

issue, we first defer to Diamantopoulos et al. (2008, p. 1205), who define a formative variable

model as one in which “the indicators determine the latent variable which receives its meaning

from the former”, and specify the mathematical structure of the formative variable as a weighted

sumscore, as follows:

(Equation 1)

𝜂 = �𝛾𝑖𝑥𝑖 + ζ𝑛

𝑖=1

where η represents the formed variable, xi are the more fundamental variables (or indicators) that

define η, the γi are the weights which define the contribution of the indicators to the formed

variable, and ζ is a disturbance term. Moreover, in the context of Equation 1, Diamantopoulos et

al. (2008, p. 1211) articulate that “it is not possible to separate the construct's meaning from the

indicators’ content”.

At this stage, comparing Diamantopoulos et al.’s (2008) definition of a formative variable

and its associated mathematical structure with the entity realist understanding of what

distinguishes real and unreal variables, it becomes hard to reconcile the notion that the formative

variable described by Equation 1 is anything other than an unreal variable. That is, an unreal

variable is simply a convenient group of variables, combined using some mathematical rule, and

equation 1 appears to say that a formative variable is just a sumscore of variables combined

using some mathematical rules.

However, might it be that the ζ term in Equation (1) somehow imbues the formative

variable with some surplus meaning beyond the indicators that are used to define η? If it does,

this might make the formative variable a real entity. Diamantopoulos (2013), borrowing from

Grace and Bollen (2008), certainly believes this to be the case. In order to provide an answer to

this question, we need to work out the meaning of the ζ term. Let us start by approaching this

question from a purely conceptual perspective (later, we will also consider pragmatic issues of

data availability for a given conceptualized set of indicators).

According to Diamantopoulos (2006, p. 11), “the error term in a formative measurement

model represents the impact of all remaining causes other than those represented by the

indicators included in the model”. Ignoring for now what LCC (along with Rigdon and Howell’s

commentaries) showed was a misuse of the term ‘cause’, let us test this idea by looking once

again at SES. First, we define SES as being a formed variable, comprising income, education and

occupation. Where is the error here? Given that “the indicators determine the latent variable …

[and] its meaning” (Diamantopoulos et al. 2008, p. 1205), it is clear that, at a conceptual level of

analysis, the definition of SES contains no conceptual error: the indicators (income, education

and occupation) define the construct, and so the formative variable perfectly mirrors its defining

factors. According to this definition of SES, the conceptual meaning of SES does not reside at all

outside of (or transcend) the income, education and occupation factors that define it. The logic of

entity realism means that SES is simply a composite of three indicators and is an unreal variable.

In fact, all formative variables defined using a set of explicitly identified factors are unreal, since

there is no error possible in their conceptual definition.

But maybe this is a special case. What about the case where the indicators of the

formative variable are not explicitly identified? Maybe error can occur here, and give the

formative variable surplus meaning? To test this idea, let us redefine SES, so that it is now

potentially a different formative variable: we shall call the new variable SES, but its conceptual

content may not be same as the previous SES idea (which was simply income, education and

occupation). Specifically, we now define SES as “the set of social and economic factors that

contribute to a person’s social standing”. Here we have not explicitly listed the individual factors

that comprise the set of things that contribute to social standing. But, if we were interested in

actually doing something with the new SES variable (e.g., using it in a model), we would want to

create that list, and that theoretical list, by definition, contains every single social and economic

factor that contributes to one’s social standing. How can it not? So once again, the definition of

SES is error-free, because there are no “remaining” social and economic factors that (a) can sit

outside the list, and (b) can contribute to one’s social standing. As a result, we can conclude that

the conceptualization of the formative SES variable, even when defined more vaguely (without a

priori specifying the individual factors that define the variable), is error-free. More generally,

even if the explicit identities of all of the individual factors comprising the formative variable are

not provided, by definition the formative variable is error free at the conceptual level. This

conclusion might seem very obvious, and one might wonder why we need to verbalize

something so trivial. The reason is that the idea of error in the formative model has been taken to

mean error in the conceptualization of the formative variable. This idea is obviously incorrect.

What, then, are Diamantopoulos et al. (2008, p. 11) referring to when they talk about the

error term representing factors “other than those represented by the indicators included in the

model”? It cannot be a conceptual error as shown above. That leaves only an operational error –

factors which should be empirically captured, but which have not been. If one defines SES as

“the set of social and economic factors that contribute to shaping a person’s social standing”, but

only operationalizes some of those factors, one is committing an error of omission at the

operational level. Operational errors are likely to result in omissions in measurement, and thus

will lead to errors in the calculation of the numerical value of the formative variable. Now,

obviously, operational and measurement errors should not be defining features of a theoretical

variable’s conceptual definition, and so Equation 1’s error term makes little sense. In fact, it

cannot exist as an error of conceptualization. In the context of a larger model, the error term

could also represent the error in prediction of another entity, such as ‘perception of an

individual’s SES’. Such an entity could be conceptualized as a real entity, and the so-called

formative indicators would be causal influences on this entity. However, ‘perception of SES’ is

not the same entity as the SES unreal entity as defined above. Perception of SES could have

many other causes than those conceptualized to form SES itself, and these are quite correctly

represented by an error term. But neither of those interpretations of the error term are

representative of it as ‘surplus meaning’ in the definition of a formative variable. There can be

no surplus conceptual meaning in a formative variable.

As we explain shortly, it is for this reason, among others, that we also suggest that the

traditional formative model diagrammatic picture (Figure 1), together with the addition of the

error term in Equation 1 are hindering researchers’ ability to place formative variables in their

correct ontological position. In sum, then, the conclusion we draw from the analysis above is

that: (a) formative variables are defined by the variables chosen to form them, and so they are

always error-free at the conceptual level, and contain no surplus conceptual meaning that

transcends the formative variable’s defining factors, and (b) as a result, all formative variables

are unreal.

THE IMPLICATIONS OF AN UNREAL FORMATIVE VARIABLE

1. A formative variable is not a separate entity from its formative indicators. So far, we

have demonstrated that formative variables are, by definition, and despite the modeling notation

adopted in Equation 1, error free at the conceptual level, and that this piece of information gives

us some confidence in stating that formative variables are unreal. That is, formative variables do

not possess the properties that real variables possess. Formative variables do not actually exist,

independent of their indicators. They are simply convenient groupings of variables, and have no

special meaning that transcends those variables that define them (see also Borsboom et al., 2003;

MacCorquodale and Meehl, 1948).

This means that the first assumption underpinning the formative MIMIC model (see

Table 1) is not valid: that is, a formative variable is not a separate entity from its formative

indicators. It also means that Diamantopoulos’ (2013) claim that “in a formative measurement

model, the latent variable is a separate entity from its indicators” is wrong. As we now show, all

the remaining assumptions outlined in Table 1 are also invalidated.

2. The formative indicators cannot cause the formative variable. The second assumption

underpinning the formative MIMIC model is that the formative indicators in the MIMIC model

have a causal effect on η, the formative variable. Yet, as discussed in LCC, the presence of

causality in a formative MIMIC model would necessitate that the cause and the effect are

separate material entities (e.g., Sosa and Tooley 1993). However, since we have just

demonstrated that formative indicators are not separate entities from the formative variable η, by

implication, it is impossible for the formative indicators to have a causal effect on η.

3. The formative MIMIC model cannot tell us what the relationship is between a

formative indicator and the formative η entity. Indeed, contrary to the notion of causality

between formative indicators and the formative η, it is simply the case that the set of formative

indicators, manipulated according to their defining mathematical structure, are the formative η.

Furthermore, it makes no sense to be ‘seeking’ the mathematical rules for summing the

indicators, because this assumes that the rules themselves have a real existence. That is, if one

wanted to estimate the magnitude of the relationship between a formative indicator and the

formative η, one would have to make the assumption that there is a true value for the relationship

that exists beyond (i.e., transcends) any value that a researcher imposes on that relationship.

Accordingly, it would be incumbent on the person wishing to perform the estimation to

demonstrate that there is a true value to the relationship.

To illustrate, let us return to SES, defined as an individual’s income, educational level,

and occupation. One could model the formative indicators of SES in a MIMIC model. According

to the generally-accepted logic of the formative MIMIC model, by running the model, the

researcher would generate estimates of the contribution that each formative indicator makes to

SES. It is quite possible that income might return a zero relationship with the SES formative η

variable. Yet, before the researcher redefines SES so that it no longer includes income, they

should first explain how there comes to be a true value for the relationship between income and

SES that can be estimated, especially given that, as we have shown, formative entities are simply

convenient groupings of variables, and are unreal. As Borsboom et al. (2003, p. 209) point out,

“Estimation is a realist concept: Roughly speaking, one could say that the idea of estimation is

meaningful only if there is something to be estimated”.

The alternative is that there is no true value that can be estimated for the value of a

relationship between a formative variable and a formative indicator. Rather, statistical estimation

procedures will throw up different values across samples, leading to inconsistencies across

studies (Hardin et al. 2011) – which Wilcox Howell, and Breivik (2008) point out is exactly the

case in research using SES. The view that we should estimate the relationships between

formative indicators and latents in a MIMIC model requires commitment to the idea that the

conceptual content of SES should be determined by a data set, and the acceptance that SES’s

definition will likely be different from one study to the next. One reason why we should not

estimate formative indicator weightings, therefore, is that if we do, the SES variable (or any

formative variable) cannot be compared across studies, since SES would have a different

meaning in each study. This issue is at the root of Borsboom et al.’s (2003, p. 209)

recommendation that in formative cases, “the term parameter estimation should be replaced by

the term parameter determination”. That is, as LCC suggest, the γ parameters in Equation 1 are

weights that should be predefined by the researcher, not sought by a statistical package (the

reader is referred to the original LCC article for other reasons why pre-specified weights should

be used when constructing formative variables).

For Diamantopoulos (2013), one of the troubling consequences of drawing the conclusion

that there is no true value of the indicator weighting to be estimated, and that the weight is part of

the variable definition, is that the recommendations of LCC “no longer address the formative

measurement model… but simply discuss alternative ways of constructing (fixed-weight)

composites”. For some who have grown up under the umbrella of classical test theory, the

conclusion that the formative model is not a measurement model may sit uncomfortably, since it

goes against much of what they have been taught. Perhaps it is for this reason that

Diamantopoulos (2013) is so dismissive of the idea that there can be no causal relationship

between the formative indicators and the formative η in the formative MIMIC model. Indeed

Diamantopoulos (2013) remains staunchly faithful to the notion that the formative MIMIC model

is a measurement model performing a much needed scientific job: “measurement models are

exactly that: theories to be tested (and possibly refuted) with empirical data”. Unfortunately,

although it might satisfy a conditioned response, a desire to undertake testing of hypotheses

about formative indicators and their relationships with the formative variable, the fact remains

that these relationships should not be estimated: rather, the relationships should be part of the

construct definition, and so cannot be inferred by statistical approaches or MIMIC models.

4. It is not possible for a single variable to have both formative and reflective content-

valid indicators. Advocates of the formative MIMIC model sometimes argue that for any given

MIMIC model, there is no single correct way of interpreting what is being measured and that, in

fact, one can measure a variable using formative or reflective approaches (or even both) if one

wishes. As Diamantoulos (2013) states, in a MIMIC model with exogenous observable indicators

(the xs) and endogenous observable reflective indicators (the ys), it is possible to view “both the

𝑥𝑠 and the 𝑦s as content-valid indicators of 𝜂, with the 𝑥𝑠 being formative and the 𝑦s being

reflective” (see Figure 2 for a diagrammatic representation of such a model). Similarly, Bollen

(2007, p. 221) asks the reader to “consider a multiple indicator–multiple cause (MIMIC) model

in which there are both causal (formative) and effect (reflective) indicators of a single latent

variable”. Bollen (2007) provides an illustration of such a model, “in which η1 is home value;

the causal (formative) indicators are square footage of house, age, lot size, number of rooms, and

so on; and the effect (reflective) indicators are appraised value, owner estimate, and assessed

value of the home.”

Figure 2 about here

In this interpretation of the formative MIMIC model, if the MIMIC model’s reflective

items are unidimensional (as reflective items should be), and if its formative items are

multidimensional (as formative items should be), and if (as Diamantopoulos suggests) η is a

valid representation of the formative items, and the reflective items are valid representations of

η, then the single entity, η, is simultaneously unidimensional and multidimensional. Such a state

of affairs is a logical impossibility, and so we conclude that a single variable cannot have both

formative and reflective content valid indicators.

In practice, MIMIC models that claim to ‘measure’ a single construct cannot have

both formative and reflective content-valid indicators of the variable: it is just impossible, so

something else must be going on in the MIMIC model (we provide detailed explanations in

LCC). For instance, in Bollen’s (2007) example of home value, rather than the formative and

reflective indicators being measures of a single “home value” η variable, a more plausible

interpretation is that the reflective indicators measure perceptions of home value, whereas the

MIMIC model’s formative items are best conceptualized, not as variables that define home

value, but rather as exogenous potential causes of home value perceptions (along with many

other factors that are not assessed in the questionnaire). All instances of MIMIC models that

proclaim to simultaneously measure a single entity with both formative and reflective indicators

are making a similar error in that the MIMIC model is just a model in which a common factor is

predicted by some exogenous variables (see also the comments of Howell, and Rigdon, in this

issue). Of course, this is just what the MIMIC model should do – after all it is a Multiple

Indicators, Multiple Causes model. As we have shown above, in LCC, and elsewhere (e.g.

Cadogan and Lee, 2013), formative variables cannot have direct causes (i.e., one cannot have

direct paths influencing a formative η variable – it is a logical impossibility). The MIMIC is a

fine tool, but only if it is not used to attempt to operationalize a formative variable. That is, the

MIMIC model works as a modeling tool when one does not assume that the exogenous xs in

Figure 2 are formative indicators of η.

ADDITIONAL ISSUES

We have clarified the logical implications of rejecting the use of the formative MIMIC models

here and in LCC, and in this rejoinder we have also now explicated the logical consequences of

accepting the utility of formative MIMIC models. For us, the natural outcome of this – the

abandonment of formative MIMIC models – is self-evident. Furthermore, the discussion

presented on the distinction between what we called real and unreal entities should clear up

many of the issues raised in the commentaries on LCC.

Beyond these issues, there is also a need to provide some brief responses to a number of

specific points raised by the three commentators. First, although at no point in LCC do we say

that the Advertising Expenditure (AE) measure was developed by Diamantopoulos and

Winklhofer (2001), it appears that Diamantopoulos (2013) got the impression that we did. We

should have made authorship of the scale more obvious, and we should also have been more

explicit when describing the AE variable and its origins. That said, the way that LCC attributed

authorship and discussed operationalization of the original AE measure has no bearing on the

logic being used by LCC to make its point. It is certainly not reasonable to conclude that, as a

result, “the entire discussion of the model discussed in LCC’s Figure 3 becomes rather irrelevant

and potentially misleading” (Diamantopoulos, 2013). Underpinning LCC’s Figure 3 example is

some fundamental logic that can be applied to any formative variable, not just the example used.

If readers follow Diamantopoulos’ implied suggestions, and ignore the Figure 3 example, they

will miss out on an illustration of how a MIMIC model cannot be a formative latent variable

model.

Diamantopoulos’ comment also makes much of the fact that LCC argued that the same

observed items could be considered indicators of two different constructs (access to coercive

tools, and perceived coercive power). However, two key mistakes are made by Diamantopoulos’

comment in this regard. First, LCC makes very clear in its Table 1 that these items, while

(although only partly) being worded the same, were different in at least two ways; a) the item

stems were different (‘how much capability does your supplier have to take an action’, versus

‘what is your ability to take the action’), and b) the unit of analysis and thus the data source was

different (the buyer rates their supplier, versus a supplier rates themselves). So, quite apart from

anything else, the same items were not used at all, and it is unclear how Diamantopoulos came to

that conclusion1. That said, some may be surprised to note that it is hardly a new idea that even

the very same item, administered to the very same respondent, and using the same measurement

model (e.g. reflective) could be used to measure more than one construct. Hayduk (1996, see e.g.

p. 25-30) discusses this at length when he demonstrates how a consideration of the error of a

measurement item should be informed by one’s construct definition. Hayduk and Littvay (2012)

elucidate this with an example of how a respondent’s extent of agreement/disagreement with the

statement ‘sex with a consenting adult of the same sex is wrong’ could be defined as a measure

of a) the true extent of agreement/disagreement with the statement ‘sex with a consenting adult

of the same sex is wrong’, b) the respondent’s personal commitment to a traditional reading of a

religious text, or c) the respondent’s parents’ commitment to a traditional reading of a religious

text, as the error variance fixed by the researcher increases. If Hayduk and Littvay (2012) are 1 Of course, we assume here that Diamantopoulos’ lack of clarity was unintentional.

right in this respect, then even if Diamantopoulos’ (2013) criticism of our coercive power

example had been correctly targeted, the nature of the criticism would appear to be unfounded.

Further, we wonder whether Diamantopoulos (2013) really meant that abandoning the

formative model “de facto means that the only auxiliary theory available to researchers would be

the reflective model”? There are a number of problems with this statement. First, it is unclear

where Diamantopoulos saw LCC recommending abandonment of the formative model, since

nowhere in LCC can this be found. Rather, LCC, and this rejoinder, recommends abandoning the

formative MIMIC model as an operational tool. It also appears that it is wrong to assume that the

only two models available for use by researchers are reflective and formative. Rigdon et al.

(2011, p. 1592-1594) explain the potential harm of this dichotomous view, giving a brief

introduction to the multiple different measurement models available to the social researcher, and

summing the problem up as follows: “we encourage researchers to consider the broad range of

possible relations between measures and constructs. This range is limited only by our ability to

imagine and implement measurement models. Computational advances over the years have

dramatically expanded this potential range, but the scope of our imagination broadens more

slowly,” (p. 1542). For Rigdon et al. (2011), the seemingly-ingrained belief that there are only

two measurement models possible, formative and reflective, is an abject failure of imagination.

Diamantopoulos also claims that LCC fails due to what is termed the application of

‘reflective logic’ to issues of formative models, which Diamantopoulos cites Bagozzi (2011) as

suggesting in relation to other articles in the area. While we will not speak for other authors, it is

not clear to us what the ‘reflective logic’ referred to as underpinning LCC could be, and we

believe it is inadvisable to set up what looks like a situation of paradigmatic incommensurability

when it is clearly not the case. In fact, this rejoinder has shown how to differentiate between real

and unreal entities, and how the reflective model can be used with real entities, and the formative

model should be used in the case of unreal entities. We would suggest that, unlike many in the

pro-formative MIMIC modeling camp, we have demonstrably moved beyond the assumption

that all entities must be conceptualized in an entity realist manner which, if we are forced, is the

only thing we can think of which could be called a ‘reflective logic’ (cf. Borsboom et al. 2003).

In fact, in light of the real / unreal distinction, the so-called key difference cited in

Diamantopoulos’ comment of exogenous (reflective) versus endogenous (formative) indicators is

nothing of the sort. In an unreal case (logically the only case where formative models fit), it

makes no sense for indicators to be either endogenous or exogenous. Rather, the indicators are

analogous to pieces of a mosaic, they are simply part of the construct (hence our

recommendation for various composite methods).

Moving to Howell’s comment on LCC, we appreciate his suggestion that the term

MIMIC should strictly-speaking only refer to models with multiple reflective indicators, rather

than conceptually distinct outcomes. However, we also note that it is not uncommon for the

latent variable in a MIMIC model to be interpreted in terms of the formative items, and for the

reflective variables chosen to estimate the MIMIC model to not be representative of the domain

of the theoretical entity being modeled, but to be more appropriately considered as endogenous

consequences, just as in the AE example given in LCC (for further examples, see amongst others

Diamantopoulos and Siguaw, 2006; Molina-Castillo et al. 2012; Sanchez-Perez and Iniesta-

Bonillo; 2004; Uhrich and Benkenstein, 2010).

Of course, when a real entity is defined, then as pointed out by Howell, there can be

causes conceptualized. However, in many cases these ‘MIMIC-type’ models are used in practice

to operationalize formatively-conceptualized variables (such as the examples we gave in LCC),

which does not make sense conceptually. Therefore, like LCC, Howell shows that it is illogical

to use a MIMIC model to operationalize a formative variable. The ‘formative measurement’

issue indeed goes away in the special case explained by Howell (2013) (i.e., when “(1) the latent

variable is interpreted only in terms of the content of its reflective indicators; (2) the Xs are

interpreted as causes, predictors, or covariates (call them what you wish), but not as measures;

and (3) the error term is interpreted as all sources of variation in the reflectively measured latent

variable not included in the model”), but only because a formative variable is not represented

here – as echoed by Rigdon in his comment.

Rigdon in fact passes over much of the detail of LCC, and instead brings his guns to bear

on a number of what he calls “more fundamental issues” concerning “psychological

measurement” (Rigdon 2013). We leave readers to draw their own conclusions on the issues

outside the scope of LCC, and Rigdon provides ample excellent discussion to do so. As for why

we focused on what Rigdon considers a rather unimportant issue, the simple reason is that the

issues covered here and in LCC are problems that are compromising the rigor of research right

now. Whether or not there are more important issues in a conceptual sense to be dealing with, in

a practical sense, researchers at this moment in time are building and testing models which are

empirically incompatible with their conceptualization because of the problems we pointed out in

LCC.

We also agree with Rigdon’s criticism of the terminological confusion currently

obfuscating measurement theory and practice. His suggestions to solve this are a great start, and

have much in common with Bagozzi’s (1984) holistic construal. However, Rigdon’s proxy

variable solution does not take into account the key issue of unreal variables. Could the

representation of an unreal variable (say SES as discussed above) in an empirical model be

considered a proxy? One could argue not, since there is no entity, or ‘conceptual variable’, in

Rigdon’s terms, of SES that is being approximated. As far as empirical models goes, it makes

sense to refer to empirical proxies of real entities, but is harder to justify in unreal entity cases.

We would also venture that not all measurement situations can be covered by the term

‘psychological measurement’ as used by Rigdon. For example, concepts such as ‘strategic

planning’, ‘market information dissemination’, and – again – ‘advertising expenditure’, do not

necessarily fall under a psychological measurement banner. As alluded to in LCC, other fields of

research, such as clinical diagnosis, and health economics, exhibit a number of different and

potentially useful modeling methods that do not rely on the assumption of real entities.

Sometimes, a variable is just a collection of indicators (unreal), not a proxy for an unobservable

real entity. In fact, those were the situations in which we argued that a formative approach may

be conceptually justified. Similarly, whether or not we can formulate a convincing existence

proposition for a theoretical entity should not be confused with whether one is using a ‘factor

model’ or not. Factor models are one way of modeling real entities, but there are other models.

The distinction is critical, and it means that Rigdon’s (2013) apparent concern that we lack “the

psychological laws that would enable” the fixing of weights is a little misdirected. As we have

already discussed herein, for an unreal entity, the definitional rules of combining the observed

indicators to create the composite (e.g. the weightings) are sufficient (MacCorquodale and

Meehl, 1948). There are no psychological laws to be discovered in such situations (Borsboom et

al. 2003).

Finally, we agree that understanding of causality is a significant problem in measurement

theory, and application – as exhibited in the confusion between formative and causal that LCC

and this rejoinder has repeatedly tried to point out. However, in such cases, while the actual

observed variables and latent proxies for real conceptual entities cannot logically have causal

relationships, it is appropriate to consider an empirical model of theoretical real entities as

representative of an underlying conceptual model with causal content. In a measurement sense,

while the actual path from the ‘real’ conceptual entity may be unobserved, we remain

philosophically and theoretically wed to the idea that in some way the conceptual variable has

some causal impact on the observed variable(s) we are using to measure it (Borsboom, 2005).

We would again say that there is no self-evident reason that we could not also consider it

possible for an observed variable to somehow have a similarly-unobserved causal impact on a

conceptual variable, but even though the observed variable would be exogenous in this case, it is

not the same as calling the model a formative one: as already explained, the exogenous variable

would be an exogenous cause – one entity having a causal relationship with a separate, distinct

other entity.

In fact, we would suggest that greater clarity of the difference between empirical and

conceptual models as Rigdon suggests might also help researchers distinguish when it is right to

talk about causality, and when it is not. For example, in the unreal case, discussion of causality is

not relevant (because there is no underlying conceptual entity with existential properties): as we

show above, what Rigdon might term the empirical proxy is the (unreal) theoretical variable.

Importantly though, Rigdon’s argument that the empirical proxy cannot have causes or effects

remains solid (because it is simply an empirical thing, and not a conceptual entity with

antecedents and consequences). We alluded to exactly this problem in explaining why the

MIMIC model does not work conceptually for the formative case. But perhaps more troublingly,

it also puts researchers who use formative models us in a difficult position in terms of placing

such variables in an empirical model. Cadogan and Lee (2013) begin to address this with their

discussion on using formative variables as endogenous, but there is much more to be done here.

CONCLUSIONS: WHERE DO WE GO FROM HERE?

We believe that the present exchange between us, Diamantopoulos, Howell, and Rigdon, has

moved forward thinking on a number of conceptual and practical issues concerning formative

models, and social scientific measurement in general. As such, we are indebted to all parties

(including the editorial team at AMS Review) for the chance to engage in this exercise, and

question our own thoughts and arguments in such an open forum. The ideas presented in LCC

have been moved on in a number of important ways by each of these three comments. However,

despite the many thousands of words that have been expended on this topic, we believe the entire

issue comes down to one seminal point; the distinction between real and unreal entities. If one

accepts that this is a valid distinction in a conceptual sense, then the implications drawn in LCC

and herein should be relatively self-evident. It is only when one begins to conflate the unreal

with the real entity that confusion begins to reign. So, one last time, let us clarify the distinction,

by way of presenting a potential thought process researchers may engage in.

The foundations of scientific research rest on theories which attempt to explain

phenomena (whether these are astronomical events, chemical reactions, human relations, animal

behavior, or anything else). Unless it deals solely with things that can be directly measured2, a

theory at least in part consists of a number of theoretical entities, which are linked together by

2 While we are not experts in medical research, an example of such a theory might be that the number of cigarettes smoked is associated with probability of death before a certain age. However, it may also be the case that even such a theory could be extended with the inclusion of unobservable mediating processes / variables needing indirect measurement. Furthermore, the notion of cause is itself unobservable. Some authors would also argue even something as seemingly observable as a probability of death or number of cigarettes is fundamentally unobservable due to measurement error, but such a view would only strengthen our thesis, so it is not necessary to dwell on it here.

way of some (most usually causal) logic. The first port of call in formalizing any scientific

theory beyond mere idle speculation must be to conceptualize those theoretical entities.

Theoretical entities begin as ideas we may have about how to explain phenomena, which

we give names to – such as ‘job satisfaction’, ‘customer orientation’, ‘service quality’,

‘socioeconomic status’, ‘advertising expenditure’ and the like. However, the conceptualization

process should be far more rigorous than simply giving a theoretical idea a name, and the most

important decision at this stage is akin to asking the question ‘is this theoretical idea something

more than just a name?’ In other words, can we define a convincing existence proposition for

this theoretical entity?

While the question of entity realism might initially seem to be hard to answer, in our view,

there are two potential questions that can shed light on this issue:

1. Is it easy to define some ‘class’ of entity it may belong to? For example, could you call it

a type of ‘attitude’, ‘perception’, ‘emotion’, ‘elementary particle’, and so forth? And,

could this theoretical entity exist, if a researcher had not first conceptualized it?

2. Is it easy to define the theoretical entity in a way that does not involve it as some

combination of other more fundamental components? Pragmatically, can you define it in

a way that does not imply for example certain observable items, or subdimensions?

If it is easy to answer ‘yes’ to both these questions, it is likely that one can define it as a real

entity, and create a convincing existence proposition. If not, one should think carefully about

assuming that the theoretical entity is real. However, the answers to such questions are

interlinked rather than sequential. Take the example of SES as an illustration. First, it is not easy

to see exactly ‘what’ SES could be. One could perhaps be rather pedantic, and call it a

‘demographic characteristic’, and argue that such characteristics of course do exist – such as

gender. But, as part of this class of entity, could SES have existed if a researcher had not

conceptualized it first? In our view, the answer would be no. An individual’s SES (note, as

distinct from the perception of SES) is a concept created by interested observers (e.g.

sociologists), and is not an inherent feature of the world around us. Contrast this with a

perception, an emotion, or an elementary particle (e.g. an electron). While our scientific theories

may be wrong, the current understanding is that such things as these are inherent features of the

world around us, whether or not we are there to name and theorize about them. It is harder

(although perhaps not impossible) to convincingly argue the same for SES.

But even if one can debate the answers to the first point (and doing so is in itself a

valuable exercise), it is the answer to the second that will settle the issue. Once one tries to define

SES more formally, one starts to run up against the problem that it is defined as some

combination of other things – income, occupation, education. Defining it in this way then makes

it even harder to answer positively the first question. As such, here, we are clearly dealing with

case where it is impossible to define a convincing existence proposition, without some

redefinition of the very nature of the theoretical entity – in other words, without changing what it

is. We suspect there are many such cases in marketing and social research.

All the central ideas we presented in LCC and in this rejoinder are founded on the

distinction between real and unreal cases. It is our view, and that of others (e.g. Borsboom et al.

2003), that the only logical definition of the formative model is as a way of modeling unreal

cases. If one accepts the arguments that lead to this viewpoint, then the substantive problems we

pointed out in this exchange are of great importance to conducting rigorous research, namely:

1. Using covariance-based MIMIC models does not properly represent the formative case.

2. The definition of a formative variable must include the components and the way they are

combined together (e.g. weightings).

3. Using exogenous indicators to model a real entity is not the same as using formative

components to model an unreal variable.

Avoiding these issues, and maintaining that the ‘formative model’ is analogous to a real entity

with exogenous indicators is, in our view, misleading, and actually harmful to the formative

model. Further, we do not understand why the unreal case – which is essentially analogous to

operationalism – seems to be anathema to so many. We showed in LCC that fields such as

clinical diagnosis and health economics understand, accept, and provide a number of options to

model unreal entities, and they are fundamental parts of theoretical explanations in many other

fields. For example, electrical resistance is not a real entity, being as it is nothing more than the

ratio of voltage across an object to current through it (MacCorquodale and Meehl, 1948).

Another example is momentum, being mass multiplied by velocity. The fact that these variables

are operationally defined does not appear to have impeded their central role in various models

and explanations. However, note a number of key characteristics that exactly mirror our

definition of the unreal entity, and ensure that such cases are useful across studies:

1. The definition includes the components, and the combination rule of those components

(ratio for resistance, multiplication for momentum)3

2. One cannot logically influence the amount of the unreal entity without influencing one or

more of the individual components (Cadogan and Lee, 2013). How can one cause a

change in momentum without changing velocity or mass?

If sciences such as physics and medicine can rely on what we term unreal entities as fundamental

parts of their models, what is the reason they seem so hard to accept in other fields (e.g.,

marketing)? It is our view that the acceptance that formative variables are unreal would do much

to enhance current research practice in marketing and similar domains of study. Furthermore,

clearly defining a formative variable model as being appropriate to the unreal case and not the

real case would remove much of the confusion and misconception regarding the formative model

that abounds in the literature, and assure the formative model of a key place in all good

researchers’ toolkits. We understand Diamantopoulos’ (2013) skepticism of whether we really

believe that formative models are not inferior. However, we consider that our clarification of

when formative models can be used is far more likely to ensure that “formative models are here

to stay”, than the confusion that currently exists.

3 Just imagining trying to use a MIMIC-type model to operationalise mass and velocity as indicators of some latent variable called ‘momentum’, and using the estimated beta values to determine whether either of the two indicators should be removed will illuminate once and for all the absurdity of the MIMIC approach to formative models.

References

Bagozzi R.P. (1984). A Prospectus for Theory Construction in Marketing. Journal of Marketing

Research, 48 (1), 11-29.

Bagozzi, R.P. (2011). Measurement and Meaning in Information Systems and Organizational

Research: Methodological and Philosophical Foundations. MIS Quarterly, 35 (2), 261-

292.

Blalock, H.M. (1975). The Confounding of Measured and Unmeasured Variables. Sociological

Methods and Research, 3 (4), 355-383.

Bollen, K.A. (2007) Interpretational Confounding is Due to Misspecification, Not Type of

Indicator: Comment on Howell, Breivik, and Wilcox (2007). Psychological Methods, 12

(2), 219-228.

Borsboom, D. (2005). Measuring the Mind: Conceptual Issues in Contemporary Psychometrics.

Cambridge, UK: Cambridge University Press.

Borsboom, D., Mellenbergh, G.J. & van Heerden, J. (2003). The Theoretical Status of Latent

Variables. Psychological Review, 110 (2), 203-219.

Cadogan, J.W., & Lee, N. (2013). Improper Use of Endogenous Formative Variables. Journal of

Business Research, 66, 233-241.

Diamantopoulos, A. (2006). The Error Term in Formative Measurement Models: Interpretation

and Modeling Implications. Journal of Modelling in Management, 1 (1), 7-17.

Diamantopoulos, A. (2013). MIMIC Models and Formative Measurement: Some Thoughts on

Lee, Cadogan, and Chamberlain. Academy of Marketing Science Review. This issue.

Diamantopoulos, A., Riefler, P. & Roth, K.P. (2008). Formative Indicators: Introduction to the

Special Issue. Journal of Business Research, 61 (12), 1203-1218.

Diamantopoulos, A. & Siguaw, J. (2006). Formative versus Reflective Indicators in

Organizational Measure Development: A Comparison and Empirical Illustration. British

Journal of Management, 17, 263-282.

Diamantopoulos, A. & Winklhofer, H.M. (2001). Index Construction with Formative Indicators:

An Alternative to Scale Development. Journal of Marketing Research, 38, 269-277.

Grace, J.B. & Bollen, K.A. (2008). Representing General Theoretical Concepts in Structural

Equation Models: The Role of Composite Variables. Environmental and Ecological

Statistics, 15 (2), 191-213.

Hardin, A.M., Chang, J.C.-J., Fuller, M.A., & Torkzadeh, G. (2011). Formative Measurement

and Academic Research: In Search of Measurement Theory. Educational and

Psychological Measurement, 71 (2), 281-305.

Hayduk, L.A. (1996). LISREL: Issues, Debates and Strategies. Baltimore, Maryland: John

Hopkins University Press.

Hayduk L.A., & Littvay, L. (2012). Should Researchers Use Single Indicators, Best Indicators,

or Multiple Indicators in Structural Equation Models? BMC Medical Research

Methodology, 12: 159.

Howell, R., (2013). Conceptual Clarity in Measurement: Constructs, Composites, and Causes: A

Commentary On Lee, Cadogan And Chamberlain (2013). Academy of Marketing Science

Review. This issue.

Lee, N., & Cadogan, J.W. (2013). Problems with Formative and Higher-Order Reflective

Variables. Journal of Business Research, 66, 242-247.

MacCorquodale, K. & Meehl, P.E. (1948). On a Distinction between Hypothetical Constructs

and Intervening Variables. Psychological Review, 55, 95-107.

Molina-Castillo, F.-J., Calantone, R.J., Stanko, M.A., & Munuera-Aleman, J.-L. (2012). Product

Quality as a Formative Index: Evaluating an Alternative Measurement Approach. Journal

of Product Innovation Management.

Rigdon, E. E., (2013), Lee, Cadogan, and Chamberlain: An Excellent Point . . . But What about

that Iceberg? Academy of Marketing Science Review, this issue.

Rigdon, E.E., Preacher, K.J., Lee, N., Howell, R.D., Franke, G. R., & Borsboom, D. (2011).

Avoiding Measurement Dogma: A Response to Rossiter. European Journal of Marketing,

45 (11/12), 1589-1600.

Sanchez-Perez, M. & Iniesta-Bonillo, M.A. (2004). Consumers Felt Commitment Towards

Retailers: Index Development and Validation. Journal of Business and Psychology, 19

(2), 141-159.

Sosa, E. & Tooley, M. (1993). Introduction. In E. Sosa & M. Tooley (Eds.). Causation, Oxford:

Oxford University Press.

Treiblmaier, H., Bentler, P.M., & Mair, P. (2011). Formative Constructs Implemented via

Common Factors. Structural Equation Modeling, 18 (1), 1-17.

Uhrich, S., & Benkenstein, M. (2010). Sport Stadium Atmosphere: Formative and Reflective

Indicators for Operationalizing the Construct. Journal of Sport Management, 24, 211-237.

Wilcox, J.B., Howell, R.D. & Breivik, E. (2008). Questions about Formative Measurement.

Journal of Business Research, 61(12), 1219-1228.

Table 1: Assumptions Underpinning and Associated with the Formative MIMIC Model

(1) In a formative variable model (see Figure 1), where η is the focal latent variable, η

is a separate entity from its formative indicators.

(2) The formative indicators in the MIMIC model have a causal effect on η. That is,

variance in a formative indicator, via a path of causal mechanisms, leads to a change

in η.

(3) MIMIC model estimation can provide information on the magnitude of the

relationship between the formative indicator entities and the η entity.

(4) It is possible for a variable to have both formative and reflective content valid

indicators of the same variable.

Figure 1: A Traditional Diagramatic Representation of a Formative Variable

Figure 2: The Formative MIMIC Model