Post on 14-Mar-2023
FORMATIVE VARIABLES ARE UNREAL VARIABLES:
WHY THE FORMATIVE MIMIC MODEL IS INVALID
John W. Cadogan, Nick Lee and Laura Chamberlain
AMS Review, Vol. 3 (1), 38-49
ISSN: 1869-814X
Abstract
In this rejoinder we provide a response to the three commentaries written by Diamantopoulos,
Howell, and Rigdon (all this issue) on our paper The MIMIC Model and Formative Variables:
Problems and Solutions (also this issue). We contrast the approach taken in the latter paper
(where we focus on clarifying the assumptions required to reject the formative MIMIC model)
by spending time discussing what assumptions would be necessary to accept the use of the
formative MIMIC model as a viable approach. Importantly, we clarify the implications of entity
realism, and show how it is entirely logical that some theoretical constructs can be considered to
have real existence independent of their indicators, and some can not. We show how the
formative model only logically holds when considering these ‘unreal’ entities. In doing so, we
provide important counter-arguments for much of the criticisms made in Diamantopoulos’
commentary, and the distinction also helps clarify a number of issues in the commentaries of
Howell and Rigdon (both of which in general agree with our original paper). We draw together
these various threads to provide a set of conceptual tools researchers can use when thinking
about the entities in their theoretical models.
Keywords: Formative Variables; Measurement; Composites; Indicators; Theory; Causality;
Ontology; Philosophy
We are grateful for being given the chance to add a rejoinder to the comments that have been
made about our original contribution to the AMS Review. Our paper, The MIMIC Model and
Formative Variables: Problems and Solutions, which the commentators refer to as LCC, appears
to have split the commentators into different camps. First, Diamantopoulos has sought to find
ways of refuting our claims that the formative MIMIC model tells us little about formative
variables. Reflecting on his work, and the conclusions one must inevitably draw if one accepts
his arguments, his stance can be summed up in the following way:
It is entirely possible for a singular entity, with singular conceptual content, to also be
multifaceted in conceptual content. Likewise, it is possible for a grouping of conceptually
different entities, that is, a grouping of multiple entities that potentially have conceptually
orthogonal meanings, to also have singular, equivalent, conceptual content. In other words,
there is no such thing as either unidimensionality or multidimensionality of variables:
whether an entity is unidimensional or multidimensional is in the hands of the individual
researcher, such that if a researcher wishes to do so, she can decide that a variable is both
unidimensional and multidimensional at the same time. As such, the MIMIC model is a
usable tool for modeling formative variables.
On this reading, Diamantopoulos’ stance appears to be illogical and contradictory. Of course, we
(and we suspect many others) cannot agree with such a view. However, while this view might
appear to be an extreme caricature, drawn by us to illustrate a point, in this rejoinder we will
demonstrate that it is the unavoidable conclusion one comes to if one follows Diamantopoulos’
arguments and concurs with his reasoning that (a) the formative MIMIC model can provide
information on formative measures, and beyond that (b) that a latent variable, η, can be measured
using both formative indicators and reflective indicators, simultaneously. Part of the error in
Diamantopoulos’ comment on LCC seems to be a failure to appreciate the subtlety of the
differences in the ontologies underpinning the formative and the reflective variable models. We
expand on this issue in some depth below, and in particular the position – entrenched in
Diamantopoulos’ comment – that the MIMIC model’s interpretation is a purely conceptual
decision; that the researcher can choose whether the MIMIC model conceptually represents a
formative model, a reflective model, or even both simultaneously.
Howell’s comment on LCC is the polar opposite of Diamantopoulos’. In fact, Howell
provides additional evidence to support our claims that the MIMIC model says nothing about
formative variables. In so doing he fatally undermines Diamantopoulos’ arguments. For instance,
he makes reference to work by Treiblmaier, Bentler, and Mair (2011), who demonstrate that
when reflective items are added to a formative variable (to create a MIMIC model), the meaning
of η, the focal latent variable in the MIMIC model, is not derived from the presumed formative
indicators. Instead, Howell points out that, in line with LCC’s arguments, Treiblmaier, Bentler,
and Mair’s (2011) use of covariance algebra and path tracing rules shows that η in a MIMIC
model is just a common factor explaining the covariance among the MIMIC model’s reflective
items.
Rigdon’s position is also at odds with Diamanopoulos’, arguing that the MIMIC model
cannot simultaneously represent reflective and formative measurement. Rigdon’s ultimate goal,
however, is to move the debate into different territory, and so he spends relatively little time
reflecting on the specific issues presented in LCC. Rigdon instead considers a number of other
important questions, tangential to those covered in LCC. Importantly, Rigdon argues that
researchers are working at three levels of abstraction: at the most abstract level (and the most
important from a theory development perspective), are theoretical concepts; at the least abstract
are observed behaviors used to make inferences about the theoretical concepts. At the
intermediate level of abstraction, researchers develop representations of the theoretical concepts
with the observed variables, using common factors, weighted composites, or other approaches to
generate factor (e.g., using factor analysis) or composite (e.g., using partial least squares) proxy
variables. Using this view of conceptual variables as either highly abstract, observed, or proxy
representations, Rigdon discusses issues raised in LCC, such as fixing weights, as well as issues
beyond LCC, such as the pro-factor bias in the psychological measurement literature.
REFLECTING ON WHERE THE COMMENTARIES LEAVE US
Where do the commentaries leave the field? In our opinion, and despite the strong
endorsement of our position by Howell, and the confirmation that our reasoning is sound by
Rigdon, the situation remains rather unsatisfactory. Readers of Diamantopoulos’ commentary,
for instance, may wonder why anyone would question the utility of the MIMIC model. If
Diamantopoulos is correct, then LCC are trying – as he suggests – to ‘kill’ the formative
measurement model. As such, they must be stopped, because by undermining the ability of the
MIMIC model to handle formative variables, LCC are inhibiting scientific progress.
Therefore, it is somewhat understating the case to say that LCC’s arguments did not
convince Diamantopoulos that MIMIC models are inappropriate for modeling formative
variables. As a result, there is a chance that some readers will be swayed by Diamantopoulos’
rejection of LCC’s claims. Indeed, perhaps Diamantopoulos is right, and LCC, in challenging a
well-established methodological tool, have joined Howell (2013), Rigdon (2013), Borsboom
(e.g., Borsboom 2005), and the like, forming a body of – his word – ‘misguided’ academics who
cannot see some obvious truth, who are failing to grasp something fundamental about “what
things are”, and “how we measure them”.
Perhaps one of the problems with LCC’s approach is that, on focusing on why MIMIC
models do not make sense for formative variables, it did not make explicit the assumptions one
would need to make if one believes that the formative MIMIC model is a viable method of
empirically representing formative variables. It may be that additional light can be shed on the
issue of whether formative MIMIC models are valid, by examining the basic precepts one needs
to accept in order to allow formative MIMIC models to remain in our panoply of acceptable
methodological approaches. Diamantopoulos, for instance, uses arguments to criticize LCC’s
logic that rely on certain assumptions about the nature of reality: if the latter assumptions are
valid, then the arguments they prop up stand a greater chance of being able to dismantle LCC’s
thesis. However, should it be the case that the assumptions propping up the formative MIMIC
model are untenable, then critiques of LCC’s logic that rely on those assumptions must be
dismissed.
Accordingly, we use this rejoinder to provide an overview of the assumptions that
implicitly or explicitly are used when defending the formative MIMIC model, or when attacking
the logic of those who question its veracity. In this respect, the following assumptions appear to
be critical to the formative MIMIC model agenda:
Table 1 about here
Figure 1 about here
We identified these assumptions and implications by referring to Diamantopoulos’ (2013)
commentary on LCC. However, there may be additional issues and assumptions that are not
covered here that appear in Diamantopoulos’ writings, and in the writings of others who discuss
the formative model. Even so, it seems the issues above form an intersecting set of beliefs that
are used to (a) justify the formative MIMIC model’s capability to model formative variables, and
(b) defend the model from attack by the likes of LCC.
Do the assumptions hold up to scrutiny? We suggest not. Our reasoning depends on
understanding what the entity realism ontology implies, and so we now discuss the nature of real
entities.
TESTING THE ASSUMPTIONS UNDERPINNING THE FORMATIVE MIMIC
MODEL
Real and unreal variables
The most basic assumption underpinning the arguments of those advocating the MIMIC
model as a way of representing formative variables is that, for a formative variable, the focal
latent variable η is a separate entity from its formative indicators. For instance, Diamantopoulos
(2013) is explicit on this front, stating that “[o]ne can very well defend a claim that, in a
formative measurement model, the latent variable is a separate entity from its indicators”. If the
logic of this assumption is undone, such that it is found that η is indistinguishable from its
formative indicators, then, as we demonstrate later, the remaining assumptions in Table 1 are in
turn invalidated.
One way to approach the question of whether a formed η is the same thing as its
indicators, or whether it is somehow made up of different conceptual ‘stuff’ from the indicators
that form it, is to consider the issue of entity reality. At first, it seems strange even to
contemplate a discussion about real things, since to do so implies that there are unreal things. We
are scientists – at least, we aspire to adopt scientific principles and approaches – so why would
we contemplate talking about unreal things? Much depends on what is meant by real.
From an entity realist perspective, a real entity is simply an entity that is assumed to
actually exist, independent of measurement or examination, and which should be taken literally
(Borsboom, Mellenbergh, and van Heerden 2003). One can view real entities as being
fundamental, or unidimensional, since they are singular in conceptual meaning, and cannot be
broken into smaller, more fundamental conceptual entities. The converse are variables that one
might term unreal. However, the use of the term unreal is rather pejorative, and so while we do
not abandon the word unreal, we qualify its use by explicitly stating that the term unreal is used
as a way of identifying a variable that does not conform to the definition of a real entity. What,
then, is an unreal variable? Borsboom et al. (2003, p. 207) describe an entity of this kind as being
“a fiction, constructed by the human mind”, an operational or “numerical trick … a (possibly
weighted) sumscore and nothing more”, which does not have or require existence independent of
measurement.
Clearly, from this perspective, an unreal variable could produce numerical magnitudes as
scores, and those scores are real in the sense that they exist on paper, in our minds, or in a
database somewhere. However, the variable that is formed cannot represent a real entity with
genuine existential properties that transcend its indicators / components (see MacCorquodale and
Meehl 1948): it is just a mathematical structure applied to a set of more basic entities. There is
no fundamental, singular, real entity that the unreal variable equates to, or represents, or maps to
at a conceptual level: unreal variables are just what their mathematical structures imply, “merely
names attached to certain convenient groupings of terms” (MacCorquodale and Meehl 1948, p.
99).
Take for example, an individual’s socio-economic status (SES). The indicators of an
individual’s SES can be defined as comprising education, income, and occupation, and it is
probably not too controversial to suppose that education, income and occupation are real entities
in their own right. However, if we define SES as being a function of education, income, and
occupation, does it make sense to consider SES a real entity, or rather just some summary of the
three defining indicators? In other words, is there a convincing existence proposition for SES
that goes beyond the existence of the defining components?
Simple conceptual thinking will show that it is not possible to provide a convincing
existence proposition for SES, without changing its definition in some way. Without a
convincing existence proposition, SES is simply the combination of the three items. Of course,
one could certainly conceptualize something akin to ‘perception of SES’, which could be the
“subjective evaluations [of others] that confer status [on an individual]” which Blalock (1975 pp.
365) indeed does. But the latter (real) entity is fundamentally different to the simple combination
of characteristics (unreal) that characterizes the typical SES definition (Borsboom et al. 2003). A
look at the body of existing marketing and business research will turn up many theoretical
entities which might appear in light of the present discussion to be more amenable to an unreal
definition. In LCC we also gave examples, including our advertising expenditure and coercive
power examples (of which more later).
Are formative variables unreal? Reassessing the meaning of the error term
Are formative variables (i.e., the ones used in formative MIMIC models), conceptually
identical to these unreal variables discussed above? This is important, since we will show that in
order for the formative MIMIC model assumptions to hold, the formative variable must be real.
If they are unreal, the formative MIMIC model cannot be considered a valid tool. To resolve this
issue, we first defer to Diamantopoulos et al. (2008, p. 1205), who define a formative variable
model as one in which “the indicators determine the latent variable which receives its meaning
from the former”, and specify the mathematical structure of the formative variable as a weighted
sumscore, as follows:
(Equation 1)
𝜂 = �𝛾𝑖𝑥𝑖 + ζ𝑛
𝑖=1
where η represents the formed variable, xi are the more fundamental variables (or indicators) that
define η, the γi are the weights which define the contribution of the indicators to the formed
variable, and ζ is a disturbance term. Moreover, in the context of Equation 1, Diamantopoulos et
al. (2008, p. 1211) articulate that “it is not possible to separate the construct's meaning from the
indicators’ content”.
At this stage, comparing Diamantopoulos et al.’s (2008) definition of a formative variable
and its associated mathematical structure with the entity realist understanding of what
distinguishes real and unreal variables, it becomes hard to reconcile the notion that the formative
variable described by Equation 1 is anything other than an unreal variable. That is, an unreal
variable is simply a convenient group of variables, combined using some mathematical rule, and
equation 1 appears to say that a formative variable is just a sumscore of variables combined
using some mathematical rules.
However, might it be that the ζ term in Equation (1) somehow imbues the formative
variable with some surplus meaning beyond the indicators that are used to define η? If it does,
this might make the formative variable a real entity. Diamantopoulos (2013), borrowing from
Grace and Bollen (2008), certainly believes this to be the case. In order to provide an answer to
this question, we need to work out the meaning of the ζ term. Let us start by approaching this
question from a purely conceptual perspective (later, we will also consider pragmatic issues of
data availability for a given conceptualized set of indicators).
According to Diamantopoulos (2006, p. 11), “the error term in a formative measurement
model represents the impact of all remaining causes other than those represented by the
indicators included in the model”. Ignoring for now what LCC (along with Rigdon and Howell’s
commentaries) showed was a misuse of the term ‘cause’, let us test this idea by looking once
again at SES. First, we define SES as being a formed variable, comprising income, education and
occupation. Where is the error here? Given that “the indicators determine the latent variable …
[and] its meaning” (Diamantopoulos et al. 2008, p. 1205), it is clear that, at a conceptual level of
analysis, the definition of SES contains no conceptual error: the indicators (income, education
and occupation) define the construct, and so the formative variable perfectly mirrors its defining
factors. According to this definition of SES, the conceptual meaning of SES does not reside at all
outside of (or transcend) the income, education and occupation factors that define it. The logic of
entity realism means that SES is simply a composite of three indicators and is an unreal variable.
In fact, all formative variables defined using a set of explicitly identified factors are unreal, since
there is no error possible in their conceptual definition.
But maybe this is a special case. What about the case where the indicators of the
formative variable are not explicitly identified? Maybe error can occur here, and give the
formative variable surplus meaning? To test this idea, let us redefine SES, so that it is now
potentially a different formative variable: we shall call the new variable SES, but its conceptual
content may not be same as the previous SES idea (which was simply income, education and
occupation). Specifically, we now define SES as “the set of social and economic factors that
contribute to a person’s social standing”. Here we have not explicitly listed the individual factors
that comprise the set of things that contribute to social standing. But, if we were interested in
actually doing something with the new SES variable (e.g., using it in a model), we would want to
create that list, and that theoretical list, by definition, contains every single social and economic
factor that contributes to one’s social standing. How can it not? So once again, the definition of
SES is error-free, because there are no “remaining” social and economic factors that (a) can sit
outside the list, and (b) can contribute to one’s social standing. As a result, we can conclude that
the conceptualization of the formative SES variable, even when defined more vaguely (without a
priori specifying the individual factors that define the variable), is error-free. More generally,
even if the explicit identities of all of the individual factors comprising the formative variable are
not provided, by definition the formative variable is error free at the conceptual level. This
conclusion might seem very obvious, and one might wonder why we need to verbalize
something so trivial. The reason is that the idea of error in the formative model has been taken to
mean error in the conceptualization of the formative variable. This idea is obviously incorrect.
What, then, are Diamantopoulos et al. (2008, p. 11) referring to when they talk about the
error term representing factors “other than those represented by the indicators included in the
model”? It cannot be a conceptual error as shown above. That leaves only an operational error –
factors which should be empirically captured, but which have not been. If one defines SES as
“the set of social and economic factors that contribute to shaping a person’s social standing”, but
only operationalizes some of those factors, one is committing an error of omission at the
operational level. Operational errors are likely to result in omissions in measurement, and thus
will lead to errors in the calculation of the numerical value of the formative variable. Now,
obviously, operational and measurement errors should not be defining features of a theoretical
variable’s conceptual definition, and so Equation 1’s error term makes little sense. In fact, it
cannot exist as an error of conceptualization. In the context of a larger model, the error term
could also represent the error in prediction of another entity, such as ‘perception of an
individual’s SES’. Such an entity could be conceptualized as a real entity, and the so-called
formative indicators would be causal influences on this entity. However, ‘perception of SES’ is
not the same entity as the SES unreal entity as defined above. Perception of SES could have
many other causes than those conceptualized to form SES itself, and these are quite correctly
represented by an error term. But neither of those interpretations of the error term are
representative of it as ‘surplus meaning’ in the definition of a formative variable. There can be
no surplus conceptual meaning in a formative variable.
As we explain shortly, it is for this reason, among others, that we also suggest that the
traditional formative model diagrammatic picture (Figure 1), together with the addition of the
error term in Equation 1 are hindering researchers’ ability to place formative variables in their
correct ontological position. In sum, then, the conclusion we draw from the analysis above is
that: (a) formative variables are defined by the variables chosen to form them, and so they are
always error-free at the conceptual level, and contain no surplus conceptual meaning that
transcends the formative variable’s defining factors, and (b) as a result, all formative variables
are unreal.
THE IMPLICATIONS OF AN UNREAL FORMATIVE VARIABLE
1. A formative variable is not a separate entity from its formative indicators. So far, we
have demonstrated that formative variables are, by definition, and despite the modeling notation
adopted in Equation 1, error free at the conceptual level, and that this piece of information gives
us some confidence in stating that formative variables are unreal. That is, formative variables do
not possess the properties that real variables possess. Formative variables do not actually exist,
independent of their indicators. They are simply convenient groupings of variables, and have no
special meaning that transcends those variables that define them (see also Borsboom et al., 2003;
MacCorquodale and Meehl, 1948).
This means that the first assumption underpinning the formative MIMIC model (see
Table 1) is not valid: that is, a formative variable is not a separate entity from its formative
indicators. It also means that Diamantopoulos’ (2013) claim that “in a formative measurement
model, the latent variable is a separate entity from its indicators” is wrong. As we now show, all
the remaining assumptions outlined in Table 1 are also invalidated.
2. The formative indicators cannot cause the formative variable. The second assumption
underpinning the formative MIMIC model is that the formative indicators in the MIMIC model
have a causal effect on η, the formative variable. Yet, as discussed in LCC, the presence of
causality in a formative MIMIC model would necessitate that the cause and the effect are
separate material entities (e.g., Sosa and Tooley 1993). However, since we have just
demonstrated that formative indicators are not separate entities from the formative variable η, by
implication, it is impossible for the formative indicators to have a causal effect on η.
3. The formative MIMIC model cannot tell us what the relationship is between a
formative indicator and the formative η entity. Indeed, contrary to the notion of causality
between formative indicators and the formative η, it is simply the case that the set of formative
indicators, manipulated according to their defining mathematical structure, are the formative η.
Furthermore, it makes no sense to be ‘seeking’ the mathematical rules for summing the
indicators, because this assumes that the rules themselves have a real existence. That is, if one
wanted to estimate the magnitude of the relationship between a formative indicator and the
formative η, one would have to make the assumption that there is a true value for the relationship
that exists beyond (i.e., transcends) any value that a researcher imposes on that relationship.
Accordingly, it would be incumbent on the person wishing to perform the estimation to
demonstrate that there is a true value to the relationship.
To illustrate, let us return to SES, defined as an individual’s income, educational level,
and occupation. One could model the formative indicators of SES in a MIMIC model. According
to the generally-accepted logic of the formative MIMIC model, by running the model, the
researcher would generate estimates of the contribution that each formative indicator makes to
SES. It is quite possible that income might return a zero relationship with the SES formative η
variable. Yet, before the researcher redefines SES so that it no longer includes income, they
should first explain how there comes to be a true value for the relationship between income and
SES that can be estimated, especially given that, as we have shown, formative entities are simply
convenient groupings of variables, and are unreal. As Borsboom et al. (2003, p. 209) point out,
“Estimation is a realist concept: Roughly speaking, one could say that the idea of estimation is
meaningful only if there is something to be estimated”.
The alternative is that there is no true value that can be estimated for the value of a
relationship between a formative variable and a formative indicator. Rather, statistical estimation
procedures will throw up different values across samples, leading to inconsistencies across
studies (Hardin et al. 2011) – which Wilcox Howell, and Breivik (2008) point out is exactly the
case in research using SES. The view that we should estimate the relationships between
formative indicators and latents in a MIMIC model requires commitment to the idea that the
conceptual content of SES should be determined by a data set, and the acceptance that SES’s
definition will likely be different from one study to the next. One reason why we should not
estimate formative indicator weightings, therefore, is that if we do, the SES variable (or any
formative variable) cannot be compared across studies, since SES would have a different
meaning in each study. This issue is at the root of Borsboom et al.’s (2003, p. 209)
recommendation that in formative cases, “the term parameter estimation should be replaced by
the term parameter determination”. That is, as LCC suggest, the γ parameters in Equation 1 are
weights that should be predefined by the researcher, not sought by a statistical package (the
reader is referred to the original LCC article for other reasons why pre-specified weights should
be used when constructing formative variables).
For Diamantopoulos (2013), one of the troubling consequences of drawing the conclusion
that there is no true value of the indicator weighting to be estimated, and that the weight is part of
the variable definition, is that the recommendations of LCC “no longer address the formative
measurement model… but simply discuss alternative ways of constructing (fixed-weight)
composites”. For some who have grown up under the umbrella of classical test theory, the
conclusion that the formative model is not a measurement model may sit uncomfortably, since it
goes against much of what they have been taught. Perhaps it is for this reason that
Diamantopoulos (2013) is so dismissive of the idea that there can be no causal relationship
between the formative indicators and the formative η in the formative MIMIC model. Indeed
Diamantopoulos (2013) remains staunchly faithful to the notion that the formative MIMIC model
is a measurement model performing a much needed scientific job: “measurement models are
exactly that: theories to be tested (and possibly refuted) with empirical data”. Unfortunately,
although it might satisfy a conditioned response, a desire to undertake testing of hypotheses
about formative indicators and their relationships with the formative variable, the fact remains
that these relationships should not be estimated: rather, the relationships should be part of the
construct definition, and so cannot be inferred by statistical approaches or MIMIC models.
4. It is not possible for a single variable to have both formative and reflective content-
valid indicators. Advocates of the formative MIMIC model sometimes argue that for any given
MIMIC model, there is no single correct way of interpreting what is being measured and that, in
fact, one can measure a variable using formative or reflective approaches (or even both) if one
wishes. As Diamantoulos (2013) states, in a MIMIC model with exogenous observable indicators
(the xs) and endogenous observable reflective indicators (the ys), it is possible to view “both the
𝑥𝑠 and the 𝑦s as content-valid indicators of 𝜂, with the 𝑥𝑠 being formative and the 𝑦s being
reflective” (see Figure 2 for a diagrammatic representation of such a model). Similarly, Bollen
(2007, p. 221) asks the reader to “consider a multiple indicator–multiple cause (MIMIC) model
in which there are both causal (formative) and effect (reflective) indicators of a single latent
variable”. Bollen (2007) provides an illustration of such a model, “in which η1 is home value;
the causal (formative) indicators are square footage of house, age, lot size, number of rooms, and
so on; and the effect (reflective) indicators are appraised value, owner estimate, and assessed
value of the home.”
Figure 2 about here
In this interpretation of the formative MIMIC model, if the MIMIC model’s reflective
items are unidimensional (as reflective items should be), and if its formative items are
multidimensional (as formative items should be), and if (as Diamantopoulos suggests) η is a
valid representation of the formative items, and the reflective items are valid representations of
η, then the single entity, η, is simultaneously unidimensional and multidimensional. Such a state
of affairs is a logical impossibility, and so we conclude that a single variable cannot have both
formative and reflective content valid indicators.
In practice, MIMIC models that claim to ‘measure’ a single construct cannot have
both formative and reflective content-valid indicators of the variable: it is just impossible, so
something else must be going on in the MIMIC model (we provide detailed explanations in
LCC). For instance, in Bollen’s (2007) example of home value, rather than the formative and
reflective indicators being measures of a single “home value” η variable, a more plausible
interpretation is that the reflective indicators measure perceptions of home value, whereas the
MIMIC model’s formative items are best conceptualized, not as variables that define home
value, but rather as exogenous potential causes of home value perceptions (along with many
other factors that are not assessed in the questionnaire). All instances of MIMIC models that
proclaim to simultaneously measure a single entity with both formative and reflective indicators
are making a similar error in that the MIMIC model is just a model in which a common factor is
predicted by some exogenous variables (see also the comments of Howell, and Rigdon, in this
issue). Of course, this is just what the MIMIC model should do – after all it is a Multiple
Indicators, Multiple Causes model. As we have shown above, in LCC, and elsewhere (e.g.
Cadogan and Lee, 2013), formative variables cannot have direct causes (i.e., one cannot have
direct paths influencing a formative η variable – it is a logical impossibility). The MIMIC is a
fine tool, but only if it is not used to attempt to operationalize a formative variable. That is, the
MIMIC model works as a modeling tool when one does not assume that the exogenous xs in
Figure 2 are formative indicators of η.
ADDITIONAL ISSUES
We have clarified the logical implications of rejecting the use of the formative MIMIC models
here and in LCC, and in this rejoinder we have also now explicated the logical consequences of
accepting the utility of formative MIMIC models. For us, the natural outcome of this – the
abandonment of formative MIMIC models – is self-evident. Furthermore, the discussion
presented on the distinction between what we called real and unreal entities should clear up
many of the issues raised in the commentaries on LCC.
Beyond these issues, there is also a need to provide some brief responses to a number of
specific points raised by the three commentators. First, although at no point in LCC do we say
that the Advertising Expenditure (AE) measure was developed by Diamantopoulos and
Winklhofer (2001), it appears that Diamantopoulos (2013) got the impression that we did. We
should have made authorship of the scale more obvious, and we should also have been more
explicit when describing the AE variable and its origins. That said, the way that LCC attributed
authorship and discussed operationalization of the original AE measure has no bearing on the
logic being used by LCC to make its point. It is certainly not reasonable to conclude that, as a
result, “the entire discussion of the model discussed in LCC’s Figure 3 becomes rather irrelevant
and potentially misleading” (Diamantopoulos, 2013). Underpinning LCC’s Figure 3 example is
some fundamental logic that can be applied to any formative variable, not just the example used.
If readers follow Diamantopoulos’ implied suggestions, and ignore the Figure 3 example, they
will miss out on an illustration of how a MIMIC model cannot be a formative latent variable
model.
Diamantopoulos’ comment also makes much of the fact that LCC argued that the same
observed items could be considered indicators of two different constructs (access to coercive
tools, and perceived coercive power). However, two key mistakes are made by Diamantopoulos’
comment in this regard. First, LCC makes very clear in its Table 1 that these items, while
(although only partly) being worded the same, were different in at least two ways; a) the item
stems were different (‘how much capability does your supplier have to take an action’, versus
‘what is your ability to take the action’), and b) the unit of analysis and thus the data source was
different (the buyer rates their supplier, versus a supplier rates themselves). So, quite apart from
anything else, the same items were not used at all, and it is unclear how Diamantopoulos came to
that conclusion1. That said, some may be surprised to note that it is hardly a new idea that even
the very same item, administered to the very same respondent, and using the same measurement
model (e.g. reflective) could be used to measure more than one construct. Hayduk (1996, see e.g.
p. 25-30) discusses this at length when he demonstrates how a consideration of the error of a
measurement item should be informed by one’s construct definition. Hayduk and Littvay (2012)
elucidate this with an example of how a respondent’s extent of agreement/disagreement with the
statement ‘sex with a consenting adult of the same sex is wrong’ could be defined as a measure
of a) the true extent of agreement/disagreement with the statement ‘sex with a consenting adult
of the same sex is wrong’, b) the respondent’s personal commitment to a traditional reading of a
religious text, or c) the respondent’s parents’ commitment to a traditional reading of a religious
text, as the error variance fixed by the researcher increases. If Hayduk and Littvay (2012) are 1 Of course, we assume here that Diamantopoulos’ lack of clarity was unintentional.
right in this respect, then even if Diamantopoulos’ (2013) criticism of our coercive power
example had been correctly targeted, the nature of the criticism would appear to be unfounded.
Further, we wonder whether Diamantopoulos (2013) really meant that abandoning the
formative model “de facto means that the only auxiliary theory available to researchers would be
the reflective model”? There are a number of problems with this statement. First, it is unclear
where Diamantopoulos saw LCC recommending abandonment of the formative model, since
nowhere in LCC can this be found. Rather, LCC, and this rejoinder, recommends abandoning the
formative MIMIC model as an operational tool. It also appears that it is wrong to assume that the
only two models available for use by researchers are reflective and formative. Rigdon et al.
(2011, p. 1592-1594) explain the potential harm of this dichotomous view, giving a brief
introduction to the multiple different measurement models available to the social researcher, and
summing the problem up as follows: “we encourage researchers to consider the broad range of
possible relations between measures and constructs. This range is limited only by our ability to
imagine and implement measurement models. Computational advances over the years have
dramatically expanded this potential range, but the scope of our imagination broadens more
slowly,” (p. 1542). For Rigdon et al. (2011), the seemingly-ingrained belief that there are only
two measurement models possible, formative and reflective, is an abject failure of imagination.
Diamantopoulos also claims that LCC fails due to what is termed the application of
‘reflective logic’ to issues of formative models, which Diamantopoulos cites Bagozzi (2011) as
suggesting in relation to other articles in the area. While we will not speak for other authors, it is
not clear to us what the ‘reflective logic’ referred to as underpinning LCC could be, and we
believe it is inadvisable to set up what looks like a situation of paradigmatic incommensurability
when it is clearly not the case. In fact, this rejoinder has shown how to differentiate between real
and unreal entities, and how the reflective model can be used with real entities, and the formative
model should be used in the case of unreal entities. We would suggest that, unlike many in the
pro-formative MIMIC modeling camp, we have demonstrably moved beyond the assumption
that all entities must be conceptualized in an entity realist manner which, if we are forced, is the
only thing we can think of which could be called a ‘reflective logic’ (cf. Borsboom et al. 2003).
In fact, in light of the real / unreal distinction, the so-called key difference cited in
Diamantopoulos’ comment of exogenous (reflective) versus endogenous (formative) indicators is
nothing of the sort. In an unreal case (logically the only case where formative models fit), it
makes no sense for indicators to be either endogenous or exogenous. Rather, the indicators are
analogous to pieces of a mosaic, they are simply part of the construct (hence our
recommendation for various composite methods).
Moving to Howell’s comment on LCC, we appreciate his suggestion that the term
MIMIC should strictly-speaking only refer to models with multiple reflective indicators, rather
than conceptually distinct outcomes. However, we also note that it is not uncommon for the
latent variable in a MIMIC model to be interpreted in terms of the formative items, and for the
reflective variables chosen to estimate the MIMIC model to not be representative of the domain
of the theoretical entity being modeled, but to be more appropriately considered as endogenous
consequences, just as in the AE example given in LCC (for further examples, see amongst others
Diamantopoulos and Siguaw, 2006; Molina-Castillo et al. 2012; Sanchez-Perez and Iniesta-
Bonillo; 2004; Uhrich and Benkenstein, 2010).
Of course, when a real entity is defined, then as pointed out by Howell, there can be
causes conceptualized. However, in many cases these ‘MIMIC-type’ models are used in practice
to operationalize formatively-conceptualized variables (such as the examples we gave in LCC),
which does not make sense conceptually. Therefore, like LCC, Howell shows that it is illogical
to use a MIMIC model to operationalize a formative variable. The ‘formative measurement’
issue indeed goes away in the special case explained by Howell (2013) (i.e., when “(1) the latent
variable is interpreted only in terms of the content of its reflective indicators; (2) the Xs are
interpreted as causes, predictors, or covariates (call them what you wish), but not as measures;
and (3) the error term is interpreted as all sources of variation in the reflectively measured latent
variable not included in the model”), but only because a formative variable is not represented
here – as echoed by Rigdon in his comment.
Rigdon in fact passes over much of the detail of LCC, and instead brings his guns to bear
on a number of what he calls “more fundamental issues” concerning “psychological
measurement” (Rigdon 2013). We leave readers to draw their own conclusions on the issues
outside the scope of LCC, and Rigdon provides ample excellent discussion to do so. As for why
we focused on what Rigdon considers a rather unimportant issue, the simple reason is that the
issues covered here and in LCC are problems that are compromising the rigor of research right
now. Whether or not there are more important issues in a conceptual sense to be dealing with, in
a practical sense, researchers at this moment in time are building and testing models which are
empirically incompatible with their conceptualization because of the problems we pointed out in
LCC.
We also agree with Rigdon’s criticism of the terminological confusion currently
obfuscating measurement theory and practice. His suggestions to solve this are a great start, and
have much in common with Bagozzi’s (1984) holistic construal. However, Rigdon’s proxy
variable solution does not take into account the key issue of unreal variables. Could the
representation of an unreal variable (say SES as discussed above) in an empirical model be
considered a proxy? One could argue not, since there is no entity, or ‘conceptual variable’, in
Rigdon’s terms, of SES that is being approximated. As far as empirical models goes, it makes
sense to refer to empirical proxies of real entities, but is harder to justify in unreal entity cases.
We would also venture that not all measurement situations can be covered by the term
‘psychological measurement’ as used by Rigdon. For example, concepts such as ‘strategic
planning’, ‘market information dissemination’, and – again – ‘advertising expenditure’, do not
necessarily fall under a psychological measurement banner. As alluded to in LCC, other fields of
research, such as clinical diagnosis, and health economics, exhibit a number of different and
potentially useful modeling methods that do not rely on the assumption of real entities.
Sometimes, a variable is just a collection of indicators (unreal), not a proxy for an unobservable
real entity. In fact, those were the situations in which we argued that a formative approach may
be conceptually justified. Similarly, whether or not we can formulate a convincing existence
proposition for a theoretical entity should not be confused with whether one is using a ‘factor
model’ or not. Factor models are one way of modeling real entities, but there are other models.
The distinction is critical, and it means that Rigdon’s (2013) apparent concern that we lack “the
psychological laws that would enable” the fixing of weights is a little misdirected. As we have
already discussed herein, for an unreal entity, the definitional rules of combining the observed
indicators to create the composite (e.g. the weightings) are sufficient (MacCorquodale and
Meehl, 1948). There are no psychological laws to be discovered in such situations (Borsboom et
al. 2003).
Finally, we agree that understanding of causality is a significant problem in measurement
theory, and application – as exhibited in the confusion between formative and causal that LCC
and this rejoinder has repeatedly tried to point out. However, in such cases, while the actual
observed variables and latent proxies for real conceptual entities cannot logically have causal
relationships, it is appropriate to consider an empirical model of theoretical real entities as
representative of an underlying conceptual model with causal content. In a measurement sense,
while the actual path from the ‘real’ conceptual entity may be unobserved, we remain
philosophically and theoretically wed to the idea that in some way the conceptual variable has
some causal impact on the observed variable(s) we are using to measure it (Borsboom, 2005).
We would again say that there is no self-evident reason that we could not also consider it
possible for an observed variable to somehow have a similarly-unobserved causal impact on a
conceptual variable, but even though the observed variable would be exogenous in this case, it is
not the same as calling the model a formative one: as already explained, the exogenous variable
would be an exogenous cause – one entity having a causal relationship with a separate, distinct
other entity.
In fact, we would suggest that greater clarity of the difference between empirical and
conceptual models as Rigdon suggests might also help researchers distinguish when it is right to
talk about causality, and when it is not. For example, in the unreal case, discussion of causality is
not relevant (because there is no underlying conceptual entity with existential properties): as we
show above, what Rigdon might term the empirical proxy is the (unreal) theoretical variable.
Importantly though, Rigdon’s argument that the empirical proxy cannot have causes or effects
remains solid (because it is simply an empirical thing, and not a conceptual entity with
antecedents and consequences). We alluded to exactly this problem in explaining why the
MIMIC model does not work conceptually for the formative case. But perhaps more troublingly,
it also puts researchers who use formative models us in a difficult position in terms of placing
such variables in an empirical model. Cadogan and Lee (2013) begin to address this with their
discussion on using formative variables as endogenous, but there is much more to be done here.
CONCLUSIONS: WHERE DO WE GO FROM HERE?
We believe that the present exchange between us, Diamantopoulos, Howell, and Rigdon, has
moved forward thinking on a number of conceptual and practical issues concerning formative
models, and social scientific measurement in general. As such, we are indebted to all parties
(including the editorial team at AMS Review) for the chance to engage in this exercise, and
question our own thoughts and arguments in such an open forum. The ideas presented in LCC
have been moved on in a number of important ways by each of these three comments. However,
despite the many thousands of words that have been expended on this topic, we believe the entire
issue comes down to one seminal point; the distinction between real and unreal entities. If one
accepts that this is a valid distinction in a conceptual sense, then the implications drawn in LCC
and herein should be relatively self-evident. It is only when one begins to conflate the unreal
with the real entity that confusion begins to reign. So, one last time, let us clarify the distinction,
by way of presenting a potential thought process researchers may engage in.
The foundations of scientific research rest on theories which attempt to explain
phenomena (whether these are astronomical events, chemical reactions, human relations, animal
behavior, or anything else). Unless it deals solely with things that can be directly measured2, a
theory at least in part consists of a number of theoretical entities, which are linked together by
2 While we are not experts in medical research, an example of such a theory might be that the number of cigarettes smoked is associated with probability of death before a certain age. However, it may also be the case that even such a theory could be extended with the inclusion of unobservable mediating processes / variables needing indirect measurement. Furthermore, the notion of cause is itself unobservable. Some authors would also argue even something as seemingly observable as a probability of death or number of cigarettes is fundamentally unobservable due to measurement error, but such a view would only strengthen our thesis, so it is not necessary to dwell on it here.
way of some (most usually causal) logic. The first port of call in formalizing any scientific
theory beyond mere idle speculation must be to conceptualize those theoretical entities.
Theoretical entities begin as ideas we may have about how to explain phenomena, which
we give names to – such as ‘job satisfaction’, ‘customer orientation’, ‘service quality’,
‘socioeconomic status’, ‘advertising expenditure’ and the like. However, the conceptualization
process should be far more rigorous than simply giving a theoretical idea a name, and the most
important decision at this stage is akin to asking the question ‘is this theoretical idea something
more than just a name?’ In other words, can we define a convincing existence proposition for
this theoretical entity?
While the question of entity realism might initially seem to be hard to answer, in our view,
there are two potential questions that can shed light on this issue:
1. Is it easy to define some ‘class’ of entity it may belong to? For example, could you call it
a type of ‘attitude’, ‘perception’, ‘emotion’, ‘elementary particle’, and so forth? And,
could this theoretical entity exist, if a researcher had not first conceptualized it?
2. Is it easy to define the theoretical entity in a way that does not involve it as some
combination of other more fundamental components? Pragmatically, can you define it in
a way that does not imply for example certain observable items, or subdimensions?
If it is easy to answer ‘yes’ to both these questions, it is likely that one can define it as a real
entity, and create a convincing existence proposition. If not, one should think carefully about
assuming that the theoretical entity is real. However, the answers to such questions are
interlinked rather than sequential. Take the example of SES as an illustration. First, it is not easy
to see exactly ‘what’ SES could be. One could perhaps be rather pedantic, and call it a
‘demographic characteristic’, and argue that such characteristics of course do exist – such as
gender. But, as part of this class of entity, could SES have existed if a researcher had not
conceptualized it first? In our view, the answer would be no. An individual’s SES (note, as
distinct from the perception of SES) is a concept created by interested observers (e.g.
sociologists), and is not an inherent feature of the world around us. Contrast this with a
perception, an emotion, or an elementary particle (e.g. an electron). While our scientific theories
may be wrong, the current understanding is that such things as these are inherent features of the
world around us, whether or not we are there to name and theorize about them. It is harder
(although perhaps not impossible) to convincingly argue the same for SES.
But even if one can debate the answers to the first point (and doing so is in itself a
valuable exercise), it is the answer to the second that will settle the issue. Once one tries to define
SES more formally, one starts to run up against the problem that it is defined as some
combination of other things – income, occupation, education. Defining it in this way then makes
it even harder to answer positively the first question. As such, here, we are clearly dealing with
case where it is impossible to define a convincing existence proposition, without some
redefinition of the very nature of the theoretical entity – in other words, without changing what it
is. We suspect there are many such cases in marketing and social research.
All the central ideas we presented in LCC and in this rejoinder are founded on the
distinction between real and unreal cases. It is our view, and that of others (e.g. Borsboom et al.
2003), that the only logical definition of the formative model is as a way of modeling unreal
cases. If one accepts the arguments that lead to this viewpoint, then the substantive problems we
pointed out in this exchange are of great importance to conducting rigorous research, namely:
1. Using covariance-based MIMIC models does not properly represent the formative case.
2. The definition of a formative variable must include the components and the way they are
combined together (e.g. weightings).
3. Using exogenous indicators to model a real entity is not the same as using formative
components to model an unreal variable.
Avoiding these issues, and maintaining that the ‘formative model’ is analogous to a real entity
with exogenous indicators is, in our view, misleading, and actually harmful to the formative
model. Further, we do not understand why the unreal case – which is essentially analogous to
operationalism – seems to be anathema to so many. We showed in LCC that fields such as
clinical diagnosis and health economics understand, accept, and provide a number of options to
model unreal entities, and they are fundamental parts of theoretical explanations in many other
fields. For example, electrical resistance is not a real entity, being as it is nothing more than the
ratio of voltage across an object to current through it (MacCorquodale and Meehl, 1948).
Another example is momentum, being mass multiplied by velocity. The fact that these variables
are operationally defined does not appear to have impeded their central role in various models
and explanations. However, note a number of key characteristics that exactly mirror our
definition of the unreal entity, and ensure that such cases are useful across studies:
1. The definition includes the components, and the combination rule of those components
(ratio for resistance, multiplication for momentum)3
2. One cannot logically influence the amount of the unreal entity without influencing one or
more of the individual components (Cadogan and Lee, 2013). How can one cause a
change in momentum without changing velocity or mass?
If sciences such as physics and medicine can rely on what we term unreal entities as fundamental
parts of their models, what is the reason they seem so hard to accept in other fields (e.g.,
marketing)? It is our view that the acceptance that formative variables are unreal would do much
to enhance current research practice in marketing and similar domains of study. Furthermore,
clearly defining a formative variable model as being appropriate to the unreal case and not the
real case would remove much of the confusion and misconception regarding the formative model
that abounds in the literature, and assure the formative model of a key place in all good
researchers’ toolkits. We understand Diamantopoulos’ (2013) skepticism of whether we really
believe that formative models are not inferior. However, we consider that our clarification of
when formative models can be used is far more likely to ensure that “formative models are here
to stay”, than the confusion that currently exists.
3 Just imagining trying to use a MIMIC-type model to operationalise mass and velocity as indicators of some latent variable called ‘momentum’, and using the estimated beta values to determine whether either of the two indicators should be removed will illuminate once and for all the absurdity of the MIMIC approach to formative models.
References
Bagozzi R.P. (1984). A Prospectus for Theory Construction in Marketing. Journal of Marketing
Research, 48 (1), 11-29.
Bagozzi, R.P. (2011). Measurement and Meaning in Information Systems and Organizational
Research: Methodological and Philosophical Foundations. MIS Quarterly, 35 (2), 261-
292.
Blalock, H.M. (1975). The Confounding of Measured and Unmeasured Variables. Sociological
Methods and Research, 3 (4), 355-383.
Bollen, K.A. (2007) Interpretational Confounding is Due to Misspecification, Not Type of
Indicator: Comment on Howell, Breivik, and Wilcox (2007). Psychological Methods, 12
(2), 219-228.
Borsboom, D. (2005). Measuring the Mind: Conceptual Issues in Contemporary Psychometrics.
Cambridge, UK: Cambridge University Press.
Borsboom, D., Mellenbergh, G.J. & van Heerden, J. (2003). The Theoretical Status of Latent
Variables. Psychological Review, 110 (2), 203-219.
Cadogan, J.W., & Lee, N. (2013). Improper Use of Endogenous Formative Variables. Journal of
Business Research, 66, 233-241.
Diamantopoulos, A. (2006). The Error Term in Formative Measurement Models: Interpretation
and Modeling Implications. Journal of Modelling in Management, 1 (1), 7-17.
Diamantopoulos, A. (2013). MIMIC Models and Formative Measurement: Some Thoughts on
Lee, Cadogan, and Chamberlain. Academy of Marketing Science Review. This issue.
Diamantopoulos, A., Riefler, P. & Roth, K.P. (2008). Formative Indicators: Introduction to the
Special Issue. Journal of Business Research, 61 (12), 1203-1218.
Diamantopoulos, A. & Siguaw, J. (2006). Formative versus Reflective Indicators in
Organizational Measure Development: A Comparison and Empirical Illustration. British
Journal of Management, 17, 263-282.
Diamantopoulos, A. & Winklhofer, H.M. (2001). Index Construction with Formative Indicators:
An Alternative to Scale Development. Journal of Marketing Research, 38, 269-277.
Grace, J.B. & Bollen, K.A. (2008). Representing General Theoretical Concepts in Structural
Equation Models: The Role of Composite Variables. Environmental and Ecological
Statistics, 15 (2), 191-213.
Hardin, A.M., Chang, J.C.-J., Fuller, M.A., & Torkzadeh, G. (2011). Formative Measurement
and Academic Research: In Search of Measurement Theory. Educational and
Psychological Measurement, 71 (2), 281-305.
Hayduk, L.A. (1996). LISREL: Issues, Debates and Strategies. Baltimore, Maryland: John
Hopkins University Press.
Hayduk L.A., & Littvay, L. (2012). Should Researchers Use Single Indicators, Best Indicators,
or Multiple Indicators in Structural Equation Models? BMC Medical Research
Methodology, 12: 159.
Howell, R., (2013). Conceptual Clarity in Measurement: Constructs, Composites, and Causes: A
Commentary On Lee, Cadogan And Chamberlain (2013). Academy of Marketing Science
Review. This issue.
Lee, N., & Cadogan, J.W. (2013). Problems with Formative and Higher-Order Reflective
Variables. Journal of Business Research, 66, 242-247.
MacCorquodale, K. & Meehl, P.E. (1948). On a Distinction between Hypothetical Constructs
and Intervening Variables. Psychological Review, 55, 95-107.
Molina-Castillo, F.-J., Calantone, R.J., Stanko, M.A., & Munuera-Aleman, J.-L. (2012). Product
Quality as a Formative Index: Evaluating an Alternative Measurement Approach. Journal
of Product Innovation Management.
Rigdon, E. E., (2013), Lee, Cadogan, and Chamberlain: An Excellent Point . . . But What about
that Iceberg? Academy of Marketing Science Review, this issue.
Rigdon, E.E., Preacher, K.J., Lee, N., Howell, R.D., Franke, G. R., & Borsboom, D. (2011).
Avoiding Measurement Dogma: A Response to Rossiter. European Journal of Marketing,
45 (11/12), 1589-1600.
Sanchez-Perez, M. & Iniesta-Bonillo, M.A. (2004). Consumers Felt Commitment Towards
Retailers: Index Development and Validation. Journal of Business and Psychology, 19
(2), 141-159.
Sosa, E. & Tooley, M. (1993). Introduction. In E. Sosa & M. Tooley (Eds.). Causation, Oxford:
Oxford University Press.
Treiblmaier, H., Bentler, P.M., & Mair, P. (2011). Formative Constructs Implemented via
Common Factors. Structural Equation Modeling, 18 (1), 1-17.
Uhrich, S., & Benkenstein, M. (2010). Sport Stadium Atmosphere: Formative and Reflective
Indicators for Operationalizing the Construct. Journal of Sport Management, 24, 211-237.
Wilcox, J.B., Howell, R.D. & Breivik, E. (2008). Questions about Formative Measurement.
Journal of Business Research, 61(12), 1219-1228.
Table 1: Assumptions Underpinning and Associated with the Formative MIMIC Model
(1) In a formative variable model (see Figure 1), where η is the focal latent variable, η
is a separate entity from its formative indicators.
(2) The formative indicators in the MIMIC model have a causal effect on η. That is,
variance in a formative indicator, via a path of causal mechanisms, leads to a change
in η.
(3) MIMIC model estimation can provide information on the magnitude of the
relationship between the formative indicator entities and the η entity.
(4) It is possible for a variable to have both formative and reflective content valid
indicators of the same variable.