Ordinal preference elicitation methods in health economics and health services research: using...

24
Ordinal preference elicitation methods in health economics and health services research: using discrete choice experiments and ranking methods Shehzad Ali * , and Sarah Ronaldson York Trials Unit, Department of Health Sciences, University of York, Lower Ground Floor ARRC Building, Heslington, York YO10 5DD, UK Introduction: The predominant method of economic evaluation is cost–utility analysis, which uses cardinal preference elicitation methods, including the standard gamble and time trade-off. However, such approach is not suitable for understanding trade-offs between process attributes, non-health outcomes and health outcomes to evaluate current practices, develop new programmes and predict demand for services and products. Ordinal preference elicitation methods including discrete choice experiments and ranking methods are therefore commonly used in health economics and health service research. Areas of agreement: Cardinal methods have been criticized on the grounds of cognitive complexity, difficulty of administration, contamination by risk and preference attitudes, and potential violation of underlying assumptions. Ordinal methods have gained popularity because of reduced cognitive burden, lower degree of abstract reasoning, reduced measurement error, ease of administration and ability to use both health and non-health outcomes. Areas of controversy: The underlying assumptions of ordinal methods may be violated when respondents use cognitive shortcuts, or cannot comprehend the ordinal task or interpret attributes and levels, or use ‘irrational’ choice behaviour or refuse to trade-off certain attributes. Current use and growing areas: Ordinal methods are commonly used to evaluate preference for attributes of health services, products, practices, interventions, policies and, more recently, to estimate utility weights. Areas for on-going research: There is growing research on developing optimal designs, evaluating the rationalization process, using qualitative tools for developing ordinal methods, evaluating consistency with utility theory, appropriate statistical methods for analysis, generalizability of results and comparing ordinal methods against each other and with cardinal measures. British Medical Bulletin 2012; 1–24 DOI:10.1093/bmb/lds020 & The Author 2012. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: [email protected] *Correspondence address. E-mail: shehzad.ali@york. ac.uk British Medical Bulletin Advance Access published August 2, 2012 at University of York on August 6, 2012 http://bmb.oxfordjournals.org/ Downloaded from

Transcript of Ordinal preference elicitation methods in health economics and health services research: using...

Ordinal preference elicitation methods

in health economics and health services research:

using discrete choice experiments and ranking

methods

Shehzad Ali*, and Sarah Ronaldson

York Trials Unit, Department of Health Sciences, University of York, Lower Ground FloorARRC Building, Heslington, York YO10 5DD, UK

Introduction: The predominant method of economic evaluation is cost–utility

analysis, which uses cardinal preference elicitation methods, including the

standard gamble and time trade-off. However, such approach is not suitable for

understanding trade-offs between process attributes, non-health outcomes and

health outcomes to evaluate current practices, develop new programmes and

predict demand for services and products. Ordinal preference elicitation

methods including discrete choice experiments and ranking methods are

therefore commonly used in health economics and health service research.

Areas of agreement: Cardinal methods have been criticized on the grounds of

cognitive complexity, difficulty of administration, contamination by risk and

preference attitudes, and potential violation of underlying assumptions. Ordinal

methods have gained popularity because of reduced cognitive burden, lower

degree of abstract reasoning, reduced measurement error, ease of

administration and ability to use both health and non-health outcomes.

Areas of controversy: The underlying assumptions of ordinal methods may be

violated when respondents use cognitive shortcuts, or cannot comprehend the

ordinal task or interpret attributes and levels, or use ‘irrational’ choice behaviour

or refuse to trade-off certain attributes.

Current use and growing areas: Ordinal methods are commonly used to

evaluate preference for attributes of health services, products, practices,

interventions, policies and, more recently, to estimate utility weights.

Areas for on-going research: There is growing research on developing optimal

designs, evaluating the rationalization process, using qualitative tools for

developing ordinal methods, evaluating consistency with utility theory,

appropriate statistical methods for analysis, generalizability of results and

comparing ordinal methods against each other and with cardinal measures.

British Medical Bulletin 2012; 1–24

DOI:10.1093/bmb/lds020

& The Author 2012. Published by Oxford University Press. All rights reserved.

For permissions, please e-mail: [email protected]

*Correspondence address.

E-mail: shehzad.ali@york.

ac.uk

British Medical Bulletin Advance Access published August 2, 2012 at U

niversity of York on A

ugust 6, 2012http://bm

b.oxfordjournals.org/D

ownloaded from

Keywords: ordinal preference methods/stated preference/economicevaluation/discrete choice experiments/best-worst scaling/ranking exercises

Accepted: July 4, 2012

Introduction

Health economic evaluation involves valuation of health states, ser-vices, products, interventions, practices and policies. The predominantmethods of economic evaluation in the literature are cost-effectivenessanalysis (CEA) and cost–utility analysis (CUA).1 In the case of CEA,the outcome measure is programme-specific, such as the reduction inblood pressure,2 number of positive cases detected,3 change in asthmaepisode days4 and gain in life years.5 CEA can be useful when the deci-sion maker is interested in comparing alternatives within a particularfield, for instance, cancer screening. However, for a decision makerwith broad health sector mandate, the more appropriate method ofeconomic evaluation is CUA because it uses a generic measure ofhealth outcome that can be compared across programmes.1 The mostcommonly used outcome measure in CUA is the quality-adjusted lifeyear (QALY) that combines survival and health-related quality of life(HRQoL). The QALY measure is preference-based, i.e. it incorporatespreference or desirability of the general public for health-related conse-quences (such as health states) of the alternative programmes beingevaluated. The commonly used preference elicitation methods used forQALY-based analysis are the standard gamble (SG) and time trade-off(TTO) (see Whitehead and Ali6 for further details on the QALY-basedapproach to economic evaluation).

While the QALY-based approach is attractive because of its genericnature, it is not suitable for all evaluation studies. For instance, whencomparing health-related programmes (in this context, the term ‘pro-gramme’ is used in a generic sense to represent health-related services,products, interventions, practices and policies), the traditional QALYapproach assumes that consumer [i.e. individuals whose welfare isbeing evaluated, including patients, service providers, carers, decisionmakers, general public and other consumers of health-related pro-grammes] preference is only influenced by outcomes that have a directimpact on HRQoL and/or survival. This can be a limitation sinceprocess characteristics (or attributes) and non-health outcomes canplay a crucial role in consumer preference and may be of interest forthe decision maker evaluating or developing health-related pro-grammes. Such process attributes may include waiting time to accesshealthcare,7–9 choice of medical staff,10 choice of service level and type

S. Ali and S. Ronaldson

2 British Medical Bulletin 2012;103

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

of provider,8,11,12 distance to health facility,13 time of appointment,14

availability of home visits, cleanliness of wards and toilets,15 modeof treatment administration,16 frequency of medication17 and out-of-pocket cost.16,18–20 Similarly, the non-health outcomes that may influ-ence consumer preference may include the impact on education,21

dignity,22 safety and ability of social participation,23 availability ofnon-health-related services,24 control over daily life22 and impact onfamily members.25 The traditional QALY-based approach ignores suchattributes and does not lend itself directly to establishing trade-offsbetween health outcomes, non-health outcomes and process attributes.Moreover, in many situations, the QALY is not the direct outcome ofinterest for comparing health programmes. For instance, a decisionmaker may be interested in evaluating the relative value of characteris-tics (or attributes) of health-related programmes (such as attributes ofhealth insurance packages)26,27 to predict the demand for services withdifferent specifications. Here, the QALY is not the direct outcome ofinterest; rather it is the relative preference for process and outcomeattributes that are of interest (such as waiting time to access health ser-vices, type and level of services covered, reimbursement of service feesand level of insurance contribution). Other examples of situationswhere QALY-based approach is not appropriate include evaluation ofrelative preferences for attributes of treatments (such as mode of drugadministration, frequency of use, clinical efficacy, side effects andcost),17,28 or establishing willingness-to-pay for healthcare products orservices (such as out-of-hours clinics).11 In such situations, alternativepreference elicitation methods are required to directly compare attri-butes of alternative programmes.

The two commonly used alternative preference-based approaches inhealth economics are discrete choice experiments (DCEs) and rankingmethods. These approaches are commonly used to evaluate how indivi-duals value and trade-off characteristics of health-related programmes,to predict the demand for services or products with various charac-terizes and to establish willingness-to-pay for programmes with differ-ent characteristics.1,29 Such information may be crucial to decisionmakers for evaluating current practices and developing new productsand services based on relative preference for levels of health, non-health and process attributes. Examples include evaluating patient pref-erence for treatment of diabetes,30 community preference for healthservice providers,11 consultants’ or service providers’ preference for jobcharacteristics,31–33 public preference for resource allocation decisionsin the healthcare system,34 healthcare decision makers’ attitudetowards trade-off between equity and cost-effectiveness,35 communitypreference for insurance packages,26 preference of general populationfor long-term care24 and patients’ willingness-to-pay for alternative

Ordinal preference elicitation methods for economic evaluation

British Medical Bulletin 2012;103 3

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

services.16,18,19 DCE and ranking studies can be used as stand-alonestudies, or as part of full economic evaluations.1,29

The aim of this article is to introduce the concepts and methods ofDCEs and ranking studies, and to critically and comprehensively ap-praise these methods in relation to their use in economic evaluationand health services research. This article is structured as follows: first,we will introduce the concept of preference elicitation methods, fol-lowed by a discussion on the need for alternative elicitation methods.Then, we will introduce and critically appraise DCE and rankingstudies and discuss the future research frontier for these preferenceelicitation methods. Further references are provided to direct the inter-ested reader.

What are preference elicitation methods?

Economists assume that individuals prefer one health programme (orhealth state) over another if they derive more satisfaction (utility) orless dissatisfaction (disutility) from it. For instance, when presentedwith choice between two treatments that differ in terms of levels ofprocess and outcome attributes (such as level of efficacy, mode of ad-ministration and level of risk of adverse events), individuals choose thealternative that provides them with greatest utility. While utility ofeach choice alternative is not observed by researchers, the choice deci-sions made by individuals can reveal the underlying utility or valuethey associate with each alternative. This forms the basis for the use ofutility-based preference methods. These methods include ‘stated prefer-ence elicitation methods’ and ‘revealed preference methods’. Theformer ask individuals to state their preference from a set of two ormore alternatives in a hypothetical situation. Stated preference elicit-ation involves trade-offs being made which reveal the value (or utility)that individuals associate with health programmes or attributes ofhealth services and products.

On the other hand, revealed preference methods involve the valuationof alternative programmes by evaluating actual consumer behaviour inreal-life market settings, which can in turn reveal the trade-offs thatindividuals actually made.36 For instance, an analysis of actual marketdata to evaluate individual choices of dental health plans in real lifecan reveal how individuals trade-off different attributes of health plansto maximize their utility. However, conducting valuations based onrevealed preferences is restricted by the actual existing markets forgoods and services (or in our case, health-related programmes); thisposes limitations, particularly in the health sector where open marketsfor goods and services (such as antibiotics for infections) often do not

S. Ali and S. Ronaldson

4 British Medical Bulletin 2012;103

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

exist. Also, in most countries with public and private insurance, consu-mers rarely face market prices for goods and services.37 Moreover,revealed preference methods suffer from problems related to lack ofconsumer knowledge about products and services (including incom-plete disclosure about healthcare products/services) or subsidised priceat the point of use.38 Also, it is difficult to imagine how revealed pref-erence can be used for valuation of health states that are not usuallychosen voluntarily. (Individuals usually do not choose their own healthstate, i.e. there is no market for health states, although individuals maychoose the process that leads to a health state, for instance, poor diet.)Finally, revealed preference methods are not useful for evaluating hypo-thetical scenarios or interventions that are under development and pol-icies for which market data do not exist.39 Hence, ‘stated preference’methods are widely used in the healthcare field.

Stated preference methods are especially useful when a product,service or policy is not currently available in real life, or where the aimis to evaluate preferences across currently available alternatives. Statedpreference methods can help to understand the mechanism of thechoice process and the trade-offs that individuals are willing to make;this is done by creating hypothetical choice scenarios with differentlevels of attributes. (Note that some studies have criticized stated pref-erence methods on the grounds that they do not represent the actualchoice decisions. Some critics of stated preference methods argue thatindividuals may behave differently when faced with actual choice deci-sions in real life markets compared to hypothetical scenarios in statedpreference exercises.) It should be noted that some studies have com-bined the revealed and stated preference approaches.40,41 However,most of these studies have been conducted outside of the health eco-nomics field.

Stated preference elicitation methods can be understood as either car-dinal or ordinal (Fig. 1). Cardinal methods generate preferences in aquantitative form based on answers from respondents, i.e. they givedirect estimates of the degree to which one health programme (orhealth state) is preferred over another.42 The most common cardinalmethods include the SG and TTO. During these tasks, individualstrade-off between probabilities, uncertainties or risks associated withhealth states or services.43 The methods are most often used to valuehealth states by asking individuals to trade-off years of perfect health(TTO) or risk of immediate death (SG) against remaining in a particu-lar health state for a period of time. Cardinal measures also include thevisual analogue scale (VAS) which is a rating exercise to value healthstatus on a scale between 0 and 100. The VAS has been criticized ontheoretical and empirical grounds44 and is less frequently used as avaluation tool. In health economics, cardinal methods are primarily

Ordinal preference elicitation methods for economic evaluation

British Medical Bulletin 2012;103 5

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

used to produce quantitative weights (i.e. utility values) for healthstates that are subsequently used to calculate QALYs. See Whiteheadand Ali6 for a detailed account of cardinal methods in economicevaluation.

Ordinal preference elicitation methods, in contrast to cardinalmethods, are concerned with ordering preferences of two or more alter-natives without directly establishing the degree of preference of one al-ternative over the other. Commonly used ordinal preference methodsinclude DCEs and ranking exercises.45 These methods are relativelyeasier for respondents to understand and also easier to administer com-pared with SG and TTO. The methods have sometimes been used tovalue health states, but more commonly to evaluate healthcare services,products, practices, interventions and policies.46 Ordinal methods canalso be used to evaluate willingness-to-pay or willingness-to-accept fora change in the level of an attribute of a service or product.47 Ordinalmeasures can be transformed into cardinal measures for use in econom-ic evaluation48 or are useful as stand-alone valuation tools in theirown right.

Why use alternative preference elicitation methods?

The most appropriate valuation method (cardinal or ordinal) isdebated; methods differ in relation to their mode of administration,ease of comprehension, underlying assumptions, scale of measurement,nature of trade-offs, use of props and diagrams, time allowed for reflec-tion and individual versus group interviews.45 Different elicitation

Fig. 1 Simplified classification of preference elicitation methods used in health economics.

S. Ali and S. Ronaldson

6 British Medical Bulletin 2012;103

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

methods can generate different health valuations or preferences, withthere being substantial evidence to suggest that this is the case.43

However, there has been no consensus towards the most preferredmethod.49 Some studies argue that the more commonly used valuationmethods, such as the SG or TTO, may not be the most appropriate orfeasible in all situations. Brazier and Ratcliffe45 pointed out that theSG, which has often been described as the gold standard for outcomemeasurement, may in practice violate the underlying theoretical axiomsof expected utility theory (EUT) (EUT postulates that individualschoose between alternatives to maximize their expected utility.Expected utility of an alternative is calculated by estimating the utilityof each possible outcome and then multiplying their probability of oc-currence, and finally summing up to calculate the expected utility ofthe alternative. These expected utilities are then compared to maximizethe utility gain), which SG is rooted in.50,51 In addition, the values gen-erated by the SG can be contaminated by risk attitude (i.e. dependingon how risk averse individuals are) and loss aversion (i.e. tendency tostrongly prefer avoiding losses to acquiring gains); similarly, the TTOcan be contaminated by time preference (i.e. individuals have different(usually higher) preference for health now over future health), durationeffects and possibly unwillingness of some respondents to sacrifice anyof their life expectancy.52 Bleichrodt53 has explored why people mayviolate expected utility and the direction of bias associated with SGand TTO methods.

Ordinal methods overcome some of the biases of SG and TTOmethods and produce more consistent responses than TTO and SG.42

These methods have therefore become increasingly popular in recentyears.54,55 They offer advantages in the form of relative ease of com-prehension and administration, in addition to greater reliability interms of reduced measurement error.45 There are practical advantagesfor the use of ordinal methods over more commonly used methodssuch as SG and TTO, including reduced cognitive burden posed torespondents and a lower degree of abstract reasoning, which are par-ticularly important in settings with limited educational and numeracyattainment.52 For this reason of lower complexity, ordinal methods areexpected to more accurately reflect decision maker preferences.56 Forexample, Coast et al.57 point out the questionable appropriateness ofcomplex tasks such as the SG and TTO for an older population group.It is less complex for respondents to provide answers about their pref-erence for health state X versus health state Y (ordinal techniques),rather than needing to quantify the magnitude of the preference forone scenario over the other (cardinal techniques). Moreover, some car-dinal tasks may cause distress to the respondents, particularly whenthere is reference to early death. Finally, cardinal methods are primarily

Ordinal preference elicitation methods for economic evaluation

British Medical Bulletin 2012;103 7

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

useful for evaluation of health states; they have found limited use inunderstanding trade-offs between health and non-health outcomes (orprocesses) that is crucial to evaluation of health services and products.A great advantage of ordinal methods is that they allow investigationof such trade-offs. In summary, the concerns over the use of cardinalmethods, the difficulty of administration of SG and TTO methods andthe desirable properties of ordinal methods have contributed towardsthe increasing recent use of ordinal techniques.57

What are ordinal preference elicitation methods?

The main ordinal stated preference methods are summarized below(Fig. 1).

Discrete choice experiments

DCEs ask individuals to choose between two or more alternatives,where each alternative is usually described using two or more attributes(see the Discrete choice experiment section). The alternatives to choosefrom may include health states, health services, products, practices,interventions or policies. The attributes can be process or outcomeattributes and may or may not be health-related (for instance, curerates and waiting times).

Ranking

A ranking exercise involves individuals being asked to place in order(partial order or complete order) a set of attributes or alternativesaccording to which they consider to be the best to the worst, forexample. As in the case of DCEs, the attributes can be health ornon-health-related.

Ordered categorical responses

An alternative method of valuation is the use of an ordered categoricalresponse scale. This approach involves defining a set of response alter-natives, followed by asking respondents to rate each health state indi-vidually according to how good or bad a respondent considers thevalue associated with that alternative to be. Hence, no direct compari-son is made between alternatives, which is different from ranking.This approach is relatively less commonly used in health economics;

S. Ali and S. Ronaldson

8 British Medical Bulletin 2012;103

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

hence, the focus of further discussion will be on the first twoapproaches.

A useful document providing good practice guidelines for conductingordinal preference studies is a report produced by the InternationalSociety for Pharmacoeconomics and Outcomes Research (ISPOR)Good Research Practices for Conjoint Analysis Task Force.58 Thedocument provides an impetus for good quality applied research; italso includes a useful checklist for researchers interested in conductingor appraising ordinal preference methods.

History of ordinal preference elicitation methods

Until recent years, the main purpose of ordinal ranking techniques wasto provide a ‘warm-up’ exercise, in order to prepare respondents forthe main task of cardinal valuation. However, the use of ordinal prefer-ence methods, either as stand-alone exercises or as part of cost-effectiveness or cost–benefit analysis, has become common in recentyears. This is partly attributable to the strong methodological founda-tion that exists for estimating cardinal values from ordinal information,primarily originating in psychology but also commonly applied in areasof environmental economics, transport research, consumer marketingand political science.49 This has facilitated much wider use of ordinalmethods in economic evaluation.

Work on ordinal preference elicitation methods can be traced back to1920s, with work in the area being pioneered by Thurstone59 who pro-posed the law of comparative judgement and the random utility theory(RUT) that laid the foundations of the process of pairwise compari-sons. Thurstone’s paired comparison approach was applied to estimatehealth valuations by Fanshel and Bush.60 Other approaches to model-ling ordinal data, such as the Bradley–Terry model,61 have also beenproposed. This approach proposed the use of logistic functional forminstead of the normal density function, which facilitated valuationfunctions to be estimated from ordinal data.45 Kind62 subsequentlycompared the Thurstone and Bradley–Terry models for scaling healthindicators.

Louviere and Woodworth63 pioneered the multi-attribute DCEs byproposing the use of experimental design theory to construct choicesets of profiles (or alternatives). This work was based on the multi-attribute utility theory and was found to be consistent with the condi-tional logit models proposed by McFadden.64 More recently, ordinalmethods, such as DCEs, which have roots in RUT, were introducedto health economics by Propper,65 and advocated by Ryan and collea-gues.9,66 It should be noted that in the initial years, the early role of

Ordinal preference elicitation methods for economic evaluation

British Medical Bulletin 2012;103 9

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

DCEs was to go beyond the conventional QALY-based approach to in-corporate process characteristics and non-health outcomes to under-stand the trade-offs between levels of attributes of health services orgoods.48 However, in more recent years, it has been recognized thatordinal preference methods can contribute to both the CEA and cost–benefit analysis. As a result, valuation of health states and process out-comes has been investigated in several studies.42,46,67,68 When used forvaluation of health states, ordinal methods may compare health statesdefined by generic standardized multi-attribute descriptive systems suchas the EQ-5D, SF-6D or the HUI, or condition-specific vignettes.42,49,52

See Louviere and Lancsar48 for a detailed history of choice experimentsin health.

Discrete choice experiment

The DCE is the most common type of ordinal preference method usedin health economics and health services research. Carson andLouviere69 defined the DCE as ‘a general preference elicitation ap-proach that asks agents to make choice(s) between two or more dis-crete alternatives where at least one attribute of the alternative issystematically varied across respondents in such a way that informationrelated to preference parameters of an indirect utility function can beinferred’. Less formally, a DCE can be understood as a survey-basedmethod to evaluate the relative importance of health states or attributesof services, interventions, practices and policies to establish the trade-offs that individuals are willing to make between levels of attributes inorder to maximize their utility from the available alternatives.70,71 Thisapproach can then estimate the value of given alternatives to therespondents.1

In the literature, several generic terms have been used to refer toDCEs. These include, but are not limited to, conjoint analysis, conjointexperiments, choice experiments, paired comparisons and stated prefer-ence methods.46 However, these terms do not always refer to the sameapproach and there is much debate about their correct use. See Carsonand Louviere69 for a detailed discussion on nomenclature of statedpreference methods.

The use of DCE in health economics and health services research

The DCE approach has been used in several areas of health economicsand health services research. We have provided several examples in theIntroduction section of the kind of questions that can be answered

S. Ali and S. Ronaldson

10 British Medical Bulletin 2012;103

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

using DCEs. Broadly speaking, DCEs can be used to elicit preferencesfor health-related products and services, to value health outcomes andhealth states, to quantify trade-offs between attributes, to evaluate non-health outcomes (for instance, GP remuneration package)72 and topredict expected uptake of policies and products.39 In effect, any re-search question that requires preference elicitation based on hypothetic-al choice scenarios can be addressed using DCEs. Bekker-Grob et al.46

found that between 1990 and 2008, there was limited focus on valuinghealth states using DCE methods. However, more recently, DCEs havealso been used to estimate cardinal values on the health utility scale forgeneric and condition-specific health states.42,73

As mentioned earlier, DCEs are also useful tools to understand thetrade-offs between health and non-health outcomes that may be diffi-cult to study using the traditional cardinal methods.74 A typicalexample is the use of cost as one of the attributes in the experiment toinvestigate WTP of service users for attributes of services and interven-tions.47,75 Finally, interest has been growing in the use of the DCE ap-proach to explore the equity dimension of health and healthcare. Onesuch study was Lancsar et al.76 that used the DCE method to derivedistributional weights for QALY gains using society’s health values.Such an approach can be used to understand equity-efficiencytrade-offs.77

Generic and alternative-specific DCEs

For practical purposes, DCEs can be understood as either ‘generic’ or‘alternative-specific’. A generic DCE involves comparing alternativesthat are unlabelled (for instance, comparing healthcare providersA and B that are different only in terms of the specified attributes andlevels), while an alternative-specific DCE compares labelled alternativesthat have inherent meaning to the respondents (for instance, chiroprac-tor or physiotherapist, both of which are then defined further in termsof, say, waiting time, distance and costs). The advantage of usingalternative-specific labels is respondent familiarity with the contextwhich reduces the cognitive burden; however, the disadvantage is thatrespondents may partly ignore the trade-offs between the specifiedattributes in the DCE and instead focus more on the unspecified attri-butes of the alternative.78 The generic design is far more common inthe health economics and health services literature. Hence, the focus ofthe remaining part of this section is on generic DCEs. However, theunderlying principles of the two designs are the same.

Ordinal preference elicitation methods for economic evaluation

British Medical Bulletin 2012;103 11

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

A hypothetical example of DCE

In a typical DCE study, the context of the experiment is described tothe respondents at the start of the survey questionnaire. For instance, ina study evaluating patients preferences for the out-of-hours generalpractice, the respondents will be described the context and setting ofthe service. Based on this context, individuals are presented with oneor more choice questions (or choice sets), each with two or more hypo-thetical alternatives (or scenarios) to choose from. Each alternative canbe completely described by a set of attributes. For example, a DCEsurvey evaluating treatment strategies in primary care for a particularcondition may have 10 choice questions, each with alternatives A and B.Each alternative is then defined by three attributes, i.e. cure rates,waiting time and cost to the patient. The alternatives will differ interms of the ‘level’ of one or more attributes. For example, in choicequestion 1, the hypothetical alternative A may offer a cure rate of70%, waiting time of 1 week and personal cost of £50. Against this,treatment alternative B may offer a better cure rate of 90%, longerwaiting time of 2 weeks and higher cost of £75. Respondents will thenbe asked to choose between alternatives A and B. This would requirethem to make a trade-off between a better cure rate in treatment Bagainst shorter waiting time and lower personal cost of treatment A.The process is then repeated for the 10 choice questions in the surveywith varying levels of the three attributes. In the process of makingthese choice decisions, individuals reveal the relative significance(preference) of levels of attributes in their decision-making process tomaximize their utility.

The number of attributes and choice scenarios (questions) varybetween studies. A recent survey of DCEs found that most DCE studiesuse 2–6 attributes (but can have more) that typically include some ofthe following: aspects of healthcare products, services and practices,monetary measures, time, level of associated risk or health statusdomains.46 However, any other attributes that are relevant to the re-search question can be included in the study. The number of choice scen-arios depend on the study design (discussed later). The above-mentionedsurvey found that most studies used �16 scenarios per respondent. Anexample of a DCE study from Scott et al.11 is presented in Figure 2.

Designing the DCE

A DCE study involves several stages, including conceptual develop-ment, design, conduct and statistical analysis.32,37 The first stageinvolves defining the research aims of the study and conceptualizing

S. Ali and S. Ronaldson

12 British Medical Bulletin 2012;103

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

the choice process. For example, in a DCE study of out-of-hours care,Scott et al.11 specified that the aims of the study were to elicit prefer-ences of the community for different models of out-of-hours care, andto evaluate the relative importance of attributes of out-of-hours care.Once the research objectives are established, the choice process is con-ceptualized which involves considering the decision-making context forindividuals making the choice, evaluating the alternatives currentlyavailable (and those that may become available or are being developed)and investigating the attributes that are likely to be important to usersor decision makers and/or those that may be amenable to policychange. This process is likely to involve some or all of the following:literature review, individual and focus group interviews with patients,carers, patient groups, local organizations and the general population,

Fig. 2 Example of a choice scenario (question) from a DCE (source: Scott et al.11 with per-mission from Elsevier).

Ordinal preference elicitation methods for economic evaluation

British Medical Bulletin 2012;103 13

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

expert clinical opinions, and discussions with policy-makers, seniorofficials and service providers.79,80 As relevant, the investigators shouldensure appropriate representation of opinions, including those fromdiverse socio-demographic groups. Relevance of alternatives to respon-dents, and the empirical context of products and services should beconsidered. It is also important to consider whether individuals are fa-miliar with the decision-making context or may require specializedknowledge to be able to make an informed decision. Through thisprocess, important attributes relevant to the decision-making contextare selected. Following this, levels of the attribute are selected based onthe range of values or situations that each attribute may take withinthe context of the research question. This is also informed by theabove-mentioned process of literature reviews and qualitative research.It is important to be aware that the number of attributes and levelsshould be balanced against what is cognitively feasible for respondentsto handle in a DCE.81

Based on attributes and levels, hypothetical alternatives are producedand grouped into choice questions to form the survey questionnaire.A full factorial survey design would include all possible combinationsof attributes and levels to generate all possible alternatives (xy, where xis the number of levels, y the number of attributes). However, thisusually results in a large number of choice questions that would be tootedious for the respondents and also too expensive to administer.32

Hence, experimental design methods are used to create smaller andmore manageable sets of alternatives to generate fractional factorialdesigns. A recent review of DCEs in health economics found that allDCEs conducted between years 2001 and 2008 used a fractionalfactorial design.46 For an efficient design, it is important to aim fororthogonal and balanced design, i.e. the attributes should be statisticallyindependent of each other, and they should occur equally frequently inthe survey. It is common practice to use software (including SPEED,SPSS, SAS and Sawtooth) to develop most efficient designs.

Analysing the DCE

DCE can be understood in the light of the two underlying theories, i.e.Lancaster’s theory of value, and the RUT. Lancaster’s theory of value82

argues that individuals view a commodity or service as a combinationof several component attributes (i.e. its characteristics) such that thetotal utility (satisfaction or desirability) of a commodity is a functionof separate partial utilities associated with composing attributes of thecommodity. The RUT argues that the utility value that an individualattaches to an alternative in a choice scenario cannot be observed by a

S. Ali and S. Ronaldson

14 British Medical Bulletin 2012;103

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

researcher, but it can be summarized by two components: an explain-able (or systematic) component and an unexplainable (or random)component.64 The explainable component comprises the attributes ofalternatives and the individual characteristics, while the unexplainablecomponent comprises the unobserved factors that influence the choice.It is assumed that respondents know the nature of their utility function.However, researchers do not observe this function, but they observethe choices made by individuals. Based on the principle of utility maxi-mization, we assume that an individual will choose alternative A overB in a choice question if the utility derived from A is more than thatfrom B. This ordinal choice decision is used to derive a cardinal valu-ation by making the assumption that choices over sets of attributes arerelated to underlying latent cardinal values that are distributed aroundmean levels for each item. Then a choice-based statistical model isemployed to analyse the influence of attribute levels on choice deci-sions. Subsequently, the relative significance of attributes can be evalu-ated by taking a ratio of the coefficients of each possible pair ofattributes. This provides the marginal rate of substitution (MRS), i.e.the rate at which an individual is likely to give up one level of an attri-bute in exchange for another attribute.

DCEs are commonly analysed using logit or probit statistical models.A literature review of DCE studies published between 1990 and 2008found that most studies used McFadden’s multinomial (conditional)logit or random effects probit models.46 However, more recently,mixed logit models have gained popularity in this area. These modelstake account of preference heterogeneity.83 It is also common practiceto evaluate the interaction between respondent-level characteristics andattributes in the DCE in predicting choice decisions. This relationshipis explored by introducing interaction terms in the statistical model.

Challenges and limitations of DCEs

Potential methodological limitations of DCEs have been identified.Ryan et al.84 conducted a qualitative think aloud study to understandthe rationalizing process used by respondents while making theirchoices. They found that many respondents concentrated primarily onone or two attributes to make their choices as this simplified the taskfor them. Lloyd43 adds that individuals employ cognitive heuristics (orshortcuts) to reduce the cognitive burden of the task. As a result, theymay be less willing to accept trade-offs for certain attributes that wereidentified by them to simplify the task. This is formally known as non-compensatory decision-making.71 This in turn would have an impact

Ordinal preference elicitation methods for economic evaluation

British Medical Bulletin 2012;103 15

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

on the validity of the results of DCE. Hence, understanding thedecision-making process is crucial in the development of a DCE.

Another challenge is that respondents may make judgements aboutthe quality of an intervention (above and beyond what is specified inthe choice question) based on known attributes or past experiences.84

For instance, if a respondent believes that costlier alternatives tend tohave better quality, then they may prefer the more costly options evenwhen all options are qualitatively equivalent in terms of the specifiedattributes. Another challenge is that respondents may have difficultyunderstanding risk information in DCEs (for instance, the risk of anadverse event). Also, individuals may see events as more likely if theyare familiar; respondents also tend to value hazards more highly forother people than themselves.

Some authors have challenged that the conventional methods used toderive welfare measures from DCEs (in particular, willingness-to-payestimates) may not be consistent with the underlying RUT; these couldpotentially be improved by using alternative methods of derivingwelfare measures.85,86 Similarly, Flynn et al.54 argue that when valuinghealth states using DCE, some of the underlying assumptions may beviolated if respondents refuse any health state to be worse than death.Also, stability of individual preference over time has been debated inthe literature, which can potentially have an impact on the inferencesdrawn from DCEs.

Another challenge is that respondents may not interpret attributesand levels in the way the researchers intended which will have animpact on the interpretation of the valuation estimates from DCEs.Quite often, DCEs include one or more ‘consistency checks’ to assessrationality and internal validity of responses.71 One of these checks isthe dominance test which refers to choice questions in which one ofthe alternatives is clearly dominant over the other (i.e. at least one attri-bute in the dominant alternative is better than the dominated and noattribute is worse). Individuals who fail this test are often referred to as‘irrational’. Another check is the test–retest reliability check which isconducted by repeating a choice scenario later in the survey to checkfor consistency of responses. While some investigators prefer to dropthe respondents from analysis who failed one or more checks, othersargue that removal of ‘irrational’ responses may also result in removalof valid responses.84 It has also been argued that ‘irrational’ behaviourmay imply that choice behaviour in DCEs may not be consistent withthe utility theory. This is an ongoing area of research. Some studieshave also criticized DCEs for being too restrictive in terms of inclusionof attributes.87 However, albeit its limitations and challenges, the DCEis a fast growing area for applied economists and there will inevitablybe further advances in the field in the years to come.

S. Ali and S. Ronaldson

16 British Medical Bulletin 2012;103

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

Ranking exercises

It has been argued that some of the limitations of DCEs can potentiallybe overcome by using ranking methods. For instance, Flynn et al.54

suggest that comparative inference about attributes such as ‘waitingtime for an appointment is more important to patients than continuityof care’ cannot be drawn directly from DCEs because DCEs producecomparative estimates for levels of attributes (i.e. scale values) and notfor attributes per se. Moreover, some authors argue that the ‘pick one’nature of the tasks in DCE may be less efficient than some of theranking exercises. Ranking data can provide more statistical informa-tion than choice experiments.54 However, like DCEs, there are limita-tions of ranking exercises, some of which will be discussed below.Currently, ranking exercises are less commonly used in health econom-ics than DCEs, partly because these are relatively new to the discipline.It should be noted that some authors regard ranking methods as a typeof DCE exercise.69 However, for empirical purposes, we prefer toreview them separately to distinguish these methods. Two of the morecommon ranking approaches include complete ranking exercise and thebest-worst scaling (BWS) method.

Complete ranking exercise

The complete ranking exercise is sometimes referred to as the contin-gent ranking method, especially when cost is one of the attributesvaried across alternatives. [Note that the contingent ranking approachis different from the contingent valuation method (not discussed in thisarticle); the latter directly ask individuals about their willingness-to-payor willingness-to-accept compensation for public goods and services.]The exercise requires respondents to rank order all available alterna-tives from the most preferred to the least preferred option, providingcomplete ordering of all alternatives. Fok et al.88 argue that completeranking may produce more efficient preference estimation comparedwith DCEs. However, they point out that the sorting task may be cog-nitively too complex or time-consuming for respondents, resulting inbiased preference estimates. The complexity of the task increases as thenumber of alternatives increases.69 One potential solution is to takeinto account the ranking capabilities of individuals by includingindividual-specific functions in the statistical model that describeranking capabilities of respondents. Fok et al.88 claim that such modelswould perform well when at least some respondents are able to rankmore than one alternative. However, the validity of such models needsto be further investigated. Another potential solution is to use only a

Ordinal preference elicitation methods for economic evaluation

British Medical Bulletin 2012;103 17

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

small number of alternatives that are cognitively possible for therespondents to manage. For instance, Slothuus et al.89 constructed fourcards each containing five attributes with different levels to establishthe willingness-to-pay for alleviation of rheumatoid arthritis symptoms.Similarly, Ratcliffe et al.42 elicited preferences for 11 sexual healthstates using a ranking exercise. However, the use of limited alternativesmay not always be appropriate and may adversely impact on statisticalestimation, especially when several attributes and levels are beingevaluated.

Another potential problem with complete ranking is that respondentsmay find it harder to distinguish between less-preferred alternatives andin turn produce biased ranking at the bottom end. To overcome this,Chapman and Staelin90 have suggested using only the first few ranks forestimation. Others have suggested two-stage approaches whereby therespondents may exclude the alternatives they do not prefer and thenrank order the remaining alternatives for estimation purposes.91

However, there is no consensus on whether these solutions can over-come the limitations of complete ranking exercises. Finally, McCabeet al.68 note that some of the underlying assumptions of completeranking exercises may be unrealistic, resulting in incorrect inferences.

Best-worst scaling

Another approach to ranking that is far more popular than completeranking in health economics is the BWS method. This method wasdeveloped by Louviere and Woodworth,92 and was empirically used byFinn and Louviere93 and further developed by Marley et al.94 Thereare three types of BWS methods, including the ‘object case’, the ‘profilecase’ and the ‘multi-attribute case’ (see Flynn95 for a detailed discus-sion with examples). The object case asks respondents to choose bestand worst attributes (without levels). The profile case asks to choosethe best and worst of the attributes with levels, while the multi-attribute case (closest to traditional DCEs) asks to choose best andworst profiles (alternatives) from a set of multi-attribute profiles. Themost common type used in health economics is the profile case BWS.In this method, respondents are presented with one scenario (orprofile) at a time. Each scenario has three or more attributes with dif-ferent levels, and respondents are asked to select the best and worst at-tribute levels within each scenario, rather than between scenarios as isthe case in DCEs. In effect, respondents are asked to pick a pair of at-tribute levels that are furthest apart on the latent utility scale.54 For in-stance, Potoglou et al.22 investigated individual’s social care-relatedquality of life on nine domains (or attributes) such as food and drink,

S. Ali and S. Ronaldson

18 British Medical Bulletin 2012;103

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

accommodation, social participation, control over daily life etc. Eachattribute had 2–4 levels. In the choice task, the respondents were pre-sented with several scenarios, one at a time, where each scenario hadall nine attributes with one level. The respondents then trade-offbetween attributes and levels to state best and worst attribute levels, inturn revealing their underlying utility function. BWS has been used forvaluation of health states and health interventions.7,22,54 This approachcan give additional information to that available from a DCE as it usesthe utility of a single level of an attribute as a benchmark rather thanan entire scenario.54

Empirical studies using BWS methods in health economics are rela-tively fewer in number, but the literature is growing. Recently,Potoglou et al.22 compared DCE and BWS head-to-head in their study(described above). They found that the preference weights from BWSand DCE reveal broadly similar patterns in preferences, and in the ma-jority of cases, the preference weights are not significantly different.However, they note that this broad equivalence may not hold true inall contexts, as warned by Flynn.95 Further developments in the field ofBWS are occurring rapidly, which may enable more widespread use ofBWS in economic evaluations.

For completeness of discussion, it should be noted that further varia-tions of the BWS approach also exist. One of them is the ‘best-worstdiscrete choice experiment’ in which individuals are first asked to pickthe best and worst options from the attribute levels in the scenario.Once the best and worst are identified, respondents are subsequentlyasked to choose the best and worst among the remaining attributelevels. This process is repeated until a complete order is established.12

For instance, for scenarios with five attribute levels, it would take twocycles to establish a complete order. These can then be analysed usingsequential multinomial models.

Conclusion

Economic evaluation involves valuation of health services, healthcareproducts, practices, policies, interventions and health states. In practice,this valuation process relies on utility-based stated preference elicitationmethods that can be cardinal (SG and TTO) or ordinal. Cardinalmethods are most commonly used to value health states. However,they have been criticized on the grounds of cognitive complexity, diffi-culty of administration, potential contamination by risk attitudes, gam-bling effects, risk aversion, time preference and duration effects andpossible violation of the underlying EUT. As a result, ordinal prefer-ence methods have gained significant popularity in health economics

Ordinal preference elicitation methods for economic evaluation

British Medical Bulletin 2012;103 19

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

and health services research in the last two decades. This can be attrib-uted to reduced cognitive complexity of ordinal preference tasks, lowerdegree of abstract reasoning required, reduced measurement error, easeof administration and ability to use both health and non-health out-comes. Furthermore, ordinal methods overcome some of the potentialbiases and limitations of the traditional cardinal approaches.

This article has discussed two common ordinal preference elicitationmethods, including DCE and ranking methods. These methods havebeen used to evaluate utilities of health and non-health outcomes, toquantify trade-offs between attributes, to estimate willingness-to-payand to predict expected uptake of policies and products. The usefulnessof ordinal methods for valuation purposes is facilitated by the abilityto estimate cardinal values from ordinal information obtained fromDCEs and ranking methods. Ordinal valuation methods can be seen asstandalone studies or they can form part of CEA, CUA or cost–benefitanalysis.

This article highlights that, while ordinal methods are very attractive,they are not without methodological challenge. Studies have found thatduring DCEs, respondents employ cognitive heuristics (or shortcuts) toreduce the cognitive burden of the task and often ignore attributes thatare of less interest. Moreover, respondents may not understand theordinal choice task as intended, or provide irrational responses96,97, ormake assumptions about unspecified attributes beyond the observedchoice scenarios. Finally, some authors have challenged the underlyingtheoretical assumptions of DCEs when used in practice. These chal-lenges may have implications on the validity of the estimates obtainedfrom DCEs.

As an alternative, ranking exercises are also becoming popular forvaluation purposes. The most common ranking exercise in health eco-nomics is the BWS, which identifies the attribute levels that are furthestapart on the latent utility scale. BWS has the advantage of being ableto compare attributes (and not levels only, as in DCE). BWS is relative-ly new to the discipline and further developments in the field are occur-ring rapidly.

This article provides a detailed discussion of the methods and chal-lenges of using DCEs and ranking exercises for valuation purposes.These methods have gained significant popularity since their introduc-tion to health economics in 1990. However, this is a growing area ofresearch and further developments can be expected in years to come.

S. Ali and S. Ronaldson

20 British Medical Bulletin 2012;103

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

References

1 Drummond MF, Sculpher MJ, Torrance GW. Methods for the Economic Evaluation ofHealth Care Programmes. USA: Oxford University Press, 2005.

2 Logan AG, Milne B, Achber C et al. Cost-effectiveness of a worksite hypertension treatment

program. Hypertension 1981;3:211–8.3 Hull R, Hirsh J, Sackett DL et al. Cost effectiveness of clinical diagnosis, venography, and

noninvasive testing in patients with symptomatic deep-vein thrombosis. N Engl J Med1981;304:1561–7.

4 Sculpher MJ, Buxton MJ. The episode-free day as a composite measure of effectiveness: an il-

lustrative economic evaluation of formoterol versus salbutamol in asthma therapy.Pharmacoeconomics 1993;4:345.

5 Mark DB, Hlatky MA, Califf RM et al. Cost effectiveness of thrombolytic therapy with

tissue plasminogen activator as compared with streptokinase for acute myocardial infarction.N Engl J Med 1995;332:1418–24.

6 Whitehead SJ, Ali S. Health outcomes in economic evaluation: the QALY and utilities. BrMed Bull 2010;96:5–21.

7 Flynn T, Louviere J, Peters T et al. Estimating preferences for a dermatology consultation

using best-worst scaling: comparison of various methods of analysis. BMC Med ResMethodol 2008;8:76.

8 Kiiskinen U, Suominen-Taipale AL, Cairns J. Think twice before you book? Modelling the

choice of public vs private dentist in a choice experiment. Health Econ 2010;19:670–82.9 Ryan M. Using conjoint analysis to take account of patient preferences and go beyond health

outcomes: an application to in vitro fertilisation. Soc Sci Med 1999;48:535–46.10 Gerard K, Lattimer V, Turnbull J et al. Reviewing emergency care systems 2: measuring

patient preferences using a discrete choice experiment. Emerg Med J 2004;21:692.

11 Scott A, Watson MS, Ross S. Eliciting preferences of the community for out of hours careprovided by general practitioners: a stated preference discrete choice experiment. Soc Sci Med2003;56:803–14.

12 Lancsar E, Louviere JJ. Estimating individual level discrete choice models and welfare mea-sures using best-worst choice experiments and sequential best-worst MNL. University of

Technology, Centre for the Study of Choice (Censoc) 2008,1–24.13 Hjelmgren J, Anell A. Population preferences and choice of primary care models: a discrete

choice experiment in Sweden. Health Policy 2007;83:314–22.

14 Rubin G, Bate A, George A et al. Preferences for access to the GP: a discrete choice experi-ment. Br J Gen Pract 2006;56:743.

15 Hanson K, McPake B, Nakamba P et al. Preferences for hospital quality in Zambia: resultsfrom a discrete choice experiment. Health Econ 2005;14:687–701.

16 Doyle S, Lloyd A, Birt J et al. Willingness to pay for obesity pharmacotherapy. Obesity 2012;

doi:10.1038/oby.2011.387.17 King MT, Hall J, Lancsar E et al. Patient preferences for managing asthma: results from a dis-

crete choice experiment. Health Econ 2007;16:703–17.18 Bech M, Kjaer T, Lauridsen J. Does the number of choice sets matter? Results from a web

survey applying a discrete choice experiment. Health Econ 2011;20:273–86.

19 Meenakshi J, Banerji A, Manyong V et al. Using a discrete choice experiment to elicit thedemand for a nutritious food: willingness-to-pay for orange maize in rural Zambia. J HealthEcon 2012;31:62–71.

20 Benjamin L, Cotte FE, Philippe C et al. Physicians’ preferences for prescribing oral and intra-venous anticancer drugs: a discrete choice experiment. Eur J Cancer 2012;48:912–20.

21 Kolstad JR. How to make rural jobs more attractive to health workers. Findings from a dis-crete choice experiment in Tanzania. Health Econ 2011;20:196–211.

22 Potoglou D, Burge P, Flynn T et al. Best-worst scaling vs. discrete choice experiments: an em-pirical comparison using social care data. Soc Sci Med 2011;72:1717–27.

23 Ryan M, Netten A, Skatun D et al. Using discrete choice experiments to estimate a

preference-based measure of outcome—an application to social care for older people.J Health Econ 2006;25:927–44.

Ordinal preference elicitation methods for economic evaluation

British Medical Bulletin 2012;103 21

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

24 Nieboer AP, Koolman X, Stolk EA. Preferences for long-term care services: willingness to pay

estimates derived from a discrete choice experiment. Soc Sci Med 2010;70:1317–25.25 Ungar WJ. Challenges in health state valuation in paediatric economic evaluation: are

QALYs contraindicated? Pharmacoeconomics 2011;29:641–52.26 Becker K, Zweifel P. Age and choice in health insurance: evidence from a discrete choice ex-

periment. Patient: Patient-Centered Outcomes Res 2008;1:27–40.

27 Brau R, Lippi Bruni M. Eliciting the demand for long-term care coverage: a discrete choicemodelling analysis. Health Econ 2008;17:411–33.

28 Lancsar EJ, Hall JP, King M et al. Using discrete choice experiments to investigate subjectpreferences for preventive asthma medication. Respirology 2007;12:127–36.

29 McIntosh E, Clarke PM, Frew EJ. Applied Methods of Cost–Benefit Analysis in Health Care.

Oxford, United Kingdom: Oxford University Press, 2010.30 Hauber A, Mohamed A, Johnson F et al. Treatment preferences and medication adherence of

people with Type 2 diabetes using oral glucose-lowering agents. Diab Med 2009;26:416–24.31 Ubach C, Scott A, French F et al. What do hospital consultants value about their jobs? A dis-

crete choice experiment. BMJ 2003;326:1432.

32 Mangham LJ, Hanson K, McPake B. How to do (or not to do). . .Designing a discrete choiceexperiment for application in a low-income country. Health Policy Plan 2009;24:151.

33 Kruk ME, Johnson JC, Gyakobo M et al. Rural practice preferences among medical students

in Ghana: a discrete choice experiment. Bull World Health Organ 2010;88:333–41.34 Green C, Gerard K. Exploring the social value of health-care interventions: a stated prefer-

ence discrete choice experiment. Health Econ 2009;18:951–76.35 Ratcliffe J, Bekker HL, Dolan P et al. Examining the attitudes and preferences of health

care decision-makers in relation to access, equity and cost-effectiveness: a discrete choice

experiment. Health Policy 2009;90:45–57.36 Mark TL, Swait J. Using stated preference and revealed preference modeling to evaluate pre-

scribing decisions. Health Econ 2004;13:563–73.37 Viney R, Lancsar E, Louviere J. Discrete choice experiments to measure consumer preferences

for health and healthcare. Exp Rev Pharmacoecon Outcomes Res 2002;2:319–26.

38 Donaldson C, Gerard K, Mitton C et al. Economics of Health Care Financing: The VisibleHand. London: Macmillan, 1993.

39 Lancsar E, Louviere J. Conducting discrete choice experiments to inform healthcare decisionmaking: a user’s guide. Pharmacoeconomics 2008;26:661–77.

40 Adamowicz W, Louviere J, Williams M. Combining revealed and stated preference methods

for valuing environmental amenities. J Environ Econ Manage 1994;26:271–92.41 Whitehead JC, Pattanayak SK, Van Houtven GL et al. Combining revealed and stated prefer-

ence data to estimate the nonmarket value of ecological services: an assessment of the state of

the science. J Econ Surv 2008;22:872–908.42 Ratcliffe J, Brazier J, Tsuchiya A et al. Using DCE and ranking data to estimate cardinal

values for health states for deriving a preference-based single index from the sexual quality oflife questionnaire. Health Econ 2009;18:1261–76.

43 Lloyd AJ. Threats to the estimation of benefit: are preference elicitation methods accurate?

Health Econ 2003;12:393–402.44 Torrance GW, Feeny D, Furlong W. Visual analog scales. Do they have a role in the measure-

ment of preferences for health states? Med Decis Making 2001;21:329–34.45 Brazier J, Ratcliffe J. Measuring and Valuing Health Benefits for Economic Evaluation. USA:

Oxford University Press, 2007.

46 de Bekker-Grob EW, Ryan M, Gerard K. Discrete choice experiments in health economics: areview of the literature. Health Econ 2012;21:145–72.

47 Ratcliffe J. The use of conjoint analysis to elicit willingness-to-pay values. Int J TechnolAssess Health Care 2000;16:270–90.

48 Louviere J, Lancsar E. Choice experiments in health: the good, the bad, the ugly and toward

a brighter future. Health Econ Pol Law 2009;4:527–46.49 Craig BM, Busschbach JJV, Salomon JA. Keep it simple: ranking health states yields values

similar to cardinal measurement approaches. J Clin Epidemiol 2009;62:296–305.50 Hershey JC, Kunreuther HC, Schoemaker PJH. Sources of bias in assessment procedures for

utility functions. Manage Sci 1982;936–54.

S. Ali and S. Ronaldson

22 British Medical Bulletin 2012;103

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

51 Schoemaker PJH. The expected utility model: its variants, purposes, evidence and limitations.

J Econ Lit 1982;529–63.52 Brazier J, Rowen D, Yang Y et al. Using rank and discrete choice data to estimate health state

utility values on the QALY scale. 2009.53 Bleichrodt H. A new explanation for the difference between time trade-off utilities and stand-

ard gamble utilities. Health Econ 2002;11:447–56.

54 Flynn TN, Louviere JJ, Peters TJ et al. Best-worst scaling: what it can do for health care re-search and how to do it. J Health Econ 2007;26:171–89.

55 Salomon JA. Reconsidering the use of rankings in the valuation of health states: a model forestimating cardinal values from ordinal data. Popul Health Metr 2003;1:12.

56 Moshkovich HM, Mechitov AI, Olson DL. Ordinal judgments in multiattribute decision

analysis. Eur J Oper Res 2002;137:625–41.57 Coast J, Flynn T, Sutton E et al. Investigating Choice Experiments for Preferences of Older

People (ICEPOP): evaluative spaces in health economics. J Health Serv Res Policy2008;13(suppl. 3):31–37.

58 Bridges JFP, Hauber AB, Marshall D et al. Conjoint analysis applications in health—a check-

list: a report of the ISPOR Good Research Practices for Conjoint Analysis Task Force. ValueHealth 2011.

59 Thurstone LL. A law of comparative judgment. Psychol Rev 1927;34:273.

60 Fanshel S, Bush JW. A health-status index and its application to health-services outcomes.Oper Res 1970;1021–66.

61 Bradley RA, Terry ME. Rank analysis of incomplete block designs: I. The method of pairedcomparisons. Biometrika 1952;39:324–45.

62 Kind P. A comparison of two models for scaling health indicators. Int J Epidemiol1982;11:271–75.

63 Louviere JJ, Woodworth G. Design and analysis of simulated consumer choice or allocation

experiments: an approach based on aggregate data. J Market Res 1983;350–67.64 McFadden D. Conditional Logit Analysis of Qualitative Choice Behavior. Berkeley,

California, USA: University of California BIoU, Regional D, 1973.

65 Propper C. Contingent valuation of time spent on NHS waiting lists. Econ J1990;100:193–9.

66 Ryan M, Hughes J. Using conjoint analysis to assess women’s preferences for miscarriagemanagement. Health Econ 1997;6:261–73.

67 Kind P. Applying paired comparisons models to EQ-5D valuations-deriving TTO utilities

from ordinal preference data. EQ-5D Concepts and Methods: A Developmental History. TheNetherlands: Springer, 2005,201–20.

68 McCabe C, Brazier J, Gilks P et al. Using rank data to estimate health state utility models.

J Health Econ 2006;25:418–31.69 Carson RT, Louviere JJ. A common nomenclature for stated preference elicitation

approaches. Environ Res Econ 2011;49:539–59.70 Ryan M, Farrar S. Using conjoint analysis to elicit preferences for health care. BMJ

2000;320:1530.

71 Ryan M, Gerard K. Using discrete choice experiments to value health care programmes:current practice and future research reflections. Appl Health Econ Health Policy2003;2:55–64.

72 Scott A. Eliciting GPs’ preferences for pecuniary and non-pecuniary job characteristics.J Health Econ 2001;20:329–47.

73 Bansback N, Brazier J, Tsuchiya A et al. Using a discrete choice experiment to estimate soci-etal health state utility values. HEDS Discussion Paper 10/03. 2010,1–34.

74 Ryan M, Skatun D, Major K. Using discrete choice experiments to go beyond clinical out-comes when evaluating clinical practice. Using Discrete Choice Experiments to Value Healthand Health Care 2008,101–16.

75 Hole AR. A comparison of approaches to estimating confidence intervals for willingness topay measures. Health Econ 2007;16:827–40.

76 Lancsar E, Wildman J, Donaldson C et al. Deriving distributional weights for QALYsthrough discrete choice experiments. J Health Econ 2011;30:466–78.

Ordinal preference elicitation methods for economic evaluation

British Medical Bulletin 2012;103 23

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from

77 Buxton MJ, Chambers JD. What values do the public want their health care systems to use in

evaluating technologies? Eur J Health Econ 2011;12:285–8.78 Amaya-Amaya M, Gerard K, Ryan M. Discrete choice experiments in a nutshell. Using

Discrete Choice Experiments to Value Health and Health Care. Dordrecht: Springer,2008,13–46.

79 Coast J, Horrocks S. Developing attributes and levels for discrete choice experiments using

qualitative methods. J Health Serv Res Policy 2007;12:25–30.80 Coast J, Al-Janabi H, Sutton E et al. Using qualitative methods for attribute development for

discrete choice experiments: issues and recommendations. Health Econ 2012;21:730–41.81 Louviere JJ, Carson R, Pihlens D. Design of discrete choice experiments: a discussion of

issues that matter in future applied research. J Choice Model 2011;4:1–8.

82 Lancaster KJ. A new approach to consumer theory. J Polit Econ 1966;74:132–57.83 Hole AR. Modelling heterogeneity in patients’ preferences for the attributes of a general prac-

titioner appointment. J Health Econ 2008;27:1078–94.84 Ryan M, Watson V, Entwistle V. Rationalising the ‘irrational’: a think aloud study of discrete

choice experiment responses. Health Econ 2009;18:321–36.

85 Lanscar E, Savage E. Deriving welfare measures from discrete choice experiments: inconsist-ency between current methods and random utility and welfare theory. Health Econ2004;13:901–37.

86 Santos Silva JMC. Deriving welfare measures in discrete choice experiments: a comment toLancsar and Savage (2). Health Econ 2004;13:913–8.

87 Lagarde M, Blaauw D. A review of the application and contribution of discrete choice experi-ments to inform human resources policy interventions. Hum Resour Health 2009;7:62–72.

88 Fok D, Paap R, Van Dijk B. A rank ordered logit model with unobserved heterogeneity in

ranking capabilities. J Appl Econom 2012;27:831–46.89 Slothuus U, Larsen ML, Junker P. The contingent ranking method—a feasible and valid

method when eliciting preferences for health care? Soc Sci Med 2002;54:1601–9.90 Chapman RG, Staelin R. Exploiting rank ordered choice set data within the stochastic utility

model. J Market Res 1982;19:288–301.

91 Chiang J, Chib S, Narasimhan C. Markov chain Monte Carlo and models of considerationset and parameter heterogeneity. J Econom 1999;89:223–48.

92 Louviere J, Woodworth G. Best-worst scaling: a model for largest difference judgments.Working paper, Faculty of Business, University of Alberta, 1990.

93 Finn A, Louviere JJ. Determining the appropriate response to evidence of public concern: the

case of food safety. J Public Policy Market 1992;11:12–25.94 Marley AAJ, Flynn TN, Louviere JJ. Probabilistic models of set-dependent and attribute-level

best-worst choice. J Math Psychol 2008;52:281–96.

95 Flynn TN. Valuing citizen and patient preferences in health: recent developments in threetypes of bestworst scaling. Exp Rev Pharmacoecon Outcomes Res 2010;10:259–67.

96 Fischer GW, Hawkins SA. Strategy compatibility, scale compatibility, and the prominenceeffect. J Exper Psychol Hum Percep Perform 1993;19:580.

97 Lancsar E, Louviere J. Deleting ‘irrational’ responses from discrete choice experiments: a case

of investigating or imposing preferences?. Health economics 2006;15:797–811.

S. Ali and S. Ronaldson

24 British Medical Bulletin 2012;103

at University of Y

ork on August 6, 2012

http://bmb.oxfordjournals.org/

Dow

nloaded from