Regression methods for covariate adjustment and subgroup analysis for non-censored...

HEALTH ECONOMICS

Health Econ. 13: 461–475 (2004)

Published online 15 December 2003 in Wiley InterScience (www.interscience.wiley.com). DOI:10.1002/hec.843

COST-EFFECTIVENESS ANALYSIS

Regressionmethods for covariate adjustment and subgroupanalysis for non-censored cost-e¡ectiveness data

Andrew R. Willana,b,*, Andrew H. Briggsc and Jeffrey S. HochdaProgram in Population Health Sciences, Hospital for Sick Children, Toronto, CanadabDepartment of Public Health Sciences, University of Toronto, CanadacHealth Economics Research Centre, University of Oxford, UKdDepartment of Epidemiology and Biostatistics, University of Western Ontario, Canada

Summary

The current interest in undertaking cost-effectiveness analyses alongside clinical trials has lead to the increasingavailability of patient-level data on both the costs and effectiveness of intervention. In a recent paper, we showhow cost-effectiveness analysis can be undertaken in a regression framework. In the current paper we developa direct regression approach to cost-effectiveness analysis by proposing the use of a system of seemingly unrelatedregression equations to provide a more general method for prognostic factor adjustment with emphasis onsub-group analysis. This more general method can be used in either an incremental cost-effectiveness or anincremental net-benefit approach, and does not require that the set of independent variables for costs andeffectiveness be the same. Furthermore, the method can exhibit efficiency gains over unrelated ordinary least squaresregression. Copyright # 2003 John Wiley & Sons, Ltd.

Keywords cost-effectiveness analysis; regression; clinical trials; covariate adjustment

Introduction

It is becoming increasingly common to conductprospective cost-effectiveness analyses of medicalinterventions as an integrated component of arandomized controlled trial. This added dimensionto trial-based evaluation has motivated the devel-opment of new statistical methodology designed toanswer questions of economic policy in addition toclinical questions. Initially attention was focussedon making inference related to estimating theincremental cost-effectiveness ratio (ICER) andcalculating the appropriate confidence intervals [1–10]. Most recently, due to the acknowledgedstatistical problems associated with ratio statistics,attention has shifted to making inference about

incremental net benefit (INB) using test ofhypothesis and confidence intervals [11–21]. How-ever, the analysis of INB requires the specificationof the willingness-to-pay (WTP) for a unit ofeffectiveness (denoted as l), or at the very least, theanalysis must be presented as a function of l sothat readers can apply the WTP most appropriatefor them.

Regardless of whether an ICER or an INBapproach is taken, the between-treatment differ-ences with respect to cost and effectiveness,together with their respective variances andcovariance, must be estimated. Concern is oftenraised regarding the possible existence of patientsubgroups, defined by baseline factors, such as ageor severity of disease, between which the true cost-effectiveness varies significantly. Recently we have

Copyright # 2003 John Wiley & Sons, Ltd.Received 26 July 2002Accepted 23 May 2003

*Correspondence to: 555 University Avenue, Toronto, ON M5G 1X8, Canada. E-mail: [email protected]

proposed the use of regression methods for theINB directly to adjust the analysis for baselineprognostic factors (covariates) and to allow arobust method for exploring important sub-groupeffects [22]. In the current paper we further developa direct regression approach to cost-effectivenessanalysis by proposing the use of a system ofseemingly unrelated regression equations to pro-vide a more general method for prognostic factoradjustment with emphasis on sub-group analysis.This more general method can be used in either anICER approach or an INB approach, and doesnot require that the set of covariates for costs andeffectiveness be the same, while still allowing forthe formal testing of restrictions across theseemingly unrelated regression models. Further-more, the method can exhibit efficiency gains overunrelated ordinary least squares regression ifdifferent sets of covariates are used in eachequation.

The paper is structured as follows. The modeland general methodology is outlined next. Thenan application to a specific trial, with emphasison treatment group interactions, is given.Simulation results addressing the issue of usingleast squares methods for skewed cost data arereported next.

Methods

The model

Consider a two-arm randomized control trial inwhich patients are randomized between Treat-ment, denoted by T, and Standard, denoted by S.Let ci be the total cost for the duration of interestfor patient i. Let EðciÞ ¼ nj if patient i israndomized to arm j, j ¼ T or S, and letDc ¼ nT � nS. Likewise, for patient i, let ei be themeasure of effectiveness. Let EðeiÞ ¼ mj if patient iis randomized to arm j, and let De ¼ mT � mS. Letnj, j ¼ T or S, be the respective sample size. Weassume that all patients are followed for theduration of interest or until death; i.e. there is nocensoring. Whether an ICER or an INB approachis used, a cost-effectiveness analysis requires thatthe following parameters be estimated: De, Dc,Vð #DDeÞ, Vð #DDcÞ, and Covð #DDe; #DDcÞ; where V and Covare the variance and covariance functions, respec-tively, and ^ indicates an estimator. For non-

censored data these parameter estimators arefunctions of the simple statistics of means,variances and covariances [20]. In this paper weprovide estimators for the required parameterswhile adjusting for baseline covariates, such asage, sex and severity of disease.

For patient i, let

ei ¼ Deti þ j0 þ j1u1i þ j2u2i þ � � �jpeupei þ eei

and

ci ¼ Dcti þ y0 þ y1w1i þ y2w2i þ � � � ypcwpci þ eci

where ti ¼ 1 if patient i is randomized to Treat-ment and 0 otherwise; u1i; u2i; . . . ; upei are the pecovariates for effectiveness;

w1i;w2i; . . . ;wpci are the pc covariates for cost;

and, eejeci

� �are i.i.d. with mean 0

0

� �and variance

S ¼s2e secsec s2c

� �. In matrix form, the models can

be written as

e ¼ Uuþ ee

where e ¼ ðe1; . . . ; enTþnSÞ0, u ¼ ðDe;j0;j1; . . . ;

jpeÞ0 ee ¼ ðee1; . . . ; ee;nTþnSÞ

0, and

U ¼

t1 1 u11 : : : upe1

t2 1 u12 : : : upe2

: : : : : : :

: : : : : : :

: : : : : : :

tnT þ tnS 1 u1;nTþnS : : : upe;nTþnS

0BBBBBBBBB@

1CCCCCCCCCA

Similarly we define c ¼ Whþ ec, where c ¼ðc1; . . . ; cnTþnS Þ

0, h ¼ ðDc; y0; y1; . . . ; ypc Þ0, ec ¼

ðec1; . . . ; ec;nTþnSÞ0, and

W ¼

t1 1 w11 : : : wpc1

t2 1 w12 : : : wpc2

: : : : : : :

: : : : : : :

: : : : : : :

tnT þ tnS 1 w1;nTþnS : : : wpc;nTþnS

0BBBBBBBBB@

1CCCCCCCCCA

The models for effectiveness and cost can bewritten as a system of seemingly unrelated regres-sion equations [23] as

y ¼ Xbþ e ð1Þ

Copyright # 2003 John Wiley & Sons, Ltd. Health Econ. 13: 461–475 (2004)

A. R.Willan et al.462

where

y ¼e

c

!; X ¼

U 0

0 W

!; b ¼

u

h

!;

e ¼eeec

!

and 0 represents matrices of appropriate dimen-sions in which all elements are zero. The vector ofparameters b can be estimated by the ordinaryleast squares (OLS) solution given by

#bbols ¼ ðX 0XÞ�1X 0y; ð2Þ

with variance-covariance matrix given by:

Vð#bbolsÞ ¼ ½X 0ðS�1 � IÞX ��1

where I is the identity matrix of dimensionnT þ nS. The vector of residuals is given by

#ee ¼ ½I � XðX 0XÞ�1X 0�y

and the elements of S can be consistentlyestimated [23] by

#ss2e ¼XnTþnS

i¼1

#eei #eei=½nT þ nS � ðpe þ 2Þ�

#ss2c ¼X2ðnTþnSÞ

i¼nTþnSþ1

#eei #eei=½nT þ nS � ðpc þ 2Þ�

and

#ssce ¼XnTþnS

i¼1

#eei #eenTþnSþi= nT þ nS �maxj¼T;S

ðpj þ 2Þ� �

If the covariates for effectiveness and cost arethe same (i.e. U ¼ W), then the OLS estimator isthe Best Linear Unbiased Estimator (BLUE). Ifone set of covariates is a subset of the other, thenthe OLS estimator for the smaller equation isBLUE. In all other situations, depending on thedata, efficiency gains are possible by using thegeneralized least squares (GLS) solution, given by

#bbgls ¼ ½X 0ðS�1 � IÞX ��1X 0ðS�1 � IÞy

with variance-covariance matrix given by

Vð#bbglsÞ ¼ ½X 0ðS�1 � IÞX ��1

The matrix S can be estimated from the OLSsolution as shown above. Whether or not the GLSsolution provides efficiency gains depends on thedata [23]. In general, the efficiency gains increase asthe absolute value of r ¼ sec=sesc, the correlation

between effectiveness and cost, increases. Also, theefficiency gains increase as the independencebetween the set of regression variables for costand the set of regression variables for effectivenessincreases. Efficiency gains will be reduced if thetreatment variable is correlated with the covariatesfor either cost or effectiveness (confounding) and ifthe covariates for cost and the covariates foreffectiveness are correlated.

Relationship with other models

If the covariates for effectiveness and cost are thesame (i.e. U ¼ W ; pe ¼ pc), then the OLS estima-tor, given in Equation (2), is identical to the OLSsolution for the multivariate regression model [24]given by

ðe; cÞ ¼ Uðu; hÞ þ ðee; ecÞ

Also, consider the approach taken by Hoch et al.[22] in which net benefit is defined for each patient as

biðlÞ ¼ lei � ci

where l is the willingness-to-pay for a unit ofeffectiveness. If the covariates for effectiveness andcost are the same, then the OLS estimator, given inEquation (2), is related to the OLS solution for theregression model given by

biðlÞ ¼ gtti þ g0 þ g1u1i þ � � � þ gpeupei þ ei

as follows:

#ggt ¼ l #DDe � #DDc

and

#ggk ¼ l #jjk � #yyk; k ¼ 0; 1; . . . ; pe

Cost-effective parameters

If we define #bb ¼ #bbols or #bbgls, depending on whichsolution is being used, then

#bb ¼ ð #DDe; #jj0; . . . ; #jjpe; #DDc; #yy0; . . . ; #yype Þ

0

Thus the first element of #bb is the covariate-adjustedestimator of De and the ðpe þ 3Þth element is thecovariate-adjusted estimator of Dc. Let vmk ¼ themth, kth element of ½X 0ð #SS

�1� IÞX ��1. Then

#VVð #DDeÞ ¼ v1;1

#VVð #DDcÞ ¼ vpeþ3;peþ3

RegressionMethods for Cost-E¡ectiveness Data 463


and

C #oovð #DDe; #DDcÞ ¼ v1;peþ3

Incremental cost-effectiveness ratio

The adjusted ICER is estimated by #DDc= #DDe, with theFieller [5,6] solution for the 100ð1� aÞ% confi-dence limits given by

ð #DDc= #DDeÞ 1� z21�a=2aec � z1�a=2

��ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiae þ ac � 2aec � z2

1�a=2ðaeac � a2ecÞq �=ð1� z21�a=2aeÞ

�where

ae ¼ v1;1= #DD2

e ; ac ¼ vpeþ3;peþ3= #DD2

c ;

aec ¼ v1;peþ3=ð #DDe#DDcÞ

and z1�a=2 is the 100ð1� a=2Þ percentile of astandard normal random variable.

Incremental net benefit

Letting l denote the willingness-to-pay for a unitof effectiveness, the adjusted INBðlÞ is estimatedby

bl ¼ l #DDe � #DDc with variance

s2l ¼ l2v1;1 þ vpeþ3;peþ3 � 2lv1;peþ3

The confidence limits for INBðlÞ are given bybl � z1�a=2sl, and if bl=sl exceeds z1�a, thehypothesis INBðlÞ ¼ 0 can be rejected in favourof INBðlÞ > 0, at the level a, leading to theconclusion that treatment is cost-effective com-pared to Standard.

Subgroup analysis

The regression model allows for sub-groupanalysis. Suppose investigators were interestedin determining whether a certain sub-group,say males for sake of argument, have the sameincremental net benefit as females. To accomplishthis let u1i be 1 if the ith patient is male and0 if female, and let u2i equal tinu1i. The covariateu2i is the interaction between sex and treat-ment arm. Let w1i ¼ u1i and w2i ¼ u2i. The

parameter of interest is lj2 � y2. If l #jj2 � #yy2is positive, then males (since they are coded 1)are observed to have greater INBðlÞ. The oppositeis true if l #jj2 � #yy2 is negative. The hypothesisthat INBmalesðlÞ ¼ INBfemalesðlÞ can be rejected atthe level a, if

ðl #jj2 � #yy2Þ=ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffil2v4;4 þ vpeþ4;peþ4 � 2lv4;peþ4

qexceeds jz1�a=2j

where pe now is the number of covariates plus thenumber of interaction terms, i.e. the number of ‘u-terms’ in the regression model. When testing forinteractions both terms should be included. Ob-serving that #jj2,

#yy2 or both are not statisticallysignificant does not mean that they are known tobe zero with 100% certainty, and the additionaltest of the hypothesis lj2 � y2 ¼ 0 is required todetermine if there is a significant interactionbetween the variable in question and randomiza-tion group.

Example

The randomized trial

The Program in Assertive Community Treatment(PACT) is one of the most studied models ofcare for persons with severe and persistentmental illnesses (SPMI) [25–28]. Lehman et al.[29] found that an assertive community treatment(ACT) program, relative to usual communityservices, reduced psychiatric inpatient days, emer-gency room visits, days homeless, and days in jailfor homeless persons with SPMI in Baltimore,Maryland (USA). The study’s rationale was thatby providing potentially more expensive butcoordinated, community-based care through theACT program, homeless persons with severemental illnesses would spend more days in stablecommunity housing with savings realized byshifting the patterns of care from higher costcrisis-oriented inpatient and emergency services tolower cost, ongoing ambulatory services. Theresults suggest that in the city of Baltimore, ACTwas effective in achieving important outcomeswarranting an examination of the cost-effecttrade-off. Lehman et al. [30] conducted an eco-nomic evaluation of the ACT program as it wasimplemented.



Methods and data

Direct treatment costs across the one year inter-vention period were examined from the perspectiveof the state mental health authority. Housingstatus was chosen as the main effectivenessmeasure because of its established validity as aprimary outcome for homeless persons with SPMI[31]. A day of stable housing was defined as livingin a non-institutionalized setting not intended toserve the homeless (e.g. independent housing,living with family, etc.) Subjects randomized tothe usual care condition had access to servicesusually available to homeless persons in the city ofBaltimore. Lehman et al. [30] offer more detailsabout the study’s methodology.

One hundred forty eight persons who werehomeless with SPMI were randomized to either theexperimental ACT program or to usual commu-nity services. Subjects were recruited during a 19-month period in 1991 and 1992 from inner-citypsychiatric hospitals, primary health care agencies,shelters, missions and soup kitchens. Baseline datacollection included an assessment of overall mentalhealth functioning using the Global Assessment ofFunctioning (GAF) Scale [32]. For this paper, weobtained complete data on 73 participants ran-domly assigned to the ACT program and 72randomly assigned to usual care (comparison)services.

Results

Baseline group comparisons examined differencesbetween the two intervention groups on demo-graphics, diagnoses and histories of homelessnessat baseline [30]. Table 1 presents an abbreviated setof results. The two groups appear comparable with

respect to age and GAF scores. In contrast, therewas a greater than expected percentage of AfricanAmericans randomized to the comparison condi-tion ðp50:01Þ. However, as pointed out by Altman[33], the confounding effect of a baseline covariatehas more to do with the magnitude of the between-group difference and the magnitude of its effect onthe outcome variable, rather than with thestatistical significance of the between-group differ-ence. Consequently, it is advisable to use aregression model to examine of the effects ofcovariates that are suspected of affecting theoutcome.

Table 2 provides a brief statistical summary ofthe cost and effect data and provides a conven-tional cost-effectiveness analysis of the data bylooking at the incremental costs and effectsbetween the two groups. ACT subjects had lowercosts and more days of stable housing, suggestingthis was the dominant treatment. Due to thesignificant difference in the subjects betweentreatment arms with respect to race, a stratifiedanalysis is also reported in Table 2. The ACTprogram is observed to have a greater effect amongWhites subjects in whom there is a larger decreasein average cost ($62 700 vs $5 070) and a largerincrease in average effect (98.1 vs 35.6 days).

This observation together with the fact thatthere was a disproportionate number of AfricanAmericans in the usual care group was themotivation for fitting a regression model includingan indicator variable for race and its interactionwith randomization group (Model 1). The indi-cator for race was 1 for African Americans, 0otherwise, and the indicator for randomizationgroup was 1 for the ACT group, 0 otherwise. Theresults are shown in Table 3. Because thecoefficient for race and its interaction weresignificant for cost there is some evidence forconcluding that the treatment effect on costdepends on race; the implication being that a

Table 1. Subject characteristics by treatment group

Characteristic ACTa subjects ðn ¼ 73Þ Comparison subjects ðn ¼ 72Þ

Age (years): mean (SD) 39.0 (9.43) 36.0 (8.30)

GAFb Score: mean (SD) 37.9 (9.08) 35.3 (9.06)African Americann 62% 83%

aAssertive community treatment.bGlobal assessment of functioning.np50:01.



separate cost-effectiveness analysis is required foreach race. The cost-effectiveness parameter esti-mates for each race are shown in Table 4. Themain effect parameter estimates from regression

model 1 are those applicable to Whites becausethey are coded 0. The simplest way to get theparameter estimates for African Americans is toreverse the coding and rerun the regression

Table 2. Simple statistics from economic analysis

Mean SE ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiVarðmeanÞ

pACT subjects

Costa 51880 7156Effectivenessb 211.9 12.22

Comparison subjectsCost 67400 9020Effectiveness 159.2 12.41

All subjectsCost difference �15520 11500Effectiveness difference 52.67 17.42

Stratified analysis

African American subjectsCost difference �5072 13110Effectiveness difference 35.61 21.34

White subjectsCost difference �62750 25070Effectiveness difference 98.1 32.68

aIn US dollars.b In days of stable housing. All values to four significant digits.

Table 3. Parameter estimates for regression Model 1

Coefficient Effectiveness Cost[standard error](2-sided p-value)

Randomization group 98.10 �62 750[36.14] [23 530](0.006643) (0.007665)

Intercept 132.7 112 200[30.24] [19 690](0.00001154) ð1:193� 10�8Þ

Race 31.92 �53 810[33.12] [21, 570](0.3353) (0.01260)

Racen randomization group �62:48 57 680[41.63] [27 100](0.1334) (0.03334)

nAll values to four significant digits.



procedure. From the parameter estimates inTable 4, the estimate for the ICER for AfricanAmericans is �142 with 90% confidence limits�1213 and 3358; the corresponding values forWhites are �640, �1439 and �278. Since theconfidence interval falls entirely within the domi-nant quadrant for Whites, it can be concluded thatthe treatment is cost-effective for them regardlessof the willingness-to-pay for a homeless day.However, for African Americans, although treat-ment is observed to dominate, the upper con-fidence limit of $3358 per homeless day implies

that conclusive evidence for cost-effectiveness islimited to willingness-to-pay values in excess ofthis value. These conclusions can be seen also inthe INB plots in Figure 1. For clarity, and becausethey are of secondary interest, the upper limitsof INB are not shown. Since treatment is observedto dominate in both racial groups, INBðlÞis positive regardless of l. For Whites the lowerlimit is always positive, implying that thehypothesis INBðlÞ � 0 can be rejected for anyvalue of l. For African Americans the lower limitcrosses the horizontal axis at $3358, and thehypothesis INBðlÞ � 0 cannot be rejected for l �3358 at the 5% level. (The use of 90% confidenceintervals provides for 5%, one-sided test ofhypotheses.)

Since the terms for race and race by treatmentare not significant for effectiveness, one may betempted to fit a model without those two terms(Model 2). The results are shown in Table 5 forOLS and GLS. The standard errors for theparameters for costs are slightly smaller for GLS,implying relatively modest efficiency gains. Theefficiency gains are modest, even though the

Figure 1. Incremental net benefit and lower 90% confidence limit versus lambda, Model 1

Table 4. Cost-effectiveness parameter estimates forregression Model 1

African Americans Whites

#DDe 35.61 98.1#DDc �5072 �62 750#VVð #DDeÞ 426.7 1306#VVð #DDcÞ 1:809� 108 5:538� 108

Covð #DDe; #DDcÞ �1:119� 105 �3:425� 105

Note: All values to four significant digits.



absolute value for r of �0:40 is large, because theset of regression variables for cost and the set ofregression variables for effectiveness are correlated.Firstly, for obvious reasons, the treatment variableis in both sets. Secondly, race, which is part of theregression equation for cost, but not effectiveness,is confounded with the treatment variable, seeTable 1. Efficiency gains will often be modest inthis context, and an important motivation forusing the method of seemingly unrelated regres-sion equations is to provide an estimate ofCovð #DDe; #DDcÞ, which is required for statisticalinference in cost-effectiveness analyses.

The cost-effectiveness parameter estimates usingGLS for this model are shown in Table 5. Fromthe parameter estimates in Table 6, the estimatefor the ICER for African Americans is �180:9

with 90% confidence limits �608 and 317; thecorresponding values for Whites are �966, �2227and �294. The INB plots for Model 2 can befound in Figure 2. As with Model 1, the confidenceinterval falls entirely within the dominant quad-rant for Whites, and it can be concluded that thetreatment is cost-effective for them regardless ofthe willingness-to-pay for a homeless day. ForAfrican Americans, the upper confidence limit of$317/homeless day implies that conclusive evi-dence for cost-effectiveness is limited to willing-ness-to-pay values in excess of this value.However, this threshold value for the willingness-to-pay is less than a tenth of the value for Model 1.Consequently, the health policy implications forthis data is very sensitive to the inclusion oftwo non-significant parameters in the regressionmodel. For a willingness-to-pay of $500, there isno conclusive evidence for cost-effectiveness forAfrican Americans using Model 1, however, thereis using Model 2. The question is which modelshould be used. As discussed later when testinginteractions to identify sub-groups for which theINB differs significantly, the appropriate covari-ates and their interactions for both effectivenessand cost must be included. That is, the relevanttest of hypothesis is lj2 � y2 ¼ 0. The fact that #jj2

is not significantly different from zero does notmean that we know it is zero with 100%confidence. The p-values for various values of l,


Coefficient Effectiveness Cost[SE](2-sided p-value)

OLS/GLS OLS GLSRandomization group 52.66 �62 750 �50 870

[17.42] [22 020] [22 000](0.002508) (0.00438) (0.02079)

Intercept 159.2 112 200 105 300[12.36] [18 310] [18 290]

(0.00001154) ð8:866� 10�10Þ ð8:658� 10�9Þ

Race �53 810 �45 460[19 750] [19 710]

(0.006432) (0.02110)

Race* randomization group 57 680 41 340[24 820] [24 780](0.02012) (0.09520)


Table 6. GLS Cost-effectiveness parameter estimates forregression Model 2

African Americans Whites

#DDe 52.66 52.66#DDc �9528 �50 870#VVð #DDeÞ 303.5 303.5#VVð #DDcÞ 1:726� 108 4:841� 108

Covð #DDe; #DDcÞ �8:071� 104 �8:071� 104




using the parameter estimates from Model 1, aregiven in Table 7.

Finally, a model was fitted considering all threecovariates (Model 3). The results are presented inTable 8. The negative coefficients for the interac-tion terms for effectiveness indicates that thebenefit of ACT was observed to be greater forWhites, younger subjects and subjects with lowerGAF scores. On the cost side, the coefficients forthe interaction terms are all positive, indicating

that the reduction in costs from ACT wereobserved to be greater for Whites, youngersubjects and subjects with lower GAF scores.Taken together, the signs of the interaction termsindicate that for any given l the INBðlÞ isobserved to be greater for Whites, youngersubjects and subjects with lower GAF scores.The p-values for the test of hypothesis for theINBðlÞ interactions are given in Table 9. For manyplausible values of l there is a significant interac-tion between randomization group and each of thethree covariates. Note that for GAF, neither #jj6

nor #yy6 is significant at the 5% level. However, formany values of l, l #jj6 � #yy6 is significant. Thepresence of the interactions makes summarizingthe results difficult. Even if age and GAF scoreswere dichotomized, eight INBðlÞ plots are requiredto illustrate the results.

Simulations

Criticism is often levied at the application ofleast-squares methods to cost data, as we have

Figure 2. Incremental net benefit and lower 90% confidence limit versus lambda, Model 2

Table 7. P-values as a function of l for testing H:lj2 �y2 ¼ 0 in regression Model 1

l p-valuen

0 (costs) 0.0333450 0.02994100 0.02767200 0.02541500 0.027271000 0.038642000 0.060735000 0.093201 (effectiveness) 0.1334

nAll values to four significant digits.



proposed in the Methods section, because of theright skewing which is often present [34]. Evidenceof skewing is given in Figure 3, which contains a

histogram of the residuals from Model 3. Resi-duals are observed twice as far to the right of zero(their mean), as they are to the left. Transforma-


Coefficient [SE] Effectiveness Cost(2-sided p-value)

OLS/GLS OLS/GLSRandomization group 298.1 �255 000

[105.4] [68 850](0.004669) (0.0002125)

Intercept �67:10 204 700[70.37] [45 990](0.3404) ð8:507� 106Þ

Race 32.06 �54 330[32.73] [21 390](0.3273) (0.01108)

Age 2.608 �1 319[1.531] [1000](0.08842) (0.1872)

GAF 2.994 �1 262[1.400] [915.2](0.03254) (0.1680)

Race* randomization group �61:53 60 210[41.16] [26 900](0.1349) (0.02519)

Age* randomization group �2:368 2 861[2.008] [1 312](0.2382) (0.02920)

GAF* randomization group �32:64 2 276[1.941] [1 269](0.09270) (0.07282)


Table 9. P-values for testing INB interactions in regression Model 3 as a function of l

l Race Age GAFH:lj4 � y4 ¼ 0 H:lj5 � y5 ¼ 0 H:lj6 � y6 ¼ 0

0 (costs) 0.02519 0.02920 0.0728250 0.02255 0.02769 0.06234100 0.02083 0.02698 0.05468200 0.01924 0.02731 0.04501500 0.02158 0.03622 0.036391000 0.03297 0.06041 0.039652000 0.05581 0.1041 0.051435000 0.09070 0.1658 0.069811 (effectiveness) 0.1349 0.2382 0.09270




tions, such as the log or square root, are oftenproposed. However, such transformationsprovide estimates for the between-treatmentdifference on a scale not relevant to policymakers [35]. In this section simulations areused to demonstrate the robustness ofleast-squares methods in the presence of right-skewing, in particular for cost data that islog-normal.

Data were simulated for a hypothetical clinicaltrial designed to compare the mean cost betweentwo arms: treatment and standard. One contin-uous and one binary covariate were considered.The error term was log-normal with a standarddeviation equal to the mean cost in the standardarm. The relationship between the covariates andcost were set such that a one standard deviationincrease in the continuous covariate or thepresence of the binary covariate increased cost by10% of the mean cost in the standard arm. Foursample sizes were considered: 50, 100, 500 and1000/arm. The means (mS or mT) of the contin-uous covariate were either the same in each arm ordiffered by one standard deviation. The probabil-ities (pS or pT) that the binary covariate waspresent were either set at 0.25 for both arms or setto 0.25 in the standard arm and 0.5 in the

treatment arm. If pS=pT or mS=mT, then thecovariates are confounded with treatment arm.The conditional mean costs were either set to 8000in both arms, or set to 8000 in the standard armand 12000 in the treatment arm. (The actualamounts are arbitrary given that the distributionof the error term and the relationship between thecovariates and cost are all relative to the mean costin the standard arm.) Consequently, there were 32situations ð4� 2� 2� 2Þ for which data weresimulated. For each situation, 5000 simulationswere generated, for which the following weregenerated:

* the average between-arm estimate of cost* the proportion of times the null hypothesis of

no treatment effect was rejected at the 5% level,2-sided

* the proportion of times the 2-sided, 95%confidence interval covered the true between-arm difference

* the proportion of times the 2-sided, 95%confidence interval covered the true coefficientfor each covariate.

The results are contained Tables 10 for thesituation where the between-arm conditional mean

0

10

20

30

40

50

60

70

-150

- -1

00

-100

- -5

0

-50

- 0

0 - 5

0

50 -

100

100

- 150

150

- 200

200

- 250

250

- 300

Residual ($1,000)

Figure 3. Histogram of Residuals



costs are equal, and in Table 11 where they differ.From the tables it can be seen that the between-arm conditional mean costs is estimated withoutnoticeable bias, although the 95% confidenceinterval tends to be a little conservative, withcoverage averaging around 98%. Consequently,the true level of the 2-sided, 5% level test is about2%. The coverage for the covariate coefficients isvery close to the nominal value, although for thesituation for which pS=pT, coverage is a littleconservative. We conclude, therefore, that least-squares methodology is a valid approach for costdata that are skewed even to the point of beinglog-normal. However, data sets in which there areextreme outliers may not be amenable to least-squares methodology, and alternative approaches,

based on dealing with outliers directly, should beemployed.

Discussion

The current interest in undertaking cost-effective-ness analyses alongside clinical trials has leadto the increasing availability of patient-leveldata on both the costs and effectiveness ofintervention that can be employed to inform aneconomic evaluation of the treatment underevaluation. Most such trial-based (or ‘stochastic’)cost-effectiveness analyses proceed by averagingover the treatment arms of the trial in order to

Table 10. Simulation results for between-arm differencein mean cost set to zero

Sample mT ¼ mS mT ¼ mS mT=mS mT=mS

size/arm pT ¼ pS pT ¼ 2pS pT ¼ pS pT ¼ 2pS

50 17.4 12.9 �4.5 0.80.018 0.022 0.014 0.0200.982 0.978 0.986 0.9800.960 0.975 0.954 0.9730.955 0.951 0.944 0.942

100 7.8 3.0 16.2 �16.50.015 0.023 0.017 0.0200.985 0.977 0.983 0.9800.957 0.977 0.961 0.9710.950 0.950 0.947 0.947

500 �3.8 �0.1 1.1 �16.10.018 0.025 0.024 0.0310.982 0.975 0.976 0.9690.957 0.970 0.954 0.9660.946 0.951 0.951 0.946

1000 �15.9 5.8 �2.7 �8.30.024 0.029 0.025 0.0300.976 0.971 0.975 0.9700.961 0.966 0.957 0.9660.952 0.948 0.947 0.945

Note: First entry: average estimated between-arm difference in

cost. Second entry: the proportion of times the null hypothesis

of no treatment effect was rejected, 5% level 2-sided. Third

entry: the proportion of times the 95% confidence interval

covered between-arm difference. Fourth entry: the proportion

of times the 95% confidence interval covered coefficient for

continuous covariate. Fifth entry: the proportion of times

the 95% confidence interval covered coefficient for binary

covariate.

Table 11. Simulation results for between-arm differencein mean cost set to 4000

Sample mT ¼ mS mT ¼ mS mT=mS mT=mS

size/arm pT ¼ pS pT ¼ 2pS pT ¼ pS pT ¼ 2pS

50 3 985 4 004 3 998 4 0060.995 0.996 0.996 0.9930.985 0.976 0.986 0.9810.947 0.970 0.956 0.9750.955 0.948 0.943 0.943

100 3 997 3 999 4 010 4 0000.996 0.994 0.995 0.9950.982 0.975 0.985 0.9770.961 0.970 0.958 0.9760.953 0.952 0.947 0.944

500 3 986 4 000 3 999 3 9950.997 0.998 0.998 0.9970.980 0.976 0.975 0.9760.963 0.970 0.959 0.9680.952 0.956 0.947 0.945

1000 4 003 4 002 3 992 4 0030.998 0.999 0.998 0.9980.975 0.972 0.975 0.9740.959 0.971 0.964 0.9730.953 0.951 0.952 0.944

Note: First entry: average estimated between-arm difference in

cost. Second entry: the proportion of times the null hypothesis

of no treatment effect was rejected, 5% level 2-sided. Third

entry: the proportion of times the 95% confidence interval

covered between-arm difference. Fourth entry: the proportion

of times the 95% confidence interval covered coefficient

for continuous covariate. Fifth entry: the proportion of times

the 95% confidence interval covered coefficient for binary

covariate.



estimate the ICER for the patient populationincluded in the trial. More recently, the net-benefitstatistic has been employed by analysts in pre-ference to the ICER due to its desirable statisticalproperties.

In a recent paper, we extended the net-benefitframework to show how cost-effectiveness analysiscould be undertaken in a regression framework(see Reference [22]). It was argued that theadvantage of formulating the problem in this waywas that the analyst could control for observedheterogeneity between patients}thereby correct-ing for any imbalances in the data and improvingthe precision of the cost-effectiveness estimates.Furthermore, by including interaction terms be-tween the treatment indicator and the prognosticfactors, the direct regression approach provides arobust method to explore potential sub-groupeffects.

In this paper, we further extend the regressionapproach to cost-effectiveness analysis by propos-ing the use of a system of seemingly unrelatedregression equations on the cost and effectivenesscomponents of the analysis. This approach retainsthe advantage of offering analysts the opportunityto covariate-adjust their analysis and to explorepotential treatment interactions with those covari-ates while offering two further advantages. Firstly,the method does not require that the set ofcovariates for cost and effectiveness be the same,as is required by the methods proposed byHoch et al. [22] and multivariate regressiontechniques [24].

Secondly, even when the regression modelsemploy the same covariates (i.e. when the OLSand GLS estimates correspond exactly) the meth-od has the advantage that it allows the testing ofjoint hypothesis concerning functions of thecoefficients across the two equations. This isimportant since, as we demonstrate, the appro-priate test concerning net-benefits being positiveinvolves both cost and effectiveness estimates, andit would be incorrect to infer that net-benefit is notsignificant simply because the corresponding coef-ficients from the cost and effect equations wereindividually insignificant. This joint test wasachieved in our previous paper by running aregression on the net-benefit directly. The advan-tage of the method proposed here is that it is notspecific to either the ICER or net-benefit ap-proaches to cost-effectiveness analysis: the estima-tors can be used to provide a covariate adjustedestimator for the ICER, or, for a given value of l,

a covariate adjusted estimator for INBðlÞ, alongwith the respective confidence intervals.

Of course, the testing of sub-group effects shouldbe undertaken with due care and diligence. Whileproviding a robust method of testing potentialsub-group effects in cost-effectiveness analysis bycomparing net-benefits between groups, the meth-ods proposed here is subject to the usual problemsthat lead many commentators to caution againstthe use of sub-group analysis in clinical trials. Ifsuch sub-group analyses did not form part of theoriginal trial design, then examination of cost-effectiveness differences by patient characteristicsmust be considered exploratory. Nevertheless, sub-group analysis in cost-effectiveness analysis may beconsidered more important than the correspond-ing use of sub-group analysis in clinical evaluation.While in clinical evaluation it may be consideredthat qualitative differences in treatment effects (i.e.effects that differ in sign) between subgroups ofpatients are rare compared to quantitative differ-ences (where effects differ only in magnitude), ineconomic evaluation quantitative differences inclinical effect or cost can lead to qualitativedifferences in the cost-effectiveness between patientsubgroups.

One potential concern with the use of linearregression methods for cost data is the tendencyfor cost data to exhibit high degrees of skewness,which can violate the assumptions on whichinference for the estimates is based. Therefore,we used simulation methods to examine theeffect of skewing on statistical inference basedon least-squares methodology. Apart from theconfidence intervals for treatment effect beinga little conservative (i.e. a little too wide), thereappears to be no real cause for concern, evenwhen cost data are log-normal and the totalsample size is as small as 100. The results of thisexperiment agree with other recent analyses of thisproblem that have also concluded that leastsquares methods are generally robust to skeweddata [36,37]. However, a potential area of forfuture research is whether the use of othertechniques, such as generalized linear models,can offer efficiency gains for modeling cost data.Adaptation of these simulation methods could beused to determine power curves as a function ofeffect-size and sample size to facilitate sample sizedetermination in the design of cost-effectivenessstudies.

Overall, we believe that seemingly unrelatedregression methods provide important advant-



ages for the analysis of uncensored cost andeffectivness data in clinical trial based cost-effec-tiveness analyses. More research is required toexamine whether these techniques can be general-ized to include more sophisticated models suchas those that might be employed to handlecensored data. In addition, the role of sub-groupanalysis in economic evaluations should be ex-plored in terms of its importance for policypurposes in comparison to the usual reluctanceto undertake sub-group analyses in clinicalevaluation.

Acknowledgements

The authors wish to thank Professor Paul Fenn for hissuggestion to examine seemingly unrelated regression inthe context of our problem and for his comments on aprevious draft, Dr Gary Foster for producing the figuresand Ms Susan Tomlinson for her meticulous proofreading. Of course, all remaining errors are our ownresponsibility. AHB is funded through a UK Depart-ment of Health, Public Health Career Scientist award,however the opinions expressed are those of the authorsand should not be attributed to any funding agency.

References1. O’Brien BJ, Drummond MF, Labelle RJ, Willan

AR. In search of power and significance: issues inthe design and analysis of stochastic cost-effective-ness studies in heath care. Med Care 1994; 32:150–163.

2. Mullahy J, Manning W. Statistical issues of cost-effectiveness analysis. In Valuing Health Care, SloanF (ed). Cambridge University Press: Cambridge,1994; 149–184.

3. van Hout BA, Al MJ, Gordon GS, Rutten FFH.Costs, effects and C/E ratios alongside a clinicaltrial. Health Econ 1994; 3: 309–319.

4. Wakker P, Klaassen MP. Confidence intervalsfor cost/effectiveness ratios. Health Econ 1995; 4:373–381.

5. Willan AR, O’Brien BJ. Confidence intervals forcost-effectiveness ratios: an application of Fieller’stheorem. Health Econ 1996; 5: 297–305.

6. Chaudhary MA, Stearns SC. Confidence intervalsfor cost-effectiveness ratios: an example from arandomized trial. Statist Med 1996; 15: 1447–1458.

7. Mullahy J. What you don’t know can’t hurt you?Statistical issues and standards for medical technol-ogy evaluation. Med Care 1996; 34(12 Suppl):DS124–DS135.

8. Manning WG, Fryback DG, Weinstein MC. Re-flecting uncertainty in cost effectiveness analysis. InCost Effectiveness in Health and Medicine, GoldMR, Siegel JE, Russell LB, Weinstein MC (eds).Oxford University Press: New York, 1996.

9. Briggs AH, Wonderling DE, Mooney CZ. Pullingcost-effectiveness analysis up by its bootstraps; anon-parametric approach to confidence intervalestimation. Health Econ 1997; 6: 327–340.

10. Polsky D, Glick HA, Willke R, Schulman K.Confidence intervals for cost-effectiveness ratios: acomparison of four methods. Health Econ 1997; 6:243–252.

11. Phelps CE, Mushlin AI. On the (near) equivalenceof cost-effectiveness and cost-benefit analysis. IntJ Technol Assess Health Care 1991; 7: 12–21.

12. Ament A, Baltussen R. The interpretation of resultsof economic evaluation: explicating the value ofhealth. Health Econ 1997; 6: 625–635.

13. Stinnett AA, Mallahy J. Net health benefits: a newframework for the analysis of uncertainty in cost-effectiveness analysis. Med Decision Making 1998;18(Suppl): S68–S80.

14. Tambour M, Zethraeus N, Johannesson M. A noteon confidence intervals in cost-effectiveness analysis.Int J Technol Assess 1998; 14: 467–471.

15. van Hout BA, Al MJ, Gordon GS, Rutten FFH.Costs, effects and C/E ratios alongside a clinicaltrial. Health Econ 1994; 3: 309–319.

16. Briggs A, Fenn P. Confidence intervals or surfaces?uncertainty on the cost-effectiveness plane. HealthEcon 1998; 7: 723–740.

17. Briggs AH. A Bayesian approach to stochastic cost-effectiveness analysis. Health Econ 1999; 8: 257–261.

18. Lothgren M, Zethraeus N. Definition, interpreta-tion and calculation of cost-effectiveness acceptabil-ity curves. Health Econ 2000; 9: 623–630.

19. Heitjan DF. Fieller’s method and net health benefit.Health Econ 2000; 9: 327–335.

20. Willan AR, Analysis, sample size and power forestimating incremental net health benefit fromclinical trial data. Controll Clin Trials 2001; 22:228–237.

21. Willan AR, Lin DY. Incremental net benefit inrandomized clinical trials. Statist Med 2001; 20:1563–1574.

22. Hoch JS, Briggs AH, Willan AR. Somethingold, something new, something blue: a frameworkfor the marriage of health econometrics andcost-effectiveness analysis. Health Econ 2002; 11:415–430.

23. Greene WH. Econometric Analysis (2nd edn).Macmillan: New York, 1993; 486–489.

24. Timm NH. Multivariate Analysis with Applicationsin Education and Psychology. Brooks/Cole: Monter-ey, 1975; 307–313.

25. Stein LI, Test MA. Alternative to mental hospitaltreatment: I. Conceptual model, treatment program,



and clinical evaluation. Arch. Gen Psychiatry 1980;37: 392–397.

26. Olfson M. Assertive community treatment: anevaluation of the experimental evidence. HospitalCommunity Psychiatry 1990; 41: 634–641.

27. Burns BJ, Santos AB. Assertive community treat-ment: an update of randomized trials. Psychiat Serv1995; 46: 669–675.

28. Scott JE, Dixon LB. Assertive community treatmentand case management for schizophrenia. Schizo-phrenia Bull 1995; 21: 657–668.

29. Lehman AF, Dixon LB, Kernan E et al. Arandomized trial of assertive community treat-ment for homeless persons with severemental illness. Arch Gen Psychiatry 1997; 54:1038–1043.

30. Lehman AF, Dixon LB, Hoch JS et al. Cost-effectiveness of assertive community treatment forhomeless persons with severe mental illness. BrJ Psychiatry 1999; 174: 346–352.

31. Newman SJ. The severely mentally ill homeless:housing needs and housing policy. Occasional Paper

No. 12. The Johns Hopkins University Institute forPolicy Studies, 1992.

32. American Psychiatric Association. Diagnostic andStatistical Manual of Mental Disorders DSM-III-R(3rd revised edn). American Psychiatric Association:Washington, DC, 1987.

33. Altman DG. Comparability of randomised groups.The Statistician 1985; 34: 125–136.

34. O’Hagan A, Stevens JW. Assessing and comparingcosts: how robust are the bootstrap and methodsbased on asymptotic normality? Health Econ 2003;12: 33–49.

35. Manning WG, Mullahy J. Estimating log models: totransform or not to transform? J Health Econ 2001;20: 461–494.

36. Lumley T, Diehr P, Emerson S, Chen L. Theimportance of the normality assumption in largepublic health data sets. Ann Rev Public Health 2002;23: 151–169.

37. Thompson SG, Barber JA. How should cost datain pragmatic randomised trials be analysed? Br MedJ 2000; 320: 1197–1200.



Regression methods for covariate adjustment and subgroup analysis for non-censored...

Documents

Transcript of Regression methods for covariate adjustment and subgroup analysis for non-censored...