Incremental value of exercise electrocardiography and thallium-201 testing in men and women for the...

10
Incremental value of exercise electrocardiography and thallium-201 testing in men and women for the presence and extent of coronary artery disease Anthony P. Morise, MD, a George A. Diamond, MD, b Robert Detrano, MD, PhD, c and Marco Bobbio, MD d Morgantown, W. Va., Los Angeles and Torrance, Calif., and Turin, Italy Our goal was to assess the incremental value of exercise testing in men and women for the diagnosis and extent of coronary artery disease. With data from one center, incre- mental logistic algorithms were developed and evaluated in a separate set of 865 patients from four centers. Variables included were pretest (age, sex, symptoms, diabetes, smok- ing, and cholesterol concentration); exercise electrocardio- gram (ECG) (ST-segment depression [millimeters], ST-seg- ment slope, peak heart rate, and change in systolic blood pressure); and thallium-201 scintigram (defect presence, re- versibility, and intensity of hypoperfusion). End points were coronary disease presence (50% diameter stenosis) and ex- tent (multivessel disease). Accuracy and incremental value were assessed by receiver operating characteristic (ROC) curve analysis. Incremental ROC curve areas for disease presence were pretest 0.75 ± 0.02, post-exercise ECG 0.82 ± 0.01, and post-thallium scintigram 0.85 ± 0.01 and for disease extent were pretest 0.71 ± 0.02, post--exercise ECG 0.76 ± 0.02, and post-thallium scintigram 0.78 ± 0.02 (p < 0.005 for all increments). Incremental increases in ac- curacy were similar for men and women. We conclude that when multivariable algorithms derived from one center were applied to a separate group, there was a significant incre- mental increase in accuracy associated with exercise test- ing for the presence and extent of coronary disease. This in- crease in accuracy was similar for men and women, (AM HEART J 1995;130:267-76.) Diagnostic tests are frequently compared with other diagnostic tests without consideration of their hier- From athe Section of Cardiology, Department of Medicine, West Virginia University School of Medicine, Morgantown; bthe Division of Cardiology and Department of Medicine, Cedars-Sinai Medical Center, and the School of Medicine, University of California, Los Angeles; ~the Division of Cardi- ology, Department of Medicine, St. Johns Cardiac Research Center, Torrance; and dthe Division of Cardiology, University of Turin. Supported in part by a an Individual Investigator (ROli grant from the Agency for Health Care Policy Research (HS-06065), Rockville, Md.; by a Specialized Center of Research (SCOR) grant from the National Institutes of Health (HL-17651), Bethesda, Md.; and by a Grant-in-Aid from the American Heart Association, West Virginia Affiliate. Received for publication Oct. 4, 1994; accepted Jan. 26, 1995. Reprint requests: Anthony P. Morise, MD, Section of Cardiology, Health Sciences Center South, West Virginia University School of Medicine, Mor- gantewn, WV 26506. Copyright © 1995 by Mosby-Year Book, Inc. 0002-8703/95/$3.00 + 0 411/64255 archical place in the flow of diagnostic decision mak- ing. This evaluation in isolation of other clinically relevant and available data gives little insight into the value of a test when considered in its proper clin- ical context. An increasing number of studies have evaluated the incremental value of test data over data obtained before or at the same time as the in- dex test. 1-s This evaluation is appropriate because clinicians make decisions on the basis of integration of all available data with previously determined data (clinical and other test data), olden influencing the subsequent tests obtained and their interpretation. In this respect, previous studies have shown that there is significant incremental value to the use of exercise testing for the diagnosis of coronary disease when considered in its appropriate clinical con- text.5, 6 Likewise, other reports confirm the incre- mental prognostic value of exercise testing. 1"4,7, s Renewed concern about the accuracy of noninvasive diagnostic testing in women has led to considerable interest in assessing the accuracy of diagnostic methods separately in men and women. 9 Previously, we developed and validated an incremental multi- variable algorithm designed to estimate the proba- bility of coronary disease presence. 5 This algorithm has several sex-specific aspects concerning pretest and exercise electrocardiogram (ECG) variables. The purpose of the current study was threefold: first, to refine the original algorithm concerning dis- ease presence; second, to present algorithms regard- ing the estimation of multivessel disease presence; and finally, to determine the incremental value of exercise testing (exercise ECG and thallium-201 scintigraphy) for disease presence and extent in men and women separately. METHODS To estimate the pretest, post-exercise ECG, and post- thallium-201 scintigram probabilities of coronary disease, multivariable algorithms were derived from patient data at one center (West Virginia University Medical Center). 267

Transcript of Incremental value of exercise electrocardiography and thallium-201 testing in men and women for the...

Incremental value of exercise electrocardiography and thallium-201 testing in men and women for the presence and extent of coronary artery disease

Anthony P. Morise, MD, a George A. Diamond, MD, b Robert Detrano, MD, PhD, c and

Marco Bobbio, MD d Morgantown, W. Va., Los Angeles and Torrance, Calif., and Turin, Italy

Our goal was to assess the incremental value of exercise testing in men and women for the diagnosis and extent of coronary artery disease. With data from one center, incre- mental logistic algorithms were developed and evaluated in a separate set of 865 patients from four centers. Variables included were pretest (age, sex, symptoms, diabetes, smok- ing, and cholesterol concentration); exercise electrocardio- gram (ECG) (ST-segment depression [millimeters], ST-seg- ment slope, peak heart rate, and change in systolic blood pressure); and thallium-201 scintigram (defect presence, re- versibility, and intensity of hypoperfusion). End points were coronary disease presence (50% diameter stenosis) and ex- tent (multivessel disease). Accuracy and incremental value were assessed by receiver operating characteristic (ROC) curve analysis. Incremental ROC curve areas for disease presence were pretest 0.75 ± 0.02, post-exercise ECG 0.82 ± 0.01, and post-thallium scintigram 0.85 ± 0.01 and for disease extent were pretest 0.71 ± 0.02, post--exercise ECG 0.76 ± 0.02, and post-thallium scintigram 0.78 ± 0.02 (p < 0.005 for all increments). Incremental increases in ac- curacy were similar for men and women. We conclude that when multivariable algorithms derived from one center were applied to a separate group, there was a significant incre- mental increase in accuracy associated with exercise test- ing for the presence and extent of coronary disease. This in- crease in accuracy was similar for men and women, (AM HEART J 1995;130:267-76.)

Diagnostic tests are frequently compared with other diagnostic tests without consideration of their hier-

From athe Section of Cardiology, Department of Medicine, West Virginia University School of Medicine, Morgantown; bthe Division of Cardiology and Department of Medicine, Cedars-Sinai Medical Center, and the School of Medicine, University of California, Los Angeles; ~the Division of Cardi- ology, Department of Medicine, St. Johns Cardiac Research Center, Torrance; and dthe Division of Cardiology, University of Turin.

Supported in part by a an Individual Investigator (ROli grant from the Agency for Health Care Policy Research (HS-06065), Rockville, Md.; by a Specialized Center of Research (SCOR) grant from the National Institutes of Health (HL-17651), Bethesda, Md.; and by a Grant-in-Aid from the American Heart Association, West Virginia Affiliate. Received for publication Oct. 4, 1994; accepted Jan. 26, 1995. Reprint requests: Anthony P. Morise, MD, Section of Cardiology, Health Sciences Center South, West Virginia University School of Medicine, Mor- gantewn, WV 26506. Copyright © 1995 by Mosby-Year Book, Inc. 0002-8703/95/$3.00 + 0 411/64255

archical place in the flow of diagnostic decision mak- ing. This evaluation in isolation of other clinically relevant and available data gives little insight into the value of a test when considered in its proper clin- ical context. An increasing number of studies have evaluated the incremental value of test data over data obtained before or at the same time as the in- dex test. 1-s This evaluation is appropriate because clinicians make decisions on the basis of integration of all available data with previously determined data (clinical and other test data), olden influencing the subsequent tests obtained and their interpretation. In this respect, previous studies have shown that there is significant incremental value to the use of exercise testing for the diagnosis of coronary disease when considered in its appropriate clinical con- text.5, 6 Likewise, other reports confirm the incre- mental prognostic value of exercise testing. 1"4, 7, s Renewed concern about the accuracy of noninvasive diagnostic testing in women has led to considerable interest in assessing the accuracy of diagnostic methods separately in men and women. 9 Previously, we developed and validated an incremental multi- variable algorithm designed to estimate the proba- bility of coronary disease presence. 5 This algorithm has several sex-specific aspects concerning pretest and exercise electrocardiogram (ECG) variables.

The purpose of the current s tudy was threefold: first, to refine the original algorithm concerning dis- ease presence; second, to present algorithms regard- ing the estimation of multivessel disease presence; and finally, to determine the incremental value of exercise testing (exercise ECG and thallium-201 scintigraphy) for disease presence and extent in men and women separately.

METHODS

To estimate the pretest, post-exercise ECG, and post- thallium-201 scintigram probabilities of coronary disease, multivariable algorithms were derived from patient data at one center (West Virginia University Medical Center).

267

August 1995 268 Morise et al. American Heart Journal

Derivation subgroups, All patients referred to the stress laboratory at West Virginia University Medical Center between 1981 and 1992 were screened. Only patients referred for the express purpose of evaluating the presence of coronary disease were considered. Patients could not have a history of myocardial infarction or coro- nary arteriography. Patients who met these criteria and who underwent coronary angiography <-3 months after stress testing were included in this study. Although it was policy to encourage referring physicians to discontinue medications influencing heart rate or blood pressure before diagnostic exercise tests, no patient was excluded from this study because he or she was taking these medications. Two derivation groups were assembled: (1) the first (chronolog- ical) 590 patients were considered the derivation group for clinical and exercise ECG algorithms, and (2) 213 patients treated between 1988 to 1992 constituted the derivation group for thallium scintigram algorithms. The first group was previously described. 5 Data from the thallium group were collected at the onset of use of single photon emission computed tomographic (SPECT) imaging at West Virginia University Medical Center. Because this study involved only a review of data previously acquired for clinical indi- cations, Human Subjects Committee review was not re- quired. However, all data were handled so 'as to ensure pa- tient confidentiality.

Baseline clinical information, Data were collected from patients in the derivation population during a pro cxercise test interview. Variables included the patient's age, sex, symptoms, current cigarette smoking, and diabetes melli- tus. Chest pain was classified according to the four catego- ries of Diamond. 1° Resting systolic blood pressure was measured with a cuff sphygmomanometer. If the patient consented, a specimen of blood was collected to measure total serum cholesterol concentration.

All resting ECGs were categorized as normal, equivocal, or abnormal. Equivocal ECGs had ST-T changes with no significant downward displacement of the ST segment (80 msec from the J-point when compared with baseline between two PR segments for at least three cycles) that would render them difficult to interpret. Abnormal ECGs had ST-T changes that would render an exercise ST response uninterpretable. These included ECG patterns of left ventricular hypertrophy, left bundle branch block, digitalis effect, Wolff-Parkinson-White syndrome, or other significant downward displacement of the ST segment.

Exercise tests. The majority of patients underwent ex- ercise testing by the Bruce treadmill protocol (6% Naugh- ton treadmill and 4% arm ergometer protocols). The following data were collected during the exercise test: peak exercise heart rate and systolic blood pressure, any ST- segment depression (millimeters), and ST-segment slope. Exercise related ST-segment changes were measured 80 msec after the J-point irrespective of ST slope and were compared with the baseline between two PR segments. Peak exercise ST-segment slope was assessed visually and was qualitatively categorized as upsloping, horizontal, or downsloping. Change in systolic blood pressure was the difference between exercise and resting systolic blood

pressures. All of the studies were read by one of the authors (A.P.M.) in a blinded fashion. Positive ST-segment criteria consisted of-> 1 mm horizontal or downsloping ST-segment depression 80 msec after the J-point for three consecutive cycles. When referred to in Results under the multivari- able algorithm, negative ST-segment responses included those with <1 mm horizontal or downsloping ST-segment depression or <1.5 mm upsloping ST-segment depression.

Exercise-induced angina and exercise capacity expressed as metabolic equivalents (METs) were not incorporated into the current algorithms. Previous work in our labora- tory has demonstrated that exercise-induced angina is not an independent predictor of coronary disease presence or extent 6 and that exercise capacity is an independent pre- dictor of coronary disease extent but not presence. 6 How- ever, this work also demonstrated that despite this inde- pendent predictability, the incorporation of METs into a multivariable model added no additional (i.e., incremental) diagnostic accuracy to the other variables considered.

Thallium-201 scintigraphy. All of the thallium studies at the derivation institution were read by one of the authors (A.P.M.) in a blinded fashion. Only thallium studies obtained after August 1988 at the derivation insti- tution were used in the derivation set. The SPECT images were divided into five segmental regions: anterior, septal, lateral, inferior, and apical. If a defect was found in any segment, it was categorized according to its reversibility and the intensity ofhypoperfusion. 11 Quantitative or even semiquantitative methods were not used, for the following reasons: (1) there was nonuniformity of quantitative meth- ods applied to the studies throughout the 5-year period of data collection at the derivation institution; (2) the valida- tion patients were treated at different centers, and the data from them were derived by different methods of acquisition (SPECT and planar), analysis (quantitative methods and visual evaluation), and interpretation; and (3) quantitative and segmental data from the validation group were unavailable to us for this study. Given these limitations we limited our analysis to simple qualitative variables.

Three thallium variables were considered in the current study. Scan abnormalities were considered in the following manner: (1) the presence of no defect was considered as a yes-or-no binary variable; (2) defect reversibility was coded as 3 for any reversible defect (complete or partial), 2 for only irreversible defects, and 1 for no defect; (3) the degree of maximal exercise hypoperfusion of the most intense defect was coded as 3 for marked, 2 for moderate, 1 for mild, and 0 for no defect.11 Because of the absence of comparable data in the validation set, we do not report the evaluation of any semiquantitative segmental scores or thallium lung-heart ratio.

Coronary angiography, Angiograms were read by two cardiologists in a blinded fashion. Differences were re- solved by consensus. Coronary artery disease was defined as the presence of at least one vessel with >-50% luminal diameter narrowing. Coronary artery disease extent was defined as involvement of at least two vessels (i.e., multi- vessel disease). Left main artery disease with >-50%

Volume 130, Number 2 American Heart Journal Morise et al. 269

Table I. Characteristics of pat ient populations

Derivation 1 Validation

Men Women Derivation 2 Men Women

No. 326 264 213 577 288 Age _+ SD (yr) 53 ± 12 56 _+ 12" 56 -+ 12 56 _+ 10 57 - 11 Women/men (%/%) - - - - 51/49 - - - - Symptoms (%)

Typical 22 24 24 40 30t Atypical 40 41 25 27 405 Nonanginal 29 29 37 18 24* None 9 6 14 15 65

Diabetes (%) 21 23 20 13 18 Smoking (%) 39 29t 39 29 22* Abnormal resting ECG (%) 35 12 20 21 26 Positive exercise ECG (%) 35 (n = 290) 33 (n = 232) 16 (n = 165) 45 (n = 528) 36t (n = 265) Peak heart rate _+ SD (beats/min) 140 -+ 23 142 _+ 23 142 _+ 21 143 _+ 25 145 __- 23 Change in systolic blood pressure ___ SD (mm Hg) 38 _+ 20 32 + 20t 35 _+ 20 38 _+ 24 30 -+ 21t Abnormal thallium scintigram (%) - - - - 69 71 485 Disease presence (%) 46 35t 50 66 405 Multivessel disease (%) 39 27t 32 45 265

Of all with disease (%) 85 77 64 69 66 Vessel (no.) disease (%)

One 13 14 25 20 14 Two 13 9 16 12 9 Three 17 10 7 33 18 Any left main artery 3 2 1 --§ --§

--, Not evaluated. *p < 0.05 vs men. tP < 0.01 vs men. Sp < 0.0001 vs men. §Left main artery disease was not differentiated in data set and is combined with second- and third-vessel disease.

stenosis was considered at least two-vessel disease. The designation of left ma in artery or three-vessel disease was not used because of the lack of specific data concerning left main artery involvement in one of the validation centers.

Multivariate algorithm. For logistic regression analysis we used Number Cruncher Statistical System software, (J.L. Hintze, Kaysville, Utah) (version 5.3, 1988). For the final algorithms, we chose variables tha t were good mul- t ivariable predictors (p < 0.05) in a m a n n e r similar to our previous method. 5 We allowed the separate determinat ion of preexercise (pretest), post-exercise ECG, and pos t - thal l ium scintigram probabilities to reflect the actual flow of clinical decision making. Therefore there are equations at each of three incremental levels for disease presence and disease extent. To carry over information from the pretest evaluation to the post-exercise ECG evaluation, we incor- porated pretest probability into the post-exercise ECG analysis with the other exercise ECG variables ra ther than use the individual pretest variables. This approach tended to minimize variable number for each analysis and is sim- ilar to an approach that we have used previously. 5, 5 We have noted that this approach leads to no loss in discrim- inan t accuracy (unpublished data). The same was done with the pos t - tha l l ium scintigram analysis except tha t post-exercise ECG probability instead of pretest probabil- ity was used.

Validation group. These pat ients ' data were assembled

for the purpose of val idat ing other test ing modalities and were, therefore, suitable for val idat ing our algorithms. For our incremental analysis, only pat ients who underwent exercise thal l ium studies were included. This group of 865 included 301 pat ients from the Cleveland Clinic (Cleve- land, Ohio), 55 from the Veterans Adminis trat ion Medical Center (Long Beach, Calif.), 329 from Cedars-Sinai Med- ical Center, and 180 additional pat ients from West Vir- ginia Universi ty Medical Center tha t were not involved in the derivation of the algorithms. Like the derivation cohort, these patients had possible coronary disease and underwent exercise test ing and subsequent coronary an- giography. The angiographic criterion for significant dis- ease in this group was the same as tha t stated earlier in this report.

Statistical comparison and analysis. The ab i l i t y of a method to resolve a group of patients into those with and without disease defines discr iminant accuracy. The area under a receiver operating characteristic (ROC) curve is an index of discr iminant accuracy tha t ranges from 0 to 1. This index indicates the probability tha t a method will perfectly resolve a group into diseased and nondiseased subgroups. I t is part icularly suited for men and women who have undergone catheterization because it is not affected by differences in disease prevalence and selection bias. 12 With the probability data, the methods of Hanley and McNeil 13,14 were used to generate the areas under the

August 1995 270 Morise et al. American Heart Journal

Table II. Accuracy of exercise ECG and thallium-201 scin- tigraphy in derivation and validation groups

Disease presence Disease extent

Deriva- Valida- Deriva- Valida- tion tion tion tion

Exercise ECG Sensitivity 52% 57% 49% 66% Specificity 78% 78% 73% 73%

Men Sensitivity 56% 58% 54% 66% Specificity 82% 79% 77% 72%

Women Sensitivity 46% 55% 39% 63% Specificity 74% 78% 69% 73%

Thallium scintigraphy Reversible defect

Sensitivity 68% 64% 72% 67% Specificity 51% 73% 48% 64%

Any defect Sensitivity 79% 80% 84% 83% Specificity 41% 58% 38% 49%

ROC curve 13 and to compare them. 14 We used True Epistat (version 5.0) to do so and to generate the actual curves. ROC curve areas are expressed as areas +_ SE. Mean _+ SD were compared by the t test: Comparison of proportions was made by a nonparametric two-sample test of propor- tions. Values ofp < 0.05 were considered significant.

RESULTS Derivation and validation subgroups. Table I lists the

characteristics of the derivation and validation groups. Derivation group 1 was used for the pretest and post-exercise ECG algorithms, and derivation group 2 was used for the post-thaUium-201 scinti- gram algorithms. The two derivation groups differed with respect to clinical variables and chronology. However, because the exercise ECG and thallium scintigram analyses were by design separate, these differences were not expected to affect our results. In derivation group 1, women were older, included fewer smokers, and had a lower prevalence of coro- nary disease than men.

Compared with the derivation groups, the valida- tion group had a lower percentage of women (33% vs 45%), patients with diabetes (15% vs 22%), current cigarette smokers (27% vs 35%), and nonanginal pain (20% vs 29%) and a higher percentage of patients with typical and atypical angina (68% vs 63%), uninterpretable resting ECGs (23% vs 25%), and positive exercise ECGs (42% vs 34%). The vali- dation group also had a higher prevalence of coro- nary disease (57% vs 41%); multivessel disease (39% vs 34%), and three-vessel disease (28% vs 14%). The

Table III. Accuracy of exercise ECG and thallium-201 scin- tigraphy by diseased vessel number in derivation groups

One-vessel Two-vessel Three-vessel disease disease disease*

Positive exercise ECG No. 80 68 95 Sensitivity 39% 52% 63% Specificity 67% 68% 71%

Thallium scintigraphy: any defect No. 54 35 18 Sensitivity 76% 86% 72% Specificity 33% 34% 31%

*Includes patients with significant left main artery disease.

percentage of abnormal exercise thallium study results and the intensity of exercise as assessed by the heart rate and blood pressure variables were similar. Because of differences between the deriva- tion and validation groups, it was believed that the validation group would present a suitable challenge to the generalizability of the derived algorithms. Within the validation group, women were older than the men and had a different mix of symptoms. They also had fewer positive exercise ECG results and thallium test results, probably because of the lower prevalence of coronary disease.

Accuracy of exercise tests. Table I I summarizes the sensitivity and specificity for exercise ECG and thal- lium scintigraphy in the derivation (SPECT) and validation (SPECT and planar) groups. Although the sensitivity of exercise ECG was higher in the valida- tion group, the specificity was very uniform. Con- versely, although specificity of thallium scintigraphy was higher in the validation group, the sensitivity was very uniform. The accuracy of exercise ECG was lower in women than men, although this difference was less apparent in the validation group. Table III summarizes the sensitivity and specificity of exercise ECG and SPECT thallium scintigraphy for detecting single-, double-, and triple-vessel or left main artery disease in the derivation groups. Although the sen- sitivity of either technique varied for the number of vessels involved, the specificity remained uniform.

Derivation of algorithms. All variables included in the final algorithms were significant independent predictors of the respective end point, either coro- nary disease presence or extent. The post-exercise ECG and post-thall ium scintigram algorithms in- corporated prior probability, respectively referred to as pretest and post-exercise ECG probabilities. De- tails concerning the variable coefficients and the use of the algorithms are provided in the Appendix.

Pretest algorithms were developed for probability

Volume 130, Number 2 American Heart Journal Morise et al. 271

1.00

A 3 B

.80 3 1

1

.60

~ .40

.20-

. 0 0 I

1.00 .80 .60 .40 .20 .00 "~.00 .80 .60 .40 .20 .00

Specificity

Fig. 1. ROC curves of three incremental models for coronary disease presence (A) and extent (B) derived from validation population. Performance of pretest model (1) was significantly improved by addition of ex- ercise ECGvariables. Likewise, performance of combined pretest and exercise ECG model (2) was improved by addition of thallium-201 scintigraphy variables (3).

estimation with and without a total cholesterol con- centration value, because frequently this value is not available at the initial clinical examination. Other variables included were age, sex, symptom score, current diabetes, and current cigarette smoking. The inclusion of cigarette smoking did not yield smoking as a significant predictor of coronary disease pres- ence in men, but it was a significant predictor of cor- onary disease extent in men and women. When cho- lesterol concentration was included, smoking lost its significance as a predictor of disease presence and extent.

Previously, we developed an exercise ECG algo- r i thm for estimating the probability of coronary dis- ease presence that had four equations depending on the patient's sex and the interpretability of the rest- ing ECG. 5 For the current study, we combined these four equations into a single equation with a separate equation for coronary disease presence and extent. In summary, we found that, although ST-segment variables were independent predictors of disease presence and extent in men and women, there were differences in men and women with respect to the details. Variables for a negative ST-segment re- sponse (negative correlation) and ST-segment slope (positive correlation) were predictors in both men and women. However, quantitative ST-segment de- pression (millimeters) was predictive only in men, as was demonstrated in a previous report. 5 In this regard, millimeters ST-segment depression is in- cluded in the equations in the Appendix but is always

Table IV. Incremental ROC curve area analysis

Disease presence Disease extent

Pretes t 0.73 _+ 0.02 0.67 -+ 0.02 p < 0.00001 p < 0.00001

Post-exercise ECG 0.78 _+ 0.01 0.73 ± 0.02 p < 0.00001 p < 0.005

Pos t - tha l l ium scint igraphy 0.83 + 0.01 0.78 -~ 0.02

coded as 0 for women. Peak heart rate and change in systolic blood pressure were included in the algo- rithms for disease presence and extent, respectively. As noted earlier in this report, exercise-induced symptoms and exercise capacity were not included because of their lack of incremental predictive value for disease presence or extent. 5, 6

Thallium scintigraphy variables were chosen by considering the significance of their independence and their effect on accuracy assessed by ROC curve area measurements performed in derivation group 2, in a manner similar to an approach used previously. 6 All thallium variables were good univariate predic- tors, and when all three thallium variables were evaluated with postexercise ECG probability, the ROC curve area increased from 0.80 to 0.84 for dis- ease presence and from 0.77 to 0.82 for disease extent (p < 0.05). Of the three variables evaluated together, maximal hypoperfusion was the only independent predictor of disease presence and extent. Despite the univariate correlation of thallium defect reversibil- ity, when analyzed with other thallium variables, its

August 1995 272 Morise et al. American Heart Journal

Table V. Incremental ROC curve area analysis by sex

Disease presence Disease extent

Men Women Men Women

Pre t e s t 0.71 _+ 0.02 0.72 -+ 0.03 0.66 -- 0.02 0.68 _+ 0.03 p < 0.00001 p < 0.001 p < 0.00001 p < 0.05

Pos t - exe rc i se E C G 0.77 +- 0.02 0.76 + 0.02 0.72 _+ 0.02 0.69 _+ 0.03 p < 0.00001 p < 0.05 p < 0.01 p < 0.001

P o s t - t h a l l i u m sc in t ig raphy 0.83 +_ 0.02 0.81 _+ 0.02 0.77 _+ 0.02 0.77 z 0.03

Table VI. Comparison of studies using multivariable analysis of clinical and exercise test variables pertaining to extent of coronary artery disease

Year of Men / women Analysis Incremental Reference publication Sample size (%) by sex Extent Prevalence Validation study

15 1981 1351 100/0 Y MVD 44 N Y 16 1982 141 82/18 N MVD 48 N N 17 1985 171 100/0 Y MVD 45 N Y 18 1991 1074 77/23 N TLD* 19 Y N 19 1991 6435 U n k n o w n N TLD 33 Y Y 20 1992 688 77/23 N TLD 28 N N 21 1992 680 73/27 N TLD 31 N Y 22 1992 607 100/0 Y TLD 16 N N

7 1993 834 74/26 N TLD* 27 N N 6 1994 800 65/35 N TLD* 18 N Y

23 1994 411 81/19 N TLD 19 Y Y C u r r e n t s t u d y 1994 590 55/45 Y MVD* 34 Y Y

MVD, Multivessel disease; N, no; TLD, triple-vessel or left main artery disease; Y, yes. *Angiographic standard was ->50% stenosis. Otherwise standard was 70% to 75% stenosis.

diagnostic effect was negligible. This finding proba- bly reflects the characteristics of referral popula- tion: many of the patients who underwent thallium studies were referred for angiography because of a defect's reversibility and not because of the intensity of exercise hypoperfusion. Nevertheless, because data on hypoperfusion intensity are less available in the validation group than are data concerning defect reversibility, equations considering reversibility were also developed, were used in our validation evalua- tion, and are included in the Appendix.

Validation of algorithms and incremental value com- parisons. The equations in the Appendix were used to generate probability data for the validation popula- tion. The results of the first three comparisons that follow are shown in Fig. 1.

Pretest versus exercise ECG. Table IV indicates that clinical variables discriminate disease presence and extent moderately well. However, the incremen- tal increase in accuracy over pretest data attributed to exercise ECG was very significant.

Exercise ECG versus thall ium-201 scintigraphy. Table IV indicates that there was a significant and

substantial incremental increase in accuracy when thallium scintigraphy was added to exercise ECG.

Presence versus extent o f coronary disease. Starting with the pretest evaluation in Table IV, the algo- ri thms for disease presence had higher ROC curve areas than those for disease extent. Therefore the variables used were better able to discriminate dis- ease presence than disease extent.

Men versus women. In Table V, similar observa- tions concerning disease presence and extent were made for men and women.

DISCUSSION

Previously, we demonstrated that there was a n incremental increase in diagnostic accuracy for clin- ical, exercise ECG, and thallium-201 scintigraphy variables concerning the presence of angiographic coronary artery disease. 5 The current study extends these findings to coronary disease extent. In addi- tion, by our multivariable algorithms, there was similar incremental diagnostic value in both men and women for both disease presence and extent.

The current study (as reflected in Table V) does

Volume 130, Number 2

American Heart Journal Morise et al. 273

Table VII. Comparison of variables evaluated in studies using multivariable methods pertaining to extent of coronary artery disease

Total Ex- Refer- Symp- Diabetes choles- Resting S T S T induced Ex Ex Peak ex Thal- ence Age Sex toms Smoking mellitus terol ECG Other (ram) slope angina AmP capacity HR RPP lium

15 Y - - Y Y N - - Y - - Y - - N Y Y - -

16 N Y Y - - - - - - N - - N Y N Y Y N Y - -

17 Y - - Y N N - - Y F a m i l y Y - - N - - Y N Y Y

h i s t o r y

18 Y Y Y N Y Y Y R e s t i n g Y Y Y Y Y Y - - - -

B P

19 Y Y Y Y Y - - N P r e v i o u s . . . . . . . .

M I

2 0 Y N - - N Y - - - - H i g h Y - - N Y Y Y Y Y

B P

2 1 Y Y Y N Y - - Y . . . . . . .

2 2 Y - - N - - - - - - Y - - Y N - - Y . . . .

7 N N - - - - N - - - - - - Y - - N N N Y - - Y

6 Y Y Y N Y Y - - - - Y Y N - - Y Y - - - -

2 3 Y Y Y N Y - - - - - - Y - - N N N - - Y Y

P r e s e n t Y Y Y Y Y Y N - - Y Y N Y - - N - - Y

HBP, Blood pressure ; Ex, exercise; HR, h e a r t rate; MI, myocard ia l infarct ion; N, not a s igni f icant predictor; RPP, r a t e - p r e s s u r e product; ST, any ST-seg-

m e n t depression; Y, s ign i f i can t predictor; - - , va r i ab le not eva lua ted .

suggest that exercise ECG has similar accuracy in men and women. However, numerous studies and our own Table II suggest that exercise-induced ST- segment depression is less reliable in women than in men. 9 However, the results displayed in Table V do not reflect only ST-segment changes. The pretest probability calculation contains an adjustment for sex such that probability is reduced for women. In addition, millimeters ST-segment depression is al- ways coded as 0 for women and as the actual milli- meters of depression for men. Both women and men had negative ST-segment responses, peak heart rate, and change in systolic blood pressure consid- ered in the algorithms. Therefore the post-exercise ECG results in Table V reflect much more than sim- ple ST-segment change, with several adjustments that would tend to minimize those raw sex-related differences. In other words, interpretation on the basis of this probabilistic approach would tend to ne- gate differences in test accuracy.

Previous studies using multivariable methods have demonstrated numerous clinical and exercise test variables that are predictors of coronary disease presence and extent. We have previously reviewed the studies that evaluated disease presence. 5 We found at least 11 studies in addition to the current study which considered disease extent. 6, 7, 15-23 The characteristics and findings of these 12 studies are presented for comparison in Tables VI and VII.

Previous studies (Table VI) Sex. Table VI indicates that 9 of 12 studies evalu-

ated data from both men and women. Except for the current study and one other, those with both sexes used predominantly (>70%) male populations. Of 9 studies with data for both men and women, only 1 (the current study) analyzed data separately in men and women. Therefore, women were underrepre- sented and not examined as a separate group in most previous studies.

Prevalence. The prevalence of disease in more than one coronary vessel in the 12 studies ranged from 18% to 48%. However, the definition of extent (mul- tivessel vs three-vessel or left main artery disease) and percentage stenosis (50% vs 75%) differed in many of the studies. The Coronary Artery Surgery Study 24 found that the prevalence ofmultivessel and three-vessel or left main artery disease in 8157 pa- tients with stable chest pain and no myocardial in- farction was respectively 45% to 50% and 33% to 38% for men and 15% to 20% and 9% to 14% for women. The ranges reflect differences resulting from the an- giographic standard chosen (50% vs 70%). Overall, prevalences for the two definitions of coronary dis- ease extent were 35% to 40% for multivessel disease and 25% to 30% for three-vessel or left main artery disease. Of the studies in Table VI, all had a preva- lence for their respective definition of disease extent that was close to these estimated prevalence ranges derived from the Coronary Artery Surgery Study. 24

Incremental consideration. As stated earlier, the interpretation of exercise test data within the con- text of other known clinical data is an important and

August 1995 274 MorLse et o]. American Heart Journal

often underappreciated consideration. 8 Therefore, incremental evaluations are necessary for determin- ing the true clinical relevance of exercise test vari- ables. Table VI indicates that 7 of the 12 studies can be considered incremental.

Validation. When establishing that certain vari- ables are predictors of dependent variables such as disease extent, it is equally important to validate these findings in a separate population, ideally one that is geographically or institutionally separate. Only 3 of the 12 studies, including the current study, carried out validation studies, and only 2 of these were incremental studies.

Variables evaluated (Table VII) Clinical variables. The majority of the studies

found age, sex, symptoms, history of diabetes, and cholesterol to be independent predictors of coronary disease extent. There was less uniformity for smok- ing and resting ECG status.

ST-segment depression. Concerning the exercise ECG variables, millimeters ST depression was a good predictor in 9 of 10 studies. Only the current study evaluated ST-segment depression separately in men and women. We found that millimeters ST- segment depression was a predictor in men but not in women.

ST-segment slope. ST-segment slope was a good predictor in four of five studies. However, the defini- tion of this variable was far from uniform. Again, only the current study evaluated ST-segment slope separately in men and women. However, we found that ST-segment slope was a predictor in both men and women.

Non ST-segment exercise variables. Only two of nine studies found exercise-induced angina to be a good predictor of extent. However, there was general agreement that exercise-induced changes in heart rate or systolic blood pressure were good predictors. Exercise capacity, usually defined as exercise dura- tion or METs achieved, was found to be a good pre- dictor in five of eight studies, but only two of the pos- itive studies used an incremental study design. One study demonstrated independent predictability for exercise capacity but a lack of an increase in incre- mental diagnostic accuracy compared with other ex- ercise test variables. 6

Previous studies of the assessment of coronary disease prognosis have identified variables not in- cluded in our algorithms, such as exercise-induced

25 26 angina and exercise capacity. , Although this ap- proach seems to be at odds with our results especially concerning disease extent, analyses dealing with disease severity need not necessarily yield the same results as analyses dealing with prognosis. The an-

atomic extent of disease and the prognostic effect of that disease are separate clinical questions that may be predicted by different variables. Therefore, be- cause our study and those dealing with prognosis are not truly comparable, neither ours nor the others are incorrect, and our findings are not necessarily in- compatable with the others.

Thallium-201 scintigraphy. All six studies (three incremental) that evaluated thallium scintigraphy found it to be an independent predictor. However, the variables evaluated in each study differed. The thal- lium variables in the current study were, by design, simple and qualitative compared to other quantita- tive methods and were taken from the early SPECT experience of the derivation institution. We realize that this put the thallium study in a disadvanta- geous position, as evidenced by the lower specificity (Table II). Nevertheless, we sought to determine if these limited qualitative data from the early experi- ence of one institution would add incremental infor- mation in a diverse set of patients. In this respect, there was a significant incremental increase in dis- criminant accuracy for the validation group (Table IV) and for men and women (Table V). It seems log- ical that with the consideration of more quantitative thallium variables, the incremental increase might have been even greater, especially concerning dis- ease extent. However, a recent report 23 suggests that quantitative thallium variables add little informa- tion to clinical and exercise variables for identifying cases of severe coronary disease. Although this study differed from the current study in several ways (Ta- ble VI), its results do raise concern about the incre- mental value of quantitative thallium studies for de- termining disease extent.

Limitations. In addition to limitations mentioned earlier, this study was limited by the variable data available for the validation group. Variables such as hypertension, obesity, estrogen status, semiquanti- tative thallium variables, and thallium lung-heart ratio were available in the derivation set, but no such data existed in the validation set. Because of this void in the validation group, we do not present results for those variables in this report.

In addition, as indicated in Table VI, the anglo- graphic criterion for significant disease in many of the studies was 70% rather than 50% diameter stenosis. Coronary Artery Surgery Study 24 data suggest that this difference might affect prevalence rates by ---5%. Although we have data concerning this variable for the derivation groups, we have no data concerning this variable for the validation group. We did evaluate the derivation group by this angiographic criterion (data not presented). As ex-

Volume 130, Number 2 American Heart Journal Morise et al. 275

pected, prevalence rates according to 70% rather than 50% stenosis were lower (disease presence 41% vs 35%). Logistic regression analysis, however, ac- cording to 70% stenosis did not reveal any differences in the variables selected or their relative signifi- cance. While the logistic coefficients in the Appendix would have been different, whether this more re- strictive angiographic criterion would have affected our conclusions based on Tables IV and V is un- known.

Published accuracy results for exercise ECG 27 in- dicate that the sensitivity of this test at the deriva- tion institution (Table II) was substantially lower (52% vs 68%). On the other hand, the specificities are comparable. This lower sensitivity is likely attribut- able to medications that were unaccounted for in our analysis. We have limited data for the derivation group and no data for the validation group concern- ing medication use. We do know that 20% of the der- ivation group were receiving ~-adrenergic blocking agents at the time of their exercise test. The fre- quencies were similar in men and women and in pa- tients with and without angiographic disease. Nev- ertheless, the presence of ~-adrenergic blocking agents in some patients is a likely explanation for the lower sensitivity in our study. The influence of other antianginal medications (such as calcium-channel blocking agents and nitrates) on our findings is un- known.

Conclusions. An appreciation of the concept of the incremental value era diagnostic test is important to understanding the effect of a test in specific clinical scenarios. An increasing number of tests are being subjected to this scrutiny, s In addition to demon- strating overall incremental value, this study dem- onstrates that there is incremental value in exercise testing for assessment of both the presence and the extent of coronary disease and that this incremental value extends to both men and women. Therefore, despite the presence of well-established differences in the accuracy of the exercise ECG for men and women, multivariable methods offer a means to rec- tify results for men and women by considering sex- related differences and incorporating them into a model.

The clinical relevance of these findings is that the results of exercise testing for men and women should be subjected to probabilistic analysis before final in- terpretation. Simply designating a test result as positive or negative is insufficient without some con- sideration of the clinical context and the effect of previous clinical impressions on the specific test re- sults. Although in many situations a probabilistic assessment of test results will have little clinical

August 1995 276 Morise et al. American Heart Journal

18. Detrano R, Janosi A, Steinbrunn W, Pfisterer M, Schmid J, Meyer MM, Guppy KH, Abi-Mansour P. Algorithm to predict triple-vessel/left main patients without myocardial infarction: an international cross valida- tion. Circulation 1991;83:HI-89-96.

19. Pryor DB, Shaw L, HarreU FE, Lee KL, Hlatky MA, Mark DB, Muhl- baier LH, Califf RM. Estimating the likelihood of severe coronary ar- tery disease, Am J Med 1991;90:553-62.

20. Christian TF, Miller TD, Bailey KR, Gibbons RJ. Noninvasive identi- fication of severe coronary artery disease using exercise tomographic thallium-201 imaging. Am J Cardiol 1992;70:14-20.

21. Hubbard BL, Gibbons RJ, Lapeyre AC, Zinsmeister AR, Clements IP. Identification of severe coronary artery disease using simple clinical parameters. Arch Intern Med 1992;152:309-12.

22. Ribisl PM, Morris CK, Kawaguchi T, Ueshima IZ~ Froelicher VF. An- giographic patterns and severe coronary artery disease. Arch Intern Med 1992;152:1618-24.

23. Christian TF, Miller TD, Bailey KR, Gibbons RJ, Exercise tomographic thallium-201 imaging in patients with severe coronary artery disease and normal electrocardiograms. Arm Intern IVied 1994;121:825-32.

24. Chaitman BR, Bourassa MG, Davis K, Rogers WJ, Tyras DH, Berger R, Kennedy JW, Fisher L, Judkins MP, Mock MB, KiUip T. Angio- graphic prevalence of high-risk coronary artery disease in patient sub- sets (CASS). Circulation 1981;64:360-7.

25. Mark DB, Hlatky MA, Harrell FE, Lee KL, Califf RM, Pryor DB. Ex- ercise treadmill score for predicting prognosis in coronary artery dis- ease. Ann Intern Med 1987;106:793-800.

26. Froelicher V, Morrow K, Brown M, Atwood E, Morris C. Prediction of atherosclerotic cardiovascular death in men using a prognostic score. Am J Cardiol 1994;73:133-8.

27. Detrano R, Gianrossi R, Froelicher V. The diagnostic accuracy of the exercise electrocardiogram: a meta-analysis of 22 years of research. Progress in Cardiovascular Medicine 1989;32:173-206.

APPENDIX The bas ic logist ic equa t i on is as follows:

P robab i l i t y = 1/1 + e -(a + bx...)

w h e r e a = in te rcep t ; b = ~ e s t ima te ; a n d x = va r i ab le value.

The fol lowing is a l is t of equa t i ons su i t ab le for e n t r y in to a h a n d h e l d ca lcu la tor or a s p r e a d s h e e t p rog ram. Var iab le codes a re inc luded a f t e r each se t of equa t i ons for t h e t h r e e

i n c r e m e n t a l s tages . Pretest probability

Disease presence

p = 1/(1 + E X P ( - ( - 8 . 1 1 6 + 0.0744 × AGE -

0.935 × SEX + 0.6752 x SYM + 0.7472 × DM

+ 0.0076 x CHOL)))

p = 1/(1 + EXP( - ( -7 .403 + 0.0883 x AGE -

SEX + 0.7762 x SYM + 0.7855 × DM + 0.7211 × SMOKEW)))

Disease extent

p = 11(1 + EXP( - ( ' 7 . 899 + 0.0844 x AGE - 1.11 x SEX + 0.2976 × SYM + 0.9704

× DM + 0.0072 x CHOL)))

p = 1/(1 + EXP( - ( -7 .602 + 0.0997 × AGE - 0.908 x SEX + 0.428 × SYM + 0.9941

× DM + 0.5562 X SMOKE)))

Pretest variable coding

AGE: Yea r s SEX: M e n = 0; w o m e n = 1

SYM (symptoms) : Typical = 4; a typ ica l = 3;

n o n a n g i n a l = 2; a s y m p t o m a t i c = 1

C H O L (cholesterol): Total (mg/dl)

DM (diabetes : Yes = 1; no = 0 S M O K E (men a n d women) : Yes = 1; no = 0

S M O K E W (women only): Yes = 1; no = 0;

M e n = 0 Post--exercise ECG probability

Disease presence

p = 1/(1 + EXP( - (0.0259 + 4.328 x P R E + 0.4396 × M M + 1.285 x D W N - 0.492 x N E G - 0.0163 × PHR)))

Disease extent

p = 1/(1 + EXP( - ( -2 .359 + 4.853 x P R E + 0.3566

x M M + 0.9036 x D W N - 0.0092 × CSBP)))

Post-exercise ECG instructions and coding P R E = P r e t e s t probabi l i ty . Req u i r e s p rev ious ca lcula t ion

a n d is a n u m b e r b e t w e e n 0 a n d 1. M M = mi l l ime te r s ST depress ion . Coded as 0 for w o m e n a n d for all subjec ts w i t h u n i n t e r p r e t a b l e r e s t i n g ECGs. Coded as ac tua l m i l l i m e t e r s (0 to 10) of a n y ST d e p r e s s i o n 80 m s e c f rom t h e J - p o i n t for m e n w i t h i n t e r p r e t a b l e r es t - ing ECGs . D W N = Downs lop ing ST s e g m e n t s . Coded as 1 i f r e s t i n g E C G i s i n t e r p r e t a b l e a n d t h e exerc i se r e s p o n s e is downslop-

ing; o the rwi se coded as 0.

N E G = N eg a t i v e ST response . Coded as 1 i f r e s t i n g E C G i s

i n t e r p r e t a b l e a n d ST is <1 m m w i t h ho r i zon ta l a n d

downs lop ing ST s e g m e n t s a n d <1.5 m m w i t h ups lop ing ST

segmen t s ; o t h e r w i s e coded as 0. P H R = P e a k h e a r t r a t e (bea ts pe r minu te ) ,

C S B P = C h a n g e in systol ic blood p re s su re . P e a k exerc ise m i n u s r e s t systol ic blood p re s su re .

Post-thallium-201 scintigram probability Disease presence

p = 1/(1 + EXP( - ( -4 .64 + 4.83 x P O S T + 1.68 x HYPO + 2.45 x DEF)))

p = 11(1 + EXP( - ( -3 .29 + 4.845 x P O S T +

0.6244 x REV)))

Disease extent

p = 11(1 + EXP( - ( -4 .168 + 3.7 x P O S T + 1.21 x HYPO + 1.535 × DEF)))

p = 1/(1 + EXP( - ( -2 .853 + 3.531 x P O S T + 0.53 x REV)))

Post-thallium-201 scintigram coding P O S T = Pos t exe rc i se E C G probabi l i ty . Requ i r e s p rev ious ca lcula t ion a n d is a n u m b e r b e t w e e n 0 a n d 1. HYPO = Hypoper fus ion . Degree (coded 0 to 3) o f h y p o p e r - fus ion of m o s t i n t e n s e defect . D E F = No defect . Coded 1 i f no defect a n d 0 for a n y defect . R E V = Revers ib i l i ty . Coded 3 for any r eve r s ib le defect , 2 for f~xed defects only, a n d 1 for no defect .