CLINICAL' EPIDEMIOLOGY ROUNDS

8
CLINICAL' EPIDEMIOLOGY ROUNDS The first round in this series (Can Med AssocJ 1981; 124: 555-558) presented 10 reasons to read clinic- al journals and introduced a flow- chart of guides for reading them (Fig. 1) that suggests four univer- sal guides for any article (consider the title, the authors, the summary and the site) and points out that further guides for reading (and dis- carding) articles depend on why they are being read. This round will present guides for reading articles that describe diagnostic tests, both old and new. First, however, we must give sonic nominal definitions. The serum level of thyroxine (T1) can be measured in at least four cir- cumstances, and it is important for us to tell them apart. First, a group of passers-by in a shopping plaza or the members of a senior citizens' club may be invited to have a free T1 test; this testing of apparently healthy volunteers from the general population for the purpose of separ- ating them into groups with high and low probabilities for thyroid disease is called screening. Second, patients who come to a clinicians s office for any illness may have a T test routinely added to whatever laboratory studies are undertaken to diagnose their chief complaints: this testing of patients for disorders that arc unrelated to the reason they came to the clinician is called case Reprint requests to: Dr. R.B. Haynes, McMaster University Health Sciences Centre, Rm. 3V43D. 1200 Main St. W, Hamilton, Ont. L8N 3Z5 finding. Third, a T4 test may be specifically ordered to explain thc exact cause for a patient's present- ing illness; this, of course, is diag- nosis. Finally, a T4 test may be or- dered for a patient who is taking a replacement hormone or who has previously received therapeutic radioiodine in order to test for achievement of a treatment goal. This round will deal mostly with diagnosis, and later rounds will take up the other three uses of paraclin- ical data such as a T4 determina- tion. Guides for reading articles about diagnostic tests When encountering an article that looks like it might be describing a Q Look at the TITLE: interesting or useful? YES NO Review the AUTHORS: good track record? © YES orDON'TKNOW © (Readthe SUMMARY: if valid, would these results be useful? YES f . (Consider the SITE: if valid, would these results apply in your practice? FIG. I-The first steps in how to read articles in a clinical journal. CMA JOURNAL/MARCH 15, 1981/VOL. 124 703

Transcript of CLINICAL' EPIDEMIOLOGY ROUNDS

CLINICAL' EPIDEMIOLOGY ROUNDS

The first round in this series (CanMed AssocJ 1981; 124: 555-558)presented 10 reasons to read clinic-al journals and introduced a flow-chart of guides for reading them(Fig. 1) that suggests four univer-sal guides for any article (considerthe title, the authors, the summaryand the site) and points out thatfurther guides for reading (and dis-carding) articles depend on whythey are being read.

This round will present guidesfor reading articles that describediagnostic tests, both old and new.First, however, we must give sonicnominal definitions.

The serum level of thyroxine (T1)can be measured in at least four cir-cumstances, and it is important forus to tell them apart. First, a groupof passers-by in a shopping plazaor the members of a senior citizens'club may be invited to have a freeT1 test; this testing of apparentlyhealthy volunteers from the generalpopulation for the purpose of separ-ating them into groups with highand low probabilities for thyroiddisease is called screening. Second,patients who come to a clinicians soffice for any illness may have a Ttest routinely added to whateverlaboratory studies are undertaken todiagnose their chief complaints: thistesting of patients for disorders thatarc unrelated to the reason theycame to the clinician is called case

Reprint requests to: Dr. R.B. Haynes,McMaster University Health SciencesCentre, Rm. 3V43D. 1200 Main St. W,Hamilton, Ont. L8N 3Z5

finding. Third, a T4 test may bespecifically ordered to explain thcexact cause for a patient's present-ing illness; this, of course, is diag-nosis. Finally, a T4 test may be or-dered for a patient who is takinga replacement hormone or who haspreviously received therapeuticradioiodine in order to test forachievement of a treatment goal.

This round will deal mostly withdiagnosis, and later rounds will takeup the other three uses of paraclin-ical data such as a T4 determina-tion.Guides for reading articles

about diagnostic tests

When encountering an article thatlooks like it might be describing a

Q Look at the TITLE: interesting or useful?

YESNO

Review the AUTHORS: good track record?©YES orDON'TKNOW

© (Readthe SUMMARY: if valid, would these results be useful?

YES f

. (Consider the SITE: if valid, would these results apply in your practice?

FIG. I-The first steps in how to read articles in a clinical journal.

CMA JOURNAL/MARCH 15, 1981/VOL. 124 703

consider the diagnostic test: Doesit have something to offer that thegold standard does not? For exam-ple, is it less risky, less uncom-fortable or less embarrassing for thepatient, less costly or applicableearlier in the course of the illness?Again, if the proposed diagnostictest offers no theoretical advantageover the gold standard, why readfurther?

Having satisfied yourself that it's

'Of course, the gold standard mustn'tinclude the diagnostic test result as oneof its components, for the resulting "in-corporation bias" would invalidate thewhole comparison.3

Table I-Elements of the proper clinical evaluation of a diagnostic test

1.

Table II Fourfold table demonstrating "blind" comparison with "gold standard"

Gold standard

PatientPatient has does not havethe disease the disease

Positive:Patient appears True False

Test result to have the positive positive a + b(conclusion disease

results of the Negative:test) Patient appears False True c + d

not to havethe disease negative - negative

Stable properties:a/(a + c) = sensitivityd/(b + d) specificity

Frequency-dependent properties:a/(a + b) = positive predictive value*d/(c + d) = negative predictive value(a+d)I(a+ b+c+d)= accuracy(a + c)/(a + b + c + d) = prevalence

*Positive predictive value can be calculated other ways too. One of them uses Bayes' theorem:

(prevalence)(sensitivity)(prevalence)(sensitivity) + (1- prevalence)(1 - specificity)

tive diagnostic test result, in whatproportion, a/(a + b), have we cor-rectly predicted, or "ruled in", thecorrect diagnosis? This proportiona/(a + b), again usually expressedas a percentage, goes by the namepositive predictive value.

Similarly, we want to know howwell a negative test result correctlypredicts the absence of, or "rulesout", the disease in question. Thisproportion, d/(c ± d), is named thenegative predictive value.

Another property of interest isthe overall rate of agreement be-tween the diagnostic test and thegold standard. Table II reveals thatthis could be expressed by the frac-tion (a + d)/(a ± b + c + d); thisrate is usually called accuracy.*

If a diagnostic test's predictivevalue constitutes the focus of ourclinical interest, why waste timeconsidering its sensitivity and spe-cificity? The reason is a funda-mental one that has major implica-tions, not just for the rational useof diagnostic tests, but also for thebasic education of clinicians. Putsimply, a diagnostic test's positiveand negative predictive values fluc-tuate widely, depending on the pro-portion of truly diseased individualsamong patients to whom the test isapplied - in Table II this is theproportion (a + c)/(a + b + c ±

a property called prevalence.Although a diagnostic test's sen-

*Galen and Gambino,4 who have writtena very thorough and easily understoodbook on this topic, call this property"efficiency". We won't.

Table Ill-Postexercise electrocardiogram as a predictor of coronary artery stenosis when thedisease is present in half the men tested5

Positive predictive value = a/(a + b) = 55/62 = 89%Negative predictive value = d/(c + d) = 84/133 = 63%Sensitivity = a/(a + c) 55/104 = 53%Specificity d/(b + d) = 84/91 = 92%Prevalence = (a + c)/(a + b + c + d) = 104/195 53%

standard arteriographic results (a +c)/(a + b + c + d) or 104/195 or53% of the patients had markedcoronary artery stenosis - a highlyselected group of patients indeed.What would happen if enthusiastsadopted the multistage stress testfor wider use in an effort to detectsignificant coronary disease in menwho want to take up jogging orother sports, regardless, of whetherthey had any chest pain?* Would apositive stress test still be useful?

The results of applying this testto a less carefully selected group ofmen are entirely predictable (TableIV). If the true prevalence ofmarked coronary artery stenosis,as assessed by the gold standard ofarteriography, was only 1/6 (104/624 or 17%) rather than better than1/2 (104/195 or 53%), the test'spositive predictive value would fallfrom 89% to 57% and its negativepredictive value would rise from63% to 91% - the reverse of theoriginal situation. t

Now, we said that this resultcould be forecast from Table III,and it is this forecasting featurethat permits a reader to translatethe results of a diagnostic test evalu-ation to his or her own setting. Allthat are needed are a rough estimateof the prevalence of the disease inone's own practice (from personalexperience) or practices like it (fromother articles) and some simplearithmetic. For example, as we'vccharitably estimated for Table IV,approximately one sixth of all men(both symptomatic and asymptoma-tic) sent for coronary arteriographyfrom a primary care setting mightultimately be found to have coro-nary artery stenosis. Thus, if we

The authors of the work cited in thisexample made no such recommendation.5

This hypothetical case closely approx-imates what actually happened amongwomen in the study cited here.5 Roughlyone sixth had 75% stenosis or more andthe stress test had a sensitivity of 50%.a specificity of 78% (values close tothose observed among men), and positiveand negative predictive values of 33%and 88% respectively. The authors con-cluded: "'In women, a positive exercisetest is of little value in predicting thepresence of significant coronary arterydisease, whereas a negative test is quiteLiseful in ruling out the presence of sig-nificant disease."

started with the original number ofpatients with coronary artery dis-ease (104), five times this number(520) would be free of the disease.Because sensitivity remains con-stant, 55 (53%) of the 104 diseasedmen would have positive exerciseECGs. Similarly, because specificityremains at 92%, 478 of the 520nondiseased men would have neg-ative tests. The rest of the tablecan then be completed by addingor subtracting to fill in the appro-priate boxes, and the predictivevalues and accuracy can then becalculated. In this or any otherexample, then, the positive predic-tive value falls and the negative pre-dictive value rises when a diagnostictest developed for patients with ahigh prevalence of the target dis-order is subsequently applied to pa-tients with a lower prevalence ofthe disorder.

Our analysis derives its relevancefrom the very real differences inprevalence of various disorders inprimary and tertiary care settings.But individual clinicians seldomwork at more than one level ofspecialization and so it might beassumed that a given clinician neednot be concerned about the effectof shifts in disease prevalence onhis or her interpretation of diag-nostic tests. This assumption isquite incorrect, however. We havealready mentioned the difference inprevalence among men and womenin the same clinical setting. Patientsusually have a variety of easily dis-cernible features that permit a fair-ly precise estimate of the diagnosis

before any diagnostic tests are per-formed. For example, a 30-year-old man with a history of nonan-ginal chest pain has a low likeli-hood of coronary artery stenosis(Diamond and Forrester6 put thislikelihood at 5%), whereas a 62-year-old man with typical anginahas a very high likelihood of coro-nary stenosis (94% 6) When these"pretest likelihoods" or "preval-ences" are fed into our diagnostictest model for exercise electrocar-diography, the information pro-vided by this test varies greatly. Forthe younger man it can be calcu-lated that the likelihood of coro-nary artery stenosis is 26% if theexercise test is positive (positivepredictive value) and 3% if the testis negative (this is the complementof the negative predictive value ord/[c + d]). The exercise test is oflittle value here: a negative testmerely informs us of the obvious(ischemic heart disease is unlikelyin this man) and a positive test doesnot imply a sufficiently high prob-ability of the disease to justify in-vasive testing under most circum-stances.

The exercise test is also not veryhelpful for the 62-year-old manwith typical angina. If the exercisetest is positive the likelihood ofdisease rises only from 94% to99%. If the test is negative thelikelihood falls only to 89%, hard-ly reassuring enough to forgo fur-ther testing.

The important use of the exer-cise test (or any other test) lies inits application in cases of uncer-

Table IV Postexercise electrocardiogram as a predictor of coronary artery stenosis when thedisease is present in one sixth of the men tested1

Positive predictive value = a/(a + b) = 55/97 = 57%Negative predictive value - d/(c + d) - 478/527 = 91%Sensitivity = a/(a + c) = 55/104 = 53% (as in Table Ill)Specificity = d/(b + d) = 478/520 - 92% (as in Table Ill)Prevalence - (a + c)/(a + b + c + d) - 104/624 = 17%

tainty. Let us consider anotherexample, that of a 45-year-oldman with atypical angina. Clinicalstudies demonstrate that such a pa-tient has a 46% likelihood of coro-nary artery stenosis.0 Should he goon to angiography or not? If anexercise test is done and is posi-tive, the likelihood of ischemicheart disease can be calculated tobe 85%, and he should thereforehave an angiogram if clinically war-ranted. If an exercise test is nega-tive, however, the likelihood of sig-nificant coronary stenosis drops to30% and the need for further in-vestigation diminishes.

Thus, the exercise test is ofvalue, but only for selected patientsfor whom the likelihood of coro-nary artery disease is neither highnor low. To act on the results ofthe exercise test in the last two cir-cumstances makes little sense be-cause it provides little informationbeyond that already apparent fromthe clinical presentation.

Having discussed the fourfoldcomparison with a gold standard,what about the element of "blind-ness"? This simply means that thosewho are carrying out or interpret-ing the results of the diagnostic testshould not know whether the pa-tient being tested really does ordoes not have the disease of interest;that is, they should be "blind" toeach patient's true disease status.Similarly, those who are applyingthe gold standard should not knowthe diagnostic test result for anypatient. It is only when the diag-nostic test and gold standard areapplied in a blind fashion that wecan be assured that conscious orunconscious bias (in this case the"diagnostic suspicion" bias) hasbeen avoided.7 As you may recall,this bias was discussed in an earlierround on clinical disagreement.8

2. Did the patient sample includean appropriate spectrum of mildand severe, treated and untreateddisease, plus individuals with dif-ferent but commonly confuseddisorders?

Florid disease (such as long-standing rheumatoid arthritis) usual-ly presents a much smaller diag-nostic challenge than the same dis-ease in an early or mild form; the

real clinical value of a new diag-nostic test often lies in its predictivevalue among equivocal cases. More-over, the apparent diagnostic valueof some tests actually resides in theirability to detect the manifestationsof therapy (such as radiopaque de-posits in the buttocks of ancientsyphilitics) rather than disease, andthe reader must be satisfied that thetwo are not being confused.

Finally, just as a duck is not oftenconfused with a yak even in theabsence of chromosomal analyses,the ability of a diagnostic test to dis-tinguish between disorders not com-monly confused in the first placeis scant endorsement for its wide-spread application. Again, the keyvalue of a diagnostic test often liesin its ability to .tlistinguish betweenotherwise commonly confuseddisorders, especially when theirprognoses or therapies differ sharp-ly. It is this discriminating propertythat makes the T4 determination sohelpful in sorting out tense, anxious,tremulous and perspiring patientsinto those with abnormal thyroidfunction and those with other dis-orders.

3. Was the setting for the study, aswell as the filter through whichstudy patients passed, adequatelydescribed?

In the previous round we sawhow the proportion of hypertensivepatients with surgically curable le-sions varied almost 10-fold depend-ing on whether the same diagnostictests were applied in a general prac-tice or in a tertiary care centre.Because a test's predictive valuechanges with the prevalence of thetarget disease, the article ought totell you enough about the study siteand patient selection filter to permityou to calculate the diagnostic test'slikely predictive value in your ownpractice.

The selection of control subjectswho do not have the disease of in-terest should be described as well.Although lab technicians and jani-tors may be appropriate controlsubjects early in the developmentof a new diagnostic test (especiallywith the declining use of medicalstudents as laboratory animals), thedefinitive comparison with a goldstandard demands equal care in the

selection of patients with and with-out the target disease. The readerdeserves some assurance that dif-ferences in diagnostic test resultsare due to a mechanism of diseaseand not simply to differences insuch features as age, sex, diet andmobility of case and control sub-jects.

4. Was the reproducibility of thetest result (precision) and its inter-pretation (observer variation) de-termined?

Validity of a diagnostic test de-mands both the absence of sys-tematic deviation from the truth(that is, the absence of bias) andthe presence of precision (the sametest applied to the same unchangedpatient must produce the same re-sult). The description of a diagnostictest ought to tell readers how re-producible they can expect the testresults to be. This is especially truewhen expertise is required in per-forming the test (for example, ultra-sonography currently has enormousvariation in the quality of its resultswhen performed by different oper-ators) or in interpreting it (as youmay recall from an earlier round,observer variation is a major prob-lem for tests involving x-rays, elec-trocardiography and the like).9

5. Was the term "normal" definedsensibly?

If the article uses the word "nor-mal" its authors should tell youwhat they mean by it. Moreover,you should satisfy yourself thattheir definition is clinically sensible.Several different definitions of nor-mal are used in clinical medicine;we contend that some of them prob-ably lead to more harm than good.We have listed six definitions ofnormal in Table V and acknowl-edge our debt to Tony Murphy forpointing out most of them.2'10

Perhaps the most common defi-nition of normal assumes that thediagnostic test results (or somearithmetic manipulation of them)for everyone, for a group of pre-sumably normal people or for acarefully characterized "reference"population will fit a specific theore-tical distribution known as the nor-mal or gaussian distribution. Oneof the nice properties of the gaussian

CMA JOURNAL/MARCH 15, 1981/VOL. 124 707

raised to the power of the numberof independent diagnostic tests per-formed. Thus, a patient who under-goes 20 tests has only Q.952O orabout 1 chance in 3 of being callednormal; a patient undergoing 100such tests has only about 6 chancesin 1000 of being called normal atthe end of the work up.*

Other definitions of normal, inavoiding the foregoing pitfalls, pre-sent other problems. The risk factorapproach is based upon studies ofprecursors or statistical predictorsof subsequent clinical events; bythis definition, the normal range forserum cholesterol concentration orblood pressure consists of levelsthat carry no additional risk of mor-bidity or mortality. Unfortunately,however, many of these risk factorsexhibit steady increases in riskthroughout their range of values;indeed, it has been pointed out thatthe normal serum cholesterol con-centration, by this definition, mightlie below 150 mg/dl (3.9 mmol/l)."Another shortcoming of this riskfactor definition becomes apparentwhen we examine the consequencesof acting upon a test result thatlies beyond the normal range: Willaltering a risk factor really changethe risk? Recent experience with the

*Thjs consequence of such definitionshelps explain the results of a randomizedtrial of multitest screening at the time ofadmission to hospital that found no pa-tient benefits but increased health carecosts.'2

Table V-Properties and consequences of different definitions of "normal"

Consequences of itsProperty Term clinical application

The distribution of diagnostic Gaussian Ought to occasionally obtaintest results has a certain shape minus values for hemoglobin

level etc. All diseases have thesame prevalence. Patients arenormal only until they areassessed.

Lies within a preset percentile Percentile All diseases have the sameof previous diagnostic test prevalence. Patients areresults normal only until they are

assessed.Carries no additional risk Risk factor Assumes that altering a risk

of morbidity or mortality factor alters risk.Socially or politically Culturally Confusion over the role of

aspired to desirable medicine in society.Range of test results beyond Diagnostic Need to know predictive valueswhich a specific disease is, for your practice.with known probability,present or absent

Range of test results beyond Therapeutic Need to keep up with newwhich therapy does more good knowledge about therapy.than harm

used in the first guide to readingabout a diagnostic test: comparisonwith a gold standard. The "knownprobability" with which a diseaseis present is our old friend the posi-tive predictive value.

This definition is illustrated inFig. 2, where we see the usual over-lap in diagnostic test results betweenpatients shown, by application of agold standard, to be disease-free ordiseased (the a, b, c and d in Fig. 2correspond to cells a, b, c and d ofTables II to IV). The known prob-ability (or predictive value) withwhich a disease is present or absentdepends on where we set the limitsfor the normal range of diagnostictest results. If we simply wantedto maximize the number of timesthe diagnostic test result was cor-rect, we'd set the limits for normalat the dotted line where the curvescross, but that might not be veryhelpful clinically. If we lower thesenormal limits to point X, cell capproaches zero, sensitivity andnegative predictive values approach100% and we can use the normaldiagnostic test result to rule out thedisease (because nobody with thedisease has test results below X).Similarly, if we raise the limits ofnormal for the diagnostic test resultto point Y, cell b approaches zero,specificity and positive predictivevalues approach 100% and we canuse the abnormal diagnostic testresult to rule in the disease (becauseno nondiseased patients have testresults above Y). Thus, this defini-tion has clinical utility and is adistinct improvement over the defi-nitions described earlier. However,it does require that clinicians keeptrack of both the predictive valuesof individual diagnostic tests andthe test levels at points X and Ythat apply in their own practices.

The final definition of normalsets its limits at the point beyondwhich specific treatments have been

shown to do more good than harm,and is indicated in Fig. 2 as pointZ. This therapeutic definition isattractive because of its link to ac-tion. The therapeutic definition ofthe normal range of blood pressure,for example, avoids the hazards oflabelling patients as diseased17 un-less they are going to be treated.The use of this definition requiresthat clinicians keep abreast of ad-vances in therapeutics and becomeadept at sorting out therapeuticclaims; a later article in this seriesof Clinical Epidemiology Rounds isdevoted to this topic.When reading a report of a new

diagnostic test, then, you shouldsatisfy yourself that the authorshave defined what they mean bynormal and that they have done soin a sensible and clinically usefulfashion.

6. If the test is advocated as partof a cluster or sequence of tests,was its contribution to the overallvalidity of the cluster or sequencedetermined?

In many conditions an individualdiagnostic test examines but one ofseveral manifestations of the un-derlying disorder. For example,in diagnosing deep vein thrombosisimpedance plethysmography exam-ines venous emptying, whereas legscanning. with iodine-l 25-labelledfibrinogen examines the turnover ofcoagulation ,factors at the site ofthrombosis.'8 Furthermore, plethys-mography is much more sensitivefor proximal than distal venousthrombosis, whereas the reverse istrue for leg scanning. As a result,these tests are best applied in se-quence: if the plethysmogram ispositive, the diagnosis is made andtreatment begins at once; if it isnegative, leg scanning begins andthe diagnostic and treatment deci-sions await its results.

This being so, it is clinically non-sensical to base a judgement of thevalue of leg scanning on a simplecomparison of its results aloneagainst the gold standard of veno-graphy. Rather, its agreement withvenography among suitably symp-tomatic patients with a negative im-pedance plethysmogram is one ap-propriate assessment of its validityand clinical usefulness. Another

valid assessment would be theagreement of results of the combi-nation of leg scanning and impe-dance plethysmography with veno-graphy.

In summary, any single compo-nent of a cluster of diagnostic testsshould be evaluated in the contextof its clinical use.

7. Were the tactics for carrying outthe test described in sufficient detailto permit their exact replication?

If the authors have concludedthat you should use their diagnostictest, they have to tell you hoW touse it; this description should coverpatient issues as well as the me-chanics of performing the test andinterpreting its results. Are therespecial requirements for fluids, dietor physical activity? What drugsshould be avoided? How painful isthe procedure and what is done torelieve any pain? What precautionsshould be taken during and after thetest? How should the specimen betransported and stored for lateranalysis? These tactics and pre-cautions must be described if youand your patients are to benefitfrom this diagnostic test.

8. Was the "utility" of the test de-termined?

The ultimate criterion for a diag-nostic test or any other clinicalmaneuver is whether the patient isbetter off for it. If you agree withthis point of view you should scru-tinize the article to see whether theauthors went beyond the foregoingissues of accuracy, precision andthe like to explore the long-termconsequences of their use of thediagnostic test.

In addition to telling you whathappened to patients correctly clas-sified by the diagnostic test, theauthors should describe the fate ofthe patients who had false-positiveresults (those with positive testresults who really did not have thedisease) and those with false-neg-ative results (those with negativetest results who really did have thedisease). Moreover, when the execu-tion of a test requires a delay in theinitiation of definitive therapy(while the procedure is being re-scheduled, the test tubes are in-cubating or the slides are waiting

FIG. 2-Diagnostic and therapeuticdefinitions of "normal".

to be read) the consequences of thisdelay should be described.

For example, we are part of ateam that has studied the valueof noninvasive tests in the diagnosisof patients with clinically suspecteddeep leg vein thrombosis, and havetested the policy of withholdinganticoagulants from patients with anegative impedance plethysmogram(a quick test) until or unless the"'1-fibrinogen leg scan becomespositive.'8 The scan takes severalhours to several days to becomepositive when venous thrombi aresmall or confined to the calf; it istherefore important to determineand report whether any patientssuffer clinical embolic' events duringthis interval (fortunately, they donot). Moreover, comparisons ofthese investigations against the goldstandard of venography have in-cluded documentation of the con-sequences of treating patients withfalse-positive results and withhold-ing treatment from those with false-negative results. The resemblance ofthis approach to the "therapeutic"definition of normalcy is worthnoting. *

Use of these guides to reading

By applying the foregoing guidesyou should be able to decide if adiagnostic test will be useful in yourpractice, if it won't or if it stillhasn't been properly evaluated. De-pending on the context in whichyou are reading about the test, oneor another of the eight guides willbe the most important one and youcan go right to it. If it has been metin a credible way, you can go on tothe others; if the most importantguide hasn't been met you candiscard the article right there andgo on to something else. Thus, onceagain, you can improve the effi-ciency with which you use yourscarce reading time. When tryingto pick the best test from an arrayof competing diagnostic tests youcould carry out on a given patient,these guides will help you comparethem with each other. On the basisof this comparison you can pick

*In this regard, we think it's a shamethat the term "diagnostic efficacy" hascrept into the literature, especially sinceit is used as a synonym for accuracyrather than utility.

the one that will best meet yourclinical requirements.The next round will consider ar-

ticles that describe the clinicalcourse and prognosis of disease.

References

1. SACKETT DL: Clinical diagnosis andthe clinical laboratory. Clin InvestMed 1978; 1: 37-43

2. MURPHY EA: The Logic of Med-icine, Johns Hopkins, Baltimore,1976: 117-160

3. RANSOHOFF DF, FEINSTEIN AR:Problems of spectrum and bias inevaluating the efficacy of diagnostictests. N Engi J Med 1978; 299: 926-930

4. GALEN RS, GAMBINO SR: BeyondNormality: The Predictive Value andElficiency of Medical Diagnoses,Wiley, New York, 1975: 30-40

5. SKETCH MH, MOHIUDDIN SM, LYNCHJD, ZENCKA AE, RUNCO V: Signifi-cant sex differences in the correla-tion of electrocardiographic exercisetesting and coronary arteriograms.A in J Cardiol 1975; 36: 169-173

6. DIAMOND GA, FORRESTER JS: Anal-ysis of probability as an aid in theclinical diagnosis of coronary-arterydisease. N EngI J Med 1979; 300:1350-1358

7. SACKETT DL: Bias in analytic re-search. J Chronic Dis 1979; 32: 51-63

8. Department and Clinical Epidemiol-ogy and Biostatistics, McMaster Uni-versity, Hamilton, Ont.: Clinical dis-agreement: II. How to avoid it andhow to learn from one's mistakes.Can Med Assoc J 1980; 123: 613-617

9. Idem: Clinical disagreement: I. Howoften it occurs and why. Ibid: 499-504

10. MURPHY EA: The normal, and theperils of the sylleptic argument. Per-spect Biol Med 1972; 15: 566-582

11. MEADOR CK: The art and science ofnondisease. N EngI J Med 1965:272: 92-95

12. DURBRJDGE TC, EDWARDS F, ED-WARDS RG, ATIUNSON M: An evalu-ation of multiphasic screening on ad-mission to hospital. Precis of a re-port to the National Health andMedical Research Council. Med JAust 1976; 1: 703-705

13. KANNEL WB, DAWBER TR, GLEN-NON WE, THORNE MC: Preliminaryreport: the determinants and clinicalsignificance of serum cholesterol.Mass J Med Technol 1962; 4:11-29

14. OLIvER MF, HEADY JA, MORRIS JN,COOPER J: A co-operative trial inthe primary prevention of ischaemicheart disease using clofibrate. Reportfrom the Committee of Principal In-vestigators. Br Heart J 1978; 40:1069-1118

15. MENCKEN HL: A Mencken Chres-toinathy, Knopf, Westminster, 1949

16. LALONDE M: A New Perspective onthe Hea/th of Canadians. A WorkingDocument, Department of NationalHealth and Welfare, Ottawa, Apr1974: 58

17. HAYNES RB, SACKETT DL, TAYLORDW, GIBSON ES, JOHNSON AL: In-creased absenteeism from work fol-lowing the detection and labelling ofhypertensives. N Engi J Med 1978;299: 74 1-744

18. HULL R, HIRsH J, SACKETT DL,POWERS P, TURPIE AGG, WALKERI: Combined use of leg scanning andimpedance plethysmography in sus-pected venous thrombosis. An alter-native to venography. N Engi J Med1977; 296: 1497-1500

I BOOKS Icontinued trom page 697

THE HEALTHY HYPOCHONDRIAC. Rec-ognizing, Understanding and Living withAnxieties about our Health. RichardEhrlich. 211 pp. W.B. Saunders Com-pany Canada, Ltd., Toronto, 1980. $14.75(Can.), clothbound; $8.50 (Can.), paper-bound. ISBN 0-721 6-334-X, clothbound;ISBN 0-7216-333-1, paperboundTHE HOSPITAL CARE OF CHILDREN. AReview of Contemporary Issues. Geof-frey C. Robinson and Heather F. Clarke.270 pp. IlIust. Oxford University Press,Toronto, 1980. $25.25. ISBN 0-19-502673-xTHE HYPOTHALAMO-PITUITARY CON-TROL OF THE OVARY. Volume 2. J.S.M.Hutchinson. 215 pp. Eden Press, West-mount, P0, 1980. $28. ISBN 0-88831-091-9

AN INTRODUCTION TO HUMAN BIO-CHEMISTRY. C.A. Pasternak. 271 pp.Illust. Oxford University Press, Toronto,1979. $20.50, paperbound. ISBN 0-19-261127-5

LANGUAGE AND COMMUNICATION INTHE ELDERLY. Clinical, Therapeutic,and Experimental Issues. Edited byLoraine K. Obler and Martin L. Albert.220 pp. lIlust. Lexington Books, Lexing-ton, Massachusetts; D.C. Heath CanadaLtd., Toronto, 1980. $25.95. ISBN 0-669-03868-7

MANAGING HEALTH SYSTEMS IN DE-VELOPING AREAS. Experiences fromAfghanistan. Edited by Ronald W.OConner. 314 pp. Illust. LexingtonBooks, Lexington, Massachusetts; D.C.Heath Canada Ltd., Toronto, 1980.$30.50. ISBN 0-669-03646-3

MANUAL OF CLINICAL PROBLEMS INONCOLOGY WITH ANNOTATED KEYREFERENCES. Carol S. Portlock andDonald R. Goffinet. 323 pp. Little,Brown and Company (Inc.), Boston,1980. Price not stated, spiralbound.ISBN 0-316-71424-0

continued on page 751

710 CMA JOURNAL/MARCH 15, 1981/VOL. 124