A systematic review identifies five “red flags” to screen for vertebral fracture in patients...

10
A systematic review identifies five ‘‘red flags’’ to screen for vertebral fracture in patients with low back pain Nicholas Henschke*, Christopher G. Maher, Kathryn M. Refshauge Back Pain Research Group, University of Sydney, PO Box 170, Lidcombe NSW 1825, Sydney, Australia Accepted 20 April 2007 Abstract Objective: To determine the accuracy of clinical features in diagnosing vertebral fracture in low back pain patients and assess the psychometric properties of the Quality Assessment of Studies of Diagnostic Accuracy Included in Systematic Reviews (QUADAS) scale. Study Design and Setting: A diagnostic systematic review was performed on all available records in MEDLINE, CINAHL, and EM- BASE. Studies were considered eligible if they investigated clinical features associated with vertebral fracture in a cohort of low back pain patients. All eligible studies were assessed for methodological quality using the QUADAS scale, and two authors extracted true-positive, true-negative, false-positive, and false-negative data for each clinical feature. Results: Twelve studies were identified by the review, investigating 51 clinical features. Five clinical features were useful to raise or lower the probability of vertebral fracture: age O50 years (likelihood ratio [LR]þ 5 2.2, LR 5 0.34), female gender (LRþ 5 2.3, LR 5 0.67), major trauma (LRþ 5 12.8, LR 5 0.37), pain and tenderness (LRþ56.7, LR 5 0.44), and a distracting painful injury (LRþ 5 1.7, LR 5 0.78). The QUADAS had low internal consistency, and only three items had high inter-rater reliability. There was inadequate reporting of many methodological quality items. Conclusion: Five clinical features were identified that can be used to screen for vertebral fracture. The psychometric properties of the QUADAS scale raise concerns about its use to rate the quality of low back pain diagnosis studies. Ó 2008 Elsevier Inc. All rights reserved. Keywords: Back pain; Fracture; Red flags; Diagnosis; QUADAS; Systematic review 1. Introduction It is widely agreed that acute low back pain is common, can be seriously disabling, and imposes an enormous social and economic burden on the community. To improve the management of back pain, clinical practice guidelines have been developed in at least 12 countries [1]. A common theme among the guidelines is that acute low back pain should be managed in primary care because it is generally benign, and the few cases of serious disease can be readily detected with a clinical assessment [1]. The exclusion of specific pathologies is one of the primary purposes of the clinical assessment, and the clinical guidelines recommend that the identification of ‘‘red flags’’ is the ideal method to accomplish this purpose [1]. ‘‘Red flags’’ are features of the patient’s medical history and clinical examination thought to be associated with a high risk of serious disorders, such as infection, inflammatory disease, cancer, or fracture [2]. Vertebral fracture is associated with significant pain and disability [3] and with increased mortality [4]. The preva- lence of vertebral fracture in patients presenting to primary care practitioners with acute low back pain has been esti- mated to be between 0.5% [5] and 4% [6], yet it is esti- mated that only 30% of vertebral fractures are diagnosed in clinical practice [7] because the presentation is similar to that of nonspecific low back pain [7,8]. Vertebral fracture not only requires specific appropriate treatment, but is a contraindication to spinal manipulative therapy, a common treatment that is endorsed in clinical practice guidelines for acute nonspecific low back pain [9]. Therefore, accurate diagnosis in primary care is essential to prevent poor out- comes [10]. As a first step in identifying fracture in patients present- ing with acute low back pain, clinical guidelines [11e14] generally recommend the following red flags: recent history of trauma [12,13]; prolonged use of corticosteroids [11,13]; age O50 years [11,13,14]; and structural deformity [11,12,14] (Table 1). The inclusion of these features in the guidelines is often justified by reference to previous guidelines [14], unpublished data [11], or single studies * Corresponding author. Tel.: þ61-2-9351-9673; fax: þ61-2-9351- 9681. E-mail address: [email protected] (N. Henschke). 0895-4356/08/$ e see front matter Ó 2008 Elsevier Inc. All rights reserved. doi: 10.1016/j.jclinepi.2007.04.013 Journal of Clinical Epidemiology 61 (2008) 110e118

Transcript of A systematic review identifies five “red flags” to screen for vertebral fracture in patients...

Journal of Clinical Epidemiology 61 (2008) 110e118

A systematic review identifies five ‘‘red flags’’ to screen for vertebralfracture in patients with low back pain

Nicholas Henschke*, Christopher G. Maher, Kathryn M. RefshaugeBack Pain Research Group, University of Sydney, PO Box 170, Lidcombe NSW 1825, Sydney, Australia

Accepted 20 April 2007

Abstract

Objective: To determine the accuracy of clinical features in diagnosing vertebral fracture in low back pain patients and assess thepsychometric properties of the Quality Assessment of Studies of Diagnostic Accuracy Included in Systematic Reviews (QUADAS) scale.

Study Design and Setting: A diagnostic systematic review was performed on all available records in MEDLINE, CINAHL, and EM-BASE. Studies were considered eligible if they investigated clinical features associated with vertebral fracture in a cohort of low back painpatients. All eligible studies were assessed for methodological quality using the QUADAS scale, and two authors extracted true-positive,true-negative, false-positive, and false-negative data for each clinical feature.

Results: Twelve studies were identified by the review, investigating 51 clinical features. Five clinical features were useful to raise orlower the probability of vertebral fracture: age O50 years (likelihood ratio [LR]þ5 2.2, LR�5 0.34), female gender (LRþ5 2.3,LR�5 0.67), major trauma (LRþ5 12.8, LR�5 0.37), pain and tenderness (LRþ56.7, LR�5 0.44), and a distracting painful injury(LRþ5 1.7, LR�5 0.78). The QUADAS had low internal consistency, and only three items had high inter-rater reliability. There wasinadequate reporting of many methodological quality items.

Conclusion: Five clinical features were identified that can be used to screen for vertebral fracture. The psychometric properties of theQUADAS scale raise concerns about its use to rate the quality of low back pain diagnosis studies. � 2008 Elsevier Inc. All rights reserved.

Keywords: Back pain; Fracture; Red flags; Diagnosis; QUADAS; Systematic review

1. Introduction

It is widely agreed that acute low back pain is common,can be seriously disabling, and imposes an enormous socialand economic burden on the community. To improve themanagement of back pain, clinical practice guidelines havebeen developed in at least 12 countries [1]. A commontheme among the guidelines is that acute low back painshould be managed in primary care because it is generallybenign, and the few cases of serious disease can be readilydetected with a clinical assessment [1]. The exclusion ofspecific pathologies is one of the primary purposes of theclinical assessment, and the clinical guidelines recommendthat the identification of ‘‘red flags’’ is the ideal method toaccomplish this purpose [1]. ‘‘Red flags’’ are features of thepatient’s medical history and clinical examination thoughtto be associated with a high risk of serious disorders, suchas infection, inflammatory disease, cancer, or fracture [2].

* Corresponding author. Tel.: þ61-2-9351-9673; fax: þ61-2-9351-

9681.

E-mail address: [email protected] (N. Henschke).

0895-4356/08/$ e see front matter � 2008 Elsevier Inc. All rights reserved.

doi: 10.1016/j.jclinepi.2007.04.013

Vertebral fracture is associated with significant pain anddisability [3] and with increased mortality [4]. The preva-lence of vertebral fracture in patients presenting to primarycare practitioners with acute low back pain has been esti-mated to be between 0.5% [5] and 4% [6], yet it is esti-mated that only 30% of vertebral fractures are diagnosedin clinical practice [7] because the presentation is similarto that of nonspecific low back pain [7,8]. Vertebral fracturenot only requires specific appropriate treatment, but isa contraindication to spinal manipulative therapy, a commontreatment that is endorsed in clinical practice guidelines foracute nonspecific low back pain [9]. Therefore, accuratediagnosis in primary care is essential to prevent poor out-comes [10].

As a first step in identifying fracture in patients present-ing with acute low back pain, clinical guidelines [11e14]generally recommend the following red flags: recent historyof trauma [12,13]; prolonged use of corticosteroids [11,13];age O50 years [11,13,14]; and structural deformity[11,12,14] (Table 1). The inclusion of these features inthe guidelines is often justified by reference to previousguidelines [14], unpublished data [11], or single studies

111N. Henschke et al. / Journal of Clinical Epidemiology 61 (2008) 110e118

Table 1

‘‘Red flags’’ for fracture suggested in clinical guidelines for the management of patients with acute low back pain

Clinical Guideline [reference] Suggested “red flags” for fracture Source of data

Agency for Health Care Policy and Research,

USA (1994) [11]

Age > 50 years

Corticosteroid use

Unpublished data [6]

National Health and Medical Major trauma

Minor trauma (if age > 50 years age,

and prolonged use of corticosteroids)

Scavone et al [29]

European Commission Research

Directorate General (2004) [14]

Age > 50 years

Corticosteroid use

Structural deformity

Royal College of General

Practitioners [13]

Accident Compensation

Significant trauma

Patient over 50 years

No information

Royal College of General

Practitioners, United Kingdom (1996) [13]

Significant trauma

Mild trauma (age > 50)

History of prolonged steroid use

Osteoporosis

Age > 70 years

Previous guidelines [11],

Waddell [45]

Research Council, Australia (2003) [12]

Corporation, New Zealand (2003) [41]

of questionable methodological quality [12] (Table 1). Nostudy has reviewed the available literature in a systematicmanner. Without evaluation of the diagnostic accuracy ofthe red flags, their usefulness in clinical practice will re-main uncertain. This review incorporated a sensitive searchstrategy and quality assessment of primary studies usinga validated tool [15] as recommended in guidelines forperforming diagnostic systematic reviews [16,17].

To determine the accuracy of the clinical examinationavailable to primary care practitioners, we conducted a sys-tematic review of studies evaluating clinical features fordiagnosing fracture in low back pain patients [18]. A sec-ondary aim was to determine the psychometric propertiesof the Quality Assessment of Studies of Diagnostic Accu-racy Included in Systematic Reviews (QUADAS) scale[15] when used to rate the quality of retrieved studies.

2. Study design

2.1. Data sources

A systematic literature search was performed to identifyall relevant original, peer-reviewed articles evaluating verte-bral fractures in patients presenting with low back pain. Theprimary search was performed from the earliest availabledates to 5 February 2007, on the MEDLINE, EMBASE,

and CINAHL electronic databases. A subject-specific searchstrategy was used, combining sensitive searches of the diag-nostic (index) tests available to primary care practitioners,and the target disease (low back pain) [19]. The search strat-egy is available as an Appendix on the journal’s website atwww.elsevier.com. The index tests included informationfrom the history and physical examination, diagnostic imag-ing, and laboratory tests. Non-English language reports wereincluded, but articles were excluded from analysis if appro-priate translation was not available.

From the results of the electronic search, the bibliogra-phies of all systematic reviews and eligible diagnosticand screening studies were reviewed. Eligible studies wereentered into the Science Citation Index to identify any arti-cles in which they had been cited. Contact was made withexperts on diagnostic testing, and low back pain, to identifyunpublished studies or articles missed by the search processand to review the list of identified studies to ensure that thesearch was comprehensive.

2.2. Study selection

The titles and abstracts of the studies identified by thesearch were screened by two authors to exclude those thatwere clearly outside the scope of the review. To determineeligibility for the analysis, articles were included if theysatisfied the following criteria: (a) reported on a cohort of

112 N. Henschke et al. / Journal of Clinical Epidemiology 61 (2008) 110e118

patients presenting with low back pain or trauma; (b) con-firmed the diagnosis of fracture with an appropriate refer-ence standard, that is, diagnostic imaging; (c) evaluatedthe diagnostic performance of a clinical test available toprimary care practitioners; and (d) reported results in suffi-cient detail to allow reconstruction of contingency tables ofthe raw data.

2.3. Study quality assessment

In studies of diagnostic accuracy, there are several poten-tial threats to internal and external validity [20]. Studies withmethodological shortcomings may overestimate the accu-racy of a diagnostic test [21]. All eligible studies identifiedby the search underwent methodological quality assessmentusing the 14-item QUADAS tool [15]. Item 3 on the QUA-DAS scale relates to the use of an adequate reference stan-dard, but as this was already a criterion for inclusion in thereview, it was omitted from the analyses. Two authors(N.H., C.G.M.) scored the remaining 13 items for each eligi-ble study with any disagreements resolved via consensus.Kappa (k) statistics were calculated to assess inter-rater reli-ability [22] using MedCalc v.9.3 (Mariakerke, Belgium).

2.4. Data extraction

Two authors independently extracted the following datafrom each eligible article: author, year, journal, setting (i.e.,primary care, secondary care), index tests, reference stan-dard, number of patients, prevalence of fracture, true-posi-tive, true-negative, false-positive, and false-negative resultsfor the index tests. Disagreements were resolved via discus-sion and consensus. Because there were empty cells in thecontingency table, a value of 0.5 was added to each cell to cir-cumvent computational problems [17]. From the extracteddata, sensitivity, specificity, and positive (LRþ), and nega-tive (LR�) likelihood ratios and their 95% confidence inter-vals (95% CI) were calculated using the score method [23]due to the low prevalence of fracture. We considered clinicalfeatures to be useful for raising the index of suspicion of frac-ture if the positive likelihood ratio and lower bound of 95%CI were greater than 1. The negative likelihood ratios wereconsidered useful to lower the suspicion if the point estimateand upper bound of 95% CI were significantly below 1. It wasour intention to pool the results and perform a meta-analysisif sufficient clinical and statistical homogeneity existedamong the studies. Diagnostic accuracy analyses wereperformed using CIA v.2.0 (Southampton, UK).

3. Results

3.1. Search results

The electronic database search retrieved 6,027 articles(Fig. 1). After review of the titles, 5,272 articles were ex-cluded because they were clearly outside the scope of the

review. The remaining 755 articles were classified intostudy types to identify those evaluating a cohort of patients[21]. There were 175 review articles, of which four[24e27] were systematic reviews related to fracture or backpain, but did not focus on diagnosis or clinical features offracture.

The titles and abstracts of the 66 cohort studies were re-viewed, and 12 met the eligibility criteria for data extrac-tion. Of the excluded articles, 20 did not investigatepopulations with low back pain or trauma, and 34 did notassess the diagnostic accuracy of any clinical features inpatients with low back pain or trauma.

3.2. Study characteristics

The 12 eligible studies assessed a total of 7,147 patientspresenting to various settings with low back pain or trauma(Table 2). Of these patients, 424 (5.9%) were diagnosedwith a vertebral fracture. Only one eligible study recruitedpatients in a community primary care setting [28]. Theother studies reported on hospital inpatients and outpatients[29], patients recruited from secondary referral centers[30], or patients presenting to accident and emergency de-partments [31e39]. Some studies only reported results forthe diagnostic accuracy of the clinical features in those pa-tients who were evaluated with the reference standard, andexcluded those patients who were not [28,32,36,38]. Allstudies used x-ray as the reference standard for detectingvertebral fracture, although some included the use of addi-tional imaging procedures, for example, computed tomog-raphy [35,37,39].

3.3. Psychometric properties of the QUADAS scale

Percent agreement between raters and k statistics for theindividual items are shown in Table 3. Inter-rater reliabilitywas slight or poor for most items; only three items had atleast moderate reliability [40]. The internal consistency(of the consensus scores for each item) was only poor witha Cronbach’s alpha value of 0.04. Four items had a fre-quency of endorsement >90% suggesting a ceiling effect.

3.4. Study quality assessment

There was poor reporting of the details of the index testsand the reference standard, or whether the tests were inter-preted in a blinded fashion, and therefore, it appeared thatno eligible study fulfilled all of the QUADAS criteria (Ta-ble 3). Most studies only provided index test results for thesubset of patients who received the reference standard, sub-jecting their study sample to partial or differential verifica-tion bias [15]. Only one study [31], recruiting accident andemergency department patients with blunt trauma, prospec-tively applied the same reference standard to their entirecohort of subjects.

113N. Henschke et al. / Journal of Clinical Epidemiology 61 (2008) 110e118

Search Strategy

MEDLINE n = 5435EMBASE n = 635CINAHL n = 259Other n = 4

Titles reviewedn = 6027

Case Reportsn = 156

Cohort n = 66

Reviews n = 175

Excluded articles onbasis of title/abstract

n = 5272

Case Seriesn = 319

Case Controln = 39

Patients with low back pain/trauma

n = 46

Excluded because not low backpain/trauma patients

n = 20

Eligible studies for data extraction n = 12

Excluded because noclinical features studied

n = 34

Fig. 1. Study selection process and reasons for exclusion.

3.5. Index test results

The eligible studies retrieved from the search processwere deemed to be too heterogenous with respect to clinicalsetting, methodological quality as assessed by QUADAS,and diagnostic accuracy of the features investigated to con-duct a meta-analysis on the results. Data on a total of 50clinical features, grouped into nine categories, wereextracted from the 12 eligible articles (Table 4). The cate-gories were age O50 years, gender, trauma, corticosteroiduse, altered consciousness, other injury, pain or tenderness,altered neurological signs, and deformity. Global clinicianjudgment was also assessed by two studies and includedin the table.

Of the nine categories of clinical features, five signifi-cantly increased the probability of fracture when present,and when absent, significantly decreased the probability.Two studies [28,30] had consistent results for older age,with LRþ in the range of 1.7e3.7 [30], and LR� in therange of 0.32e0.49 [30]. Female gender had consistentresults in two studies (LRþ5 1.3, LR�5 0.65 [30];LRþ5 2.3, LR�5 0.67 [32]) with females aged >75years (LRþ5 4.4, LR�5 0.62 [30]) also significantly

altering the likelihood of vertebral fracture. Major trauma(LRþ5 12.8, LR�5 0.37 [29]), or trauma with neurolog-ical signs (LRþ5 14.4, LR�5 0.73 [36]) were also sig-nificant features of vertebral fracture. The presence orabsence of a distracting and painful injury in the accidentand emergency setting also significantly affected the suspi-cion of vertebral fracture (LRþ5 1.7, LR�5 0.78 [34]).Pain and tenderness were assessed by numerous studies ei-ther as individual items or in combination. When assessedindividually, the results were contradictory across studies[31,34], but when used in combination were effective inraising or lowering the suspicion of vertebral fracture(LRþ5 7.2, 6.7 [35,38]; LR�5 0.42, 0.44 [35,38]). Thepresence of muscle spasm or back bruising was notsignificant.

Corticosteroid use and altered consciousness did not sig-nificantly alter the probability of fracture, whether thesefeatures were present or absent. Abnormal neurologicalsigns were significant in two studies [29,35], but not inothers [28,34,36]. Clinician judgment, based on a positiveor equivocal clinical exam (LRþ5 2.9, LR�5 0.00[32]), was significant in one of the two studies that

114 N. Henschke et al. / Journal of Clinical Epidemiology 61 (2008) 110e118

Table 2

Study characteristics

Authors Design Setting Patients

Prevalence of

fracture (n)

Reference

standard

Deyo and Diehl, 1986 [28] Prospective Walk-in clinic of university

hospital

311 patients seeking

treatment for low back pain

(LBP) who received

lumbar spine x-rays

4.2% (13) x-ray

Scavone et al., 1981 [29] Retrospective

chart review

Hospital inpatients and

outpatients

871 patients receiving lumbar

spine x-rays

3.0% (26) x-ray

van den Bosch et al., 2004

[30]

Retrospective

chart review

Hospital radiology

department

2,007 primary care referrals

for x-ray of patients with

LBP

4.1% (83) x-ray

Terregino et al., 1995 [31] Prospective Accident and emergency

department

183 blunt trauma patients

able to be evaluated

clinically

9.3% (17) x-ray

Samuels et al., 1993 [32] Retrospective

chart review

Accident and emergency

department

99 blunt trauma patients

receiving x-rays

15.2% (15) x-ray

Reinus et al., 1998 [33] Prospective,

consecutive

Accident and emergency

department

482 patients receiving lumbar

spine x-rays

2.1% (10) x-ray

Holmes et al., 2003 [34] Prospective Accident and emergency

department

2,404 blunt trauma patients

receiving x-rays

6.3% (152) x-ray

Hsu et al., 2003 [35] Prospective Accident and emergency

department

100 multi-trauma patients 29.0% (29) x-ray/CT/ MRI

Gibson et al., 1992 [36] Prospective,

consecutive

Accident and emergency

department

108 acute LBP patients

receiving lumbar spine

x-ray

6.5% (7) x-ray

Gestring et al., 2002 [37] Prospective Accident and emergency

department

71 blunt trauma patients

requiring computed

tomography (CT) and

x-ray

14.1% (10) x-ray/CT

Frankel et al., 1994 [38] Prospective,

consecutive

Accident and emergency

department

167 blunt trauma patients

receiving surveillance

thoracolumbar x-rays

9.0% (15) x-ray

Durham et al., 1995 [39] Retrospective

chart review

Accident and emergency

department

344 blunt trauma patients

receiving x-ray

13.7% (47) x-ray/CT

evaluated it. The presence of a structural deformity(LRþ5 21.6, 46.4 [31,35]) significantly increased theprobability of a fracture when present, but when absent,did not lower the suspicion of fracture.

4. Discussion

Clinical guidelines for the management of low back painadvocate the use of red flags to raise the index of suspicionconcerning serious spinal pathology. This study providesthe first diagnostic systematic review of these red flagsand other clinical features for identifying vertebral fracturein low back pain patients. It contains a more detailed anal-ysis of the red flags by summarizing the diagnostic accu-racy quantitatively and exploring methodological qualityof the primary studies. By closely adhering to guidelinesdeveloped for performing diagnostic systematic reviews[17,18], it identified a number of shortcomings in the avail-able evidence for this important area of primary care prac-tice. Most of the clinical features identified as being usefulwere evaluated by only one study, and the studies were gen-erally of low methodological quality, or did not sufficiently

report on the methodological quality items. The inclusionof many of the red flags recommended in clinical practiceguidelines is not supported by the literature.

This review identified five clinical features that, whenpresent, should alert clinicians to a greater possibility ofvertebral fracture. Conversely, when these features are ab-sent, there is a smaller probability of vertebral fracture.These features are age O50 years, female gender, majortrauma, pain and tenderness, and a distracting painful in-jury. The results were inconsistent for abnormal neurologi-cal signs and a positive or equivocal clinical examination.

Age O50 years is consistently included as a red flag forfracture in clinical guidelines [11,13,14,41], and two stud-ies were identified that evaluated age O50 as a clinical fea-ture [28,30]. These studies reported an increasinglikelihood of fracture with increasing age over 50 years,with the highest LR in females aged >75 years(LRþ5 4.4) [30]. However, due to differences in the clin-ical setting and age cutoffs used in the two studies, we wereunable to pool the results. Older females and those whohave used corticosteroids are predisposed to osteoporosis[28], and thus to an increased risk of fracture even from mi-nor trauma [7,8]. Only one study investigated corticosteroid

115N. Henschke et al. / Journal of Clinical Epidemiology 61 (2008) 110e118

T

able

3

Stu

dy

qu

alit

yas

sess

men

tu

sin

gth

eQ

UA

DA

Ssc

ale

Au

tho

rs

Sp

ectr

um

com

po

siti

on

Sel

ecti

on

crit

eria

Dis

ease

pro

gre

ssio

n

bia

s

Par

tial

veri

fica

tio

n

bia

s

Dif

fere

nti

al

veri

fica

tio

n

bia

s

Inco

rpo

rati

on

bia

s

Tes

t

exec

utio

n

det

ails

Ref

eren

ce

exec

uti

on

det

ails

Tes

tre

vie

w

bia

s

Dia

gno

stic

rev

iew

bia

s

Cli

nica

l

rev

iew

bia

s

Unin

terp

reta

ble

resu

lts

Wit

hd

raw

als

Dey

oan

dD

ieh

l,1

98

6[2

8]

þ�

þ�

þ�

??

þ�

þS

cavo

ne

etal

.,1

98

1[2

9]

þ?

þþ

��

??

þ�

�va

nd

enB

osc

het

al.,

20

04

[30

]

þ?

þþ

þ�

þ?

þ�

Ter

reg

ino

etal

.,1

99

5[3

1]

þþ

þþ

þþ

��

??

þ�

�S

amu

els

etal

.,1

99

3[3

2]

þ�

þ�

þþ

��

??

þ�

�R

ein

us

etal

.,1

99

8[3

3]

þþ

þ?

��

þþ

?�

�H

olm

eset

al.,

20

03

[34

]�

þþ

�?

þ�

�þ

��

Hsu

etal

.,2

00

3[3

5]

þþ

þ?

��

??

þ�

�G

ibso

net

al.,

19

92

[36

þþ

��

þ�

�?

��

Ges

trin

get

al.,

20

02

[37

]�

�þ

??

þ�

þ?

��

Fra

nke

let

al.,

19

94

[38

þþ

??

þ�

�?

��

Du

rham

etal

.,1

99

5[3

9]

þþ

þ?

��

??

þ�

�In

ter-

rate

rre

liab

ilit

yk

0.1

80

.85

0.6

30

.25

�0

.01

da

0.0

40

.43

0.0

00

.27

0.0

00

.00

�0

.09

Per

cent

agre

emen

t(%

)50

92

92

50

25

100

25

83

88

39

26

78

3

No

te:þ

,y

es;�

,n

o;

?,u

ncl

ear.

aT

he

kst

atis

tic

could

not

be

calc

ula

ted

for

this

item

bec

ause

both

revie

wer

sunif

orm

lyre

sponde

dyes

.

use [28], and the results did not significantly raise or lowerthe probability of fracture.

Trauma is an obvious mechanism of injury that can leadto fracture. In the studies not performed in an accident andemergency setting, major or significant trauma providedinconsistent results [28,29]. Despite guidelines suggestingthat minor trauma in patients aged O50 years [12,13]should increase the suspicion of fracture, this was not eval-uated by the studies included in this review. Studies of a co-hort of patients with low back pain or trauma in accidentand emergency departments were considered eligible forthis review because it was thought that the inclusion ofthese studies would provide further insights into the clinicalpresentation of patients with vertebral fractures. Signs suchas trauma, pain, and tenderness, and a distracting painfulinjury were useful in the accident and emergency settingto alert clinicians to the possibility of vertebral fracture.

Most of the eligible studies did not provide sufficient de-tails regarding the execution of the index and referencetests (Table 3), which is important when attempting to ap-ply the results in clinical practice, or validate them in fur-ther studies. This is most relevant in tests that requirea subjective interpretation, such as the reading of x-rays,determining the presence of abnormal neurological signs[35] or defining major trauma [29]. Without adequate re-porting, we cannot know whether the unreported or unclearitems on the QUADAS reflect true methodological flaws inthe studies. Additionally our evaluation of the psychometricproperties of the QUADAS scale raises questions about itsuse to rate the quality of low back pain diagnosis studies.Previous studies that have also evaluated the QUADASscale have reported inconsistent results regarding its reli-ability [42,43]. As this scale has been proposed for use infuture Cochrane Collaboration guidelines [43] on perform-ing diagnostic systematic reviews, further assessment of thepsychometric properties of the scale are needed.

The optimal design for assessing the accuracy of a diag-nostic test is considered to be a prospective blind compar-ison of the test (index test) with a reference standard ina consecutive series of patients from a relevant clinical pop-ulation [20,21]. First-contact primary care settings in thecommunity are often the first line of screening when pa-tients present with low back pain, and so it is in this settingthat accurate diagnostic tools are vital. The accident andemergency departments often use a liberal imaging policyfor patients presenting with low back pain or followingtrauma [38] that reduces the need for accurate screeningin the clinical examination. The results of this review mustbe considered with respect to the setting in which the fea-tures were assessed, and also the relevance of the clinicalpopulations studied. No eligible study fulfilled this optimaldesign, and only one study included in the review was per-formed on a series of low back pain patients seeking treat-ment at a walk-in clinic [28].

Despite five individual clinical features being identifiedby this review as useful for the identification of patients

116 N. Henschke et al. / Journal of Clinical Epidemiology 61 (2008) 110e118

Table 4

Clinical signs and accuracy data extracted from eligible studiesa

Clinical Features LR+ (95% CI) LR- (95% CI) Sensitivity (%) Specificity (%)

Age

Age > 50 years [28] 2.2 (1.4 to 2.8) 0.34 (0.12 to 0.75) 79 64

Age 55 years [30] 1.7 (1.5 to 1.9) 0.35 (0.22 to 0.54) 82 51

Age 65 years [30] 2.5 (2.1 to 2.8) 0.32 (0.21 to 0.47) 78 68

Age 75 years [30] 3.7 (2.9 to 4.5) 0.49 (0.37 to 0.62) 59 84

Gender

Female [30] 1.3 (1.1 to 1.4) 0.65 (0.45 to 0.90) 72 43

Female [32] 2.3 (1.1 to 4.3) 0.67 (0.37 to 0.97) 47 80

Female 55 years [30] 2.0 (1.6 to 2.4) 0.54 (0.40 to 0.70) 63 69

Female 65 years [30] 2.8 (2.2 to 3.3) 0.52 (0.40 to 0.66) 59 79

Female 75 years [30] 4.4 (3.3 to 5.7) 0.62 (0.50 to 0.73) 45 90

Male 55 years [30] 1.1 (0.7 to 1.7) 0.98 (0.86 to 1.07) 19 83

Male 65 years [30] 1.7 (1.1 to 2.7) 0.92 (0.81 to 0.99) 18 90

Male 75 years [30] 2.7 (1.6 to 4.6) 0.90 (0.81 to 0.97) 15 95

Trauma

Significant trauma [28] 1.9 (0.7 to 4.6) 0.88 (0.59 to 1.05) 21 89

Major trauma [29] 12.8 (8.3 to 18.7) 0.37 (0.20 to 0.57) 65 95

Minor trauma [29] 1.1 (0.6 to 1.9) 0.97 (0.71 to 1.15) 27 76

Trauma (acute fracture) [33] 1.1 (0.5 to 2.0) 0.94 (0.49 to 1.31) 40 64

Severe mechanism of injury [34] 1.7 (1.4 to 1.9) 0.62 (0.50 to 0.75) 61 64

Trauma [36] 2.1 (1.9 to 2.6) 0.00 (0.00 to 0.70) 100 52

Trauma & neurological signs [36] 14.4 (2.7 to 69.9) 0.73 (0.37 to 0.94) 29 98

Corticosteroids

Using corticosteroids [28] 0.0 (0.0 to 37.2) 1.01 (1.01 to 1.03) 0 99

Altered consciousness

Amnesia [31] 0.8 (0.4 to 1.1) 1.42 (0.80 to 2.13) 47 37

Loss of consciousness [31] 0.9 (0.5 to 1.4) 1.09 (0.59 to 1.68) 53 43

Decreased level of consciousness [34] 1.3 (1.0 to 1.6) 0.92 (0.82 to 1.01) 28 78

ETOH or drug intoxication [34] 0.9 (0.6 to 1.3) 1.02 (0.94 to 1.08) 15 84

Other injury

Lower extremity injury [31] 0.5 (0.1 to 1.6) 1.14 (0.85 to 1.32) 12 77

Upper extremity injury [31] 0.8 (0.2 to 2.5) 1.04 (0.77 to 1.18) 12 85

Distracting painful injury [34] 1.7 (1.3 to 2.0) 0.78 (0.67 to 0.88) 41 75

Pain/tenderness

Low back pain with radiation [29] 0.0 (0.0 to 0.9) 1.16 (1.16 to 1.20) 0 86

Sciatica [29] 0.4 (0.1 to 2.1) 1.06 (0.89 to 1.10) 4 91

Hip/leg pain [29] 0.0 (0.0 to 1.5) 1.10 (1.09 to 1.12) 0 91

Pain [31] 3.9 (1.9 to 7.1) 0.60 (0.35 to 0.85) 47 88

Spinal pain [34] 1.1 (1.0 to 1.2) 0.79 (0.60 to 1.01) 72 35

Back pain/tenderness [35] 7.2 (3.3 to 16.2) 0.42 (0.25 to 0.62) 62 91

Pain or tenderness [38] 6.7 (4.4 to 10.2) 0.44 (0.32 to 0.58) 60 91

Local tenderness [29] 1.9 (1.2 to 2.6) 0.68 (0.44 to 0.93) 50 73

Tenderness [31] 8.0 (3.8 to 15.9) 0.50 (0.28 to 0.74) 53 93

Tenderness [34] 1.0 (0.9 to 1.1) 1.04 (0.80 to 1.33) 71 28

Tenderness on palpation [37] 0.9 (0.4 to 1.7) 1.11 (0.56 to 1.72) 40 54

(Continued)

117N. Henschke et al. / Journal of Clinical Epidemiology 61 (2008) 110e118

Table 4

Continued

Clinical Features LR+ (95% CI) LR- (95% CI) Sensitivity (%) Specificity (%)

Muscle spasm [29] 1.3 (0.4 to 3.2) 0.98 (0.78 to 1.06) 12 91

Back bruising [35] 4.8 (0.6 to 35.9) 0.95 (0.79 to 1.02) 7 99

Altered neurological signs

Neuromotor deficits [28] 0.0 (0.0 to 2.0) 1.12 (1.12 to 1.17) 0 89

Abnormal deep tendon reflex [29] 1.1 (0.4 to 2.8) 0.99 (0.79 to 1.08) 12 89

Sensory deficit [29] 2.2 (1.1 to 3.9) 0.83 (0.61 to 0.99) 27 88

Muscle weakness/atrophy [29] 2.2 (1.0 to 4.1) 0.86 (0.65 to 1.00) 23 90

Abnormal neurologic exam [34] 1.0 (0.6 to 1.6) 1.00 (0.94 to 1.05) 9 91

Abnormal neurology [35] 9.8 (3.2 to 30.5) 0.61 (0.42 to 0.78) 41 96

Neurological signs, SLR < 40 [36] 2.4 (0.6 to 6.7) 0.81 (0.41 to 1.06) 29 88

Deformity

Deformity or neurologic deficit [31] 46.4 (2.3 to 929.1) 0.86 (0.72 to 1.04) 12 100

Step deformity [35] 21.6 (1.2 to 388.9) 0.86 (0.74 to 1.00) 14 100

Global clinician judgement

Positive or equivocal clinical exam [32] 2.9 (2.5 to 4.0) 0.00 (0.00 to 0.31) 100 66

Positive or equivocal clinical exam [39] 1.2 (0.9 to 1.4) 0.72 (0.45 to 1.09) 72 39

Abbreviation: LR, likelihood ratio.a Shading represents significant results.

with vertebral fracture, testing a combination of individualfeatures could develop a more sensitive screening test.However, no study in this review investigated the combina-tion of more than two features. Deyo and Diehl [44] eval-uated a combination of four features to identify patientspresenting with cancer as a cause of back pain, finding thatthis combination increased sensitivity to 100% [44]. As noclinical guidelines recommend the routine use of x-rays [1],clinicians need to be able to confidently exclude patientswithout vertebral fracture so that patients do not undergounnecessary diagnostic testing, but no fracture is missed.

There is limited support for the use of some of the com-monly suggested ‘‘red flags’’ to alert clinicians to the pos-sibility of vertebral fracture in low back pain patients. Mostclinical practice guidelines do not provide evidence-basedrecommendations to screen for vertebral fracture. The lim-ited number of studies and inadequate reporting of method-ological quality items indicate a clear need for futurehigh-quality studies on screening for vertebral fracture inlow back pain patients.

Acknowledgments

N.H. is under scholarship awarded by the NationalHealth & Medical Research Council of Australia. C.M. isa senior research fellow funded by the National Health &Medical Research Council of Australia.

References

[1] Koes BW, van Tulder MW, Ostelo R, Kim Burton A, Waddell G.

Clinical guidelines for the management of low back pain in primary

care: an international comparison. Spine 2001;26:2504e13.

[2] Waddell G. The back pain revolution. 2nd edition. Edinburgh:

Churchill Livingstone; 2004.

[3] O’Neill TW, Cockerill W, Matthis C, Raspe HH, Lunt M,

Cooper C, et al. Back pain, disability, and radiographic vertebral

fracture in European women: a prospective study. Osteoporos Int

2004;15:760e5.

[4] Cooper C, O’Neill T, Silman A. The epidemiology of vertebral frac-

tures. European Vertebral Osteoporosis Study Group. Bone

1993;14(Suppl 1):S89e97.

[5] Suarez-Almazor ME, Belseck E, Russell AS, Mackel JV. Use of lum-

bar radiographs for the early diagnosis of low back pain. Proposed

guidelines would increase utilization. JAMA 1997;277:1782e6.

[6] Deyo RA, Rainville J, Kent DL. What can the history and physical

examination tell us about low back pain? JAMA 1992;268:760e5.

[7] Papaioannou A, Watts NB, Kendler DL, Chui KY, Adachi JD,

Ferko N. Diagnosis and management of vertebral fractures in elderly

adults. Am J Med 2002;113:220e8.

[8] Grigoryan M, Guermazi A, Roemer FW, Delmas PD, Genant HK.

Recognizing and reporting osteoporotic vertebral fractures. Eur Spine

J 2003;12(Suppl 2):S104e12.

[9] Terret A, Kleynhans A. Complications for manipulation of the low

back. Chiropr J Aust 1992;22:129e39.

[10] Borenstein DG. Epidemiology, etiology, diagnostic evaluation, and

treatment of low back pain. Curr Opin Rheumatol 2001;13:128e34.

[11] Bigos S, Bowyer O, Braen G. Acute low back problems in adults.

Clinical practice guideline no. 14. Rockville, MD: Agency for Health

Care Policy and Research, Public Health Service, U.S. Department of

Health and Human Services; 1994.

118 N. Henschke et al. / Journal of Clinical Epidemiology 61 (2008) 110e118

[12] NHMRC. Evidence-based management of acute musculoskeletal

pain. Bowen Hills, Queensland: Australian Academic Press; 2003.

[13] Waddell G, Feder G, McIntosh A, Lewis M, Hutchison A. Low back

pain evidence review. London: Royal College of General Practi-

tioners; 1996.

[14] van Tulder M, Becker A, Bekkering T, Breen A, Gil del Real MT,

Hutchinson A, et al. European guidelines for the management of

acute nonspecific low back pain in primary care, 2004. Available at

www.backpaineurope.org. Accessed May 1, 2005.

[15] Whiting P, Rutjes AWS, Reitsma JB, Bossuyt PMM, Kleijnen J. The

development of QUADAS: a tool for the quality assessment of stud-

ies of diagnostic accuracy included in systematic reviews. BMC Med

Res Methodol 2003;3:25.

[16] de Vet HC, van der Weijden T, Muris JW, Heyrman J, Buntinx F,

Knottnerus JA. Systematic reviews of diagnostic research. Consider-

ations about assessment and incorporation of methodological quality.

Eur J Epidemiol 2001;17:301e6.

[17] Irwig L, Macaskill P, Glasziou P, Fahey M. Meta-analytic methods

for diagnostic test accuracy. J Clin Epidemiol 1995;48:119e30. [dis-

cussion 131-2].

[18] Irwig L, Tosteson AN, Gatsonis C, Lau J, Colditz G, Chalmers TC,

et al. Guidelines for meta-analyses evaluating diagnostic tests. Ann

Intern Med 1994;120:667e76.

[19] Deville WL, Buntinx F, Bouter LM, Montori VM, de Vet HC, van der

Windt DA, et al. Conducting systematic reviews of diagnostic stud-

ies: didactic guidelines. BMC Med Res Methodol 2002;2:9.

[20] Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP,

Irwig LM, et al. The STARD statement for reporting studies of diag-

nostic accuracy: explanation and elaboration. Ann Intern Med

2003;138:W1eW12.

[21] Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, van der

Meulen JH, et al. Empirical evidence of design-related bias in studies

of diagnostic tests. JAMA 1999;282:1061e6.

[22] Chmura Kraemer H, Periyakoil VS, Noda A. Kappa coefficients in

medical research. Stat Med 2002;21:2109e29.

[23] Newcombe RG. Interval estimation for the difference between inde-

pendent proportions: comparison of eleven methods. Stat Med

1998;17:873e90.

[24] van Tulder MW, Assendelft WJ, Koes BW, Bouter LM. Spinal radio-

graphic findings and nonspecific low back pain. A systematic review

of observational studies. Spine 1997;22:427e34.

[25] Cummings SR, Bates D, Black DM. Clinical use of bone densitom-

etry: scientific review. JAMA 2002;288:1889e97.

[26] Jarvik JG, Deyo RA. Diagnostic evaluation of low back pain with

emphasis on imaging. Ann Intern Med 2002;137:586e97.

[27] Woltmann A, Buhren V. Shock trauma room management of spinal

injuries in the framework of multiple trauma. A systematic review

of the literature. Unfallchirurg 2004;107:911e8.

[28] Deyo RA, Diehl AK. Lumbar spine films in primary care: current use

and effects of selective ordering criteria. J Gen Intern Med 1986;1:

20e5.

[29] Scavone JG, Latshaw RF, Rohrer GV. Use of lumbar spine films. Sta-

tistical evaluation at a university teaching hospital. JAMA 1981;246:

1105e8.

[30] van den Bosch MAAJ, Hollingworth W, Kinmonth AL, Dixon AK.

Evidence against the use of lumbar spine radiography for low back

pain. Clin Radiol 2004;59:69e76.

[31] Terregino CA, Ross SE, Lipinski MF, Foreman J, Hughes R. Selec-

tive indications for thoracic and lumbar radiography in blunt trauma.

Ann Emerg Med 1995;26:126e9.

[32] Samuels LE, Kerstein MD. ‘Routine’ radiologic evaluation of the

thoracolumbar spine in blunt trauma patients: a reappraisal. J Trauma

1993;34:85e9.

[33] Reinus WR, Strome G, Zwemer FL Jr. Use of lumbosacral spine

radiographs in a level II emergency department. Am J Roentgenol

1998;170:443e7.

[34] Holmes JF, Panacek EA, Miller PQ, Lapidis AD, Mower WR. Pro-

spective evaluation of criteria for obtaining thoracolumbar radio-

graphs in trauma patients. J Emerg Med 2003;24:1e7.

[35] Hsu JM, Joseph T, Ellis AM. Thoracolumbar fracture in blunt trauma

patients: Guidelines for diagnosis and imaging. Injury 2003;34:

426e33.

[36] Gibson M, Zoltie N. Radiography for back pain presenting to

accident and emergency departments. Arch Emerg Med 1992;9:

28e31.

[37] Gestring ML, Gracias VH, Feliciano MA, Reilly PM, Shapiro MB,

Johnson JW, et al. Evaluation of the lower spine after blunt trauma

using abdominal computed tomographic scanning supplemented with

lateral scanograms. J Trauma 2002;53:9e14.

[38] Frankel HL, Rozycki GS, Ochsner MG, Harviel JD, Champion HR.

Indications for obtaining surveillance thoracic and lumbar spine

radiographs. J Trauma 1994;37:673e6.

[39] Durham RM, Luchtefeld WB, Wibbenmeyer L, Maxwell P,

Shapiro MJ, Mazuski JE. Evaluation of the thoracic and lumbar spine

after blunt trauma. Am J Surg 1995;170:681e4.

[40] Landis JR, Koch GG. A one-way components of variance model for

categorical data. Biometrics 1977;33:671e9.

[41] ACC. The New Zealand acute low back pain guide. Wellington, New

Zealand: Accident Compensation Corporation; 2003.

[42] Hollingworth W, Medina LS, Lenkinski RE, Shibata DK, Bernal B,

Zurakowski D, et al. Interrater reliability in assessing quality of di-

agnostic accuracy studies using the QUADAS tool. A preliminary

assessment. Acad Radiol 2006;13:803e10.

[43] Whiting PF, Weswood ME, Rutjes AWS, Reitsma JB, Bossuyt PNM,

Kleijnen J. Evaluation of QUADAS, a tool for the quality assess-

ment of diagnostic accuracy studies. BMC Med Res Methodol

2006;6:9.

[44] Deyo RA, Diehl AK. Cancer as a cause of back pain: frequency, clin-

ical presentation, and diagnostic strategies. J Gen Intern Med 1988;3:

230e8.

[45] Waddell G. An approach to backache. Br J Hosp Med 1982;28:

187e94.

198118.e1N. Henschke et al. / Journal of Clinical Epidemiology 61 (2008) 110e118

Appendix 1dSearch strategy

MEDLINE/CINAHL (30-3-05)

1. medical history taking.mp. or exp Medical History Taking/2. Physical examination.mp. or exp Physical Examination/3. exp RADIOGRAPHY/or radiography.mp.4. x-ray.mp. or exp X-Rays/5. back pain/ra6. low back pain/ra7. spine/ra8. spinal diseases/ra9. lumbar vertebrae/ra

10. magnetic resonance imaging mp. or exp Magnetic Resonance Imaging/11. exp Tomography, X-Ray Computed/or computed tomography.mp.12. nuclear medicine.mp. or exp Nuclear Medicine/13. single photon emission computed tomography.mp. or exp Tomography, Emission-Computed, Single-Photon/14. back pain/ri15. low back pain/ri16. spine/ri17. spinal diseases/ri18. lumbar vertebrae/ri19. radionuclide imaging.mp. or exp Radio nuclide Imaging/20. scintigraphy.mp.21. bone scan.mp.22. exp ‘‘Laboratory Techniques and Procedures‘‘/or laboratory tests mp. or exp Diagnostic Tests, Routine/23. erythrocyte sedimentation rate.mp. or exp Blood Sedimentation/24. complete blood count.mp.25. questionnaires.mp. or exp QUESTIONNAIRES/26. clinical history.mp.27. blood cell count.mp. or exp Blood Cell Count/28. or/1-2729. dorsalgia. ti, ab.30. back pain.mp. or exp Back Pain/31. backache.ti, ab.32. (lumbar adj pain).ti, ab.33. (spinal adj pain).ti, ab.34. coccyx.ti, ab.35. coccydynia.ti, ab.36. sciatica.mp. or exp SCIATICA/37. spondylosis.ti, ab.38. Lumbago.ti, ab.39. sacroiliac joint.mp. or exp Sacroiliac Joint/40. spine.mp. or exp SPINE/41. or/29-4042. fracture.mp. or exp FRACTURES/or exp SPINAL FRACTURES/

EMBASE (30-3-05)