Self-monitoring of blood glucose in patients with diabetes who do not use insulin-are guidelines...

11
Do Guidelines for the Diagnosis and Monitoring of Diabetes Mellitus Fulfill the Criteria of Evidence-Based Guideline Development? Eva Nagy, 1 Joseph Watine, 2 Peter S. Bunting, 3 Rita Onody, 1 Wytze P. Oosterhuis, 4 Dunja Rogic, 5 Sverre Sandberg, 6 Krisztina Boda, 7 and Andrea R. Horvath 1* BACKGROUND: Although the methodological quality of therapeutic guidelines (GLs) has been criticized, little is known regarding the quality of GLs that make diagnos- tic recommendations. Therefore, we assessed the methodological quality of GLs providing diagnostic recommendations for managing diabetes mellitus (DM) and explored several reasons for differences in quality across these GLs. METHODS: After systematic searches of published and electronic resources dated between 1999 and 2007, 26 DM GLs, published in English, were selected and scored for methodological quality using the AGREE Instrument. Subgroup analyses were performed based on the source, scope, length, origin, and date and type of publication of GLs. Using a checklist, we collected laboratory-specific items within GLs thought to be im- portant for interpretation of test results. RESULTS: The 26 diagnostic GLs had significant short- comings in methodological quality according to the AGREE criteria. GLs from agencies that had clear pro- cedures for GL development, were longer than 50 pages, or were published in electronic databases were of higher quality. Diagnostic GLs contained more preana- lytical or analytical information than combined (i.e., diagnostic and therapeutic) recommendations, but the overall quality was not significantly different. The qual- ity of GLs did not show much improvement over the time period investigated. CONCLUSIONS: The methodological shortcomings of di- agnostic GLs in DM raise questions regarding the va- lidity of recommendations in these documents that may affect their implementation in practice. Our re- sults suggest the need for standardization of GL termi- nology and for higher-quality, systematically devel- oped recommendations based on explicit guideline development and reporting standards in laboratory medicine. © 2008 American Association for Clinical Chemistry Clinical practice guidelines are systematically devel- oped statements to assist practitioner and patient deci- sions about appropriate health care for specific clinical circumstances (1). The methodological quality of practice guidelines (GLs) 8 has been widely criticized (2). As effective treatment requires effective diagnosis, recommendations for the clinical use of tests should also fulfill the criteria of evidence-based guideline de- velopment (3). Assuring methodological quality of GLs requires that the potential biases of GL develop- ment have been addressed adequately and that the rec- ommendations are valid and feasible in practice. Therefore, the aim of our current study was to investi- gate whether GL development teams use appropriate and explicit methods for making diagnostic recom- mendations and whether diagnostic GLs meet basic re- porting standards. For these assessments, we chose lab- oratory diagnosis and monitoring of diabetes mellitus (DM), one of the global health problem areas in which GLs are most widely used worldwide. In previous work using the AGREE (Appraisal of Guidelines Research and Evaluation) instrument (4), we showed that the methodological quality of 4 DM GLs, issued by presti- 1 Department of Clinical Chemistry, University of Szeged, Medical Faculty, Szeged, Hungary; 2 Laboratoire de Biologie Polyvalente, Ho ˆ pital Ge ´ne ´ ral, Rodez, France; 3 Department of Pathology and Laboratory Medicine, The Ottawa Hospital, Ottawa, Ontario, Canada; 4 Department of Clinical Chemistry, Atrium Medical Centre, Heerlen, The Netherlands; 5 Institute of Clinical Laboratory Diagnosis, Zagreb University School of Medicine, Clinical Hospital Center, Zagreb, Croatia; 6 Laboratory of Clinical Biochemistry, Haukeland University Hospital, Bergen, Norway; 7 Department of Medical Informatics, University of Szeged, Medical Faculty, Szeged, Hungary. * Address correspondence to this author at: Department of Clinical Chemistry, University of Szeged, Medical Faculty, Somogyi Bela ter 1, Szeged, H-6725 Hungary. E-mail [email protected]. Received August 6, 2008; accepted August 7, 2008. Previously published online at DOI: 10.1373/clinchem.2008.109082 8 Nonstandard abbreviations: GL, guideline; DM, diabetes mellitus; AGREE, Ap- praisal of Guidelines Research and Evaluation; D, AGREE domain; I, AGREE item. Clinical Chemistry 54:11 1872–1882 (2008) Evidence-based Laboratory Medicine and Test Utilization 1872

Transcript of Self-monitoring of blood glucose in patients with diabetes who do not use insulin-are guidelines...

Do Guidelines for the Diagnosis andMonitoring of Diabetes Mellitus Fulfill

the Criteria of Evidence-BasedGuideline Development?

Eva Nagy,1 Joseph Watine,2 Peter S. Bunting,3 Rita Onody,1 Wytze P. Oosterhuis,4 Dunja Rogic,5

Sverre Sandberg,6 Krisztina Boda,7 and Andrea R. Horvath1*

BACKGROUND: Although the methodological quality oftherapeutic guidelines (GLs) has been criticized, little isknown regarding the quality of GLs that make diagnos-tic recommendations. Therefore, we assessed themethodological quality of GLs providing diagnosticrecommendations for managing diabetes mellitus(DM) and explored several reasons for differences inquality across these GLs.

METHODS: After systematic searches of published andelectronic resources dated between 1999 and 2007, 26DM GLs, published in English, were selected andscored for methodological quality using the AGREEInstrument. Subgroup analyses were performed basedon the source, scope, length, origin, and date and typeof publication of GLs. Using a checklist, we collectedlaboratory-specific items within GLs thought to be im-portant for interpretation of test results.

RESULTS: The 26 diagnostic GLs had significant short-comings in methodological quality according to theAGREE criteria. GLs from agencies that had clear pro-cedures for GL development, were longer than 50pages, or were published in electronic databases were ofhigher quality. Diagnostic GLs contained more preana-lytical or analytical information than combined (i.e.,diagnostic and therapeutic) recommendations, but theoverall quality was not significantly different. The qual-ity of GLs did not show much improvement over thetime period investigated.

CONCLUSIONS: The methodological shortcomings of di-agnostic GLs in DM raise questions regarding the va-lidity of recommendations in these documents that

may affect their implementation in practice. Our re-sults suggest the need for standardization of GL termi-nology and for higher-quality, systematically devel-oped recommendations based on explicit guidelinedevelopment and reporting standards in laboratorymedicine.© 2008 American Association for Clinical Chemistry

Clinical practice guidelines are systematically devel-oped statements to assist practitioner and patient deci-sions about appropriate health care for specific clinicalcircumstances (1 ). The methodological quality ofpractice guidelines (GLs)8 has been widely criticized(2 ). As effective treatment requires effective diagnosis,recommendations for the clinical use of tests shouldalso fulfill the criteria of evidence-based guideline de-velopment (3 ). Assuring methodological quality ofGLs requires that the potential biases of GL develop-ment have been addressed adequately and that the rec-ommendations are valid and feasible in practice.Therefore, the aim of our current study was to investi-gate whether GL development teams use appropriateand explicit methods for making diagnostic recom-mendations and whether diagnostic GLs meet basic re-porting standards. For these assessments, we chose lab-oratory diagnosis and monitoring of diabetes mellitus(DM), one of the global health problem areas in whichGLs are most widely used worldwide. In previous workusing the AGREE (Appraisal of Guidelines Researchand Evaluation) instrument (4 ), we showed that themethodological quality of 4 DM GLs, issued by presti-

1 Department of Clinical Chemistry, University of Szeged, Medical Faculty,Szeged, Hungary; 2 Laboratoire de Biologie Polyvalente, Hopital General, Rodez,France; 3 Department of Pathology and Laboratory Medicine, The OttawaHospital, Ottawa, Ontario, Canada; 4 Department of Clinical Chemistry, AtriumMedical Centre, Heerlen, The Netherlands; 5 Institute of Clinical LaboratoryDiagnosis, Zagreb University School of Medicine, Clinical Hospital Center,Zagreb, Croatia; 6 Laboratory of Clinical Biochemistry, Haukeland UniversityHospital, Bergen, Norway; 7 Department of Medical Informatics, University ofSzeged, Medical Faculty, Szeged, Hungary.

* Address correspondence to this author at: Department of Clinical Chemistry,University of Szeged, Medical Faculty, Somogyi Bela ter 1, Szeged, H-6725Hungary. E-mail [email protected].

Received August 6, 2008; accepted August 7, 2008.Previously published online at DOI: 10.1373/clinchem.2008.1090828 Nonstandard abbreviations: GL, guideline; DM, diabetes mellitus; AGREE, Ap-

praisal of Guidelines Research and Evaluation; D, AGREE domain; I, AGREEitem.

Clinical Chemistry 54:111872–1882 (2008)

Evidence-based Laboratory Medicine and Test Utilization

1872

gious authorities between 1999 and 2003, was ratherlow (5 ). The objectives of the current study were toexamine whether similar findings were present in alarger sample of recently published DM GLs, and if so,to identify some of the reasons for such findings. Wewere also interested in differences between GLs that areprimarily diagnostic compared to those that are com-bined with therapeutic recommendations. Owing tothe high number and heterogeneous nature of diagnos-tic questions and recommendations addressed in DMguidelines, and the fact that AGREE is designed to as-sess the methodological quality of GLs only, our studydid not investigate the accuracy of the content ofguidelines.

Materials and Methods

SEARCH STRATEGY

We carried out a systematic literature search to retrievediagnostic GLs in DM. The aim of the search was toobtain a representative sample of GLs, published in En-glish between 1 January 1999 and 31 December 2007,that can be easily accessed and are therefore likely to beread and used in many countries. In PubMed, 1 re-viewer (J. Watine) applied a broad search strategy usingthe Clinical Queries filter “systematic[sb],” which iscapable of retrieving systematic reviews and/or GLs(6 ). This term was combined with the laboratory-specific MeSH terms “Clinical Laboratory Techniques”[MeSH] AND systematic[sb] OR “Laboratory Tech-niques and Procedures”[MeSH] AND systematic[sb](7 ). Another independent reviewer (E. Nagy) searchedin electronic journals using the keywords “guideline”AND “diabetes,” and in dedicated GL databases andwebsites of professional organizations. The databasessearched are shown in Data Supplement 1, whichaccompanies the online version of this article athttp://www.clinchem.org/content/vol54/issue11.

SELECTION OF GUIDELINES ELIGIBLE FOR THE STUDY

Based on the titles and/or abstracts, references werescreened for relevance (E. Nagy, J. Watine). Using thissubset, 2 independent reviewers (E. Nagy, A.R. Hor-vath) applied the following inclusion criteria: the pub-lication fulfilled the definition of GLs (1 ) and dealtwith the use of laboratory tests for the diagnosis ormonitoring of DM, and the GL was publicly availablein a peer-reviewed journal and/or in nationally or in-ternationally endorsed GL databases. If several updatesof the GL were available during the studied time pe-riod, only the latest version was selected. All these cri-teria had to be met for enrollment into the study.

We excluded publications that contained thera-peutic recommendations only, were primarily focusedon technical/analytical/quality control/standardization/

quality management issues, referred to special patientgroups, or offered local protocols on best practice (i.e.,restricted to 1 particular health care setting).

EVALUATION OF THE METHODOLOGICAL QUALITY OF

GUIDELINES

It has been shown that compliance with the AGREEcriteria of most GL development programs is high (8 ),and, therefore, we used AGREE, a standardized, ge-neric, and validated checklist (4, 9, 10 ), along with itsaccompanying Training Manual, for the assessment ofGLs. AGREE arranges 23 criteria, thereafter referred toas AGREE items I1 through I23, into 6 key domains(D1 through D6). Selected GLs were randomly allo-cated to 2 assessment teams with 4 reviewers per team.Reviewers were trained how to use AGREE in a pilotstudy before conducting this larger survey (5 ). Tomake the appraisal process as objective as possible, re-viewers were provided with all supplementary files ref-erenced by each GL and found in the public domain (E.Nagy), including background supporting materials,technical papers, or general GL development manualsissued by the respective GL agency.

Reviewers independently assessed the fulfillmentof the AGREE criteria on a 4-point Likert scale. Dis-agreements in 2 or more scores between appraiserswere resolved by discussion and consensus. An inde-pendent reviewer’s opinion (A.R. Horvath) was re-quired in 1 case only for reaching consensus. Domainscores were expressed in percentages, and a final con-clusion was reached about the acceptability of the GLaccording to the instructions of AGREE. A GL was“strongly recommended” if the majority of itemsscored 3 or 4 and most domain scores (i.e., at least 4 of6) were �60%. A GL was “not recommended” if themajority of items rated 1 or 2 and most of the domainscores (i.e., 4 or more of 6) were �30%. Guidelineswere “recommended with provisos or alterations”when the GL rated high (3 or 4) or low (1 or 2) on asimilar number of items and most domain scores werebetween 30% and 60%. For investigating whether di-agnostic GLs meet additional reporting standards (11 )that are not covered in depth in the AGREE, we as-sessed the presence of the following items: (a) an evi-dence table, (b) a description of the grading system, (c)graded recommendations, and (d) an expiry or reviewdate. Additionally, we assessed whether the GL con-tained data thought to be important for test interpre-tation (3 ), such as (e) prevalence, (f ) diagnostic accu-racy of tests, (g) preanalytical, and (h) analyticalspecifications. All reviewers checked the availability ofthese items, and results were summarized by 1 inde-pendent assessor (E. Nagy).

We created 5 subgroups of GLs based on theirsource, scope, length, and origin and whether they were

Quality of Diagnostic Guidelines in Diabetes

Clinical Chemistry 54:11 (2008) 1873

supplemented with a guideline methods manual. Wealso investigated the quality of guidelines according tothe date and type of publication. In the statistical anal-yses (K. Boda), the mean item and standardized do-main scores of GL subgroups were compared by theKruskal–Wallis test. Pair-wise comparisons were car-ried out using the Mann–Whitney U-test with Bonfer-roni correction. The frequency of reporting laboratoryspecific information in different guideline subgroupswas compared with the Fisher exact test. The level ofsignificance was set at P � 0.01 because of multiplecomparisons. All analyses were performed using SPSSfor Windows, version 13.

Results

Of 2630 references retrieved in a broad search, wefound 497 GLs to be related to laboratory medicine(Fig. 1). After screening for relevance, we subjected 54GLs to the selection criteria described. Fig. 1 shows thereasons for excluding 28 GLs; 26 GLs became eligiblefor critical appraisal (12–37 ). The most importantcharacteristics of GLs are summarized in Table 1. NineGLs originated from the USA, 3 from Canada, 7 fromthe UK, 1 each from Australia, New Zealand, and SouthAfrica, and 4 were international. All but 2 GLs weredeveloped in the last 6 years.

CRITICAL APPRAISAL OF GUIDELINES

Based on the assessment of methodological quality, 22GLs were recommended by reviewers, of which only 11were strongly recommended and the rest “with provi-sos and alterations.” Four GLs had 4 or 5 domains withscores �30%, and reviewers did not recommend theiruse (Table 1).

The domain and item scores of individual GLs areshown in Table 1 and online Data Supplement 2, re-spectively. Table 2 summarizes the mean item scoresand the number and proportion of GLs scoring �3 onthe 4-point Likert scale. Overall, the best-performingdomains were D1, “scope and purpose” (77%; Table1), with a high proportion of GLs scoring above 3 for allitems (Table 2). Although D4, “clarity and presenta-tion,” scored highly (76%; Table 1), I18 within thisdomain performed poorly, as only 10 GLs (38%) weresupported with tools for application (Table 2).

Domains 2 and 3, which explored the process ofGL development, showed lower scores (Table 1). NineGLs (35%) scored �60% in “stakeholder involvement”and 14 (54%) in “rigor of development” domains. Ofindividual items in D2, only a small proportion ofGLs gave information about the composition andaffiliations of the guideline development group, pro-vided some information on patient involvement in thedevelopment process, defined their target users clearly,and pilot tested the GL by target users before publica-tion (Table 2). In D3, there are notable shortcomingsin using systematic methods for searching the evidenceor at least giving some information about literatureretrieval, describing clearly the criteria for selectingthe evidence, indicating the methods used for formu-lating recommendations, and giving information onthe peer review and updating process. The lowestscores were achieved with “applicability” (34%) and“editorial independence” (39%) domains, in whicheach items performed very poorly (Table 2 and onlineData Supplement 2).

QUALITATIVE ANALYSIS OF GUIDELINES

Date of publication. Quality of GLs was also investi-gated according to the date of publication to seewhether any improvement can be observed over time.Only the highest scoring D1 and D4 showed some mar-ginal development in quality over the time scale inves-tigated (Table 1). GLs seem to have become more spe-cific in stating their objectives (I1) and in creating morefocused clinical questions (I2), and the recommenda-tions in GLs have become more easily identifiable (I17)(online Data Supplement 2). However, the poor per-formance in D6 showed further deterioration from

Broad search and selection for laboratory-related guidelines in:–MEDLINE n = 222–electronic guideline databases n = 295 (overlap 20)

n = 497

Excluded for not being relevant tothe laboratory management of DMn = 443

Full articles read for detailed evaluationn = 54

Guidelines selected for appraisaln = 26 (12–37)

Excluded n = 28•Technical or analytical paper •Duplicate publicatio = 1•Inappropriate topi = 3•No recommendation = 4•Newer updates available n = 6•Special patient group n = 2•Local protocol = 6•Parts of 1 guideline n = 3(4-part guideline merged into 1)

Screening of titles/ abstracts for relevance

n = 3

Fig. 1. Searching and selecting guidelines.

1874 Clinical Chemistry 54:11 (2008)

Tabl

e1.

Crit

ical

appr

aisa

lof

diab

etes

mel

litus

guid

elin

esby

the

AG

REE

inst

rum

ent.

Gu

idel

ine

Dat

eo

fis

sue

Sou

rcea

Sco

peb

Len

gth

,p

ages

GL

man

ual

cO

rig

in

Do

mai

nsc

ore

,%

Ove

rall

asse

ssm

ent

D1d

D2

D3

D4

D5

D6

(12

)19

99Bo

thDi

agno

stic

�10

0N

oU

SA89

4069

3517

21Re

com

men

dw

ithal

tera

tion

(13

)20

01Da

taba

seCo

mbi

ned

51–1

00Ye

sU

K56

7574

718

71St

rong

lyre

com

men

d

(14

)20

02Jo

urna

lCo

mbi

ned

1–10

No

Nor

thAm

eric

a47

617

3314

4W

ould

not

reco

mm

end

(15

)20

02Bo

thDi

agno

stic

11–5

0N

oU

SA53

2331

6711

17Re

com

men

dw

ithal

tera

tion

(16

)20

02Da

taba

seCo

mbi

ned

�10

0Ye

sU

K92

8587

9833

42St

rong

lyre

com

men

d

(17

)20

02Da

taba

seCo

mbi

ned

�10

0Ye

sU

K92

8890

9833

42St

rong

lyre

com

men

d

(18

)20

02Da

taba

seCo

mbi

ned

1–10

No

Sout

hAf

rica

1423

656

00

Wou

ldno

tre

com

men

d

(19

)20

02Bo

thDi

agno

stic

1–10

No

Cana

da81

2140

7319

13Re

com

men

dw

ithal

tera

tion

(20

)20

03Da

taba

seCo

mbi

ned

�10

0N

oCa

nada

8633

6090

2542

Reco

mm

end

with

alte

ratio

n

(21

)20

03Da

taba

seCo

mbi

ned

�10

0Ye

sN

ewZe

alan

d86

8376

9656

100

Stro

ngly

reco

mm

end

(22

)20

03Da

taba

seDi

agno

stic

51–1

00Ye

sU

SA97

2177

9039

88St

rong

lyre

com

men

d

(23

)20

03Da

taba

seDi

agno

stic

�10

0Ye

sU

SA94

2374

8142

83St

rong

lyre

com

men

d

(24

)20

03Da

taba

seDi

agno

stic

51–1

00Ye

sIn

tern

atio

nal

100

3329

5258

42Re

com

men

dw

ithal

tera

tion

(25

)20

04Bo

thCo

mbi

ned

1–10

Yes

USA

9727

6479

675

Reco

mm

end

with

alte

ratio

n

(26

)20

04Da

taba

seCo

mbi

ned

1–10

No

USA

,Can

ada

4227

665

00

Wou

ldno

tre

com

men

d

(27

)20

04Da

taba

seCo

mbi

ned

�10

0Ye

sU

K97

8892

9872

92St

rong

lyre

com

men

d

(28

)20

05Da

taba

seDi

agno

stic

11–5

0Ye

sCa

nada

7235

1385

6929

Reco

mm

end

with

alte

ratio

n

(29

)20

05Da

taba

seCo

mbi

ned

51–1

00Ye

sIn

tern

atio

nal

5846

5579

4496

Reco

mm

end

with

alte

ratio

n

(30

)20

05Da

taba

seDi

agno

stic

�10

0Ye

sAu

stra

lia97

7390

8139

21St

rong

lyre

com

men

d

(31

)20

06Da

taba

seDi

agno

stic

51–1

00Ye

sIn

tern

atio

nal

7815

2669

2521

Wou

ldn’

tre

com

men

d

(32

)20

07Bo

thCo

mbi

ned

�10

0Ye

sU

SA64

4239

6917

46Re

com

men

dw

ithal

tera

tion

(33

)20

07Bo

thCo

mbi

ned

11–5

0Ye

sU

SA61

3139

9239

0Re

com

men

dw

ithal

tera

tion

(34

)20

07Da

taba

seDi

agno

stic

0–50

Yes

Inte

rnat

iona

l86

2755

6028

9Re

com

men

dw

ithal

tera

tion

(35

)20

07Da

taba

seCo

mbi

ned

51–1

00Ye

sU

K97

7164

9056

21St

rong

lyre

com

men

d

(36

)20

07Da

taba

seCo

mbi

ned

51–1

00Ye

sU

K97

6967

8856

21St

rong

lyre

com

men

d

(37

)20

07Da

taba

seCo

mbi

ned

51–1

00Ye

sU

K75

7167

8172

29St

rong

lyre

com

men

d

Mea

n77

4554

7634

39

Rang

e14

–100

6–88

6–92

33–9

80–

720–

100

aBo

th,j

ourn

alan

del

ectr

onic

guid

elin

eda

taba

se.

bCo

mbi

ned,

diag

nost

ican

dth

erap

eutic

reco

mm

enda

tions

.c

GL

deve

lopm

ent

man

ualo

rte

chni

cald

ocum

ent

was

avai

labl

ebe

fore

GL

publ

icat

ion.

dD1

,sco

pean

dpu

rpos

e;D2

,sta

keho

lder

invo

lvem

ent;

D3,r

igor

ofde

velo

pmen

t;D4

,cla

rity

and

pres

enta

tion;

D5,a

pplic

abili

ty;D

6,ed

itoria

lind

epen

denc

e.

Quality of Diagnostic Guidelines in Diabetes

Clinical Chemistry 54:11 (2008) 1875

2005 onward, with failures to report editorial indepen-dence and conflict of interest in the majority of GLs(Table 1).

Type of publication. We investigated how GL develop-ers defined the type of their publications and whetherthese reflected the methods used for their develop-

Table 2. Performance of AGREE item scores in diabetes mellitus guidelines.

Domain and item

All GLs (n � 26)

Mean score Score >3, nRate with

score >3, %

D1a

1 The overall objective of the guideline is specifically described. 3.34 19 73

2 The clinical questions covered by the guideline are specificallydescribed.

3.16 17 65

3 The patients to whom the guideline is meant to apply arespecifically described.

3.45 21 81

D2

4 The guideline development team involves all relevantprofessional groups.

2.53 9 35

5 The patients’ views and preferences have been sought. 2.12 9 35

6 The target users of the guideline are clearly defined. 3.03 15 57

7 The guideline has been piloted among target users. 1.75 6 23

D3

8 Systematic methods were used to search for evidence. 2.50 10 38

9 The criteria for selecting the evidence are clearly described. 2.42 11 42

10 The methods used for formulating the recommendations areclearly defined.

2.30 7 27

11 The health benefits, side effects, and risks have beenconsidered.

3.14 17 65

12 There is an explicit link between the recommendations and thesupporting evidence.

3.12 19 73

13 The guideline has been externally reviewed by experts beforeits publication.

2.58 11 42

14 A procedure for updating the guideline is provided. 2.31 10 38

D4

15 The recommendations are specific and unambiguous. 3.57 22 85

16 The different options for management of the condition areclearly presented.

3.37 21 81

17 The recommendations are easily identifiable. 3.75 24 93

18 The guideline is supported with tools for application. 2.43 10 38

D5

19 Potential barriers in applying the recommendations have beendiscussed.

1.88 4 15

20 Potential cost implications of applying the recommendationshave been considered.

2.21 6 23

21 The guideline presents key review criteria for monitoring and/or audit purposes.

1.95 8 31

D6

22 The guideline is editorially independent from the funding body. 2.60 11 42

23 Conflicts of interest of guideline development members havebeen recorded.

1.95 8 31

a D1, scope and purpose; D2, stakeholder involvement; D3, rigor of development; D4, clarity and presentation; D5, applicability; D6, editorial independence.

1876 Clinical Chemistry 54:11 (2008)

ment. There was diversity in definitions: 19 publica-tions were labeled as GLs or recommendations, ofwhich 7 stated that they were evidence-based, 4 posi-tion statements or reports, and 3 guidance documents(Table 3). Among the 7 evidence-based GLs, 5 had ev-idence summaries and 6 graded recommendations.Three GLs that had evidence tables did not define theirpublications as evidence-based GLs (22, 23, 25 ). Morethan two-thirds of GLs (n � 18, 69%) defined theirgrading system, but only 16 (62%) graded their finalrecommendations (Table 3).

Procedure for updating guidelines. Item 14 investigateswhether GL developers describe the procedures for up-dating recommendations, including the timescale, re-sponsibilities, and methods used. Fifteen GLs (58%)gave a timescale or expiration date, of which 1 GL pro-vided this information in a separate GL developmentmanual of the issuing authority (Table 3). The mostfrequent review date was 3 and 4 years. Only 10 GLs(38%) provided adequate information on the updatingprocess (Table 2).

Reporting of laboratory-specific information in diagnos-tic guidelines. We investigated whether GLs covered es-sential laboratory-specific information, such as preva-lence/pretest probability and diagnostic accuracy dataor preanalytical and analytical factors critical for thecorrect interpretation and application of laboratory re-sults in clinical practice (Table 3). About 60% of theGLs mentioned these factors. Reporting these pieces ofinformation was more frequent in diagnostic com-pared to combined GLs, but the difference was not sta-tistically significant in the various GL subgroups, asdiscussed below (online Data Supplement 3).

SUBGROUP ANALYSIS

GLs were grouped according to source, scope, length,origin, and availability of a guideline methods manual,to investigate whether there were statistically signifi-cant differences in GL quality in these subsets. Resultsare shown in Table 4 and online Data Supplements 3and 4.

Subgrouping by source. Grouping GLs by source of pub-lication revealed that 1 GL was published in a peer-reviewed journal, 19 were available in electronic GLdatabases, and 6 in both sources. The GL that was pub-lished exclusively in a peer-reviewed journal (14 ) wasnot recommended for use by the assessors. None of the6 GLs published in both peer-reviewed journals and GLdatabases were strongly recommended. GLs publishedin electronic guideline databases only received a morefavorable overall assessment. A notable difference, at alevel of significance of P � 0.05, could be observed inthe D5 only for the electronic GLs (Table 4).

Subgrouping by scope. The rate of occurrence ofstrongly recommended GLs was higher for the com-bined (50%) than for the diagnostic (30%) GLs, but therate of GLs not recommended was also higher in thecombined group. The difference was moderate (P �0.05) in D2 only, with combined GLs scoring higher(Table 4). Moderate differences were also found in 4individual items (online Data Supplement 4). Diagnos-tic GLs defined their objectives better (I1) and consid-ered the cost implications of the recommendationsmore frequently (I20), whereas combined GLs definedtheir target users (I6) and their updating processesmore precisely (I14) than diagnostic ones.

Subgrouping by length. A clear relationship could bedemonstrated between GL length and methodologicalquality (Table 4). Most GLs that were not recom-mended were shorter, and all strongly recommendedguidelines were longer than 50 pages. Significant differ-ences between these subgroups could be found formost domains, with higher quality of the longer GLs.Moderate differences (P � 0.05) could be observedwith the “applicability” and “clarity and presentation”domains. However, the best-performing GLs, scoring�50% in the “applicability” domain (21, 24, 27, 28,35–37 ), were generally longer than 50 pages, and allwere published in electronic databases (Table 1).

Subgrouping by origin. The majority of the strongly rec-ommended GLs (7 of 11) originated from the UK; theother 4 were from New Zealand, Australia, and theUSA (Table 1). Significant differences (P � 0.01) couldbe observed in fulfilling the criteria of D2, with higherscores for the British GLs. In the “rigor of develop-ment” and “clarity and presentation” domains, the dif-ference was moderate (P � 0.05) (Table 4).

Subgrouping by availability of guideline methods man-ual. Two-thirds of GLs had some accompanying man-uals describing the methods of their development insome form. All strongly recommended GLs had such amanual (Table 4). All domain scores were better in thesubset where these manuals were available, and the dif-ferences were highly statistically significant (P � 0.01)in D4, D5, and D6. In D1, D2, and D3, the P values werealso significant, but at values somewhat �0.01.

Discussion

In the current study, we made special efforts to retrieveand use all available background technical materialswhen appraising GLs to avoid a biased assessment ofmethodological quality by the AGREE Instrument.Our evaluation revealed that diagnostic recommenda-tions in the field of DM suffer from the same method-ological weaknesses as GLs developed by prestigious

Quality of Diagnostic Guidelines in Diabetes

Clinical Chemistry 54:11 (2008) 1877

Tabl

e3.

Qua

litat

ive

anal

ysis

ofre

port

ing

diab

etes

mel

litus

guid

elin

es.

Gu

idel

ine

Typ

eo

fp

ub

licat

ion

asd

escr

ibed

by

auth

ors

Evid

ence

tab

leD

escr

ipti

on

of

gra

din

gsy

stem

Gra

ded

reco

mm

end

atio

ns

Rev

iew

dat

e,ye

arPr

eval

ence

/Pre

test

pro

bab

ility

Dia

gn

ost

icac

cura

cyPr

ean

alyt

ical

info

rmat

ion

An

alyt

ical

info

rmat

ion

(12

)Re

view

ofth

eev

iden

cean

dre

com

men

datio

ns�

––

––

�–

(13

)N

atio

nalc

linic

algu

idel

ines

–�

�3

��

–�

(14

)Co

nsen

sus

repo

rt–

––

1–

––

–(1

5)

Gui

delin

esan

dre

com

men

datio

ns–

��

–�

��

(16

)Cl

inic

algu

idel

ines

and

evid

ence

revi

ew�

��

4–

�–

(17

)Cl

inic

algu

idel

ines

and

evid

ence

revi

ew�

��

4–

––

(18

)G

uide

line

––

––

––

––

(19

)Cl

inic

alpr

actic

egu

idel

ines

–�

�–

��

��

(20

)Cl

inic

alpr

actic

egu

idel

ines

–�

�–

��

��

(21

)Ev

iden

ce-b

ased

best

prac

tice

guid

elin

es–

��

3�

–�

(22

)Re

com

men

datio

nan

dra

tiona

lest

atem

ent

��

�–

��

��

(23

)Re

com

men

datio

nan

dra

tiona

lest

atem

ent

��

�–

��

��

(24

)Re

port

––

––

��

––

(25

)Cl

inic

alpr

actic

egu

idel

ines

�–

–5

––

––

(26

)G

uide

lines

––

––

––

––

(27

)Cl

inic

algu

idel

ines

and

evid

ence

revi

ew�

��

4–

��

(28

)G

uide

lines

and

prot

ocol

s–

––

3–

–�

–(2

9)

Glo

balg

uide

line

–�

–3–

5�

��

�(3

0)

Evid

ence

-bas

edgu

idel

ines

��

�3

��

��

(31

)Re

port

––

––

��

�–

(32

)M

edic

algu

idel

ines

(evi

denc

eba

sed)

–�

�–

�–

�–

(33

)Po

sitio

nst

atem

ent

–�

�1a

��

��

(34

)G

uide

line

–�

�3

––

––

(35

)G

uida

nce

–�

�co

ntin

uous

––

��

(36

)G

uida

nce

–�

–co

ntin

uous

��

��

(37

)G

uida

nce

–�

�co

ntin

uous

�–

��

Perc

enta

geof

GLs

fulfi

lling

crite

ria31

6962

5858

5862

58

aIn

form

atio

non

upda

ting

ispr

ovid

edin

ase

para

tegu

idel

ine

deve

lopm

ent

man

ual.

1878 Clinical Chemistry 54:11 (2008)

Tabl

e4.

Subg

roup

anal

ysis

D1a

D2

D3

D4

D5

D6

Stro

ng

lyre

com

men

ded

,n

(%)

Rec

om

men

ded

wit

hal

tera

tio

n,

n(%

)

Wo

uld

n’t

reco

mm

end

,n

(%)

Mea

nd

om

ain

sco

re,

%SE

Ran

ge

Mea

nd

om

ain

sco

re,

%SE

Ran

ge

Mea

nd

om

ain

sco

re,

%SE

Ran

ge

Mea

nd

om

ain

sco

re,

%SE

Ran

ge

Mea

nd

om

ain

sco

re,

%SE

Ran

ge

Mea

nd

om

ain

sco

re,

%SE

Ran

ge

Sour

ce

Gui

delin

eda

taba

se(n

�19

)80

5.2

14–1

0052

6.1

15–8

858

6.5

6–92

803.

352

–98

405.

10–

7245

7.6

0–10

011

(58)

5(2

6)3

(16)

Jour

nala

ndG

Lda

taba

se(n

�7)

b70

7.1

47–9

727

4.6

6–42

436.

817

–69

648.

333

–92

183.

96–

3925

10.0

0–75

0(0

)6

(86)

1(1

4)

P0.

209

0.05

50.

169

0.08

30.

018c

0.15

2

Scop

e

Diag

nost

ic(n

�10

)85

4.5

53–1

0031

5.2

15–7

350

8.2

13–9

069

5.3

35–9

035

5.8

11–6

934

8.9

9–88

3(3

0)6

(60)

1(1

0)

Com

bine

d(n

�16

)73

6.2

14–9

754

6.8

6–88

566.

96–

9280

4.5

33–9

833

6.1

0–72

438.

80–

100

8(5

0)5

(31)

3(1

9)

P0.

286

0.02

3c0.

660

0.09

70.

776

0.55

1

Leng

th

1–50

page

s(n

�9)

618.

514

–97

242.

76–

3530

7.0

6–64

685.

833

–92

217.

40–

6916

8.0

0–75

0(0

)6

(67)

3(3

3)

�50

page

s(n

�17

)86

3.5

56–1

0056

6.2

15–8

867

4.8

26–9

280

4.1

35–9

841

4.6

8–72

527.

221

–100

11(6

5)5

(29)

1(6

)

P0.

009d

0.00

3d0.

001d

0.05

10.

018c

0.00

1d

Orig

in

Nor

thAm

eric

a(n

�12

)74

5.7

42–9

727

2.8

6–42

447.

16–

7772

5.7

33–9

225

5.5

0–69

359.

30–

882

(17)

8(6

6)2

(17)

Briti

sh(n

�7)

875.

956

–97

623.

269

–88

674.

564

–92

823.

871

–98

428.

90–

7244

10.1

0–92

7(1

00)

0(0

)0

(0)

Oth

er(n

�7)

7411

.314

–100

439.

815

–83

4811

.26–

9070

5.9

52–9

636

7.6

0–58

4115

.40–

100

2(2

8.5)

3(4

3)2

(28.

5)

P0.

355

0.00

1d0.

028c

0.03

7c0.

112

0.60

6

Man

ual

yes

(n�

19)

843.

556

–100

535.

915

–88

625.

352

–98

823.

050

–98

424.

56–

7249

7.4

0–10

011

(58)

7(3

7)1

(5)

No

(n�

7)59

10.5

14–8

925

4.0

6–40

339.

56–

6960

7.7

33–9

012

3.6

0–25

145.

60–

420

(0)

4(5

7)3

(43)

P0.

013c

0.01

5c0.

022c

0.01

0d0.

001d

0.00

4d

aD1

,sco

pean

dpu

rpos

e;D2

,sta

keho

lder

invo

lvem

ent;

D3,r

igor

ofde

velo

pmen

t;D4

,cla

rity

and

pres

enta

tion;

D5,a

pplic

abili

ty;D

6,ed

itoria

lind

epen

denc

e.b

One

guid

elin

e(1

4)

was

publ

ishe

din

jour

nalo

nly.

cP

�0.

05.

dP

�0.

01.

Quality of Diagnostic Guidelines in Diabetes

Clinical Chemistry 54:11 (2008) 1879

authorities in many other disciplines (2, 5, 38 ). Sub-group analyses of our study demonstrated that longerand electronically published GLs and the availability ofGL development manuals yielded higher methodolog-ical scores in most AGREE domains (Table 4). Onesimple explanation is the lack of space available in jour-nals for detailed and accurate reporting (39 ). Poormethodological scores could just as well reflect faultymethods that could lead to biased and/or inaccurateconclusions. Paradoxically, lengthy GLs are thought tobe less practical for daily use (39 ), so one may arguethat length of GLs adversely affects implementation. Inour case, GLs that achieved high scores for “applicabil-ity” were indeed longer documents, but they alsocovered additional information on organization, costimplications, and monitoring of the use of recommen-dations in practice. All these tools help GL implemen-tation, and thus we cannot confirm that lengthy GLsare not applicable in practice. The Conference onGuideline Standardization defined a standard for GLreporting to promote quality and facilitate implemen-tation (11 ). Such GL reporting standards have not yetbeen adopted by most journals, and peer reviewers alsorarely use the AGREE or other criteria for systematicassessment of recommendations before publication(40 – 42 ). These shortcomings suggest the need for GLreporting standards, similar to the STARD documentfor reporting diagnostic accuracy studies (43 ), andclear publication and peer review policies for GLs bymajor medical journals.

In our study, the quality of purely diagnostic GLswas not significantly different from that of combinedGLs (Table 4). Our additional evaluation in Table 3showed that nearly half of all GLs do not report preana-lytical, analytical, and diagnostic accuracy data (3 ),which may lead to inappropriate interpretation of testresults in clinical practice (44 ). Fulfilling these criteriawould be desirable in any GLs that deal with laboratorytesting–related issues, since it is expected that practicerecommendations are developed in a multidisciplinaryprocess (45 ). Unfortunately, this could not be con-firmed by our study, as only 41% of the criteria werefulfilled in D2, which explored the involvement of allrelevant stakeholders in the GL development process.

All GLs that scored better in the comparison byorigin were from agencies with detailed GL manualsthat provided a clear description and standards for thedevelopment process (Table 1). The availability of a GLmanual, however, does not always guarantee that GLteams follow those processes consistently, and it hasbeen shown that it is often not clear how decisions aremade by the GL team when arriving at final recommen-dations (8 ). The substantial heterogeneity, in both howthe type of publication is defined and the adherence tothis definition in the final presentation of the GL, sug-

gests that there is likely a disparity between the meth-odology GL developers describe and what is actuallyfollowed in practice (Table 3). We found several GLsthat described a grading system but did not grade theirfinal recommendations. The lack of evidence tables inGLs that claim to be evidence-based may also point topotential deviations from the processes set in GL man-uals. Therefore it is advisable that diagnostic GL devel-opment teams adhere to preset methodology and doc-ument the procedures followed explicitly.

We could not demonstrate major improvementsin GL quality for most domains, and in the “editorialindependence” domain, deterioration in scores wasobserved over time. We further evaluated the quality ofGLs over time in some cases where the authorities is-sued several GLs [e.g., National Institute for ClinicalExcellence (NICE), WHO, International Diabetes Fed-eration (IDF)] within the time scale investigated (datanot shown). The NICE GL in 2004 is of higher qualitythan the NICE 2002 version due to improvements in“applicability” and “editorial independence” domains.It is noteworthy that many international organizationshave improved the rigor of their guideline develop-ment process and are moving toward internationalstandardization (11, 46 – 48 ). Surprisingly, the inter-national WHO and IDF GLs in 2006 and 2007 hadlower scores in most domains than the 2003 and 2005versions, despite the fact that both agencies releasedguideline development manuals in 2003 (http://whqlibdoc.who.int/hq/2003/EIP_GPE_EQC_2003_1.pdf, http://www.idf.org). Therefore, we assume that the lowerAGREE scores are due to the lack of reporting somemethodological details rather than the lack of followingthe methodology described in the manuals. Explicit re-porting of methodology and adherence to that meth-odology is particularly important for influential agen-cies (e.g., American Diabetes Association and WHO)whose recommendations are followed or adaptedworldwide.

There are several limitations in our study. By eval-uating English publications only, our results may sufferfrom language bias. However, several publications, in-cluding our own review of the topic, confirm no signif-icant differences in the quality of English vs non-En-glish publications of guidelines or trials (2, 49, 50 ).Because most national DM GLs are based on orstrongly influenced by international recommendationsprimarily published in English, we believe our resultsare likely to be generalizable.

Our study evaluated different publications thatwere defined in various ways by their authors. Suchheterogeneity of definitions (such as guideline, guid-ance, protocols, position statement, recommendationand rationale statement, consensus report) may high-light different approaches in formulating recommen-

1880 Clinical Chemistry 54:11 (2008)

dations for practice. We also found several GLs that,while having proof of using evidence-based methods,failed to define their publication as such (22, 23, 25 ).This suggests that the definitions used in the interna-tional guideline community may be confusing for bothGL developers and users, and that simplification andstandardization of terminology is needed. One may ar-gue that AGREE can be used for assessing evidence-based GLs only. However, AGREE is a generic andwidely accepted toolbox (8 ) that can investigate the GLdevelopment process irrespective of whether it appliesevidence- or consensus-based methodology (4 ). Infact, most evidence-based GLs have a substantial ele-ment of consensus-based judgment, especially whenevidence is conflicting or lacking. In the latter case, GLdevelopers should still search for and appraise the “bestavailable” evidence before they conclude that the bestthey can do is to reach consensus.

Our study does not determine whether there arerelationships between the methodological quality ofGLs and the validity of their content. The AGREE In-strument or other GL appraisal tools can investigateneither the accuracy of the content of recommenda-tions nor their impact on patient outcomes (51, 52 ).Another shortcoming of all critical appraisal tools isthat they do not differentiate between whether the pub-lication fails certain criteria due to lack of reporting orto poor methodology and design. Therefore, our re-sults should not be interpreted as criticisms of the truthof scientific statements or the validity of recommenda-tions made in a given publication. However, the dem-onstrated shortcomings in reporting and/or the meth-odology applied by different GL developers could leadto distrust in and/or misuse of recommendations (53 ).With such shortcomings, the energy put into develop-

ing scientifically accurate but otherwise poorly pre-sented GLs could end up being wasted, whereas inac-curate but otherwise nicely presented GLs might bepromoted and used widely. This is why we advise thatGLs be critically evaluated for both methodology andcontent before recommendations are used in clinicalpractice (38 ).

In conclusion, our results suggest the need for sys-tematically developed, explicit recommendationsbased on evidence-based guideline development andreporting standards in laboratory medicine. Our studyalso highlights the need for simplification and stan-dardization of GL terminology. Further studies areneeded to explore in depth the relationship betweenthe scientific validity and the methodological quality ofdiagnostic recommendations in DM.

Author Contributions: All authors confirmed they have contributed tothe intellectual content of this paper and has met the following 3 require-ments: (a) significant contributions to the conception and design, acqui-sition of data, or analysis and interpretation of data; (b) drafting orrevising the article for intellectual content; and (c) final approval of thepublished article.

Authors’ Disclosures of Potential Conflicts of Interest: No authorsdeclared any potential conflicts of interest.

Role of Sponsor: The funding organizations played no role in thedesign of study, choice of enrolled patients, review and interpretationof data, or preparation or approval of manuscript.

Acknowledgments: Several authors of this article (J. Watine, W.Oosterhuis, D. Rogic, S. Sandberg, P. S. Bunting, A.R. Horvath) aremembers of the Committee on Evidence-Based Laboratory Medicineof IFCC, and this work was carried out in collaboration with theIFCC Task Force on the Global Campaign for Diabetes Mellitus.

References

1. Field MJ, Lohr KN, eds. Guidelines for clinicalpractice: from development to use. Washington,DC: National Academy Press; 1992. 426 p.

2. Horvath AR, Nagy E, Watine J. Critical appraisalof guidelines. In: Evidence-based LaboratoryMedicine: Principles, Practice, Outcomes. PriceCP, Christenson RH (eds) AACC Press, Washing-ton. 2nd edition, 2007;295–319.

3. Oosterhuis WP, Bruns DE, Watine J, Sandberg S,Horvath AR. Evidence-based guidelines in labora-tory medicine: principles and methods. Clin Chem2004;50:806–18.

4. The AGREE Collaboration. Appraisal of Guidelinesfor Research and Evaluation (AGREE) Instrument.http://www.agreecollaboration.org (Accessed De-cember 2007).

5. Horvath AR, Nagy E, Watine J. Quality of guide-lines for the laboratory management of diabetesmellitus. Scand J Clin Lab Invest Suppl 2005;240:41–50.

6. Hunt DL, McKibbon KA. Locating and appraisingsystematic reviews. Ann Intern Med 1997;126:

532–8.7. Horvath AR, Pewsner D. Systematic reviews in

laboratory medicine: principles, processes andpractical considerations. Clin Chim Acta 2004;342:23–39.

8. Van der Wees PJV, Hendriks EJM, Custers JWH,Burgers JS, Dekker J, de Bie RA. Comparison ofinternational guideline programs to evaluate andupdate the Dutch program for clinical guidelinedevelopment in physical therapy. BMC HealthServices Research 2007;7:191. http://www.biomedcentral.com/1472–6963/7/191 (AccessedDecember 2007).

9. The AGREE Collaboration. Development and val-idation of an international appraisal instrumentfor assessing the quality of clinical guidelines: theAGREE project. Qual Saf Health Care 2003;12:18–23.

10. MacDermid JC, Brooks D, Solway S, Switzer-McIntyre S, Brosseau L, Graham ID. Reliabilityand validity of the AGREE instrument used byphysical therapists in assessment of clinical prac-

tice guidelines. BMC Health Services Research2005;5:18. (http://www.biomedcentral.com/1472–6963/5/18) (Accessed December 2007).

11. Shiffman RN, Shekelle P, Overhage JM, Slutsky J,Grimshaw J, Deshpande AM. Standardized re-porting of clinical practice guidelines: a proposalfrom the Conference on Guideline Standardiza-tion. Ann Intern Med 2003;139:493–8.

12. Woolf SH, Davidson MB, Greenfield S, Bell HS,Ganiats TG, Hagen MD et al. American Academyof Family Physicians and American Diabetes As-sociation. The benefits and risk of controllingblood glucose levels in patients with type 2 dia-betes mellitus: a review of the evidence andrecommendations. AAFP Policy Action April 1999.http://www.aafp.org/online/etc/medialib/aafp_org/documents/clinical/clin_recs/diabetespolicy.Par.0001.File.tmp/clinicalrecs_diabetespolicy05.pdf(Accessed December 2007).

13. Scottish Intercollegiate Guidelines Network.Management of diabetes CPG55. 2001. http://www.sign.ac.uk/guidelines/fulltext/55/index.html

Quality of Diagnostic Guidelines in Diabetes

Clinical Chemistry 54:11 (2008) 1881

(Accessed December 2007).14. Reece EA, Homko C, Miodovnik M, Langer O. A

consensus report of the Diabetes in PregnancyStudy Group of North America Conference, LittleRock, Arkansas, May 2002. J Matern Fetal Neo-natal Med 2002;12:362–4.

15. Sacks DB, Bruns DE, Goldstein DE, Maclaren NK,McDonald JM, Parrott M. Guidelines and recom-mendations for laboratory analysis in the diagno-sis and management of diabetes mellitus. ClinChem 2002;48:436–72.

16. McIntosh A, Hutchinson A, Home PD, Brown F,Bruce A, Damerell A, Davis R et al. (2001) Clinicalguidelines and evidence review for type 2 dia-betes: management of blood glucose. Sheffield:ScHARR, University of Sheffield. http://www.nice.org.uk/cat.asp?c�36733 (Accessed December2007).

17. McIntosh A, Hutchinson A, Feder G, Durring-ton P, Elkeles R, Hitman GA, et al. (2002) Clinicalguidelines and evidence review for type 2 dia-betes: lipids management. Sheffield: ScHARR,University of Sheffield. http://www.nice.org.uk/cat.asp?c�38551 (Accessed December 2007).

18. Society for Endocrinology, Metabolism and Dia-betes of South Africa. Revised SEMDSA Guide-lines for diagnosis and management of type 2diabetes mellitus for primary health care in 2002.http://www.semdsa.org.za/guidelines.htm (Ac-cessed December 2007).

19. Berger H, Crane J, Farine D, Armson A, De LaRonde S, Keenan-Lindsay L, et al. Screening forgestational diabetes mellitus. J Obstet GynaecolCan 2002;24:894–912.

20. Canadian Diabetes Association Clinical PracticeGuidelines Expert Committee. Canadian Dia-betes Association 2003 Clinical Practice Guide-lines for the Prevention and Management ofDiabetes in Canada. http://www.diabetes.ca/cpg2003/downloads/cpgcomplete.pdf (AccessedDecember 2007).

21. New-Zealand Guidelines Group. Management ofType 2 Diabetes. http://www.nzgg.org.nz/guidelines/dsp_guideline_popup.cfm?guidelineCatID�32&guidelineID�36 (Accessed December 2007).

22. U.S. Preventive Services Task Force. Screening fortype 2 diabetes mellitus in adults: recommenda-tions and rationale. February 2003. Agency forHealthcare Research and Quality, Rockville, MD.http://www.ahrq.gov/clinic/uspstf/uspsdiab.htm(Accessed December 2007).

23. U.S. Preventive Services Task Force. Screening forgestational diabetes mellitus: recommendationsand rationale. http://www.ahrq.gov/clinic/uspstf/uspsgdm.htm (Accessed December 2007).

24. World Health Organization. Screening for type 2diabetes: report of a World Health Organizationand International Diabetes Federation meeting.Geneva, 2003. http://whqlibdoc.who.int/hq/2003/WHO_NMH_MNC_03.1.pdf (Accessed December2007).

25. Snow V, Aronson MD, Hornbake ER, Mottur-Pilson C, Weiss KB. Clinical Efficacy AssessmentSubcommittee of the American College of Physi-cians. Lipid control in the management of type 2diabetes mellitus: a clinical practice guideline

from the American College of Physicians. AnnIntern Med 2004;140:644–9.

26. Kaiser Permanente, Care Management Institute’sDiabetes Guidelines workgroup. Guidelines forthe Management of Adult Diabetes in PrimaryCare. http://members.kaiserpermanente.org/kpweb/pdf/feature/247clinicalpracguide/CMI_AdultDiabetesGuideline_public_web_033104.pdf (Accessed De-cember 2005).

27. National Collaborating Centre for Women’s andChildren’s Health (NICE guidance). Type 1diabetes: diagnosis and management of type 1diabetes in children and young people. http://www.nice.org.uk/page.aspx?o�213575 (Ac-cessed December 2007).

28. Guidelines and Protocols Advisory Committee (BrColumbia Ministry of Health). Diabetes Care, re-vised 2005. http://www.health.gov.bc.ca/gpac/guideline_diabetes.html (Accessed December2007).

29. IDF Clinical Guidelines Task Force. Global guide-line for type 2 diabetes. Brussels: InternationalDiabetes Federation, 2005. http://www.idf.org/home/index.cfm?node�1457 (Accessed Decem-ber 2007).

30. National Health and Medical Research Council.National evidence based guidelines for the man-agement of type 2 diabetes mellitus. http://www.health.gov.au/nhmrc/publications/pdf/cp86.pdf(Accessed November 2007).

31. World Health Organization: Definition and diag-nosis of diabetes mellitus and intermediatehyperglycemia: report of a WHO/IDF Consulta-tion. Geneva, 2006. http://www.who.int/diabetes/publications/Definition%20and%20diagnosis%20of%20diabetes_new.pdf (Accessed December2007).

32. American Association of Clinical Endocrinolo-gists. American Association of Clinical Endocri-nologists medical guidelines for clinical practicefor the management of diabetes mellitus. EndocrPract 2007;13(Suppl 1):1–68.

33. American Diabetes Association. Standards ofmedical care in diabetes. Diabetes Care 2007;30:S4–41.

34. IDF Clinical Guidelines Task Force. Guideline formanagement of postmeal glucose. Brussels: In-ternational Diabetes Federation, 2007. http://www.idf.org/home/index.cfm? node�1620 (AccessedDecember 2007).

35. Sowerby Centre for Health Informatics at New-castle. PRODIGY Guidance: Diabetes type 2: lipidmanagement. http://www.prodigy.nhs.uk/guidance.asp?gt�Diabetes%20-%20lipid%20management(Accessed December 2007).

36. Sowerby Centre for Health Informatics atNewcastle: PRODIGY Guidance: Diabetes type 2:renal disease. http://www.prodigy.nhs.uk/guidance.asp?gt�Diabetes%20-%20renal%20disease (Ac-cessed December 2007).

37. Sowerby Centre for Health Informatics atNewcastle: PRODIGY Guidance: Blood glucose.http://www.prodigy.nhs.uk/guidance.asp?gt�Diabetes%20-%20renal%20disease (AccessedDecember 2007).

38. Qaseem A, Vijan S, Snow V, Cross T, Weiss KB,

Owens DK, for the Clinical Efficacy AssessmentSubcommittee of the American College of Physi-cians. Glycaemic control and type 2 diabetesmellitus: the optimal hemoglobin A1c targets. Aguidance statement from the American College ofPhysicians. Ann Intern Med 2007;147:417–22.

39. Deeks JJ. Word limits best explain failings ofindustry supported meta-analyses [Letter]. BMJ2006;333:1021.

40. Fervers B, Burgers JS, Haugh MC, Brouwers M,Browman G, Cluzeau F, Philip T. Predictors ofhigh quality clinical practice guidelines: examplesin oncology. Int J Qual Health Care 2005;17:123–32.

41. Miller J, Petrie J. Development of practice guide-lines [Commentary]. Lancet 2000;355:82–3.

42. vanTulder MW, Tuut M, Pennick V, Bombardier C,Assendelft WJJ. Quality of primary care guidelinesfor acute low back pain. Spine 2004;29:E357–62.

43. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA,Glasziou PP, Irwig LM, et al. The STARD state-ment for reporting studies of diagnostic accuracy:explanation and elaboration. Clin Chem 2003;49:7–18.

44. Skeie S, Nordin G, Oosterhuis WP, Araczki A,Horvath AR, Perich C et al. Post-analytical exter-nal quality assurance of blood glucose andHbA1c: an international survey. Clin Chem 2005;51:1145–53.

45. Grimshaw GM, Khunti K, Baker R. Diagnosis ofheart failure in primary care: an assessment ofinternational guidelines. Br J Gen Pract 2001;51:384–6.

46. Raine R, Sanderson C, Black N. Developing clin-ical guidelines: a challenge to current methods.BMJ 2005;331:631–3.

47. Oxman AD, Fretheim A, Schunemann HJ, SURE.Improving the use of research evidence in guide-line development: introduction. Health Res PolicySyst 2006;4:12.

48. Schunemann HJ, Fretheim A, Oxman AD. Improv-ing the use of research evidence in guidelinedevelopment: I. Guideline for guidelines. HealthRes Policy Syst 2006;4:13.

49. Moher D, Fortin P, Jadad AR, Juni P, Klassen T, LeLorier J, et al. Completeness of reporting of trialspublished in languages other than English: impli-cations for conduct and reporting of systematicreviews. Lancet 1996;347:363–6.

50. Burgers JS, Grol R, Klazinga NS, Makela M, ZaatJ. Towards evidence-based clinical practice: aninternational survey of 18 clinical guideline pro-grams. Int J Qual Health Care 2003;15:31–45.

51. Vlayen J, Aertgeerts B, Hannes K, Sermeus W,Ramaekers Dirk. A systematic review of appraisaltools for clinical practice guidelines: multiple sim-ilarities and one common deficit. Int J QualHealth Care 2005;17:235–42.

52. Burgers JS. Guidelines quality and guidelinescontent: are they related [Editorial]. Clin Chem2006;52:3–4.

53. Woolf SH, Grol R, Hutchinson A, Eccles M, Grim-shaw J. Potential benefits, limitations, and harmsof clinical guidelines. BMJ 1999;318:527–30.

1882 Clinical Chemistry 54:11 (2008)