Ability of expert physicians to structure clinical guidelines: reality versus perception

11
Ability of expert physicians to structure clinical guidelines: reality versus perceptionErez Shalom MSc, 1 Yuval Shahar MD PhD, 2 Meirav Taieb-Maimon PhD, 3 Susana B. Martins MD MSc, 4 Laszlo T. Vaszar MD, 5 Mary K. Goldstein MD MSc, 6 Lily Gutnik BA 7 and Eitan Lunenfeld MD 8 1 PhD Student, Medical Informatics Research Center, 2 Professor and Head of the Medical Informatics Research Center, 3 Lecturer, 7 Medical Student, Department of Information Systems Engineering, Ben Gurion University of the Negev, Beer Sheva, Israel 4 Research Health Science Specialist, 6 Director, Geriatrics Research Education and Clinical Center (GRECC), Veterans Administration Palo Alto Health Care System, Palo Alto, CA, USA 5 Researcher, Mayo Clinic Arizona, USA 8 Professor and Head of OB/GYN Division, Soroka Medical Center, Faculty of Health Sciences, Ben Gurion University of the Negev, Beer Sheva, Israel Keywords clinical decision support systems, clinical guidelines, cognitive aspects, educational measurement, knowledge acquisition, knowledge representation Correspondence Erez Shalom Medical Informatics Research Center Department of Information Systems Engineering Ben Gurion University of the Negev Beer Sheva 84105 Israel E-mail: [email protected] Accepted for publication: 24 April 2009 doi:10.1111/j.1365-2753.2009.01241.x Abstract Rationale, aims and objectives Structuring Textual Clinical Guidelines (GLs) into a formal representation is a necessary prerequisite for supporting their automated applica- tion. We had developed a collaborative guideline-structuring methodology that involves expert physicians, clinical editors and knowledge engineers, to produce a machine- comprehensible representation for automated support of evidence-based, guideline-based care. Our goals in the current study were: (1) to investigate the perceptions of the expert physicians and clinical editors as to the relative importance, for the structuring process, of different aspects of the methodology; (2) to assess, for the clinical editors, the inter- correlations among (i) the reported level of understanding of the guideline structuring ontology’s (knowledge scheme’s) features, (ii) the reported ease of structuring each feature and (iii) the actual objective quality of structuring. Methods A clinical consensus regarding the contents of three guidelines was prepared by an expert in the domain of each guideline. For each guideline, two clinical editors inde- pendently structured the guideline into a semi-formal representation, using the Asbru guideline ontology’s features. The quality of the resulting structuring was assessed quan- titatively. Each expert physician was asked which aspects were most useful for formation of the consensus. Each clinical editor filled questionnaires relating to: (1) the level of understanding of the ontology’s features (before the structuring process); (2) the usefulness of various aspects in the structuring process (after the structuring process); (3) the ease of structuring each ontological feature (after the structuring process). Subjective reports were compared with objective quantitative measures of structuring correctness. Results Expert physicians considered having medical expertise and understanding the ontological features as the aspects most useful for creation of a consensus. Clinical editors considered understanding the ontological features and the use of the structuring tools as the aspects most useful for structuring guidelines. There was a positive correlation (R = 0.87, P < 0.001) between the reported ease of understanding ontological features and the reported ease of structuring those features. However, there was no significant correlation between the reported level of understanding the features – or the reported ease of struc- turing by using those features – and the objective quality of the structuring of these features in actual guidelines. Conclusions Aspects considered important for formation of a clinical consensus differ from those for structuring of guidelines. Understanding the features of a structuring ontol- ogy is positively correlated with the reported ease of using these features, but neither of these subjective reports correlated with the actual objective quality of the structuring using these features. Journal of Evaluation in Clinical Practice ISSN 1356-1294 © 2009 The Authors. Journal compilation © 2009 Blackwell Publishing Ltd, Journal of Evaluation in Clinical Practice 15 (2009) 1043–1053 1043

Transcript of Ability of expert physicians to structure clinical guidelines: reality versus perception

Ability of expert physicians to structure clinical guidelines:reality versus perceptionjep_1241 1043..1053

Erez Shalom MSc,1 Yuval Shahar MD PhD,2 Meirav Taieb-Maimon PhD,3 Susana B. Martins MD MSc,4

Laszlo T. Vaszar MD,5 Mary K. Goldstein MD MSc,6 Lily Gutnik BA7 and Eitan Lunenfeld MD8

1PhD Student, Medical Informatics Research Center, 2Professor and Head of the Medical Informatics Research Center, 3Lecturer, 7MedicalStudent, Department of Information Systems Engineering, Ben Gurion University of the Negev, Beer Sheva, Israel4Research Health Science Specialist, 6Director, Geriatrics Research Education and Clinical Center (GRECC), Veterans Administration Palo AltoHealth Care System, Palo Alto, CA, USA5Researcher, Mayo Clinic Arizona, USA8Professor and Head of OB/GYN Division, Soroka Medical Center, Faculty of Health Sciences, Ben Gurion University of the Negev, Beer Sheva,Israel

Keywords

clinical decision support systems, clinicalguidelines, cognitive aspects, educationalmeasurement, knowledge acquisition,knowledge representation

Correspondence

Erez ShalomMedical Informatics Research CenterDepartment of Information SystemsEngineeringBen Gurion University of the NegevBeer Sheva 84105IsraelE-mail: [email protected]

Accepted for publication: 24 April 2009

doi:10.1111/j.1365-2753.2009.01241.x

AbstractRationale, aims and objectives Structuring Textual Clinical Guidelines (GLs) into aformal representation is a necessary prerequisite for supporting their automated applica-tion. We had developed a collaborative guideline-structuring methodology that involvesexpert physicians, clinical editors and knowledge engineers, to produce a machine-comprehensible representation for automated support of evidence-based, guideline-basedcare. Our goals in the current study were: (1) to investigate the perceptions of the expertphysicians and clinical editors as to the relative importance, for the structuring process, ofdifferent aspects of the methodology; (2) to assess, for the clinical editors, the inter-correlations among (i) the reported level of understanding of the guideline structuringontology’s (knowledge scheme’s) features, (ii) the reported ease of structuring each featureand (iii) the actual objective quality of structuring.Methods A clinical consensus regarding the contents of three guidelines was prepared byan expert in the domain of each guideline. For each guideline, two clinical editors inde-pendently structured the guideline into a semi-formal representation, using the Asbruguideline ontology’s features. The quality of the resulting structuring was assessed quan-titatively. Each expert physician was asked which aspects were most useful for formationof the consensus. Each clinical editor filled questionnaires relating to: (1) the level ofunderstanding of the ontology’s features (before the structuring process); (2) the usefulnessof various aspects in the structuring process (after the structuring process); (3) the ease ofstructuring each ontological feature (after the structuring process). Subjective reports werecompared with objective quantitative measures of structuring correctness.Results Expert physicians considered having medical expertise and understanding theontological features as the aspects most useful for creation of a consensus. Clinical editorsconsidered understanding the ontological features and the use of the structuring tools as theaspects most useful for structuring guidelines. There was a positive correlation (R = 0.87,P < 0.001) between the reported ease of understanding ontological features and thereported ease of structuring those features. However, there was no significant correlationbetween the reported level of understanding the features – or the reported ease of struc-turing by using those features – and the objective quality of the structuring of these featuresin actual guidelines.Conclusions Aspects considered important for formation of a clinical consensus differfrom those for structuring of guidelines. Understanding the features of a structuring ontol-ogy is positively correlated with the reported ease of using these features, but neither ofthese subjective reports correlated with the actual objective quality of the structuring usingthese features.

Journal of Evaluation in Clinical Practice ISSN 1356-1294

© 2009 The Authors. Journal compilation © 2009 Blackwell Publishing Ltd, Journal of Evaluation in Clinical Practice 15 (2009) 1043–1053 1043

1. Introduction: automated support toguideline-based care

1.1. Automated support to clinical-guidelineapplication

Clinical guidelines (GLs) are ‘systematically developed state-ments to assist practitioner and patient decisions about appropriatehealth care for specific clinical circumstances’ [according to thedefinition of the Institute of Medicine (IOM) [1]]. They are thusintimately related to the dissemination of evidence-based medicine(EBM). Extensive evidence confirms that application of state-of-the-art clinical GLs can improve the quality of medical care [2],increase survival [3,4] and reduce costs [5].

During the past 25 years, there have been several efforts tosupport complex GL-based care over time in an automatedfashion. Such automated support requires formal GL-modellingmethods. Most current modelling methods use knowledge acqui-sition tools for eliciting the medical knowledge needed for fillingthe knowledge roles (KRs) (e.g. eligibility conditions) of the GLspecification ontology (i.e. the key concepts, properties and rela-tions among the GL’s concepts) chosen by each modelling method[6]. A review by De Clercq et al. [7] has identified the four mainareas involved in the development of GL-based decision supportsystems: (1) GL modelling and representation; (2) GL specifica-tion; (3) GL verification and testing; and (4) GL application. Arecent review [8] examines the various approaches for run-timeguideline application and their relationship to the GL modellingmethod.

In summary, GL representation is a critical issue for implemen-tation of GLs within a computer-based clinical decision supportsystem [9].

In the present research, we focused on the subjective aspects(from the point of view of a clinically oriented editor) of GL

representation, specification and acquisition, and their relation toobjective measures of the specification correctness.

1.2. Incremental clinical-guideline specification

In most GL modelling methods, the process of specification of theGL into a formal language is not sufficiently smooth and transpar-ent. There is an unclear division of responsibility in the GL speci-fication task between the knowledge engineers (KEs), who aretypically knowledgeable regarding the syntax and semantics of theGL representation technique, and the physicians, who are knowl-edgeable regarding the semantics of the GL. The core of theproblem is that physicians cannot (and need not) program in GLspecification languages, while programmers and KEs do not suf-ficiently understand the clinical semantics of the GL. Patel et al.[10] showed that physicians interpret information differently fromKEs and concluded that the developmental process for an encodedrepresentation must involve the active participation of both physi-cians and computer scientists at each stage in the evolution ofthe guideline’s translation. Thus, converting GLs into machine-comprehensible formats must capitalize on the relative strengths ofboth types of expert – the medical know-how of the physician andthe semantic understanding of the GL specification format of theKE.

To facilitate collaboration among these two very different typesof users and the iteration inherent in such a process, we havedeveloped an architecture and a set of tools, known as the Digitalelectronic Guideline Library (DeGeL) [11], which supports GLclassification, semantic mark-up, context-sensitive search, brows-ing, run-time application, and retrospective quality assessment.The DeGeL architecture includes the URUZ Markup tool, a web-based knowledge acquisition tool for structuring GLs, which weused in the current project (see Fig. 1).

Figure 1 The Uruz Web-based guidelinemark-up tool in the DeGeL architecture. Thetool’s basic semi-structuring interface isuniform across all guideline ontologies. Thetarget ontology selected by the medicalexpert, in this case, Asbru, is displayed in theupper left tree; the guideline source isopened in the upper right frame. The expertphysician highlights a portion of the sourcetext (including tables or figures) and drags itfor further modification into the bottomframe’s Editing Window tab labelled by asemantic role chosen from the targetontology (here, the Asbru Plan-Body textualcontent). Contents can be aggregated fromdifferent source locations [11].

Physicians ability to structure GLs E. Shalom et al.

© 2009 The Authors. Journal compilation © 2009 Blackwell Publishing Ltd1044

The current study formed part of our broader project, whoseobjective is to support GL-based EBM. One of the key steps in theproject included defining a methodology for clinical GL structur-ing and for an evaluation of the resultant structured content [12].Within the framework of the broader project, we have developed athree-phase, nine-step methodology for the structuring of proce-dural and declarative knowledge of the GLs and have evaluatedthis methodology with encouraging qualitative and quantitativeresults. Our methodology includes specification and conversion ofthe GL’s free-text representation, first by expert physicians (EPs),who create a clinical consensus specific to the chosen ontology(see below), then by clinical editors (CEs), who perform the actualstructuring, or mark-up, of the text of the GL, and finally by KEs,who convert the GL into a machine-comprehensible representa-tion, enabling automated support of GL-based care. (In this study,mark-up means structuring of the GL’s text by labelling portionsof text with semantic labels from the chosen target GL specifica-tion language (ontology), sometimes even modifying the text, seeFig. 1 for mark-up example using the URUZ tool). KEs are alsoinvolved in supporting the clinical consensus formation, in termsof the chosen GL ontology and the mark-up by the CEs.

Our specification methodology therefore starts with an initial,mandatory creation of an ontology specific consensus (OSC),which is a structured document that describes schematically theinterpretation of the GL, using the semantics KRs of the chosenGL ontology, agreed upon by both the EPs and the KEs, and whichincludes the clinical directives of the GL and the semantic logic ofthe specification language [13]. The OSC is usually created col-laboratively by senior EPs and KEs. The tasks of the CEs includelearning the GL specification language and tools, receiving sometraining from the KEs in the knowledge acquisition tool, andcreating a marked-up GL document by using the knowledge acqui-sition tool, the OSC and their own knowledge. Lastly, to facilitateevaluation of each mark-up, a domain EP and a KE together createa Gold Standard mark-up document, which is a semi-formalmarked-up version of the GL that describes the best structuring ofthe GL.

Thus, for the application of our methodology, three differentroles are required – EPs, CEs and KEs, who together perform thedifferent tasks. Figure 2 summarizes their different roles in themethodology and their main tasks.

However, the relative contribution of each of the different roles(especially the EP and the CE roles) to the overall methodology isnot known, and it is not clear to what extent the roles’ subjectiveunderstanding of various aspects of the methodology is associatedwith objective improvements in the quality of the structuring.Thus, in the current study, we examined the cognitive aspects ofour methodology.

1.3. The Asbru guideline representationontology

In the overall project, and in particular in this study, we used theAsbru ontology [14] as the underlying GL – representation lan-guage, owing to our familiarity with it and to its well-developed,documented and formalized syntax and semantics.

The Asbru specification language includes semantic KRs orga-nized into the following KR classes: (1) the GL’s transitionconditions (e.g. filter conditions, which represents obligatory eli-

gibility criteria, such as being pregnant; set-up conditions, whichpose additional conditions that need to be made true, such asobtaining a blood test; complete conditions, which determinewhen GL application finishes, based on predefined criteria; andabort conditions, which determine when GL application should beaborted, based on predefined criteria); (2) the plan-body, that is theGL’s core control structures (e.g. sequential [including unor-dered], concurrent [including any-order], repeating actions, orsub-guidelines, or ‘To Be Defined’. Also, the control structuremight be no yet defined); (3) the GL’s intentions (e.g. process andoutcome intentions, which, correspondingly, describe the GL’sobjectives with respect to the physcian’s actions and the patient’sresulting state); and (4) the contexts of the activities in the GL (e.g.the actors [physician, nurse] and the clinical-context [hospital,ambulatory, home]). KRs such as the GL’s strength of recommen-dation and its level of evidence are also a part of the Asbruontology.

1.4. The subjective–objective comprehensionproblem

Despite the comprehensive structuring process described above,several questions naturally arise: how well do EPs and CEs under-stand the GL structuring process and the ontological frameworkunderlying it? How well do subjective perceptions by CEs (of theirown understanding) materialize in the harsh reality of their actualGL structuring performance?

Thus, the current paper poses several specific research questionsaimed at elucidating these issues and introduces a detailed meth-odology for answering these questions. It then presents the resultsof applying the methodology in the case of several large GLs (eachGL might be composed of hundreds of sub-plans and ontologicalKRs) and several different CEs. We aimed to answer the followingresearch questions (see Section 3, as part of the results description,for specific details):(i) Which general aspects (e.g. clinical knowledge, knowledge of

the ontology or knowledge of the tools) are considered by the EPs

Figure 2 Main roles and tasks in the overall guideline specification andevaluation methodology.

E. Shalom et al. Physicians ability to structure GLs

© 2009 The Authors. Journal compilation © 2009 Blackwell Publishing Ltd 1045

and CEs as most helpful in the task of creation of the OSC and inthe task of marking-up a GL, respectively? What is the relationshipbetween these two sets of aspects?(ii) What is the inter-correlation among the two subjective reportsdescribed immediately below (No. 1 and No. 2) and the oneobjective measure (No. 3):

1 The subjective report of the CEs regarding their compre-hension level of each of the Asbru ontology KRs before themark-up.2 The subjective report of the CEs regarding the ease ofstructuring the GL in URUZ using each of the Asbru ontologyKRs after the mark-up.3 The objective quality of the structuring the CEs hadachieved for each of the Asbru ontology KRs.

2. MethodsThe current study was performed in the context of a broader study,whose goal was to develop and assess a methodology for GLstructuring by collaboration among medical domain experts, CEsand KEs, and for evaluation of the structured contents. We brieflyreview the objectives and goals of the broader project, to put thecurrent study in context, to enable a better understanding of itssignificance, and to better understand its methodology, whichrelies on an existing process.

2.1. Our previous research – a methodologyfor GL specification

In a previous research [12] we defined a methodology whichincludes all necessary activities before, during and after themark-up process, and supports specification and conversion of theGL’s free-text representation through semi-structured and semi-formal representations into a machine comprehensible representa-

tion (see Fig. 3 for all methodology phases). Within thismethodology, the activities in the mark-up process include threemain phases:1 Preparations before the mark-up activities: choosing the speci-fication language (GL ontology), learning the specification lan-guage, selecting a GL for specification, creating an OSC with theEP, acquiring training in the mark-up tool by the CEs and prepar-ing a gold standard mark-up document with a senior EP.2 During the mark-up activities: classifying the GL according to aset of semantic indices (e.g. diagnosis, treatment), and performingthe actual specification process (the mark-up) using the tools andconsensus. This activity is performed by the CE.3 After mark-up activity: evaluation of the results of GL specifi-cation, with the EP that assisted in creation of the gold standardmark-up.

We evaluated this methodology through a process that is thebasis also for the current study, which focuses on an analysis of thesubjective data collected in the process. Three GLs, from threedifferent clinical disciplines, were selected for use as the textualsource for structuring by the CEs: pelvic inflammatory disease[15,16], chronic obstructive pulmonary disease [17] and hypothy-roidism (HypoThrd) [18]. Five EPs, four CEs (three senior physi-cians and one intern) and two KEs participated in this study. In thefirst stage, an OSC was created with the EPs to form a consistent,agreed interpretation of each GL that can support the KRs requiredby the chosen GL specification language (ontology) [13].

After learning the Asbru GL specification language and theDeGeL framework, and receiving some training in the URUZmark-up tool, each of the CEs created a semi-formal Asbrumark-up using the URUZ tool, the OSC document, and his or herown knowledge (thus, for each GL, two mark-ups were created bytwo different CEs). In order to evaluate each of the mark-ups, agold standard mark-up was created. The gold standard is also asemi-formal mark-up, which describes the best structuring of the

Figure 3 The three main phases of the overall guideline specification methodology before, during and after the mark-up, and the activities performedin each phase. Note the descriptions under each activity. Activity six (creation of a gold standard) can be performed initially, or in parallel with activitiesseven and eight (editors mark-up). Note also the participants in each activity. The four questionnaires upon which the current study is based are shownin their respective locations in the process (see Section 2.2).

Physicians ability to structure GLs E. Shalom et al.

© 2009 The Authors. Journal compilation © 2009 Blackwell Publishing Ltd1046

GL, and is created by a senior EP and a KE working together. Eachof the mark-ups created by the CEs was compared with a goldstandard mark-up.

To obtain meaningful qualitative and quantitative results, objec-tive measures were defined for the evaluation of the elicited knowl-edge in each of the mark-ups. The objective measures were definedin two main categories: a completeness measure of the acquiredknowledge, that is how much content from the gold standard exists(or not) in each of the semi-formal mark-ups of each CE (forexample, a predefined set of plans), and a soundness, or correct-ness measure, that is, how correct the acquired knowledge is, fromthe two aspects of (1) clinical semantics and (2) Asbru ontologysemantics. The correctness scores were always assigned by com-paring the content of the mark-up by a CE, to the content of thegold standard.

The results of the first study, which established the validity ofthe framework, were quite encouraging. The gold standardmark-up included 196 guideline plans and sub-plans, containingaltogether 326 instances of ontological KRs (e.g. eligibility con-ditions). With respect to the completeness, 97% of the plans and91% of the KR instances of the guidelines were recreated by theCEs. With respect to the correctness, there was often significantvariability between CE pairs structuring each guideline; but for allGLs and CEs, the specification quality was significantly higherthan random, using a proportionality test (P < 0.01). ProceduralKRs were more difficult to mark-up than declarative KRs. Thedetailed quantitative analysis is outside of the scope of the currentpaper, which focuses on an analysis of the subjective measures,and can be found elsewhere [12].

Note that in our methodology, the CE is the main actor: the CEhas to learn the specification language and the specification tools,and finally perform the mark-up. Therefore, it is crucial to knowwhich aspects most help the CEs in all of those tasks, what aretheir attitudes regarding the specification language and its associ-ated tools, and how these attitudes are correlated with the qualityof their structuring. In addition, it is also useful to understandwhich aspects seem important to the EPs.

2.2. The methodology of the current study

As explained in Section 1.4, in the current study we focused on thesubjective aspects involved in the use of the GL specificationmethodology, as explained (the detailed research questions arelisted as part of the results). Our investigation included the admin-istration of four questionnaires, one to the EPs and three to theCEs, at different phases of the overall study, and an analysis oftheir content and the correlations among them. The questionnairesand the findings are described below.

2.2.1. Questionnaire No.1: usefulness of different

aspects of the methodology for OSC creation

This 14-point questionnaire assessed retrospectively the aspectsthat helped the EPs in the task of creating the OSC. It was filled bythe EPs after they created the OSC and comprised questionsjudged to be relevant to the task of creating the OSC (it includedaspects such as the EPs’ medical knowledge and the URUZmark-up tool). For each aspect, a description and an example were

given, and the EPs were asked to score the level of its contributionto the task of creating the OSC on a scale of -3 (interfered greatly)to +3 (contributed greatly).

2.2.2. Questionnaire No. 2: usefulness of different

aspects of the methodology for performing the

mark-up

This questionnaire assessed retrospectively the aspects that helpedthe CEs in the task of creating the mark-ups. It was filled by theCEs after they created the mark-up included the same 14 aspects asused in questionnaire No. 1. Similarly, for each aspect, a descrip-tion and an example were given, and the CEs were asked to scorethe contribution of that aspect to the task of performing themark-up on a scale of -3 to +3.

2.2.3. Questionnaire No.3: comprehension of the

Asbru KRs

This questionnaire assessed prospectively (before creating themark-up) the CEs’ comprehension of Asbru KRs. It was filled bythe CEs before they created the mark-up, covered 25 Asbru KRs.For each KR, a description and an example were given, and theCEs were asked to score their level of understanding of that KR ona scale of -3 (very difficult to understand) to +3 (very easy tounderstand).

2.2.4. Questionnaire No.4: ease of structuring the

Asbru KRs

This questionnaire assessed retrospectively the CEs’ reported easeof structuring Asbru KRs, related to the same 25 KRs as given inQuestionnaire No. 3. It was filled by the CEs after they created themark-up. Again, a description and an example were given for eachKR, the CEs were asked to score the level of ease of structuring iton a scale of -3 (very difficult to structure) to +3 (very easy tostructure).

The comparisons performed among the answers to the question-naires are summarized in Fig. 4.

3. ResultsThe research questions posed in this study were organized intothree groups as follows:A. Aspects that most helped the EPs and CEs to create the OSCand the mark-up

(A.1) What is the extent to which general aspects, such asclinical knowledge, understanding the overall process method-ology and understanding the Asbru specification language,help the EPs who created the OSCs and the CEs who createdthe mark-ups?(A.2) What is the correlation between the values assigned tothe various aspects by the two EPs creating a consensus foreach GL, and what is the correlation between the valuesassigned to each aspect by the two CEs structuring each GL?(A.3) Is there a significant positive correlation between thevalues assigned to the various aspects by the EPs for the task ofcreating the OSC and those assigned by the CEs for the task ofperforming the mark-up?

E. Shalom et al. Physicians ability to structure GLs

© 2009 The Authors. Journal compilation © 2009 Blackwell Publishing Ltd 1047

B. Reported level of ease in understanding and structuring theAsbru ontology KRs

(B.1) How well do CEs understand each of the Asbru KRsbefore performing the mark-up?(B.2) What is the level of ease of structuring the GLs accord-ing to each of the Asbru KRs?(B.3) Is there a significant positive correlation between theperceived ease of understanding the KRs and the perceivedease of using them in the mark-up?

C. Correlation between the subjective and the objective measures(C.1) Is there a significant positive correlation between theperceived level of comprehension of the Asbru KRs before themark-up and the actual quality of the structuring of the KRs inthe mark-ups?(C.2) Is there a significant positive correlation between theCEs’ subjective reporting with regard to the ease of structuringthe GL by using each of the Asbru KRs and the actual qualityof the structuring of the KRs in the mark-ups?

The results are organized according to the respective researchquestions A to C

A. Aspects that most helped the EPs and CEs to create the OSCand mark-up

• Methods of measurement for research question AMethod for A.1: To pinpoint the aspects that were most helpful increating the OSC and performing the mark-up, we analysed theanswers to questionnaires no. 1 and no. 2, respectively.Method for A.2: To determine whether there is a significant corre-lation among the values assigned to the various aspects by thedifferent EPs and CEs for the tasks of creating the OSC andcarrying out the mark-up, respectively, we performed a series ofPearson correlation tests between the scores assigned to eachaspect by each pair of EPs performing the OSC task, and alsobetween the scores assigned to the aspects by each pair of CEsperforming the mark-up task.Method for A.3: To determine whether there is a significant positivecorrelation between the values assigned to the various aspects by theEPs for the task of creating the OSC and those assigned by the CEsfor the task of carrying out the mark-up, we performed a Pearsoncorrelation test between the vector of the mean scores of the aspects(across all EPs) assigned to the OSC task and the vector of the meanscores of the aspects (across all CEs) assigned to the mark-up task.• Results for research question AResults for A.1: Table 1 summarizes the main aspects consideredhelpful – or not helpful – by the EPs in the creation of the OSC

Figure 4 Summary of the comparisons per-formed among the answers to the question-naires and the objective scores assigned tothe marked-up guideline documents. EP,expert physician; CE, clinical editor; OSC,ontology-specific consensus; KR, knowledgerole.

Physicians ability to structure GLs E. Shalom et al.

© 2009 The Authors. Journal compilation © 2009 Blackwell Publishing Ltd1048

(answers to questionnaire no. 1). The ‘EPs’ own medical expertise’was listed as most helpful aspect (2.75 � 0.5). The next mosthelpful aspects were: ‘reading the guideline source before makingan OSC’ (2.5 � 0.58), ‘knowing the Asbru ontology’s hybridmodel’ (2.25 � 1.51) and ‘knowing Asbru’s procedural anddeclarative KRs’ (2.25 � 0.5). The DeGeL tools, however (aspects9–14 in Table 1), had a mean usefulness score of less than 1 (themean across all EPs and aspects was 1.3 � 0.58).

Table 2 summarizes the main aspects that were considered ashelpful – or not helpful – by the CEs in performing the mark-up(answers to questionnaire no. 2). The aspect ‘knowing Asbru’sdeclarative aspects’ was considered by the CEs as most helpful(2.75 � 0.5). The next most helpful aspects were ‘Asbru’s proce-dural aspects’ (2.5 � 0.58), ‘knowing the IndexiGuide Tool’(2.5 � 0.58) and ‘knowing the URUZ main interface’ (2.25 �0.96). However, the DeGeL framework and the Spock run-timeapplication aspects had a mean usefulness score of less than 1, andthe aspect of ‘having several sources’ was actually listed as inter-fering with the CEs’ work (mean = -0.5). (The mean across allCEs and aspects was 1.57 � 0.38.)Results for A.2: The results of the correlation tests between theaspect scores of each pair of EPs for creating the OSC task andalso between the aspect scores of each pair of CEs for performingthe mark-up task were inconclusive: the correlations were signifi-cant for only about half of the pairs. However, when we examinedthe results of each EP and each CE, it was evident that the EPs andthe CEs were very consistent in themselves with regard to whatthey perceived as the most useful aspects and the least usefulaspects. In contrast, there was some variability between the scores

of the EPs and between the scores of the CEs for the aspectswith the intermediate mean values. Therefore, we excluded theintermediate values and recalculated the correlations. This time,the correlations for all the pairs were positive and significant(0.69 < R < 0.89, P < 0.05).

Table 1 Mean � SD scores assigned by the expert physicians to the usefulness of each aspect with respect to the task of creating an ontologyspecific consensus (OSC)

Aspect name Aspect description Mean SD

1 Having medical expertise Your own expertise regarding the GL’s domain 2.75 0.502 Reading the GL sources before creating the OSC Reading the textual content of the GL 2.50 0.583 Knowing the multiple-representation model Understanding the hybrid-Asbru representation model 2.25 1.504 Understanding the Asbru procedural KRs Understanding the semantics of the Asbru procedural operators

(e.g. do-in-parallel)2.25 0.50

5 Understanding the Asbru declarative KRs Understanding the semantics of the Asbru declarative KRs (e.g.filter condition)

2.25 0.50

6 Understanding ontologies Understanding the key concepts and relations of ontologies 1.50 1.297 Having more than one source Using more than one textual source to structure the GL 1.50 1.738 Understanding DeGeL Understanding the overall DeGeL library 1.25 1.509 Understanding the main interface of URUZ Understanding the use of the tool for GL mark-up 0.50 0.58

10 Using the URUZ plan-body wizard Understanding the use of the tool for structuring a GL into atree of plans

0.50 0.58

11 Using the IndexiGuide GL semantics classification* Understanding the use of the tool for semantic indexing of the GL 0.50 1.0012 Using the Vaidurya GL search and retrieval engine† Understanding the use of the tool for search and retrieval of

the GL0.25 0.50

13 Understanding the use of the vocabulary server‡ Understanding the use of the tool for finding terms in standardvocabularies such as Loinc, CPT

0.25 0.50

14 Understanding the Spock GL run-time application engine§ Understanding the use of the tool for applying the GLs 0.00 0.00

Mean score 1.30 0.28

*The IndexiGuide tool enables the CE to classify GLs according to one or more semantic indices [11].†The Vaidurya engine enables context-sensitive search and retrieval of GLs from the DeGeL library [19].‡The vocabulary server enables linking of knowledge to standard terms (e.g. LOINC vocabulary) [20].§The Spock Engine enables run-time guideline application at the point of care [21].CE, clinical editor; GL, Clinical guideline; KR, knowledge role; SD, standard deviation.

Table 2 Mean scores � SD assigned by the CEs to the usefulness ofeach aspect for the task of performing a mark-up

Aspect name* Mean SD

1 Understanding the Asbru procedural KR 2.75 0.502 Understanding the Asbru declarative KRs 2.50 0.583 Using IndexiGuide 2.50 0.584 Having medical expertise 2.25 0.965 Understanding the main interface of URUZ 2.25 0.966 Using the URUZ plan-body wizard 2.00 2.007 Reading the GL sources before creating the OSC 1.75 1.268 Understanding ontologies 1.75 1.509 Using vaidurya 1.75 1.26

10 Knowing the multiple representation model 1.25 1.5011 Understanding the use of the vocabulary server 1.25 1.2612 Understanding DeGeL 0.25 0.9613 Understanding Spock 0.25 0.5014 Having more than one source -0.50 2.08

Mean score 1.57 0.38

*See Table 1 for a description of the aspects.CE, clinical editor; GL, Clinical guideline; KR, knowledge role; OSC,ontology-specific consensus; SD, standard deviation.

E. Shalom et al. Physicians ability to structure GLs

© 2009 The Authors. Journal compilation © 2009 Blackwell Publishing Ltd 1049

Results for A.3: Finally, the correlation between the mean scores ofthe aspects related to the OSC and the aspects related to themark-up process was found to be positive, but weak and insignifi-cant (R = 0.23, P = 0.214).• Intermediate conclusion for research question A

The main aspects important to the EPs were their own medicaland ontological knowledge, while those crucial to the CEs werethe different DeGeL tools relevant to GL specification. Further-more, there was only a very weak, insignificant correlation amongthe aspects considered as useful for creating a consensus and theaspects considered useful for performing the mark-up.B. Reported ease of understanding and structuring the Asbru KRsby the CEs• Methods of measurement for research question BMethod for B.1: To pinpoint the KRs that were perceived as mosteasy/difficult for the CEs to understand before the mark-up, weused questionnaire No. 3Method for B.2: To determine which KRs were perceived as mosteasy/difficult to structure by the CEs after the mark-up, we usedquestionnaire No. 4.Method for B.3: To determine whether there was a significantpositive correlation between the perceived ease of understandingthe KRs and the perceived ease of using them in the mark-up, weperformed a Pearson correlation test between the mean scores

across all CEs (i.e. the average score across all scores assigned byeach of the CE in the questionnaire for each KR) for the twoquestionnaires.• Results for research question BResults for B.1 and B.2: The mean understanding and structuringscores of each KR across all CEs are presented in Table 3. TheKRs that were reported as easiest to understand were also reportedas the easiest to structure. The Actors and To-Be-Defined KRs werelisted in both questionnaires as being the easiest to understandand easy to structure (3.00 � 0), as was the Clinical context KR(2.75 � 0.5). Surprisingly, KRs of the Intentions KR class wereassigned low scores – perhaps owing to incomplete understandingof their semantics.Results for B.3: Most importantly, we found a significant and highpositive correlation (R = 0.87, P < 0.001) between the mean scores(across all CEs) of the answers to the two questionnaires.• Intermediate conclusions for research question BThe EPs displayed consistent behaviour: KRs that the CEs consid-ered as easy to understand were also considered as easy to structure.Thus, intuitive ‘real world’declarative KRs such as Actors, Clinicalcontext and Filter conditions were easy to understand and to struc-ture, whereas less intuitive procedural KRs with complex semanticssuch as Intentions and Switch case and Plan body (especiallysubplan KRs) were difficult to structure and to understand.

Table 3 Mean CEs’ scores � SD for under-standing and structuring various KRs, sortedby understanding scores (column I)

KR ID KR description*

I II

Understanding Structuring

Mean SD Mean SD

1 Actors 3.00 0.00 3.00 0.002 To-Be-Defined 3.00 0.00 3.00 0.003 Clinical context 2.75 0.50 2.75 0.504 Level of evidence 2.75 0.50 1.75 1.265 Strength of recommendation 2.75 0.50 1.50 1.006 Simple action 2.50 0.58 2.50 1.007 Complete condition 2.50 1.00 2.25 0.968 Abort condition 2.50 1.00 2.00 0.829 Filter condition 2.25 0.96 1.50 1.73

10 Subplans – sequential order 1.50 2.38 1.25 2.3611 Reactivate condition 1.50 1.91 0.75 1.2612 Cyclical plan 1.25 0.96 2.00 0.8213 Subplans – any order 1.25 2.87 1.00 2.0014 If-Then – else 1.25 2.87 0.75 1.8915 Subplans – parallel order 1.00 2.83 2.00 0.8216 Suspend condition 1.00 1.41 0.75 1.2617 Guideline knowledge 0.75 1.89 0.50 2.3818 Plan activation 0.50 2.65 0.75 2.0619 Intentions – overall outcome 0.50 2.89 0.50 2.8920 Subplans – unordered 0.25 2.75 0.75 1.8921 Set-up condition 0.00 1.15 0.50 1.2922 Intentions – intermediate outcome 0.00 2.31 0.25 2.6323 Intentions – overall process 0.00 2.31 -0.25 2.0624 Switch – case -0.50 2.52 0.50 2.3825 Intentions – intermediate process -0.50 1.91 0.00 2.31

Mean score 1.35 1.20 1.29 0.97

*See Section 1.3 for explanations and description of the knowledge roles.CE, clinical editor; KR, knowledge role; SD, standard deviation.

Physicians ability to structure GLs E. Shalom et al.

© 2009 The Authors. Journal compilation © 2009 Blackwell Publishing Ltd1050

C. Results for subjective and objective comparison• Methods of measurement for research question CMethods for C.1 and C.2: For each CE, we conducted two Pearsoncorrelation tests to examine whether there was a significant posi-tive correlation between: (1) the CEs’ reported comprehensionlevel of the Asbru KRs before performing the mark-up and theactual quality of structuring of these KRs in the resulting mark-up,and (2) the CEs’ reported ease of structuring the Asbru KRs afterperforming the mark-up and the actual quality of structuring ofthese KRs in the resulting mark-up.• Results for research question CResults for C.1 and C.2: For all of the CEs (other than for one CEin the case of one particular GL), the correlations between theirsubjective estimates of both comprehension of Asbru KRs beforethe mark-up and the reported ease of structuring Asbru KRs afterthe mark-up, on the one hand, and the objective measures of theirstructuring quality, on the other hand, were low and not significant(P > 0.05).

Table 4 shows the subjective mean scores assigned by each CEin his or her report of the comprehension of Asbru KRs and theease of structuring of Asbru KRs on a scale of [-3, 3] (based onquestionnaires No. 3 and No. 4). The mean correctness scoreoverall KRs for each CE is also shown.

The subjective self-assessment of some of the CEs appeared tocontradict their performance: For example, CE1 apparently under-estimated his comprehension, with a rather modest understandingscore of 0.3, and awarded the average ease of structuring a score of0.5, but his correctness score was 91%. CE2, on the other hand,apparently overestimated his mean comprehension score with arather high 2.5 and the average ease of structuring score as 2.6, buthis correctness score of was only 76%.• Intermediate conclusion for research question C

There was typically no correlation between the CEs’ estimate oftheir comprehension level of the specification language’s KRs – orof the ease of structuring these KRs – with the actual mark-upresults of these KRs.

4. DiscussionThe current study was performed as part of a larger project inwhich we are developing a comprehensive methodology for GLstructuring and for assessment of the structuring quality. In thecurrent study, we investigated several important cognitive aspectsof such a methodology, which are independent of any particularGL representation framework and are thus relevant to most exist-ing GL-based automated support architectures. In particular, we

examined inter-correlations between two aspects of GL specifica-tion (i.e. ontological comprehension and ease of use) and oneobjective quantitative measure (i.e. correctness) in the contextof the performance of the two main tasks involved in the GLspecification process: creating the OSC and structuring the GL’scontents.

4.1. Creating an ontology-specific consensusand performing mark-ups

For creating the OSC, aspects such as medical knowledge andunderstanding of Asbru semantics were considered as morehelpful by the EPs than expertise with the different DeGeL tools.For the CEs, the most helpful aspects for performing the mark-upsincluded understanding the Asbru GL ontology semantics and (incontrast to the task of creation of the OSC by the EPs) familiaritywith the GL editing tools. Tools directly helpful for actual editing,such as URUZ, were considered to be very useful, while tools thatare only marginally relevant to the GL specification process, suchas the Spock GL application tool, were considered to be of lowusefulness.

Thus, we may conclude that creating an OSC and performingthe actual mark-ups are very different tasks, each requiring differ-ent skills: The more theoretical skills helped the EPs in the creationof an OSC, while the more computer-oriented skills, such as theability to use specific editing tools, aided the CEs in the mark-upprocess. However, the most important skill for both tasks wasunderstanding the semantics of the GL specification language –Asbru in this case. These results, which demonstrate the need fordifferent skills for each task, strengthen our conclusion that seniorEPs should be involved in creation of the OSC, while CEs, whoneed not be domain experts but who could perhaps have a morecomputational orientation, should perform the mark-up.

4.2. Understanding the Asbru guidelineontology before and after the mark-up

Many studies in various sub-domains of cognitive psychologyshow that a person’s knowledge base leads him or her to have acertain set of attitudes and beliefs that are subsequently manifestedin his or her behaviour [22]. For the participants of this study,perceptions of difficulty of understanding the ontological KRsprior to the mark-up task were positively correlated with theiractual reported perceived level of ease of structuring these KRsupon completion of the mark-up task. This correlation supports thepossibility that their attitudes regarding task difficulty shaped their

Table 4 Comparison between the subjective scores and the objective measures

Specialty

CE1 CE2 CE3 CE4

Seniorphysician

Seniorphysician Intern

Seniorphysician

Subjective mean score (between -3 and +3) Reported comprehension of Asbru KRs before mark-up 0.3 2.4 2.9 1.5Reported ease of structuring Asbru KRs after mark-up 0.5 2.6 2.3 0.9

Objective measure* Correctness of mark-up (%) 91 73 95 84

*See Section 2.1 for more details about the objective measures, and how they were calculated.CE, clinical editor; KR, knowledge role.

E. Shalom et al. Physicians ability to structure GLs

© 2009 The Authors. Journal compilation © 2009 Blackwell Publishing Ltd 1051

behaviour (i.e. their reported task difficulty after completion of thetask). In both phases (before and after mark-up), the CEs reportedintuitive KRs such as Actors, Clinical Context and Filter Condi-tions as easy to understand and to structure, whereas, more‘abstract’ and non-intuitive KRs, such as Intentions, Switch Caseand Guideline Knowledge, were reported as being more difficult tounderstand and structure. Thus, the CEs displayed consistentbehaviour in their reports: the KRs that they declared as morecomprehensible before the mark-up were also reported as beingeasier for them to structure.

Based on these results, we conclude that additional emphasisshould be placed on teaching the semantics of the KRs that werereported as difficult to understand. These include task-specificdeclarative KRs, such as Intentions, and procedural concepts, suchas Switch Case. Perhaps a short one-time test should be adminis-tered to the CEs, before starting any mark-up, to measure theirontological knowledge so as to ensure that they understand all theKRs at a sufficient level. Furthermore, perhaps a small simulationtest of a sample mark-up task should be created to assess the CEs’knowledge and understanding of the mark-up task.

4.3. CEs’ perception of their ontologicalcomprehension and ease of mark-up versus themark-up correctness

Neither the CEs’ estimates of their comprehension level of thespecification language’s KRs nor their reports of the ease of struc-turing these KRs were correlated with the correctness levels of theGL mark-ups using these KRs, as performed by the same CEs. Aswe cannot rely on the CEs’ self-estimates of their level of onto-logical comprehension of specific KRs, or on the CEs’ retrospec-tive self-reports of their level of ease of structuring specific KRs,we cannot use such reports as substitutes for actual assessment ofthe quality of each mark-up. To assess the quality of a GL’smark-up by a particular CE, it might be best to perform assess-ments of the level of correctness of samples of the GL text forwhich a gold standard had been previously created. In practice,medical students or residents might well be more suitable playersfor the mark-up task, as they may have more time available thansenior experts and as ontological comprehension skills obviouslydo not require clinical expertise.

In addition, although the number of CEs was small, from acognitive psychological point of view, several theories of exper-tise, as these are manifested in the context of GL representation[23], might be well exemplified within the results. For example,CE2 reported a high comprehension of the Asbru ontology’s KRsand subsequently viewed the actual mark-up task as not difficult(Table 4), but, in practice, he had the lowest mean correctnessscore in the mark-up. This might be an example of the ‘errors ofoverconfidence’ phenomenon found among experts [24].

It would seem that effective structuring of clinical GLs requiresa basic level of clinical knowledge, together with ontological com-prehension and perhaps a computational orientation (owing to theneed to understand the procedural semantics of the target ontol-ogy). It was indeed noteworthy that CE3, an intern with significantprevious computational training and experience, had the highestmean correctness score (95%); the other three CEs were seniorexpert physicians. As it appears that the mark-up task requires bothclinical and computational knowledge, it is likely that the expert

physicians lacked some of the computational skills needed toattain maximum correct results, even though they had a high,well-organized level of clinical knowledge and comprehension.Furthermore, the clinical aspects of the task were sufficiently basicthat domain-specific medical expertise was not required, mostlyowing to the existence of a detailed OSC. The required clinicalaspects were thus well within the expertise of an intern or aresident physician, and perhaps even that of an advanced medicalstudent.

The possibility of using CEs with only basic clinical skills butwith a deeper understanding of the target GL specification ontol-ogy has encouraging implications for the prospects of large-scalerepresentation of procedural clinical knowledge.

4.4. Limitations of the current study

An apparent limitation of this study lies in the small numbers ofGLs, EPs and CEs, which is generally a common limitation in KAevaluations. However, it is important to point out that although thenumber of EPs and CEs is small, the number of GLs is misleading;in fact, the amount of clinical knowledge processed in this study isquite significant: the study included 196 plans and subplans, and326 KRs.

Thus, the study’s implications regarding the level of trainingneeded for mark-up and the soundness of self-assessment by theCEs are not decisive. However, we think that the study’s sugges-tions of (1) the possibility of using CEs with only basic clinicalskills but with a deeper understanding of the target GL specifica-tion ontology, and (2) lack of significant correlation between thereported subjective level of understanding of the ontological fea-tures – or the reported subjective ease of structuring using thesefeatures – and the objective quality of the structuring of thesefeatures in actual GLs, are worth further investigation. As it is noteasy to incorporate many EPs and CEs in any study, a futureproject might focus on only these points.

4.5. Conclusions

In summary, we drew the following conclusions from the currentstudy:1 EPs consider having medical expertise and understanding of theontological features as most useful for creation of the OSC.2 CEs consider understanding of the ontological features and useof the structuring tools as most useful for structuring the GLs.3 There is a significant positive correlation between the reportedlevel of understanding the ontological features and the reportedease of structuring those features.4 There is no significant correlation between the reported subjec-tive level of understanding of the ontological features – or thereported subjective ease of structuring using these features – andthe objective quality of the structuring of these features in actualGLs.

AcknowledgementsWe thank Dr L. Basso and Dr H. Kaizer for their efforts andcontribution to this research in the early evaluations of the mark-uptools. This research was supported in part by NIH award numberR01-LM-06806, by an IBM faculty fellowship (Y.S.), and by the

Physicians ability to structure GLs E. Shalom et al.

© 2009 The Authors. Journal compilation © 2009 Blackwell Publishing Ltd1052

Deutsche Telekom Labs at Ben-Gurion University of the Negev.The views expressed are those of the authors and not necessarilythose of the Department of Veterans Affairs.

References1. Field, M. & Lohr, K. (1990) Clinical Practice Guidelines: Directions

for A New Program. Washington D.C.: National Academy Press.2. Grimshaw, J. M. & Russel, I. T. (1993) Effect of clinical guidelines on

medical practice: a systematic review of rigorous evaluations. Lancet,342, 1317–1322.

3. Micieli, G., Cavallini, A. & Quaglini, S. (2002) Guideline complianceimproves stroke outcome – a preliminary study in 4 districts in theItalian region of lombardia. Stroke, 33, 1341–1347.

4. Qualigini, S., Ciccarese, P., Micieli, G. & Cavallini, A. (2004) Non-compliance with guidelines: motivations and consequences in a casestudy. In Proceedings of the symposium on Computerized Guidelinesand Protocols (CGP 2004) (eds K. Kaliser, S. Miksch & S. W. Tu),pp. 75–87. Amsterdam: IOS Press.

5. Qualigini, S., Cavallini, A., Gerzeli, S. & Micieli, G. (2004) Economicbenefit from clinical practice guideline compliance in stroke patientmanagement. Health Policy (Amsterdam, Netherlands), 69 (3), 305–315.

6. Peleg, M., Tu, S. W., Bury, J., et al. (2003) Comparing computer-interpretable guideline models: a case-study approach. Journal of theAmerican Medicine Inform Association, 10 (1), 52–68.

7. De Clercq, P., Blom, J., Korsten, H. & Hasman, A. (2004) Approachesfor creating computer-interpretable guidelines that facilitate decisionsupport. Artificial Intelligence in Medicine, 31 (1), 1–27.

8. Isern, D. & Moreno, A. (2008) Computer-based execution of clinicalguidelines: a review. International Journal of Medical Informatics, 77(12), 787–808.

9. Wang, D., Peleg, M., Bu, D., et al. (2003) GESDOR – a genericexecution model for sharing of computer-interpretable clinicalpractice guidelines. In Proceedings of the AMIA Symposium (ed.M. Musen), pp. 694–698. Washington D.C./Bethesda, MD: AmericanMedical Informatics Association.

10. Patel, V. L., Allen, V. G., Arocha, J. F. & Shortliffe, E. H. (1998)Representing clinical guidelines in GLIF: individual and collaborativeexpertise. Journal of the American Medicine Inform Association, 5 (5),467–483.

11. Shahar, Y., Young, O., Shalom, E., Galperin, M., Mayaffit, A., Mosko-vitch, R. & Hessing, A. (2004) A framework for a distributed, hybrid,multiple-ontology clinical-guideline library and automated guideline-support tools. Journal of Biomed Inform, 37 (5), 325–344.

12. Shalom, E., Shahar, Y., Taieb-Maimon, M., et al. (2008) A quantitativeevaluation of a methodology for collaborative specification of clinical

guidelines at multiple representation levels. The Journal of BioMedi-cal Informatics, 41 (6), 889–903.

13. Shalom, E., Shahar, Y., Lunenfeld, E., et al. (2006) The importance ofcreating an ontology-specific consensus before a mark-up-based speci-fication of clinical guidelines. 17th European Conference on ArtificialIntelligence (ECAI-06).

14. Shahar, Y., Miksch, S. & Johnson, P. (1998) The Asgaard project: atask-specific framework for the application and critiquing of time-oriented clinical guidelines. Artificial Intelligence in Medicine, 14,29–51.

15. Emedicine Website (2005) Pelvic Inflammatory Disease. Avai-lable at: http://emedicine.medscape.com/article/256448-overview(last accessed 19 August 2009).

16. Centers for Disease Control and Prevention (CDC) Website (2002)Sexually Transmitted Diseases Treatment Guidelines. Available at:http://www.cdc.gov/mmwr/preview/mmwrhtml/rr5106a1.htm (lastaccessed 19 August 2009).

17. Veterans Affairs (VA) Website (2005) Inpatient Management ofCOPD: Emergency Room and Hospital Ward Management (B1).Available at: http://www.healthquality.va.gov/Chronic_Obstructive_Pulmonary_Disease_COPD.asp (last accessed 19 August 2009).

18. The American Association of Clinical Endocrinologists (AACE)Website (2002) American Association of Clinical EndocrinologistsMedical Guidelines for Clinical Practice for the Evaluation and Treat-ment of Hyperthyroidism and Hypothyroidism. Available at: http://www.aace.com/pub/pdf/guidelines/hypo_hyper.pdf (last accessed 19August 2009).

19. Moskovitch, R. & Shahar, Y. (2009) Vaidurya: a multiple-ontology,concept-based, context-sensitive clinical-guideline search engine.Journal of Biomedical Informatics, 42 (1), 11–21.

20. German, E., Leibowitz, A. & Shahar, Y. (2009) An architecture forlinking medical decision-support applications to clinical databasesand its evaluation. Journal of Biomedical Informatics, 42 (2), 203–218.

21. Young, O., Shahar, Y., Liel, Y., Lunenfeld, E., Bar, G., Shalom, E.,Martins, S. B., Vaszar, L. T., Marom, T. & Goldstein, M. K. (2007)Runtime application of Hybrid-Asbru clinical guidelines. Journal ofBiomed Inform, 40 (5), 507–526.

22. Buchman, T. G., Patel, V. L., Dushoff, J., Ehrlich, P. R., Feldman, M.,Levin, B., Miller, D. T., Rozin, P., Levin, S. A. & Fitzpatrick, S. (2006)Enhancing the use of clinical guidelines: a social norms perspective.Journal of the American College of Surgeons, 201, 826–836.

23. Patel, V. L., Arocha, J. F., Diermeier, M., How, J. & Mottur-Pilson, C.(2001) Cognitive psychological studies of representation and use ofclinical practice guidelines. International Journal of Medical Infor-matics, 63 (3), 147–167.

24. Leprohan, J. & Patel, V. L. (1995) Decision making strategies fortelephone triage in emergency medical services. Journal of MedicalDecision Making, 15, 240–253.

E. Shalom et al. Physicians ability to structure GLs

© 2009 The Authors. Journal compilation © 2009 Blackwell Publishing Ltd 1053