Using a standardised patient assessment to measure professional attributes

10
Using a standardised patient assessment to measure professional attributes Marta van Zanten, 1 John R Boulet, 1 John J Norcini 2 & Danette McKinley 1 INTRODUCTION Professionalism is an important topic in medical education today. While much work has focused on defining professionalism and teach- ing medical students the appropriate interpersonal behaviours, relatively little research has looked at meaningful ways of assessing the relevant attributes. METHOD The Educational Commission for Foreign Medical Graduates (ECFMG Ò ) clinical skills assess- ment (CSA Ò ) uses standardised patients (SPs) to evaluate the readiness of graduates of international medical schools to enter accredited graduate training programmes in the USA. Doctor interpersonal skills, including professional qualities such as rapport, are evaluated as part of the CSA. Attentiveness, attitude and empathy, all facets of professional behaviour, are specifically targeted as part of the assessment. RESULTS To date, over 35 000 candidates have been assessed, encompassing more than 370 000 individ- ual SP encounters. Based on a 1-year cohort of examinees, the reliability of the individual profes- sionalism-related component scores ranged from 0.61 to 0.70. Doctors who had graduated from med- ical school more recently, or were younger, generally obtained higher ratings. Professional qualities were only marginally related to measures of basic science and clinical science proficiency. Female candidates were rated significantly higher than male candidates on all dimensions. CONCLUSIONS While some professional behaviours are probably best measured using formats such as surveys, self-assessment and critical incident tech- niques, certain aspects of the domain can be reliably and validly measured in SP examinations. KEYWORDS education, medical, undergraduate *standards; professional competence standards; interpersonal relations; educational measure- ment methods; empathy; attitude of health person- nel. Medical Education 2005; 39: 20–29 doi:10.1111/j.1365-2929.2004.02029.x INTRODUCTION In recent years, patient advocates, medical educators and doctors themselves have been concerned with the apparent decline in doctor professionalism, due in part to the corporate transformation of the health care system and increased patient workloads. In response, a broad coalition of organisations has placed a strong emphasis on returning professional- ism to the medical vocation. Institutions such as the Liaison Committee for Medical Education (LCME) and the Accreditation Council for Graduate Medical Education (ACGME), through its Outcome Project, have called for the training and assessment of professional behaviours of medical students and residents. 1 Given the strong positive relationship between doctor interpersonal skills and patient compliance, such efforts should lead to better health care outcomes. 2,3 Although there is agreement that professional attributes are important for doctors, no common definition of the term exists. Several medical specialty and accrediting organisations have developed their own descriptions of professionalism and have used an assortment of methods to assess and measure the associated behaviours. In 2002, Arnold 4 reported that professionalism 1 Educational Commission for Foreign Medical Graduates (ECFMG), Philadelphia, Pennsylvania, USA 2 Foundation for Advancement of International Medical Education and Research (FAIMER), Philadelphia, Pennsylvania, USA Correspondence: Marta van Zanten, Educational Commission for Foreign Medical Graduates (ECFMG), 3624 Market Street, Philadelphia, Pennsylvania 19104, USA. Tel: 00 1 215 823 2226; Fax: 00 1 215 386 3309; E-mail: [email protected] Ó Blackwell Publishing Ltd MEDICAL EDUCATION 2005; 39: 20–29 20

Transcript of Using a standardised patient assessment to measure professional attributes

Using a standardised patient assessment to measureprofessional attributesMarta van Zanten,

1

John R Boulet,1

John J Norcini2

& Danette McKinley1

INTRODUCTION Professionalism is an importanttopic in medical education today. While much workhas focused on defining professionalism and teach-ing medical students the appropriate interpersonalbehaviours, relatively little research has looked atmeaningful ways of assessing the relevant attributes.

METHOD The Educational Commission for ForeignMedical Graduates (ECFMG�) clinical skills assess-ment (CSA�) uses standardised patients (SPs) toevaluate the readiness of graduates of internationalmedical schools to enter accredited graduate trainingprogrammes in the USA. Doctor interpersonal skills,including professional qualities such as rapport, areevaluated as part of the CSA. Attentiveness, attitudeand empathy, all facets of professional behaviour, arespecifically targeted as part of the assessment.

RESULTS To date, over 35 000 candidates have beenassessed, encompassing more than 370 000 individ-ual SP encounters. Based on a 1-year cohort ofexaminees, the reliability of the individual profes-sionalism-related component scores ranged from0.61 to 0.70. Doctors who had graduated from med-ical school more recently, or were younger, generallyobtained higher ratings. Professional qualities wereonly marginally related to measures of basic scienceand clinical science proficiency. Female candidateswere rated significantly higher than male candidateson all dimensions.

CONCLUSIONS While some professional behavioursare probably best measured using formats such as

surveys, self-assessment and critical incident tech-niques, certain aspects of the domain can be reliablyand validly measured in SP examinations.

KEYWORDS education, medical, undergraduate ⁄*standards; professional competence ⁄ standards;interpersonal relations; educational measure-ment ⁄methods; empathy; attitude of health person-nel.

Medical Education 2005; 39: 20–29doi:10.1111/j.1365-2929.2004.02029.x

INTRODUCTION

In recent years, patient advocates, medical educatorsand doctors themselves have been concerned withthe apparent decline in doctor professionalism, duein part to the corporate transformation of the healthcare system and increased patient workloads. Inresponse, a broad coalition of organisations hasplaced a strong emphasis on returning professional-ism to the medical vocation. Institutions such as theLiaison Committee for Medical Education (LCME)and the Accreditation Council for Graduate MedicalEducation (ACGME), through its Outcome Project,have called for the training and assessment ofprofessional behaviours of medical students andresidents.1 Given the strong positive relationshipbetween doctor interpersonal skills and patientcompliance, such efforts should lead to better healthcare outcomes.2,3

Although there is agreement that professionalattributes are important for doctors, no commondefinition of the term exists. Several medical specialtyand accrediting organisations have developed theirown descriptions of professionalism and have used anassortment of methods to assess and measure theassociated behaviours. In 2002, Arnold4 reported that

professionalism

1Educational Commission for Foreign Medical Graduates (ECFMG),Philadelphia, Pennsylvania, USA2Foundation for Advancement of International Medical Education andResearch (FAIMER), Philadelphia, Pennsylvania, USA

Correspondence: Marta van Zanten, Educational Commission for ForeignMedical Graduates (ECFMG), 3624 Market Street, Philadelphia,Pennsylvania 19104, USA. Tel: 00 1 215 823 2226; Fax: 00 1 215 3863309; E-mail: [email protected]

� Blackwell Publishing Ltd MEDICAL EDUCATION 2005; 39: 20–2920

about half of all medical schools had formal defini-tions of professionalism and assessment methods inplace to test students’ qualities and skills. Certifica-tion and licensure organisations, most notably theAmerican Board of Internal Medicine (ABIM), havespecifically defined elements of professionalism,5

including altruism, accountability, excellence, duty,service, honour, integrity and respect for others.Challenges to professionalism have also been identi-fied, including abuse of power, arrogance, greed,misrepresentation, impairment, lack of conscien-tiousness and conflicts of interest. The Association ofAmerican Medical Colleges (AAMC) stated thatdoctors must be able to demonstrate altruistic,knowledgeable, skilful and dutiful behaviours at thetime of medical school graduation.6 Although theterm �professionalism� is not specifically used, theseideals relate directly to the constructs describedabove.

In addition to the recent focus placed on definingprofessionalism and promoting the teaching of the

appropriate behaviours to medical students andresidents, the assessment of construct-related attrib-utes is a critical component of most initiatives.However, the traditional ways of assessing profes-sionalism involve numerous difficulties, such asrecognising the context-dependency of professionalbehaviours, choosing whether to assess knowledge,behaviour or both, deciding who should judge, anddeveloping a valid and reliable measurement tool.4,7,8

More important, it can be argued that many aspectsof professionalism cannot be evaluated through themeasurement of specific behaviours and ⁄or know-ledge.

Some authors have attempted to evaluate profes-sionalism as a part of clinical performance, withsupervising doctors, peers, nurses and patients eval-uating doctors using a variety of rating scales.Professionalism can be assessed as a comprehensiveentity, or by breaking it down into specific elementsof professional behaviour, such as humanism, em-pathy or ethics, which can be measured independ-ently. Successful methods for measuring componentsof professionalism include self-assessment surveys,critical incident techniques, longitudinal studies,evaluation of video-taped patient visits and standard-ised patient-based objective structured clinical exam-inations (OSCEs).9–16 While strengths andweaknesses exist for each of these assessment meth-ods and techniques,4,8 it is clear that, depending onthe domain being measured and the evaluationtechnique employed, several aspects of professionalbehaviour can be reliably and validly assessed.

Standardised patients (SPs) have been widely used totest the clinical skills of doctors, including measuresof interpersonal and communication skills, althoughtheir use for specifically evaluating the professional-ism of doctors is somewhat limited. The authors of astudy that did use a 3-station OSCE to measuremedical students’ professional behaviours reportedthat the interrater reliability of the cumulative pro-fessionalism score (intraclass correlation coefficient)was 0.65. However, there was a lack of intercasecorrelation between professionalism ratings and sep-arate ratings of communication skills.17 Standardisedpatients are more often used to measure the profes-sional characteristics of a doctor–patient interaction,such as empathy or ethics. In a study by Colliveret al.,12 SPs used a checklist to indicate whether Year 4medical students were �empathic� in encounters. Theempathy measure correlated most highly with com-munication items that related to making the patientfeel comfortable and at ease. Singer et al.18 developed2 ethics OSCE stations which were part of a

Overview

What is already known on this subject

Although several characteristics of profes-sional behaviour can be assessed using avariety of techniques, problems exist and someaspects of this construct are difficult to meas-ure.

What this study adds

We provide evidence for the effectiveness ofusing standardised patients to evaluate someprofessional attributes, such as empathy andrespect.

Ratings of professional behaviours varied as afluctuation of doctor, patient and case char-acteristics, supporting the use of the assess-ment method.

Suggestions for further research

Valid and reliable methods for assessing otheraspects of professionalism, such as account-ability, altruism and ethical behaviour, are stillneeded.

21

� Blackwell Publishing Ltd MEDICAL EDUCATION 2005; 39: 20–29

pre-internship examination in Canada. The authorsreported that there was satisfactory interrater agree-ment and that the stations appeared to be measuringthe intended traits.18 Although these studies wererelatively small scale, they provided some evidencethat certain aspects of professionalism can be effect-ively measured using SP-based assessments.

Purpose

The purpose of this paper is to evaluate the effect-iveness of the Educational Commission for ForeignMedical Graduates’ (ECFMG�) clinical skills assess-ment (CSA�) for evaluating attributes related toprofessionalism. While a simulated clinical environ-ment is less useful for measuring many of thedescribed professional behaviours such as altruism,ethics and accountability, it is likely that certainaspects within the broader domain, such as empathyand respect, can be effectively assessed using this typeof tool. Comparing ratings of professional behavioursas a function of candidate, patient and case charac-teristics can provide valuable data to support the useof the assessment instrument.

METHODS

ECFMG certification

The ECFMG is responsible for certifying graduates ofinternational medical schools who wish to pursuegraduate medical education in the USA. The certifi-cation process involves a number of steps, including averification of a medical diploma, passing scores onthe US Medical Licensing Examinations (USMLE)Step 1 (basic science) and Step 2 (clinical science)examinations, an acceptable score on the Test ofEnglish as a Foreign Language (TOEFL), and finallya passing score on the CSA.

Clinical skills assessment (CSA)

The CSA is a performance examination that requirescandidates to demonstrate their clinical skills in asimulated medical environment. Candidates interactwith 10 or 11 SPs, who are lay people trained torealistically portray common clinical complaints. Thecandidates treat the SPs as they would actual patients,gathering relevant patient data, performing a focusedphysical examination and writing up their findings inthe form of a clinical note. Candidates have 15 min-utes to assess each of the SPs and 10 minutes afterencounters to write up their findings. The testspecifications ensure that candidates encounter a

varied mix of clinical problems and SPs.19 From ascoring perspective, the examination is divided into 2conjunctive components, the integrated clinicalencounter (ICE) and doctor)patient communication(COM). The ICE portion is comprised of historytaking and physical examination skills, scored ana-lytically by SPs who complete case-specific checklists,and the written summarisation of patient findings,scored holistically by trained doctor raters. Interper-sonal skills (IPS) are assessed holistically by the SPsacross 4 dimensions, using a 4-point Likert scaleranging from poor to excellent. The specific domainsassessed include �interviewing and collecting infor-mation�, �counselling and delivering information�,�rapport� and �personal manner�. Satisfaction is alsorated by SPs using a 5-point scale, similar to the scaleused for rating IPS, but including an option of �don�tknow ⁄neutral’. The satisfaction ratings provided bySPs are for research purposes and are not counted aspart of a candidate’s CSA score. Spoken Englishproficiency is also assessed in every encounter. The 4IPS ratings and the 1 spoken English proficiencyrating from each of the 10 SPs are combined tocomprise the overall COM score. Candidates mustachieve passing scores on both the ICE and COMportions of the examination to pass the CSA.

CSA IPS dimensions and professionalism

While the CSA IPS rating scale was not designed tofocus specifically on assessing professionalism, cer-tain aspects of the assessment do overlap withelements in widely used definitions of the trait. Forthis research, we will focus on using elements ofprofessionalism as defined by the ABIM ProjectProfessionalism and the AAMC Medical SchoolObjectives Project (MSOP) that overlap with the CSAIPS assessment instrument. A comparison of CSA IPSdimensions, Project Professionalism and MSOPdefinitions of doctor professional attributes is pre-sented in Table 1. These descriptions of professionalattributes were chosen because the definitions weredeveloped to be broad and inclusive enough to beused by a wide variety of medical groups, includingeducators and assessors.

The first CSA IPS dimension, �Skills in interviewingand collecting information�, includes actions desir-able in any skilful and professional doctor, but it doesnot contain specific elements of professional beha-viour as defined by Project Professionalism or MSOP.

The second element in the CSA IPS scale, �Skills incounselling and delivering information�, is related tothe Project Professionalism element of �Honour and

professionalism22

� Blackwell Publishing Ltd MEDICAL EDUCATION 2005; 39: 20–29

integrity�, defined as being fair, truthful andstraightforward with patients. Not allowing patients tovoice their wishes or contribute to decision making islabelled as one of Project Professionalism’s identified�Abuses of power�, and is also a criteria used by SPswhen evaluating the IPS 2 dimension. The MSOPdefinition of professionalism also contains IPS 2 skillsunder the definition �Physicians must be skilful.� Forexample, the MSOP definition states that doctorsmust be prepared to discuss different options withpatients in an honest and objective way, clearly anattribute that overlaps with a rating of skills incounselling and delivering information.

The CSA IPS 3 rating of �Rapport� overlaps mostclosely with the 2 chosen definitions of professional-ism. The �Rapport� rating assesses a doctor’s ability tobe on his or her way to establishing a caringrelationship with a patient. The Project Profession-alism �Respect for others� criteria speak directly tothese elements. Additionally, under �Abuses ofpower�, arrogance is identified specifically as makingempathy for the patient difficult. �Rapport� elementsare also mentioned in the MSOP document. The firstelement, �Physicians must be altruistic�, describes theneed for doctors to be compassionate, respectful andempathetic and to avoid being judgmental.

Table 1 Comparison of CSA IPS criteria with definitions of professionalism

CSA IPS dimension Project Professionalism MSOP

Skills in interviewing and collecting information No direct relationship No direct relationshipEffective use of open-ended and closedquestions, clarity of questions, avoidance ofjargon in questioning, and effective useof verification, summarisation and transitionphrases

Skills in counselling and delivering information Honour and integrity Physicians must be skilfulAbility to check for a patient’s understanding,to counsel a patient when appropriate,tactfulness, avoidance of jargon inpresenting possible diagnoses, linkage ofsymptoms or concerns to closing information,and the ability to leave a patient with a clearunderstanding of what will happen next

Being fair, truthful andstraightforward with patients

Doctors must be prepared todiscuss different options withpatients in an honest andobjective wayAbuses of power

Interactions with patients and colleaguesNot allowing patients to voice theirwishes or contribute todecision making

Rapport Respect for others Physicians must be altruisticAbility to be on his or her way to establishinga caring relationship with a patient

The essence of humanism Doctors must becompassionate, respectfuland empathetic and avoidbeing judgmental

Attentiveness, body language, includingappropriate eye contact and respect ofpersonal space, attitude, and demonstrationof empathy and support for the patient’sconcerns

Abuses of powerArroganceOffensive display of superiorityand self-importanceDenotes haughtiness, vanity,insolence and disdainArrogance makes empathy forthe patient difficult

Personal manner Respect for others Physicians must be altruisticSuitable introduction, appropriate moodand demeanour, and skills in conducting aphysical examination respectful of a patient’scomfort and modesty

The essence of humanism Compassionate treatmentof patients and respectfor their privacy anddignity

Satisfaction Inferred relationship Inferred relationshipPatient’s overall approval of a doctor

23

� Blackwell Publishing Ltd MEDICAL EDUCATION 2005; 39: 20–29

The fourth IPS dimension, �Personal manner�, asses-ses criteria such as an appropriate introduction, acandidate’s mood and demeanour in the encounter,and skills in conducting a physical examination thatis respectful of a patient’s comfort and modesty.While the Project Professionalism definition does notspecifically mention these elements, certainly �Re-spect for others� would overlap with these concepts.The MSOP document does speak directly to the samebehaviours mentioned in the IPS 4 dimension. Under�Physicians must be altruistic� the call is for �compas-sionate treatment of patients, and respect for theirprivacy and dignity�.

The satisfaction rating, although not part of acandidate’s CSA score, provides information regard-ing a patient’s overall approval of a doctor. For thisrating, SPs are allowed to be subjective and may alsouse their opinion of a candidate’s medical knowledgeor ability. Most elements of professionalism in theProject Professionalism and MSOP definitions over-lap to some degree with the CSA definition of patientsatisfaction.

Sample

The analysis sample included candidates who weretested between 1st February 2002 and 31st January2003. This 1-year cohort included 9262 candidates, ofwhom 1516 were repeaters. To eliminate the poten-tial confounding effects of repeat candidate per-formance, only data for the 7746 first-time takerswere used in the analyses. Over 20% (n ¼ 1594) ofthe sample were US international medical schoolgraduates, US citizens who had attended non-LCMEaccredited medical schools located outside the US,Canada and Puerto Rico. Over 58% of the sample wasmale (n ¼ 4515) and most candidates (77.8%)claimed that English was not their native language. Arelatively large percentage (22.6%) of the samplecomprised people who had been citizens of Indiawhile at medical school. The next most prevalentcohorts came from the US (20.6%), Pakistan (7.2%),China (3.0%) and the Philippines (3.0%). The meanage of the candidates was 30.5 years (SD ¼ 5.3 years).Data from over 77 000 SP encounters were availablefor analysis.

CSA case content

Cases used for a CSA administration are selectedbased on detailed test specifications, includingcriteria based on medical content and SP charac-teristics (e.g. age, gender). Cases are assigned to 1of 5 categories based on the patient’s primary

reason for visiting the doctor: abdominal, chest,constitutional, miscellaneous and neurological ⁄psy-chological. Cases labelled as abdominal includegastrointestinal, genitourinary and most gynaecolo-gical complaints. Chest cases consist of cardiovas-cular and respiratory reasons for visit. Constitutionalcases represent complaints of vague origin, such asweight gain or loss, fever, fatigue, etc. Casescategorised as miscellaneous include musculoskele-tal complaints and routine visits such as for insur-ance physicals. Neurological ⁄psychological casesrepresent complaints such as headaches, confusionand dementia.

Cases are also classified by acuity to 1 of the following3 categories: acute, subacute and chronic. Acute casesare defined as problems that occurred in thepreceding 48 hours, subacute as complaints thatbegan within the preceding 2 weeks, and chronic asongoing problems that have lasted for more than2 weeks.

Analyses

Generalisability studies were conducted to investigatethe sources of measurement error on the interview-ing, counselling, rapport, personal manner andpatient satisfaction ratings. Because different SPsperform different cases, and some SPs performmultiple cases, the model for the generalisabilitystudy was a person · SP (case). Variance componentswere estimated and these values were used to calcu-lated generalisability coefficients for a 10-case assess-ment.

Several analyses were conducted to investigate theassociations between professional behaviours, asmeasured in the CSA, and candidate back-ground ⁄performance variables. Specifically, Pearsoncorrelations were used to summarise the magnitudeof relationships between interpersonal skills ⁄ satisfac-tion ratings and variables related to candidate back-ground, including performance on othercertification examinations and the integrated clinicalencounter portion of the CSA. Here, IPS andsatisfaction ratings, averaged over the 10 CSA clinicalencounters, were used. The magnitude of theseassociations can provide some evidence to supportthe use of the CSA scales to measure select profes-sional attributes.

Comparisons between ratings for select candidatecohorts were also made. Here, analysis of covari-ance (ANCOVA) was used to test for differences inmean scores. Because the abilities of the candidates

professionalism24

� Blackwell Publishing Ltd MEDICAL EDUCATION 2005; 39: 20–29

may have been different for the comparison groups,a covariate (Test of English as Foreign Language –TOEFL) was added to the model. The TOEFLscore was chosen as a covariate based on themoderate correlations between this measure andthe IPS and satisfaction ratings. Least square meansare reported for all analyses involving candidategroups. These are the selected group mean scores,after adjusting for potential differences in Englishability. Based on the literature, it was hypothesisedthat certain candidate groups (e.g. female doctors)would be better able to demonstrate professionalbehaviours.

Analysis of variance (ANOVA) was used to test for meandifferences in ratings as a function of case charac-teristics. Here, it was hypothesised that, based oncertain case attributes (e.g. acuity of condition), itmay be more, or less, difficult to exhibit professionalbehaviours. As a result, one might expect differencesin mean scores.

RESULTS

The variance components for the IPS and satisfactionratings are presented in Table 2.

The reliabilities of the interviewing, counselling,rapport, personal manner and satisfaction ratings,over 10 encounters, were 0.63 (standard error ofmeasurement [SEM] ¼ 0.19), 0.70 (SEM ¼ 0.22),0.68 (SEM ¼ 0.19), 0.61 (SEM ¼ 0.20) and 0.68(SEM ¼ 0.26), respectively. Based on the generalis-ability studies, the choice of CSA case (type ofencounter) had little impact on the variability ofthe ratings. That is, mean ratings did not varyappreciably as function of the case. However,

depending on the individual component, thechoice of SP performing the case accounted forapproximately 15–17% of the variability in ratings.This is evident by the non-zero SP (case) (SP nestedin case) variance component. The final component(error) includes the 2-way interaction[P · SP (case)] and all other unaccounted sourcesof score variability.

The correlations between CSA IPS ⁄patient satisfac-tion ratings and other measures are presented inTable 3.

Doctors who had graduated from medical schoolmore recently, or were younger, generally achievedhigher interviewing, counselling, rapport, personalmanner and patient satisfaction ratings. There weremoderate positive correlations between TOEFLscores and rated attributes related to professional-ism. While the correlations between the CSA meas-ures and the USMLE Step Examination scores werepositive, at most, there was only 3.2% sharedvariance. There were also moderate correlationsbetween IPS ⁄patient satisfaction ratings and otherCSA measures (data gathering, patient note). Here,professional attributes, at least those measured viathe CSA, were associated with better history taking,physical examination and written communicationskills.

A comparison between the performances of male andfemale candidates on the IPS ⁄patient satisfactionmeasures is presented in Table 4.

Based on Table 4, female candidates were ratedsignificantly higher than male candidates on all CSAIPS dimensions. This would suggest that femalecandidates exhibited more professional behaviours

Table 2 Variance components for IPS and satisfaction ratings

Interviewing� Counselling*� Rapport*�Personalmanner*� Satisfaction�

Person 0.06 0.11 0.08 0.06 0.14Case 0.00 0.02 0.00 0.01 0.00SP(Case) 0.07 0.09 0.07 0.08 0.21Error 0.29 0.39 0.30 0.31 0.46q2 0.63 (0.19) 0.70 (0.22) 0.68 (0.19) 0.61 (0.20) 0.68 (0.26)

* Directly related to professionalism definitions.� 1–4 scale.� 1–5 scale.

25

� Blackwell Publishing Ltd MEDICAL EDUCATION 2005; 39: 20–29

and ⁄or fewer ineffectual ones. The most noteworthydifference was on the rapport dimension, wherewomen were rated, on average, almost 0.3 standarddeviations higher than men. Overall, controlling for

English ability (TOEFL), the SPs were more satisfiedwith women doctors than men.

The manifestation of professional behaviours maybe differentially related to the condition of thepatient or type of complaint. Mean IPS ⁄ satisfactionratings, by case content categories are presented inTable 5.

Based on separate ANOVAs for each of the measureddimensions, there were significant differences in allratings as a function of the primary case category. Ingeneral, ratings were lowest for neurological cases,regardless of the particular dimension being meas-ured. For example, counselling ratings for neuro-logical cases were 0.34 SDs lower than the averageratings provided for the remaining case categories.There were also significant differences in satisfactionratings between all case category aggregations(F ¼ 1115.5, P < 0.01). Here, patient satisfactionratings were significantly lower for abdominal cases.On average, satisfaction ratings were 0.30 SDs lower if

professionalism

Table 3 Correlation between CSA interpersonal skills ⁄ satisfaction ratings and candidate background ⁄ performance variables

Yearssincegraduation Age TOEFL Step 1 Step 2

Datagathering

Patientnote

Interviewing� ) 0.17 ) 0.16 0.32 0.13 0.18 0.45 0.38Counselling*� ) 0.19 ) 0.14 0.35 0.10 0.16 0.32 0.32Rapport*� ) 0.13 ) 0.07 0.29 0.04 0.12 0.36 0.32Personal manner*� ) 0.09 ) 0.07 0.20 0.04 0.11 0.34 0.29Satisfaction� ) 0.20 ) 0.13 0.39 0.08 0.17 0.47 0.40

* Directly related to professionalism definitions.� 1–4 scale.� 1–5 scale.

Table 4 Means comparisons by candidate gender(controlling for TOEFL)

Female Male SignificanceEffectsize

Interviewing� 3.03 2.97 P < 0.01 0.21Counselling*� 2.91 2.88 P < 0.01 0.08Rapport*� 3.07 2.98 P < 0.01 0.28Personalmanner*�

3.12 3.06 P < 0.01 0.19

Satisfaction� 3.62 3.52 P < 0.01 0.24

* Directly related to professionalism definitions.� 1–4 scale.� 1–5 scale.

Table 5 Mean ratings by primary case category

Abdominal Chest Constitutional Miscellaneous Neurological

Interviewing� 3.00 3.09 3.06 2.97 2.88Counselling*� 2.90 3.01 2.96 2.95 2.66Rapport*� 3.01 3.10 3.09 3.00 2.91Personal manner*� 3.09 3.15 3.17 3.08 2.93Satisfaction� 3.35 3.79 3.65 3.51 3.48

* Directly related to professionalism definitions.� 1–4 scale.� 1–5 scale.

26

� Blackwell Publishing Ltd MEDICAL EDUCATION 2005; 39: 20–29

an abdominal case was encountered as opposed toany other type.

The mean IPS ⁄ satisfaction ratings by case acuity areprovided in Table 6.

Based on the ANOVA, there were significant differ-ences in mean IPS and satisfaction ratings as afunction of the acuity of the case. On average,candidates received lower IPS and satisfaction ratingsif they encountered a patient with an acute condi-tion. The mean satisfaction rating for subacute caseswas 0.28 SDs higher than that for acute cases.

DISCUSSION

The reliability of the individual professionalism-related component scores was found to be moder-ate, ranging from 0.61 to 0.70. There was, however,some variability in the ratings as a function of whichSP was performing the case. This would be expectedin that individual SPs, although extensively trainedin the assessment of humanistic attributes, wouldstill have varying expectations with respect to theprovision of health care. Here, especially if theassessment of professional attributes has some highstakes consequences, it may be necessary to adjustratings, as is currently done for the ECFMG CSA,based on the leniency or stringency of the particularassessor. Nevertheless, provided that there are suffi-cient numbers of patient encounters, it is stillpossible to get reasonably precise estimates ofgeneral attributes related to professionalism.Although the CSA does not, and was not intendedto, measure all attributes related to professionalism,there are certainly some parts of the interpersonalskills and patient satisfaction ratings that can beused to reliably assess certain parts of the broaderdomain.

The relationships between IPS ⁄patient satisfactionratings and external variables provide discriminantand criterion-related evidence to suggest that validmeasures of professional attributes are being ob-tained. Certainly, based on the constructs beingmeasured, one would not expect professionalism tobe highly related to basic science or clinical scienceperformance. However, one would anticipate thatinterviewing and counselling skills, and rapport,would be related to English ability, as evidenced bythe correlations with TOEFL scores. In terms ofinternal CSA measures such as data gathering, onewould anticipate that candidates with better inter-viewing skills would obtain more information, result-ing in higher checklist performance. Similarly,candidates who obtain more relevant information,perhaps as a function of more advanced humanisticand professional skills, would be better prepared tosummarise their findings in the clinical note.

We found significant differences in the IPS ⁄patientsatisfaction ratings afforded to male and femalecandidates. Gender differences have, however, beennoted in the literature.11,13–15,20–22 In addition, for SPassessments, there is little evidence to suggest thatthese ratings are biased.21,23–27 Our data wouldsuggest that female candidates are better able todisplay behaviours related to professionalism, at leastin the simulated environment. For future studies itwould be informative to gather more specific data,over encounters, relating to the particular behavioursassociated with professionalism, or non-professional-ism (e.g. presence or absence of empathetic com-ments). This information could be used to explorethe relationship between patient conditions, or casecontent, and specific humanistic attributes. Unfortu-nately, the global nature of the rating schema usedfor the CSA does not allow for the categorisation ofspecific behaviours. Here, it would be useful to ratecandidates on more specific actions. For example,one could quantify respectfulness during the physicalexamination (draping), use of open-ended questions,verification of information, etc. and use thesemeasures to delineate specific differences betweenmale and female performance.

Ratings of professional attributes also varied as afunction of characteristics of the clinical encounter.The 4 interpersonal skills ratings were lowest forneurological ⁄psychological cases. It is reasonable thatcandidates would have more difficulty counsellingand discussing treatment options with patients whodisplay neurological or psychological deficienciessuch as dementia or panic attacks. Establishingrapport with these types of patients may also be more

Table 6 Mean dimension ratings by case acuity

Acute Chronic Sub-acute

Interviewing� 2.95 3.03 3.02Counselling*� 2.82 2.93 2.93Rapport*� 2.99 3.00 3.06Personal manner*� 3.06 3.08 3.11Satisfaction� 3.46 3.49 3.71

* Directly related to professionalism definitions.� 1–4 scale.� 1–5 scale.

27

� Blackwell Publishing Ltd MEDICAL EDUCATION 2005; 39: 20–29

challenging than with mentally stable persons. Inaddition, it would appear to be more problematic forcandidates to show their professional attributes whileinteracting with a patient with an acute manifesta-tion. Because almost all cases involving a patient inmoderate to severe pain are labelled acute, thisfinding could be attributed to the challenge ofrelating to a patient in an empathic manner andconducting a respectful and compassionate physicalexamination when pain is an issue. Overall, the noteddifferences in ratings as a function of case charac-teristics provide some evidence to support the validityof the CSA for measuring certain professionalattributes.

Compared to other means of assessment such assurveys and direct observation, SP assessments, suchas the CSA, provide several advantages for assessingprofessional behaviours. In the CSA, doctors aresystematically rated while working up a variety ofpatients with a range of complaints. Here, all doctorsare assessed under the same set of standardisedconditions. Additionally, the person carrying out theassessment of professional behaviour is the one whohas the most personal perspective, the patient.Nevertheless, regardless of the types of encountersmodelled, the training of the SPs, or the particularrating rubrics employed, it is impossible to measureall aspects of professionalism. For example, a doctormay appear to be altruistic yet have motives that areessentially selfish. The primary disadvantage of usingSPs to rate professional behaviours is the simulatedcontext for the assessment. The simulated setting isnot only costly and logistically difficult to fabricate,but relies on the expectations and health beliefs of acohort of trained patients. Although there has beensome research to suggest that doctors will perform inthe same way with real and simulated patients,28

sociably desirable response sets are certainly possiblewithin an examination environment. The CSA, inparticular, is a somewhat artificial setting for theevaluation of professionalism in that all its encoun-ters involve the doctor meeting the patient for thefirst time, whereas with real patients a doctor canbuild up rapport over numerous visits. In addition,there are relatively few cases where ethical issues (e.g.maintaining confidentiality) are addressed. Despitethese drawbacks, SP assessments can provide areasonably good setting for the evaluation of selectbehaviours. While some aspects of professionalismare best measured using other techniques andassessment instruments, our analyses suggest thatcertain characteristics of professionalism can bereliably and validly measured as part of the perform-ance-based CSA.

Contributors: MVZ and JRB designed the study and wrotethe paper jointly. JRB and DM performed the data analysis.JJN reviewed the manuscript for intellectual contentregarding the assessment of professionalism.Acknowledgements: none.

Funding: this study was funded internally by theEducational Commission for Foreign Medical Graduates.Conflicts of interest: none.

Ethical approval: not required.

REFERENCES

1 ACGME Outcome Project. http://www.acgme.org.2003.

2 Ong LM, de Haes JC, Hoos AM, Lammes FB. Doc-tor)patient communication: a review of the literature.Soc Sci Med 1995;40 (7):903–18.

3 Beck RS, Daughtridge R, Sloane PD. Physician)patientcommunication in the primary care office: a systematicreview. J Am Board Fam Pract 2002;15 (1):25–38.

4 Arnold L. Assessing professional behaviour: yesterday,today and tomorrow. Acad Med 2002;77 (6):502–15.

5 American Board of Internal Medicine. Project Profes-sionalism. 2001;1–41.

6 Medical School Objectives Project. Learning objectivesfor medical student education – guidelines for medicalschools. Report 1 of the Medical School ObjectivesProject. Acad Med 1999;74 (1):13–8.

7 Misch DA. Evaluating physicians’ professionalism andhumanism: the case for humanism �connoisseurs�. AcadMed 2002;77 (6):489–95.

8 Ginsburg S, Regehr G, Hatala R et al. Context,conflict, and resolution: a new conceptual frameworkfor evaluating professionalism. Acad Med 2000;75(10):6–11.

9 Arnold EL, Blank LL, Race KE, Cipparrone N. Canprofessionalism be measured? The development of ascale for use in the medical environment. Acad Med1998;73 (10):1119–21.

10 Clack GB, Head JO. Gender differences in medicalgraduates’ assessment of their personal attributes. MedEduc 1999;33 (2):101–5.

11 Hojat M, Gonnella JS, Mangione S et al. Empathy inmedical students as related to academic performance,clinical competence and gender. Med Educ 2002;36(6):522–7.

12 Colliver JA, Willis MS, Robbs RS, Cohen DS, SwartzMH. Assessment of empathy in a standardised patientexamination. Teach Learn Med 1998;10 (1):8–11.

13 Hojat M, Gonnella JS, Nasca TJ, Mangione S, Veloksi JJ,Magee M. The Jefferson Scale of Physician Empathy:further psychometric data and differences by genderand specialty at item level.Acad Med 2002;77 (10):58–60.

14 Hojat M, Gonnella JS, Nasca TJ, Mangione S, VergareM, Magee M. Physician empathy: definition, compo-nents, measurement and relationship to gender andspecialty. Am J Psychiatry 2002;159 (9):1563–9.

professionalism28

� Blackwell Publishing Ltd MEDICAL EDUCATION 2005; 39: 20–29

15 Barnsley J, Williams AP, Cockerill R, Tanner J. Physi-cian characteristics and the physician)patient rela-tionship. Impact of sex, year of graduation andspecialty. Can Fam Physician 1999;45:935–42.

16 Bylund CL, Makoul G. Empathic communication andgender in the physician)patient encounter. PatientEduc Couns 2002;48 (3):207–16.

17 Prislin MD, Lie D, Shapiro J, Boker J, Radecki S. Usingstandardised patients to assess medical students’ pro-fessionalism. Acad Med 2001;76 (10):90–2.

18 Singer PA, Cohen R, Robb A, Rothman A. The ethicsobjective structured clinical examination. J Gen InternMed 1993;8 (1):23–8.

19 Educational Commission for Foreign Medical Gradu-ates. Clinical Skills Assessment (CSA) Candidate OrientationManual. Philadelphia: Educational Commission forForeign Medical Graduates 2002.

20 Colliver JA, Marcy ML, Travis TA, Robbs RS. Theinteraction of student gender and standardised patientgender on a performance-based examination of clin-ical competence. Acad Med 1991;66 (9):31–3.

21 Chambers KA, Boulet JR, Furman GE. Are interper-sonal skills ratings influenced by gender in a clinicalskills assessment using standardised patients? AdvHealth Sci Educ Theory Pract 2001;6 (3):231–41.

22 Roter DL, Hall JA, Aoki Y. Physician gender effects inmedical communication: a meta-analytic review. JAMA2002;288 (6):756–64.

23 Colliver JA, Vu NV, Marcy ML, Travis TA, Robbs RS.Effects of examinee gender, standardised patient gen-der, and their interaction on standardised patients’ratings of examinees’ interpersonal and communica-tion skills. Acad Med 1993;68 (2):153–7.

24 Rutala PJ, Witzke DB, Leko EO, Fulginiti JV. Theinfluences of student and standardised patient genderson scoring in an objective structured clinical exam-ination. Acad Med 1991;66 (9):28–30.

25 Gispert R, Rue M, Roma J, Martinez-Carretero JM.Gender, sequence of cases and day effects on clinicalskills assessment with standardised patients. Med Educ1999;33 (7):499–503.

26 Furman G, Colliver JA, Galofre A. Effects of studentgender and standardised patient gender in a singlecase using a male and a female standardised patient.Acad Med 1993;68 (4):301–3.

27 Rothman AI, Cohen R, Ross J, Poldre P, Dawson B.Station gender bias in a multiple-station test of clinicalskills. Acad Med 1995;70 (1):42–6.

28 Whelan GP, McKinley DW, Boulet JR, Macrae J, Kam-holz S. Validation of the doctor)patient communica-tion component of the Educational Commission forForeign Medical Graduates Clinical Skills Assessment.Med Educ 2001;35 (8):757–61.

Received 3 October 2003; editorial comments to authors 25November 2003; accepted for publication 9 March 2004

� Blackwell Publishing Ltd MEDICAL EDUCATION 2005; 39: 20–29

29