Evaluating portfolio assessment systems: what are the appropriate criteria?

Article

600 Nurse Educa

Christine Webb BA,MSc, PhD, RN, RSCN,RNT, Professor ofHealth Studies, Instituteof Health Studies,University of Plymouth,44 Chandlers Walk,Exeter EX2 6AS, UK.Tel./fax: +01392-426321;E-mail: [email protected] Endacott MA,PhD, RN, Cert Ed, DipN,Professor of ClinicalNursing, La TrobeUniversity, P.O. Box199, Bendigo, Victoria,AustraliaMorag A Gray PhD,MN, Dip CNE, Cert.Ed,RGN, RCNT. RNT, ILTM,Head of CurriculumDevelopment, Readerand Senior TeachingFellow, Faculty ofHealth and LifeSciences, NapierUniversity, 74 CanaanLane, Edinburgh EH92TB, UKMelanie A JasperBNurs, BA, MSc, PhD,RGN, RHV, RM, NDNCert, PgCEA,ILTM,Principal Lecturer,Institute of Medicine,Health and Social Care,University ofPortsmouth, StGeorge’s Building, 141High Street, PortsmouthPO1 2HY, UKMirjam McMullanMSc, BSc, SRCh,ResearchAssistant, Institute ofHealth Studies,University of Plymouth,Plymouth PL4 8AA, UKJulie Scholes MSc,D.Phil, RN, Professor ofNursing, Centre forNursing and MidwiferyResearch, University ofBrighton, WestlainHouse, Falmer, BrightonBN1 9PT, UK

(Requests for offprints

to CW)

Manuscript accepted:

17 June 2003

Evaluating portfolioassessment systems: what arethe appropriate criteria?
Christine Webb, Ruth Endacott, Morag A Gray, Melanie A Jasper,Mirjam McMullan and Julie Scholes
Purpose. The purpose of this paper is to discuss how portfolio assessment processes shouldbe evaluated.

Background. Articles in the nursing literature discuss the use of validity and reliability

as criteria for evaluating portfolio assessment processes, and recommendations includetighter specification of grading criteria, a standardized national approach to assessing clinicalcompetence in nursing students, and inter-rater reliability checks. On the other hand, some

general practitioner educators suggest that these may not be the appropriate criteriabecause the nature of the evidence in portfolios is descriptive and judgement-based rather

than quantifiable.Method. Drawing on multi-method case study data from a recent study evaluating the use

of portfolios in the assessment of learning and competence in nursing education in England,

we suggest that criteria developed to evaluate qualitative research may be more appropriatefor evaluating portfolio assessment processes.

Discussion. Multiple sources of evidence from the varied perspectives of students, teachers,practice assessors and external examiners are tapped as part of the portfolio assessment

process. Tripartite meetings between students, teachers and clinical assessors to reviewplacements are crucial in verifying both the written evidence and students’ ability to

communicate and critically analyse their performance. The variety of evidence collectedwould potentially allow monitoring, using qualitative research evaluation criteria, both of

the portfolios themselves and the systems by which they are monitored and evaluated.However, not all this information is collected consistently and systematically, as called for incurriculum documents.

Conclusions. Use of qualitative research evaluation criteria offers a potentially productive

way forward in evaluating portfolio assessment processes but some aspects of currentpractice need to be tightened, particularly double marking, internal moderation and

external examining.ª 2003 Elsevier Ltd. All rights reserved.

Introduction

The purpose of this article is to discussimplications of our study evaluating the use ofportfolios in the assessment of learning andcompetence in nursing, midwifery and healthvisiting on behalf of the former English

tion Today (2003) 23, 600–609 0260-6

National Board for Nursing, Midwifery andHealth Visiting. Our purpose here is to providea conceptual discussion of evaluating theportfolio assessment process; therefore, thearticle does not use the conventional format ofa research report, and full details of the study

917/$ - see front matter ª 2003 Elsevier Ltd. All rights reserved.

doi:10.1016/S0260-6917(03)00098-4

mail to: profcwebb@�blueyonder.co.uk,

mail to: profcwebb@�blueyonder.co.uk,

Evaluating portfolio assessment systems

ª 2003 Elsevier Ltd. A

may be found elsewhere (Endacott et al. 2002).In summary, following a national survey ofEnglish Higher Education Institutions (HEIs),case studies of four were carried out using avariety of data collection strategies: focusedinterviews with nurse teachers, clinicalassessors and students, non-participantobservation in HEIs and clinical areas, anddocument analysis. Both pre- and post-registration programmes were studied, anddiploma and baccalaureate levels wereincluded. The article first considers debates inthe literature about the validity and reliabilityof assessment by portfolio in nursing andgeneral practice education, and suggests thatcriteria for evaluating qualitative rather thanquantitative research may be more appropriatebecause of the nature of the evidence inportfolios. It then draws on data from the studyto illustrate how portfolio assessmentprocesses were conducted in the case studyHEIs, and concludes by suggesting whatchanges might be needed if the proposedalternative criteria were to be adopted.

Conceptualising rigour inportfolios

A portfolio, as it is used for formative andsummative assessment in UK nurse education,may be defined as:

a collection of evidence, usually in writtenform, of both the products and processesof learning. It attests to achievement andpersonal and professional development,by providing critical analysis of itscontents. (McMullan et al. 2003)

The data collected are usually descriptive(reflective accounts, statements of evidence tosupport claims for skill achievement, etc) andjudgements made on competence and learningare at best at the ordinal level, such as pass/refer/fail. The validity and reliability of thismethod of assessment have been muchdebated, particularly in the UK, whereportfolios are the required approach based onresearch by Bedford et al. (1993) carried out forthe previous English National Board forNursing, Midwifery and Health Visiting (nowreplaced by the Nursing and MidwiferyCouncil).

ll rights reserved.

Validity and reliability of portfolioassessment

Validity may be defined as the measurement ofwhat is claimed to be measured, whilereliability refers to constancy of measurement.(Carter 1991; pp179–180)

Nurse education

Questions of the validity and reliability ofportfolios as a tool for summative clinicalassessment of nursing students are frequentlyraised in the literature, but rarely are attemptsmade to resolve the issues involved. Mostrecently in the UK, Calman et al. (2002) havepublished an article based on their study ofassessing practice in student nurses for theNational Board for Nursing, Midwifery andHealth Visiting for Scotland. The research briefincluded assessing the ‘methods of measuringprogress in achieving competence. . . forreliability and validity’ (p517). None of theseven Scottish higher education institutions(HEIs) studied used any formal checks onvalidity and reliability, although some reportedan intention to do so using questionnaires toassess student and mentor satisfaction with theportfolio method.

In a review of the assessment of practice forthe English National Board for Nursing,Midwifery and Health Visiting, Gerrish et al.(1997) also raise major concerns about validityand reliability, and point to differences in theevidence of achievement in written accounts instudent portfolios and in observation bymentors, and also highlight the role of theinternal monitoring panels and externalexaminers in this process. However, a varietyof terms was used in various ways byeducators to refer to accuracy, internalcoherence, defensibility and pragmaticusefulness of assessment strategies, andGerrish et al. conclude that the interpretationand application of grading criteria need furtherexamination.

Several writers have described attempts todevise grading criteria for assessing portfolios innurse education. Jasper (1995) gives a table of‘portfolio marking criteria at level two’(equivalent to the second year of a three-yearundergraduate/first degree/bachelor levelprogramme) to illustrate practice in a

Nurse Education Today (2003) 23, 600–609 601

https://www.researchgate.net/publication/10905697_Portfolios_and_assessment_of_competence_a_review_of_the_literature?el=1_x_8&enrichId=rgreq-5cd4b195-3e95-4bb9-8821-26227237a94a&enrichSource=Y292ZXJQYWdlOzkwNTQ2MjY7QVM6MTAxNjIxNDIxODM4MzM3QDE0MDEyMzk4NDMxMjg=


602 Nurse Educa

post-registration programme at a UK university,and each of the 10 criteria shown is markedagainst a 10-point scale in which 0–3 indicate‘little or no evidence’ and 7–10 ‘evidence of thecriterion being met in all its aspects’ (p84).However, these are essentially subjectivepointers whose validity and reliability couldonly be judged if operational definitions of ‘littleor no evidence’, etc were provided. This isimplied in Jasper’s statement that:

However, the value of the portfolio lies inthe nature of the process, rather than theend product per se. The portfolio itselfmerely documents the process, rather thansupplying any measure of quality. Qualityand standards of practice need to beverified in another way, namely in thiscase, by a practice assessment document.The sufficiency of the knowledge base canbe tested via an examination (p85).

This suggests assessment of clinical skillsand performance in practice settings should beassessed separately from the reflective aspectsrecorded in a portfolio.

Karlowicz (2000 p82), writing from theUnited States of America, considers thatattempts to demonstrate the validity ofportfolio scoring systems fail because of:

Vague terminology in the definition ofscores, complexity of the scoring rubric,lack of understanding of the levels ofstudent performance, and a lack ofagreement regarding characteristics mostvalued by the evaluators. (p83)

As a result, it is not possible to show thatportfolio scores correlate with those on othertests such as the standardized examinationsused at state level in the US for Registration asa nurse.

In summary, issues of validity and reliabilityin the assessment of student nurse clinicalcompetence using portfolios have beendiscussed widely in the literature in the UK,USA and Australia, but solutions to theproblems raised are underdeveloped.

General practice education in medicine

Three groups of workers in medical educationin the UK have attempted to take forward

tion Today (2003) 23, 600–609

portfolio use as part of developing generalpractice in a more patient-centred directionthat includes communications skillsenhancement.

Mathers et al. (1999) conducted a mixedmethod evaluation of portfolio use in acontinuing medical education programme forGPs in Scotland. They attempted to assess thescheme’s efficiency and effectiveness, butfound these to be ‘rather blunt instrumentswhen it came to establishing the finer detail ofthe actual learning processes’ (p527). Although,participants considered that the scheme wasvalid, Mathers et al. conclude that:

There is clearly a need to develop objectiveoutcome measures for each part of theprocess and criteria other than self reportfor evidence of the application of learningto practice (p529)

Snadden (1999) considers that educatorsusing portfolios for assessment need ‘robustmethods’, but questions whether efforts to usethe concepts of validity and reliabilityconstitute an attempt to ‘measure theunmeasurable’:

In so far as we are dealing in health careeducation with non-standardizedsituation, then standardizedmeasurements may not be appropriateand we are hindered by the ‘reductionistphilosophy that underpins our discipline’(p479).

Snadden concludes that a ‘mental shift’ isneeded to ‘a more holistic approach toassessment’ that fits with a portfolio approachto gathering evidence.

This conclusion has also been reached byPitts and colleagues, via a series of articlestracing the progress of their attempts to studyand improve assessor reliability in portfolioassessment by general practice trainers. Anevaluation in 1999 of inter-rater reliability (i.e.,consistency) of assessor judgement concludedthat safe summative judgements were notmade (Pitts et al. 1999). Levels of reliabilityresembled those of other methods ofassessment and the development of a portfolioassessment guide did not remove thesubjective element. Pitts et al. conclude, as doesMathers et al. (1999), that:

ª 2003 Elsevier Ltd. All rights reserved.

https://www.researchgate.net/publication/12945249_Educational_portfolios_in_the_assessment_of_general_practice_trainers_Reliability_of_assessorsMedical_Education_33515-520?el=1_x_8&enrichId=rgreq-5cd4b195-3e95-4bb9-8821-26227237a94a&enrichSource=Y292ZXJQYWdlOzkwNTQ2MjY7QVM6MTAxNjIxNDIxODM4MzM3QDE0MDEyMzk4NDMxMjg=

https://www.researchgate.net/publication/12945250_Portfolios_in_continuing_medical_education_Effective_and_efficientMedical_Education_33521-530?el=1_x_8&enrichId=rgreq-5cd4b195-3e95-4bb9-8821-26227237a94a&enrichSource=Y292ZXJQYWdlOzkwNTQ2MjY7QVM6MTAxNjIxNDIxODM4MzM3QDE0MDEyMzk4NDMxMjg=




Any assessment method is only ever acompromise. There may be a danger intrying to force portfolios into atechnical-rational world as a politicalexpedient, when changing the culture toone where professionals are encouraged tolearn, develop and change themselvesshould be the aim. (Pitts et al. 1999, p428)

Thus, ‘any assessment method is acompromise’ and ‘‘traditional’’ approaches toassessing portfolios are likely to proveinappropriate (p428).

In a later study, Pitts et al. (2002) usedstatistics to compare initial individual assessorjudgements with those reached after discussionbetween random pairs of assessors. Thisdiscussion led to improved reliability from fair(j ¼ 0:26) to moderate (j ¼ 0:5). However, thejudgements in questions were only aboutwhether a student should pass or fail.Nevertheless, the authors conclude thatinter-assessor discussion may be a wayforward in ‘making explicit the otherwiseunstated assumptions, beliefs and values of theassessors’ (p200) and moving away frompositivist and reductionist assumptions that donot fit with a portfolio approach to assessment.Such debate might well reflect an ideal positionbut may not be sufficiently realistic to providea pragmatic solution in the face of large studentnumbers in a three-year nursing programmeand also the extensive volume of evidencepresented in these portfolios.

However, it is difficult to envisage whatconventional tests of validity and reliabilitycould be brought to bear on this kind of data.The concepts of face and content validity maybe appropriate, but there is no ‘gold standard’against which ‘higher’ levels of validity couldbe judged. Where this was attempted in theUSA by comparing portfolio outcomes withstandardized examination results (Karlowicz2000), a correlation was not found. Reliabilitytesting would require more than one rater toassess students and their ‘scores’ to becompared statistically, as attempted by Pittset al. (1999, 2002). Repeated testing of the samestudent over time to evaluate score stability isclearly impossible because of maturationaleffects. Because of the nature of the data andinevitable reliance on professional judgement,

ll rights reserved.

this does not seem a useful approach.Therefore, different criteria may be needed.

Alternative conceptualisations of rigour inportfolios

Discussing evaluation of qualitative research,Denzin and Lincoln (1998 p276) consider theapproach of using criteria of evaluation asstemming from a positivist methodology. Apostpositivist position also calls for criteria ofevaluation for qualitative research, but arguesthat these should be different in this ‘alternativeparadigm’. Finally, in a poststructuralistapproach, a different set of evaluation criteriafor qualitative research would focus onsubjectivity, emotionality and feelings.

The second, postpositivist approach of usingevaluation criteria specifically developed forqualitative research may offer a way forwardfor evaluating the rigour of portfolios in nursingstudent assessment. The approach generallyrelies on establishing an ‘audit trail’ (Guba &Lincoln 1985, 1989) to demonstrate thetrustworthiness of the research, and thesuggested criteria are credibility, transferability,dependability and confirmability (see Fig. 1 fordefinitions of these terms).

Elaborating on these concepts, Morse (1989pp76–77) discusses ‘criteria of adequacy andappropriateness of data’, ‘verification usingsecondary informants’, ‘use of multiple raters’and sufficient documentation of the ‘audit trail’to allow stakeholders to reconstruct the processby which the conclusions were reached. In thecase of portfolios, using such an approachmight demonstrate that the evidence containedwithin them is sufficiently rigorous todemonstrate student competence.

In interpreting the findings of our recentstudy, we attempted to take forward ideasabout the inappropriateness of ‘traditional’evaluation criteria by applying notions of rigourdeveloped for qualitative research methods.

The reality of assessment in thestudy

We mapped our data against ideas aboutevaluating rigour in qualitative research, asshown in Fig. 1. This demonstrates that the case


https://www.researchgate.net/publication/12274682_The_Value_of_Student_Portfolios_to_Evaluate_Undergraduate_Nursing_Programs?el=1_x_8&enrichId=rgreq-5cd4b195-3e95-4bb9-8821-26227237a94a&enrichSource=Y292ZXJQYWdlOzkwNTQ2MjY7QVM6MTAxNjIxNDIxODM4MzM3QDE0MDEyMzk4NDMxMjg=




https://www.researchgate.net/publication/11278472_Enhancing_reliability_in_portfolio_assessment_Discussions_between_assessors?el=1_x_8&enrichId=rgreq-5cd4b195-3e95-4bb9-8821-26227237a94a&enrichSource=Y292ZXJQYWdlOzkwNTQ2MjY7QVM6MTAxNjIxNDIxODM4MzM3QDE0MDEyMzk4NDMxMjg=


https://www.researchgate.net/publication/232581259_Designing_Funded_Qualitative_Research?el=1_x_8&enrichId=rgreq-5cd4b195-3e95-4bb9-8821-26227237a94a&enrichSource=Y292ZXJQYWdlOzkwNTQ2MjY7QVM6MTAxNjIxNDIxODM4MzM3QDE0MDEyMzk4NDMxMjg=

Fig. 1 Categories of data from the study mapped against criteria for evaluating qualitative research (Based onGuba & Lincoln 1985, 1989).


604 Nurse Educa

study HEIs were all gathering a variety ofevidence in their portfolio assessment processesthat could be used to support claims of rigourusing criteria proposed for qualitative research.However, the adequacy of some of this evidencemust be explored further. In the evidencebelow, the initial capital letter (A, B, C, D) refersto the case study site, and the followinginformation identifies the data source.

Monitoring the process

Practice varied in relation to who ‘marked’portfolio content and whether there was

tion Today (2003) 23, 600–609

systematic double-marking to check thedependability of marks awarded. For example,at case study B, a Community Practice Teacherconsidered that their role was simply to verifythat the reflections were an accurate record ofevents rather than to make any statement onthe quality of what was written:

The portfolio and the learning contract arepresented to the assessor as somethingcompletely different. We do not clap eyeson the portfolio at all. It is totally separate.I didn’t get to see my last student’sportfolio let alone get the chance to read it!





At another site, the reflective account writtenby the student was marked separately in twoparts, rather than two markers cross-checking:

On the placement the assessor signs off theportfolio including action plans, clinicalskills, the factual part of the reflection.Then the portfolio comes to the personaltutor, who checks, counter-signs thereflective parts against Steinaker and Bell.(C/Pre-registration/Teacher).

Thus, the intended verification and use ofmultiple raters was not achieved.

Participants at case site D were committed tofacilitating discussion about the assessmentprocess and this was achieved throughtripartite meetings that included the student,the assessor and the lecturer:

So at the tripartite we discuss how they(the student) have done and the practiceassessor will ask them about their practiceand about the skills but we are alsochecking their level of knowledge. . . weare checking that against their writtenwork and their reflection. The student isdiscussing that and reflecting as they aretalking and then we take the portfolioaway. The assessor and I will agreewhether they have met their outcomes andtheir practice and that they are safepractitioners who behave professionally.We will have seen the portfolio and theircommunication ability and that they areacting within the limits of competency. . .(D/Pre-registration/Teacher)

Well-conducted tripartite meetings such asthese would allow confirmability of assessmentdecision.

However, at another HEI the reality oftripartite meetings did not match the idealspelled out in curriculum documents:

The tripartite meetings in practice are verydifficult to manage within the student’sfinal week. The link teachers have a largenumber of areas. . . it’s a very tall order inreality. I would say we achieve 50%success. (C/Teacher)

At this site, a change had been made frompersonal teachers attending tripartite meetingsto using link teachers (a teacher who liaised

ll rights reserved.

with a certain geographical or specialist clinicalarea). This still meant, however, very partialsuccess, and where the meetings did take placeit would be with a person whom the studentmight be meeting for the first time. This wasnot likely to promote the kind of discussiondescribed in case study D above.

Moderation and external examining assafeguards of the rigour of portfolioassessment were mentioned in several articlesdiscussed earlier. However, in the presentstudy moderation took a variety of forms andwas not always carried out as envisaged incurriculum documents. In case study Dportfolios were:

Marked by the student’s academic mentorand then subject to internal moderation,which occurs across pathways andperceived to be a valuable method ofachieving consistency of marks(D/Fieldnotes).

On the other hand at site C:

A random sample of portfolios comes tothe moderation meeting. The assessorsdon’t come. The moderation process isfraught with difficulty. (C/Teacher)

This difficulty seemed to be associated withheavy workloads in supervising and assessinglarge numbers of students, and the consequentdifficulty in arranging meetings at which theappropriate teaching staff were able to bepresent.

Informal approaches were used in cases ofdoubt:

We have a team meeting monthly. . . we‘share’ portfolios, sometimes ‘blind’. If youare concerned about one you ask someoneelse for his or her opinion. (C/Teacher)

Here the moderation process seemed to beapplied ad hoc to resolve difficult decisions,rather than being used systematically as a formof cross-checking on marking.

With regard to external examinerinvolvement, this took a ‘hands off’ form ratherthan being a systematic appraisal of theevidence in portfolios. At site C all portfolioswere collected in when external examinersvisited for the examination board, and theycould make their own selection for monitoring.



606 Nurse Educa

Lecturers would also draw attention to anyproblems of the kind discussed above. Thisalso happened at sites A and B:

They would gather all the portfolios andtheoretical assessments so the externalexaminer could sample them. . . They werenot in a position to change marks butcould comment about the overall parityand equity of marking and relativestandards to their own institution.(B/Fieldnotes)

Sampling by teachers was used to selectportfolios sent to External Examiners at site D:

External examiners are sent 10% ofeverything, and all the ‘refers’ and ‘fails’.(D/Teacher)

Teacher interviewees related this process totheir own experience as external examiners,which allowed them to see the deficiencies inmechanisms and how these might beimproved:

Their job is to verify that we are actuallydoing the process correctly. I used to be anexternal myself. I would go through asample of clinical documentation, and Iwould find, even though it had beenmoderated and looked at by the personaltutor (that). . . the whole thing isfraught with holes... The whole problemcomes down to volume. (B/EducationManager)

In some cases, all relevant portfolios werecollected together for sampling by externalexaminers on the day they came for theexamination board meeting. The short amountof time available then allowed only smallsamples or particular problematic portfolios tobe scrutinized.

However, in some systems it was possiblefor external examiners to participate in or evenchange decisions:

External satisfied with the number ofportfolios she is sent. Over the time thatshe has been external examiner she hasseen an improvement in the level ofreflection and the student guidelines haveevolved to become much more consistent.In this exam board as a result of

tion Today (2003) 23, 600–609

their portfolio, 5 students had theirclassification raised and 2 had theirclassification reduced. Of the 5 who hadtheir classification raised, 3 were raisedinto 1st class honours. (D/ExaminationBoard/Fieldnotes)

Thus, although processes described incurriculum documents were intended to allowthe credibility, transferability, dependabilityand confirmability of portfolio evidence to beestablished, in reality these were notimplemented as intended.

The bottom line – can students fail theirportfolio?

The ‘bottom line’ in using any method ofassessment is whether it allows discriminationbetween students who should pass and thosewho should fail. In all case studies it emergedthat the formative aspects of portfolios wereused to identify students who were ‘struggling’in various ways so that, when it came to thesummative point, the problems would havebeen overcome:

It is unlikely that any student would failthe portfolio per se. The process of usingthe portfolio may help the (practiceassessor) or lecturer to identify a studentwho was struggling, but the whole processof using the portfolio incrementally,developmentally within a supportiverelationship should ensure that a studentdoes not hand in a piece of work that willnot pass. (A/Field note)

For pre-registration programmes, there wasevidence from site D that the portfolio could beused to ‘fail’ a student:

I think it picks up the weak students. Itpicks up those who don’t have theknowledge in the wards and those who dowell. It’s good feedback. (D/Teacher)

However, in Case Study C, no example couldbe identified by the staff and studentsinterviewed of a student who had failed andhad to leave the programme because ofportfolio deficiencies. If a section had not beenachieved, the student would be required tomake it up in the next placement:




She didn’t get all the statements [clinicalskills] and was referred. She carried themover to the next year. (C/Pre-registration/Student/2nd year).

When asked about the validity anddiscriminating power of the portfolio in aninterview, the Leader for Learning Disabilitiesat site C said:

Even quite poor students seem to achievethe minimum.

A teacher at site C was also unable torecall an incidence of failure in practiceand is unable to say what would happen in acase of repeated failure, but on differentelements:

Researcher Would it be possible then for a student to bediscontinued on the basis of the practiceassessment of the portfolio? Has that everhappened?

Interviewee I’m trying to think back to previous courses.No, because what would happen is that theywould have to be referred on a particularelement in two terms and that, as far as I’maware, has not happened. Maybe for non-submission. I’m thinking of a couple ofstudents who discontinued. But the lack ofsubmission of the practice portfolio wasonly one of the elements in a catalogue ofthings that would have led to dismissalanyway. One was to do with some quiteserious charges of professional misconductand the other one wasgenerally, sort of, non-attending,non-compliance with a whole range of theassessment elements.

Thus, elements of the portfolio, includingthose assessing clinical performance, could beused to identify an unsatisfactory student, butother criteria would come into play and itwould not be the portfolio that would bedecisive and that any student would fail theportfolio per se. The process of using theportfolio may help the assessor or teacher toidentify a student who is struggling, but thewhole process of using the portfolioincrementally and developmentally within asupportive relationship should ensure that astudent did not hand in a piece of workthat would not pass. The university teacherrole appeared to be a quality assuranceone in terms of verifying completionand assessing consistency betweenassessors.

ll rights reserved.

Discussion

Monitoring the rigour of portfolio use has to bedone at two levels – that of the processes usedto assess the portfolio itself and that of theoverall monitoring processes of the assessmentscheme.

With regard to the former – studentassessment using the portfolio – there waswidespread agreement by study participantsthat the tripartite meeting was crucial. Here,ideally, the student, personal teacher andplacement mentor would meet together,consider the evidence presented in theportfolio, and have a real discussion about itscontents and what had occurred on theplacement. In this way the student coulddemonstrate their communication, reflectiveand analytical skills, the mentor could givefeedback on past performance and guidance onfuture learning needs and verify that the eventsdescribed had actually taken place, and theteacher could be assured of the rigour of thestudent’s clinical assessment.

The relationship between student andpractice assessor was also fundamental.Typically on community specialist practiceprogrammes, this would be a one-to-onerelationship lasting over a period of months,and the teacher would have observedcontinuously how the student worked and theywould have reflected together on their workthroughout the period. On midwiferyprogrammes this close working relationshipwas also evident, although placement lengthswere shorter. In hospital-based placements onpre-registration programmes, the nature ofward work meant that an on-going one-to-onerelationship of this kind was not as commonand thus this ‘knowing’ the student and itsguarantee of the authenticity of what waswritten in the portfolio were not as easy toachieve. The large numbers of students on pre-registration programmes added to thedifficulties of implementing the rigorousprocesses set out in curriculum documents.Our findings here echo those of Pitts et al.(2002), who report that discussions betweenassessors allow unstated assumptions anddiverse views to be aired, leading to aconsensus judgement about the evidence in aportfolio.





608 Nurse Educa

Grading criteria had been developed in all thecases studied, in some instances in an attempt todistinguish between performance at diplomaand first degree level, or to award the class of adegree. They were based on a variety offrameworks, including that of Benner (1984) andSteinaker and Bell (1979). However, the problemof vagueness of criteria reported in the literaturediscussed earlier also applied to these.

At the level of monitoring the overallassessment scheme, any scheme or list ofcriteria for assessing the portfolio process isclearly only as rigorous as the contributingdata. If double marking, moderating betweenmarkers and external examining are not carriedout systematically, then they cannot guaranteethe safety of the system. Here, our evidenceshowed considerable shortfalls in internalmonitoring between teachers and systematicscrutiny of portfolios by external examiners. Insome examples, there were very large numberson the adult branch of pre-registrationprogrammes but numbers on the otherbranches were much smaller. Nevertheless,this did not seem to mean that internal andexternal moderation were done morerigorously in smaller branches.

Evidence was being collected by all HEIs thatcould be used to evaluate their portfolioassessment schemes using qualitative researchevaluation criteria. This approach is alsosuggested as a productive way forward by thosewho have attempted to improve rigour ingeneral practice schemes (Mathers et al. 1999;Pitts et al. 2002; Snadden 1999). Audit trails wereextensive, both in the portfolios themselves, andin curriculum documents and assessmenthandbooks. Evidence was triangulated and wasverified by more than one informant. The basisfor assessment of rigour was already laid, butsome aspects – notably second marking, internalmonitoring and external examining – were notnecessarily being carried as projected incurriculum documents. These aspects need to beapplied more systematically, whether aqualitative or quantitative approach to qualityassurance is used.

Conclusion

The nature of the data collected in portfoliosmeans that it is not possible to apply the

tion Today (2003) 23, 600–609

concepts of validity and reliability without closespecification of detailed and objective criteria forgrading this evidence. Reports in the literaturesuggest that grading criteria developed so farare too vague to eliminate subjectivity.However, in professional education it isinevitable that professional judgements areinvolved in assessing student competencebecause the activities being assessed inthemselves involve the use of judgement. It isprecisely this that underlies the use of portfoliosto assess the process of a student’s developmentof ability to make these clinical judgements asthey progress through the educationprogramme. Portfolios usually containqualitative rather than quantitative evidenceand assessors make qualitative judgementsabout this evidence, taking into account whatthey have learned about the student in a more orless extensive relationship throughout theplacement. It seems, therefore, that criteria forjudging the worth of qualitative research areappropriate for judging the worth of portfoliosthemselves and the systems set up to monitorthe overall assessment process. However, ifthese alternative criteria are to be used, then theevidence that feeds into the process must besystematically gathered. Our study suggeststhat there is some way to go in achieving this,particularly with regard to tripartite meetings,internal monitoring of portfolio assessment, andexternal examiners’ contribution to overallscrutiny of the assessment schemes. However,the elements of a decision trail that currentlyexist in a form that could be used to evaluate therigour of the portfolio assessment process are:

• Explicit marking/grading criteria• Evidence from a variety of sources, including

assessor observations from multiple orextended placements, skills checklists andstudents’ reflective accounts

• Internal quality assurance processes,including double marking and moderationprocesses

• External quality assurance processes,including external examiner reports andnational quality audit schemes

Acknowledgements

The study was funded by the former EnglishNational Board for Nursing, Midwifery and






Health Visiting, whose responsibility for itsoversight then passed to the Dept of Health.The views expressed in this paper are those ofthe authors and do not necessarily reflect theopinions of the funding bodies.

References

Benner P 1984 From Novice to Expert. Addison Wesley,Menlo Park, CA

Bedford H, Phillips T, Robinson J, Schostak J 1993Assessing Competencies in Nursing and MidwiferyEducation. English National Board for Nursing,Midwifery and Health Visiting, London

Calman L, Watson R, Norman I, Redfern S, Murrells T 2002Assessing practice of student nurses: methods,preparation of assessors and student views. Journal ofAdvanced Nursing 38(5): 516–523

Carter DE 1991 Descriptive research. In: Cormack DFS (ed).The Research Process in Nursing, 2nd ed. BlackwellScientific Publications, Oxford, pp. 178–185

Denzin NK, Lincoln YS 1998 The art of interpretation,evaluation and presentation. In: Denzin NK, Lincoln YS(eds). Collecting and Interpreting Qualitative Material.Sage, Thousand Oaks, CA, pp. 275–281

Endacott R, Jasper M, McMullan M, Miller C, Pankhurst P,Scholes J, Webb C 2002 Evaluation of the use ofportfolios in the assessment of learning and competencein nursing, midwifery and health visiting. Unpublishedreport to the Department of Health, Leeds

Gerrish K, McManus M, Ashworth P 1997 Levels ofachievement: a review of the assessment of practice.English National Board for Nursing, Midwifery HealthVisiting, London

ll rights reserved.

Guba E, Lincoln YS 1985 Effective evaluation: Improvingthe Usefulness of Evaluation Results through Responsesand Naturalistic Approaches. Jossey Bass, San Francisco

Guba E, Lincoln YS 1989 Fourth Generation Evaluation.Sage, Newbury Park, CA

Jasper M 1995 The use of a portfolio in assessingprofessional education. In: Gibbs G (ed). ImprovingStudent Learning Through Assessment and Evaluation.Oxford Centre for Staff Development, Oxford,pp. 78–87

Karlowicz KA 2000 the value of student portfolios toevaluate undergraduate nursing programs. NurseEducation 25(2): 82–87

Mathers NJ, Challis MC, Howe AC, Field NJ 1999 Portfoliosin continuing medical education – effective andefficient? Medical Education 33: 521–530

McMullan M, Endacott R, Gray MA, Jasper M, Miller CML,Scholes J, Webb C 2003 Portfolios and assessment ofcompetence: a review of the literature. Journal ofAdvanced Nursing 41: 283–294

Morse JM 1989 Designing funded qualitative research. In:Denzin NK, Lincoln YS (eds). Strategies ofQualitative Inquiry. Sage, Thousand Oaks, CA,p. 56085

Pitts J, Coles C, Thomas P 1999 Educational portfolios in theassessment of general practice trainers: reliability ofassessors. Medical Education 33: 515–520

Pitts J, Coles C, Thomas P, Smith F 2002 Enhancingreliability in portfolio assessment: discussions betweenassessors. Medical Teacher 24(2): 197–201

Snadden D 1999 Portfolios – attempting to measure theunmeasurable? (Commentary). Medical Education 33:478–479

Steinaker NW, Bell MR 1979 The Experiential Taxonomy.Academic Press, New York






















Evaluating portfolio assessment systems: what are the appropriate criteria?

Documents

Transcript of Evaluating portfolio assessment systems: what are the appropriate criteria?