Use of Criteria in Assessing Teaching Portfolios: Judgemental practices in summative evaluation

16
Use of Criteria in Assessing Teaching Portfolios: Judgemental practices in summative evaluation Kari Smith a and Harm Tillema b * a Oranim Academic College, Israel and University of Bergen, Norway; b Leiden University, The Netherlands How is the quality of teaching portfolios being assessed in summative assessment contexts? This question is of special importance in the growing debate on standards and criteria in assessment. In our study we looked for contexts in which portfolios are used for summative assessment purposes and gauge the issues raised by assessors with respect to grading and judging performance. We examine the appraisal of these portfolios by looking at the explicitness of guidelines/instructions/ framework used by 35 portfolio assessors in examining their actual evaluations of the teaching portfolios in programs of teacher education for English as a Second Language. The study utilises different tools to gauge the process of raising standards and appraisal of portfolios. Using a questionnaire, conceptions of portfolio standards and criteria use were collected and commented upon by selected interviews to elaborate the understanding of the quantitative data. The findings were mounted up into a possible typology for clarification and improvement of actual assessment practices of portfolios. In this way transparency about how criteria are being deployed was attempted. Keywords: Portfolio; Assessment; Appraisal; Summative evaluation Raising Standards in Portfolio Appraisal Raising standards in the appraisal of portfolios is an issue of considerable importance (Shephard, 2000). Assessors of teaching portfolios are increasingly faced with two inherently contradictory forces that drive judgemental evaluations in actual assessment practices. On the one hand there is the need for creating and complying with standards for assessment that operate to promote uniformity of grading and a transparent appraisal of performance in the expectation that this will lead to certify achievement. And on the other hand, critical questions are raised about separating goals aimed at appraising achievement versus improving quality of student learning *Corresponding author. Department of Education, Leiden University, PO Box 9555, NL 2300 RB Leiden, The Netherlands. Email: [email protected] Scandinavian Journal of Educational Research Vol. 51, No. 1, February 2007, pp. 103–117 ISSN 0031-3831 (print)/ISSN 1470-1170 (online)/07/010103-15 ß 2007 Scandinavian Journal of Educational Research DOI: 10.1080/00313830601078696

Transcript of Use of Criteria in Assessing Teaching Portfolios: Judgemental practices in summative evaluation

Use of Criteria in Assessing Teaching

Portfolios: Judgemental practices in

summative evaluation

Kari Smitha and Harm Tillemab*aOranim Academic College, Israel and University of Bergen, Norway; bLeiden University,

The Netherlands

How is the quality of teaching portfolios being assessed in summative assessment contexts? This

question is of special importance in the growing debate on standards and criteria in assessment. In

our study we looked for contexts in which portfolios are used for summative assessment purposes

and gauge the issues raised by assessors with respect to grading and judging performance. We

examine the appraisal of these portfolios by looking at the explicitness of guidelines/instructions/

framework used by 35 portfolio assessors in examining their actual evaluations of the teaching

portfolios in programs of teacher education for English as a Second Language. The study utilises

different tools to gauge the process of raising standards and appraisal of portfolios. Using a

questionnaire, conceptions of portfolio standards and criteria use were collected and commented

upon by selected interviews to elaborate the understanding of the quantitative data. The findings

were mounted up into a possible typology for clarification and improvement of actual assessment

practices of portfolios. In this way transparency about how criteria are being deployed was

attempted.

Keywords: Portfolio; Assessment; Appraisal; Summative evaluation

Raising Standards in Portfolio Appraisal

Raising standards in the appraisal of portfolios is an issue of considerable importance

(Shephard, 2000). Assessors of teaching portfolios are increasingly faced with two

inherently contradictory forces that drive judgemental evaluations in actual

assessment practices. On the one hand there is the need for creating and complying

with standards for assessment that operate to promote uniformity of grading and a

transparent appraisal of performance in the expectation that this will lead to certify

achievement. And on the other hand, critical questions are raised about separating

goals aimed at appraising achievement versus improving quality of student learning

*Corresponding author. Department of Education, Leiden University, PO Box 9555, NL 2300 RB

Leiden, The Netherlands. Email: [email protected]

Scandinavian Journal of Educational Research

Vol. 51, No. 1, February 2007, pp. 103–117

ISSN 0031-3831 (print)/ISSN 1470-1170 (online)/07/010103-15

� 2007 Scandinavian Journal of Educational Research

DOI: 10.1080/00313830601078696

and development (Cochran-Smith & Fries, 2002; Darling-Hammond & Snyder,

2000; Delandshere & Arens, 2003). The position of portfolio assessment, which is

widely used nowadays in teaching and teacher education (Burns, 1999), comes into

this debate by asking: do we improve its feasibility for achievement measurement

(complying to performance standards) in order to certify prospective and

experienced teachers? Or do we appraise achievements that signify the quality of

(monitoring) their development and learning (growth in professional expertise)?

Teachers’ portfolios for documentation of professional competence are increasingly

required as a regular condition for certification (in pre-service teaching education as

well as for advanced teaching certificates) (Zeichner & Wray, 2000). There is,

however, less research on the appraisal of portfolios and the judgemental processes

involved on part of the assessors of portfolios, especially when they are used for

summative assessment purposes (Burns, 1999; Smith & Tillema, 2001). What

standards or criteria are applied, and what is the extent of agreement about the

utilisation of these criteria (Heilbronn, Jones, Bubb, & Totterdell, 2002; Zuzowsky &

Libman, 2002)?

The focus of the present study is the transparency of criteria use in appraisal of

portfolios in summative assessment contexts, i.e., explicitness in grading perfor-

mance to certify achievements (Burns, 1999; Cochran-Smith, 2001; Murray, 2001).

The context that gave rise to this study is the ambivalent position felt both by

assessors and those being assessed about grading the actual practical teaching

performance of pre-service student teachers of English teachers in Israel and in the

Netherlands. The quality of rating and criteria use (Tillema, 2003; Zuzowsky &

Libman, 2002) is usually determined by some internal, mostly institution-specific

standard or benchmarks to judge how the assessors should appraise to meet certain

specifications. An earlier inventory of perceptions of grading and evaluation (Smith

& Tillema, 1998, 2001) showed substantial reluctance on the part of teacher

educators to maintain strict criteria in rating portfolios in order not to disrupt the

process of competence development in their students. These teacher educators

clearly recognise a dilemma between their position as assessor and mentor.

Furthermore, our previous data (Tillema & Smith, 2000) show a current lack of

explicitness in determining quality of portfolios due to heavy reliance on context and

circumstance in portfolio construction. Therefore, we recognise a need to clarify and

evaluate actual dilemmas in judgemental processes of grading portfolios in order to

achieve greater explicitness and transparency in use of criteria. The summative

appraisal process makes this need even more urgent because it is required to utilise

portfolio criteria as careful contextual benchmarks for the scrutiny of accomplish-

ments in light of explicitly stated achievement goals.

Teachers’ portfolios for summative evaluation of professional competence become

more and more common for certification in pre-service teacher education and for

advanced teaching certificates (Shulman, 1998; Tucker, Stronge, & Gareis, 2002;

Wade & Yarbrough, 1996). There is, however, less research on quality of assessment

of portfolios, and there seems to be confusion among teacher educators about how to

104 K. Smith and H. Tillema

assess portfolios summatively (Burroughs, 2001; Smith & Tillema, 2003). This lack

of clarity regarding summative assessment stems, in our view, from the ambitious

assumption that the same portfolio can be and actually is used for learning

(professional development) purposes as well as summative assessment purposes

(Snyder, Lippincott, & Bower, 1998; Tillema & Smith, 2000; Zeichner & Wray,

2000). This causes a conflict of interest for the portfolio compiler who is required to

be aware of external assessment criteria and standards when compiling the portfolio.

But also for the teacher educator as an assessor who plays the dual role of being the

supporter of a professional development process as well as the judge of the final

portfolio product. This conflict may well affect the use and conflation of selected

criteria in the appraisal of a portfolio. Teacher educators as well as students of

teaching seem to be in need of a better understanding of how the portfolio is

beneficial not only as a learning and development tool, but also as an assessment

tool.

Summative Appraisal of Portfolios

Summative assessment of the portfolio for certification purposes is expected to be,

and should be, carried out in light of explicit standards for teaching. This requisite

explicitness of standards has positive as well as negative impacts on the judgemental

process of appraising portfolios in student teachers’ learning and teachers’

professional development. On the one hand it specifies what will be appraised and

to what degree one has accomplished certain agreed standards of performance; on

the other hand it narrows the range of permissible exemplifications of teaching

activity (Heilbronn et al., 2002; Zuzowsky & Libman, 2002). It is therefore of

interest to examine the assessors’ process of using standards and covering of criteria

to determine how they are interpreted and applied in concrete evaluations of

individual portfolios.

Scrutinising actual appraisal processes in summative assessment of portfolios in

greater depth presupposes the existence of a common core of standards on teaching

knowledge and skills in order to determine the criteria against which they can be

documented and appraised in the portfolio. The idea inherent to summative

assessment is that against such core competences actual accomplishments can be

evaluated through a judgemental process of criteria application. The portfolio, then,

is the tool used to provide evidence of the attainment of standards (Delandshere &

Arens, 2001; Tillema, 1998). In short, standards direct the content to be specified in

a portfolio which then needs to be appraised according to certain criteria to

determine the portfolio’s quality or worth. It is here that already marked differences

and even disagreement in the summative debate are to be found on what constitutes

core content or standards to be rated in the portfolio (Murray, 2001; Yinger &

Hendricks-Lee, 1998). An inventory of available Internet sites on criteria for

teaching portfolios would show enormous variety (Smith & Tillema, in press). At

best, one could argue that the need we recognise for reaching an agreement on

Use of Criteria in Assessing Teaching Portfolios 105

explicit criteria could spark off a useful dialogue in the profession; the different

perspectives on what constitutes the core of the teaching profession would contribute

to making deliberate choices. Agreement on core issues to be rated in a professional

portfolio certainly would serve the interest of an accountability i.e., summative

perspective (Cochran-Smith & Fries, 2002). Moreover, agreed standards to be

covered by the portfolio could serve as guidelines for teacher educators, assessors

and student teachers when actually working with the portfolio, as they provide a

focus for assessment which can be communicated to all stakeholders. And lastly, the

explicit standards serve as goals for professional development activities (Darling-

Hammond et al., 1998; Delandshere & Arens, 2001, 2003). A state of agreement on

standards and criteria is, however, yet to be found in the teaching profession

(Zuzowsky & Libman, 2002). On the contrary, in the literature we find strong

criticism of extensive use of standards for teaching (Yinger & Hendricks-Lee, 1998).

Some of the main concerns are:

N There is no consensus about the core teaching knowledge (Murray, 2001), which

makes it impossible to introduce a prototype of portfolio for professional

development and for summative assessment purposes.

N Standards lead to a narrow interpretation of teaching (Cochran-Smith, 2001), and

teachers are discouraged from documenting their own initiatives and creativity in

the portfolio if these do not align with the explicit standards.

N Teacher knowledge is non-cognitive knowledge (Van Manen, 1999); not only

technical knowledge reflected in performances (Delandshere & Arens, 2001).

Tacit knowledge of teaching based on experience, personal beliefs and values is a

major part of teaching which cannot be documented in uniform portfolio entries.

N There is a lack of construct validity (Burroughs, 2001) with emphasis on

performance and elimination of theoretical knowledge, which is a dangerous

dualism (Delandshere & Arens, 2001). The current trend expressed in standards

for teaching diminishes (and at times even ignores) the importance of theoretical

knowledge of teaching and about teaching. Most portfolio frameworks put

emphasis on the performance aspect of teaching and do not ensure documenta-

tion of underlying understanding based on solid theoretical knowledge of the

more technical aspects of the profession. The balance between theory and practice

has, as a result of the quest for standards, ended up placing too much weight on

performance.

N Standards might lead to dangerous consequences (Apple, 2001) such as

consideration of alternatives necessary for change (Delandshere & Arens, 2001).

When portfolios for assessment are mainly directed by explicit standards, there is a

danger that creative teachers refrain from presenting alternatives which disagree

with the standards. However, a continuous flow of courageous alternatives is

needed to ensure a dynamic educational system which is in constant search for

reform.

N Elimination of differences in the way teaching is represented is reductionist

(Delandshere & Arens, 2001). Teaching is contextual and what is considered

106 K. Smith and H. Tillema

‘‘good’’ teaching in one setting is not necessarily the best approach in a different

setting. Teachers differ in personalities, strengths and weaknesses and a portfolio

framework which does not allow for differences constrains teachers’ professional

development if the portfolio entries are compiled in accordance with the explicit

standards to ensure positive assessment.

N Standards are developed by likeminded volunteers (Delandshere & Arens, 2001).

In situations where portfolios are used for summative assessment purposes in light

of standards one needs to question who has decided on the standards and,

furthermore, on the portfolio framework. The question concerning who is

involved in the process of developing the portfolio framework becomes a crucial

one.

These arguments and positions make it cumbersome to engage in a discussion on

shared and explicit criteria for grading, at least at the level of standards, i.e., the

content domains of achievement. Despite this current state of affairs in the debate on

standards (which, taken literally, would deny the possibility of summative

assessment) a direct concern in the actual appraisal of portfolios is how the

portfolios’ quality is being measured or evaluated as a product offered for scrutiny to

raters or assessors of the portfolios. How, then, we could ask, is the quality of the

documented accomplishments shown in the portfolio rated, i.e., what criteria are

being used as the ‘‘yard stick’’ against which the accomplishments presented in a

portfolio are evaluated. It is in this actual judgemental process itself that is

determined in what way and to what degree certain standards are met, i.e., based on

the evidence presented. This judgemental process calls for a careful contextualised

appraisal on the part of the assessors in which criteria are constructed and applied

often in situ to value or give merit to a portfolio.

Therefore, the present study intends to examine how portfolios are assessed in

different summative contexts, by studying how the quality of a portfolio is merited in

actual judgemental practices. It is our objective to arrive at a typology of the distinct

utilisations of criteria use in portfolio appraisal. Such a typology is likely to

contribute to a clearer understanding of portfolio use and thereby strengthen its

positive impact on the quality of teacher education.

The Study

Setting

The focus of this study is appraisal criteria used for summative portfolio assessment

of teaching practice in pre-service teacher education. As context for comparison we

have chosen two 4-year elementary school teacher education programmes for

teachers of English as a foreign language, one in Israel and one in the Netherlands.

The participants were 35 teacher educators who act as regular assessors of their

students’ teaching portfolio.

Use of Criteria in Assessing Teaching Portfolios 107

Design and Instruments

The study uses both a structured questionnaire and an open interview to gauge the

conceptions of assessors with regard to the use of criteria to rate their students’

portfolios. The questionnaire (see Appendix 1) revolved around three main issues

and contained 17 items with reference to:

(a) the purpose of a portfolio; to determine its usage as a summative or formative

tool, consisting of five items which were formulated as factual statements to be

rated on a 5-point Likert scale by agreement (fully agree to fully disagree),

(b) the process of grading a portfolio; to determine the main criteria used during the

appraisal process, consisting of seven items, which were rated by adherence to a

main focus in appraisal; and

(c) measuring the quality of a particular portfolio; to determine problems or issues

in establishing the final grading of an individual portfolio, which was rated on

five items representing main concerns of the assessor (fully applicable to not at

all applicable).

Data were collected with a 5-point rating scale to examine the degree to which

assessors adhered to specific conceptions of criteria use. This inventory also included

selected interviews with 16 out of the 35 teacher educators who showed a typical

profile in answering the questionnaire items. They were selected for the interviews to

elaborate our understanding of the context and setting of criteria use. The purpose of

this phase of the study was to collect data on the variety of criteria use as well as the

grounded reasons and encountered problems assessors perceive in using these

criteria. The data were analysed to arrive at a typology of criteria use. The present

article discusses the findings in order to propose a tool for scrutinising the actual

practice of criteria use of assessors in teacher education.

Findings

No large demographic differences were found between the teacher educators from

the two countries which enabled us to analyse the data as one group. Our

participants had on average 15 years of teaching experience, with at least 4.5 years as

an assessor of portfolios. The mean student portfolio rating mounted up to 24.5 a

year. Salient difference between the Dutch and Israeli groups was the setting of

appraisal: Dutch assessors worked in dyads mainly while their Israeli colleagues rated

students’ portfolios individually.

The questionnaire data in Table 1 are presented under three main categories: the

purpose of grading portfolio, the process of utilising criteria and issues in measuring

criteria levels and quality of a portfolio.

As is apparent from Table 1, we find the highest importance attached to portfolio

appraisal as a Tool for Self-development (4.37) and Establishing Development

(4.14). The lowest scores were found for portfolio ratings which concern: Evaluation

108 K. Smith and H. Tillema

of Course Learning (2.74) and Clarity in Rating (2.51) (sic). More specifically,

especially under the purpose category of criteria use in portfolio grading no large

variation was found, meaning all selected purposes are somehow relevant and under

scrutiny in appraisal of portfolios. However, under the categories: process and

quality of appraisal, greater targeting was found. The main issue under the process of

using portfolio criteria is determination of self-development and student’s reflection

while establishing performance levels and certification rating is rated somewhat

lower. Based on t-test differences found (see Table 1) it seems as if the Israeli

assessors give higher value to establishing performance levels (and reflection) than

their Dutch colleagues. Under the category of measuring quality we find most

divergence between our participants but both are outspoken in their concerns of

giving authentic evidence in the portfolio. The main problem seems to be

establishing student attainments in an authentic way (i.e., avoiding a check-box

approach) and valid measurement. Assessors clearly find these issues problematic in

their own practice. Moreover, the data represented in Table 1 seem to indicate that

there may be a major dilemma present between, on the one hand, establishing actual

levels for certification in a reliable way and, on the other hand, using the portfolio for

supporting further development, reflection and learning.

The interviews which were conducted subsequently were meant to further

illuminate and clarify these questionnaire findings. The interview questions revolved

around two issues concerning the actual appraisal process of our assessors:

Table 1. Descriptive statistics inventory on portfolio criteria use (5-point Likert scale)

Overall

Mean SD

Dutch

assessors

Israeli

assessors T-test p

Purpose 3.84

Of Grading Establishing development 4.14 .772 4.04 4.40 n.s

Promoting learning 4.11 .631 4.12 4.10 n.s

Providing feedback 3.49 1.040 3.40 3.70 n.s

Monitoring actual levels 3.83 .747 3.72 4.10 n.s

Documenting performance 3.63 .910 3.56 3.80 n.s

Process 3.39

of rating Self-development. 4.37 .690 4.32 4.50 n.s

Prior knowedge 3.37 1.031 3.80 2.30 .001

Reflection 3.71 .926 3.60 4.00 .100

Evaluation of course learning 2.74 1.197 2.92 2.30 n.s

Establishing performance

level

3.11 1.132 2.72 4.10 .001

Change of beliefs 3.00 .804 2.92 3.20 n.s

Certification 3.40 .881 3.32 3.60 n.s

Quality of 3.36

Measurement Reliable 3.11 .758 2.88 3.70 .002

Valid 3.57 .850 3.40 4.00 .05

Clarity 2.51 .951 2.52 2.50 n.s

Authentic 4.03 .822 3.80 4.60 .007

Giving evidence 3.60 .695 3.36 4.30 .002

Use of Criteria in Assessing Teaching Portfolios 109

(a) their orientation towards justification or warranting the quality of the portfolio

which was categorised as:

N failure to judge a portfolio altogether

N use of the appraisal process for developmental learning purposes, or

N maintaining a strict use of benchmark criteria to judge (grade) the portfolio, and

(b) the style of mentoring or teaching adopted by the assessor. In the dual position

of the assessor who is also a teacher educator of students, the following three

positions can be categorised, i.e., as either

N instructional (teaching)-oriented

N relational (personal)-oriented

N situational (or goal)-oriented.

Table 2 presents some typical summary sentences from the interviews, categorised

under (a) the judgemental orientation of the assessors, and (b) the preferred

mentoring style to deal with student development. Table 2 also indicates (in bold)

the presumed criteria in use by the assessor under each category.

Since it was clear from these interviews that each of these 16 teacher educators

adhered to a certain justification belief and style of mentoring, we analysed whether

the Israeli and Dutch teacher context differed in this respect. For this purpose each

interviewed teacher educator was specifically asked to rate their position on a scale

constituted by the three positions for belief and style (and rated as strongly versus not

strongly adhering to a certain position). Differences between Israeli and Dutch

educators were found: t522.18, p,.03; (analysed as categorised responses: Mann-

Witney U562.50, p,.05) for purpose and t52.09; p,.04; (Mann-Witney

U564,50, p,.06) for process respectively. The Dutch teacher educators (with

mean 1.72 on a 3-point scale) were more often critical about applying judgemental

criteria while the Israeli teacher educators (mean 2.44) on the other hand, adhered to

a more judgemental view. With respect to style of mentoring, the Dutch teacher

educators were inclined to a more relational and situational style of mentoring (mean

2.16) while the Israeli colleagues were more oriented towards an instructional style

(1.86). The correlation between justification of criteria and style of mentoring was

substantial, 2.56, indicating a negative relation between a critical attitude towards

appraisal and an instructional mentoring style. The more judgemental oriented

teacher educators preferred the more instructional mentoring style; this relation was

especially strong for the Israeli teacher educators.

Interpreting Criteria Use in Portfolio Appraisal: Towards a typology

Based on the questionnaire and interview findings it becomes evident that the actual

practice of appraising and grading of the portfolio product seems to take on different

forms and is conducted through various approaches, and is based on different

orientations and beliefs that govern the use of criteria. Our inventory showed a

110 K. Smith and H. Tillema

Table 2. Typical interview responses categorised under judgemental orientation and mentoring style (in bold presumed criteria in use)

Type of

mentoring

Type of appraisal orientation

Denial of appraisal Appraisal for learning and development Judgmental evaluation

Instructional Grading portfolios is sometimes so

undoable, we have no fixed way of

judging them. We have difficulty in

appraising them because no ageement

exist as to its measurement. What does it

signify?. A lesson plan does not tell me

enough I have to see and observe real

lessons, We have to be aware of the danger

of misjudging qualities. Students can talk

and window dress a lot but the real thing

is seeing it for your self. It is difficult to give

proper feedback based on portfolio alone,

You create a dependency and smooth

talking while at the same time the real

ground for judging performance is missing

1.The portfolio shows whether the students

are able to apply the theory learned in

class to their classroom practice and

the learning process of doing so. 2. I select

a choice of different assignments the stu-

dents chooses to do and the assessment

rubric used to assess it. 3. I mainly look

for the learning process of the student

from the 1st assignment in which she

relates to my on-going comments. Trying

to improve the work and better apply

theory to practice. 4. The report would be

done on an assessment rubric the

student would receive and an accompanying

note. 5. Every part of the rubric has

points allotted to the work done in the

assignment.

1.The portfolio presents a profile of the student’s

teaching activities. 2. All students have to include

lesson plans. observation reports and

reflections. They can choose to add teaching

material they develop themselves. 3. I am looking

for individual progress. creativity in teaching

and learning from mistakes. 4. I write

comments in the portfolio and add a written

summary. 5. The compulsory parts are given

a percentage for each item and the part the

students choose has a fixed maximum grade.

Use

ofC

riteriain

Assessin

gT

each

ing

Portfolios

111

Table 2. (Continued.)

Type of

mentoring

Type of appraisal orientation

Denial of appraisal Appraisal for learning and development Judgmental evaluation

Relational Many of my colleagues worked with

portfolios. So I wanted to try it out. I am

happy I did. For me as a mentor practice

teaching became more than only

teaching. The most important part of the

portfolio is the reflections of the student

on their observations of the mentor. the

teaching of their peers and their own. The

assessment is mainly in my written

comments. a kind of my reflection on

their reflection. It is difficult to give the

portfolio a grade. so I have a pass and

fail.

The portfolio is an excellent tool for assessing

learning. It assesses the effort, dedication,

progress, and involvement with the

learning in relation to the standards of

the course. It is an excellent tool for

getting to know the student well and in

depth based on the reflections.

1.They provide evidence of progress: so even if the

observations showed initial performance. the

portfolio can show the progress of the student

teacher’s thinking. the ‘‘zone of proximal

development’’ as it were. So contribute to our

evaluation of how she is developing. 2. I use

selected journal entries with responses from

their tutor and personal reflections on both

entries and responses; I take the conclusions on

professional development ‘‘Where am I. where am

I going from here?’’3. I’m looking for a) evidence that

the student teacher knows how to reflect critically

and constructively on her own experiences

(including both positive and negative self-criticism

and suggestions as to directions for ‘next time’. b)

and evidence of professional progress from the

first year entries/lesson plans and reflections to the

last. and the final concluding section. I would assess

basically on these two things. plus evidence that the

writer has invested time. Thought and work in

writing components of the portfolio and in their

teaching and own development in general.

1. It provides a comprehensive reflection

of the learning process when learning

how to teach. 2. The portfolio should

include various aspects of teaching. lesson

plans. worksheets. tests. and reflections

on all these. 3. I am looking for progress

and for independent critical reflection.

4. An assessment page is given to the

student with detailed comments on each

entry and a translation into a number

(I have no choice). 5. I work with a

rubric which represents the criteria

of the portfolio.

Situational Judging portfolios is difficult because each

one of us will deal with it differently.

May be we should be better communicating

about it and see how we evaluate the

portfolios. But the important thing is the

dynamics; it is not a static panel that

judges fixed situations on a common

ground. It is the interests of the students

that is at stake here

1. They provide a picture of the student’s

progress over time which is so relevant to

practical work. They also give the student

the possibility of presenting teaching aids.

etc. 2. Reflections on lessons observed.

Reflections on lessons taught. lesson plans.

teaching aids prepared and used by student.

mentor’s report. 3. Reflectivity. progress.

clarity. 4. Through a pre-prepared page.

5. Each segment is allotted a grade.

and they all add up.

1.When I did not use the portfolio. I looked mainly at

the teaching skills. Now I look for understanding

of teaching as well. 2. The portfolio should include

a certain number of artifacts chosen by the

student. 3. I would develop the criteria together

with the students. it would help them understand

the goal of the portfolio and to choose the artifacts. 4.

I report the assessment orally in a meeting. and

then we decide on the grade together. 5. I have to

give a grade. and I like to do it together with the

student.

112

K.

Sm

ithand

H.

Tillem

a

coloured palette offering diverse ways of establishing the quality of a portfolio

(Table 2). Portfolio ratings of both product and process, of content and procedure,

of knowledge, performance and reflection were found, either being developmental or

selective, integrative and piecemeal utilised. Based on what we found, several main

strands could be highlighted in the utilisation of criteria through which the merits of

a portfolio are judged. As a typology of criteria use we can distinguish between three

strands.

(a) Criteria as Judging Evidence

In this most prevalent case, the portfolio was being viewed as a product, as a

collection of materials presented to be rated and evaluated as such. The presented

material was taken at face value without much consideration for its origination or its

process of collection, nor for the purpose according to which the portfolio was

constructed. The rating of a portfolio was primarily a matter of connoisseurship, i.e.,

based on hidden, assessor-dependent criteria. In most positive instances we found

that benchmarks were specified within a norm or arbitrary standard, i.e.,

competencies to be considered in the portfolio, or a specification of evidence

required for each entry in the portfolio. These normative criteria specified what was,

as a minimum, to be included in the portfolio. Sometimes they were indicated at

quite a detailed level, by specifying content areas, materials to be included and rating

scales.

(b) Criteria as Rules of Accountability

In this more grounded type, the portfolio was rated as a product against regulative

standards being set at an institutional level, and not assessor-dependent. Ultimately,

criteria referred to performance assessment goals of which the portfolio had to

provide evidence of attainment. These standards are specified beforehand, referring

to a programme of requirements or a curriculum completed. Criteria were used to

detect compliance with the admission levels set for professional certification (either

for entry or retention in the profession). These (public or explicit) criteria were

specified in reference to external standards, often set by a certifying or selecting

board or agency. They referred to content domains or performance levels belonging

to a professional level of functioning.

(c) Criteria as Critical Appraisal

In this more sophisticated type, not only was the portfolio as a product in itself taken

for scrutiny but also its contextual background, its origination as well as its

construction. The portfolio was regarded as an outcome of a process that served to

meet specified purposes. Its setting of construction, including the institutional

constraints, as well as its realised outcomes, were under scrutiny relative to the

Use of Criteria in Assessing Teaching Portfolios 113

process of collection that had taken place. The portfolio was appraised in light of the

context in which it had been compiled. It was a form of auditing, i.e., a weighting of

evidence relative to the objectives that needed to be reached. Criteria could be

negotiated and constructed with regard to the purposes acknowledged and accepted

by those involved in the appraisal process: both assessors and those being assessed.

The constraints under which the portfolio was constructed were taken into account.

This often led to ‘‘meta-criteria’’ for auditing the portfolio construction, i.e., having

trust in the outcomes presented, credibility of evidence, groundedness of the

materials, unity of the product. These criteria were preferably negotiated beforehand

instead of after the collection process.

In this ideal case, criteria operated as quality improvement. Criteria now acted as

common and shared dimensions for development intended to (gradually) improve

the quality of the portfolio product, i.e., performance of its collector. Shared

dimensions were extracted from the literature or professional debate on competence

in a profession and could include, for instance: richness of content, compliance with

standards, performance evidence, and growth in professional development.

Conclusion

The focus of this paper has been appraisal of portfolios in summative assessment

contexts, more specifically in pre-service teacher education of teachers in Israel and

in the Netherlands. The most common practice of teacher educators in their role as

assessors is to exercise judgemental, usually normative, evaluation based on pre-

decided, more or less explicit criteria. These are represented in a list which allows for

‘‘check-box’’ appraisal. We found predominantly criteria use as judging evidence (the

first level of our typology). The use of a check-box approach was most salient in the

many assessment practices we examined. This present practice is problematic in that

portfolio appraisal is made dependent on certain contexts and certain assessors, and

reflects a specific criteria use which does not apply across other settings. However,

uniform pre-decided criteria, a kind of one-size-fits-all, are perhaps not the right

answer to this present condition of diversity; it may even be hard to assume that there

are criteria equally applicable in various settings. Portfolio compilation and

certification in a strict standard-directed context constrains creativity, individuality

and innovation (Burroughs, 2001) and is not akin to learning to teach and teaching.

Less uniformity in criteria use, as well as explicitness and transparency, may be a

better way to deal with criteria use in summative assessment (Dottin, 2001;

International Task Force on Assessment Centers, 2000). As an alternative, we

suggest that summative appraisal takes place in dialogue with the portfolio

stakeholders. Crucial to our argument is that a portfolio is constructed for a certain

purpose. Therefore, its compiler, its requirer, its facilitator (practice teacher or

teacher educator) all have an interest in establishing the worth and merit of the

portfolio product. In this arrangement, it is the assessor who enquires, detects and

examines the portfolio content in relation to the specific context and objectives

114 K. Smith and H. Tillema

expressed by the ‘‘owners’’ of the portfolio, i.e, the student teacher as well as the

other parties involved. The appraisal then becomes a careful scrutiny of

accomplishments, accounting for process as well as product, and the portfolio

collector is invited to explain and defend her/his work. This ‘‘extended’’ summative

appraisal is more like an auditing process than a check-box judgement and

measurement.

A major issue that still needs to be examined and discussed, given our position, is

the applicability and feasibility of this form for auditing in summative contexts for

accountability purposes. When we look at criteria use this way we feel supported by

the practice found among the participants of our study. It became evident that the

appraisal of portfolios for them is not only a matter of rating an artefact but primarily

meriting a practice (a manner of assessment that has been established at a particular

institutional level). Judgement of an individual portfolio product is embedded in a

specific practice (i.e., procedures in context). Therefore, it is our contention that

portfolio appraisal needs to consider these.

In evaluating assessment practices, the concept of an audit as a way to look at

appraisal may serve as a helpful approach to determine the quality of procedures and

instruments relative to their purpose (Herriot, 1989; Tillema, 2003). An audit

primarily indicates contributions and improvements made in the achievements of the

portfolio compiler, relative to the goals set, and thus legitimises the outcomes of the

portfolio. Furthermore, an audit can scrutinise existing portfolio practices, such as

the way assessors perform their appraisals, as well as ascertain prospects for

certification and licensing (Darling-Hammond & Snyder, 2000). In this respect an

audit combines an accountability perspective with an improvement perspective

(Smith & Tillema, 1998) and thus may resolve the dilemma in which teacher

educators as assessors find themselves.

References

Apple, M. W. (2001). Markets, standards, teaching, and teacher education. Journal of Teacher

Education, 52(2), 182–196.

Burns, C. W. (1999). Teaching portfolio and the evaluation of teaching in higher education:

confident claims, questionable research support. Studies in Educational Evaluation, 25,

131–142.

Burroughs, R. (2001). Composing standards and composing teachers. The problem of National

Board Certification. Journal of Teacher Education, 52(2), 223–232.

Cochran-Smith, M. (2001). The outcomes question in teacher education. Teaching and Teacher

Education, 17(5), 527–546.

Cochran-Smith, M., & Fries, M. K. (2002). The discourse of reform in teacher education:

Extending the dialogue. Educational Researcher, 31(6), 26–28.

Darling-Hammond, L. (2000). Teacher quality and student achievement: A review of state policy

evidence. Education Policy Analysis Archives, 8(1), 23–36.

Darling-Hammond, L., Diez, M. E., Moss, P., Pecheone, R., Pullin, D., Schafer, W., & Vickers,

L. (1998). The role of standards and assessment: A dialogue. In M. Diez (Ed.), Changing the

practice of teacher education: Standards and assessment as a lever for change. Washington, DC:

AACTE Publications. (ERIC Document Reproduction Service no. ED 417 157).

Use of Criteria in Assessing Teaching Portfolios 115

Darling-Hammond, L., & Snyder, J. (2000). Authentic assessment of teaching in context. Teaching

and Teacher Education, 16, 523–545.

Delandshere, G., & Arens, S. A. (2001). Representations of teaching and standard-based reform: are

we closing the debate about teacher education? Teaching and Teacher Education, 17, 547–566.

Delandshere, G., & Arens, S. A. (2003). Examining the quality of the evidence in pre-service

teacher portfolios. Journal of Teacher Education, 54(1), 57–73.

Dottin, E. (2001). The development of a conceptual framework. Lanham, MD: University Press of

America.

Heilbronn, R., Jones, C., Bubb, S., & Totterdell, M. (2002). School-based induction tutors, a

challenging role. School Leadership & Management, 22(4), 34–45.

Herriot, P. (1989). Assessment and selection in organizations: Methods and practice for recruitment and

appraisal. Chichester, UK: John Wiley.

International Task Force on Assessment Centers. (2000). Guidelines and ethical considerations

for assessment center operations. Public Personnel Management, 29(3), 315–331.

Murray, F. B. (2001). The overreliance of accreditors on consensus standards. Journal of Teacher

Education, 52(2), 211–222.

Shephard, L. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7),

4–15.

Shulman, L. S. (1998). Teacher portfolios: A theoretical activity. In N. Lyons (Ed.), With portfolio

in hand: Validating the new teacher professionalism (pp. 23–37). New York: Teachers College

Press.

Smith, K., & Tillema, H. (1998). Evaluating portfolio use as a learning tool for professionals.

Scandinavian Journal of Educational Research, 41(2), 193–205.

Smith, K., & Tillema, H. (2001). Long-term influences of portfolios on professional development.

Scandinavian Journal of Educational Research, 45(2), 183–203.

Smith, K., & Tillema, H. (2003). Clarifying different types of portfolio use. Assessment &

Evaluation in Higher Education, 26(6), 625–648.

Smith, K., & Tillema, H. (in press). Portfolio assessment, in search of criteria. Teaching & Teacher

Education.

Snyder, J., Lippincott, A., & Bower, D. (1998). The inherent tensions in the multiple uses of

portfolios in teacher education. Teacher Education Quarterly, 25(1), 45–60.

Tillema, H. (1998). Design and validity of a portfolio instrument for professional training. Studies

in Educational Evaluation, 24(3), 263–278.

Tillema, H. (2003). Auditing assessment practices; establishing quality criteria in the appraisal of

competencies in organisations. International Journal of Human Resource Development and

Management, 3(4), 359–369.

Tillema, H., & Smith, K. (2000). Learning from portfolios: Differential use of feedback in

portfolio construction. Studies in Educational Assessment, 26, 193–210.

Tucker, P. D., Stronge, J. H., & Gareis, C. R. (2002). Handbook on teacher portfolios for evaluation

and professional development. New York: Eye on Education.

Van Manen, M. (1999). Knowledge, reflection an complexity in teacher practice. In M. Lang,

J. Olson, H. Hansen, & W. Bunder (Eds.), Changing schools/changing practices: Perspectives on

educational reform and teacher professionalism (pp. 65–75). Leuven, Belgium: Garant.

Wade, R. C., & Yarbrough, D. B. (1996). Portfolios: A tool for reflective thinking in teacher

education. Teaching and Teacher Education, 12(1), 63–79.

Yinger, R., & Hendricks-Lee, M. (1998). Professional development standards as a new context for

professional development in the US. Teachers & Teaching, 4(2), 273–299.

Zeichner, K., & Wray, S. (2000). The teaching portfolio in US teacher education programs: What

we know and what we need to know. Teaching and Teacher Education, 17, 613–621.

Zuzowsky, R., & Libman, Z. (2002, August). Standards of teaching performance and teacher tests;

where do they lead us. Paper presented at ATEE conference Warsaw.

116 K. Smith and H. Tillema

Appendix A. Questionnaire Items on Portfolio Use

Its purpose—development or certification

N Portfolio is a tool to highlight progression in development

N Portfolio is a tool to promote further learning

N Portfolio is a tool for providing functional feedback

N Portfolio is a tool to monitor actual competence levels

N Portfolio is a tool to document performance

Its process—appraising the portfolio

Portfolio appraisal is meant to:

N use it as a tool for self-development

N assess prior knowledge

N share and reflect good practice

N evaluate training courses

N enhance one’s performance

N change a student’s attitudes or beliefs

N gain accreditation or certification.

Its assessment qualities—the portfolio

N is a reliable measure of competence in relation to required standards

N gives a clear and consistent understanding of student qualities

N is clear how the portfolio is being assessed

N gives an authentic reflection of student growth during a period of time

N presents evidence of a student’s current competence.

Use of Criteria in Assessing Teaching Portfolios 117