CBSE Board Class X Summative Assessment – II Mathematics ...
Use of Criteria in Assessing Teaching Portfolios: Judgemental practices in summative evaluation
Transcript of Use of Criteria in Assessing Teaching Portfolios: Judgemental practices in summative evaluation
Use of Criteria in Assessing Teaching
Portfolios: Judgemental practices in
summative evaluation
Kari Smitha and Harm Tillemab*aOranim Academic College, Israel and University of Bergen, Norway; bLeiden University,
The Netherlands
How is the quality of teaching portfolios being assessed in summative assessment contexts? This
question is of special importance in the growing debate on standards and criteria in assessment. In
our study we looked for contexts in which portfolios are used for summative assessment purposes
and gauge the issues raised by assessors with respect to grading and judging performance. We
examine the appraisal of these portfolios by looking at the explicitness of guidelines/instructions/
framework used by 35 portfolio assessors in examining their actual evaluations of the teaching
portfolios in programs of teacher education for English as a Second Language. The study utilises
different tools to gauge the process of raising standards and appraisal of portfolios. Using a
questionnaire, conceptions of portfolio standards and criteria use were collected and commented
upon by selected interviews to elaborate the understanding of the quantitative data. The findings
were mounted up into a possible typology for clarification and improvement of actual assessment
practices of portfolios. In this way transparency about how criteria are being deployed was
attempted.
Keywords: Portfolio; Assessment; Appraisal; Summative evaluation
Raising Standards in Portfolio Appraisal
Raising standards in the appraisal of portfolios is an issue of considerable importance
(Shephard, 2000). Assessors of teaching portfolios are increasingly faced with two
inherently contradictory forces that drive judgemental evaluations in actual
assessment practices. On the one hand there is the need for creating and complying
with standards for assessment that operate to promote uniformity of grading and a
transparent appraisal of performance in the expectation that this will lead to certify
achievement. And on the other hand, critical questions are raised about separating
goals aimed at appraising achievement versus improving quality of student learning
*Corresponding author. Department of Education, Leiden University, PO Box 9555, NL 2300 RB
Leiden, The Netherlands. Email: [email protected]
Scandinavian Journal of Educational Research
Vol. 51, No. 1, February 2007, pp. 103–117
ISSN 0031-3831 (print)/ISSN 1470-1170 (online)/07/010103-15
� 2007 Scandinavian Journal of Educational Research
DOI: 10.1080/00313830601078696
and development (Cochran-Smith & Fries, 2002; Darling-Hammond & Snyder,
2000; Delandshere & Arens, 2003). The position of portfolio assessment, which is
widely used nowadays in teaching and teacher education (Burns, 1999), comes into
this debate by asking: do we improve its feasibility for achievement measurement
(complying to performance standards) in order to certify prospective and
experienced teachers? Or do we appraise achievements that signify the quality of
(monitoring) their development and learning (growth in professional expertise)?
Teachers’ portfolios for documentation of professional competence are increasingly
required as a regular condition for certification (in pre-service teaching education as
well as for advanced teaching certificates) (Zeichner & Wray, 2000). There is,
however, less research on the appraisal of portfolios and the judgemental processes
involved on part of the assessors of portfolios, especially when they are used for
summative assessment purposes (Burns, 1999; Smith & Tillema, 2001). What
standards or criteria are applied, and what is the extent of agreement about the
utilisation of these criteria (Heilbronn, Jones, Bubb, & Totterdell, 2002; Zuzowsky &
Libman, 2002)?
The focus of the present study is the transparency of criteria use in appraisal of
portfolios in summative assessment contexts, i.e., explicitness in grading perfor-
mance to certify achievements (Burns, 1999; Cochran-Smith, 2001; Murray, 2001).
The context that gave rise to this study is the ambivalent position felt both by
assessors and those being assessed about grading the actual practical teaching
performance of pre-service student teachers of English teachers in Israel and in the
Netherlands. The quality of rating and criteria use (Tillema, 2003; Zuzowsky &
Libman, 2002) is usually determined by some internal, mostly institution-specific
standard or benchmarks to judge how the assessors should appraise to meet certain
specifications. An earlier inventory of perceptions of grading and evaluation (Smith
& Tillema, 1998, 2001) showed substantial reluctance on the part of teacher
educators to maintain strict criteria in rating portfolios in order not to disrupt the
process of competence development in their students. These teacher educators
clearly recognise a dilemma between their position as assessor and mentor.
Furthermore, our previous data (Tillema & Smith, 2000) show a current lack of
explicitness in determining quality of portfolios due to heavy reliance on context and
circumstance in portfolio construction. Therefore, we recognise a need to clarify and
evaluate actual dilemmas in judgemental processes of grading portfolios in order to
achieve greater explicitness and transparency in use of criteria. The summative
appraisal process makes this need even more urgent because it is required to utilise
portfolio criteria as careful contextual benchmarks for the scrutiny of accomplish-
ments in light of explicitly stated achievement goals.
Teachers’ portfolios for summative evaluation of professional competence become
more and more common for certification in pre-service teacher education and for
advanced teaching certificates (Shulman, 1998; Tucker, Stronge, & Gareis, 2002;
Wade & Yarbrough, 1996). There is, however, less research on quality of assessment
of portfolios, and there seems to be confusion among teacher educators about how to
104 K. Smith and H. Tillema
assess portfolios summatively (Burroughs, 2001; Smith & Tillema, 2003). This lack
of clarity regarding summative assessment stems, in our view, from the ambitious
assumption that the same portfolio can be and actually is used for learning
(professional development) purposes as well as summative assessment purposes
(Snyder, Lippincott, & Bower, 1998; Tillema & Smith, 2000; Zeichner & Wray,
2000). This causes a conflict of interest for the portfolio compiler who is required to
be aware of external assessment criteria and standards when compiling the portfolio.
But also for the teacher educator as an assessor who plays the dual role of being the
supporter of a professional development process as well as the judge of the final
portfolio product. This conflict may well affect the use and conflation of selected
criteria in the appraisal of a portfolio. Teacher educators as well as students of
teaching seem to be in need of a better understanding of how the portfolio is
beneficial not only as a learning and development tool, but also as an assessment
tool.
Summative Appraisal of Portfolios
Summative assessment of the portfolio for certification purposes is expected to be,
and should be, carried out in light of explicit standards for teaching. This requisite
explicitness of standards has positive as well as negative impacts on the judgemental
process of appraising portfolios in student teachers’ learning and teachers’
professional development. On the one hand it specifies what will be appraised and
to what degree one has accomplished certain agreed standards of performance; on
the other hand it narrows the range of permissible exemplifications of teaching
activity (Heilbronn et al., 2002; Zuzowsky & Libman, 2002). It is therefore of
interest to examine the assessors’ process of using standards and covering of criteria
to determine how they are interpreted and applied in concrete evaluations of
individual portfolios.
Scrutinising actual appraisal processes in summative assessment of portfolios in
greater depth presupposes the existence of a common core of standards on teaching
knowledge and skills in order to determine the criteria against which they can be
documented and appraised in the portfolio. The idea inherent to summative
assessment is that against such core competences actual accomplishments can be
evaluated through a judgemental process of criteria application. The portfolio, then,
is the tool used to provide evidence of the attainment of standards (Delandshere &
Arens, 2001; Tillema, 1998). In short, standards direct the content to be specified in
a portfolio which then needs to be appraised according to certain criteria to
determine the portfolio’s quality or worth. It is here that already marked differences
and even disagreement in the summative debate are to be found on what constitutes
core content or standards to be rated in the portfolio (Murray, 2001; Yinger &
Hendricks-Lee, 1998). An inventory of available Internet sites on criteria for
teaching portfolios would show enormous variety (Smith & Tillema, in press). At
best, one could argue that the need we recognise for reaching an agreement on
Use of Criteria in Assessing Teaching Portfolios 105
explicit criteria could spark off a useful dialogue in the profession; the different
perspectives on what constitutes the core of the teaching profession would contribute
to making deliberate choices. Agreement on core issues to be rated in a professional
portfolio certainly would serve the interest of an accountability i.e., summative
perspective (Cochran-Smith & Fries, 2002). Moreover, agreed standards to be
covered by the portfolio could serve as guidelines for teacher educators, assessors
and student teachers when actually working with the portfolio, as they provide a
focus for assessment which can be communicated to all stakeholders. And lastly, the
explicit standards serve as goals for professional development activities (Darling-
Hammond et al., 1998; Delandshere & Arens, 2001, 2003). A state of agreement on
standards and criteria is, however, yet to be found in the teaching profession
(Zuzowsky & Libman, 2002). On the contrary, in the literature we find strong
criticism of extensive use of standards for teaching (Yinger & Hendricks-Lee, 1998).
Some of the main concerns are:
N There is no consensus about the core teaching knowledge (Murray, 2001), which
makes it impossible to introduce a prototype of portfolio for professional
development and for summative assessment purposes.
N Standards lead to a narrow interpretation of teaching (Cochran-Smith, 2001), and
teachers are discouraged from documenting their own initiatives and creativity in
the portfolio if these do not align with the explicit standards.
N Teacher knowledge is non-cognitive knowledge (Van Manen, 1999); not only
technical knowledge reflected in performances (Delandshere & Arens, 2001).
Tacit knowledge of teaching based on experience, personal beliefs and values is a
major part of teaching which cannot be documented in uniform portfolio entries.
N There is a lack of construct validity (Burroughs, 2001) with emphasis on
performance and elimination of theoretical knowledge, which is a dangerous
dualism (Delandshere & Arens, 2001). The current trend expressed in standards
for teaching diminishes (and at times even ignores) the importance of theoretical
knowledge of teaching and about teaching. Most portfolio frameworks put
emphasis on the performance aspect of teaching and do not ensure documenta-
tion of underlying understanding based on solid theoretical knowledge of the
more technical aspects of the profession. The balance between theory and practice
has, as a result of the quest for standards, ended up placing too much weight on
performance.
N Standards might lead to dangerous consequences (Apple, 2001) such as
consideration of alternatives necessary for change (Delandshere & Arens, 2001).
When portfolios for assessment are mainly directed by explicit standards, there is a
danger that creative teachers refrain from presenting alternatives which disagree
with the standards. However, a continuous flow of courageous alternatives is
needed to ensure a dynamic educational system which is in constant search for
reform.
N Elimination of differences in the way teaching is represented is reductionist
(Delandshere & Arens, 2001). Teaching is contextual and what is considered
106 K. Smith and H. Tillema
‘‘good’’ teaching in one setting is not necessarily the best approach in a different
setting. Teachers differ in personalities, strengths and weaknesses and a portfolio
framework which does not allow for differences constrains teachers’ professional
development if the portfolio entries are compiled in accordance with the explicit
standards to ensure positive assessment.
N Standards are developed by likeminded volunteers (Delandshere & Arens, 2001).
In situations where portfolios are used for summative assessment purposes in light
of standards one needs to question who has decided on the standards and,
furthermore, on the portfolio framework. The question concerning who is
involved in the process of developing the portfolio framework becomes a crucial
one.
These arguments and positions make it cumbersome to engage in a discussion on
shared and explicit criteria for grading, at least at the level of standards, i.e., the
content domains of achievement. Despite this current state of affairs in the debate on
standards (which, taken literally, would deny the possibility of summative
assessment) a direct concern in the actual appraisal of portfolios is how the
portfolios’ quality is being measured or evaluated as a product offered for scrutiny to
raters or assessors of the portfolios. How, then, we could ask, is the quality of the
documented accomplishments shown in the portfolio rated, i.e., what criteria are
being used as the ‘‘yard stick’’ against which the accomplishments presented in a
portfolio are evaluated. It is in this actual judgemental process itself that is
determined in what way and to what degree certain standards are met, i.e., based on
the evidence presented. This judgemental process calls for a careful contextualised
appraisal on the part of the assessors in which criteria are constructed and applied
often in situ to value or give merit to a portfolio.
Therefore, the present study intends to examine how portfolios are assessed in
different summative contexts, by studying how the quality of a portfolio is merited in
actual judgemental practices. It is our objective to arrive at a typology of the distinct
utilisations of criteria use in portfolio appraisal. Such a typology is likely to
contribute to a clearer understanding of portfolio use and thereby strengthen its
positive impact on the quality of teacher education.
The Study
Setting
The focus of this study is appraisal criteria used for summative portfolio assessment
of teaching practice in pre-service teacher education. As context for comparison we
have chosen two 4-year elementary school teacher education programmes for
teachers of English as a foreign language, one in Israel and one in the Netherlands.
The participants were 35 teacher educators who act as regular assessors of their
students’ teaching portfolio.
Use of Criteria in Assessing Teaching Portfolios 107
Design and Instruments
The study uses both a structured questionnaire and an open interview to gauge the
conceptions of assessors with regard to the use of criteria to rate their students’
portfolios. The questionnaire (see Appendix 1) revolved around three main issues
and contained 17 items with reference to:
(a) the purpose of a portfolio; to determine its usage as a summative or formative
tool, consisting of five items which were formulated as factual statements to be
rated on a 5-point Likert scale by agreement (fully agree to fully disagree),
(b) the process of grading a portfolio; to determine the main criteria used during the
appraisal process, consisting of seven items, which were rated by adherence to a
main focus in appraisal; and
(c) measuring the quality of a particular portfolio; to determine problems or issues
in establishing the final grading of an individual portfolio, which was rated on
five items representing main concerns of the assessor (fully applicable to not at
all applicable).
Data were collected with a 5-point rating scale to examine the degree to which
assessors adhered to specific conceptions of criteria use. This inventory also included
selected interviews with 16 out of the 35 teacher educators who showed a typical
profile in answering the questionnaire items. They were selected for the interviews to
elaborate our understanding of the context and setting of criteria use. The purpose of
this phase of the study was to collect data on the variety of criteria use as well as the
grounded reasons and encountered problems assessors perceive in using these
criteria. The data were analysed to arrive at a typology of criteria use. The present
article discusses the findings in order to propose a tool for scrutinising the actual
practice of criteria use of assessors in teacher education.
Findings
No large demographic differences were found between the teacher educators from
the two countries which enabled us to analyse the data as one group. Our
participants had on average 15 years of teaching experience, with at least 4.5 years as
an assessor of portfolios. The mean student portfolio rating mounted up to 24.5 a
year. Salient difference between the Dutch and Israeli groups was the setting of
appraisal: Dutch assessors worked in dyads mainly while their Israeli colleagues rated
students’ portfolios individually.
The questionnaire data in Table 1 are presented under three main categories: the
purpose of grading portfolio, the process of utilising criteria and issues in measuring
criteria levels and quality of a portfolio.
As is apparent from Table 1, we find the highest importance attached to portfolio
appraisal as a Tool for Self-development (4.37) and Establishing Development
(4.14). The lowest scores were found for portfolio ratings which concern: Evaluation
108 K. Smith and H. Tillema
of Course Learning (2.74) and Clarity in Rating (2.51) (sic). More specifically,
especially under the purpose category of criteria use in portfolio grading no large
variation was found, meaning all selected purposes are somehow relevant and under
scrutiny in appraisal of portfolios. However, under the categories: process and
quality of appraisal, greater targeting was found. The main issue under the process of
using portfolio criteria is determination of self-development and student’s reflection
while establishing performance levels and certification rating is rated somewhat
lower. Based on t-test differences found (see Table 1) it seems as if the Israeli
assessors give higher value to establishing performance levels (and reflection) than
their Dutch colleagues. Under the category of measuring quality we find most
divergence between our participants but both are outspoken in their concerns of
giving authentic evidence in the portfolio. The main problem seems to be
establishing student attainments in an authentic way (i.e., avoiding a check-box
approach) and valid measurement. Assessors clearly find these issues problematic in
their own practice. Moreover, the data represented in Table 1 seem to indicate that
there may be a major dilemma present between, on the one hand, establishing actual
levels for certification in a reliable way and, on the other hand, using the portfolio for
supporting further development, reflection and learning.
The interviews which were conducted subsequently were meant to further
illuminate and clarify these questionnaire findings. The interview questions revolved
around two issues concerning the actual appraisal process of our assessors:
Table 1. Descriptive statistics inventory on portfolio criteria use (5-point Likert scale)
Overall
Mean SD
Dutch
assessors
Israeli
assessors T-test p
Purpose 3.84
Of Grading Establishing development 4.14 .772 4.04 4.40 n.s
Promoting learning 4.11 .631 4.12 4.10 n.s
Providing feedback 3.49 1.040 3.40 3.70 n.s
Monitoring actual levels 3.83 .747 3.72 4.10 n.s
Documenting performance 3.63 .910 3.56 3.80 n.s
Process 3.39
of rating Self-development. 4.37 .690 4.32 4.50 n.s
Prior knowedge 3.37 1.031 3.80 2.30 .001
Reflection 3.71 .926 3.60 4.00 .100
Evaluation of course learning 2.74 1.197 2.92 2.30 n.s
Establishing performance
level
3.11 1.132 2.72 4.10 .001
Change of beliefs 3.00 .804 2.92 3.20 n.s
Certification 3.40 .881 3.32 3.60 n.s
Quality of 3.36
Measurement Reliable 3.11 .758 2.88 3.70 .002
Valid 3.57 .850 3.40 4.00 .05
Clarity 2.51 .951 2.52 2.50 n.s
Authentic 4.03 .822 3.80 4.60 .007
Giving evidence 3.60 .695 3.36 4.30 .002
Use of Criteria in Assessing Teaching Portfolios 109
(a) their orientation towards justification or warranting the quality of the portfolio
which was categorised as:
N failure to judge a portfolio altogether
N use of the appraisal process for developmental learning purposes, or
N maintaining a strict use of benchmark criteria to judge (grade) the portfolio, and
(b) the style of mentoring or teaching adopted by the assessor. In the dual position
of the assessor who is also a teacher educator of students, the following three
positions can be categorised, i.e., as either
N instructional (teaching)-oriented
N relational (personal)-oriented
N situational (or goal)-oriented.
Table 2 presents some typical summary sentences from the interviews, categorised
under (a) the judgemental orientation of the assessors, and (b) the preferred
mentoring style to deal with student development. Table 2 also indicates (in bold)
the presumed criteria in use by the assessor under each category.
Since it was clear from these interviews that each of these 16 teacher educators
adhered to a certain justification belief and style of mentoring, we analysed whether
the Israeli and Dutch teacher context differed in this respect. For this purpose each
interviewed teacher educator was specifically asked to rate their position on a scale
constituted by the three positions for belief and style (and rated as strongly versus not
strongly adhering to a certain position). Differences between Israeli and Dutch
educators were found: t522.18, p,.03; (analysed as categorised responses: Mann-
Witney U562.50, p,.05) for purpose and t52.09; p,.04; (Mann-Witney
U564,50, p,.06) for process respectively. The Dutch teacher educators (with
mean 1.72 on a 3-point scale) were more often critical about applying judgemental
criteria while the Israeli teacher educators (mean 2.44) on the other hand, adhered to
a more judgemental view. With respect to style of mentoring, the Dutch teacher
educators were inclined to a more relational and situational style of mentoring (mean
2.16) while the Israeli colleagues were more oriented towards an instructional style
(1.86). The correlation between justification of criteria and style of mentoring was
substantial, 2.56, indicating a negative relation between a critical attitude towards
appraisal and an instructional mentoring style. The more judgemental oriented
teacher educators preferred the more instructional mentoring style; this relation was
especially strong for the Israeli teacher educators.
Interpreting Criteria Use in Portfolio Appraisal: Towards a typology
Based on the questionnaire and interview findings it becomes evident that the actual
practice of appraising and grading of the portfolio product seems to take on different
forms and is conducted through various approaches, and is based on different
orientations and beliefs that govern the use of criteria. Our inventory showed a
110 K. Smith and H. Tillema
Table 2. Typical interview responses categorised under judgemental orientation and mentoring style (in bold presumed criteria in use)
Type of
mentoring
Type of appraisal orientation
Denial of appraisal Appraisal for learning and development Judgmental evaluation
Instructional Grading portfolios is sometimes so
undoable, we have no fixed way of
judging them. We have difficulty in
appraising them because no ageement
exist as to its measurement. What does it
signify?. A lesson plan does not tell me
enough I have to see and observe real
lessons, We have to be aware of the danger
of misjudging qualities. Students can talk
and window dress a lot but the real thing
is seeing it for your self. It is difficult to give
proper feedback based on portfolio alone,
You create a dependency and smooth
talking while at the same time the real
ground for judging performance is missing
1.The portfolio shows whether the students
are able to apply the theory learned in
class to their classroom practice and
the learning process of doing so. 2. I select
a choice of different assignments the stu-
dents chooses to do and the assessment
rubric used to assess it. 3. I mainly look
for the learning process of the student
from the 1st assignment in which she
relates to my on-going comments. Trying
to improve the work and better apply
theory to practice. 4. The report would be
done on an assessment rubric the
student would receive and an accompanying
note. 5. Every part of the rubric has
points allotted to the work done in the
assignment.
1.The portfolio presents a profile of the student’s
teaching activities. 2. All students have to include
lesson plans. observation reports and
reflections. They can choose to add teaching
material they develop themselves. 3. I am looking
for individual progress. creativity in teaching
and learning from mistakes. 4. I write
comments in the portfolio and add a written
summary. 5. The compulsory parts are given
a percentage for each item and the part the
students choose has a fixed maximum grade.
Use
ofC
riteriain
Assessin
gT
each
ing
Portfolios
111
Table 2. (Continued.)
Type of
mentoring
Type of appraisal orientation
Denial of appraisal Appraisal for learning and development Judgmental evaluation
Relational Many of my colleagues worked with
portfolios. So I wanted to try it out. I am
happy I did. For me as a mentor practice
teaching became more than only
teaching. The most important part of the
portfolio is the reflections of the student
on their observations of the mentor. the
teaching of their peers and their own. The
assessment is mainly in my written
comments. a kind of my reflection on
their reflection. It is difficult to give the
portfolio a grade. so I have a pass and
fail.
The portfolio is an excellent tool for assessing
learning. It assesses the effort, dedication,
progress, and involvement with the
learning in relation to the standards of
the course. It is an excellent tool for
getting to know the student well and in
depth based on the reflections.
1.They provide evidence of progress: so even if the
observations showed initial performance. the
portfolio can show the progress of the student
teacher’s thinking. the ‘‘zone of proximal
development’’ as it were. So contribute to our
evaluation of how she is developing. 2. I use
selected journal entries with responses from
their tutor and personal reflections on both
entries and responses; I take the conclusions on
professional development ‘‘Where am I. where am
I going from here?’’3. I’m looking for a) evidence that
the student teacher knows how to reflect critically
and constructively on her own experiences
(including both positive and negative self-criticism
and suggestions as to directions for ‘next time’. b)
and evidence of professional progress from the
first year entries/lesson plans and reflections to the
last. and the final concluding section. I would assess
basically on these two things. plus evidence that the
writer has invested time. Thought and work in
writing components of the portfolio and in their
teaching and own development in general.
1. It provides a comprehensive reflection
of the learning process when learning
how to teach. 2. The portfolio should
include various aspects of teaching. lesson
plans. worksheets. tests. and reflections
on all these. 3. I am looking for progress
and for independent critical reflection.
4. An assessment page is given to the
student with detailed comments on each
entry and a translation into a number
(I have no choice). 5. I work with a
rubric which represents the criteria
of the portfolio.
Situational Judging portfolios is difficult because each
one of us will deal with it differently.
May be we should be better communicating
about it and see how we evaluate the
portfolios. But the important thing is the
dynamics; it is not a static panel that
judges fixed situations on a common
ground. It is the interests of the students
that is at stake here
1. They provide a picture of the student’s
progress over time which is so relevant to
practical work. They also give the student
the possibility of presenting teaching aids.
etc. 2. Reflections on lessons observed.
Reflections on lessons taught. lesson plans.
teaching aids prepared and used by student.
mentor’s report. 3. Reflectivity. progress.
clarity. 4. Through a pre-prepared page.
5. Each segment is allotted a grade.
and they all add up.
1.When I did not use the portfolio. I looked mainly at
the teaching skills. Now I look for understanding
of teaching as well. 2. The portfolio should include
a certain number of artifacts chosen by the
student. 3. I would develop the criteria together
with the students. it would help them understand
the goal of the portfolio and to choose the artifacts. 4.
I report the assessment orally in a meeting. and
then we decide on the grade together. 5. I have to
give a grade. and I like to do it together with the
student.
112
K.
Sm
ithand
H.
Tillem
a
coloured palette offering diverse ways of establishing the quality of a portfolio
(Table 2). Portfolio ratings of both product and process, of content and procedure,
of knowledge, performance and reflection were found, either being developmental or
selective, integrative and piecemeal utilised. Based on what we found, several main
strands could be highlighted in the utilisation of criteria through which the merits of
a portfolio are judged. As a typology of criteria use we can distinguish between three
strands.
(a) Criteria as Judging Evidence
In this most prevalent case, the portfolio was being viewed as a product, as a
collection of materials presented to be rated and evaluated as such. The presented
material was taken at face value without much consideration for its origination or its
process of collection, nor for the purpose according to which the portfolio was
constructed. The rating of a portfolio was primarily a matter of connoisseurship, i.e.,
based on hidden, assessor-dependent criteria. In most positive instances we found
that benchmarks were specified within a norm or arbitrary standard, i.e.,
competencies to be considered in the portfolio, or a specification of evidence
required for each entry in the portfolio. These normative criteria specified what was,
as a minimum, to be included in the portfolio. Sometimes they were indicated at
quite a detailed level, by specifying content areas, materials to be included and rating
scales.
(b) Criteria as Rules of Accountability
In this more grounded type, the portfolio was rated as a product against regulative
standards being set at an institutional level, and not assessor-dependent. Ultimately,
criteria referred to performance assessment goals of which the portfolio had to
provide evidence of attainment. These standards are specified beforehand, referring
to a programme of requirements or a curriculum completed. Criteria were used to
detect compliance with the admission levels set for professional certification (either
for entry or retention in the profession). These (public or explicit) criteria were
specified in reference to external standards, often set by a certifying or selecting
board or agency. They referred to content domains or performance levels belonging
to a professional level of functioning.
(c) Criteria as Critical Appraisal
In this more sophisticated type, not only was the portfolio as a product in itself taken
for scrutiny but also its contextual background, its origination as well as its
construction. The portfolio was regarded as an outcome of a process that served to
meet specified purposes. Its setting of construction, including the institutional
constraints, as well as its realised outcomes, were under scrutiny relative to the
Use of Criteria in Assessing Teaching Portfolios 113
process of collection that had taken place. The portfolio was appraised in light of the
context in which it had been compiled. It was a form of auditing, i.e., a weighting of
evidence relative to the objectives that needed to be reached. Criteria could be
negotiated and constructed with regard to the purposes acknowledged and accepted
by those involved in the appraisal process: both assessors and those being assessed.
The constraints under which the portfolio was constructed were taken into account.
This often led to ‘‘meta-criteria’’ for auditing the portfolio construction, i.e., having
trust in the outcomes presented, credibility of evidence, groundedness of the
materials, unity of the product. These criteria were preferably negotiated beforehand
instead of after the collection process.
In this ideal case, criteria operated as quality improvement. Criteria now acted as
common and shared dimensions for development intended to (gradually) improve
the quality of the portfolio product, i.e., performance of its collector. Shared
dimensions were extracted from the literature or professional debate on competence
in a profession and could include, for instance: richness of content, compliance with
standards, performance evidence, and growth in professional development.
Conclusion
The focus of this paper has been appraisal of portfolios in summative assessment
contexts, more specifically in pre-service teacher education of teachers in Israel and
in the Netherlands. The most common practice of teacher educators in their role as
assessors is to exercise judgemental, usually normative, evaluation based on pre-
decided, more or less explicit criteria. These are represented in a list which allows for
‘‘check-box’’ appraisal. We found predominantly criteria use as judging evidence (the
first level of our typology). The use of a check-box approach was most salient in the
many assessment practices we examined. This present practice is problematic in that
portfolio appraisal is made dependent on certain contexts and certain assessors, and
reflects a specific criteria use which does not apply across other settings. However,
uniform pre-decided criteria, a kind of one-size-fits-all, are perhaps not the right
answer to this present condition of diversity; it may even be hard to assume that there
are criteria equally applicable in various settings. Portfolio compilation and
certification in a strict standard-directed context constrains creativity, individuality
and innovation (Burroughs, 2001) and is not akin to learning to teach and teaching.
Less uniformity in criteria use, as well as explicitness and transparency, may be a
better way to deal with criteria use in summative assessment (Dottin, 2001;
International Task Force on Assessment Centers, 2000). As an alternative, we
suggest that summative appraisal takes place in dialogue with the portfolio
stakeholders. Crucial to our argument is that a portfolio is constructed for a certain
purpose. Therefore, its compiler, its requirer, its facilitator (practice teacher or
teacher educator) all have an interest in establishing the worth and merit of the
portfolio product. In this arrangement, it is the assessor who enquires, detects and
examines the portfolio content in relation to the specific context and objectives
114 K. Smith and H. Tillema
expressed by the ‘‘owners’’ of the portfolio, i.e, the student teacher as well as the
other parties involved. The appraisal then becomes a careful scrutiny of
accomplishments, accounting for process as well as product, and the portfolio
collector is invited to explain and defend her/his work. This ‘‘extended’’ summative
appraisal is more like an auditing process than a check-box judgement and
measurement.
A major issue that still needs to be examined and discussed, given our position, is
the applicability and feasibility of this form for auditing in summative contexts for
accountability purposes. When we look at criteria use this way we feel supported by
the practice found among the participants of our study. It became evident that the
appraisal of portfolios for them is not only a matter of rating an artefact but primarily
meriting a practice (a manner of assessment that has been established at a particular
institutional level). Judgement of an individual portfolio product is embedded in a
specific practice (i.e., procedures in context). Therefore, it is our contention that
portfolio appraisal needs to consider these.
In evaluating assessment practices, the concept of an audit as a way to look at
appraisal may serve as a helpful approach to determine the quality of procedures and
instruments relative to their purpose (Herriot, 1989; Tillema, 2003). An audit
primarily indicates contributions and improvements made in the achievements of the
portfolio compiler, relative to the goals set, and thus legitimises the outcomes of the
portfolio. Furthermore, an audit can scrutinise existing portfolio practices, such as
the way assessors perform their appraisals, as well as ascertain prospects for
certification and licensing (Darling-Hammond & Snyder, 2000). In this respect an
audit combines an accountability perspective with an improvement perspective
(Smith & Tillema, 1998) and thus may resolve the dilemma in which teacher
educators as assessors find themselves.
References
Apple, M. W. (2001). Markets, standards, teaching, and teacher education. Journal of Teacher
Education, 52(2), 182–196.
Burns, C. W. (1999). Teaching portfolio and the evaluation of teaching in higher education:
confident claims, questionable research support. Studies in Educational Evaluation, 25,
131–142.
Burroughs, R. (2001). Composing standards and composing teachers. The problem of National
Board Certification. Journal of Teacher Education, 52(2), 223–232.
Cochran-Smith, M. (2001). The outcomes question in teacher education. Teaching and Teacher
Education, 17(5), 527–546.
Cochran-Smith, M., & Fries, M. K. (2002). The discourse of reform in teacher education:
Extending the dialogue. Educational Researcher, 31(6), 26–28.
Darling-Hammond, L. (2000). Teacher quality and student achievement: A review of state policy
evidence. Education Policy Analysis Archives, 8(1), 23–36.
Darling-Hammond, L., Diez, M. E., Moss, P., Pecheone, R., Pullin, D., Schafer, W., & Vickers,
L. (1998). The role of standards and assessment: A dialogue. In M. Diez (Ed.), Changing the
practice of teacher education: Standards and assessment as a lever for change. Washington, DC:
AACTE Publications. (ERIC Document Reproduction Service no. ED 417 157).
Use of Criteria in Assessing Teaching Portfolios 115
Darling-Hammond, L., & Snyder, J. (2000). Authentic assessment of teaching in context. Teaching
and Teacher Education, 16, 523–545.
Delandshere, G., & Arens, S. A. (2001). Representations of teaching and standard-based reform: are
we closing the debate about teacher education? Teaching and Teacher Education, 17, 547–566.
Delandshere, G., & Arens, S. A. (2003). Examining the quality of the evidence in pre-service
teacher portfolios. Journal of Teacher Education, 54(1), 57–73.
Dottin, E. (2001). The development of a conceptual framework. Lanham, MD: University Press of
America.
Heilbronn, R., Jones, C., Bubb, S., & Totterdell, M. (2002). School-based induction tutors, a
challenging role. School Leadership & Management, 22(4), 34–45.
Herriot, P. (1989). Assessment and selection in organizations: Methods and practice for recruitment and
appraisal. Chichester, UK: John Wiley.
International Task Force on Assessment Centers. (2000). Guidelines and ethical considerations
for assessment center operations. Public Personnel Management, 29(3), 315–331.
Murray, F. B. (2001). The overreliance of accreditors on consensus standards. Journal of Teacher
Education, 52(2), 211–222.
Shephard, L. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7),
4–15.
Shulman, L. S. (1998). Teacher portfolios: A theoretical activity. In N. Lyons (Ed.), With portfolio
in hand: Validating the new teacher professionalism (pp. 23–37). New York: Teachers College
Press.
Smith, K., & Tillema, H. (1998). Evaluating portfolio use as a learning tool for professionals.
Scandinavian Journal of Educational Research, 41(2), 193–205.
Smith, K., & Tillema, H. (2001). Long-term influences of portfolios on professional development.
Scandinavian Journal of Educational Research, 45(2), 183–203.
Smith, K., & Tillema, H. (2003). Clarifying different types of portfolio use. Assessment &
Evaluation in Higher Education, 26(6), 625–648.
Smith, K., & Tillema, H. (in press). Portfolio assessment, in search of criteria. Teaching & Teacher
Education.
Snyder, J., Lippincott, A., & Bower, D. (1998). The inherent tensions in the multiple uses of
portfolios in teacher education. Teacher Education Quarterly, 25(1), 45–60.
Tillema, H. (1998). Design and validity of a portfolio instrument for professional training. Studies
in Educational Evaluation, 24(3), 263–278.
Tillema, H. (2003). Auditing assessment practices; establishing quality criteria in the appraisal of
competencies in organisations. International Journal of Human Resource Development and
Management, 3(4), 359–369.
Tillema, H., & Smith, K. (2000). Learning from portfolios: Differential use of feedback in
portfolio construction. Studies in Educational Assessment, 26, 193–210.
Tucker, P. D., Stronge, J. H., & Gareis, C. R. (2002). Handbook on teacher portfolios for evaluation
and professional development. New York: Eye on Education.
Van Manen, M. (1999). Knowledge, reflection an complexity in teacher practice. In M. Lang,
J. Olson, H. Hansen, & W. Bunder (Eds.), Changing schools/changing practices: Perspectives on
educational reform and teacher professionalism (pp. 65–75). Leuven, Belgium: Garant.
Wade, R. C., & Yarbrough, D. B. (1996). Portfolios: A tool for reflective thinking in teacher
education. Teaching and Teacher Education, 12(1), 63–79.
Yinger, R., & Hendricks-Lee, M. (1998). Professional development standards as a new context for
professional development in the US. Teachers & Teaching, 4(2), 273–299.
Zeichner, K., & Wray, S. (2000). The teaching portfolio in US teacher education programs: What
we know and what we need to know. Teaching and Teacher Education, 17, 613–621.
Zuzowsky, R., & Libman, Z. (2002, August). Standards of teaching performance and teacher tests;
where do they lead us. Paper presented at ATEE conference Warsaw.
116 K. Smith and H. Tillema
Appendix A. Questionnaire Items on Portfolio Use
Its purpose—development or certification
N Portfolio is a tool to highlight progression in development
N Portfolio is a tool to promote further learning
N Portfolio is a tool for providing functional feedback
N Portfolio is a tool to monitor actual competence levels
N Portfolio is a tool to document performance
Its process—appraising the portfolio
Portfolio appraisal is meant to:
N use it as a tool for self-development
N assess prior knowledge
N share and reflect good practice
N evaluate training courses
N enhance one’s performance
N change a student’s attitudes or beliefs
N gain accreditation or certification.
Its assessment qualities—the portfolio
N is a reliable measure of competence in relation to required standards
N gives a clear and consistent understanding of student qualities
N is clear how the portfolio is being assessed
N gives an authentic reflection of student growth during a period of time
N presents evidence of a student’s current competence.
Use of Criteria in Assessing Teaching Portfolios 117