Guidance for systematic project evaluation

21
1 Some Guidance for Systematic Project Evaluation Patrick Boyle & Mark Griffiths, Q Associates Sydney, Australia March 2014 (Version 7) Introduction The main purpose of this informal paper is to provide some guidance for participants in the OLT funded Workshop on Leading and Managing Projects to help plan, review and implement evaluation effectively for the success of their projects. It includes some general advice on important evaluation matters, good practice principles and a set of questions (section 4) intended to help evaluation planning, including critical reflection on existing evaluation plans and intended approaches. Over the last seven years, feedback from workshop participants and others who have used these questions within and/or beyond their projects, suggests that they are valuable. Evaluation is now a complex field of scholarship and professional practice with a vast literature. In the space and time available we can only touch on some important topics. We have sought to ensure that the ideas and views presented reflect accurately those of respected experts and current notions of international good practice in evaluation. For this kind of paper, citing numerous sources is not appropriate. However, scholars and practitioners whose ideas are reflected strongly in the guidance are Michael Scriven, Michael Patton, and John Owen. People who wish to look further into evaluation would be well served by some of these authors’ many works. Two other valuable sources of knowledge and good ideas for project evaluation are Funnell and Rogers (2011), Purposeful program theory: effective use of theories of change and logic models and the Journal of the Australasian Evaluation Society. 1. Three Important General Matters Awareness of a common tension It is worth raising an issue that arises sometimes for project leaders. When preparing for and carrying out evaluation, it is often necessary to manage tensions that arise between the important factors of rigor (or richness), timeliness and cost. This is particularly true with large-scale evaluations and/or those concerned with complex and sophisticated projects. For example, some people involved in a project might argue for a sophisticated evaluation strategy and methodology because they believe that technical rigor is important. Evaluation approaches having high rigor almost always involve more time and cost. Other stakeholders may be content with evaluation which is “good enough” to monitor and improve key aspects of the project, and to generate credible findings and overall judgments. While most OLT projects will not require complex evaluation, we suggest that it is useful to keep this potential issue in mind, particularly when deciding on the budget for evaluation work, holding discussions with prospective external evaluators and planning for evaluation.

Transcript of Guidance for systematic project evaluation

1

Some Guidance for Systematic Project Evaluation

Patrick Boyle & Mark Griffiths, Q Associates Sydney, Australia

March 2014 (Version 7)

Introduction The main purpose of this informal paper is to provide some guidance for participants in the OLT funded Workshop on Leading and Managing Projects to help plan, review and implement evaluation effectively for the success of their projects. It includes some general advice on important evaluation matters, good practice principles and a set of questions (section 4) intended to help evaluation planning, including critical reflection on existing evaluation plans and intended approaches. Over the last seven years, feedback from workshop participants and others who have used these questions within and/or beyond their projects, suggests that they are valuable. Evaluation is now a complex field of scholarship and professional practice with a vast literature. In the space and time available we can only touch on some important topics. We have sought to ensure that the ideas and views presented reflect accurately those of respected experts and current notions of international good practice in evaluation. For this kind of paper, citing numerous sources is not appropriate. However, scholars and practitioners whose ideas are reflected strongly in the guidance are Michael Scriven, Michael Patton, and John Owen. People who wish to look further into evaluation would be well served by some of these authors’ many works. Two other valuable sources of knowledge and good ideas for project evaluation are Funnell and Rogers (2011), Purposeful program theory: effective use of theories of change and logic models and the Journal of the Australasian Evaluation Society.

1. Three Important General Matters Awareness of a common tension It is worth raising an issue that arises sometimes for project leaders. When preparing for and carrying out evaluation, it is often necessary to manage tensions that arise between the important factors of rigor (or richness), timeliness and cost. This is particularly true with large-scale evaluations and/or those concerned with complex and sophisticated projects. For example, some people involved in a project might argue for a sophisticated evaluation strategy and methodology because they believe that technical rigor is important. Evaluation approaches having high rigor almost always involve more time and cost. Other stakeholders may be content with evaluation which is “good enough” to monitor and improve key aspects of the project, and to generate credible findings and overall judgments. While most OLT projects will not require complex evaluation, we suggest that it is useful to keep this potential issue in mind, particularly when deciding on the budget for evaluation work, holding discussions with prospective external evaluators and planning for evaluation.

2

Have an overall evaluation approach and engage in evaluation planning We advocate strongly that project leaders/teams have some kind of coherent evaluation approach or strategy, which usually incorporates a philosophical element or perspective on evaluation. This in turn will help generate ideas for evaluation planning and decisions about methods. While we recommend evaluation planning, by this we mean an active, continuing and adaptive approach. This may or may not mean that a highly formal and relatively fixed evaluation plan is a good idea. This will depend on the project context, approach to work and the perspectives of the project leadership and team. Good planning is often dynamic and adaptive and can occur and be documented, relatively informally, in stages. We refer above to the importance of having an “evaluation approach or strategy”. In some program and project contexts (e.g. health intervention; school improvement), particularly high stakes ones, these can be quite complex. With OLT projects, complex evaluation approaches are usually not necessary or expected. Nevertheless, it is beneficial for a project if some kind of overall approach to evaluation is developed early and evaluation planning is taken seriously as an integral part of leading and implementing the initiative. At the start of section 2 (Some Good Practice Principles for Evaluation) we provide guidance for developing an overall approach. This can be seen as an early element in evaluation planning which is addressed throughout the paper and particularly via section 4. Engage effectively with key stakeholders Usually, nothing is more important for effective and beneficial project evaluation than engaging well with key stakeholders. Clearly, the observations and judgments of stakeholders are almost always essential data to help answer evaluation questions, particularly for summative evaluation. However, stakeholder engagement is important for other reasons. It’s often very valuable to ensure that early on key stakeholders understand at least the goals and importance of the project. In addition, finding out what stakeholders would regard as success and what they would value most out of a project can be very useful for two reasons. First, this knowledge will help to maximize project success, including choice and implementation of activities and ways of doing things. Second, it can provide valuable guidance for focusing evaluation activities. Engagement with stakeholders is discussed further later in the paper.

2. Some Good Practice Principles for Evaluation Elsewhere, particularly in North America, the transdisciplinary field of evaluation has become increasingly sophisticated over the last thirty years or so. One marker of this has been the emergence of good practice principles, diverse models for practice, and standards. Some of these are widely respected (e.g. Centers for Disease Control and Prevention, Framework for program evaluation in public health, USA (1999); The Program Evaluation Standards, Joint Committee on Standards for Educational Evaluation (1994). Thousand Oaks, CA: Sage). Partly because of increasing demands for high quality evaluation in Australia, mainly by funding bodies, the field here is beginning to follow a similar trend towards greater sophistication and professionalism. The Australasian Evaluation Society is playing a key role in facilitating this evolution.

3

When an evaluation brief is complex, reference to guidelines such as those cited above, can be very helpful. However, for many projects good evaluation can be planned and carried out with the aid of a relatively simple set of good practice principles, such as the one provided below. Consultation with people who have expertise and successful experience with relevant evaluation work is also usually a good idea. Following the summary table, we elaborate on these principles, particularly 1 to 7.

In the remainder of the paper we elaborate on these principles, particularly 1-7.

Some Good Practice Principles for Evaluation (Project Context)

1. Have the Project Team agree an overall evaluation approach/philosophy

including the evaluation purposes that are important (e.g. summative, formative, learning). Do this early in the life of the project!

2. Engage effectively with key stakeholders: early; systematically and

positively; to understand their needs and views of project success; to seek their judgments of merit; and to keep them informed.

3. Determine the relative importance of project merit and worth.

4. Ensure intended project logic* is clearly understood and articulated

* Essentially how a project is intended to work, with an emphasis on the key inputs/activities, their interrelationships and likely project success factors.

5. Establish the emphases/main foci for evaluation, helped by 1-4 above.

For formative evaluation be guided by the project’s critical success factors; and for summative, the most important results expected of the project and the views of stakeholders concerning merit/success.

6. Formulate good evaluation questions (or criteria) to refine the foci and

frame evaluation, guided by 1-5 above.

7. Identify and use suitable indicators and data/evidence (in order to help answer the evaluation questions).

8. Develop and implement effective methods and processes for data

collection and review (e.g. survey instruments; document analysis).

9. Draw credible and useful findings/judgments based on sound evidence, stakeholder observations and judgments, and/or logic.

10. Engage in action and effective two-way communication about evaluation

throughout the project.

4

3. Advice Aligned with Good Practice Principles 3.1 Agree an overall evaluation approach or philosophy One way to arrive at an overall evaluation approach is to have the project team consider and agree answers to a small number of questions, such as the ones we propose below. By facilitating such a process, the characteristics and a description of the approach can be derived. This type of activity can also enhance team understandings about the project and support team building. As noted in the introduction having such an overall approach provides a foundation for evaluation planning and implementation. Note that what we provide below is a simulation of a process for arriving at an overall approach to evaluation and the ‘answers’ to the questions provided are for illustration only. Question 1: In terms of achievements or results, in broad terms what is most important for our project? The agreed ‘answer’ to this question could include one or more of:

the assessment of the challenges and needs of different stakeholders

the derivation and dissemination of new or improved knowledge

development of high quality products (e.g. learning resources)

achievement of specific impacts (e.g. changes in practices)

improved understanding (learning) about a state of affairs

the development and trialing of a new or improved process and assessment of its effectiveness

Question 2: In addition to the OLT, who are the most important stakeholders in our project and what are their primary expectations and values likely to be (including their views of what project success might look like)? The answers to this question would clearly include the key stakeholders (e.g. OLT; project team and possibly partners or close collaborators; a wider discipline community; a deans’ council; one or more professional associations). The suspected interests and values of these entities (i.e. before further consultation within the project) might be quite varied. They might include: evidence requirements concerning project achievements (e.g. impacts); wanting to be involved or informed as the project proceeds; a desire to ‘get something’ out of the project such as career development capital; needing to feel that their position in relation to an issue is respected; an expectation that the project is disseminating effectively in order to achieve wider engagement; or that the project has resulted in learning along the way (e.g. about what worked well and what didn’t and why this knowledge is valuable). Question 3: In light of the answers to questions like 1) and 2), for our project what key purposes and benefits do we need or want evaluation to serve and provide? The answers to this question might include:

Overall, we want our evaluation-related activities to enhance project success

Our primary evaluation purposes need to be formative and summative

Evaluation will be important and helpful at all stages, for different reasons

5

Improving how we are operating as we go and gathering evidence of the quality of resources as we develop them will both be particularly important. So too will be front-end evaluation of our project design and implementation strategy and the early determination of what our primary stakeholders are likely to value most from the project (i.e. what success would look like to them)

We need our summative evaluation to be professional and provide a clear and accurate evaluative story of the project, in the most positive way possible

If we do our evaluation-related work well it’s likely we can enhance stakeholder engagement in positive ways, which is an important effect we are aiming for

Question 4: In light of what we need from evaluation, what philosophy or key principles will we adopt as we plan and implement our evaluation-related work? By way of example, the answers to this question might be along the lines of:

Evaluation needs to be part of the fabric of our project implementation

Engaging constructively with stakeholders will be a critical factor for getting the most from our evaluation work and for achieving the project’s objectives

We will explore the potential of methods such as constructive inquiry and action research to inform how we go about our evaluation work

Throughout the project we will focus evaluation by formulating and using evaluation questions appropriate to different purposes and project stages

We will use a blend of formal and informal activities/techniques for evaluation

Our approach to evaluation will be open, critically reflective and constructive

We will communicate effectively with relevant people about our evaluative findings and any significant decisions or actions that result

We will commission the external evaluator as soon as possible and will ensure that person’s philosophy on evaluation is broadly aligned with ours

3.2 Be clear about the main purposes evaluation is to serve Historically and particularly since Michael Scriven introduced the terms (1967), the two primary purposes of evaluation have been formative and summative. For projects, formative evaluation focuses mainly on determining findings that help improve the project in progress. Examples of aspects of a project to be improved include communication, team functioning, how well an enabling process is working, versions of resources being developed, and participants’ understanding and skills in relation to project requirements. Formative evaluation can also help to discover success factors and problems for the kinds of effects intended by a project. Learning about such factors can help improve strategy and implementation for the existing project and for future initiatives. (and see comments below on the possible learning and success maximization purposes for evaluation). Summative evaluation is concerned mainly with judgments about the overall merit of a project, usually once it has been completed. It can also be important for phase evaluation in large or complex projects. Summative evaluation generally serves the wider purpose of providing evidence-based judgments to satisfy the reporting, accountability and decision making requirements of sponsors or funding bodies. It can also contribute to the achievement of wider engagement with ideas or practices (beyond a project), public relations and marketing, scholarly work (e.g. publications) and the enhancement of individuals’ reputations.

6

Figure 1: An expanded view of the main purposes of evaluation Other valuable purposes for evaluation, in terms of both process and outcomes, are emerging. Figure 1 provides an illustration of how evaluation purposes can relate to each other. In the present context it’s worth emphasizing success optimization. Briefly, this purpose concerns evaluation-related activities intended to enhance the success of a project in particular ways. Clearly, formative evaluation is concerned with helping a project be successful. However, there are some evaluation-related activities concerned with maximizing success which have (historically) not been thought of as being simply formative. Examples of these are: pre-implementation review of project design and logic; stakeholder or community readiness assessment; and inquiry and discussion processes to achieve deep and respectful engagement with stakeholders. The potential benefits of such activities vary and some will be discussed in the workshop. For example, processes that enable deeper and respectful engagement with stakeholders, particularly around evaluative matters, can result in increased buy-in, greater willingness to contribute and (ultimately) more positive perceptions of a project. There are clearly overlaps between evaluation purposes, but consideration of the different emphases is important for good evaluation strategy and planning.

7

Good data and evidence are required for both formative and summative evaluation and in section 3.7 we focus on this. Formative evaluation usually requires more refined data (focused on particular project aspects), can usually be facilitated by people within the project, and is often less formal in terms of its processes. Summative evaluation, particularly where stakes are high, requires more comprehensive data, though often less detailed. Good validity of data is particularly important for summative evaluation, and external input to evaluation design, implementation and judgments is always desirable, and sometimes a requirement. 3.3 Engage effectively with important stakeholders

Effective engagement with key stakeholders is usually a critical success factor for projects which have goals relating to change or enhancement of practices. In relation to evaluation, engagement with key stakeholders should be as early as possible in the life of a project. In the past, this has not been a common practice; stakeholders are often seen simply as sources of feedback late in a project. Early engagement provides opportunities to identify stakeholders’ needs and views of the potential value of a project, and to communicate with them about project goals and why they are important in the process for achieving them. In short, effective engagement with stakeholders demonstrates that project teams respect their needs and perspectives. Constructive early engagement can be extremely important for evaluation planning, and ultimately, for optimizing the success of a project. It can enable:

1. learning and communication about the project and its importance, which in turn can help reduce the need for troubleshooting later;

2. discovery of what particular stakeholder groups would value most as project outcomes (their views of success);

3. identification of questions about merit and worth people might want asked as part of evaluation;

4. finding out how stakeholders might like to contribute to the project and/or be informed about progress; and

5. seeding of feelings among stakeholders that they are being taken seriously. In most professional evaluation contexts, it is expected that the people responsible for evaluation must still be prepared to think and do things independently (e.g. ask particular questions; make well-reasoned judgments). This need doesn’t diminish the importance and value of engaging seriously with stakeholders throughout projects, as part of evaluation-related work.

3.4 Consider and decide on the relative importance of merit and worth

Having decided on the main purposes of evaluation and consulted key stakeholders to understand what matters to them, it is sometimes important to determine whether it’s useful, for evaluation purposes, to distinguish between the merit and worth of the project. Frequently, merit is the main emphasis in project evaluation. Judgments about merit are generally concerned with aspects such as: the extent to which objectives were achieved; whether an intended impact has been achieved (e.g. changes

8

in particular practices); or the perceived quality of resources developed (e.g. a curriculum model; a set of videos to enhance student learning). For many the distinction between merit and worth is blurred, but for others it is a useful one. In the purer sense, worth is usually seen in terms of significance, value for investment, or relative importance in a bigger picture. By way of illustration, particular stakeholders might judge a new resource or process resulting from a project to be of high merit (i.e. high quality; excellent efficacy for purpose). At the same time they might consider the worth of these outcomes (significance; value given the cost) to be limited. In more recent times, there is a growing expectation in many contexts that project evaluation provides some judgments about worth as well as merit. 3.5 Ensure the intended project logic (or project strategy) is clear Our notion of intended project logic (IPL) is essentially the same (in meaning and application) as program logic, an increasingly important element in evaluation theory and practice. It isn’t practical to attempt a detailed treatment of this in this paper, so we limit our comments to general advice. In Australia, some but not all expert evaluation practitioners use project/program logic as part of their approach. If project leaders decide that developing and using an IPL is a good idea, assistance might be available from the external evaluator. In any case, one valuable reference source is Funnell and Rogers (2011), cited earlier. Examples of representations of IPLs, which can vary considerably, will be presented in the workshop if time permits. Developing a clear IPL involves thinking ahead to help maximize project success. It can be regarded as one way of articulating overall strategy for the project. Starting from its objectives, the IPL should make clear how the project is meant to achieve its principal outcomes/effects, that is, how and why it is intended to work. This isn’t necessarily complex, but an IPL should indicate the important likely causes of success (i.e. critical success factors). Such factors can relate to people, actions, conditions, resources, interdependencies, etc. Many project/program logics are not explicit about critical success factors but including them is usually beneficial. For example, knowledge of them can help decision making about what to focus more or less energy on. Thinking through any key assumptions being made and significant risks that could influence a project’s success can be valuable for identifying critical success factors. To illustrate this further, consider a case where one important objective of a project is to achieve change, in specified positive ways, in the ways students engage in their learning. Such change is an intended outcome or desired effect of the project. To achieve this, the project will need to put in place certain enabling conditions and processes (e.g. positive engagement of academic staff; effective staff development activities; new student learning activities). When these conditions and processes are being considered, there may well be risks (or assumptions) that also need attention (e.g. current levels of staff motivation to be involved). Developing an IPL can help to identify these conditions, processes and other factors and make clear how they will collectively lead to the project’s intended results (success). Representations of an IPL can vary. They can be purely descriptive or tabular, but simple semi-graphic formats often make them clearer. They can resemble cause-effect diagrams, flowcharts or process maps. The main thing to ensure is that the IPL

9

illuminates clearly the planned elements and means by which the project is to achieve its important outcomes and any interdependencies between these elements. Both the process and the outcome of developing a clear IPL are valuable. As an outcome, it provides a clear description/representation of how the project is intended to work and this can be useful for several reasons. First, early in a project, particularly complex ones, it can be very helpful to spend time with people responsible for the project, and those who will be impacted by it, to provide them with a better understanding of its logic and design. Getting the views of people can also be valuable for refining the IPL, including the methods and activities planned. Second, the IPL (sometimes) illuminates critical success factors and significant risks for a project (e.g. the importance of effective communication with participating students; the active engagement of associate deans). This knowledge provides valuable initial guidance for formative evaluation. Because formative evaluation is concerned mainly with improving the project as it is happening, it makes sense to ensure that factors important for success are the main focus of formative evaluation. Third, later in projects, when carrying out summative evaluation, feedback from stakeholders on the merit of a project will often relate to how and why aspects of the project could have worked better. The IPL provides a reference frame that helps those doing evaluation to make sense of and form judgments about differences between project intent and how it actually worked. The process of developing an IPL, or reviewing and improving the existing logic or strategy, can also be very helpful. Doing so is, in effect, a formative evaluation activity, of a macro kind. By systematically reflecting critically on how the project is intended to achieve its goals, project leaders and teams will often discover success factors, assumptions or risks they hadn’t thought of previously. This can inform improvement of project activities, priorities and sometimes even the objectives. 3.6 Formulate good evaluation questions to focus and frame evaluation The general characteristics of evaluation questions (EQs) is covered in the workshop. Focusing attention on what’s important is one of their most important purposes. EQs should reflect what matters most in terms of project outcomes and effects, as well as processes and other enablers (e.g. good communication). Decisions within a project team about what matters most need to be shaped by the project’s objectives, its design and intended logic, and the views of success held by key stakeholders. Posing and obtaining answers to EQs facilitates improvement of and being able to communicate about these things that matter most. Some evaluation questions can be relevant for both formative and summative purposes and examples of such questions will be discussed in the workshop. Below, some examples of questions are provided that would be mainly relevant to either formative or summative evaluation. 3.6.1 Evaluation questions principally for the formative purpose In general, EQs for formative evaluation aim to help find out how aspects of a project are going and how they can be improved. They should be shaped mainly by the critical

10

success factors identified for the project (e.g. clear and effective project logic and design; good communication with the students participating in a pilot; the need for the video production quality to be high; effective team performance). Examples of EQs for the formative purpose are provided below.

Is our project logic/design clear and well understood, at least within the project team?

With our priority objectives, what are the significant risks to our desired progress over the next 3/6 months?

Are we engaging effectively with the important people we need to?

How effective is our communication with the students in our pilot scenario-group learning sessions?

In terms of pedagogical qualities, how do academics rate the preliminary video clips we have developed?

3.6.2 Evaluation questions principally for the summative purpose Summative evaluation is primarily concerned with the making and reporting of judgments about the merits including the value of a project overall. For this purpose having appropriate EQs (or evaluation criteria) is very important. They have a framing and focusing purpose serving as a cue for keeping project effort targeted on what is most important. If particular outcomes are considered to be critical for project success, summative EQs need to enable the extent of achievement of these to be discovered. If the perceptions of certain stakeholders concerning the significance of the project in a bigger picture are important, questions that focus on these perceptions are required. Some examples of EQs for the summative purpose are provided below and others will be examined in the workshop.

To what extent were the Project’s main objectives achieved?

How valuable are the web-based resources developed by the Project for enhancing online learning and teaching?

What levels of engagement with and uptake of the work/outcomes of the Project have been achieved across the Australian HE sector?

Has the Project achieved any very significant unintended outcomes and how valuable are these?

How effective was the overall strategy for achieving the Project’s objectives?

What was learned during the Project that is considered valuable for: (a) advancing the work of the Project beyond its life; and/or (b) informing the implementation of other projects having similar contexts or challenges?

How well was the Project led and managed? Deriving answers to EQs, particularly the higher level kinds normally used for summative evaluation, almost always requires judgments, based on the examination and interpretation of data/evidence gathered via particular methods (e.g. interviews). It is important to emphasize that rich answers to EQs can rarely be obtained simply by the presentation of data/evidence. Subjective judgments and values are always involved, more or less, even where relatively concrete reference points exist (e.g. explicit performance standards).

11

When developing EQs, it is important to seek to ensure that the answers they will generate will be useful for purposes such as improvement, learning, forming recommendations, or decision making. It is also necessary to keep in mind the relative ease or difficulty of obtaining data and other information to help answer questions. 3.7 Identify and use good indicators and data/evidence Indicators Indicators are commonly used where a variable cannot be observed or measured directly. They serve as proxies for the variable of interest. For example, a person’s score on a test that measures their attitude to something (e.g. fear of crime) is only an indicator of ‘attitude’. The person’s attitude (the variable of interest) is latent, that is, it cannot be seen or measured directly. Student satisfaction with teaching effectiveness is often used as one indicator of teaching quality. As a general principle, indicators are known correlates of a variable of interest. The comments above relate mainly to the concept of an indicator. In practice, indicators have a data component. Data or evidence provide the magnitude (or qualitative) aspect of an indicator. For example, students’ ratings gathered by questionnaires often provide the data for student satisfaction indicators. Choosing appropriate indicators, and data/evidence well matched to them (e.g. valid measurements), is important for helping to answer EQs. When planning evaluation, and particularly when forming EQs, it is important to think ahead about the best kinds of indicators or measures that might be used. For example, EQs concerning the worth or significance of a project for a university will likely need indicators such as the ‘perceptions of the project’s worth by important constituents’ (e.g. academic staff; senior managers). Questions about a project’s merit (e.g. the extent of changes achieved in the quality of student engagement) might be best answered by reference to indicators such as ‘examiners’ and teachers’ judgments of the extent and nature of changes’, and/or ‘shifts in student engagement profiles as measured by reputable instruments’. Answering a question concerning the level of up-take of new policies might be helped by reference to an indicator such as ‘the extent of documented decisions in faculties that are demonstrably based on the policies’. Indicators along with associated data/evidence are identified and gathered to help answer EQs and draw conclusions. The following simple example illustrates this point and connects some of the ideas just discussed. Summative EQ Has the introduction of the new student support modules developed by the Project resulted in significant enhancement in the quality of the student experience? Example of an indicator and data used to help answer the question

Properly gathered student perceptions data (using refined questions) on whether and how their experience has been enhanced by the support modules.

12

In this example, an indicator (‘student perceptions of the quality of their experience’) has been used as a proxy for the variable ‘quality of the student experience’. Like many educational and social variables, the ‘quality of the student experience’ is subjective and latent (i.e. inside the person and inaccessible to an outsider).

Data/evidence

Discussions about data and evidence in evaluation can get complicated. In this paper we can only provide some general ideas and advice, hopefully in a balanced way, and encourage people to look further when necessary. Our advice starts with a general view, which some experts might find a little unpalatable. This can be summarized using three short statements rather than a longer passage. We hope this makes our overall message a little clearer, even if the cost is some overlap or redundancy.

1) As a general rule, it is important that data/evidence have at least good levels of validity for their purposes in evaluation. Essentially, in the context of evaluation, good validity of data means they are sufficiently accurate, meaningful and credible for the people using them, for whatever purposes (e.g. improving, monitoring, judging; in general, answering evaluation questions).

2) Sometimes, the intended outcomes of a project, and as a result the evaluation questions that have been set, will require that both the data needed to draw conclusions and the methods for obtaining the data are complex (if possible at all in projects having limited resources and duration). A general example of this would be where an intended outcome is a demonstrable improvement in the quality of learning achieved by students as a result of the project intervention.

3) Most of the time, data/evidence which are ‘good enough’ for helping to answer evaluation questions* can be gathered with relative ease.

* The importance of setting realistic project goals and (then) evaluation questions, in the first place, is worth emphasizing. The process of developing intended project logic can be valuable for critically examining and refining goals, and subsequently, providing guidance for the formulation of sensible evaluation questions (given these goals and the realistic possibilities of the project).

The point made in 3) above is worth emphasizing. As an extension of it we advise against being influenced too much by people who are overly zealous about conceptual and/or technical rigor, particularly when it comes to measurements, as one kind of data/evidence. This can sometimes go as far as misplaced obsession with technical details that usually have little relevance to effective practical evaluation (e.g. expecting refined evidence of the dimensionality of a scale; unrealistic and/or unnecessary expectations about the precision of measurements). We have both seen examples where this has gone as far as having an intimidation effect on people in project teams. We intend no disregard for the sometimes complex conceptual and technical requirements of obtaining data/evidence with sufficient validity for answering particular EQs. As we point out in 2) above, sometimes data challenges are difficult. However, for most projects of the kind supported by the OLT, provided the objectives

13

are realistic and the EQs (or success criteria) set are sensible, obtaining appropriate data/evidence to serve evaluation purposes should be manageable.

More on validity and data suitability

There is a vast literature on the conceptual and technical aspects of validity, which incorporates reliability. In some projects there will be a need to delve into this body of knowledge, or seek assistance from someone who can cut through it and provide relevant advice. As noted above, for most projects this should not be necessary. The ideas on validity we discuss here are limited, but over time we have found that most people find this level of advice helpful. We start by restating what we mean by “good validity”.

Essentially, in the context of project evaluation, good validity of data means they are sufficiently accurate, meaningful and credible for the people using them, for whatever purposes (e.g. improving, monitoring, judging; in general, helping to answering evaluation questions).

There are many ways to demonstrate or advocate the validity of data. Some approaches rely on quite complex statistical methods, measurement theory assumptions, and project (or research) designs that are often not necessary or practical for most OLT projects. This is not the place to discuss such approaches. Clear and well-reasoned connectedness between data considered to be meaningful and useful (in logical terms) for helping to answer an EQ, and the sources of such data, can be a reasonable basis for asserting the validity of the data. A simple example of this can be seen in the proposition that the judgments of relevant employees (e.g. academic staff) in a university have suitable validity for helping to evaluate the merit or significance of particular project effects (e.g. new policies that have been introduced). The advocated validity of the judgments, in this context, is based on the reasoning that the academic staff are well placed to observe and experience “policy changes”. As a result, their judgments about the merit or significance of the new policies, from their (stakeholder) perspective, can be seen as being accurate, meaningful and credible, provided proper methods to gather the data are used. Another way to advocate the validity of data for a certain purpose is through reference to credible bodies of research. An example of this can be seen in the now widely accepted view that properly derived student ratings of teaching have at least moderate validity for helping to assess overall effectiveness of teaching. This view gained credibility because it is grounded (by reference) in a vast research literature on student evaluation of teaching. It’s interesting to note that in the earlier days of this research, many scholars argued that such student ratings had validity based on logic alone; because students are well placed (and better placed than anyone else) to observe and experience teaching directly. In general terms, the following kinds of data, with their sources evident in the descriptions, are regarded as having at least reasonable levels of validity for most evaluation purposes in learning and teaching development projects.

14

1. Observations, perceptions and judgments of well-placed people on

particular aspects of the project (e.g. merit of outcomes; usefulness of resources developed; significance or value of the project).

2. Material evidence of project achievements or productivity such as products, resources or artifacts (e.g. applications, guidelines, models, staff/student learning programs, software, templates, rubrics, videos)

3. Measurements or other evidence having at least reasonable validity on various variables useful for evaluation (e.g. counts or patterns of usage; data on uptake; unsolicited evaluative feedback; demand for learning about outcomes; records of activities that show engagement with a process)

In relation to “observations, perceptions and judgments” (1 above), such data can be collected in quantitative form (i.e. measurements, such as ratings) or in other forms (e.g. own-voice perceptions through means such as focus groups or interviews). The use of uncomplicated data for helping to answer EQs The tables on the following page provide illustrations of the use of relatively uncomplicated “perceptions” data to help answer EQs. Table 1 presents data relating to the following higher level summative EQ.

EQ: How valuable/significant has the Project been in the eyes of important stakeholders?

All of the numbers in the table (1-7) relate to items in a questionnaire. One example of these items (see row 5 in the table) is: “The Project has enabled the identification of some of the most important issues that need to be addressed in XXX YYY in Australia”. All of the items were based on positive mode Likert-style statements, so the “% positive” data represent the sum of the response percentages from the “Strongly Agree” and “Agree” scale categories for each item. The overall power and usefulness of the data in Table 1 come from both the generally high positive percentages, and the fact that the responses represent the perceptions of a number of different project stakeholders. The data in tables 1 and 2 have been modified slightly to simplify and clarify the illustrations.

15

Table 1: Illustration of the use of perceptions data to help answer a summative EQ Stakeholder Group Project Aspect/PEQ Item

Associate Deans

% Positive

‘Industry’

% Positive

Academics

% Positive

Students

% Positive

1) Value of outcomes 95 100 90 80

2) Significance for XX YY future

97 100 80 100

3) Project management 98 100 100 90

4) Value for learning 100 100 100 100

5) Identifying key issues 100 100 95 100

6) Ideas for curriculum develop.

98 100 90 100

7) Importance of follow-up 100 100 100 90

Table 2 provides data gathered (via survey) to help answer a formative evaluation question during a project concerned with the development of online learning modules.

EQ: Do academic staff believe the pilot online modules reflect good pedagogical practice?

Table 2: Illustration of data useful for helping to answer a formative EQ

(and inform improvements) Pedagogical Quality Criteria

Strongly Disagree

Disagree

Agree

Strongly Agree

1) Relevant to student learning goals

0.0% 10.0% 70.0% 20.0%

2) Learning activities are effective for PQR..

0.0% 15.0% 50.0% 35.0%

3) Student guidance is sufficient to enable ….

0.0% 35.0% 60.0% 5.0%

4) Balance between ‘instruction’ and activity is good

10.0% 50.0% 40.0% 0.0%

5) Overall the module is effective for ……………

0.0% 5.0% 75.0% 20.0%

* The response data on quality criteria 3 and 4 suggest that improvement is possible on these aspects of the module in question. Having identified the sufficiently valid data/evidence that can be gathered or accessed, it is important to select and develop appropriate methods for collection, where collection is necessary. This means paying attention to having clear questions, prompts and criteria to guide information collection, and practical, controlled and ethical ways of doing so. Some further suggestions about data and data collection are provided later in the paper.

16

3.8 Draw credible and useful findings and judgments Findings need to be credible in all serious evaluation contexts. This is particularly the case for summative evaluation in high stakes situations, for example, where findings or judgments are intended to inform decision making about funding, continuity, or a person’s career. To be credible, findings need to be demonstrably based on valid information and evidence, as discussed earlier, and on effective ways of analyzing information and deriving conclusions. Particularly when findings include judgments, the bases for these (e.g. values, criteria and/or logic used) should be made clear. For example, a judgment (or recommendation) about the best way to implement the use of quality indicators at academic program level might be based in part on relevant literature, partly on knowledge of successful approaches used elsewhere, and partly on the views of key stakeholders. Findings also need to be useful for their intended purposes. They need to tell users something that they need to know in order to act or make a reasonable judgment. For summative evaluation this means that findings need to address the interests or needs that sponsors or decision makers have. Such questions and criteria will typically relate to matters such as:

the demonstrable ways that outcomes of a project are or appear to be valuable for improving practices, environments, experiences, designs, etc.;

the extent to which objectives that were the basis for funding a project have been achieved;

the evident value of a project by reference to the benefits gained for the investment made, in the eyes of important stakeholders; and

the need for a clear case, based on merit and/or imperatives, for sustaining implementation or providing ongoing support for follow-up initiatives.

For formative evaluation, to be useful, findings need to indicate how improvement or change can enhance project success and performance while it is in progress. Usually, making sure that information sought and obtained is specific for its purpose is the key to effective formative evaluation. For example, asking members of an important committee whether the format and content of the project updates they receive from the project team are appropriate would be strengthened by asking them to nominate any other specific information they would find valuable. 3.9 Engage in action and effective two-way communication Historically, evaluation is intended to inform improvement, judgment and decision making, but it is not often seen as encompassing these activities and outcomes. However, in many contexts, including project evaluation, it is important to act on evaluation findings. In particular, significant findings from formative evaluation should be used to make improvements wherever possible. It is often a good idea to also communicate findings and actions taken (or planned) to relevant stakeholders (e.g. academic staff; students). Both of these actions are necessary to satisfy the need for “closing the loop”.

17

Summative evaluation usually concerns an overall project, but in large-scale or long projects summative evaluation might be carried out for a particular important stage (e.g. during or after the trial of a new process being developed). In either case, clear and accurate reporting of outcomes, findings and recommendations to relevant stakeholders, particularly decision makers and sponsors, is important. 3.10 Aspire to professional level evaluation Overall, seek to be as professional as possible with evaluation. This includes ensuring sufficient rigor with aspects of methodology (e.g. overall evaluation approach; instrument and process design; data collection). We advise that concerns for rigor need to be balanced with the need to ensure the workability of processes, the utility value of information gathered and findings (for their purposes), and consideration of timelines and available resources. Use good practice principles as far as possible to guide evaluation work and seek expert assistance if necessary.

18

4. Questions to Help Evaluation Planning and Review The set of questions that follow is intended to stimulate reflection about and help with planning and review of evaluation for projects. While not exhaustive, the questions provide a good basis for guiding evaluation for educational development projects. Collectively, the questions do not reflect any particular evaluation philosophy or strategy, although there is an overall orientation to success maximization and emphases on stakeholder engagement, evaluation questions, and the utility of evaluation questions and results for achieving desired project results. The questions have been shaped by the thinking and principles for practice of a number of highly respected scholars and practitioners. Scriven and Eisner, for example, both emphasize the importance of bringing clear values and judgments to evaluation (contrasted with a view of evaluation based on the ‘impartial’ examination of data). In other words, subjectivity is an inherent part of evaluation. Scriven, who introduced the terms formative and summative evaluation to the literature, stresses the importance of and differences between these two key purposes. He is also adamant that unintended outcomes of projects and programs should be discovered and judged through evaluation. Owen and Patton both stress the importance in evaluation of ensuring that evaluation questions, information and conclusions have high utility value for helping with decision making and/or the enabling of change. The effective and continual engagement of key stakeholders in evaluation is considered to be particularly important by Owen and Patton. Contact for Clarifications Any questions concerning the ideas in this paper can be referred to Patrick Boyle ([email protected]).

19

Questions to Help Evaluation Planning and Review

1. Effective Stakeholder Engagement

1.1 Who are the most important stakeholders in/for this project? 1.2 What are the best ways to communicate with different stakeholders? 1.3 Which project objectives are likely to be most important to each group? 1.4 For each key stakeholder group what level of understanding of the project’s

objectives, logic and activities is desirable? 1.5 For each group, which intended outcomes are they likely to value most? 1.6 Are there particular things that specific stakeholders would value from the

project (What would very successful look like to them?) 1.7 Which particular factors (actions; conditions; supports; stage outcomes) do

key stakeholders regard as being critical for project success? 1.8 How might each stakeholder group be engaged constructively to help

evaluation of the project and/or the achievement of success?

2. Most Important Objectives and Outcomes

2.1 Which of the project’s stated objectives/intended outcomes are critical? 2.2 What other outcomes are highly desirable (in light of stakeholder views)? 2.3 Do key stakeholders, including the project team have sufficiently clear

understandings of the project’s main objectives 2.4 Is there a need to adjust project objectives, logic and/or activities (and to

advise key stakeholders of changes)? 2.5 In broad terms, what indicators and evidence would be most powerful for

demonstrating the achievement of the most important objectives?

3. Evaluation Questions (EQs)

- Questions that help frame and focus evaluation, guide information gathering and derive findings.

- Examples provided are indicative only and are at the more general level; most would generate more specific and/or follow-up questions that would be answered by gathering information via appropriate methods (e.g. survey).

- In general, conclusions (‘answers’) related to EQs are derived from consideration of gathered data and evidence, particularly material evidence (e.g. products), measurements and qualitative data, and stakeholders’ perceptions and judgments about merit and significance/value.

3.1 Examples: principally for the summative evaluation purpose

1) To what degree have the main project objectives been achieved? 2) With reference to Objective 4, is student engagement with their learning

activities changing in the ways intended? What is the best evidence to show the nature and significance of the effect?

3) What valuable unplanned outcomes or achievements have resulted from the project?

20

4) What has been learned from the project that is useful for future stages, follow-up initiatives, or implementations elsewhere?

5) Have substantial goals, plans or strategies been established to enable continuity/sustainability of the project’s work/effects?

6) How valuable are the outcomes of the XYZ Project in the bigger picture of Medical Education in Australia?

7) How effectively was the project led and managed? 3.2 Examples: principally for the formative evaluation purpose

1) Are we on track with our deliverables for Stage 2? 2) Is our consultation process with students working well for them, particularly

in terms of assisting them with use the new SAFL application? 3) Which of the unresolved aspects of the new E-P do academic staff believe need

improving quickly? 4) How effective is our communication process with the Senior Management

Group for achieving what we need to? 5) Are members of the Project Team working effectively, as a team, and

according to our agreed values? 6) What improvements do our collaborating academics want to the support

being provided by our assessment advisors? 7) How can we improve the method (e.g. criteria) we are using to quality assure

the videos we are developing? 8) Have the student ratings of the key features of the pilot CPL tool changed

significantly during the semester and what can we learn from them to further improve it?

9) What are the main things we have learned from the pilot run of the Scenario Based Learning sessions (particularly in relation to evident success factors and inhibitors)?

4. Concerning Data, Evidence & Sources One of the most important good practice principles for evaluation is the need to obtain good data/evidence to answer evaluation questions 4.1 What valid (including useful) kinds of data/evidence can be obtained

which would help answer each EQ? (i.e. credible data that will inform improvement; help draw conclusions; result in learning; demonstrate success)

4.2 Where primary data/evidence (e.g. direct measurements; concrete outcomes) might not be available, do valuable kinds of secondary evidence exist (e.g. explicit indications of intention to adopt the model developed; logical inferences based on knowledge of outcomes in similar contexts)?

4.3 What kinds of data/evidence having reasonable validity are accessible, given the circumstances of the project?

21

5. Concerning Methods for Gathering Data

For each EQ or important objective, what are the best feasible methods for gathering data/evidence to assist evaluation?

6. Concerning Evaluative Conclusions and Communication

6.1 What are the most powerful/useful conclusions that can be drawn (or we

would like to draw) concerning the project objectives and the needs and interests of key stakeholders?

6.2 How can stakeholders be involved constructively in the drawing of conclusions?

6.3 How does the available evidence help demonstrate that particular outcomes have been achieved (and their merit and/or worth)?

6.4 What is the main demonstrable value of the project, particularly in terms of its contribution to the field in which it has been working?

6.5 In what ways are the outcomes important for key stakeholders and the Australian HE sector more generally?

6.6 What are the most effective ways for dissemination of the outcomes of the project, particularly to achieve active positive involvement of key people?

6.7 What are the best ways that outcomes can be used to facilitate desired effects beyond the project (e.g. increased engagement and buy-in; change; sustained effort; follow-up projects)?

Some Methods for Gathering Data/Evidence Direct measurement/assessment via instruments designed to determine, for

example: perceived merit or value; attitude change; shifts in practices) Surveys (to assess stakeholders’ perceptions and judgments) Structured interviews or similar activities Focus groups or similar interactive free-voice processes Logging of feedback received from stakeholders or other relevant parties,

particularly unsolicited feedback (e.g. emails; network postings) Web analytics Desk-based investigation/review (e.g. distillation of information from

files/documents such as reports or minutes) External expert reviewers (to assess merit of outcomes such as resources) Project achievement logs or internal evaluation reports Extraction of data or trends (e.g. in nationally administered survey results;

evidence-based changes in practices; usage patterns)