A Research Synthesis of the Evaluation Capacity Building Literature

http://aje.sagepub.com/American Journal of Evaluation

http://aje.sagepub.com/content/early/2012/01/27/1098214011434608The online version of this article can be found at:

DOI: 10.1177/1098214011434608

published online 27 January 2012American Journal of EvaluationLesesne

Susan N. Labin, Jennifer L. Duffy, Duncan C. Meyers, Abraham Wandersman and Catherine A.A Research Synthesis of the Evaluation Capacity Building Literature

Published by:

http://www.sagepublications.com

On behalf of:

American Evaluation Association

can be found at:American Journal of EvaluationAdditional services and information for

http://aje.sagepub.com/cgi/alertsEmail Alerts:

http://aje.sagepub.com/subscriptionsSubscriptions:

http://www.sagepub.com/journalsReprints.navReprints:

http://www.sagepub.com/journalsPermissions.navPermissions:

What is This?

- Jan 27, 2012OnlineFirst Version of Record >>

at American Evaluation Association on February 9, 2012aje.sagepub.comDownloaded from

A Research Synthesis ofthe Evaluation CapacityBuilding Literature

Susan N. Labin1, Jennifer L. Duffy2, Duncan C. Meyers2,Abraham Wandersman2, and Catherine A. Lesesne3

Abstract

The continuously growing demand for program results has produced an increased need for eva-luation capacity building (ECB). The Integrative ECB Model was developed to integrate concepts fromexisting ECB theory literature and to structure a synthesis of the empirical ECB literature. The studyused a broad-based research synthesis method with systematic decision rules and demonstrates theviability of the method for producing a reliable analysis of disparate data from a variety of designs.There was a high degree of consistency in what was reported in the empirical literature and thetheoretical literature in terms of strategies and outcomes. Reported outcomes at the individual levelincluded attitudes, knowledge, and behaviors and at the organizational level included practices,leadership, culture, mainstreaming, and resources. Collaborative processes and programmaticoutcomes emerged as important issues for ECB models and practice. The consistency between theempirical and the theoretical literature indicates that the field is ready to develop common mea-sures, use stronger designs, and report more systematically. This synthesis provides an overview ofexisting data and an empirical basis for refining strategies and common measures for enhancing theresearch and practice of ECB to achieve ECB and programmatic goals and outcomes.

Keywords

ECB, evaluation capacity building, research synthesis, capacity building, systematic review

For at least the past decade, evaluation capacity building (ECB) has been attracting the interest of

evaluators committed to increasing stakeholder understanding of evaluation and building evaluation

culture and practice in organizations (Boyle, Lemaire, & Rist, 1999; Compton, Baizerman, & Stockdill,

2002; Fetterman, Kaftarian, & Wandersman, 1996; Milstein & Cotton, 2000). Consequently, there is a

growing theoretical and empirical ECB literature (Compton et al., 2002; Cousins, Goh, Clark, & Lee,

1 Washington, DC, USA2 Department of Psychology, University of South Carolina, Columbia, SC, USA3 ICF Macro, Atlanta, GA, USA

Corresponding Author:

Susan N. Labin, 8517 Rayburn Road, Bethesda, MD 20817, USA

Email: susan@susanlabin.com

American Journal of Evaluation00(0) 1-32ª The Author(s) 2012Reprints and permission:sagepub.com/journalsPermissions.navDOI: 10.1177/1098214011434608http://aje.sagepub.com

2004; Preskill & Boyle, 2008). However, to date there has not been a systematic review of the empirical

ECB literature. The purpose of this article is to address this gap.

Synthesizing and taking stock of the existing empirical literature on ECB has a number of

intended benefits. A systematic synthesis provides an evidence base about how ECB is being prac-

ticed in the field. The results can illuminate the current landscape of ECB by describing the contexts

and settings where it occurs, the strategies used, the outcomes reported, and how the strategies have

been evaluated. Such a descriptive base is an important step in the development of ECB and can

contribute to the improvement of ECB practice by increasing information sharing for the many

evaluators who have incorporated ECB into their practice (according to an American Evaluation

Association [AEA] membership survey, at least half of those responding reported they included

ECB in their practice; AEA, 2008). This descriptive base can also be of value to organizations and

funders that embark on the journey of ECB for improving programs and organizations.

Building on the work of others who have contributed to developing a language of and indicators

for ECB (Compton et al., 2002; Preskill & Boyle, 2008), this synthesis systematically codifies the

literature. Common terminology and indicators are the basis for describing ECB processes and are

prerequisites for evaluating ECB efforts; all of which will enhance both the practice and science of

ECB. As with all syntheses (Labin, 2008), this synthesis is intended to clarify existing knowledge,

raise questions, and reveal gaps to inform future practice and research.

Defining ECB

Developing a working definition of ECB was an essential first step for synthesizing the literature on

ECB. Various definitions of ECB have been proposed (Boyle et al., 1999; Preskill & Boyle, 2008;

Schaumberg-Muller, 1996; Stockdill, Baizerman, & Compton, 2002). Our review of these

definitions identified common features. For example, each identifies ECB as an activity separate

from actually conducting evaluations. There were also differences among definitions-some focus

primarily on ECB as an activity at the organizational level (Stockdill et al., 2002), while others are

concerned with capacity building at the individual and organizational levels (Preskill & Boyle, 2008;

Schaumberg-Muller, 1996).

In addition to examining explicit definitions of ECB, we drew on the work of collaborative,

participatory, and empowerment evaluation, which are precursors or approaches that include aspects

of ECB. For example, in empowerment evaluation, building evaluation capacity in order to improve

program outcomes has been a central and explicit principle since its inception in the early 1990s

(Fetterman et al., 1996). Empowerment evaluation and other participatory and collaborative approaches

emphasize how goals similar to those of ECB could be achieved through participatory means and thus,

provided much of the foundation for what has became known as ECB (Cousins & Whitmore, 1998;

Fetterman & Wandersman, 2005; Love, 2006; O’Sullivan, 2004; Rodriguez-Campos, 2005). Based

on our review of these frameworks and approaches, we developed the following working definition

of ECB:

Evaluation capacity building (ECB) is an intentional process to increase individual motivation,

knowledge, and skills, and to enhance a group or organization’s ability to conduct or use evaluation.

The Integrative ECB Model

This working definition was used to identify the types of cases to include in the synthesis. In order to

determine what information would be systematically extracted from the cases, we developed an

Integrative ECB Model (Figure 1) based on existing ECB frameworks and a review of both the

2 American Journal of Evaluation 00(0)

theoretical and empirical ECB literature (Baizerman, Compton, & Stockdill, 2002b; Cousins et al.,

2004; Duffy & Wandersman, 2007; Milstein & Cotton, 2000; Owen, 2003; Preskill & Boyle, 2008;

Suarez-Balcazar et al., 2010). In particular, we used Preskill and Boyle’s (2008) Multidisciplinary

Model of ECB. We developed the Integrative ECB Model to ensure that the synthesis reflected and

integrated key elements in existing theory and empirical literature on ECB and did not restrict

information to that from any one existing framework or subset of the literature. Our intention was

to maximize the breadth of information to be extracted, advance the development and use of

common terminology, and operationalize the study’s working definition of ECB.

A basic logic model of Needs-Activities-Outcomes (Kellogg Foundation, 2001; United Way,

1996, 2008; University of Wisconsin-Extension, 2003) was used to organize and portray the key

circumstances, activities, processes, and outcomes of ECB. The logic model structure implies a

causal direction from left to right, that is, needs will affect the strategies and strategies will affect

outcomes achieved. Furthermore, implementation and evaluation descriptors may mediate the

effects of strategies on outcomes.

Individual levelo Attitudeso Knowledgeo Skills/Behaviors

Organizational level o Processes, Policies, and

Practices (PPP)o Leadershipo Organizational Cultureo Mainstreamingo Resources

Program Outcomeso Developmento Implementation o Results

Negative outcomes

Lessons Learned

ShortLong-Term &

SustainableStrategieso Theoryo Modeo

o Typeo Content

Implementationo Target

Population OrganizationDomain

o Timing, Frequency,Dosage

o Mid−CourseCorrectionsBarriers

Evaluation of ECBo Approacho Designo Measureso Data typeo Timeframeo Internal or External

Reasons–Motivationso Audience: Internal−

Externalo Assumptionso Expectations

Goals–Objectives

Contexto Needs Assessmento Tailored

Resources and Strengthso Individual−Attitudeso Organizational

Resources: staff/time/moneyEvaluationexpertisePractices,leadership,

streaming

III. RESULTS: OUTCOMES

•ooo

II. ACTIVITIES: WHAT and HOW

•ooo Level: Individual-

Organizationaloo

o•oooooo

I. NEED:WHY

culture, main-

Figure 1. Integrative evaluation capacity-building model.*Collaborative and participatory aspects and processes should be included in defining and operationalizingnearly all elements of the model.

Labin et al. 3

Our goal was to create a model that included key activities and processes in ECB. We realize that

our model is a simplification and may not have identified every interactive process that is part of

ECB. However, we believe we have captured the major activities and processes, thus allowing for

systematic extraction and coding of data.

Need for ECB––Why

The first column of the model relates to the need for conducting ECB and who and what motivates

the interest in ECB (Milstein & Cotton, 2000; Preskill & Boyle, 2008). ECB may be driven by

factors internal to the organization (such as a leader’s desire to increase evaluation within the orga-

nization), by external factors (such as funder requirements), or by a combination of internal or exter-

nal factors. Preskill and Boyle emphasize the importance of considering three elements related to the

need for ECB: (a) motivation for ECB; (b) assumptions and expectations about ECB; and (c) iden-

tification of goals and objectives for ECB. They point out that ‘‘Understanding the organization’s

motivation for engaging in ECB . . . provides insight into who should participate and which teaching

and learning strategies might be most beneficial’’ (p. 446). Related to the motivation for ECB are

assumptions that may underlie the desire to engage in ECB. Preskill and Boyle suggest that when

these assumptions are not shared among the key people involved in ECB, the success of the effort

may be inhibited. They also note that the explication of specific objectives is important for the suc-

cessful design and implementation of ECB efforts.

Conducting a needs assessment and tailoring ECB efforts to the particular population and con-

text can affect the selection and implementation of the ECB strategies. The existing characteristics

of an organization have also been hypothesized to affect the type of strategies utilized and their

efficacy. Some of these specific factors noted in the literature include attitudes toward evaluation,

availability of resources for ECB (staff, time, and financial), internal evaluation expertise, and

organizational practices and capacities such as support for evaluation and ECB from leadership,

from the organizational culture, and through mainstreaming or making evaluation a routine part

of the organization (Milstein & Cotton, 2000; Owen, 2003; Preskill & Boyle, 2008; Suarez-

Balcazar et al., 2010). These factors have been hypothesized to affect organizational learning and

the extent to which the outcomes of ECB will become sustainable (Preskill & Boyle, 2008). In the

Integrative ECB Model, many of these preexisting characteristics that can facilitate the ECB pro-

cess are defined as strengths and resources.

ECB Activities––What and How

The second column of the Integrative ECB Model categorizes the activities of the ECB strategies,

implementation specifics, and evaluation of the ECB efforts. Various aspects of strategies are

defined to capture important dimensions that define their nature and effectiveness. Some ECB

efforts are justified by an underlying theory or approach such as empowerment evaluation or orga-

nizational learning. Preskill and Boyle (2008) identified a number of types of theories which can

inform the design and implementation of ECB strategies, including theories about evaluation, adult

learning theory, and theories related to organizational change and development.

ECB strategies may be provided through multiple modes, such as face-to-face meetings, telecon-

ferences or phone calls, e-mail or other web-based mechanisms, and through written materials such

as evaluation manuals. Some may use a mode of exclusively face-to-face efforts, while others might

utilize only distal modes such as phone or web. Strategies can be directed at the individual level for

learning and behavior change and at the organizational level. The ECB literature addresses how the

level at which strategies are directed affects outcomes (Preskill & Boyle, 2008; Stockdill et al.,

2002). Types of strategies refer to the mechanisms of delivery, that is, training, technical assistance

(TA), and experiential involvement or participation in evaluation activities (Duffy & Wandersman,

2007; Milstein & Cotton, 2000). Furthermore, there is also discussion in both the empirical and

theoretical literature about the substance or content of the ECB activities. At the individual level,

strategy content focuses on attitudes and evaluation curriculum, for example, designing evaluations

and analyzing data. At the organizational level, strategy content focuses on collective activities such

as using evaluation as part of organizational processes and practices, providing leadership support

for evaluative activities, fostering a learning culture, mainstreaming evaluation by making it a more

routine aspect of how the organization functions, or increasing resources for evaluation (Compton et

al., 2002; Duffy & Wandersman, 2007; Sanders, 2003; Suarez-Balcazar et al., 2010).

Implementation variables are not only important descriptors of ECB efforts, but they also may

mediate effects of the strategies on outcomes (Durlak & DuPre, 2008; Rapkin & Trickett, 2005).

Standard implementation variables were included in the model—population, organizational setting,

domain or area—in order to both describe existing ECB practice and to determine whether these

factors mediate the effects of strategies on outcomes (Preskill & Boyle, 2008). As with most inter-

ventions, timing, frequency, and dosage of the interventions may also affect the outcomes. Mid-

course corrections to accommodate population and contextual issues are often cited as an

important implementation variable for increasing the likelihood of positive outcomes (Wandersman,

Imm, Chinman, & Kaftarian, 2000). Conversely, barriers encountered may hinder the likelihood of

achieving positive outcomes.

The activities of the ECB efforts include not only strategies but also evaluations of those efforts.

The nature of the evaluations is important for developing indicators, measuring, and interpreting out-

comes (Preskill & Boyle, 2008). The descriptive categories of the evaluations represent basic com-

mon and important features of evaluations including the approach, design, measures, data type, time

frame, and who conducted the evaluation.

ECB Results—Outcomes

Both individual and organizational outcomes of ECB have been hypothesized. Individual-level

outcomes include improved attitudes, knowledge, and skills as evidenced in behaviors such as enga-

ging in various evaluation activities (Duffy & Wandersman, 2007; Owen, 2003; Preskill & Boyle,

2008; Suarez-Balcazar et al., 2010). Preskill and Boyle hypothesize that individual learning is

affected by the organizational context and Taylor-Ritzler, Suarez-Balcazar, and Garcia-Iriarte

(2010) found that supportive organizational characteristics were necessary for individual learning

and behavior change. Organizational-level outcomes in the Integrative ECB Model consist of five

organizational characteristics that are important for a successful ECB effort. Processes, policies, and

practices (PPP) relate to the doing and using of evaluation, both hallmarks of ECB (GAO, 2003;

Gibbs, Napp, Jolly, Westover, & Uhl, 2002; Preskill & Boyle, 2008; Suarez-Balcazar et al.,

2010). Leadership is included because of its well-established importance for organizational change

(Kotter, 1996; Milstein & Cotton, 2000). Organizational culture is the collective values, attitudes,

goals, and practices that can support or hinder organizational change and is considered an essential

outcome or indicator of successful ECB (Boyle, 1999; Cousins et al., 2004; GAO, 2003; Owen,

2003; Preskill & Torres, 1999; Suarez-Balcazar et al., 2010). We selected indicators of

organizational culture that have been identified as conducive for ECB such as being committed

to learning and using data (Preskill & Torres, 1999; Robinson & Cousins, 2004). Mainstreaming

or the routinization of evaluation is considered an essential element in the sustainability of ECB,

which is considered by some the most important long-term goal (Boyle, 1999; Preskill & Boyle,

2008; Sanders, 2003; Stockdill et al, 2002). Resources to support evaluation are hypothesized to

be important if ECB is going to be successful (Boyle et al., 1999; Gibbs et al., 2002; Milstein &

Cotton, 2002; Preskill & Boyle, 2008). The five organizational characteristics are elements in the

Labin et al. 5

Integrative ECB Model and are derived from the ECB literature as intended and necessary outcomes

to document successful ECB practice (Duffy & Wandersman, 2007).

Program outcomes refer to the outcomes that are the mission of the organization that is hosting

the ECB effort. For example, a health program may want to improve prenatal and newborn health

outcomes. Program outcomes were neither extracted from the literature nor included in the synth-

esis, but rather were added to the model because of reports in the empirical literature that program

outcomes improved as a result of the ECB. Improved program outcomes have been a long-standing

rationale for conducing ECB (Wandersman, et al., 2005). Negative outcomes were included in order

to track any unintended negative consequences of ECB. Lessons learned as comments by the authors

were collected from the cases.

Research Questions

Using the Integrative ECB Model as a guide, we identified the following research questions:

1. What are the needs preceding ECB efforts?

2. What strategies are being used for ECB and what implementation variables are being reported?

3. What evaluation approaches and methods are being used to assess ECB efforts?

4. What outcomes of ECB are being reported at the individual and organizational levels?

In addition to these four descriptive questions, we examined several relationships posited by the

logic model structure, that is, the causal direction from left to right.

1. How do strategies vary by the presence of pre-existing resources?1

2. How do the outcomes vary by strategies?

Method

We employed a broad-based research synthesis method (Labin, 2008) that uses systematic decision

rules that are derived from principles of meta-analysis and that distinguish synthesis from traditional

literature reviews. However, meta-analysis usually includes only randomized control trials (RCTs;

Cooper & Hedges, 1994; Higgins & Green, 2011; Labin, 2008), whereas broad-based synthesis

methods include studies with a variety of designs and types of data. Broad-based synthesis methods

have been developed and used by the Government Accountability Office (GAO, 1987, 1989, 1992a,

1992b) and the Centers for Diseases Control and Prevention (The Guide to Community and Preven-

tive Services, 2011). While broad-based methods are appropriate in well-developed fields to incor-

porate data from a variety of designs, they are especially and perhaps uniquely well suited for fields

early in their development such as ECB.

Research synthesis, both meta-analysis and broad-based, involves a series of steps, each of which

is governed by systematic decision rules:

1. Define the research questions.

2. Collect information sources.

3. Select information sources based on inclusion criteria.

4. Extract and code data.

5. Analyze data.

6. Present findings.

The Integrative ECB Model displayed in Figure 1 served as the source of the research questions.

To guide the collection and selection of sources, the research team conducted electronic searches

using the nine phrases identified in Table 1. Seven social science electronic databases were searched

including ERIC, Psychinfo, and Sociological Abstracts. Team members also conducted a manual

review of the tables of contents (from 1998 through August 2008) for seven evaluation journals.

We were aware that including only published articles might create ‘‘publication bias’’ (Begg,

1994), which assumes significant findings are more likely to be published. Our inclusion criteria sti-

pulated that the documents must be empirical examples of ECB that also met our definition of ECB.

The search was wide and included studies published as articles or book chapters using all types of

data and methods including case studies, narrative descriptions, qualitative data, and quantitative

methods. Articles did not have to use the term ‘‘ECB.’’ However, articles that were solely about ECB

theory, theoretical frameworks, assessment methods, fictional examples, or descriptions of ECB in

general were excluded. We also excluded articles describing evaluation courses and articles discuss-

ing ECB efforts directed at building national capacity, which were determined to be substantially

different and beyond the scope of this study. However, we did include studies of ECB funded by

federal agencies. The initial searches yielded over 500 results and a manual scan of titles and

abstracts reduced this number to 149 (Table 2). A more in-depth reading of abstracts and in some

instances the full text of the documents produced a final sample of 61 unique cases that met the

inclusion criteria and were coded.

For the purpose of this study, the authors identified each case as a single real-world example of

the practice of ECB. Some cases focused on a single ECB practitioner’s work with a single organi-

zation (Appendix A, e.g., Cohen, 2006). Others focused on a team of ECB practitioners working

with multiple organizations as part of a single ECB initiative, such as a multiyear project intended

to build evaluation capacity throughout a statewide network of substance abuse prevention agencies

(Appendix A, e.g., Stevenson, Florin, Mills, & Andrade, 2002). Sometimes one example of ECB

practice was discussed in multiple articles (Appendix A, e.g., Huffman, Lawrenz, Thomas, &

Table 1. Collect Information Sources: Searches

Databases Searched: Dissertations and ThesesEBSCO Business Source PremierERICPsychInfoSociological AbstractsSocial Work AbstractsWeb of Science

Search terms used: Developing evaluation capacityEmpowerment evaluationEvaluation capacity buildingEvaluation capacity developmentEvaluation skill buildingEvaluation technical assistanceEvaluation trainingEvaluative inquiryInsourcing

Journals for which tables of contents weremanually reviewed (from 1998 to 2008)

American Journal of EvaluationCanadian Journal of Program EvaluationEvaluationEvaluation and Program PlanningEvaluation Journal of AustralasiaJournal of Multidisciplinary EvaluationNew Directions for Evaluation

Labin et al. 7

Clarkson, 2006; Huffman, Thomas, & Lawrenz, 2008), for which information from all the related

articles were coded as a single case. Other times a single article presented information on more than

one example of ECB practice (sometimes comparing and contrasting the approaches and results);

each of these examples was coded as a separate case (Appendix A, e.g., Hoole & Patterson,

2008). The determination of whether ECB examples were treated as a single case including multiple

organizations or multiple separate cases was based on whether the authors described their ECB prac-

tice as part of a single effort or multiple separate efforts. Appendix A provides a list of all the cases

used in the analysis and the articles reviewed in order to code each case.

Coding was conducted by a team of three researchers (authors of this article including one

senior evaluation researcher and two advanced doctoral-level students). The Integrative ECB

Model provided the major concepts that were to be coded. A coding form was developed to

structure the systematic extraction and coding of data from each case. The coding form oper-

ationalized the elements in the model and provided indicators and response categories for each

element.2

The items on the coding form were either binary (e.g., whether or not the ECB effort was eval-

uated) or categorical (e.g., whether the target population consisted of professional evaluators, pro-

gram practitioners/staff, a combination of the two). Many items on the coding form also included an

‘‘other’’ response option that allowed the coder to write-in additional relevant information. These

‘‘other’’ responses were included for nearly every element in the model, for example, preexisting

strengths, strategies, target populations, evaluation descriptors, outcomes, and barriers. Coders uti-

lized the ‘‘other’’ option to write in a response only if the case reported information that was related

to an item on the coding form but was not captured by any of the response options on the coding

form. Each of the ‘‘other’’ responses was assessed by several researchers and was either collapsed

into one of the existing coding categories or remained a true ‘‘other’’ response. With a few

Table 2. Select Information Sources Based On Inclusion Criteria

Selection Stage Included Excluded

1. Initial search of databases andtables of contents

149 articles identified based ondefinition of ECB

Articles that did not meet ECBdefinition were excluded fromfurther review.

2. Review of abstracts 101 items met preliminaryinclusion criteria

66 items excluded because they did notmeet criteria for an empirical case ofECB

3. Coding of articles 85 book chapters and articleswere coded, yielding a totalof 79 cases of ECBa

7 dissertations were excluded(precluded from coding process dueto length)6 items were excluded because onreview of full article they did notmeet ECB case criteria3 items were excluded becauseinsufficient information to code thecase was provided

4. Exclusion of classroom-basedcases

61 cases were included in finalanalyses after excludingclassroom-based cases.

18 cases were excluded after decisionthat classroom-based casesrepresented were substantivelydifferent from other types of ECBcases.

aSome cases of ECB had multiple articles or chapters written about them, and some articles described multiple distinct casesof ECB. Multiple articles on a single case were treated as one case and multiple cases were coded within a single article whenappropriate. See Appendix A.

exceptions, the vast majority of the information fell into the existing response categories. Coders

also collected comments that related to lessons learned and conclusions by the authors.

A coding manual that operationalized each of the items on the coding form was used to ensure

that coders followed systematic coding rules to extract data for each case from the associated arti-

cle/articles. The manual directed coders to avoid making assumptions and to only code material that

was explicitly reported. While no assumptions were made about the presence of a concept or item,

coders generally could code the presence of a given concept even if the authors did not use the same

terminology as the coding manual. For example, mainstreaming would be coded as an outcome if it

was reported that the ECB effort was successful in making evaluation an integral part of how the

organization functions even if none of the specific indicators of mainstreaming or the specific term

was used. In a few instances, use of the same terminology as the coding manual was required. For

example, use of ‘‘written documents’’ was only coded if there was explicit mention of manuals or

other written documents created for or accessed by the project.

Both the coding form and the accompanying coding manual were refined through an iterative

piloting process. On several occasions at least two of the coding team rated a subsample of cases

and all discrepancies were discussed and decision rules were established, refined, and recorded in

the coding manual. During this piloting process, we dropped several items in the model because

there was insufficient detail in the reviewed articles for them to be reliably coded. These dropped

items were (a) the reasons for conducting the ECB effort; (b) the implementation variables of time,

frequency and dosage; and (c) whether it was an internal or external evaluation. A subset of 10 cases

was selected as a purposeful sample on which to assess intercoder reliability and agreement. Purpo-

seful sampling was used in order to obtain a group of cases differing on major concepts of interest

and varying in terms of the level of detail reported. Having such a diverse sample of cases, as sug-

gested by Lipsey and Wilson (2001), provided an opportunity for coders to discuss difficulties

encountered and develop additional decision rules for any discrepancies in how items were coded.

This process was the basis for further refinement of the coding manual.

All three researchers participated in coding the reliability sample; two raters coded each case. The

overall interrater reliability calculated for Cohen’sk among all raters, interpreted using the ranges estab-

lished by Landis and Koch (1977), was in the substantial range (k ¼ .66), with specific results ranging

from moderate (k ¼ .46) to almost perfect (k ¼ .83). Cohen’s k coefficient was chosen as a statistical

measure of intercoder agreement given the categorical nature of the items in the coding form.3

The k statistic may understate interrater agreement and result in a misclassification to a lower

rating when there is little or no variability in ratings. To address this possibility of misclassified rat-

ings, percent agreement was calculated as an additional measure of interrater agreement. Percent

agreement was calculated by dividing the number of agreements between coders by the sum of the

number of agreements plus the number of disagreements. The resulting percent agreement numbers

are much higher than those for the k statistics. The overall percent agreement was 85% with specific

levels ranging from 70.3% to 94.8%.

After coder reliability and agreement were established, and adjustments were made to the coding

process, the remaining cases were coded by one coder on the team. Each coder extracted data via the

coding form and the data were scanned and compiled electronically to ensure accuracy.

Notwithstanding the relatively strong overall levels of reliability and interrater agreement

achieved in this review, there were several issues that posed challenges. First and foremost was the

high degree of variability in the level of detail with which the ECB cases were described in the orig-

inal articles. A second factor was that for many articles the narratives did not follow the structure of a

traditional research article—they did not have methods or results sections––and concepts of interest

were dispersed throughout the articles in such a way that the coder had to review the article multiple

times in order to code information for that case. Notwithstanding these challenges, it is likely that the

coder agreement statistics understate the reliability of the coding because the coding procedure was

Labin et al. 9

refined during the coding of the reliability sample, and the coding manual was improved with more

decision rules. Due to the enhanced detail of the revised manual—based on the resolution of discre-

pancies through group discussions––it is reasonable to assume that the coding of the remaining cases

(n ¼ 51; 83%) was characterized by an even higher degree of intercoder agreement and a higher

degree of accuracy than reflected in the k statistics and the percent agreement percentages for the

reliability sample.

The data from this systematic synthesis, as data for syntheses in general, are only as comprehensive

as the reporting for each case in the published articles. It seems reasonable to assume that the major

elements such as strategies and outcomes would have been reported and described by the authors. Nev-

ertheless, it is possible there was some underreporting of major issues and their relevant detail.

The coding results for the elements in the model were the basis of descriptive analyses—fre-

quencies, percentages, mean number of responses—to answer the first four research questions

(Tables 3–10). These distributions were assessed for the appropriateness of cross-tab analyses

to address the relational research questions (Questions 5 and 6). Limited variation in distributions

and small cell frequencies posed analytical constraints. However, the data for several of the

descriptive variables was judged adequate for relational analyses (Tables 11–14). Nevertheless,

given the limitations of the data the findings from the relational analyses are considered

exploratory.

Results

Goals, Contexts, and Strengths

The first column of the model refers to establishing the need for the ECB effort or why ECB was

initiated. In spite of the importance of this information to the design of ECB efforts (Preskill &

Boyle, 2008), this information was infrequently and incompletely reported. A majority of the cases

mentioned having goals and objectives (Table 3), but there were not enough specifics to reliably

code any further details. While needs assessments were not common (30%), a majority did report

tailoring the ECB effort to fit the context (64%).

A relatively small percentage of cases (slightly less than 15%) reported positive individual-level

attitudes toward evaluation as a strength preceding the ECB effort. However, a majority of cases

indicated some organizational-level strengths (62%) preceding the ECB effort. Indicators defining

organizational strengths are specified in Table 3. ‘‘Other’’ strengths noted included relationship

issues, such as those between the evaluators and the organizations.

Strategies Used and Implementation Variables

In the majority of cases (80%; Table 4), some theory or principles were reported as guiding the ECB

strategies used and about half of these provided some specifics on that theory. Among those that

reported specifics on the theory, two thirds explicitly mentioned participatory, collaborative, or

empowerment evaluation, of which empowerment evaluation was the most frequently reported

(about half). The type of strategy most commonly used was training (77%; Table 5). There was also

frequent use of multiple strategies and frequent use of involvement in an evaluation. Additional

information sometimes provided details on how the strategies were implemented, especially

including collaborative and interactive processes such as involving stakeholders or creating a

‘‘learning team.’’

Nearly all cases (91.8%) reported specific content for the individual strategies. Virtually all of

these efforts were directed toward multiple competencies in evaluation. Of those that targeted

‘‘doing an evaluation,’’ more focused on collecting data than analyzing data (77% compared to

50%). The single most frequently mentioned ‘‘other’’ content that strategies targeted was

collaborative and interactive processes. The least frequently reported content of individual-level

strategies was directed toward influencing attitudes about evaluation.

Half of the cases reported some content for organizational-level strategies of which the most fre-

quently reported (over one third) was ‘‘organizational practices’’ and the least frequently reported

was ‘‘how to build leadership support’’ (Table 6).

Three types of barriers were coded: external, individual, and organizational (Table 7). About half

of the cases reported individual-level barriers of which attitudes were the most prevalent (44%). This

was most often specified as the conflict between evaluating and providing service. However, other

details about the individual-level barriers indicated that not only was there difficulty applying new

knowledge, but there was also difficulty learning the evaluation subject matter. Organizational-level

barriers were the most commonly reported type of barriers (60%); and the most common of these

were resources (30/37¼ 81.8% of the organizational barriers). ‘‘Other’’ responses focused on collabora-

tive and interactive challenges such as those between stakeholders, evaluators and clients, funders, orga-

nizations, and the community.

Evaluation Approaches and Methods Used To Assess ECB Efforts

As shown in Table 8, about half of the cases reported some evaluation of their ECB efforts. More

than half of the cases were classified as case studies, which were in-depth studies or descriptions.

Only two cases mentioned a comparison group. Only one case mentioned instruments. Discussion

of measurement of the outcomes was extremely rare. Forty percent indicated data collections were

conducted over more than a 2-year period, which is consistent with the duration of ECB projects.

A range of data collection methods was reported, however, almost a third did not mention any

specific data collection methods (Table 8). The majority of data collection methods were used in

Table 3. Goals and Resources, Context, and Strengths

Ma n %

ECB goals or objectivesb

Any goals or objective reported 45 73.8Context

Needs assessment conducted 18 29.5ECB effort tailored to fit context 39 63.9Not reported 16 26.2

Strengths: Individual-level 9 14.8Individual positive attitudes 9 14.8

Strengths: Organization-level 2.3 38 62.3Resources (staff/time/money) 22 36.1Internal evaluation expertise 12 19.7Organizational practices 8 13.1Leadership support 20 32.8Evaluation culture 13 21.3Mainstreaming 7 11.5Other strengthsc 4 6.6

The total number of cases is 61. Totals may exceed 61 because subcategories are not mutually exclusive and multipleresponses are possible.aMean number of organizational strengths reported from total number of possible strengths (seven) for cases where anorganizational strength was reported.bGoals and objectives were not sufficiently reported to allow further specification.cOther strengths focused on relationships such as between evaluators and the organizations.

Labin et al. 11

combination with other methods. For example, the most prevalent data collection method was a sur-

vey and two thirds of those cases reporting using surveys also used additional data collection meth-

ods. ‘‘Other’’ methods were used to provide feedback and monitoring such as narratives, field notes,

photos, visuals, as well as briefings and discussions. Notwithstanding the variety of data collection

methods reported, a minimal amount of quantitative data was reported.

Outcomes Reported at the Individual and Organizational Levels

Nearly all (about 90%; Table 9) of the cases reported some individual-level outcomes in terms of

changes in attitudes, knowledge, or behaviors, of which positive attitude outcomes were the least

frequently reported (36%). About half of the cases (51%) reported knowledge outcomes and a

Table 4. Theory and Modes of ECB Strategies

Underlying theory/principles guiding strategies 49 80.3Underlying theory/principles guiding strategies not reported 12 19.7Total 61 100Modes of strategies reported

Face-to-face only 29 47.5Face-to-face combined with other modes 27 44.3Combination not including face-to-face 1 1.6Not reported 4 6.6

Total 61 100

TA ¼ technical assistance.

Table 5. Type, Level, and Content of ECB Strategies

Any of thisType of Strategy

Reported

StrategyReportedExclusively

n % n %

Type of strategies 59 96.7Training 47 77.0 10 16.4TA/coaching/support 38 62.3 1 1.6Involvement in evaluation 41 67.2 5 8.2

Individual-level content reported 56 91.8Attitudes 14 23.0 0 0.0Terms, approaches, or methods 23 37.7 0 0.0Logic models 32 52.5 1 1.6Evaluation plan 50 82.0 2 3.3How to do an evaluation? 44 72.1 1 1.6Interpretation and use of data 32 52.5 0 0.0

Organization-level content reported 31 50.8Organizational practices 24 39.3 8 13.1Building leadership support 6 9.8 0 0.0Building evaluation culture 15 24.6 4 6.6Mainstreaming 15 24.6 2 3.3

The total number of cases is 61. Totals may exceed 61 because subcategories are not mutually exclusive and multipleresponses are possible.

significant majority reported behavioral outcomes (80%). The most frequently reported knowledge

outcomes were learning elements of an evaluation plan, how to do an evaluation, and terms,

approaches, and methods. ‘‘Other’’ knowledge outcomes were interpersonal and included group

facilitation, leadership, and how to work in a group or team. Planning and doing an evaluation were

the most common behavioral outcomes and were reported more frequently than the corresponding

knowledge outcomes of planning and doing. In knowledge outcomes, collecting and analyzing data

as part of doing an evaluation were specified with about the same frequency, but in behavior out-

comes there were more reported outcomes for collecting than for analyzing data. ‘‘Other’’ behavior

outcomes focused on collaborative skills such as involving stakeholders.

In the majority of cases, some organization-level outcomes were reported (77%; Table 10) and

93% of these cases also reported individual outcomes. Organizational outcomes were defined and

divided into five organizational characteristics as shown in Table 10. ‘‘Other’’ outcomes mentioned

pertained not to the host organization or target of the ECB but rather to the programs under their

purview. These program outcomes included developing the program’s theory of change and improv-

ing program implementation and program results.

Table 6. Implementation––Target Population of ECB

Characteristics of Target Population N %

Intended target of ECBa

Individual only 16 26.2Organization only 9 14.8Individuals and organizations 36 59.0Not reported 0 0.0

Types of individual participantsStaff only 47 77.0Students only 0 0.0Staff with evaluators/other 9 14.8Students with staff/evaluators 5 8.2

Types of organizationsNonprofit Only 24 39.3Government Only 7 11.5School/school district only 9 14.8University only 4 6.6Other type 1 1.6Multiple types reported 9 14.8Not reported 7 11.5

DomainEducation 15 24.6Health 18 29.5Child and youth development 4 6.6Justice 1 1.6Community development 6 9.8Other domain 3 4.9Multiple domainsb 12 19.7Not reported 2 3.3

CountryUnited States 42 68.9Outside United States 18 29.5Not reported 1 1.6

aFive of the cases targeting organizations included communities.bEight of the multiple domain cases included education and/or health and/or other domains.

Labin et al. 13

How Type and Number of Strategies Vary by the Presence of Preexisting Resources

To explore the data for patterns of relationships between Needs in Column I and Activities in

Column II in the Model (Figure 1), we examined how the type and number of ECB strategies

employed varied by the reported presence of preexisting resources––financial, staff, and technolo-

gical.4 About 36% of the cases reported the presence of resources, but there does not appear to be a

discernable pattern of relationship between the reporting of preexisting resources and the number of

strategies (Table 11). This may suggest that the number and type of strategies was decided for rea-

sons other than material resources or perhaps there are differences in perceived project resources

versus resources from a larger entity that is funding the ECB effort.

However, there is some data suggesting that using TA cooccurs with a greater reporting of

resources (Table 12). For example, for those cases not reporting resources, those using TA have a

lower percentage (62%) compared to those using training (80%) or experiential involvement

(72%). Furthermore, if we calculate the percentage of cases reporting resources for those using each

type of strategy, those using TA report existing resources (14/38 ¼ 37%) with a slightly greater fre-

quency than those who reported using training (15/46 ¼ 33%) and those using experiential

involvement (13/41 ¼ 32%). While the percentage differences are small, these findings suggest that

TA may occur in situations with more resources. Given the potential importance of TA for sustain-

able outcomes of ECB, the need for resources for TA warrants further exploration.

How Types of Outcomes Vary by Type of ECB Strategies

We also examined the relationship between Strategies in Column II with Outcomes in Column III of the

Model. For cases reporting individual-level outcomes, over a third utilized all three types of strategies—

training, TA, and experience (Table 13); and for these cases, behavioral outcomes were the

Table 7. Implementation Adjustments and Barriers Reported

Ma n %

Midcourse corrections were made 17 27.9External barriers reported 1.3 9 14.8

Timelines too limited 6 9.8External evaluation requirements don’t fit internal needs 7 11.5

Individual-level barriers reported 1.4 34 55.7Attitudes/beliefs as barriers 27 44.3Lack of participation 11 18.0Difficulty applying new knowledge/skills 7 11.5Other individual-level barrier 3 4.9

Organization-level barriers reported 2.1 37 60.7Lack of resources 30 49.2Staff turnover 14 23.0Lack of internal evaluation expertise 8 13.1Lack of organizational practices 5 8.2Lack of leadership support 7 11.5Lack of evaluation culture 5 8.2Lack of mainstreaming 0 0.0Other 7 11.5

The total number of cases is 61. Totals may exceed 61 because subcategories are not mutually exclusive and multipleresponses are possible.aMean number of barriers reported from total number of possible barriers for cases where a barrier of that type wasreported, that is, total possible for external barriers (2), individual-level (5), organization-level (8).

most frequent type of individual outcome. If only one type of strategy was utilized, it was most

often training; and for these cases, knowledge outcomes were the most frequent type of indi-

vidual outcome.

Looking at the organizational outcomes for those cases that utilized all three types of strategies,

we see that PPP is the most frequent type of outcome followed by mainstreaming and resources. Cul-

ture and leadership exhibit the lowest frequencies of outcomes. This same pattern for the most (PPP,

mainstreaming, and resources) and least (culture and leadership) frequent outcomes is observed for

those cases that utilized only training as a strategy as well as for most of the other cases.

How Types of Outcomes Vary by Content of Strategies?

Organizational outcomes are reported with a higher frequency for those cases that report

organizational content than those that report only individual-level content (Table 14). Individual-

Table 8. Evaluation of ECB

Characteristics of Evaluations of ECB N %

Evaluation of ECB workAny evaluation reported 30 49.2Not reported 31 50.8

Evaluation approach/ theory specifiedAny approach or theory reported 12 19.7Not reported 49 80.3

Evaluation designPrepost with comparison group 1 1.6Post only with comparison group 1 1.6Prepost with no comparison group 9 14.8Retrospective pre-post, no comparison group 2 3.3Post only with no comparison group 8 13.1Case study 35 57.4Not reported 5 8.2

Time period of data collectionLess than or equal to 6 months 6 9.8Between 6 months and 1 year 4 6.6Between 1 year and 2 years 10 16.4Between 2 years and 3 years 8 13.1More than 3 years 16 26.2Not reported 17 27.9

Data collection methodsSurvey 26 42.6Interview 19 31.1Third party observation 12 19.7Document review 11 18.0Focus groups 7 11.5Other methods 7 11.5Not reported 20 32.8

Quantitative data reportedIndividual-level outcomes 13 21.3Organization level-outcomes 6 9.8

The total number of cases is 61. Totals may exceed 61 because subcategories are not mutually exclusive and multipleresponses are possible.

Labin et al. 15

level outcomes of positive attitudes and behavioral changes are reported with a lower frequency for

those reporting only individual-level content compared to those cases reporting both individual- and

organizational-level content.

Lessons Learned

Coders extracted ‘‘other’’ responses related to lessons learned. The most frequent of these ‘‘other’’

responses fell into four areas: (a) barriers due to time and resource constraints, (b) collaborative and

experiential principles of the strategies, (c) collaborative and relationship building content of stra-

tegies, and (d) appropriate evaluation content of strategies. Various dimensions of time were men-

tioned and the most frequently mentioned was how ECB and organizational change is something

that happens over time, for example, 2 years.

In the area related to theories and principles underlying the strategies, a notable finding is the high

frequency of positive reporting of empowerment evaluation and other experiential, participatory,

and active learning approaches. Building relationships, collaborating, and networking were cited

as needed content for the individual-level ECB strategies. Other issues related to evaluation content

for individual-level strategies were often mentioned with the most prevalent aspect reported being

Table 9. Individual-Level Outcomes

Outcome Categories Ma n %

Any individual-level outcomes reported 56 91.8Attitudes 22 36.1Knowledge 2.5 31 50.8

Terms, approaches, and methods 15 24.6How to hire/work with an evaluator? 3 4.9Logic models 11 18.0Elements of an evaluation plan 17 27.9How to do an evaluation? 15 24.6

How to manage evaluation? 1 1.6How to collect data? 11 18.0How to analyze data? 10 16.4Reporting data? 5 8.2

Interpretation and use of data 8 13.1Other knowledge-related outcome 3 4.9General mention of knowledge 6 9.8

Behaviors and Skills 2.8 49 80.3Hired &/or worked with an evaluator 19 31.1Developed logic models 22 36.1Planned/designed an evaluation 35 57.4Did an evaluation 38 62.3

Managed an evaluation 5 8.2Collected data 29 47.5Analyzed data 19 31.1Reported data 17 27.9

Interpreted and used data 24 39.3Other behavior-related outcome 0 0.0General mention of behavior 1 1.6

The total number of cases is 61. Totals may exceed 61 because subcategories are not mutually exclusive and multipleresponses are possible.aMean number of outcomes reported from total number of possible outcomes for cases where an outcome of that type wasreported, that is, total possible outcomes for knowledge (8), behaviors and skills (7).

the importance of using process evaluation and immediate findings to show program progress and

outcomes, for example, with the use of logic models, outputs, short-term, and intermediate out-

comes. The importance of tailoring strategies to the situation and attending to the organizational cul-

ture were both mentioned as important lessons learned.

Table 10. Organization-Level Outcomes

Outcome Categories Ma n %

Any organization-level outcomes 47 77.0Processes, policies, and practices 2.7 44 72.1

Improved Organizational Capacity 15 24.6Improved Ability to Get Funding 16 26.2Increase in Doing Evaluation 26 42.6Increase in Using Evaluation 24 39.3Plan for future ECB Activities 14 23.0Plan for future evaluations 20 32.8Other process practice or policies 5 8.2

Leadership: Increased Support 1 8 13.1Organizational Culture 1.6 17 27.9

Learning, participation, collaboration 10 16.4Evaluation/data use 16 26.2Other organizational culture 2 3.3

Mainstreaming 1.8 33 54.1Evaluation is more routine 11 18.0Jobs redesigned to include evaluation 16 26.2New relationships among organizations 18 29.5Ongoing learning opportunities 13 21.3Other mainstreaming outcome 0 0.0General mention of mainstreaming 2 3.3

Resources 1.9 28 45.9Written evaluation materials 6 9.8Technology 14 23.0More resources for evaluation 12 19.7Internal evaluation expertise 18 29.5Other resource/resources 2 3.3

The total number of cases is 61. Totals may exceed 61 because subcategories are not mutually exclusive and multipleresponses are possible.aMean number of outcomes reported from total number of possible outcomes for cases where an outcome of that type wasreported, that is, total possible outcomes for PPP (7); leadership (1); culture (3); mainstreaming (6); resources (5).

Table 11. Number of Strategies (Training, Technical Assistance, & Experiential Involvement) by the Presenceof Preexisting Resources

Resources: StaffTime, Money

All CasesNone

Reported

Any OneStrategyReported

Any TwoStrategiesReported

All ThreeStrategiesReported

Total % WithinResources Categoriesn % n % n % n % n %

Resources reported 22 36.1 2 9.1 6 27.3 6 27.3 8 36.4 100.0Resources not reported 39 63.9 0 0.0 10 25.6 13 33.3 16 41.0 100.0Total 61 100.0 2 3.3 16 26.2 19 31.1 24 39.3 100.0

Labin et al. 17

Discussion

The findings of this synthesis of the empirical ECB literature reflect the emergent nature of the field

but also suggest that ECB efforts have a relatively common set of principles and strategies and have

achieved some short-term and intermediate-term outcomes at both the individual and organizational

levels. In spite of the variation in the narrative reporting of ECB, there is a high degree of consis-

tency between the concepts in the theoretical literature and those found in the empirical literature.

This consistency confirms that the ECB field has developed beyond its infancy and demonstrates the

field’s need and readiness for more systematic reporting practices and evaluation methods. In the

discussion below, we offer additional suggestions for using the findings from the synthesis in imple-

menting and evaluating ECB.

Collaboration

As mentioned earlier, ECB has deep roots in collaborative, participatory, and empowerment evalua-

tion approaches (Cousins et al., 2004; Cousins & Whitmore, 1998; Fetterman & Wandersman, 2005;

Preskill & Torres, 2009). In this empirical synthesis, collaboration emerged as the essential thread in

the fabric of ECB efforts, warranting its explicit inclusion as a key concept in ECB models, efforts,

and evaluations. We found collaborative issues reported as an aspect of existing strengths, strategies,

barriers, individual-level outcomes, and organizational-level outcomes. Collaborative issues were

not explicitly included in the original Integrative ECB Model but have been added. Collaborative

and participatory processes involve the ways in which people interact and can be considered part

of the human relations of ECB. There was also convergence of evidence from the synthesis that sug-

gests the need for more attention to the other human relation dimensions of ECB such as targeting

attitudes, leadership, and a supportive organizational culture.

Collaboration between funders and projects may also be something to explore. Funders were not

reported as being participants in the ECB efforts, but there was mention of their importance to the

efforts. Adequate resources are needed not only to begin ECB efforts, but also to sustain them. If

funders were included as target participants in the ECB efforts, it could increase their first-hand

knowledge of ECB efforts and requirements, which, in turn, could affect expectations and funding

cycles and reduce related resource and staff-turnover barriers. These hypotheses merit further

exploration.

Strategies and Outcomes

The synthesis findings confirm the importance of participatory processes in ECB strategies. Consis-

tent with the emphasis in the literature (Fetterman & Wandersman, 2005; Patton, 2012), experiential

learning through involvement or participation in an evaluation was a major type of strategy or

Table 12. Types of Strategies by the Presence of Preexisting Resources

Resources: Staff, Time, Money

All CasesNone

Reported TrainingTechnicalAssistance

ExperientialInvolvement

n % n % n % n % n %

Resources reported 22 36.1 2 9.1 15 68.2 14 63.6 13 59.1Resources not reported 39 63.9 0 0.0 31 79.5 24 61.5 28 71.8Total 61 100.0 2 3.3 46 75.4 38 62.3 41 67.2

The total number of cases is 61. Totals may exceed 61 because categories are not mutually exclusive and multiple responsesare possible.

gies Indiv

ttitude

ulture

1100.0

5100.0

report

2100.0

es,polic

is61.T

sponse

19 at American Evaluation Association on February 9, 2012aje.sagepub.comDownloaded from

ulture

4100.0

report

es,polic

is61.T

sponse

delivery mechanism and was associated with a high frequency of individual behavioral outcomes as

was the use of the combination of all three types of strategies, that is, experiential, training, and TA.

Training was associated with a high frequency of individual knowledge outcomes. This suggests that

the optimal use of multiple strategies—training, participating in an evaluation, and TA—for achiev-

ing individual knowledge and behavioral outcomes.

Multiple strategies may also be optimal for achieving organizational outcomes such as PPP, cul-

ture, and mainstreaming. TA may play a critical role in developing and sustaining organizational

changes. Therefore, various stakeholders, including funders, may want to consider the need for

ongoing resources for TA.

In this synthesis, we examined specific components of evaluation such as planning, doing, and

using data––key elements in definitions of ECB. Stevenson, Florin, Mills, and Andrade (2002) have

concluded that ECB should focus on selected components of evaluation (i.e., planning and designing

evaluation) and less on other skills (i.e., data analysis). The findings in this synthesis suggest support

for Stevenson and his colleagues. For example, collecting and analyzing data were targeted with

about the same frequency in the strategies as they were reported as knowledge outcomes. However,

behavior outcomes of collecting data were more frequent than those for analyzing data. Therefore,

while the content may equally address collecting and analyzing data (and knowledge outcomes are

commensurately equal), there are more behavioral outcomes of collecting data than of conducting

analyses. Furthermore, there were reports of barriers or difficulties in learning the subject matter and

of needing help with analyzing data. Thus, more attention in the future to which evaluation skills are

most important and feasible seems warranted.

Notwithstanding the conceptual literature that stresses the importance of motivation (Preskill &

Boyle, 2008) and attitudes (Owen, 2003) toward evaluation and ECB, this empirical synthesis did

not find much attention directed to targeting such attitudes. Attitudes were not often reported as a

preexisting strength, as content for strategies, or as an outcome. Conversely, negative attitudes

were frequently noted as a barrier and the importance of attitudes was mentioned in other com-

ments. All of these findings suggest the need for ECB efforts to pay more attention to targeting

the development of positive attitudes toward evaluation and understanding how to overcome neg-

ative attitudes.

Individual, Organizational, and Program Outcomes

Of the five types of organizational ECB outcomes examined––PPP, leadership, culture, mainstreaming,

resources—PPP is the outcome that occurred with the greatest frequency. The indicators for

PPP included ‘‘doing’’ and ‘‘using’’ evaluation, which are essential elements in the working defini-

tion of ECB used in this synthesis as well as in other definitions and frameworks of ECB. Main-

streaming was the second most frequently cited organizational outcome and its indicators reflect

changes needed for the sustainability of ECB such as making evaluation more routine and changing

the way people do their jobs. The frequency with which change occurs for each of the five charac-

teristics of organizational capacity may reflect varying levels of difficulty in achieving change. For

example, that mainstreaming occurs less frequently than PPP suggests that it is more difficult to

achieve. Similarly, leadership is the least frequently reported outcome and may be one of the more

difficult to affect.

While all the factors reinforce each other, leadership, culture, and resources may be factors that

play a particularly important role in supporting the other organizational factors. It is difficult to envi-

sion PPP or mainstreaming advancing without at least certain minimal levels of these other factors.

Yet, leadership was the least frequently targeted organizational factor and the least frequently

reported organizational outcome. This suggests that more attention should be paid to defining, tar-

geting, developing and measuring leadership.

Labin et al. 21

There may be optimal sequencing regarding the targeting of particular organizational character-

istics before others. For example, perhaps PPP should be targeted before mainstreaming, consistent

with the concept that ECB is developmental and occurs in stages (Gibbs et al., 2002; Love, 1991;

Owen, 2003).

Another issue to consider is the sequence and relationship of individual outcomes to organiza-

tional outcomes. The findings here indicate that individual outcomes of attitudes and behaviors were

more frequent when both individual and organizational content or change was targeted in the stra-

tegies than when only individual content or change was targeted (Table 14). This is consistent with

Preskill and Boyle’s (2008) discussion of the importance of organizational factors for individual

learning to take place. It also supports Taylor-Ritzler, et al. (2010) who found that an organizational

environment conducive to evaluation was necessary to increase individual motivation and behavior

change.

The organizations that are the target of the ECB efforts are generally service organizations

that are responsible for developing and implementing programs, predominately in the health

and education fields. An important purpose of ECB is to support the efforts of organizations

to improve their programs and program outcomes for their service populations. While program

outcomes were not in the original Integrative ECB Model, they emerged as an important issue

from the synthesis and were added to the final Integrative ECB Model (Figure 1). Achieving

program outcomes has long been an important and explicit goal of the evaluation field, of

empowerment evaluation (Fetterman & Wandersman, 2005), and of ECB efforts in general. The

explicit inclusion of program outcomes to the model is empirically justified and hopefully their

inclusion will be helpful in orienting ECB theory, models, and strategies toward achieving pro-

gram outcomes.

Evaluation of ECB

The information in the original articles describing the evaluations of each ECB case was limited and

often communicated in qualitative narratives. A strength of the evaluations reported was the large

number of cases that used a variety of data collection methods. However, there are weaknesses in

the evaluations, especially in regard to very limited reporting of measures and quantitative data. This

seems somewhat surprising for a field embedded in evaluation and populated by evaluators. It may

be useful to understand what is driving the current state of evaluation of ECB in order to inform sug-

gestions of how to improve it.

We do know that ECB is a complex phenomenon involving issues of individual learning,

organizational change, sustained change, and program processes and outcomes. It is important to

continue to look at all of these areas and build on existing knowledge (Compton et al., 2002;

Cousins, et al., 2004; Preskill & Boyle, 2008). Nevertheless, the challenges to evaluating ECB

derive from the complex interplay within and across individual and organizational factors, the

difficulty of isolating causal factors of change both for the organization and for program outcomes,

and the long-term goals of sustainable change (Wing, 2004). Furthermore, comparative designs are

difficult to implement long-term because of the need to keep the treatment and comparison groups

sufficiently separate and distinct over time. The complexity of these issues is a challenge for mean-

ingful evaluations of ECB that adequately represent the important aspects and effects of ECB and

not just the easy-to-measure effects.

Added to this complexity is the fact that ECB is a relatively new area, still being defined and

without agreed upon measures. As with all fledging endeavors, it is important to use evaluation

methods that will encourage and not squelch development. However, there are numerous principles

and methods that evaluators have in their toolbox that can enhance the state of ECB evaluation. The

first priority is to focus on developing, using, and reporting on ECB with common terminology,

indicators, and rigorous measures. The second priority is improvement of evaluation designs with

the judicious use of comparison or control groups for short-term and intermediate-term outcomes.

Improving designs also includes planning now for longitudinal studies with longer time frames in

order to address the definitions and goals of ECB as sustainable organizational change (Baizerman,

Compton, & Stockdill, 2002b). Such efforts could propel the field forward and foster an evidence-

based practice for ECB.

The Integrative Model and Synthesis Method: Implications forMeasurement and Practice

The Integrated ECB model proved to be useful and the findings of the synthesis indicated the need

for a couple of additions to the model of important reoccurring themes in the empirical literature.

These additions included explicit inclusion of collaborative and participatory aspects and processes

to most elements of the model and inclusion of program outcomes.

The methodology produced a reliable empirical knowledge base from a diverse body of literature.

Therefore, this study demonstrates the viability of a systematic synthesis research method for a

developing field of inquiry using narrative accounts of programs and evaluations as well as the fea-

sibility of codifying such knowledge. The systematic coding process created an operationalization of

concepts that provides an empirical basis for developing measures to be used in the field and in

future evaluations. The strength of this study is its ability to summarize existing knowledge, but its

inherent dependence on what is reported in the literature is also a limitation of this and all syntheses.

The Integrative ECB Model, the questions derived from it, and the analyses conducted were all

developed with the goal of painting a reliable description of the current empirical ECB field and a

preliminary assessment of some basic relationships. Certainly this, as one piece of research, has its

particular limitations. There are undoubtedly more elements to be added to the model, more analyses

to be conducted to answer additional questions, and more ECB efforts to be reviewed.

One means to address these limitations would be to think of this synthesis as a database. As most

databases, their first iteration is not their last nor is any one set of analyses definitive; rather it is a

resource to be refined, added to, and accessed for further analyses. Expanding on the idea of a data-

base are other means of developing knowledge and practice that are being used in other areas and

that could be used to develop the ECB field. One is a clearinghouse for ECB program materials and

measures, whereas another is online access to detailed information about ECB efforts, implementa-

tion, and outcomes such as those used in projects and sites compiling effective practices for specific

areas, for example, substance abuse or HIVAIDS (GAO, 2009; KIT Solutions, 2011; Tanglewood

Research, 2011). An ECB website could be established and users could submit entries of additional

ECB examples for review. The coding categories for the elements of the Integrative ECB Model

could be posted to encourage common language and collaborative input, and to contribute to the

refinement of existing measures (Preskill & Torres, 2009; Stevenson, et. al, 2002; Taylor-Ritzler,

Suarez-Balcazar, & Garcia-Iriarte, 2010; Volkov & King, 2007). Focusing on developing, using, and

reporting with common measures would allow the field to define and assess strategies and outcomes

and empirically examine the causal sequences and relationships hypothesized in the various theore-

tical ECB models. This would enhance the empirical data to guide ECB practice and evaluation.

In summary, we have demonstrated that the field of ECB has reported many outcomes, but it still

has much room for growth and refinement. Our analysis highlights the frequency and types of indi-

vidual and organizational outcomes reported while also noting some preliminary results of relation-

ships between strategies and outcomes. Next steps for the field include more complete articulations

of the ECB process and more rigorous measurement and evaluation of these processes. This would

contribute to the ECB field becoming increasingly effective at achieving its long-term goals includ-

ing positive program outcomes.

Labin et al. 23

s,P.S.

unity-

e,13,85–104.

elopin

27,257.

(2005).

hoolco

ion.Pr

9,49–54.

ilson,M

tobuild

ion.Eva

ng,28,329–334.

leofva

41–47.

ndon,P.R

ofbuild

sonnel

),125–142.

elopin

ention

llabora

tion.AID

n,17(4

),317–333.

(2002).

23,1–17.

ampbel

ein,L.

.,Bonte

(2004).

sexual

,34,251–262.

(2007).

116,61.

er,P.,

tooutc

ration

vention

support

,41,143–157.

etor’

111,85–93.

1 (con

(continued

elopin

public

):33–40.

udon,R

ith,I.,

S)1995-2

93,47–62.

udon,R

(2001).

Smith,M

7–S2

Bonnet

2001).

2–77.

2008).

,32,478–506.

Solıs

2004).

munity,

p.89–117).

28,91–101.

iential

tional

n,25,286.

havior

(pp.321–350).

gton,D

.(2005).

p.92–122).

2 (con

(continued

aspohle

.(2003).

moting

munity-

26,37–52.

2003).

county

hoolre

initia

99,33–49.

ion:en

munity

cultura

vention.Jo

ity,26,53–71.

(2008).

119,93–113.

(2006).

munitie

tion.N

109,73–85.

llabora

29,358–368

land,S.

(2002).

elopin

ind.Can

17,103–119.

L.E.(1

schoolim

ion:ex

anitoba

schoolim

,155–178.

ell-Jo

p.73–91).

schooldis

63–80.

talhea

lth.Ps

abilita

28,234–241.

1 (con

(continued

(2005).

acity-

munity

ts.Eva

11,390–414.

,B.E.,

Johnso

(2005).

munity

p.155–82).

llan-W

ten,S.

,pitfa

22,99–124.

3,45–56.

foundla

nd.Can

22,41–56.

cDonal

rs,P.,

rd,B.(2

public

9,9–29.

Lennie

(2005).

schoolbre

,5,18–26.

in,B.,

vention.N

forEva

7–46.

oon,S.

erly,40,121–128.

.(2005).

n,17,400–404.

.,Burg

2007).

initio

,30,231–236.

.(2005).

schoole

ety’

20,125–155.

1 (con

(continued

’Sulli

moting

tion:Fi

munity-

es.Eva

8,372–387.

van,J.

es:Pro

moting

within

tion.Eva

,21,21–29.

culture

initio

isofits

within

,3,43–47.

eg,J.,

on,B.,

yson,K

,31,22–33.

ck,B.J.,

t,P.J.

(1999).

s’ev

Public

lth.Can

,137–154.

(1996).

counta

ilylit

llabora

ach.Eva

,19,263–272.

applie

53–64.

k,24,120–127.

en,E.M

‘‘Im

ciplin

ion.’’

alofEngi

):171–182.

2002).

),233–243.

illo,N

2003)Im

anoutc

initia

ity,26,5–20.

llins,

‘‘Adap

.’’A

alofEva

):387.

jian,K

.(2002).

95,39–56.

Studyi

28,45–59.

1 (con

(continued

Support

4,225.

2001).

entions:

munity

action

ion.Can

16,79–98.

(2005).

anitar

3,78–112.

ion.Eva

,1,221–228.

2006).

elopin

stitute

t,36(2

),161–172.

entifie

Authors’ Note

Dr. Lesesne was previously with the Centers for Disease Control and Prevention, Division of Reproductive

Health, where she was the Project Officer for this project. She contributed to the initiation and development

of the project during her tenure at CDC and its completion while at ICF Macro. The findings and conclusions

in this article are those of the authors and do not necessarily represent the official position of the U.S. Centers

for Disease Control and Prevention.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publi-

cation of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publica-

tion of this article: This study was supported by CDC contract # 200-2008-M-27814.

1. The data on organizational strengths other than resources was judged insufficient for relational analyses.

2. Details on coding of items are available by request from the first author.

3. Given that a balanced contingency table (i.e., categories used by one rater are directly comparable to the

categories used by a second rater) is required to properly calculate k via computer-aided statistical software,

pseudo-observations were utilized. Before the pseudo-observations were created, a weight of one (1) was

assigned to each of the codes that were given for each case. To create pseudo-observations, a very small

weight (0.0000000001) was assigned to the range of possible values. Even though the small weight of the

pseudo-observations did not greatly impact the k statistic that was calculated, these observations ensured

that the same categories were used by each coder, which allowed for balanced contingency tables.

4. Information on goals and context (needs assessments and tailoring) were not reported in sufficient detail to

provide meaningful information to conduct relational analyses with strategies.

References

American Evaluation Association. (2008). American evaluation association internal scan report to the

membership, by goodman research group. Retrieved from AEA website: http://www.eval.org

Baizerman, M., Compton, D. W., & Stockdill, S. H. (2002a). Editors’ notes. New Directions for Evaluation, 93,

1–6.

Baizerman, M., Compton, D. W., & Stockdill, S. H. (2002b). New directions for ECB. New Directions for

Evaluation, 93, 109–119.

Begg, C. B. (1994). Publication bias. In H. Cooper & L. V. Hedges (Eds.), The handbook of research synthesis

(pp. 399–409). New York, NY: Sage.

Boyle, R. (1999). Professionalizing the evaluation function: Human resource development and the building of

evaluation capacity. In R. Boyle & D. Lemaire (Eds.), Building effective evaluation capacity (pp. 135–151).

New Brunswick, NJ: Transaction.

Boyle, R., Lemaire, D., & Rist, R. C. (1999). Introduction: Building evaluation capacity. In R. Boyle &

D. Lemaire (Eds.), Building effective evaluation capacity: Lessons from practice (pp. 1–19). New

Brunswick, NJ: Transaction.

Centers for Disease Prevention and Control. (2011). The guide to community and preventive services: The com-

munity guide. Retrieved from http://www.thecommunityguide.org/index.html

Compton, D. W., Baizerman, M., & Stockdill, S. H. (2002). Special issue: The art, craft, and science of evalua-

tion capacity building. New Directions for Evaluation, 93, 1–120.

Cooper, H., & Hedges, L. (1994). The handbook of research synthesis. New York, NY: Russell Sage.

Cousins, J. B., Goh, S. C., Clark, S., & Lee, L. E. (2004). Integrating evaluative inquiry into the organizational

culture: A review and synthesis of the knowledge base. Canadian Journal of Program Evaluation, 19, 99–141.

Cousins, J. B., & Whitmore, E. (1998). Framing participatory evaluation. New Directions for Evaluation, 80, 5–23.

Duffy, J. L., & Wandersman, A. (2007, November). A review of research on evaluation capacity-building stra-

tegies. Paper presented at the annual conference of the American Evaluation Association, Baltimore, MD.

Durlak, J. A., & DuPre, E. P. (2008). Implementation matters: A review of research on the influence of imple-

mentation on program outcomes and the factors affecting implementation. American Journal of Community

Psychology, 41, 327–350.

Fetterman, D., Kaftarian, S., & Wandersman, A. (Eds.). (1996). Empowerment evaluation. Thousand Oaks, CA:

Fetterman, D., & Wandersman, A. (Eds.). (2005). Empowerment evaluation principles in practice. New York,

NY: Guilford Press.

General Accounting Office. (1987). Drinking-age laws: An evaluation synthesis of their impact on highway

safety (GAO/PEMD-87-10). Washington, DC: Author.

General Accounting Office. (1989). Prospective evaluation methods: The prospective evaluation synthesis

(GAO/PEMD-10.1.10). Washington DC: Author.

General Accounting Office. (1992a). Cross-design synthesis (GAO/PEMD-92-18). Washington, DC: Author.

General Accounting Office. (1992b). The evaluation synthesis (GAO/PEMD-10.1.2). Washington, DC: Author.

Government Accountability Office. (2003). Program evaluation: An evaluation culture and collaborative part-

nerships help build agency capacity (GAO-03-454). Washington, DC: Author.

Government Accountability Office. (2009). Program evaluation: A variety of methods can help identify effec-

tive interventions (GAO-30-10, 2009). Washington, DC: Author.

Gibbs, D., Napp, D., Jolly, D., Westover, B., & Uhl, G. (2002). Increasing evaluation capacity within commu-

nity based HIV prevention programs. Evaluation and Program Planning, 25, 261–269.

Higgins, J., & Green, S. (Eds.). (2011, March). Defining types of studies. Cochrane Handbook for Systematic

Reviews of Interventions. Version 5.1.0. Retrieved from www.cochrane-handbook.org

Kellogg Foundation, W. K. (2001). Logic model development guide. Retrieved from: http://www.wkkf.org/

Pubs/Tools/Evaluation/Pub3669.pdf

KIT Solutions. (2011). COMET and iGTO. Retrieved March 15, 2011, from http://www.kitsolutions.net/

our-clients-who-we-serve#all

Kotter, J. (1996). Leading change, Cambridge, MA: Harvard Business School Press.

Labin, S. N. (2008). Research synthesis: Toward broad based evidence. In N. L. Smith & P. R. Brandon (Eds.),

Fundamental issues in evaluation (pp. 89–110). New York, NY: Guilford Press.

Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics,

33, 159–174.

Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage.

Love, A. (1991). Internal evaluation: Building organizations from within. Thousand Oaks, CA: Sage.

Love, A. (2006). Developing evaluation capacity in extension 4-H field faculty: A framework for success.

American Journal of Evaluation, 22, 257–269.

Milstein, B., & Cotton, D. (2000). Defining concepts for the presidential strand on building evaluation capacity.

Working paper circulated in advance of the November 2000, meeting of the American Evaluation Association.

O’Sullivan, R. G. (2004). Practicing evaluation: A collaborative approach. Thousand Oaks, CA: Sage.

Owen, J. M. (2003). Evaluation culture: A definition and analysis of its development within organizations.

Evaluation Journal of Australasia, 3, 43–47.

Patton, M. Q. (2012). Essentials of utilization-focused evaluation. Thousand Oaks, CA. Sage.

Preskill, H., & Torres, R. (1999). Evaluative inquiry for learning in organizations. Thousand Oaks, CA: Sage.

Preskill, H., & Torres, R. (2009). Readiness for Organizational Learning and Evaluation (ROLE). In D. Russ-Eft

& H. Preskill (Eds.), Evaluation in organizations (2nd ed., pp. 491–504). Boston, MA: Perseus Books.

Labin et al. 31

Preskill, H., & Boyle, S. (2008). A multidisciplinary model of evaluation capacity building. American Journal

of Evaluation, 29, 443–459.

Preskill, H., Zuckerman, B., & Matthews, B. (2003). An exploratory study of process use: Findings and impli-

cations for future research. American Journal of Evaluation, 24, 423–442.

Rapkin, B. D., & Trickett, E. J. (2005). Comprehensive dynamic trial designs for behavioral prevention research

with communities: Overcoming inadequacies of the randomized controlled trial paradigm. In E. Trickett &

W. Pequenaut (Eds.), Increasing the community impact of HIV prevention interventions (pp. 249–277). New

York, NY: Oxford University Press.

Robinson, T. T., & Cousins, J. B. (2004). Internal participatory evaluation as organizational learning: A long-

itudinal case study. Studies in Educational Evaluation, 30, 1–22.

Rodriguez-Campos, L. (2005). Collaborative evaluations. Tamarac, FL: Llumina Press.

Sanders, J. R. (2003). Mainstreaming evaluation. New Directions for Evaluation, 99, 3–6.

Schaumberg-Muller, H. (1996). Evaluating capacity building: Donor support and experiences. Report for the

DAC (Development Assistance Committee) Expert Group on Aid Evaluation, OECD (Organization for

Economic Cooperation and Development). Copenhagen, Denmark: DANIDA (Danish International Devel-

opment Assistance). Retrieved from: http://www.oecd.org/dataoecd/20/52/16546669.pdf

Stockdill, S. H., Baizerman, M., & Compton, D. W. (2002). Toward a definition of the ecb process: A conver-

sation with the ECB literature. New Directions for Evaluation, 93, 7–25.

Stevenson, J. F., Florin, P., Mills, D. S., & Andrade, M. (2002). Building evaluation capacity in human service

organizations: A case study. Evaluation and Program Planning, 25, 233–243.

Suarez-Balcazar, Y., Taylor-Ritzler, T., Garcia-Iriarte, E., Keys, C., Kinney, L., & Rush-Ross, H., . . . Curtin, G.

(2010). Evaluation capacity building: A cultural and contextual framework. In F. Balcazar, Y. Suarez-

Balcazar, T. Taylor-Ritzler & C. B. Keys (Eds.), Race, culture and disability: Rehabilitation science and prac-

tice. Sudbury, MA: Jones & Bartlett Learning.

Taylor-Ritzler, T., Suarez-Balcazar, Y., & Garcia-Iriarte, E. (2010). Results and implications of a mixed-methods

ECB model validation study. Paper presented at the American Evaluation Association annual meeting, San

Antonio, TX.

Tanglewood Research Inc. (2011). The evaluation lizard. Retrieved from http://evaluationlizard.com/welcome.

United Way of America. (1996). Measuring program outcomes: A practical approach. Arlington, VA: Author.

United Way of America. (2008). Logic model handbook. Arlington, VA: Author. Retrieved from: http://www.

vsuw.org/file/logic_model_handbook_updated_2008.pdf

University of Wisconsin Cooperative Extension. (2003). Evaluation logic model bibliography. Retrieved from:

http://www.uwex.edu/ces/pdande/evaluation/evallogicmodel.html

Volkov, B., & King, J., (2007). Checklist for building evaluation capacity. Retrieved http://www.wmich.edu/

evalctr/archive_checklists/ecb.pdf

Wandersman, A., Imm, P., Chinman, M., & Kaftarian, S. (2000). Getting to outcomes: Methods and tools for

planning, evaluation and accountability. Rockville, MD: Center for Substance Abuse Prevention.

Wandersman, A., Snell-Johns, J., Lentz, B. E., Fetterman, D. M., Keener, D. C., Livet, M., . . . Flasphohler, P.

(2005). The principles of empowerment evaluation. In D. Fetterman & A. Wandersman (Eds.), Empowerment

evaluation principles in practice (pp. 27–41). New York, NY: Guilford Press.

Wing, K. T. (2004). Assessing the effectiveness of capacity-building initiatives: Seven issues for the field.

Non-Profit and Voluntary Sector Quarterly, 33, 153–160.

A Research Synthesis of the Evaluation Capacity Building Literature

Documents

Transcript of A Research Synthesis of the Evaluation Capacity Building Literature

Mission Karmayogi - Capacity Building Commission

Building First Nations Capacity Through Teacher Efficacy

capacity building in pharmacy education in resource-poor ...

Leadership Capacity Building for Manufacturing and Manufacturing ...

Playford Literacies: building literacy capacity in learning communities.

Capacity Building Catalogue - Talal Abu-Ghazaleh Academy

Guidebook for Capacity Building in the Engineering ...

USP Presentation -CBD Regional Capacity Building 25 Nov ...

Evaluation of UNESCO's work in capacity building in the basic ...

Building Capacity Building Relationships

Journal of Economic and Social Policy Economic Gardening: Capacity Building for Stronger Regions Recommended Citation Economic Gardening: Capacity Building for Stronger Regions

Counter Narco-Terrorism Program Building Force Capacity of ...

ADMINISTRATIVE CAPACITY BUILDING SELF ...

Building Research Capacity

(Capacity Building) - NKPSIMS

Capacity building in local self-governments

Institutions-taking-root-building-state-capacity-in-challenging ...

Building Capacity to use m-Health in Maternal, Newborn and ...

Of Capacity Building &Training Of Panchayati Raj Institutions ...

Regional Capacity Building for Energy Efficiency in Central ...