Developing procedural flexibility: Are novices prepared to learn from comparing procedures?

20
436 British Journal of Educational Psychology (2012), 82, 436–455 C 2011 The British Psychological Society The British Psychological Society www.wileyonlinelibrary.com Developing procedural flexibility: Are novices prepared to learn from comparing procedures? Bethany Rittle-Johnson 1 , Jon R. Star 2 and Kelley Durkin 1 1 Department of Psychology and Human Development, Peabody College, Vanderbilt University, Nashville, Tennessee, USA 2 Graduate School of Education, Harvard University, Cambridge, Massachusetts, USA Background. A key learning outcome in problem-solving domains is the develop- ment of procedural flexibility, where learners know multiple procedures and use them appropriately to solve a range of problems (e.g., Verschaffel, Luwel, Torbeyns, & Van Dooren, 2009). However, students often fail to become flexible problem solvers in mathematics. To support flexibility, teaching standards in many countries recommend that students be exposed to multiple procedures early in instruction and be encouraged to compare them. Aims. We experimentally evaluated this recommended instructional practice for supporting procedural flexibility during a classroom lesson, relative to two alternative conditions. The alternatives reflected the common instructional practice of delayed exposure to multiple procedures, either with or without comparison of procedures. Sample. Grade 8 students from two public schools (N = 198) were randomly assigned to condition. Students had not received prior instruction on multi-step equation solving, which was the topic of our lessons. Method. Students learned about multi-step equation solving under one of three conditions in math class for about 3 hr. They also completed a pre-test, post-test, and 1-month-retention test on their procedural knowledge, procedural flexibility, and conceptual knowledge of equation solving. Results. Novices who compared procedures immediately were more flexible problem solvers than those who did not, even on a 1-month retention test. Although condition had limited direct impact on conceptual and procedural knowledge, greater flexibility was associated with greater knowledge of both types. Conclusions. Comparing procedures can support flexibility in novices and early introduction to multiple procedures may be one important reason. Correspondence should be addressed to Bethany Rittle-Johnson, Department of Psychology and Human Development, Peabody College, Vanderbilt University, 230 Appleton Place, Peabody #0552, Nashville, TN 37203, USA (b.rittle- [email protected]). DOI:10.1111/j.2044-8279.2011.02037.x

Transcript of Developing procedural flexibility: Are novices prepared to learn from comparing procedures?

436

British Journal of Educational Psychology (2012), 82, 436–455C© 2011 The British Psychological Society

TheBritishPsychologicalSociety

www.wileyonlinelibrary.com

Developing procedural flexibility: Are novicesprepared to learn from comparing procedures?

Bethany Rittle-Johnson1∗, Jon R. Star2 and Kelley Durkin1

1Department of Psychology and Human Development, Peabody College, VanderbiltUniversity, Nashville, Tennessee, USA

2Graduate School of Education, Harvard University, Cambridge,Massachusetts, USA

Background. A key learning outcome in problem-solving domains is the develop-ment of procedural flexibility, where learners know multiple procedures and use themappropriately to solve a range of problems (e.g., Verschaffel, Luwel, Torbeyns, & VanDooren, 2009). However, students often fail to become flexible problem solvers inmathematics. To support flexibility, teaching standards in many countries recommendthat students be exposed to multiple procedures early in instruction and be encouragedto compare them.

Aims. We experimentally evaluated this recommended instructional practice forsupporting procedural flexibility during a classroom lesson, relative to two alternativeconditions. The alternatives reflected the common instructional practice of delayedexposure to multiple procedures, either with or without comparison of procedures.

Sample. Grade 8 students from two public schools (N = 198) were randomlyassigned to condition. Students had not received prior instruction on multi-step equationsolving, which was the topic of our lessons.

Method. Students learned about multi-step equation solving under one of threeconditions in math class for about 3 hr. They also completed a pre-test, post-test,and 1-month-retention test on their procedural knowledge, procedural flexibility, andconceptual knowledge of equation solving.

Results. Novices who compared procedures immediately were more flexibleproblem solvers than those who did not, even on a 1-month retention test. Althoughcondition had limited direct impact on conceptual and procedural knowledge, greaterflexibility was associated with greater knowledge of both types.

Conclusions. Comparing procedures can support flexibility in novices and earlyintroduction to multiple procedures may be one important reason.

∗Correspondence should be addressed to Bethany Rittle-Johnson, Department of Psychology and Human Development,Peabody College, Vanderbilt University, 230 Appleton Place, Peabody #0552, Nashville, TN 37203, USA ([email protected]).

DOI:10.1111/j.2044-8279.2011.02037.x

Developing procedural flexibility 437

Proficiency in problem-solving domains requires that people develop proceduralflexibility, where learners know multiple procedures and use them appropriately tosolve a range of problems (Baroody & Dowker, 2003; Blote, Van der Burg, & Klein,2001; Kilpatrick, Swafford, & Findell, 2001; Siegler, 1996; Star, 2005). To supportflexibility, teaching standards in numerous countries recommend that students beintroduced to multiple procedures early in instruction and be encouraged to comparethe procedures. Unfortunately, experimental evidence supporting the effectiveness ofcomparing procedures early in instruction is sparse. The purpose of the current studywas to evaluate the effectiveness of comparison of multiple procedures for novices, inparticular Grade 8 students learning to solve multi-step linear equations.

The importance of procedural flexibilityProcedural flexibility is a key feature of competence in many domains, includingmathematics (see Siegler, 1996; Star, 2005; Verschaffel, Luwel, Torbeyns, & Van Dooren,2009 for reviews). People who develop procedural flexibility are more likely to use oradapt existing procedures when faced with unfamiliar problems and to have a greaterunderstanding of domain concepts (e.g., Blote et al., 2001; Hiebert et al., 1996). Forexample, knowledge of multiple procedures for multi-digit arithmetic calculations wasrelated to greater accuracy on transfer problems and greater conceptual knowledgeof arithmetic (Carpenter, Franke, Jacobs, Fennema, & Empson, 1998). Furthermore,flexibility is a salient characteristic of experts in a domain (Dowker, 1992; Star & Newton,2009).

Despite general agreement on the importance of procedural flexibility, definitionsand measures range from simply knowing multiple procedures, to selectively choosingto use a procedure based on problem characteristics, person characteristics, and/orcontextual context (see Verschaffel et al., 2009 for a review). As in many past studies,we operationalized procedural flexibility as selection of procedures based on problemcharacteristics; we also used multiple measures of flexibility (Blote, Klein, & Beishuizen,2000; Blote et al., 2001; Kilpatrick et al., 2001; Klein, Beishuizen, & Treffers, 1998; Star &Rittle-Johnson, 2008; Torbeyns, Ghesquiere, & Verschaffel, 2009). First, we assessedstudents’ knowledge of multiple procedures and the appropriateness of these proceduresfor efficiently solving particular problem types. Second, we assessed students’ flexibleuse of procedures – whether students selected and implemented the most appropriateprocedure for a given problem based on the ease and accuracy of particular procedures(Beishuizen, van Putten, & van Mulken, 1997; Blote et al., 2001).

Supporting procedural flexibilityTo support procedural flexibility, expert mathematics teachers often have studentsshare and compare multiple solution procedures (e.g., Ball, 1993; Lampert, 1990), asdo teachers in high-performing countries such as Japan (Richland, Zur, & Holyoak,2007). Recent experimental evidence confirms that comparing procedures can promoteflexibility (Rittle-Johnson & Star, 2007, 2009; Star & Rittle-Johnson, 2009). In particular,asking students to identify similarities and differences in two solution procedures forsolving the same problem, including when one procedure is more efficient, increasesprocedural flexibility. Because of the potential advantages of discussing and comparingmultiple solution procedures, reform efforts in numerous countries emphasize theimportance of using this instructional practice (e.g., Australian Education Ministers,2006; Brophy, 1999; Kultusministerkonferenz, 2004; National Council of Teachers ofMathematics, 2000; Singapore Ministry of Education, 2006; Treffers, 1991).

438 Bethany Rittle-Johnson et al.

When in the learning process is comparison of procedures appropriate? Teachingstandards in some countries specify that students should compare procedures earlyin the learning process (Becker & Selter, 1996; Klein et al., 1998; National Councilof Teachers of Mathematics, 2000), whereas standards in other countries do notspecify when this instructional practice should be used (e.g., Australian EducationMinisters, 2006; Singapore Ministry of Education, 2006). The goal of the current studywas to explore whether comparison of procedures can be effective for novices in adomain.

Although there is strong experimental evidence that students with some priorknowledge in a domain develop greater flexibility if they compare procedures (Rittle-Johnson & Star, 2007, 2009; Star & Rittle-Johnson, 2009), the effectiveness of comparingprocedures for novices is much less clear. First, a recent study found that studentswho were not familiar with one of the target procedures learned less if they comparedprocedures, rather than study the procedures separately (Rittle-Johnson, Star, & Durkin,2009). Prior knowledge of solution methods, not general math ability, was the importantpredictor in this study. Second, comparing procedures requires that students areintroduced to multiple procedures, and several lines of inquiry advise against earlyintroduction to multiple procedures. Novices in a domain can easily be overwhelmedwith information, so learning theories such as cognitive load theory specify that novicesshould not be exposed to too much information at once (e.g., Sweller, van Merrienboer, &Paas, 1998). For example, learners with low prior knowledge learned more whenintroduction of a second skill (graphing linear functions) was delayed, rather thanpresented simultaneously with the first skill (using spreadsheets) (Clarke, Ayres, &Sweller, 2005). Evidence such as this has led to the instructional design principle tocarefully sequence instruction to allow for gradual build-up of multiple ideas, ratherthan simultaneous presentation of multiple ideas (Carnine, 1997). In addition, teachersoften voice concern that introducing multiple procedures to low-knowledge and low-ability students will confuse the students (Leikin, 2003; Silver, Ghousseini, Gosen,Charalambous, & Strawhun, 2005). Overall, there are several reasons to predict thatmultiple procedures should not be introduced simultaneously.

Because of these concerns, we compared the effectiveness of novices comparingprocedures (i.e., immediate-CP [comparison of procedure]) to two common, alternativeinstructional approaches that delay introduction to multiple procedures. The first isproviding instruction on one procedure before comparing it to alternative procedures(i.e., delayed-CP), and the second is providing instruction on one procedure beforeintroducing alternative procedures, without comparison (i.e., delayed-exposure).

In our delayed-CP condition, rather than supporting comparison of proceduresimmediately, we first developed students’ familiarity with one procedure. After gainingfamiliarity with one procedure, students should benefit from comparing it to alternatives,as has been found in previous studies with more experienced learners (Rittle-Johnson &Star, 2007, 2009; Star & Rittle-Johnson, 2009). To support learning of a single procedure,we had students compare examples of one procedure used to solve different problemsbecause this can help learners abstract a more general solution procedure that is less tiedto specific problem features (e.g., Gentner, Loewenstein, & Thompson, 2003) and hasbeen shown to support learning of equation solving (Rittle-Johnson & Star, 2009). Next,students compared the procedure to alternatives. Overall, the delayed-CP conditioncould prepare novices to learn from comparing procedures.

In our delayed-exposure condition, students learned one procedure and later wereintroduced to alternative procedures, and they were never encouraged to compare the

Developing procedural flexibility 439

Table 1. Alternative solution procedures for three types of linear equations

Sample solution via Sample solution viaEquation typea distribute-first procedure shortcut procedures

Type I 3(x + 1) = 15 3(x + 1) = 15a(x + b) = c, where a and c are 3x + 3 = 15 x + 1 = 5

integers and c is evenly divisible by a 3x = 12 x = 4x = 4 (Divide-first)

Type II 1/4(x + 8) = 5 1/4(x + 8) = 5a(x + b) = c, where a is a fraction 1/4x + 2 = 5 x + 8 = 20

1/4x = 3 x = 12x = 12 (Multiply-first)

Type III 2(x + 1) + 3(x + 1) = 10 2(x + 1) + 3(x + 1) = 10a(x + b) + d(x + b) = c, where a 2x + 2 + 3x + 3 = 10 5(x + 1) = 10

and d are integers and c is evenly 5x + 5 = 10 x + 1 = 2divisible by (a + d) 5x = 5 x = 1

x = 1 (Combine-first)

a“x” stands for a variable and other letters were replaced with numbers.

problems or procedures. This condition parallels a traditional instructional approach ofstudents studying and practicing one procedure at a time (Hiebert et al., 2003). Forexample, comparison of multiple procedures was used in less than 5% of problems inrepresentative 8th-grade mathematics lessons from countries around the world, withthe exception of Japan (Hiebert et al., 2003). This approach was also the norm for ourparticipants.

Overall, the delayed-CP and delayed-exposure conditions are representative of com-mon alternatives to immediately introducing and comparing procedures with novices.These two alternatives reflect a natural consequence of delayed introduction to multipleprocedures – less exposure to multiple procedures. Because of limited classroomtime to focus on a particular topic, when teachers delay introduction to multipleprocedures, they often spend less time on multiple procedures (e.g., Blote et al., 2001;Klein et al., 1998).

Importance of algebraIn the current study, we experimentally evaluated the impact of these three differentlearning conditions for developing procedural flexibility with novice equation solvers.Many in mathematics education consider linear equation solving a foundational skill(National Mathematics Advisory Panel, 2008). Regrettably, students often memorize rulesand do not learn flexible and meaningful ways to solve equations (Kieran, 1992).

Examples of different procedures for solving three types of linear equations areshown in Table 1. Each of these equation types can be solved by the distribute-firstprocedure, which is a conventional and commonly taught procedure for solving linearequations that applies to most equations. However, each type can also be solved by analternative procedure that treats expressions such as (x + b) as a composite variableand are arguably shortcuts – they are more efficient because they involve fewer stepsand fewer computations; thus, they may be executed faster and with fewer errors. These

440 Bethany Rittle-Johnson et al.

non-conventional procedures can push children to understand important problemfeatures and to reflect on when different procedures are most efficient.

Current studyWe worked with 8th-grade students learning to solve equations during two 80- to90-min classroom lessons. Students had received very little prior instruction on equationsolving. They were randomly assigned to immediate comparison of procedures, delayedcomparison of procedures or delayed exposure to multiple procedures. Before and afterthe intervention, we assessed students’ flexibility for equation solving as well as theirconceptual and procedural knowledge for equation solving, and we included a 1-monthretention test. Conceptual knowledge was defined as ‘an integrated and functional graspof mathematical ideas’ (Kilpatrick et al., 2001, p. 118), including the ability to recognizeand explain key domain concepts (Carpenter et al., 1998; Hiebert & Wearne, 1996).Procedural knowledge was defined as the ability to execute action sequences to correctlysolve problems (Hiebert & Wearne, 1996; Rittle-Johnson, Siegler, & Alibali, 2001).

Students in all conditions studied worked examples of hypothetical students’ solutionprocedures and responded to explanation prompts with a partner. Using workedexamples insured exposure to multiple procedures and can make learning more effectiveand efficient, particularly when learners are prompted to generate explanations (seeAtkinson, Derry, Renkl, & Wortham, 2000 for a review). Working with a partner tendsto support greater learning than working alone (e.g., Johnson & Johnson, 1994).

MethodParticipantsAll 250 students from 11 Grade-8 classrooms at two US middle schools participated.Fifty-two students were dropped from the analysis; 47 of the students because they wereabsent for at least 1 day of the intervention, two because they were absent for boththe post-test and retention test, and three because they were unable to complete ourmaterials because of significant learning disabilities or very limited English proficiency. Ofthe remaining 198 students, 99 were female, 77% were Caucasian (14% African American,7% Hispanic, and 2% Asian), and the average age was 14.1 years (range 13.2–15.9 years).At one school, 46% of students were eligible for free or reduced lunch; at the other, 25%were eligible. All teachers were using a general Grade-8 mathematics curriculum thatgave very limited attention to equation solving (Bailey et al., 2004). Teachers reportedintroducing students to one- and two-step equation solving for a few days before thestudy started.

DesignWithin each classroom, students were randomly paired with another student and thepairs were randomly assigned to the immediate-CP condition (n = 67), the delayed-CP condition (n = 62), or the delayed-exposure condition (n = 69). During the2-day intervention, all students studied worked examples with a partner and answeredexplanation prompts.

MaterialsAll materials are available at: http://gseacademic.harvard.edu/contrastingcases/curriculum.html

Developing procedural flexibility 441

A. Immediate-Comparison-of-Procedures Condition Alex’s distribute-first way:

Alex’s distribute-first way: Alex’s distribute-first way:

2(x – 3) = 82x – 6 = 8 Distributed 2

2x = 14 Added __________ on Bothx = 7 Divided by __________ on Both

Morgan’s multiply/divide-first way:

2(x – 3) = 8 x – 3 = 4 Divided by 2 on Both

x = 7 Added __________ on Both

1. How do you know that both ways to solve the problem are correct? 2. Alex and Morgan divided both sides by 2, but in different steps. Why is the divide step OK to do in either step?

B. Delayed-Comparison--of--Procedures Condition (Day 1)

2(x – 3) = 82x – 6 = 8 Distributed 2

2x = 14 Added _________ on Bothx = 7 Divided by _________ on Both

4(x + 2) = 12 4x + 8 = 12 Distributed 4 4x = 4 Subtracted _________ on Both

x = 1 Divided by _________ on Both

1. How do you know that Alex solved both problems correctly? 2. On the second step, why did Alex add on both sides in the first problem and subtract on both sides inthe second problem?

C. Delayed Exposure Condition

Alex’s distribute-first way:

2(x – 3) = 82x – 6 = 8 Distributed 2

Distributed 4

2x = 14 Added _________ on Bothx = 7 Divided by _________ on Both

1. How do you know that Alex solved the problem correctly?

Alex’s distribute-first way:

4(x + 2) = 124x + 8 = 12

4x = 4 Subtracted _________ on Bothx = 1 Divided by _________ on Both

2. Why did Alex subtract on both sides as the second step?

-------------------------------- NEXT PAGE ------------------------------

Figure 1. Sample page of the intervention packet for each condition on Day 1. On Day 2, theimmediate-CP and delayed-CP conditions studied example pairs like that shown in Panel A.

InterventionEach packet contained 18 worked examples illustrating solutions to instances of thethree problem types illustrated in Table 1, with the packets divided into two sections todistribute the material over two intervention sessions. The packets differed in whetherthe shortcut procedures were included on the first day and in whether and how theworked examples were paired. Exposure to different problem types was the same acrosspackets.

In the immediate-CP packet, worked examples were presented in pairs, and each paircontained the same equation solved using the distribute-first and a shortcut procedure(see Figure 1). In the delayed-CP packet, examples were also presented in pairs, buton Day 1, each pair contained two different types of equations, each solved using thedistribute-first procedure. On Day 2, the packet was identical to the immediate-CP packet.

442 Bethany Rittle-Johnson et al.

In the delayed-exposure packet, worked examples were presented one at a time, andthe examples were the same as those presented in the delayed-CP condition. A naturalconsequence of delayed exposure to multiple procedures was that students in the lattertwo conditions saw five examples of shortcut procedures, compared to nine in theimmediate-CP condition.

An explanation prompt accompanied each worked example. In the immediate-CPand delayed-CP conditions, prompts focused on comparing the feasibility and efficiencyof the solution steps for the given problem(s) (see Figure 1). Prompts in the delayed-exposure packets focused on the feasibility of the solution steps for the given problem.

To facilitate learning of the procedures across conditions, the packets also included(1) five guided practice problems that prompted students to use a procedure demon-strated in one of the worked examples to solve an isomorphic equation and (2) 12independent practice problems that prompted students to solve equations using anymethod(s) they chose. In addition, two brief homework assignments were distributed atthe end of each day, each with six equations to solve.

All of the materials were modified from the ones used in Rittle-Johnson et al. (2009) tomake them more accessible to novices. We no longer included equations with variableson both sides, introduced new equation types more slowly, had six fewer workedexamples and explanation prompts, and added 30 min to the intervention time. On theworked examples, we also added descriptive labels for each procedure to help highlightthat the same procedure was being demonstrated on multiple problems.

AssessmentThe same assessment was used as an individual pre-test, post-test, and retention test.Sample items are outlined in Table 2. Flexibility knowledge was an independent measureof students’ ability to generate multiple methods for solving a problem and to evaluatenon-standard solution steps. Flexible use was assessed by how frequently the appropriateshortcut procedures were used on the procedural knowledge problems. The measureof procedural knowledge was students’ ability to solve equations that had problemsfeatures that were both familiar and unfamiliar (e.g., additional terms or new operators).The conceptual knowledge items gauged students’ verbal and non-verbal knowledge ofalgebra concepts. Six of the items were taken from Rittle-Johnson and colleagues (2009)and six were new. The new items were meant to better assess conceptual knowledge ina sample with limited equation-solving experience. At the beginning of the assessment,students also received five warm-up problems, such as simplifying an expression usingthe distributive property.

ProcedureData collection primarily took place over four consecutive classroom periods withinstudents’ regular mathematics classes. On the first and fourth day, students were given45 min to complete the pre-test and post-test, respectively. They spent the second andthird days completing the intervention during their full math period, which lasted anaverage of 84 min (range 70–91 min). The second day began with a scripted 10-min lessonon how to simplify expressions and to solve an equation using the distributive propertyas well as suggestions for how to work productively with a partner. Next, students eachreceived the appropriate packet and completed it with their assigned partner for theremainder of the class period. When there were an uneven number of students in a class,

Developing procedural flexibility 443

Tabl

e2.

Sam

ple

asse

ssm

ent

item

s

Kno

wle

dge

type

Sam

ple

item

sSc

orin

g

I.Pr

oced

ural

flexi

bilit

ya.

Flex

ibili

tykn

owle

dge

�=

.82

i.G

ener

ate

mul

tiple

proc

edur

es(n

=4)

a.So

lve

this

equa

tion

intw

odi

ffere

ntw

ays:

18=

3(x

+2)

1pt

for

two

corr

ect,

uniq

ueal

gebr

aic

solu

tions

.ii.

Eval

uate

non-

stan

dard

proc

edur

es(n

=2)

Ada

m’s

first

step

:Pa

rta:

1pt

for

corr

ect

choi

ce.

2(s

+3(

s–

1))=

18Pa

rtb:

2pt

sfo

rch

oice

a,1

ptfo

rch

oice

b.s

+3(

s–

1)=

9Pa

rtc:

2pt

sif

accu

rate

lyev

alua

teef

ficie

ncy

ora.

Wha

tst

epdi

dA

dam

use

toge

tfr

omth

efir

stlin

eto

the

just

ifyw

hyok

todo

;1pt

ifsp

ecify

that

seco

ndlin

e?st

epis

valid

.b.

Do

you

thin

kth

atth

isis

ago

odw

ayto

star

tth

ispr

oble

m?

(a)

ave

rygo

odw

ay;(

b)O

Kto

do,b

utno

ta

very

good

way

;(c)

not

OK

todo

.c.

Expl

ain

your

reas

onin

g.b.

Flex

ible

use

Freq

uenc

yof

usin

gsh

ortc

utpr

oced

ures

onpr

oced

ural

know

ledg

eite

ms

�=

.88

II.Pr

oced

ural

know

ledg

e�

=.8

3a.

Fam

iliar

prob

lem

feat

ures

(n=

3)3(

h+

2)+

4(h

+2)

=35

1pt

for

each

corr

ect

answ

er.

b.U

nfam

iliar

prob

lem

feat

ures

(n=

6)3(

2x+

3x–

4)+

5(2x

+3x

–4)

=48

1pt

for

each

corr

ect

answ

er.

III.C

once

ptua

lkno

wle

dge

�=

.50

(n=

12)

98=

21x

1pt

for

sele

ctin

gye

san

d1

ptfo

rm

entio

ning

98+

2(x

+1)

=21

x+

2(x

+1)

equi

vale

nce

ofeq

uatio

ns(a

)W

ithou

tso

lvin

gth

eeq

uatio

ns,d

ecid

eif

thes

eeq

uatio

nsar

eeq

uiva

lent

.(b

)Ex

plai

nyo

urre

ason

ing.

Not

e:C

ronb

ach’

sal

phas

are

the

aver

age

ofpo

st-t

est

and

rete

ntio

nte

st.

444 Bethany Rittle-Johnson et al.

one group of students worked in a triad (N = 10). If a student’s partner was absent,they joined another pair assigned to the same condition for that day. The following daybegan with a brief, scripted lesson about using the distributive property to solve twoequations with fractions. Next, students worked with their partner on the second half ofthe packet. Immediately before the end of the class period, an 8-min summary lesson wasgiven to the entire class, emphasizing that there are multiple ways to solve an equation.Finally, students completed the retention test 25–27 days later.

Coding

AssessmentThe assessment items were scored according to the guidelines in Table 2, and percentagecorrect was calculated. We also coded students’ solution procedures on proceduralknowledge items, based on whether their first step was (1) distributing across paren-theses, (2) using one of the shortcut steps that had been demonstrated in the workedexamples, (3) using an unusual or incorrect algebraic step, (4) using an informal, non-algebraic approach, such as guess-and-test or unwind, or (5) not attempting the problem.If students used an algebraic procedure to solve at least one equation at pre-test, wecategorized them as attempting algebra at pre-test. A second person coded the solutionmethods and explanation qualities across the assessments for 20% of participants, withCohen’s Kappa coefficients ranging from .72 to 1.0.

InterventionFor each student, we tallied the number of practice problems completed and coded thefrequency of using the shortcut procedures. We also coded students’ written explana-tions based on types of comparisons made and general features of the explanations. Twoindependent coders coded this information for 20% of the sample, and Cohen’s Kappacoefficient ranged from .77 to .94.

Data analysis

Missing dataTwelve students were absent at pre-test, 7 different students were absent at post-test,and 19 different students were absent at retention test. Statisticians strongly recommendimputing values for missing data, rather than omitting participants with missing data,because it leads to more precise and unbiased conclusions (Peugh & Enders, 2004;Schafer & Graham, 2002). Non-missing data are used to estimate missing values throughan iterative process using Maximum Likelihood Estimation; see Schafer and Graham(2002) for an accessible overview of handling missing data. We imputed missing datausing the missing value analysis module of SPSS 18.0. Findings using a case-wise deletionapproach yielded the same basic findings.

Multi-level modelsBecause students worked with a partner during the intervention, their subsequentperformance may not be independent of one another, which would violate theassumption of independence in analysis of variance (ANOVA) models. To test for non-independence in partner scores on the post-test and retention test, we calculated intra-class correlations, controlling for the predictor variables (Kenny, Kashy, & Cook, 2006).

Developing procedural flexibility 445

Intra-class correlations ranged from −.19 to + .40, with significant intra-class correlationson multiple measures, indicating non-independence in the data. Therefore, we usedmulti-level linear models. We specified the use of restricted maximum likelihood (REML)estimation and compound symmetry for the variance–covariance structure in the models(Kenny et al., 2006). The significance tests used the Satterthwaite (1946) approximationto estimate the degrees of freedom.

Our model had two levels – the individual level and the dyad level. Effects of students’pre-test knowledge were tested in the individual level of the model (i.e., their pre-testconceptual, procedural and flexibility knowledge scores, and whether they used algebraon the pre-test). Effect of condition was tested at the dyad level, and we specified theimmediate-CP condition as the referent condition. We ran a separate model for eachoutcome.

In our initial models, we explored whether use of algebra at pre-test interacted withcondition in predicting post-test or retention test outcomes, as it had in Rittle-Johnsonet al., (2009), but none of the interaction terms were significant. Thus, we did notinclude the interaction term in the final models.

ResultsPre-test knowledgeAt pre-test, students had some flexibility knowledge (M = 28% correct), proceduralknowledge (M = 15% correct), and conceptual knowledge (M = 23% correct). However,students had little knowledge of algebraic methods (see Table 3). Only 56% of studentsattempted to use any algebraic procedure at least once, whether correctly or incorrectly,and less than 25% used a correct algebraic procedure. One-way ANOVAs confirmed thatthere were no significant differences between conditions on any of the pre-test measures,p’s from .11 to .86.

Effects of condition at post-test and retention testStudents in the three conditions differed in procedural flexibility, particularly in flexibleuse, at both post-test and retention test (see Figure 2). At post-test, there were nodifferences in procedural knowledge, but there were differences at retention test (seeFigure 2). The three conditions did not differ in conceptual knowledge at either timepoint.

First, consider students’ flexibility knowledge. As shown in Figure 2, students in theimmediate-CP condition scored the highest and students in the delayed-CP conditionscored the lowest. The omnibus test for the effect of condition indicated that it hada marginal effect on flexibility knowledge at post-test, F(2, 99) = 2.52, p = .085,and a significant effect at retention, F(2, 99) = 4.28, p = .017. Parameter estimatesfrom the models indicated how much the delayed-CP and delayed-exposure conditionsdiffered from the immediate-CP condition (see Table 4), and we calculated Cohen’sd to measure the size of these effects. For example, the parameter estimate for thedelayed-CP condition was –8.28 at post-test, indicating that flexibility scores wereabout 8 points lower than in the immediate-CP condition, and Cohen’s d was −.39,a medium-sized effect. At retention, flexibility scores in the delayed-CP were about 10points lower, also a medium effect-size, d = −.50. Flexibility scores in the delayed-exposure condition were not significantly lower than in the immediate-CP condition;

446 Bethany Rittle-Johnson et al.

Table 3. Use of procedures by condition (percentage of problems). Each shortcut was appropriate ona third of the problems

Distribute- Divide-first Multiply-first Combine-first Other Non-first shortcut shortcut shortcut algebra algebra Blank

Pre-testImmediate-CP 7.9 0.9 0.2 0.2 20.5 26.9 43.2Delayed-CP 3.6 1.2 1.0 0.0 20.4 30.5 43.8Delayed-exposure 5.0 1.4 0.4 0.2 19.4 35.4 38.7

Intervention Day 1Immediate-CP 50.0 18.2 n/a 17.2 5.5 2.7 6.5Delayed-CP 83.6∗∗∗ 0.3∗∗∗ n/a 1.9∗∗∗ 2.2 3.8 8.3Delayed-exposure 84.5∗∗∗ 1.9∗∗∗ n/a 0.0∗∗∗ 6.5 1.7 5.3

Intervention Day 2Immediate-CP 29.6 12.9 23.6 16.9 4.7 3.0 9.2Delayed-CP 49.5∗ 3.0∗∗∗ 16.7 6.7∗∗ 8.1 2.4 13.7Delayed-exposure 44.4 4.6∗∗∗ 20.3 9.2∗ 9.4 2.2 9.9

Post-testImmediate-CP 23.8 9.6 4.7 12.1 23.3 7.3 19.2Delayed-CP 26.6 3.0∗∗ 2.8 3.9∗∗∗ 30.7 9.9 23.3Delayed-exposure 29.0 3.1∗∗ 4.1 7.4 27.7 7.2 21.8

Retention TestImmediate-CP 21.0 8.2 3.6 11.7 23.2 10.4 23.3Delayed-CP 25.9 2.2∗ 1.2∗ 2.2∗∗∗ 25.0 14.4 29.4Delayed-exposure 24.7 2.9∗ 1.8 4.8∗ 24.6 11.8 29.3

Note: Condition differs from immediate-CP at ∗p � .05; ∗∗ p � .01; ∗∗∗ p ≤ .001.See Table 1 for an example of each shortcut.

however, the small-to-medium effect sizes suggest that the lack of reliable differencesmay be due to low power (d’s = −.35 and −.32 for post-test and retention test,respectively).

Next, consider students’ flexible use of procedures. On the procedural knowledgeitems, choosing the appropriate shortcut for a particular problem indicated more flexibleuse of solution procedures. The omnibus test indicated that flexible use depended oncondition at post-test, F(2, 109) = 4.86, p = .010, and at retention test, F(2, 106) =6.04, p = .003. Parameter estimates indicated that flexible use scores in the delayed-CP condition were about 14 points lower than in the immediate-CP condition at bothpost-test and retention, a medium-to-large effect (d’s = −.61 and −.67, respectively; seeTable 4). Flexible use scores were 9 points lower in the delayed-exposure conditionthan the immediate-CP at post-test and 10 points lower at retention (d’s = −.44 and−.50, respectively, see Table 4). Follow-up analyses on frequency of using each typeof shortcut indicated that immediate-CP supported greater adoption of the divide-firstshortcut than either of the other conditions and greater adoption of the combine-firstshortcut than the delayed-CP condition (see Table 3).

This greater use of shortcut procedures was also reflected in the size of individualchildren’s repertoire of procedures. At the retention test, students in the immediate-CP condition used an average of 1.4 (SD = 1.2) correct algebraic procedures, whereasstudents in the delayed-CP and delayed-exposure conditions used an average of 0.9

Developing procedural flexibility 447

Figure 2(A). (A) Flexibility knowledge, (B) flexible use of procedures. Values are estimated marginalmeans with standard error bars.

(SD = 0.9) and 1.0 (SD = 0.9), respectively, �’s = −.50 and −.37, p’s < .05. Overall,immediate comparison of procedures supported more flexible use of procedures after adelay.

Greater flexibility was related to greater procedural and conceptual knowledge.When flexible use and flexibility knowledge at post-test were added as predictorsin the model of procedural knowledge at post-test, both were substantial predictors,�’s = .32 and .26, respectively, p’s ≤ .001. This was also true at retention; both

448 Bethany Rittle-Johnson et al.

Figure 2(B). (C) procedural knowledge, at post-test and retention test. Values are estimated marginalmeans with standard error bars.

measures of flexibility at retention were related to procedural knowledge at retention,�’s = .37 and .39, respectively, all p’s < .001. When we conducted parallel models withconceptual knowledge as the dependent variable, flexible use and flexibility knowledgewere both significant predictors of conceptual knowledge at post-test, � = .09, p = .06and � = .10, p = .04, respectively, and at retention, �’s = .12 and .22, respectively,p’s < .05.

Finally, consider the direct effect of condition on procedural knowledge. At post-test,there was no main effect of condition, p = .42 (see Figure 2). However, at retention test,condition did influence procedural knowledge, F(2, 100) = 3.83, p = .025. In particular,procedural knowledge scores in the delayed-CP condition were 11 points lower thanin the immediate-CP condition (d = −.48), although scores in the delayed-exposurecondition were not significantly lower (d = −.17). Was the effect of condition onprocedural knowledge mediated by its effect on flexible use? Recall that condition had asignificant impact on flexible use at post-test. In turn, flexible use at post-test was relatedto procedural knowledge at retention, � = .356, p < .001. To test for mediation, weadded students’ flexible use at post-test to the model predicting procedural knowledgeat retention, and the main effect of condition was no longer significant, F(2, 104) = 1.63,p = .201. A Sobel test confirmed that flexible use was a mediator, z = 2.62, p = .009.

Effect of condition on intervention activities

Practice problemsStudents solved an average of 11 of the 12 independent practice problems and thisdid not vary by condition. However, frequency of use of the shortcut procedures didvary by condition, as shown in Table 3. Students in the immediate-CP condition used

Developing procedural flexibility 449

Table 4. Parameter estimates for post-test and retention test outcomes

Flexibility Flexible Procedural Conceptualknowledge use knowledge knowledge

Post-testIntercept 53.14 (2.95)∗∗∗ 27.29 (3.45)∗∗∗ 42.86 (3.09)∗∗∗ 31.20 (1.58)∗∗∗

Condition (reference = immediate-CP)Delayed-CP −8.28 (3.82)∗ −14.07 (4.56)∗∗ −5.31 (3.99) −0.69 (1.98)Delayed-exposure −5.94 (3.75) −8.55 (4.49)† −2.66 (3.92) −0.78 (1.94)

Not use algebra at pre-test −9.83 (3.15)∗∗ −6.93 (3.34)∗ −16.01 (3.31)∗∗∗ −2.35 (1.86)Pre-test conceptual 0.30 (0.12)∗ 0.38 (0.13)∗∗ 0.23 (0.13) 0.44 (0.07)∗∗∗

Pre-test procedural 0.24 (0.07)∗∗ 0.23 (0.08)∗∗ 0.54 (0.08)∗∗∗ 0.14 (0.04)∗∗

Pre-test flexibility 0.50 (0.09)∗∗∗ 0.19 (0.10) 0.42 (0.09)∗∗∗ 0.22 (0.05)∗∗∗

Retention testIntercept 48.72 (2.61)∗∗∗ 22.77 (3.20)∗∗∗ 38.70 (3.19)∗∗∗ 32.52 (2.03)∗∗∗

ConditionDelayed-CP −9.83 (3.38)∗∗ −14.36 (4.26)∗∗ −10.74 (4.07)∗∗ −1.05 (2.64)Delayed-exposure −3.95 (3.31) −10.04 (4.19)∗ −2.32 (3.99) −0.18 (2.59)

Not use algebra at pre-test −9.55 (2.81)∗∗ −4.67 (3.04) −13.41 (3.53)∗∗∗ −5.69 (2.15)∗∗

Pre-test conceptual 0.44 (0.11)∗∗∗ 0.20 (0.12) 0.21 (0.14) 0.42 (0.08)∗∗∗

Pre-test procedural 0.32 (0.07)∗∗∗ 0.23 (0.07)∗∗ 0.50 (0.08)∗∗∗ 0.23 (0.05)∗∗∗

Pre-test flexibility 0.30 (0.08)∗∗∗ 0.22 (0.09)∗ 0.23 (0.10)∗ 0.15 (0.06)∗

Note. Unstandardized coefficients are shown with standard errors in parentheses.Pre-test conceptual, procedural, and flexibility knowledge were grand mean centered.†p � .06; ∗p � .05; ∗∗p � .01; ∗∗∗p � .001.

two of the three shortcut procedures more frequently. In turn, overall frequency ofusing shortcut procedures during the intervention was predictive of accuracy on theprocedural knowledge assessment at both post-test and retention test, � = .14, p = .017and � = .13, p = .043, respectively, with pre-test measures in the model.

Explanations on worked examplesA majority of the intervention was spent studying and explaining worked examples.Students answered 94% of the 18 available explanation prompts and this did not varybetween conditions.

We coded the students’ explanations for different types of comparisons as well as forgeneral characteristics, as described in Table 5. Students discussed their explanationswith their partner, so partners’ written explanations were very similar. We only coded thewritten explanations of one randomly selected member of each pair and used ANOVAs toevaluate the impact of condition on the frequency of each explanation type. In general,the condition manipulations had their intended effects. Both the immediate-CP anddelayed-CP conditions supported much more comparison and evaluation of examples.The immediate-CP condition supported the most comparison of the efficiency of twoprocedures and the fact that they led to the same answers, as well as general evaluationof the efficiency of a particular procedure. The delayed-CP condition supported thegreatest comparison of solution steps and evaluation of problem features. Finally, thedelayed-exposure condition led students to reference mathematical properties moreoften.

450 Bethany Rittle-Johnson et al.

Table 5. Percentage of intervention explanations containing each feature, by condition

Explanation Sample Immediate- Delayed- Delayed-characteristic explanations CP CP exposure

1. Any comparisona At least one comparison 56 56 7Compare efficiencyb ‘Morgan’s way because it has less steps’. 16 7 0Compare solutionstepsb

‘Alex distributed and Morgancombined like terms’.

30 36 1

Compare problemfeaturesa

‘If the x + − numbers are the same, itwill work all the time’.

18 24 5

Compare answersb ‘Both got the same answer’. 12 7 22. Any evaluationa 47 41 31

Evaluate efficiencyb ‘James’ way was just faster’. 41 25 19Evaluate problemfeaturesc

‘Heather’s problem has easiernumbers’.

8 16 13

3. Reference mathematicalpropertiesa

‘They combined like terms’. 7 12 21

Notes: Because of the multiple tests required with 10 explanation categories, we used a Bonferronicorrection to adjust the critical p-value to .005.aDelayed-exposure differs from both delayed-CP and immediate-CP at p � .001.bAll three conditions differ from each other at p � .005.cDelayed-CP differs from immediate-CP at p ≤ .001.

SummaryOverall, students in the immediate-CP condition had greater flexibility knowledge,flexible use, and retention of procedural knowledge than students in the delayed-CPcondition. Knowledge gains in this condition were more similar to those of students inthe delayed-exposure condition, except that immediate-CP led to more flexible use ofprocedures. Although condition had less direct impact on procedural and conceptualknowledge, greater flexibility was associated with greater procedural and conceptualknowledge. The immediate-CP condition may have supported learning during theintervention by increasing use of the shortcut procedures on the intervention practiceproblems and by increasing attention to the accuracy and efficiency of the proceduresillustrated in the worked examples.

DiscussionImmediate comparison of procedures was effective for supporting flexibility in novices,relative to more common instructional approaches involving delayed exposure tomultiple procedures. In turn, greater flexibility was related to greater procedural andconceptual knowledge. Overall, the findings suggest that comparing procedures can beeffective for novices. We discuss when and why comparing procedures may be effective,considerations for when multiple procedures should be introduced, and limitations andfuture directions.

When and why comparing procedures may be effectiveNovices can learn effectively from comparing procedures under sufficiently supportiveconditions, such as covering a limited amount of material in a lesson. Thus, novices areable to learn by making analogies between two unfamiliar procedures (i.e., mutualalignment: Gentner et al., 2003; Kurtz, Miao, & Gentner, 2001). During mutual

Developing procedural flexibility 451

alignment, people notice potentially relevant features in two unfamiliar examples byidentifying their similarities and then focusing attention on and making sense of thesesimilarities (Gentner et al., 2003). Indeed, novices who compared procedures oftenmade comparisons between the two examples, focusing on comparing problem features,solution steps, answers, and the relative efficiency of the methods.

Although immediately comparing procedures was effective with novices, sequentialstudy of examples with delayed exposure to multiple procedures was almost as effective,differing only in flexible use of procedures. Less flexible use may have arisen becausestudents in the delayed-exposure condition had less exposure to the shortcut procedures.In classrooms, less exposure to alternative procedures is a common consequence ofdelayed exposure to multiple procedures (e.g., Blote et al., 2001; Klein et al., 1998).Potential advantages of the delayed exposure condition include reduced cognitiveload (students needed to attend to and process fewer things at any given time) andgreater attention to mathematical properties when studying the worked examples.For novices, the advantages of comparing procedures over sequential study of theprocedures are limited. The advantages grow as students increase their knowledge ofrelevant procedures (Rittle-Johnson & Star, 2007, 2009; Rittle-Johnson et al., 2009; Star &Rittle-Johnson, 2009).

Our attempt to develop familiarity with one procedure and then have studentscompare this familiar procedure to an unfamiliar one (the delayed-CP condition) wasunsuccessful. In theory, supporting familiarity with one procedure so that studentscan learn new procedures via analogy to that procedure should be effective; it is howmore experienced students appear to learn from comparing procedures (e.g., Rittle-Johnson & Star, 2007). Future research needs to evaluate the effectiveness of alternativeinstantiations of delayed comparison of procedures that may be beneficial.

Finally, although condition directly impacted flexibility, it had limited direct impacton conceptual and procedural knowledge. The immediate-CP group did have greaterretention of procedural knowledge than the delayed-CP group, although not than thedelayed-exposure group. Rather than a strong direct effect, condition had an indirecteffect on these outcomes; condition had a small-to-medium sized effect on flexibility, andin turn, greater flexibility was related to greater conceptual and procedural knowledge.Often, the benefits of flexibility emerge over time as children become more skilled atexecuting newly learned procedures, problems become more complex, and efficiencybecomes more important (e.g., Blote et al., 2000; Lemaire & Siegler, 1995).

When to introduce multiple methodsEarly introduction to multiple procedures may be important for supporting proceduralflexibility, particularly flexible use of procedures. Students in the immediate-CP conditionwere introduced to multiple procedures on the first day while students in the other twoconditions were not introduced to multiple procedures until the second day. Similarly,in two classroom studies, immediate and explicit attention to multiple procedureswas associated with greater procedural flexibility than delayed exposure to multipleprocedures for 2nd graders learning multi-digit arithmetic (Blote et al., 2001; Kleinet al., 1998). Limiting initial instruction to a single procedure may discourage reflectionon the strengths and weaknesses of the procedure, may lead to strong preferences forusing that procedure that are difficult to overcome, and can limit exposure to alternativeprocedures. In contrast, immediate introduction supports exploration and use of a varietyof procedures, including when and why one might choose a particular procedure to solvea given problem.

452 Bethany Rittle-Johnson et al.

Limitations and future directionsTo test the generalizability of these findings, several lines of future research are needed.First, the relative importance of frequent exposure to multiple procedures and immediatecomparison of multiple procedures for novices need to be teased apart. Although morefrequent exposure to multiple procedures often results from immediate comparison ofprocedures, teachers could increase exposure to alternative procedures without theuse of comparison. For more experienced students, comparing procedures leads togreater flexibility than equivalent exposure to alternative solution procedures presentedsequentially (Rittle-Johnson & Star, 2007; Rittle-Johnson et al., 2009; Star & Rittle-Johnson, 2009), but we need to evaluate whether the same is true for novices. Second,the impact of comparing procedures needs to be studied with a greater variety of typesof procedures. For example, equation-solving procedures are algorithmic – the steps arewell defined and executing them correctly will lead to a single, correct answer. The needfor flexible knowledge of procedures is more salient when none of the procedures canconsistently produce the correct answer. Third, future research should use measures offlexibility that encompass broader individual and environmental characteristics that caninfluence procedure choice and reflect students’ broader adaptive expertise (Verschaffelet al., 2009). Fourth, the impact of comparing procedures needs to be studied withstudents with learning disabilities. According to teacher reports, of the few studentswho struggled with our materials, several had a diagnosed learning disability or verylimited English proficiency. Finally, the external validity of these findings is limited by thefact that this was a researcher-led intervention. We will need to better understand howclassroom teachers can support effective comparisons in novices and more experiencedlearners.

In conclusion, comparing multiple algebraic procedures can improve proceduralflexibility for novices. In turn, greater flexibility is related to greater procedural andconceptual knowledge. This study suggests that even for novices, it can pay tocompare.

AcknowledgementsThis research was supported with funding from the Institute of Education Sciences, USDepartment of Education, grants R305H050179 and R305B040110. The opinions expressedare those of the authors and do not represent views of the US Department of Education.Thanks to the students and teachers at Harpeth Valley and Walter J Baird Middle Schools forparticipating in this research. Thanks to Kristen Tremblay, Holly Harris, Anna Krueger, VivienHaupt, Chrissy Tanner, and Meredith Murray for help in collecting and coding the data.

ReferencesAtkinson, R. K., Derry, S. J., Renkl, A., & Wortham, D. (2000). Learning from examples: Instructional

principles from the worked examples research. Review of Educational Research, 70, 181–214.doi:10.2307/1170661

Australian Education Ministers. (2006). Statements of learning for mathematics. Carlton SouthVic, Australia: Curriculum Corporations.

Bailey, R., Day, R., Frey, P., Howard, A., Hutchens, D., McClain, K., . . . Price, J. (2004). Mathemat-ics: Applications and concepts, course 3. New York: Glencoe.

Ball, D. L. (1993). With an eye on the mathematical horizon: Dilemmas of teaching elemen-tary school mathematics. The Elementary School Journal, 93, 373–397. Retrieved from:http://www.jstor.org/stable/1002018

Developing procedural flexibility 453

Baroody, A. J., & Dowker, A. (2003). The development of arithmetic concepts and skills:Constructing adaptive expertise. Mahwah, NJ: Erlbaum.

Becker, J. P., & Selter, C. (1996). Elementary school practices. In A. J. Bishop, K. Clements,C. Keitel, J. Kilpatrick, & C. Laborde (Eds.), International handbook of mathematicseducation (pp. 511–564). Dordrecht, The Netherlands: Kluwer.

Beishuizen, M., van Putten, C. M., & van Mulken, F. (1997). Mental arithmetic and strategy usewith indirect number problems up to one hundred. Learning and Instruction, 7, 87–106.doi:10.1016/S0959-4752(96)00012-6

Blote, A. W., Klein, A. S., & Beishuizen, M. (2000). Mental computation and conceptualunderstanding. Learning and Instruction, 10, 221–247. doi:10.1016/S0959-4752(99)00028-6

Blote, A. W., Van der Burg, E., & Klein, A. S. (2001). Students’ flexibility in solving two-digitaddition and subtraction problems: Instruction effects. Journal of Educational Psychology,93, 627–638. doi:10.1037//0022-0663.93.3.627

Brophy, J. (1999). Teaching. Education Practices Series No. 1, International Bureau ofEducation. Retrieved from http://www.ibe.unesco.org

Carnine, D. (1997). Instructional design in mathematics for students with learning disabilities.Journal of Learning Disabilities, 30, 130–141. doi:10.1177/002221949703000201

Carpenter, T. P., Franke, M. L., Jacobs, V. R., Fennema, E., & Empson, S. B. (1998). A longitudinalstudy of invention and understanding in children’s multidigit addition and subtraction. Journalfor Research in Mathematics Education, 29, 3–20. doi:10.2307/749715

Clarke, T., Ayres, P., & Sweller, J. (2005). The impact of sequencing and prior knowledge onlearning mathematics through spreadsheet applications. Educational Technology Researchand Development, 53, 15–24. doi:10.1007/BF02504794

Dowker, A. (1992). Computational estimation strategies of professional mathematicians. Journalfor Research in Mathematics Education, 23, 45–55. doi:10.2307/749163

Gentner, D., Loewenstein, J., & Thompson, L. (2003). Learning and transfer: A general rolefor analogical encoding. Journal of Educational Psychology, 95, 393–405. doi:10.1037/0022-0663.95.2.393

Hiebert, J., Carpenter, T. P., Fennema, E., Fuson, K. C., Human, P., Murray, H., . . . Wearne, D.(1996). Problem solving as a basis for reform in curriculum and instruction: The case ofmathematics. Educational Researcher, 25, 12–21. doi:10.2307/1176776

Hiebert, J., Gallimore, R., Garnier, H., Givvin, K. B., Hollingsworth, H., Jacobs, J., . . . Kersting, N.(2003). Teaching mathematics in seven countries: Results from the timss 1999 video study(No. NCES 2003–013). Washington, DC: U.S. Department of Education, National Center forEducation Statistics.

Hiebert, J., & Wearne, D. (1996). Instruction, understanding, and skill in multidigit addition andsubtraction. Cognition and Instruction, 14, 251–283. doi:10.1207/s1532690xci1403 1

Johnson, D. W., & Johnson, R. T. (1994). Learning together and alone: Cooperative, competitiveand individualistic learning (4th ed.). Boston, MA: Allyn and Bacon.

Kenny, D. A., Kashy, D. A., & Cook, W. L. (2006). Dyadic data analysis. New York, NY: GuilfordPress.

Kieran, C. (1992). The learning and teaching of school algebra. In D. Grouws (Ed.), Handbookof research on mathematics teaching and learning (pp. 390–419). New York: Simon &Schuster.

Kilpatrick, J., Swafford, J. O., & Findell, B. (Eds.). (2001). Adding it up: Helping children learnmathematics. Washington, DC: National Academy Press.

Klein, A. S., Beishuizen, M., & Treffers, A. (1998). The empty number line in dutch second grades:Realistic versus gradual program design. Journal for Research in Mathematics Education, 29,443–464. doi:10.2307/749861

Kultusministerkonferenz. (2004). Bildungsstandards im fach mathematik fur den primarbere-ich [educational standards in mathematics for primary schools]. Luchterhand: Munchen-Neuwied.

454 Bethany Rittle-Johnson et al.

Kurtz, K., Miao, C.-H., & Gentner, D. (2001). Learning by analogical bootstrapping. The Journalof the Learning Sciences, 10, 417–446. doi:10.1207/S15327809JLS1004new 2

Lampert, M. (1990). When the problem is not the question and the solution is not theanswer mathematical knowing and teaching. American Educational Research Journal, 27,29–63. doi:10.3102/00028312027001029

Leikin, R. (2003). Problem-solving preferences of mathematics teachers: Focusing on symmetry.Journal of Mathematics Teacher Education, 6 , 297–329. doi:10.1023/A:1026355525004

Lemaire, P., & Siegler, R. S. (1995). Four aspects of strategic change: Contributions to children’slearning of multiplication. Journal of Experimental Psychology: General, 124, 83–97. doi:10.1037//0096-3445.124.1.83

National Council of Teachers of Mathematics. (2000). Principles and standards for schoolmathematics. Reston, VA: NCTM.

National Mathematics Advisory Panel. (2008). Foundations of success: The final report of thenational mathematics advisory panel. Washington, DC: U.S. Department of Education.

Peugh, J. L., & Enders, C. K. (2004). Missing data in educational research: A review of reportingpractices and suggestions for improvement. Review of Educational Research, 74, 525–556.doi:10.3102/00346543074004525

Richland, L. E., Zur, O., & Holyoak, K. J. (2007). Cognitive supports for analogies in the mathematicsclassroom. Science, 316 , 1128–1129. doi:10.1126/science.1142103

Rittle-Johnson, B., Siegler, R. S., & Alibali, M. W. (2001). Developing conceptual understandingand procedural skill in mathematics: An iterative process. Journal of Educational Psychology,93, 346–362. doi:10.1037//0022-0663.93.2.346

Rittle-Johnson, B., & Star, J. R. (2007). Does comparing solution methods facilitate conceptualand procedural knowledge? An experimental study on learning to solve equations. Journal ofEducational Psychology, 99, 561–574. doi:10.1037/0022-0663.99.3.561

Rittle-Johnson, B., & Star, J. R. (2009). Compared with what? The effects of different compar-isons on conceptual knowledge and procedural flexibility for equation solving. Journal ofEducational Psychology, 101, 529–544. doi:10.1037/a0014224

Rittle-Johnson, B., Star, J. R., & Durkin, K. (2009). The importance of prior knowledge whencomparing examples: Influences on conceptual and procedural knowledge of equation solving.Journal of Educational Psychology, 101, 836–852. doi:10.1037/a0016026

Satterthwaite, F. E. (1946). An approximate distribution of estimation of variance components.Biometrics Bulletin, 2, 110–114. doi:10.2307/3002019

Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. PsychologicalMethods, 7, 147–177. doi:10.1037//1082-989X.7.2.147

Siegler, R. S. (1996). Emerging minds: The process of change in children’s thinking. New York:Oxford University Press.

Silver, E. A., Ghousseini, H., Gosen, D., Charalambous, C., & Strawhun, B. (2005). Moving fromrhetoric to praxis: Issues faced by teachers in having students consider multiple solutions forproblems in the mathematics classroom. Journal of Mathematical Behavior, 24, 287–301.doi:10.1016/j.jmathb.2005.09.009

Singapore Ministry of Education. (2006). Secondary mathematics syllabuses.Star, J. R. (2005). Reconceptualizing procedural knowledge. Journal for Research in Mathematics

Education, 36 , 404–411. Retrieved from: http:www.jstor.org/stable/30034943Star, J. R., & Newton, K. J. (2009). The nature and development of expert’s strategy flex-

ibility for solving equations. ZDM Mathematics Education, 41, 557–567. doi:10.1007/s11858-009-0185-5

Star, J. R., & Rittle-Johnson, B. (2008). Flexibility in problem solving: The case of equation solving.Learning and Instruction, 18, 565–579. doi:10.1016/j.learninstruc.2007.09.018

Star, J. R., & Rittle-Johnson, B. (2009). It pays to compare: An experimental study on computationalestimation. Journal of Experimental Child Psychology, 101, 408–426. doi:10.1016/j.jecp.2008.11.004

Developing procedural flexibility 455

Sweller, J., van Merrienboer, J. J. G., & Paas, F. G. W. C. (1998). Cognitive architecture andinstructional design. Educational Psychology Review, 10, 251–296. doi:10.1023/B:TRUC.0000021808.72598.4d

Torbeyns, J., Ghesquiere, P., & Verschaffel, L. (2009). Efficiency and flexibility of indirect additionin the domain of multi-digit subtraction. Learning and Instruction, 19, 1–12. doi:10.1016/j.learninstruc.2007.12.002

Treffers, A. (1991). Didactical background of a mathematics program for primary education. InL. Streefland (Ed.), Realistic mathematics education in primary school (pp. 21–56). Utrecht,The Netherlands: Freudenthal Institute.

Verschaffel, L., Luwel, K., Torbeyns, J., & Van Dooren, W. (2009). Conceptualizing, investigating,and enhancing adaptive expertise in elementary mathematics education. European Journalof Psychology of Education, 24, 335–359. doi:10.1007/BF03174765