Development of a new reading comprehension assessment: Identifying comprehension differences among...

14
Development of a new reading comprehension assessment: Identifying comprehension differences among readers Sarah E. Carlson a, , Ben Seipel b , Kristen McMaster c a University of Oregon, Center on Teaching and Learning, United States b California State University, Chico, United States c University of Minnesota, Twin Cities, United States abstract article info Article history: Received 28 September 2012 Received in revised form 9 January 2014 Accepted 17 March 2014 Available online xxxx Keywords: Reading comprehension assessment Individual differences Reading comprehension processes The purpose of this study was to evaluate the Multiple-choice Online Cloze Comprehension Assessment (MOCCA), designed to identify individual differences in reading comprehension. Data were collected with two sets of 3rd through 5th grade students during two years: 92 students participated in Year 1 and 98 students participated in Year 2 to address primary research questions, and an additional 94 (N = 192) students participated in Year 2 to address the limitation of test administration time. Participants were group administered the MOCCA and a standardized reading prociency assessment, and individually administered other reading measures. Preliminary analyses indicated that the MOCCA produced reliable and valid scores as a new reading comprehension assessment for identifying types of comprehension processes used during reading, as well as for identifying individual differences in the types of comprehension processes used during reading. Findings are discussed in terms of developing a new measure to identify cognitive reading comprehension processes used during reading. Future research is needed to provide additional support for the technical adequacy of the assessment. © 2014 Elsevier Inc. All rights reserved. 1. Introduction Many students struggle with reading, and in particular, reading comprehension. As students advance in school, they transition from learning to read (e.g., learning to decode and developing uency and comprehension skills) to reading to learn (e.g., using comprehension skills to learn from text; Chall, 1996). This transition is often most evident in the upper elementary grades, when many readers begin to encounter difculties with new comprehension requirements (Shanahan & Shanahan, 2008, 2012). Assessments are needed to determine why readers experience comprehension difculties in order to develop appropriate instruction to meet their individual needs, yet few such assessments are available. Thus, the purpose of this study was to report preliminary ndings from a new reading comprehension assessment, the Multiple-choice Online Cloze Comprehension Assessment (MOCCA), developed to identify individual differences in reading comprehension. In this paper, we rst discuss theories of reading comprehension that guided the devel- opment of MOCCA. Second, we describe existing reading comprehension assessments used to measure specic aspects of comprehension, and how they have informed the development of MOCCA. Finally, we report initial evidence of the reliability and validity of MOCCA, and discuss how the present study extends the reading comprehension assessment literature. 1.1. Reading comprehension theories and assessments Reading comprehension is a complex and multidimensional construct; thus, the development of reading comprehension assessments should be guided by theory (August, Francis, Hsu, & Snow, 2006; Fletcher, 2006). Reading comprehension theories help identify constructs that work during the process of comprehension and specify the relationships among them so that researchers can better operationalize the dimensions to be assessed. Reading comprehension theories suggest that successful reading comprehension involves the extent to which a reader can develop a coherent mental representation of a text through developing a coherent situation model (e.g., Graesser, Singer, & Trabasso, 1994; Kintsch, 1998; McNamara, Kintsch, Songer, & Kintsch, 1996; van den Broek, Rapp, & Kendeou, 2005). A situation model is comprised of the situations that take place in a text (e.g., time, space, characters, and causality) (van Dijk & Kintsch, 1983; Zwaan, Magliano, & Graesser, 1995). For instance, Learning and Individual Differences 32 (2014) 4053 This research was supported by Grant #R305C050059 from the Institute of Education Sciences (IES), U.S. Department of Education, to the University of Minnesota and through the Interdisciplinary Education Sciences Predoctoral Training Program, Minnesota Interdisciplinary Training in Education Sciences (MITER)for data collection and resources, as well as by Grant #R305b110012 from the IES, U.S. Department of Education, to the Center on Teaching and Learning at the University of Oregon, through a Postdoctoral Fellowship for writing resources. The opinions expressed are those of the authors and do not necessarily represent views of the IES or the U.S. Department of Education. Corresponding author at: University of Oregon, 1600 Millrace Drive, Suite 207, Eugene, OR 97403, United States. Tel.: +1 541 346 8363; fax: +1 541 346 8353. E-mail address: [email protected] (S.E. Carlson). http://dx.doi.org/10.1016/j.lindif.2014.03.003 1041-6080/© 2014 Elsevier Inc. All rights reserved. Contents lists available at ScienceDirect Learning and Individual Differences journal homepage: www.elsevier.com/locate/lindif

Transcript of Development of a new reading comprehension assessment: Identifying comprehension differences among...

Learning and Individual Differences 32 (2014) 40–53

Contents lists available at ScienceDirect

Learning and Individual Differences

j ourna l homepage: www.e lsev ie r .com/ locate / l ind i f

Development of a new reading comprehension assessment: Identifyingcomprehension differences among readers☆

Sarah E. Carlson a,⁎, Ben Seipel b, Kristen McMaster c

a University of Oregon, Center on Teaching and Learning, United Statesb California State University, Chico, United Statesc University of Minnesota, Twin Cities, United States

☆ This research was supported by Grant #R305C050059Sciences (IES), U.S. Department of Education, to the Univethe Interdisciplinary Education Sciences Predoctoral TInterdisciplinary Training in Education Sciences (MITER)” fas well as by Grant #R305b110012 from the IES, U.S. DCenter on Teaching and Learning at the University of OFellowship for writing resources. The opinions expressednot necessarily represent views of the IES or the U.S. Depar⁎ Corresponding author at: University of Oregon, 16

Eugene, OR 97403, United States. Tel.: +1 541 346 8363;E-mail address: [email protected] (S.E. Carlson).

http://dx.doi.org/10.1016/j.lindif.2014.03.0031041-6080/© 2014 Elsevier Inc. All rights reserved.

a b s t r a c t

a r t i c l e i n f o

Article history:Received 28 September 2012Received in revised form 9 January 2014Accepted 17 March 2014Available online xxxx

Keywords:Reading comprehension assessmentIndividual differencesReading comprehension processes

The purpose of this study was to evaluate the Multiple-choice Online Cloze Comprehension Assessment(MOCCA), designed to identify individual differences in reading comprehension. Data were collected with twosets of 3rd through 5th grade students during two years: 92 students participated in Year 1 and 98 studentsparticipated in Year 2 to address primary research questions, and an additional 94 (N=192) students participatedinYear 2 to address the limitation of test administration time. Participantswere group administered theMOCCA anda standardized reading proficiency assessment, and individually administered other readingmeasures. Preliminaryanalyses indicated that theMOCCA produced reliable and valid scores as a new reading comprehension assessmentfor identifying types of comprehension processes used during reading, as well as for identifying individualdifferences in the types of comprehension processes used during reading. Findings are discussed in termsof developing a new measure to identify cognitive reading comprehension processes used during reading.Future research is needed to provide additional support for the technical adequacy of the assessment.

© 2014 Elsevier Inc. All rights reserved.

1. Introduction

Many students struggle with reading, and in particular, readingcomprehension. As students advance in school, they transition fromlearning to read (e.g., learning to decode and developing fluency andcomprehension skills) to reading to learn (e.g., using comprehensionskills to learn from text; Chall, 1996). This transition is often mostevident in the upper elementary grades, when many readers beginto encounter difficulties with new comprehension requirements(Shanahan & Shanahan, 2008, 2012).

Assessments are needed to determine why readers experiencecomprehension difficulties in order to develop appropriate instructionto meet their individual needs, yet few such assessments are available.Thus, the purpose of this study was to report preliminary findingsfrom a new reading comprehension assessment, the Multiple-choice

from the Institute of Educationrsity of Minnesota and throughraining Program, “Minnesotaor data collection and resources,epartment of Education, to theregon, through a Postdoctoralare those of the authors and dotment of Education.00 Millrace Drive, Suite 207,fax: +1 541 346 8353.

Online Cloze Comprehension Assessment (MOCCA), developed toidentify individual differences in reading comprehension. In this paper,wefirst discuss theories of reading comprehension that guided the devel-opment of MOCCA. Second, we describe existing reading comprehensionassessments used tomeasure specific aspects of comprehension, andhowthey have informed the development of MOCCA. Finally, we report initialevidence of the reliability and validity of MOCCA, and discuss how thepresent study extends the reading comprehension assessment literature.

1.1. Reading comprehension theories and assessments

Reading comprehension is a complex and multidimensionalconstruct; thus, the development of reading comprehension assessmentsshould be guided by theory (August, Francis, Hsu, & Snow, 2006; Fletcher,2006). Reading comprehension theories help identify constructs thatwork during the process of comprehension and specify the relationshipsamong themso that researchers can better operationalize the dimensionsto be assessed.

Reading comprehension theories suggest that successful readingcomprehension involves the extent to which a reader can develop acoherentmental representation of a text through developing a coherentsituation model (e.g., Graesser, Singer, & Trabasso, 1994; Kintsch, 1998;McNamara, Kintsch, Songer, & Kintsch, 1996; van den Broek, Rapp, &Kendeou, 2005). A situation model is comprised of the situations thattake place in a text (e.g., time, space, characters, and causality) (vanDijk & Kintsch, 1983; Zwaan, Magliano, & Graesser, 1995). For instance,

41S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

a reader may track causality by keeping track of the goal of the text(Trabasso & van den Broek, 1985; van den Broek, Lynch, Naslund,Ievers-Landis, & Verduin, 2003). The following example describes acausal connection: “Jimmy wanted to buy a bike. He got a job andearned enough money. He went to the store to buy the bike. Jimmywas happy.” In this example, a reader could make a causal connectionby generating an inference that Jimmy was happy because he reachedhis goal and bought a bike.

Researchers have found that many poor comprehenders (i.e.,readers with adequate word reading skills but with poor comprehensionskills compared to peers with similar word reading skills) fail tomake causal inferences while reading, which may stem from failureto track causal relations and goals in a text (e.g., Cain & Oakhill, 1999,2006; McMaster et al., 2012; Rapp, Broek, McMaster, Kendeou, &Espin, 2007; van den Broek, 1997). To provide appropriate instructionto improve such inference generation, it is important that readingcomprehension assessments identify the specific processes withwhich poor comprehenders struggle.

Researchers have assessed reading comprehension processes tounderstand how readers build connections (i.e., inferences) and trackrelations during reading to develop a coherent representation of atext, and have assessed reading comprehension products to evaluatethe result of the representation of the text. The products are the ‘endresult’ of reading, or what the reader learned or stored in memoryfrom the text after reading (i.e., offline). Reading products are typicallyassessed using recall, questioning activities, and traditional multiple-choice assessments.

In contrast, reading processes occur during the act of reading (i.e.,online) and can be somewhatmore difficult to assess because the exam-iner must infer what is taking place during reading. Methods to assessonline reading comprehension processes include eye-trackingmethods,reading time measures, and think-aloud tasks (e.g., Ericsson & Simon,1993; Kaakinen, Hyönä, & Keenan, 2003; Linderholm, Cong, & Zhao,2008). Think-aloud tasks, for example, are used to identify specificreading comprehension processes (e.g., causal, bridging, elaborativeinferences; paraphrases) that readers use during reading (Ericsson& Simon, 1993). Findings from think-aloud studies indicate thatreaders use different types of comprehension processes during readingto develop coherent situation models (e.g., Laing & Kamhi, 2002;Trabasso & Magliano, 1996a,b; van den Broek, Lorch, Linderholm, &Gustafson, 2001). Although think-aloud data provide fruitful informa-tion about the processes that readers use during comprehension, theyare laborious, time consuming, and impractical for practitioners to useto identify reading comprehension differences among their students forinstructional purposes.

1.2. Identifying comprehension differences

Researchers who have assessed reading comprehension processesusing think-aloud methods have identified individual processingdifferences among readers at different levels of comprehensionskill (McMaster et al., 2012; Rapp et al., 2007). Specifically, McMasteret al. (2012) administered a think-aloud task to fourth grade readers atdifferent levels of comprehension skill (i.e., good, average, and poor).They identified two types of poor comprehenders: (1) paraphrasers:poor comprehenders who mostly paraphrased during reading; and(2) elaborators: poor comprehenders who elaborated about the text,including information that was connected to background knowledgethat was not always relevant to the text. These findings were consistentwith previous research that found similar types of poor comprehenders,and support other researchers' conclusions that poor comprehendersmay struggle with reading in different ways (Cain & Oakhill, 2006;Nation, Clarke, & Snowling, 2002; Perfetti, 2007; Rapp et al., 2007).

McMaster et al. (2012) also found that the two types of poorcomprehenders responded to intervention in different ways. Specifical-ly, they compared two questioning interventions: one that prompted

readers to answer causal questions (Why questions that promptedreaders to make causal connections during reading), and one thatprompted readers to answer general questions (questions that promptedreaders tomake any kind of connections during reading). The researchersfound that paraphrasers benefitted more from the general questioningintervention than elaborators did, whereas elaborators benefited morefrom the causal questioning intervention than paraphrasers did. Thesefindings suggest that different types of poor comprehendersmay responddifferently to intervention.

Though researchers have employed methods to assess readingcomprehension processing differences among readers (e.g., think-aloudtasks), most traditional school-based reading comprehension assess-ments (e.g., reading proficiency assessments, standardized measures)have not been designed to detect such processes or to identify individualcomprehension differences. In addition, many of these methods assessthe product of reading comprehension rather than the process, limitingthe types of conclusions that can be drawn about how readers compre-hend differently. For example, Keenan, Betjemann, and Olson (2008)found that commonly used standardized reading comprehension assess-ments measure aspects of reading such as decoding and word recogni-tion, but not necessarily reading comprehension, and what is measuredvaries depending on the age of the reader. Thus, such traditional assess-mentsmay be insufficient for identifying specific reading comprehensiondifferences; yet, educators often make instructional decisions based ontheir outcomes (Keenan et al., 2008).

Researchers have begun to develop other methods to help addressthe shortcomings of traditional reading assessments and measurehow readers comprehend text rather than only assessing the prod-uct of comprehension. For instance, Magliano and colleagues devel-oped the Reading Strategy Assessment Tool (RSAT; Magliano, Millis,Development Team, Levinstein, & Boonthum, 2011), which measures asubset of the comprehension processes found to lead to a coherentrepresentation of a text. RSAT is an automated computer-basedassessment in which readers read texts one sentence at a time, andare asked either indirect questions (i.e., “What are your thoughtsregarding your understanding of the sentence in the context of thepassage?”) or direct questions (i.e.,Why questions related to a targetsentence). Readers type their responses, which are later analyzed fortypes of comprehension processes (e.g., paraphrases, bridging inferences,elaborations) and content words (e.g., nouns, verbs, adjectives, adverbs)used during reading.

Magliano et al. (2011) identified unique types of comprehensionprocesses that readers used during reading using RSAT, and also foundthat RSAT predicted scores on measures of reading comprehension.However, the measure is limited in several ways. First, RSAT uses anopen-ended response format where participants type their responsesto questions, limiting its use to older participants who have developedappropriate typing skills. Second, linguistic algorithms used to identifythe types of comprehension processes produced in responses may belimited in capturing the quality of responses and identifying individualprofiles of readers. Finally, like think alouds, the open-ended responsetask used in RSAT can produce a large amount of variability in howreaders interpret the task instructions, especially the instructions foranswering the indirect question which could be interpreted differentlyfrom reader to reader. Thus, it seems useful to develop an assessmentthat capitalizes on the strengths of RSAT (e.g., identify comprehensionprocesses during reading), but is also familiar to readers in terms oftesting format, efficient for educators to administer and score, and canbe used for making instructional decisions with children.

Other recently developed assessments, such as the DiagnosticAssessment of Reading Comprehension (DARC; August et al., 2006)and The Bridging Inferences Test, Picture Version (Bridge-IT, PictureVersion; Pike, Barnes, & Barron, 2010), measure individual differencesin reading comprehension processes for readers in Grades 2–6. TheDARC requires readers to remember newly read text, connect to andintegrate relevant background knowledge, and generate bridging

42 S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

inferences (August et al., 2006). Despite its usefulness for identifyingcertain types of comprehension processes, the DARC uses unfamiliarpseudo-word relational statements embedded in texts. Readers areonly asked to judge if such statements are true or false, and the assess-ment does not identify whether readers build a coherent representationof a text. The Bridge-IT, Picture Version also assesses children's ability togenerate bridging inferences during reading, as well as the ability tosuppress irrelevant text information (Pike et al., 2010). In addition,this assessment involves a task in which readers choose the lastsentence of a narrative text and each text is either accompaniedwith a related picture, inconsistent picture, or no picture. Similar tothe DARC, the Bridge-IT, Picture Version is limited in its utility fordistinguishing between different comprehension processes usedto develop a coherent representation of a text, and for identifyingindividual comprehension differences.

In sum, researches have developed assessments that target identify-ing comprehension processes; however few reading comprehensionassessments are available for educators and practitioners to use to easilyassess differences in readers' comprehension processes among childrenat various levels of comprehension skills. Limitations from previouslydeveloped assessments provide a rationale for developing new assess-ments that address the needs of readers who struggle with readingcomprehension in different ways. Furthermore, developing readingcomprehension assessments that focus on efficiently identifying specificreading comprehension processes used to develop a coherent repre-sentation of a text may be useful for identifying different types ofcomprehenders for the purposes of instruction.

1.3. Designing a reading comprehension assessment

In addition to variation in purpose and utility for educationaldecision making, reading comprehension assessments vary across manydimensions, including response format (e.g., cloze, multiple-choice,open-ended), presentation format (e.g., paper–pencil and computer-based), and the components of reading comprehension measured(e.g., literal comprehension, inferential processes, main idea identifica-tion) (Eason & Cutting, 2009; Keenan et al., 2008). Each dimensionpresents a challenge for assessment development.

In designing an assessment, the developer must make decisionsabout each dimension, which requires careful consideration of thebenefits and drawbacks of options under each dimension. For instance,multiple-choice tests are efficient for administrating in group settingsand are familiar to readers; however, traditional multiple-choice testsrequire readers to choose only one correct choice and alternativechoices are mainly distracters without diagnostic meaning (Cutting &Scarborough, 2006). Additionally, multiple-choice questions aretraditionally presented after an entire text, thus measuring the productof comprehension rather than the processes used to build a coherentrepresentation of the text. Open-ended questions allow readers todemonstrate comprehension processes used to build a coherent textrepresentation; however, open-ended assessments, like think alouds,are time consuming and difficult to score (e.g., Magliano et al., 2011).Modified cloze tests, such as the maze task in which every nth word isdeleted and replaced with three options for the reader to select, areefficient to administer and score, and have been demonstrated to providea general indicator of reading proficiency (Deno, 1985; Espin & Foegen,1996; Fuchs & Fuchs, 1992; Wayman, Wallace, Wiley, Ticha, & Espin,2007). In addition, maze tasks are often timed, which does not allowthe reader to build a complete and coherent mental representation ofthe text. In fact, researchers have provided evidence that such approachesassess decoding or sentence level comprehension, rather than discourselevel comprehension (Francis, Fletcher, Catts, & Tomblin, 2005; Keenanet al., 2008; Nation& Snowling, 1997). Further,maze taskswere designedprimarily for progress monitoring in reading rather than for assessingprocesses that take place during reading, and are thus limited in theirdiagnostic utility for comprehension (Wayman et al., 2007).

In the present study, we developed and evaluated an assessmentto measure comprehension processes that readers use during reading(i.e., online), capitalizing on the benefits of existing measures(e.g., efficient and familiar presentation formats), but also addressingthe shortcomings of existing measures (i.e., identify specific onlinereading comprehension processes and individual processing differencesused to develop a coherent representation of a text). The resultingtool is the Multiple-choice Online Cloze Comprehension Assessment(MOCCA). MOCCA is a paper and pencil assessment that consists of shortnarrative texts (seven sentences long). For each text, the sixth sentence isdeleted and readers are required to choose among four multiple-choiceresponses to complete the sixth sentence of the text. The best responserequires the reader to make a causal inference that results in a coherentrepresentation of the text. Unlike traditional multiple-choice assess-ments, MOCCA was designed with alternate response types that repre-sent specific reading comprehension processes used during reading(i.e., causal inferences, paraphrases, local bridging inferences, and lateralconnections). Fig. 1 provides an item from MOCCA, with each responsetype labeled for the comprehension process it identifies. Instructionsand additional items fromMOCCA can be found in Appendix A.

1.4. Study purpose and research questions

The purposes of this study were to evaluate the initial technicaladequacy of MOCCA and examine its capacity to identify readingcomprehension processes differences among readers. Included in thisexamination was also whether MOCCA can be used to identify subtypesof poor comprehenders similar to those identified in previous researchusing think-aloud approaches (McMaster et al., 2012; Rapp et al., 2007).Our research questions included: (1) Does MOCCA produce scores thatare reliable (internally consistent) depending on the amount of time pro-vided during test administration (i.e., timed vs. untimed) and dependingon the difficultly and discrimination levels of the items? (2) DoesMOCCAproduce scores that are valid (in terms of criterion validity)? and (3) Towhat extent does MOCCA distinguish among comprehension processesof good, average, and poor comprehenders, including subtypes of poorcomprehenders, during reading depending on the amount of timeprovided during test administration?

2. Methods

2.1. Participants

To address our research questions, data were collected across twoyears. Specifically, 92 third, fourth, and fifth grade students in Year 1and 98 third, fourth, and fifth grade students in Year 2 completed theMOCCA (timed version) and a full battery of additional reading relatedassessments (as described under Measures). In Year 2, an additional94 third, fourth, and fifth grade students, along with the other 98students (N = 192) were provided additional testing time (untimedversion) to complete as many items possible. This additional time wasprovided to address limitations from initial findings that indicatedmanyMOCCA itemswere incomplete during Year 1 because insufficienttime was provided to complete all MOCCA items. Thus, additional timewas provided to conductmore accurate item analyses.We also used theadditional time from participants in Year 2 whowere also administeredadditional reading measures (N = 98) to address whether addingadditional time would provide more accurate validity and comprehen-sion processing information among comprehension groups who tookthe MOCCA. Participant demographic information is presented in Table 1.

Participants (N=92 in Year 1;N=98 in Year 2)were also screenedand divided into good, average, and poor comprehender groups usingpercentile scores from three measures: (1) the Computerized Achieve-ment Levels Tests (CALT; Northwest Evaluation Association, 2001);(2) Dynamic Indicators of Basic Early Literacy Skills (DIBELS), 6th Ed.Oral Reading Fluency (ORF) (Good & Kaminski, 2002); and (3) the

Fig. 1. Sample MOCCA item.

43S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

CurriculumBasedMeasurement (CBM)Maze Task (Deno, 1985; Espin &Foegen, 1996; Fuchs & Fuchs, 1992). Specifically, we used percentileranges from the CALT to determine the groups, and DIBELS ORFand CBM Maze scores to corroborate the CALT. Poor comprehenders(Year 1 n = 25; Year 2 n = 24) were at the 25th percentile; averagecomprehenders (Year 1 n = 22; Year 2 n = 26) were at the 50thpercentile; and good comprehenders (Year 1 n = 45; Year 2 n = 48)were at the 75th percentile on the CALT.

2.2. Measures and materials

2.2.1. Screening measuresThe CALT (Northwest Evaluation Association, 2001) is a standard-

ized computer adapted reading comprehension assessment that isadministered to students one to two times each year. The CALTmeasuresliteral, inferential, and vocabulary components of reading comprehen-sion. The CALT has a reported reliability range of r = .76 to .87(Northwest Evaluation Association, 2001). Percentile scoreswere provid-ed by the school district.

DIBELS ORF (Good & Kaminski, 2002) is an individually adminis-tered assessment of students' accurate and fluent reading of connected

Table 1Participant demographic information.

Variable Year 1 (Primary sample; N = 92) % Year 2 (Primar

GradeThird 8 8.7 5Fourth 40 43.5 41Fifth 44 47.8 52

SexFemale 65 70.6 66Male 27 29.4 32

Disability typeAutism 1 1.1 1EBD 1 1.1 3OHD 0 0.0 1SLD 1 1.1 1SLI 5 5.4 5None 84 91.3 87

ELL 19 20.7 14Free/reduced lunch 45 48.9 39Race/ethnicityMissing 0 0.0 0Native American 1 1.1 0Asian 11 12.0 17Black 21 22.8 13Hispanic 16 17.4 12White 43 46.7 56

Note. EBD = emotional/behavioral disorders, OHD = other health disabilities, SLD = specific l

text. ORF is typically used to identify readers who may need additionalinstructional support, andmonitor progress in oral reading fluency. ORFconsists of a set of standardized passages that participants read aloudfor 1 min each. Words omitted, substituted, or hesitated upon by thereader for longer than 3 s are counted as errors. The score is the numberof words read correctly in 1 min. Reliability coefficients for ORF havebeen reported as r = .65 to .98 (Good & Kaminski, 2002). Participantswho read below 75 words per min were not included in the currentstudy to ensure reading comprehension problems did not stem fromfluency difficulties. ORF scores were provided by the school district.

The CBM Maze task is a modified cloze task used to index overallreading proficiency (Deno, 1985; Espin & Foegen, 1996; Fuchs & Fuchs,1992). The CBM Maze task is group administered. For each text, everyseventh word is deleted and replaced with three options, only one ofwhich makes syntactic and semantic sense. The reader is to select theword that best fits the sentence, and the score is the number of correctword selections. In this study, participants read three texts used inprevious research (e.g., McMaster et al., 2012). Participants weregiven 1min for each text. The total number of correct words selectedin 1min was summed for each participant, and scores from the threepassages were averaged. The CBM Maze has a reported reliability

y sample; N = 98) % Year 2 (Exploratory sample; N = 192) %

5.1 13 6.841.8 95 49.553.1 84 43.8

67.3 111 57.832.7 81 42.2

1.0 4 2.03.0 3 1.61.0 3 1.61.0 8 4.25.1 6 3.1

88.9 168 87.514.3 34 17.739.8 78 40.6

0.0 1 0.60.0 0 0.0

17.3 25 13.013.3 28 14.612.1 31 16.156.6 107 55.7

earning disabilities, SLI = speech/language impairment, ELL = English language learners.

44 S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

range from r = .60 to .80 for elementary school participants (Waymanet al., 2007).

2.2.2. Measure under developmentMOCCA was developed as a paper and pencil, multiple-choice,

online (i.e., during reading), cloze comprehension assessment designedto identify comprehension processes used during reading of narrativetexts. Narrative text items were written so the causal structure of eachtext (plot, nature of events, tense of language) was centered on a maingoal that motivated subgoals and events in the text (e.g., Trabasso &van den Broek, 1985). Instead of deleting every nth number of words,as in traditional cloze tasks (e.g., CBM Maze), one line of each text wasdeleted. The deleted line was always the sixth sentence and occurredbefore the last line of the text. Participants were required to choose oneof four response types to complete the deleted line from each text. Re-sponse types were developed for each short narrative based on previousfindings regarding the types of cognitive processes produced by differentcomprehenders (good, average, and two types of poor comprehenders;McMaster et al., 2012; Rapp et al., 2007) during a think-aloud task. First,typical cognitive processes produced during think-aloud tasks wereidentified. Of these processes, the most frequently produced types werebridging inferences, elaborative inferences, associations, and paraphrasesor text repetitions. Thus, for each text, an inferential response type wasdeveloped to best complete the missing sentence with a causally coher-ent bridging statement (goal completion statement; e.g., Trabasso & vanden Broek, 1985). The following response types were developed: (1)Causally coherent inference best completes the missing sentence with acausally coherent statement (goal completion statement; e.g., Trabasso& van den Broek, 1985), thus closing the causal chain; (2) local bridginginference is a semantically correct lure that connects to information inthe sentence immediately preceding the missing sentence and is relatedto the goal of the story, but does not close the causal chain; (3) lateral con-nection is a lure with semantic overlap with the content in the narrative(e.g., the action), but is unrelated to relevant text information. This re-sponse type combined associations, elaborations, and self-explanationsfound during think-aloud studies which involve readers accessing back-ground knowledge, but did not always close the causal chain of the text.A final response type developed was (4) a paraphrase: a lure designedto paraphrase or repeat the earlier goal statement, or a combination ofthe goal and subgoal statements presented in the text.

MOCCA includes 40 narrative text items, each with four multiple-choice responses. The 40 items are divided into two separate sectionsof 20 items each, to provide a break for participants. Participants answerMOCCA items during two 10 min testing periods. The average FleschKincaid (Kincaid, Fishburne, Rogers, & Chissom, 1975) grade levelfor the items (i.e., texts and responses) is 4.4. Each item has a title, isseven sentences long, and is an average of 80.5 words. MOCCA iscoded and scored for the number of times the reader selects eachresponse type. Specifically, each response type (causally coherentinference, local bridging inference, lateral connection, and paraphrase)is coded with a 1 or a 0 to indicate whether or not the response typewas chosen. Total scores for each response type are then calculated.

2.2.3. Additional reading measuresParticipants completed additional measures to examine other

aspects of reading (decoding, comprehension processes, and workingmemory) during individual testing sessions. Scores from these measureswere used to establish the relation between MOCCA response types andother reading-related measures.

To measure word reading or decoding, participants were adminis-tered the Woodcock Johnson (WJ) III Word Identification (ID) andWord Attack subtests (Woodcock, McGrew, & Mather, 2001). Duringthese subtests, participants read lists of increasingly difficult realand nonsense words aloud. Words read correctly are scored as 1, andwords read incorrectly are scored as 0. The ceiling for both the WJ IDand Attack is six consecutive 0 s. The score for each subtest is the total

number of words read correctly. Both tasks take approximately 5 minto complete. The reported reliability for WJ III subtests range fromr = .80 to .90 and higher, and have been normed from Kindergartenthrough Grade 12 (Woodcock et al., 2001).

To measure comprehension processes used during reading, partici-pants were administered a think-aloud task. This task involves askingthe reader to read a text out loud one sentence at a time. After eachsentence, the reader is asked to say whatever comes to mind, even if itseems obvious, because there is no right or wrong answer. After readingthe entire text, the reader is asked two yes/no questions about the story.Participants read two narrative texts for the think-aloud task adminis-tered in the current study. For each text, each sentence was printed ona separate index card. This task was first demonstrated by the experi-menter with a practice text, and then any questions were answeredbefore moving on to the experimental texts. Each text (practice and ex-perimental) was 15–21 sentences long, with a Flesch Kincaid (Kincaidet al., 1975) reading level of 4.0. Each story during the think-aloudtask took approximately 10 min to complete. Scoring for the think-aloud task was conducted using a coding protocol and is described inmore detail in the Think-aloud coding section. The texts used for thethink-aloud task are presented in Appendix B.

To measure working memory, participants were administered asentence span task, which measures both storage and processing com-ponents ofworkingmemory. In this task, participants listen tounrelatedsets of sentences and are asked to remember different aspects of each(Daneman & Carpenter, 1980). The sentence span task used in the cur-rent study was modified for American children by Swanson, Cochran,and Ewers (1989). First, participants listen to sets of sentences (rangingfrom 2 to 5 sentences) read by an experimenter. Second, participantsanswer a comprehension question about one of the sentences. Third,participants are asked to recall the last word from each sentence inorder from the first to last sentence. All components (recall of the lastwords and question answer) must be answered correctly to move onto the next difficulty level. The task materials consist of 6 practicesentences and 28 test sentences. Each sentence is between six and 10words long. All of the final words and answers to the comprehensionquestions are nouns and none of the words are repeated in any of thesets. Words recalled and responses to the questions are answered outloud and scored for accuracy. The task lasts approximately 10 min.

2.3. Procedures

In both Years 1 and 2, there were two testing phases. In Phase 1,participants completed two group-administered assessments (CBMMaze and MOCCA) during scheduled class periods. In Phase 2, partici-pants were individually administered additional assessments (WJ WordID and Attack, think aloud, and sentence span) in designated areas ofthe school during regular school hours or during an after-school program.Participating teachers helped arrange individual testing schedules basedon availability. Students received a $5.00 gift card from Target for partic-ipating in this study, and teachers were compensated for their time withgift cards from Amazon.com to supplement their classroom libraries.

2.3.1. Group testingTrained project staff administered the CBM Maze and MOCCA. Staff

consisted of two doctoral candidates in Educational Psychology and 11undergraduate Psychology majors, all who had previous experienceworking with children in school settings. Staff members were trainedto administer and score the CBM Maze and MOCCA during three1-hour sessions. Project staff practiced administering and scoring eachassessment with each other and project supervisors until each wascorrect and consistent.

In Years 1 and 2, project staff administered both assessments duringone session that lasted approximately 45–60 min. First, they adminis-tered the CBM Maze by reading the directions, and providing twosample items. Participants were told they would be reading three

45S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

narrative texts for 1 min each, and whenever they encountered threewords in parentheses within a sentence, they would need to circle theword that belonged in the sentence, even if they were unsure of theanswer. All questions about this task were answered when practicingwith the sample items.

Second, project staff administered the MOCCA by reading the direc-tions, providing two sample items, and answering any questions. Partici-pants were told they would be reading short narrative stories and eachstory was missing a sentence. The participants were instructed to readeach story silently and choose one of four choices below each story forthe missing sentence that best completed the story. Participants weregiven 10min to complete up to 20 items in the first section of the assess-ment. Then, participants were asked to stop, take a short break, and thenturn to the second section and complete up to 20more items for another10 min. The MOCCA took approximately 30 min to administer.

During Year 2 only, participants were given additional time tocomplete all MOCCA items. The purpose of adding this time in Year 2was because findings from Year 1 indicated that few participants com-pleted all MOCCA items in the allotted time limit. Thus, after completingthe initial timed version of the MOCCA (used to replicate findings fromYear 1), Year 2 participants were instructed to return to unfinisheditems in both sections, and complete as many items as possible. Theuntimed version lasted for approximately 20 min. Data were used foritem analyses, including item differentiation and difficulty.

2.3.2. Individual testingProject staff also administered and scored all individually adminis-

tered assessments. Staff members were trained to administer andscore each assessment with each other and the project supervisorsuntil each was correct and reliable. Project staff individually admin-istered assessments during one session that lasted approximately30 min. Each session took place in designated locations in the partici-pants' schools arranged before the testing day. Individual testing wasaudio recorded for future data coding, scoring, and entering.

2.4. Think-aloud coding

Trained project staff transcribed the think-aloud sessions for coding.Think-aloud protocols were scored for types of comprehension process-es used during reading. First, think-aloud responses were parsed intoidea units (generally, phrases including a subject and verb). Second,each idea unit was coded using a scheme adapted from previousresearch to examine types of comprehension processes used duringreading to build a coherent representation of a text (e.g., McMasteret al., 2012; Rapp et al., 2007; van den Broek et al., 2001). Coded vari-ables included: associations (concepts from background knowledgebrought to mind by the text); evaluative comments (opinions aboutthe text); bridging inferences (connecting contents of the currentsentence with local/near or global/distant text information); valid elab-orative inferences (explanations about the contents of the current sen-tence using background knowledge relevant with the text); invalidelaborative inferences (explanations about the contents of the currentsentence using background knowledge irrelevant with the text); validpredictive inferences (anticipations of what will occur next in the textand is also connected to relevant text information); invalid predictiveinferences (anticipations of what will occur next in the text and is alsonot connected to relevant text information); metacognitive responses(reflections of understanding or agreement with the text); paraphrases/text repetitions (putting the current sentence or part of the currentsentence into his/her own words, or restating the text verbatim); andaffective responses (emotions related to contents of the text).

For Year 1, there were eight independent judges, and for Year 2,twelve independent judges who coded the think-aloud responses.During both years, judges were paired into dyads to assess interrateragreement of the think-aloud coding. Interrater agreement was calcu-lated using a randomly selected 20% of the transcripts. There was an

average of 90% agreement for Year 1 and an average of 93% agreementYear 2. Disagreements between judges were resolved by discussion.

3. Results

Data were analyzed separately for participants in Years 1 and 2 toaddress each of our research questions. First, separate analyses wereconducted to assess the internal consistency of the MOCCA (bothtimed and untimed data) as well as the difficulty and discriminationlevels of the items. Second, we assessed the criterion validity of theMOCCA using the Year 1 (timed) and 2 (timed and untimed) datasets.Third, separate analyses were conducted to identify different typesof comprehenders during Year 1 and 2, to determine whether Year 1results replicated in Year 2, and determine if providing additional timefor completing more MOCCA items better distinguishes between thetypes of comprehenders, including subgroups of poor comprehenders.

3.1. Reliability and validity of the MOCCA

3.1.1. Internal consistencyCronbach's alpha was used to determine the internal consistency of

each of the MOCCA response types (i.e., causally coherent inference,paraphrase, local bridging inference, lateral connection). Specifically, wewere interested not only in determining the internal consistency of thecorrect (i.e., causally coherent inference), but also in determining wheth-er responses for each item coherewith the overall response for each type.Thus, Cronbach's alpha tests were used tomeasure the reliability for eachresponse type. Cronbach's alphas range between 0 and 1 (Cronbach,1951), and can be interpreted as follows: excellent = α ≥ .9; good =.89 N α ≥ .8; acceptable = .79 N α ≥ .7; questionable = .69 N α ≥ .6;poor = .59 N α ≥ .5; and unacceptable = .49 N α (Streiner, 2003).

We found the following patterns across years and timed vs. untimedtest administrations that the causally coherent inference and paraphraseresponse type reliabilities fell between excellent and good ranges ofinternal consistency; whereas, the local bridging inference and lateralconnection response type reliabilities did not.

To determine if the low alphas for the local bridging inferenceand lateral connection response types were due to a lack of internalconsistency, or were due to lower variances because the responsetypes were chosen less often than the causally coherent inference andparaphrase response types, additional analyses were conducted. Specif-ically, correlation analyses were conducted between the mean of theeach response types chosen by the participants (i.e., the average propor-tion of the total items chosen for a particular response type) and partic-ipants' response type total score means (i.e., the average proportion ofthe total responses chosen for a particular response type). Analyseswere conducted for both Years 1 and 2 (timed) and Year 2 (untimed).Coefficient alphas, the proportion of total response type means andvariances are provided in Table 2.

The average proportion of each response type total was highlycorrelated with the corresponding average participant response typetotal score. For instance, the average proportion of when the causallycoherent inference was chosen was highly correlated with the totalscore for the causally coherent inference (Year 1 timed r = .42; Year 2timed r = .42; Year 2 untimed r = .52; ps b .01). In addition, thepatterns of correlations for Year 1 (timed) showed similar patterns asthe coefficient alphas (i.e., the correlations for the causally coherentinference N paraphrase N local bridging inference N lateral connection).In Year 2 (timed and untimed) the correlations for the lateral connectionwere slightly higher than the correlations for the local bridging inference.Correlation coefficients are provided in Table 3.

Finally, as shown in Table 2, the variances for the average participanttotal score and average proportion of the total score across the responsetypes showed similar patterns as found between the coefficient alphas,as well as the correlations. That is, the variance of the causally coherentinference total scorewas greater than any of the other three total scores;

Table 2Coefficient alphas, means and variances for the MOCCA response types across years and administration times.

Year and administration time Causally coherent inference Paraphrase Local bridging inference Lateral connection

Year 1 (timed) Coefficient α .92 .85 .63 .62Proportion of total response type M .52 .08 .06 .03Proportion of total response type σ2 .25 .07 .05 .03Participant response type total scoreM 20.79 3.36 2.35 1.35Participant response type total score σ2 71.57 16.65 5.48 3.13

Year 2 (timed) Coefficient α .93 .86 .32 .43Proportion of total response type M .50 .07 .06 .03Proportion of total response type σ2 .25 .06 .05 .03Participant response type total scoreM 20.02 2.71 2.20 1.05Participant response total score σ2 71.91 14.14 2.79 1.69

Year 2 (untimed) Coefficient α .94 .87 .65 .68Proportion of total response type M .61 .13 .10 .05Proportion of total response type σ2 .24 .15 .09 .05Participant response type total scoreM 24.22 5.27 4.13 2.14Participant response type total score σ2 103.02 29.16 9.55 5.90

46 S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

the variance for theparaphrase total scorewas greater than the variancefor the local bridging inference and lateral connection response types;and so on. Thus, the initial coefficient alpha patterns appear to be dueto increasing restriction of range on the total scores rather than lack ofinternal consistency.

3.1.2. Item discrimination and difficulty levelsTo explore discrimination and difficulty levels for the MOCCA items

(N = 40), we used MOCCA data from the Year 2 (untimed) datasetbecause these participants were allowed additional time to completemore of the MOCCA items (N = 192 participants), and thus, providemore accurate item information.Whereas, in the timed administrationsof MOCCA (Years 1 and 2), few participants completed all of the items.Item discrimination and difficulty data are listed in Table 4.

3.1.2.1. Discrimination levels. An item analysis was conducted onMOCCAitems (N= 40 items) from the Year 2 (untimed) dataset using Iteman,Version 3.5 (1989), a software program designed to provide itemanalyses using classical test theory (CTT) (Iteman forWindows, Version3.5, Assessment Systems Corp.). Discrimination and point biserialcorrelation ranges between −1 and +1. Values of r = .40 andabove are considered to be desirable, and values of r = .20 andbelow are considered to be low (Ebel, 1954), of which items shouldbe revised. Discrimination and point biserial reliabilities for MOCCAitems ranged from r = .15–.89.

3.1.2.2. Difficulty levels. Item difficulty was calculated for the proportionof participants who answered the item correctly (i.e., chose the causallycoherent response type). Item difficulty indices range between 0and 100 — the higher the values, the easier the question. Thus, items

Table 3Correlation coefficients between the average proportion of totals by MOCCA response type acr

Year and administration time Causally coherentMOCCA response type

Year 1 (timed) Causally coherent inference .423⁎⁎Paraphrase −.172⁎⁎Local bridging inference −.083⁎⁎Lateral connection −.092⁎⁎

Year 2 (timed) Causally coherent inference .424⁎⁎Paraphrase −.166⁎⁎Local bridging inference −.056⁎⁎Lateral connection −.154⁎⁎

Year 2 (untimed) Causally coherent inference .519⁎⁎Paraphrase −.323⁎⁎Local bridging inference −.226⁎⁎Lateral connection −.276⁎⁎

Note.⁎⁎ p b .01.

85% and above are classified as easy, items 51–84% are classified asmoderate, and items below 50% are classified as difficult (Lord, 1952).Item difficulty ranged from 36% to 81% for the MOCCA items.

3.1.3. ValidityWe calculated Pearson's r correlation coefficients to examine

criterion-related validity with the MOCCA response types during Years1 (timed) and Year 2 (timed and untimed). The criterion measuresconsidered were with each of the MOCCA response types (causallycoherent inference; paraphrases; local bridging inferences; lateralconnections) and the CALT, CBM Maze, DIBELS ORF, WJ ID, WJ Attack,andWMWords. Correlation coefficients between the MOCCA responsetype scores during for Years 1 and 2 and scores on the measuresconsidered for criterion-related validity ranged between r = − .37–.75(ps b .05 and b .001). Correlation coefficients for the validity analysisare provided in Table 5.

3.2. Capacity of the MOCCA: Distinguishing among comprehensionprocesses

3.2.1. Comprehension skillTo determine whether MOCCA can distinguish among comprehen-

sion processes of different types of readers, depending on the adminis-tration time,we compared theMOCCA response types selected by good,average, and poor comprehenders. Comprehension groups were identi-fied using scores from the CALT, a state standardized reading compre-hension assessment. Specifically, we conducted repeated measuresanalyses of variance (RM-ANOVAs) with MOCCA response types (caus-ally coherent inferences; paraphrases; local bridging inferences; lateralconnections) as the within subjects variable and comprehension group

oss years and administration times.

inference Paraphrase Local bridging inference Lateral connection

.368⁎⁎

.122⁎⁎ .249⁎⁎

.099⁎⁎ .152⁎⁎ .245⁎⁎

.374⁎⁎

.027 .183⁎⁎

.012 .066⁎⁎ .203⁎⁎

.399⁎⁎

.080⁎⁎ .254⁎⁎

.106⁎⁎ .140⁎⁎ .270⁎⁎

Table 4Discrimination and difficulty indices for MOCCA Items (N = 40 items) for Year 2 untimedtest administration (N = 192 participants).

Item Proportion correct:

Difficulty index Discrimination index Point biserial index

1 .69 .18 .152 .63 .49 .393 .63 .46 .344 .73 .70 .605 .69 .51 .456 .74 .66 .597 .81 .57 .598 .36 .50 .379 .80 .53 .5410 .47 .43 .3311 .74 .53 .5312 .74 .62 .5913 .61 .67 .5514 .46 .61 .5115 .50 .65 .5116 .60 .72 .6317 .60 .76 .6018 .61 .76 .6019 .45 .61 .4620 .58 .87 .6921 .74 .55 .4722 .69 .55 .4823 .78 .62 .6224 .72 .44 .4725 .80 .44 .4926 .54 .68 .5727 .76 .64 .6328 .66 .76 .6229 .56 .63 .5430 .54 .70 .5731 .60 .65 .5332 .56 .78 .6433 .54 .83 .7034 .58 .74 .6535 .50 .89 .7436 .59 .77 .6437 .37 .74 .5638 .43 .69 .5739 .43 .56 .4840 .40 .67 .53

Table 5Correlation coefficients between MOCCA response types and other reading measures across ye

Year and administration time Measure MOCCA variable

Causally coherent inference

Year 1 (timed) CALT total .719⁎⁎⁎CBM Maze .728⁎⁎⁎DIBELS .589⁎⁎WJ ID .583⁎⁎WJ Attack .388⁎⁎WM Words .308⁎

Year 2 (timed) CALT total .747⁎⁎⁎CBM Maze .626⁎⁎DIBELS .647⁎⁎WJ ID .523⁎⁎WJ Attack .277⁎⁎WM Words .413⁎⁎

Year 2 (untimed) CALT total .636⁎⁎CBM Maze .421⁎⁎DIBELS .565⁎⁎WJ ID .431⁎⁎WJ Attack .232⁎WM Words .346⁎

Note.~ p = .05.⁎ p b .05.⁎⁎ p b .01.⁎⁎⁎ p b .001.

47S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

(good, average, poor) as the between subjects variable, aswell as simplemain effects to assess the interaction between response type andcomprehension group. Descriptive statistics (means and SDs) and simplemain effects for Year 1 (timed) and Year 2 (timed and untimed) forthe MOCCA response types selected by good, average, and poorcomprehenders are presented in Table 6.

Results from the RM-ANOVA revealed statistically significantinteractions of response type by comprehension group across years andadministration times. Simple main effects of response type by compre-hension skill revealed that good comprehenders chose more causallycoherent inferences than did the average and poor comprehendersduring both years administration times. In addition, both the poor andaverage comprehenders varied slightly in the other types of MOCCAresponse types (i.e., paraphrase, local bridging inference, lateralconnection) chosen; however, did choose each more often than didthe good comprehenders. Although these findings were promising fordistinguishing between readers at different levels of comprehensionskills, wewere also interested in determining if MOCCA could distinguishbetween subtypes of poor comprehenders as seen in previous researchusing other more laborious methods (i.e., think alouds).

3.2.2. Poor comprehendersThus, to determine whether MOCCA can distinguish between

subtypes of poor comprehenders found in previous research (i.e.,paraphrasers and elaborators; McMaster et al., 2012), we first analyzedoutcomes from the think-aloud task.

3.2.2.1. Think-aloud subtypes. A cluster analysis was conducted todetermine whether poor comprehenders cluster into two types (para-phraser and elaborator) as seen in previous research (e.g., McMasteret al., 2012). We used Ward's method (Ward & Hook, 1963), whichuses a hierarchical method to group a larger number of participants intosmaller groups with similar characteristics. The goal of forming groupswith this method is to use the sum of the squared within deviations forthe group mean of each variable, which is minimized for all the variablesin each group at the same time (Ward & Hook, 1963). For the currentstudy, we used a two-cluster solution with the following think-aloudvariables: Associations; evaluations; connective inferences (local

ars and administration times.

Paraphrase Local bridging inference Lateral connection

−.309⁎⁎ −.292⁎⁎ −.322⁎⁎−.110 −.102 −.101−.153 −.103 −.169−.232⁎ −.265⁎⁎ −.225⁎−.110 −.217⁎ −.138−.035 −.054 −.036−.197~ −.258⁎⁎ −.371⁎⁎

.066 −.035 −.210⁎−.043 −.102 −.221⁎−.142 −.054 −.278⁎⁎−.129 .024 −.151−.166 −.069 −.118−.396⁎⁎ −.475⁎⁎ −.557⁎⁎−.164 −.280⁎⁎ −.346⁎⁎−.243⁎ −.347⁎⁎ −.424⁎⁎−.219⁎⁎ −.212⁎ −.374⁎⁎−.174 −.076 −.180−.284⁎⁎ −.233⁎ −.287⁎⁎

Table 6Simple main effects for repeated measures analysis of variance for the MOCCA response types by comprehender group across years and administration times.

Year and administration time MOCCA response type Good Average Poor

(Yr1 n = 45)(Yr2 n = 48)

(Yr1 n = 22)(Yr2 n = 26)

(Yr1 n = 25)(Yr2 n = 24)

Mean (SD) Mean (SD) Mean (SD)

Year 1 (timed) Causally coherent inferences 28.80a (6.84) 16.64b (4.88) 14.16b (4.69)Paraphrases 2.60a (2.96) 3.27a (4.74) 5.32b (4.38)Local bridging inferences 1.62a (1.56) 2.32b (3.08) 3.04b (2.85)Lateral connections 1.62a (1.47) 2.95b (2.59) 2.96b (3.03)

Year 2 (timed) Causally coherent inferences 25.54a (6.28) 16.77b (7.89) 12.50b (4.88)Paraphrases 2.08a (2.27) 3.54a (6.29) 3.08b (2.26)Local bridging inferences 1.21a (1.18) 1.54a (1.27) 2.17b (1.97)Lateral connections 1.31a (0.98) 2.12b (1.68) 2.17b (1.97)

Year 2 (untimed) Causally coherent inferences 32.17a (6.23) 24.92b (9.69) 21.38b (7.15)Paraphrases 2.65a (2.76) 5.85b (7.48) 6.04b (3.62)Local bridging inferences 2.08a (1.40) 2.85a (2.60) 4.75b (3.22)Lateral connections 1.56a (1.05) 2.96b (2.16) 3.63b (2.45)

Note. Means in a row that do not share subscripts are significantly different at p b .05 using a Bonferroni adjustment. Year 1 timed (N = 92), and Year 2 timed and untimed (N = 98). TheF-tests for the interaction of MOCCA response types by comprehender group were F = 28.46 (Year 1 timed); F = 27.12 (Year 2 timed); and F = 15.79 (Year 2 untimed); all ps b .001.

48 S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

and global connectives); paraphrases (paraphrases and verbatimtext repetitions); elaborative inferences (elaborative and predictiveinferences); andmetacognitive comments (understandings and textagreements).

Results from the cluster analysis replicated findings from previousresearch and yielded two groups of poor comprehenders for bothYears 1 and 2 (paraphrasers and elaborators) (McMaster et al., 2012;Rapp et al., 2007). A RM-ANOVA comparing the think-aloud responsetypes used by the good, average, and two types of poor comprehendersrevealed a statistically significant interaction of comprehender typein Year 1 and Year 2. Descriptive statistics, simple main effects, and Fstatistics for think-aloud response types by comprehender groupadministered in Years 1 and 2 are found in Table 7.

Simple main effects revealed statistically significant differencesbetween the two types of poor comprehenders with each other andwith the average and good comprehenders when comparing differenttypes of think-aloud responses (McMaster et al., 2012; Rapp et al.,2007) across each year. That is, the most unique feature for elaboratorswas that they made reliably more invalid elaborative and predictiveinferences than did paraphrasers, average, and good comprehendersduring both Years 1 and 2. Paraphrasers, on the other hand, mademore paraphrases and text repetitions than did the elaborators, average,and good comprehenders during both Years 1 and 2.

3.2.2.2. MOCCA subtypes. To examine if MOCCA can identify similarsubtypes of poor comprehenders as those identified by think-aloudresults, an additional cluster analysis was conducted with the MOCCAresults. We used a two-cluster solution with the following MOCCAresponse type variables: Causally coherent inference, paraphrase, localbridging inference, and lateral connection response types. This analysis

Table 7Means, SDs, and simple main effects for repeatedmeasures analyses of variance for the think-alyears.

Year and administration time Think aloud response

Year 1 (timed) Valid elaborative and predictive inferencesInvalid elaborative and predictive inferencesParaphrases and text repetitions

Year 2 (timed) Valid elaborative and predictive inferencesInvalid elaborative and predictive inferencesParaphrases and text repetitions

Note. Means in a row that do not share subscripts are significantly different at p b .05 using a Boninteraction of think-aloud response types by comprehender group were F = 4.77, p = .001 (Y

also yielded two groups of poor comprehenders for Years 1 (timed) and2 (timed and untimed) (paraphrasers: poor comprehenders who chosethe causally coherent inference response type, but not as consistentlyas the other comprehender groups, and also chose the paraphraseresponse type more than other response types; and elaborators: poorcomprehenders who chose the causally coherent inference type, butnot as consistently as the average and good comprehenders, and alsochose the lateral connection response type some of the time). Thesame labels for the subtypes of poor comprehenders were used forMOCCA findings to be consistent with the above think-aloud findings.

A RM-ANOVA comparing MOCCA response types used by the good,average, and two types of poor comprehenders revealed a statisticallysignificant interaction of comprehender type in Year 1 (timed) andYear 2 (timed and untimed). Descriptive statistics, simple main effects,and F statistics for the MOCCA response types chosen by good, average,and two types of poor comprehenders in Years 1 and 2 are listed inTable 8.

Simple main effects revealed statistically significant differencesbetween the two types of poor comprehenders with each other andwith the average and good comprehenders when comparing the differ-entMOCCA response types. Specifically, one unique featurewas that theelaborators chose the lateral connection response type more often thandid the good comprehenders during Year 2 when there was additionaltime provided to complete more MOCCA items (i.e., Year 2 untimed).However, the elaborator groups across Years 1 and 2 were not choosingthis response type more often than the other distractor response types(i.e., paraphrase and local bridging inference) when not choosing thecausally coherent inference.

The paraphrasers, on the other hand, chose the paraphraseresponse type more often than did the elaborators, average, and good

oud response types chosen by good, average, and two types of poor comprehenders across

Poor:Elaborators(Yr1 n = 19)(Yr2 n = 14)

Poor:Paraphrasers(Yr1 n = 6)(Yr2 n = 10)

Average(Yr1 n = 22)(Yr2 n = 26)

Good(Yr1 n = 45)(Yr2 n = 48)

Mean (SD) Mean (SD) Mean (SD) Mean (SD)

29.39a (6.05) 16.83a (11.44) 30.09a (10.83) 28.42a (14.12)3.58a (3.72) 0.83b (0.75) 1.36b (1.94) 1.51b (2.81)4.74a (4.56) 23.67b (8.07) 8.18a (9.66) 7.49a (9.38)

24.00a (5.90) 15.30a (3.53) 20.92a (10.92) 18.17a (13.06)5.57a (5.30) 2.00b (1.76) 2.31b (3.17) 1.40b (2.67)2.43a (2.21) 14.90b (6.52) 6.92a (9.96) 5.42a (8.01)

ferroni adjustment. Year 1 timed (N = 92) and Year 2 timed (N = 98). The F-tests for theear 1); F = 3.46 (Year 2); p = .006.

Table 8Means, SDs, and simplemain effects for repeatedmeasures analyses of variance forMOCCA response types chosen by good, average, and two types of poor comprehenders across years andadministration times.

Year and administration time MOCCA response Poor:Elaborators(Yr1 n = 20)(Yr2 n = 15)

Poor:Paraphrasers(Yr1 n = 5)(Yr2 n = 9)

Average(Yr1 n = 22)(Yr2 n = 26)

Good(Yr1 n = 45)(Yr2 n = 48)

Mean (SD) Mean (SD) Mean (SD) Mean (SD)

Year 1 (timed) Causally coherent inferences 14.25a (3.87) 13.80a (7.76) 16.64a (4.88) 25.80b (6.84)Paraphrases 4.65a (4.21) 8.0b (4.42) 3.27a (4.74) 2.60a (2.96)Local bridging inferences 2.65a (2.46) 4.60b (4.04) 2.32a (3.08) 1.62a (1.56)Lateral connections 2.50a (2.86) 4.80b (3.34) 2.95a (2.59) 1.62a (1.47)

Year 2 (timed) Causally coherent inferences 13.33a (4.72) 11.11a (4.99) 16.77a (7.89) 25.54b (6.28)Paraphrases 2.80a (2.31) 3.56a (2.24) 3.54a (6.29) 2.08a (2.27)Local bridging inferences 2.27a (1.98) 2.0a (2.06) 1.54a (1.27) 1.21a (1.18)Lateral connections 2.07a (1.57) 2.56a (2.65) 2.12a (1.68) 1.31a (0.98)

Year 2 (untimed) Causally coherent inferences 22.13a (7.71) 20.11a (6.33) 24.92a (9.69) 32.17b (6.23)Paraphrases 5.33a (2.66) 7.22b (4.76) 5.85a (7.48) 2.65c (2.76)Local bridging inferences 5.20a (3.26) 4.0b (3.20) 2.85c (2.60) 2.08c (1.40)Lateral connections 3.67a (2.47) 3.56a (2.56) 2.96a (2.16) 1.56b (1.05)

Note. Means in a row that do not share subscripts are significantly different at p b .05 using a Bonferroni adjustment. Year 1 timed (N = 92) and Year 2 timed and untimed (N = 98). TheF-tests for the interaction of MOCCA response types by comprehender group were F = 19.14 (Year 1 timed); F = 18.21 (Year 2 timed); and F = 10.68 (Year 2 untimed); all ps b .001.

49S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

comprehenders during Year 1 (timed) and Year 2 (untimed). In addition,the paraphrasers did choose the paraphrase response type more oftenthan the other distractor response types (i.e., local bridging inferenceand later connection)when not choosing the causally coherent inference.

4. Discussion

In this study, we examined a new reading comprehension assess-ment (MOCCA) to identify individual comprehension processing differ-ences. The MOCCA was developed to assess the processes of readingcomprehension used when reading narrative texts. Assessing readingcomprehension processes has been useful in previous research foridentifying individual comprehension differences among readers (e.g.,McMaster et al., 2012; Rapp et al., 2007), which may in turn be usefulfor identifying appropriate instructional methods needed to addressspecific comprehension needs (e.g., McMaster et al., 2012). The MOCCAwas developed because there are currently few reading comprehensionassessments available for practitioners when assessing reading compre-hension processes for instructional purposes. Overall, this study providesinitial evidence that theMOCCA can produce reliable and valid infor-mation that can be used to identify types of processes that readers atdifferent levels of comprehension use during comprehension.

4.1. Reliability and validity of the MOCCA

Findings from the current study indicate that the MOCCA demon-strates moderate to high internal consistency as a reading comprehen-sion assessment. Specifically, the MOCCA yielded reliable responsesfor the causally coherent inference and paraphrase response types.Thus, the findings for the causally coherent inference response type,developed to complete the missing sentences in the MOCCA with acausal, goal completing statement, support previous research for howreaders develop a coherent representation of a text (e.g., Graesseret al., 1994; Trabasso & van den Broek, 1985). In addition, these findingsextend previous research demonstrating that MOCCA may be used toidentify such comprehension processes used during reading instead ofusing other more laborious methods (i.e., think-aloud tasks). However,it would beworth replicating thesefindingswith amuch larger nationalrepresentative sample to confirm such findings.

The results for the local bridging inference and lateral connectionresponse types, however, were not reliable. However, additional analy-ses revealed that the lower alpha coefficients were due to an increasedrestriction of the range of total scores because these response types

were chosen less often or not at all during the test administrations,regardless of whether additional time was provided.

Participants chose these response types less often perhaps becausethey are not the best response types that reflect comprehensionprocesses readers engage in during reading. For instance, the lateralconnection response type may have been low because of the inherentchallenge of detecting when readers elaborate, draw on associations,or provide self-explanations during reading (Magliano et al., 2011;Millis, Magliano, Wiemer Hastings, Todaro, & McNamara, 2007). Thatis, such lateral connections, especially elaborations that connect toirrelevant background knowledge, are less constrained by semanticcontext when compared to other textual processes. Thus, the MOCCAlateral connection response types may not accurately reflect whatreaders engage in during reading. Future studies that develop newresponse types that more reliably reflect readers' comprehensionprocesses are warranted.

However, these preliminary data provide a positive step towarddeveloping a new way to identify and assess the types of processesthat readers use during reading. Future studies are needed to collectadditional data to refine the current MOCCA items and response typesand examine other response modes. For instance, a study that uses theMOCCA texts without the multiple-choice response types, but ratherasks participants to complete the missing sentence, could be oneapproach for identifying other potential MOCCA response types thatrepresent readers' comprehension processes.

In addition to the reliability analyses, the data from the untimedadministration of the MOCCA allowed us to conduct an item analysisto examine the discrimination and difficulty levels of the items. Itemdiscrimination and difficulty indices were calculated using the totalnumber of correct (i.e., causally coherent inference) response types.We found that many of the items fit in acceptable ranges for bothdiscrimination and difficulty. However, some of the items were limitedin their range of difficulty, presenting a limitation in the scope todiscriminate between items at the low and high ends; thus, revisionsshould be conducted on items that did not fit within acceptable ranges.For instance, items that had low discrimination levels may not havefunctioned as well as the other items because the response types werenot strongly aligned to their corresponding definitions. That is, theresponse types may have overlapped with one another which mayhave confused the reader, and thus, became hard to choose the bestresponse to complete the missing sentence. For instance, in somecases, the paraphrase response type paraphrases or repeats the goalsentence of the text; where as in other cases, the paraphrase response

50 S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

type paraphrases a combination of the goal and subgoal sentences in thetext. This inconsistency for some of the paraphrase response types, andpotentially others, across MOCCA items warrants additional revisionsand refinements for the low functioning items.

Finally, theMOCCA produced scores thatwere significantly correlatedwith other readingmeasures. Specifically, the causally coherent inferenceresponse type was significantly positively correlated with all the otherreadingmeasures administrated in this study. These results provide initialsupport that suggests that readers should be able to decode as well ascomprehend to perform well on the MOCCA. Additionally, the otherMOCCA response types (paraphrases, local bridging inferences, lateralconnections) were negatively correlated or not correlated with theother reading measures. These findings may suggest that the better stu-dents perform on other reading measures, the less likely they are tochoose the less than desirable response types on the MOCCA (i.e., para-phrase, local bridging inference, lateral connection). In otherwords, the performingwell on other readingmeasures negatively corre-lates with choosing the non-causally coherent response types on theMOCCA. However, additional administrations of the MOCCA and otherreadingmeasures should be conductedwith amuch larger sample of par-ticipants to further examine the validity of the measure.

4.2. Identifying comprehension differences among readers

Our findings also indicate that the MOCCA identifies readingcomprehension processing differences between comprehenders. Spe-cifically, we found that, overall, good comprehenders chose the causallycoherent inference response type more than did the average and poorcomprehenders; and average and poor comprehenders chose theother response types (paraphrases, local bridging inferences, lateralconnections) more than did the good comprehenders. These findingsextend previous research that indicates that different comprehendersuse different comprehension processes during reading (e.g., McMasteret al., 2012; Rapp et al., 2007). The response types developed for theMOCCA are important because they are analogous to types of compre-hension processes that readers use to develop a representation of atext (e.g., Graesser & Clark, 1985; Graesser et al., 1994; Kintsch, 1998;Magliano et al., 2011; McMaster et al., 2012; McNamara et al., 1996;Rapp et al., 2007; van den Broek et al., 2001, 2006). Such processeshave been traditionally found using methods such as think-aloudtasks that are time consuming and inefficient.

In addition, many traditional reading comprehension assessments(e.g., standardized comprehension measures) fail to identify specificcomprehension processes among readers, but instead measure totalreading comprehension scores or the product of comprehension, anddo not provide information forwhy a reader struggles with comprehen-sion. Like other unique assessments (e.g., RSAT; Magliano et al., 2011),the MOCCA was developed using theories of reading comprehensionto identify how (i.e., the processes bywhich) readers develop a coherentrepresentation of a text or situation model during reading rather thanafter reading. However, unlike other assessments, MOCCA was de-veloped to: expend fewer resources during test administration andscoring; provide readers with a familiar testing format (i.e., multiple-choice); provide educators with shorter administration times comparedto using other more laborious methods (i.e., think-aloud tasks);and provide an assessment that can be used in a variety of academicsettings (e.g., Magliano et al., 2011). In addition, we have initial evidencethat theMOCCA can be used instead of think alouds to efficiently identifyindividual reading comprehension differences (e.g., types of poorcomprehenders) during reading for the purpose of making instructionaldecisions (e.g., August et al., 2006; Magliano et al., 2011; McMasteret al., 2012; Pike et al., 2010).

In addition, in this study, two subtypes of poor comprehendersemerged from both the think-aloud and MOCCA data. First, our find-ings support previous research that has found subtypes of poorcomprehenders using a think-aloud task: (1) paraphrasers, or poor

comprehenderswhogenerally repeat the text andmake fewer responsesthat are inferential, associative, or metacognitive; and (2) elaborators,or poor comprehenders who make more inferential responses thatconnect to background knowledge, but do not appear to support overallcomprehension of a text (McMaster et al., 2012; Rapp et al., 2007). Thesedata support the notion that poor comprehenders may struggle withcomprehension in different ways (e.g., Cain & Oakhill, 2006; Nationet al., 2002; Perfetti, 2007) and further raises the question of whetherpoor comprehenders respond to interventions in different ways(e.g., McMaster et al., 2012). Therefore, it is important to assess read-ing comprehension processes using appropriatemeasures to identifycomprehenders who may benefit from different types of instructionalapproaches.

We also found subtypes of poor comprehenders using the MOCCA:(1) paraphrasers: poor comprehenders who chose the paraphraseresponse type more than the local bridging and lateral connectionresponse types when they were not choosing the causally coherentinference response type; and (2) elaborators: poor comprehenderswho chose the lateral connection response type much of the time whenthey were not choosing the causally coherent inference type. Thesepreliminary findings extend previous research for using other types ofmethods to identify subtypes of poor comprehenders (e.g., Maglianoet al., 2011; McMaster et al., 2012); however, the findings did not consis-tently yield identical groups of poor comprehenders or similar patternsfor both years of this study. One limitation of these findings is the smallsample size of the poor comprehender subgroups. Sample sizes for thepoor comprehender group were small due to limited access to apopulation of poor readers (i.e., participants were selected from onlyone suburban school district). De Ayala (2009) recommends that, todetect a group difference with new assessments, a minimum samplesize of 10 times the number of parameters should be estimated. That is,if I is the number of items and A is the number of alternatives for eachitem, this means the sample size should be at least 2I(A− 1). A nominalmodel has 2(A− 1) parameters for each item. For example, with I= 20items and A=4alternatives, thismeans 2 ∗ 10 ∗ 20 ∗ 3=1200 (De Ayala,2009). Thus, future researchwith amuch larger sample sizemay providea better estimate of whether the MOCCA can be used to predict subtypesof poor comprehenders.

Another limitation for identifying subtypes of poor comprehendersusing MOCCA is that the MOCCA and the think-aloud task may bemeasuring reading comprehension processes differently. For instance,during the MOCCA, readers are required to choose one out of fourresponse types that mimic reading comprehension processes to com-plete a missing sentence from a text. In a think-aloud task, readers areprompted to talk aloud during reading and responses are later codedfor types of comprehension processes used during reading. Thus, itmay be more difficult to compare using the MOCCA and a think-aloudtask for identifying subtypes of poor comprehenders. Further researchexamining other MOCCA response types that are analogous to othertypes of think-aloud responses may provide useful information forcontinuing to develop the MOCCA as an appropriate measure foridentifying subtypes of poor comprehenders.

5. Conclusion

The purposes of this study were to examine MOCCA, a new readingcomprehension assessment designed to identify specific comprehen-sion processes used during reading, and to identify individual differ-ences among the types of processes different comprehenders useduring reading, in particular, poor comprehenders. The results from thisstudy support our purpose for developing a reading comprehensionassessment around how readers use cognitive processes to build a coher-ent representation of a text. That is, the development of MOCCA supportsthe need for assessments that are developed based on cognitive theoriesof reading comprehension (e.g., Johnson-Laird, 1983; Kintsch & van Dijk,1978;Oakhill & Cain, 2007; Perfetti, Landi, &Oakhill, 2005; vandenBroek,

51S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

Young, Tzeng, & Linderholm, 1999). For instance, previous research hasshown that readers engage in different types of processes to developa coherent representation of a text or situation model during read-ing, and processes that help readers track causality in a text are par-ticularly strong in this development (e.g., Graesser et al., 1994;Trabasso & van den Broek, 1985). Thus, the causally coherent inferenceresponse type developed for the MOCCA provides initial evidence tosupport our purpose.

In addition, a unique feature of MOCCA is the qualitative nature ofdifferences between response types for each item. Results from thisstudy provide preliminary support for MOCCA as being a reliable andvalid reading comprehension assessment that not only identifies thetypes of cognitive processes that readers using during reading, butalso distinguishes between types of comprehension processes thatgood, average, and subtypes of poor comprehenders use during reading.MOCCA items were designed with alternative answers that were basedon findings from psychological studies of reading comprehension tohelp identify such differences (e.g., Kintsch & van Dijk, 1978; Maglianoet al., 2011; Trabasso & van den Broek, 1985; van den Broek et al.,1999, 2001).

These results extend the literature in the areas of the cognitiveprocesses of reading comprehension and reading comprehensionassessments. That is, the MOCCA differs from more traditional school-based reading comprehension assessments because of its diagnosticqualities. These findings have implications for educational researchfocused on using appropriate assessments for developing appropri-ate instruction and interventions for improving struggling readers'comprehension.

We wish to thank the undergraduate and graduate students whotook part in this research project and provided support with collect-ed, scored, and coded data. We would also like to acknowledge andthank Dr. Mark Davison and Dr. Gina Biancarosa for their ongoingsupport and suggestions for data analyses, and Dr. Susan Goldmanfor her suggestions for future revisions of the MOCCA.

Appendix A.MOCCA practice item and corresponding response types

NOTE: Instructions and practice items follow the format used duringtheMOCCA administration. The following page includes a practice item,and two additional MOCCA items. Children see the items one at a timeduring the MOCCA administration.

Practice Story: Janie and the Trip to the StoreJanie's dad was heading to the store.Janie wanted to go with him.She wanted to get a treat at the store.Janie had saved up some money.At the store, there was lots of candy to choose from.MISSING SENTENCEJanie was happy.

CHOICES:A) She picked out her favorite candy bar.[Causally coherent inference]

B) Janie worked at the store.[Local bridging inference]C) Janie's dad was upset with her choice.[Lateral connection]D) Janie wanted to go to the store.[Paraphrase]

The GiftMaria was still too young to work and earn money.Whenever birthdays came up, she relied on her older siblings to buythe gift.Maria's mother was having a birthday soon, and Maria wanted tobuy her own gift.Because she was good at doing chores, Maria decided to ask someneighbors if they would pay her to clean.Shewent to all of the neighbor's houses to offer her help, but no onewanted her help.MISSING SENTENCEHer mother said, “The best gift of all is to know that you appreciateme.”

CHOICES:A) One nice neighbor offered to give Maria money to buy a present.[Local bridging inference]B) So, Maria decided to do all her mother's chores for her birthday.[Causally coherent inference]C) Maria wanted to buy her own gift for her mother's birthdayparty.[Paraphrase]D) Maria walked downtown to look for a job to earn money for thegift.[Lateral connection]

Kayla at the RestaurantKayla is going to meet a friend lunch at her favorite restaurant.Kayla loves French fries at this restaurant and plans to order them.Rachel, Kayla's friend, arrives at the restaurant.Kayla and Rachel hadn't seen each other inmonths andwere happy tohave lunch.Kayla and Rachel sat at their table and began to talk.MISSING SENTENCEKayla is happy with her choice and hands back the menu to thewaiter.

CHOICES:A) Kayla loves the French fries served at this restaurant.[Paraphrase]B) Rachel tells Kayla that she just bought a new house.[Local bridging inference]C) Kayla likes how the restaurant is decorated with flowers.[Lateral connection]D) Kayla orders the French fries when the waiter arrives.[Causally coherent inference]

Appendix B. Think-aloud texts

NOTE: Both think-aloud texts are administered one sentence ata time.

52 S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

B.1. Brian's magical skill

Brian liked to perform for people. His teacher said that shewas goingto choose someone to be in a school show that would be in the spring.Brian wanted to be chosen. He sang a song for his teacher. It was notvery good. His teacher did not choose him. Brian decided to put on amagic act. He bought some fancy magic cards and a hat. He tried to dosome tricks on a table. The tricks were difficult to do. Brian wanted tolearn other kinds of activities. He asked a magician if he would teachhim. The magician agreed to teach him. Brian visited every day for amonth. He watched how each routine was done and practiced a lot.Brian learned how to perform different kinds of magic. He selectedtricks that he could do best. He put together a good act, showing histeacher. He made some pretty flowers come out of her ear and thenmade the flowers disappear. The magic act was a hit and was selectedfor the show.

Comprehension Questions:

1) Does Brian like to Perform? _______2) Did Brian learn magic by himself? _______

B.2. Candle crafts for school

One day, Sally's class had show-and-tell. Her best friend showed apicture that she had painted. She told the class how she had made thepicture and everyone liked it. Then Sally decided that she wanted tomake something special for show-and-tell. So, she went to the libraryand checked out a book. Sally read that candles could be made bymelting crayons and pouring them into a cup. Finally, she decided shewanted to make a red candle. Sally found some crayons and a cup athome. She put the crayons into the cup. She melted the wax in thecup and held a string to make a wick. The wax hardened quickly. Atlast, the beautiful candle was finished. She put her new candle in aholder and began decorating it with ribbons. The next day, Sally carriedthe candle to school. When she arrived, she asked her teacher if shecould be in show-and-tell. She won the show-and-tell grand prize forher candle. Her friend was happy for her and they celebrated afterschool.

Comprehension Questions:

1) Did Sally go to the library? ______2) Did Sally dislike show-and-tell? ______

References

August, D., Francis, D. J., Hsu, H. -Y. A., & Snow, C. E. (2006). Assessing reading comprehen-sion in bilinguals. The Elementary School Journal, 107, 221–238.

Cain, K., & Oakhill, J. (1999). Inference making ability and its relation to comprehensionfailure in young children. Reading and Writing: An Interdisciplinary Journal, 11,489–503.

Cain, K., & Oakhill, J. (2006). Assessment matters: Issues in the measurement of readingcomprehension. British Journal of Educational Psychology, 76, 697–708.

Chall, J. S. (1996). Stages of reading development (2nd ed.). Fort Worth, TX: Harcourt-Brace.Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika,

16(3), 297–334.Cutting, L. E., & Scarborough, H. S. (2006). Prediction of reading comprehension: Relative

contributions of word recognition, language proficiency, and other cognitive skillscan depend on how comprehension is measured. Scientific Studies of Reading, 10,277–299.

De Ayala, R. J. (2009). The theory and practice of item response theory. New York: TheGuilford Press.

Daneman, M., & Carpenter, P. A. (1980). Individual difference in working memory andreading. Journal of Verbal Learning and Verbal Behavior, 19, 450–466.

Deno, S. L. (1985). Curriculum-basedmeasurement: The emerging alternative. ExceptionalChildren, 52, 219–232.

Eason, S. H., & Cutting, L. E. (2009). Examining sources of poor comprehension in olderpoor readers: Preliminary findings, issues, and challenges. In R. K. Wagner, C. S.Schatschneider, & C. Phythian-Sence (Eds.), Beyond decoding: The behavioral andbiological foundations of reading comprehension (pp. 263–283). New York, NY: Guilford.

Ebel, R. L. (1954). Procedures for the analyses of classroom tests. Educational andPsychological Measurement, 14, 352–364.

Ericsson, K. A., & Simon, H. (1993). Protocol analysis: Verbal reports as data. Cambridge,MA: MIT Press.

Espin, C. A., & Foegen, A. (1996). Validity of general outcome measures for predictingsecondary students' performance on content-area tasks. Exceptional Children, 62,497–514.

Fletcher, J. M. (2006). Measuring reading comprehension. Scientific Studies of Reading, 10,323–330.

Francis, D. J., Fletcher, J. M., Catts, H. W., & Tomblin, J. B. (2005). Dimensions affecting theassessment of reading comprehension. In S. G. Paris, & S. A. Stahl (Eds.), Children'sreading comprehension and assessment (pp. 369–394). Mahwah, NJ: Erlbaum.

Fuchs, L. S., & Fuchs, D. (1992). Identifying a measure for monitoring student readingprogress. School Psychology Review, 58, 45–58.

Good, R. H., & Kaminski, R. A. (Eds.). (2002). Dynamic indicators of basic early literacy skills(6th ed.). Eugene, OR: Institute for the Development of Educational Achievement..

Graesser, A. C., & Clark, L. F. (1985). Structures and procedures of implicit knowledge.Norwood, NJ: Ablex.

Graesser, A. C., Singer, M., & Trabasso, T. (1994). Constructing inferences during narrativetext comprehension. Psychological Review, 101, 371–395.

Johnson-Laird, P. N. (1983). Mental models: Towards a cognitive science of language,inference, and consciousness. Cambridge: Cambridge University Press.

Iteman, Version 3.5 (1989). Conventional item analysis program. St. Paul, MN: AssessmentSystems Corporation.

Kaakinen, J. K., Hyönä, J., & Keenan, J. M. (2003). How prior knowledge, working memorycapacity, and relevance of information affect eye-fixations in expository text. Journalof Experimental Psychology: Learning, Memory, and Cognition, 29, 447–457.

Keenan, J. M., Betjemann, R. S., & Olson, R. K. (2008). Reading comprehension tests vary inthe skills they assess: Differential dependence on decoding and oral comprehension.Scientific Studies of Reading, 12, 281–300.

Kincaid, J. P., Fishburne, R. P., Rogers, R. L., & Chissom, B. S. (1975). Derivation of newreadability formulas (Automated Readability Index, Fog Count, and Flesch ReadingEase formula) for Navy enlisted personnel. Research Branch report 8-75. Chief ofNaval technical training: Naval air station Memphis. Scientific Studies of Reading, 12,281–300.

Kintsch, W. (1998). Comprehension: A paradigm for cognition. New York: CambridgeUniversity Press.

Kintsch,W., & vanDijk, T. A. (1978). Toward amodel of text comprehension and production.Psychological Review, 85, 363–393.

Laing, S. P., & Kamhi, A. G. (2002). The use of think-aloud protocols to compareinferencing abilities in average and below-average readers. Journal of LearningDisabilities, 35, 436–447.

Linderholm, T., Cong, X., & Zhao, Q. (2008). Differences in low and high working memorycapacity readers' cognitive and metacognitive processing patterns as a function ofreading for different purposes. Reading Psychology, 29, 61–85. http://dx.doi.org/10.1080/02702710701568587.

Lord, F. M. (1952). The relationship of the reliability of multiple-choice test to thedistribution of item difficulties. Psychometrika, 18, 181–194.

Magliano, J. P., Millis, K. K., Development Team, R. S. A. T., Levinstein, I., & Boonthum, C.(2011). Assessing comprehension during reading with the Reading StrategyAssessment Tool (RSAT). Metacognition and Learning, 6, 131–154. http://dx.doi.org/10.1007/s11409-010-9064-2.

McMaster, K. L., van den Broek, P., Espin, C. A., White, M. J., Rapp, D. N., Kendeou, P., Bohn-Gettler, C. M., & Carlson, S. (2012). Making the right connections: Differential effectsof reading intervention for subgroups of comprehenders. Learning and IndividualDifferences, 22(1), 100–111. http://dx.doi.org/10.1016/j.lindif.2011.11.017.

McNamara, D. S., Kintsch, E., Songer, N. B., & Kintsch, W. (1996). Are good textsalways better? Interactions of text coherence, background knowledge, and levelsof understanding in learning from text. Cognition and Instruction, 14, 1–43.

Millis, K. K., Magliano, J. P., Wiemer Hastings, K., Todaro, S., & McNamara, D. (2007).Assessing and improving comprehension with latent semantic analysis. In T. K.Landauer, D. S. McNamara, D. Simon, &W. Kintsch (Eds.), Handbook of latent semanticanalysis (pp. 207–225). Mahwah, NJ: Erlbaum.

Nation, K., & Snowling, M. (1997). Assessing reading difficulties: The validity and utility ofcurrent measures of reading skill. British Journal of Educational Psychology, 67, 359–370.

Nation, K., Clarke, P., & Snowling, M. J. (2002). General cognitive ability in children withreading comprehension difficulties. British Journal of Educational Psychology, 72,549–560.

Northwest Evaluation Association (2001). Computerized Achievement Levels Tests (CALT).Lake Oswego, OR: Independent School District 271, Bloomington, MN.

Oakhill, J., & Cain, K. (2007). Introduction to comprehension development. In K. Cain, & J.Oakhill (Eds.), Children's comprehension problems in oral and written language: Acognitive perspective (pp. 3–40). New York: Guilford Press.

Perfetti, C. A. (2007). Reading ability: Lexical quality to comprehension. Scientific Studies ofReading, 11, 357–383.

Perfetti, C., Landi, N., & Oakhill, J. (2005). The acquisition of reading comprehension skill.InM. J. Snowling, & C. Hulme (Eds.), The science of reading: A handbook (pp. 227–247).Oxford: Blackwell.

Pike, M. M., Barnes, M. A., & Barron, R. W. (2010). The role of illustrations in children'sinferential comprehension. Journal of Experimental Child Psychology, 105, 243–255.

Rapp, D. N., van den Broek, P., McMaster, K. L., Kendeou, P., & Espin, C. A. (2007). Higher-order comprehension processes in struggling readers: A perspective for re-search and intervention. Scientific Studies of Reading, 11(4), 289–312. http://dx.doi.org/10.1080/10888430701530417.

Shanahan, T., & Shanahan, C. (2008). Teaching disciplinary literacy to adolescents:Rethinking content-area literacy. Harvard Educational Review, 78, 40–59.

Shanahan, T., & Shanahan, C. (2012).What is disciplinary literacy andwhy does it matter?Topics in Language Disorders, 32, 1–12.

Streiner, D. L. (2003). Starting at the beginning: An introduction to coefficient alpha andinternal consistency. Journal of Personality Assessment, 80, 99–103.

53S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

Swanson, H. L., Cochran, K. F., & Ewers, C. A. (1989). Working memory in skilled and lessskilled readers. Journal of Abnormal Child Psychology, 17, 145–156.

Trabasso, T., & van den Broek, P. (1985). Causal thinking and the representation ofnarrative events. Journal of Memory and Language, 24, 612–630. http://dx.doi.org/10.1016/0749596x(85)90049-x.

Trabasso, T., &Magliano, J. P. (1996a). Conscious understanding during text comprehension.Discourse Processes, 21, 255–288.

Trabasso, T., & Magliano, J. P. (1996b). How do children understand what they read andwhat can we do to help them? In M. Graves, P. van den Broek, & B. Taylor (Eds.),The first R: A right of all children (pp. 160–188). New York: Columbia University Press.

van den Broek, P.W. (1997). Discovering the cement of the universe: The development ofevent comprehension from childhood to adulthood. In P. W. van den Broek, P. Bauer,& T. Bourg (Eds.), Developmental spans in event comprehension and representation:Bridging fictional and actual events (pp. 321–342). Hillsdale, NJ: Erlbaum.

van den Broek, P., Rapp, D. N., & Kendeou, P. (2005). Integrating memory-based andconstructionist processes in accounts of reading comprehension. DiscourseProcesses, 39, 299–316. http://dx.doi.org/10.1080/0163853x.2005.9651685.

van den Broek, P., Young, M., Tzeng, Y., & Linderholm, T. (1999). The landscape model ofreading: Inferences and the online construction of memory representation. In S. R.Goldman, & H. van Oostendorp (Eds.), The construction of mental representations dur-ing reading (pp. 71–98). Mahwah, NJ: L. Erlbaum Associates.

van den Broek, P., Lorch, R. F., Linderholm, T., & Gustafson, M. (2001). The effectsof readers' goals on inference generation and memory for texts. Memory andCognition, 29, 1081–1087.

van den Broek, P., Lynch, J. S., Naslund, J., Ievers-Landis, C. E., & Verduin, C. J. (2003).Children's comprehension of main ideas in narratives: Evidence from the selectionof titles. Journal of Educational Psychology, 95, 707–718.

van den Broek, P., McMaster, K. L., Rapp, D. N., Kendeou, P., Espin, C. A., & Deno, S. L. (2006,June). Connecting cognitive science and educational practice to improve readingcomprehension. Paper presented at the Institute of Education Sciences ResearchConference, Washington, DC.

van Dijk, T. A., & Kintsch, W. (1983). Strategies of discourse comprehension. New York:Academic Press.

Ward, J. H., & Hook, M. E. (1963). Application of an hierarchical grouping procedure to aproblem of grouping profiles. Educational and Psychological Measurement, 23, 69–81.

Wayman, M., Wallace, T., Wiley, H. I., Ticha, R., & Espin, C. A. (2007). Literature synthesison curriculum-based measurement in reading. Journal of Special Education, 41,85–120.

Woodcock, R., McGrew, K. S., & Mather, N. (2001). Woodcock–Johnson III tests of cognitiveabilities and achievement. Itasca, IL: Riverside Publishing.

Zwaan, R. A., Magliano, J. P., & Graesser, A. C. (1995). Dimensions of situation modelconstruction in narrative comprehension. Journal of Experimental Psychology:Learning, Memory, and Cognition, 21, 386–397.