American Reading Forum Presentation - Text Complexity of Standardized Writing Test Anchor Sets

21
Standardized Writing Tests: Does Complexity Matter? Sarah Pennington Allison Papke University of South Florida

Transcript of American Reading Forum Presentation - Text Complexity of Standardized Writing Test Anchor Sets

Standardized Writing Tests:Does Complexity Matter?

Sarah PenningtonAllison Papke

University of South Florida

Rationale

● Florida students in grades 4, 8, & 10 are assessed on FCAT Writes

● Scores range from Unscorable to 6● Scorers use a holistic rubric that evaluates

draft on four areas: Focus

Organization

Support

Conventions

Scoring

● A six at 8th grade level: Focused, purposeful writing

Sticks to main idea with logical organization

Specific support that is relevant, illustrates the point being made

Clarity of ideas and how they are presented

Mature command of language

Varied sentence structure

Few errors in mechanics

Incoming:Common Core Writing Standards

The CCSS in writing reflect increasing expectations in the cognitive complexity of student writing as students progress.

● Writing standard W.1.2 (Informative/Expository writing)

4th grade: Level 2 complexity (Basic applica-tion)

8th & 10th grades: Level 4 complexity (Exten-ded thinking & complex reasoning) (FSU, 2013).

Research Questions

1. Does the text complexity of student samples from the FCAT Writes increase as the assigned score of the writing sample increases?

2. Does the text complexity of student samples from the FCAT Writes increase as the grade level of the student increases?

Data Set● FCAT Writes Anchor Sets

2011 (N=53) 4th grade - Expository (N=18)

8th grade - Expository (N=18)

10th grade - Expository (N=17)

2012 (N=53) 4th grade - Narrative (N=17)

8th grade - Persuasive (N=18)

10th grade - Persuasive (N=18)

Complexity Measures

● Computerized Propositional Idea Density Rater (CPIDR; Brown, Snodgrass, Kemper, Herman, & Covington, 2008)

Analysis of proposition density within text

● Lexile Analyzer (Metametrix, 2013) Measure of text complexity used within

CCSS

Measure of text difficulty focusing on level of words used and how the words are com-bined to make sentences

Related Measures?

CPIDR and Lexile results Pearson Correlation .249 (all data)

Pearson Correlation .334 (2012)

Pearson Correlation .159 (2011)

MANOVA Results

Analysis of both years combined:

No significant interaction (Grade * Score)

Grade differences significant F=4.237 (4,174) p<.01

Score differences significant F=4.303 (10,172) p<.001

MANOVA - Differences

Grade level Differences (significant @ .05):

CPIDR Lexile

4 – 8

4 - 10 4 - 10

MANOVA - Differences

Score level Differences (significant @ .05):

CPIDR Lexile

1 - 4 1 - 3

MANOVA Results

Analysis of 2011 samples:

No significant interaction (Grade * Score)

Grade differences not significant

Score differences significant F=2.073 (10,68) p<.05

MANOVA - Differences

Score level Differences (significant @ .05):

CPIDR Lexile

1 - 4 None

MANOVA Results

Analysis of 2012 samples:

Interaction significant F= 2.283 (20,68) p<.01

Grade differences significant F= 7.005 (4,68) p<.001

Score differences significant F=3.935 (10,68) p<.001

MANOVA - Differences

Grade level Differences (significant @ .05):

CPIDR Lexile

4 - 8 4 - 8

4 - 10 4 - 10

8 - 10

MANOVA - Differences

Score level Differences (significant @ .05):

CPIDR Lexile

None 1 - 3

Implications

● No consistent changes in the complexity as score or grade level increase

What are the factors that influence scores? (What are we scoring the students on?)

Is an overall holistic rubric an effective method of assessing work on a high-stakes test?

Words and Reality

● According to FLDOE, “The quality of the response, rather than the appearance or length of the response, is part of Florida's scoring criteria” (2013, p. 2).

● Pearson Correlation between score and number of words for all grade levels in both years: .837

Strengths

● Use of multiple measures● Use of multiple years of released anchor

sets

Limitations

● 2011 FCAT Writes – 1 scorer● 2012 FCAT Writes – 2 scorers ● 2012 scoring based on higher expectations

– increased attention to mechanics & quality of details (FLDOE, n.d.)

● Prompts differ by grade level each year

Selected ReferencesBrown, C., Snodgrass, T., Kemper, S., Herman, R., & Covington, M. (2008). Automatic measurement of propositional idea density from part-of-speech tagging. Behavior Research Methods, 40(2), pp. 540-545.

Florida Department of Education (n.d.) FCAT Writing. Retrieved November 28, 2013 from http://fcat.fldoe.org/fwinfopg.asp

Florida Department of Education (2005). FCAT and FCAT 2.0 Writing Rubrics. Retrieved March 2, 2013 from http://fcat.fldoe.org/rubrcpag.asp

Florida Department of Education, Office of Assessment (2013). 2013 FCAT 2.0 writing frequently asked questions. Retrieved November 30, 2013 from http://fcat.fldoe.org/fcat2/pdf/13fcat2writing.pdf

Florida State University (2013). CPALMS: Where educators go for bright ideas. Retrieved November 28, 2013 from http://www.cpalms.org

Metametrix (2013). Lexile Analyzer. Durham, NC: Metametrix. Retrieved March 3, 2013 from http://www.lexile.com/analyzer/