Modeling latent true scores to determine the utility of aggregate student perceptions as classroom...

22
Modeling latent true scores to determine the utility of aggregate student perceptions as classroom indicators in HLM: The case of classroom goal structures Angela D. Miller a, * , Tamera B. Murdock b a Educational and Counseling Psychology, 239 Dickey Hall, University of Kentucky, Lexington, KY 40506, USA b Department of Psychology, 4825 Troost, University of Missouri-Kansas City, Kansas City, MO 64110, USA Abstract Measures of classroom climate such as classroom goal structures are often assessed through stu- dents’ perceptions; the aggregated means within classrooms are then sometimes labeled as ‘‘class- room characteristics.’’ The validity of these constructs is limited by the reliability of the measure at both the student and classroom level; yet, few studies accurately assess reliability when multilevel models are used. We demonstrate the use of a three-level hierarchical linear model to estimate latent true score measures of students’ perceptions of goal structures, appropriately adjusted for their nest- ed structure. To investigate the distinctiveness of goal structures from teacher characteristics, we examined the inter-correlations among the student and classroom level variables, and predictors of each. Ó 2006 Elsevier Inc. All rights reserved. Keywords: Hierarchical linear modeling; Goal structures; Teacher characteristics 1. Introduction Contemporary theories of achievement motivation, including expectancy-value theory, goal theory, and self-determination theory, presume that the students’ motivated behavior 0361-476X/$ - see front matter Ó 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.cedpsych.2006.10.006 * Corresponding author. E-mail address: [email protected] (A.D. Miller). Contemporary Educational Psychology xxx (2007) xxx–xxx www.elsevier.com/locate/cedpsych ARTICLE IN PRESS Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores to determine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

Transcript of Modeling latent true scores to determine the utility of aggregate student perceptions as classroom...

ARTICLE IN PRESS

Contemporary Educational Psychology xxx (2007) xxx–xxx

www.elsevier.com/locate/cedpsych

Modeling latent true scores to determinethe utility of aggregate student perceptions as

classroom indicators in HLM: The case ofclassroom goal structures

Angela D. Miller a,*, Tamera B. Murdock b

a Educational and Counseling Psychology, 239 Dickey Hall, University of Kentucky, Lexington, KY 40506, USAb Department of Psychology, 4825 Troost, University of Missouri-Kansas City, Kansas City, MO 64110, USA

Abstract

Measures of classroom climate such as classroom goal structures are often assessed through stu-dents’ perceptions; the aggregated means within classrooms are then sometimes labeled as ‘‘class-room characteristics.’’ The validity of these constructs is limited by the reliability of the measureat both the student and classroom level; yet, few studies accurately assess reliability when multilevelmodels are used. We demonstrate the use of a three-level hierarchical linear model to estimate latenttrue score measures of students’ perceptions of goal structures, appropriately adjusted for their nest-ed structure. To investigate the distinctiveness of goal structures from teacher characteristics, weexamined the inter-correlations among the student and classroom level variables, and predictorsof each.� 2006 Elsevier Inc. All rights reserved.

Keywords: Hierarchical linear modeling; Goal structures; Teacher characteristics

1. Introduction

Contemporary theories of achievement motivation, including expectancy-value theory,goal theory, and self-determination theory, presume that the students’ motivated behavior

0361-476X/$ - see front matter � 2006 Elsevier Inc. All rights reserved.

doi:10.1016/j.cedpsych.2006.10.006

* Corresponding author.E-mail address: [email protected] (A.D. Miller).

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

2 A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx

ARTICLE IN PRESS

is affected by the context of their classroom, and more specifically, by the characteristics,attitudes and behaviors of teachers within the classroom (Eccles, Lord, & Midgley, 1991;Kaplan & Midgley, 1999; Midgley, Feldlaufer, & Eccles, 1989). One of the most widelyadopted motivational frameworks and the focus of this article is achievement goal theory(Ames & Archer, 1988; Dweck & Leggett, 1988; Elliott & Dweck, 1988). Achievement goaltheory includes both situational and personal components. Personal goals are the purposesfor which students engage in academic endeavors, whereas classroom goal structures arethe goal-related messages that are more salient in a classroom setting.

Researchers working within this theoretical framework assume that teachers influencestudents’ motivation by the goal structure they establish and promote in the classroom(Ames, 1992; Ames & Archer, 1988). By reinforcing the importance of learning and under-standing the material, recognizing student effort, and avoiding student comparisons,teachers promote a mastery-oriented environment (e.g., mastery goal structure). Whenstudents are mastery-oriented, their focus is on the task at hand. They are presumed tobe interested in improving themselves through effort, and generally, they believe that theirintellectual ability is fluid and malleable. As such, when mastery-oriented students are con-fronted with challenges, they respond by increasing their effort, persistence, and trying newlearning strategies. Empirically, the increased use of learning strategies (Ames & Archer,1988; Young, 1997), higher self-efficacy (Midgley, Anderman, & Hicks, 1995; Roeser,Midgley, & Urdan, 1996), and personal goals that are mastery-oriented (Nolen & Halady-na, 1990; Young, 1997) have been associated with perceived mastery goal structures.

Conversely, teachers who stress the importance of grades and recognize students foroutperforming their classmates create a competitive, performance-oriented classroom(e.g., performance goal structure). Performance goal structures and goal orientationsare presumed to be less adaptive than mastery goals because they increase students’ focuson themselves and their ability, thereby creating anxiety in situations where students mayhave doubts about their abilities to perform. Moreover, performance goal orientations areoften found in students who believe that their intelligence is fixed. Accordingly, these stu-dents respond to challenging material as challenges to their sense of self, and are morelikely to be debilitated in such situations. Perceived performance goal structures have beenassociated with numerous maladaptive outcomes including cheating (Anderman, Griesing-er, & Westerfield, 1998), negative affect (Kaplan & Midgley, 1999; Anderman, 1999), per-sonal extrinsic or performance orientations (Urdan, 2004; Wolters, 2004), and variousbehavioral problems (Kaplan & Maehr, 1999; Roeser & Eccles, 1998).

Despite the prominence of goal theory in motivational research, there are several con-ceptual and methodological limitations to the current body of scholarship in this area. Inthis article, the principal limitation of interest is the use of student perceptions to measuregoal structures and, more specifically, the trend of aggregating these perceptions for use asa classroom indicator. The use of student perception data creates several statistical prob-lems. First, when measures of perceived goals structure are used as individual predictors ofstudent outcomes, there is a violation of the assumption of independent observations.When perceived goals are aggregated within a classroom or teacher and used as a contextmeasure, the reliability of scores is influenced both by the measure itself and the number ofstudents reporting within a given classroom, yet most researchers does not report theappropriate reliability statistic. These methodological limitations limit the conceptualimplications of goal research. In the following sections, we examine more specificallyhow these problem areas are apparent in the achievement goal literature, followed by

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx 3

ARTICLE IN PRESS

an introduction of the specific hierarchical linear modeling (HLM) methodology used inthis paper to examine and further explicate our current understanding of achievement goalstructures.

1.1. Methodological issues in the achievement goal literature

A recent review of the literature on goal structure effects revealed that 16 of the 31 fieldstudies which claim that mastery versus performance goal structure have differential effectson student motivation simply correlated individual students’ perceptions of the classroomgoal structure with one or more motivation outcomes (Miller, 2006). Perceptions of class-room goal structure have been correlated with the perceptions of individual students’reports of motivation as measured by effort, persistence, and help-seeking (Kaplan & Mae-hr, 1999; Roeser et al., 1996). However, the fact that a student’s individual perceptionscorrelate with his or her self-reported motivational outcome is partially a result of the (cor-related) measurement error that comes from using a common method to collect all of thedata (i.e., self report). In other words, a portion of the shared variance among the con-structs is due to the common self-reporting perspective. Resultant correlations may there-fore overestimate context effects by ascribing all of the variance to something about theteacher, when in fact, much of it may be due to the individual. In short, these methodsdo not allow researchers to estimate the extent to which goal structures actually varybetween classrooms. Without knowing the extent to which the variance in perceived goalstructure is a function of the students within each classroom versus the practices of theteacher, it becomes difficult to make practice recommendations to teachers. Moreover,these analyses do not take into consideration that multiple students are reporting percep-tions of the same teacher and classroom. It is difficult to make classroom level conclusionsbased on this type of analysis because of this violation of the independent observationsassumption of the inferential statistics used in these analyses. Simply, these types of studiesignore the nesting of students within classroom and claims based on students’ perceptionsabout differences between classroom contexts cannot be made without also attending tovariance within classrooms in individual characteristics such as students’ interests, skills,and goals.

Inquiry aimed at disentangling the impact of goals structures (or any other teacherbehavior) on students is by its very nature multilevel insofar as teachers are matched withstudents according to classroom groupings (e.g., students are nested within teachers). Theadvent of HLM and other methods of mixed level modeling have made it possible to moreaccurately estimate teacher effects when nesting occurs. Nine studies have explicitlyattempted to tease out the influence of goals structures from those of individual ratersusing HLM analyses (Anderman et al., 2001; Anderman & Young, 1994; Kaplan, Gheen,& Midgley, 2002; Karabenick, 2004; Ryan, Gheen, & Midgley, 1998; Turner et al., 2002;Urdan, 2004; Urdan, Midgley, & Anderman, 1998; Wolters, 2004). Although two of theseHLM studies measured goal structures with teachers’ reports (Anderman et al., 2001;Anderman & Young, 1994), all of the others assessed goal structures at the classroom levelby aggregating students’ perceptions of goal structures within each classroom. Analyseswere then conducted to determine the effects of the aggregate goal structure measure onstudent motivation, as measured by cognitive, affective, and behavioral indices, such aspersistence, effort, help-seeking, and personal goal orientations. Across these studies, how-ever, the conclusions were based on relationships with aggregated measures of classrooms

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

4 A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx

ARTICLE IN PRESS

mastery and classroom goal structure which were never demonstrated to be reliable andwhich were potentially even biased.

1.2. Lack of reliability and potential bias in aggregated measures

From a phenomenological perspective, students’ perceptions are the most pertinentsource of data on classroom constructs because a given student’s behavior is presumedto be affected by his or her interpretation of the classroom context, more so than by anobjective indictor of that context. At the same time, however, if we are going to make rec-ommendations to teachers for how to change their practices, we need to identify somecommonalities among what students in their classrooms are reporting. This ‘‘common per-spective’’ is measured in HLM by aggregating the individual student perceptions withinany class to the classroom level. However, from a measurement standpoint, the use ofaggregate student perceptions as classroom indicators in hierarchical analyses raises sev-eral additional issues.

First, there is measurement error that is typically miscalculated and/or ignored. That is,researchers aggregate using multiple students in the classroom as ‘‘observers’’ of the envi-ronment (Turner et al., 2002; Urdan et al., 1998). These aggregate measures are labeled asmeasures of classroom environment and used at the classroom level to predict student-level outcomes, without calculating the reliability of the scores at the classroom level.Whereas the traditionally reported Cronbach a reliability is an estimate of consistencyof responses within items, the classroom level reliability also reflects the inter-rater agree-ment among students sharing the same classroom; and as such, is affected by the numberof student respondents per classroom (among other things). This classroom level reliabilitywill affect correlations of this classroom indictor with other variables in a research study.Thus, faulty interpretations of the reliability can potentially result in erroneous interpre-tations of the relations among variables. For example, the influence of goal structure couldbe underestimated if the researcher relies only upon an acceptable Cronbach’s a coefficientas an indicator of reliability. Assuming the reliabilities to be adequate and finding goalstructure to be nonsignificant predictors of the outcome variable have been common find-ings in many studies (Anderman et al., 2001; Kaplan et al., 2002; Turner et al., 2002; Wol-ters, 2004) often contrary to the researchers hypothesis. However, if aggregate perceptionswere used, Cronbach’s a is not sufficient and possible inadequate reliability of constructscores could be attenuating inter-correlations with other variables in the model. Thus, con-clusions drawn could be unfounded and alternative conclusions possible.

Furthermore, the student-level reliability in a hierarchical model should also be verifiedbecause traditional reliability estimates (i.e., Cronbach’s a) do not consider the nestedstructure of the data (students grouped within teachers or classrooms). As has been seenin other areas, such as the use of SES as an individual or school indicator (Raudenbush &Bryk, 2002), there can be aggregation bias, meaning that results of the analysis dependupon the degree to which the goal structures actually vary from classroom to classroomand the precision of the estimate of each classrooms measure of goal structure. Becausethese measures of classroom goal structure are determined by aggregating individual stu-dent perceptions, the measurement precision is heavily reliant on the number of studentssampled per classroom and their level of agreement on the construct.

Another troublesome issue that continually plagues researchers who endorse achieve-ment goal theory is the inconsistency in results across studies. Goal theory posits that

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx 5

ARTICLE IN PRESS

mastery goal structures are good and should increase positive outcomes whereas perfor-mance goal structures are less adaptive and should be related to more negative outcomes.However, classroom mastery goals rarely correlate with expected student outcomes suchas achievement or adaptive motivational outcomes while classroom performance goalsoften correlate with negative student outcomes. Furthermore, performance goal structuresdo not always lead to maladaptive student behaviors as posited by achievement goal the-ory (Miller, 2006). One way in which we may begin to explain some of the inconsistentfindings is to examine more closely the measurement issues associated with using studentperception data, especially the use of the aggregate perceptions at the classroom level. Assuch, the first purpose of this study is to demonstrate how an HLM measurement modelcan be used to determine both student level and classroom level reliabilities when the struc-ture of the data is nested.

A second difficult issue in the achievement goal literature is the challenge of distinguish-ing mastery goal structures from other aspects of teacher behavior/classroom environmentsuch as teacher competence and teacher respect. These constructs have also been measuredthrough student perception and are defined, respectively, as the students’ perception of theteacher’s ability to communicate subject matter effectively while managing a classroomappropriately and maintaining positive rapport with students. Although these constructsare infrequently included in goal structure studies, when they are, extremely high correla-tions have been reported between the teacher variables and perceived mastery goal struc-tures (Murdock, Hale, & Weber, 2001; Roeser et al., 1996). In these studies, the classroomperception variables have been measured at the student level without accounting for nest-ing. If we assume these constructs are classroom level phenomena, then validity of scoresdepends on demonstrating their separateness at the classroom level as evidenced by low tomoderate correlations among them. Accordingly, in this study we not only model latenttrue score measures of classroom goal structure at both the student and classroom levels,but we also investigate their correlations with three other classroom level variables (i.e.,teacher respect, teacher competence, and teacher interest). Teacher respect and compe-tence as defined previously are included because of their inclusion in studies in whichthe authors noted unusually high correlations. Teacher interest defined as the students’perceptions of their teacher’s interest in teaching as a profession and level of engagementwith students is included for exploratory purposes. Some goal theorists have hinted thatstudents are unable to separate their personal feelings about their teachers from theirteacher’s behavior (Urdan, 2004). If this hypothesis is true, then we would expect students’perceptions of teacher interest to be highly correlated with their perceptions of masterygoal structures. If all of these teacher characteristics and goal structures are distinct con-structs, this should be evident in their inter-correlations.

1.3. What is an HLM measurement model?

Most scholars in educational psychology are familiar with a more traditional two-levelHLM that models student-level outcomes in terms of both student characteristics andclassroom or school effects. Typically in these models, the lowest level consists of studentvariables (e.g., personal goal orientation and gender) and at the next level of the model arecharacteristics of the context in which students are nested (e.g., classroom goal structures).In this case, the student outcome of personal mastery goal orientation could be modeled interms of both the student level predictor of prior achievement and the teacher level

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

6 A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx

ARTICLE IN PRESS

predictors of classroom goal structures. The primary advantage of using HLM is the abil-ity to include both student and classroom level variables in a model that utilizes simulta-neous estimation in an attempt to disentangle the teacher and student effects while alsoaccounting for the nesting of students within classrooms.

All traditional HLM analyses are carried out in a two-step method. First, an uncondi-tional model is examined in which the variance of the outcome variable is partitioned tobetween and within class components. Next, a conditional model is estimated in whichpredictors at one or all levels are added in attempt to account for any between-class var-iance that may exist. In this article, this analysis method is extended to a three-level modelinto which the individual item responses of participants are modeled at the lowest level.Also, in this case, the outcome is not limited to one variable; all latent constructs are esti-mated and examined simultaneously. This will be further clarified in the methods sectionwith a detailed description of the specific model examined. This HLM measurement model(Raudenbush & Bryk, 2002) includes item responses nested within students who are thenin turn nested within classrooms. Through the two-step method described previously, bothreliability and validity issues can be examined via the same model. This type of model isappropriate in the analysis of classroom environment characteristics, such as goal struc-tures, when the method of data collection involves perceptions of a shared environment.In this situation, each respondent is viewed as an independent observer of the environmentand their responses are aggregated across all ‘‘student observers’’ to be used as an indica-tor of the environment.

Before using these aggregate measures as classroom indicators in other more traditionaltwo-level HLM, more information should be obtained about the measurement of class-room constructs by means of students’ perceptions. Because students’ responses withina given classroom are in regard to a shared environment, some of the variance in theresponses is most likely shared among members of that common environment. Thus, atthe student level, the HLM reliability coefficient will account for the fact that studentswithin classrooms should be more similar to one another than those in different class-rooms. As such, in the case of classroom goal structures, we can assume that multiple stu-dent perceptions of the same classroom teacher will to some extent be shared regardless ofthe different individual characteristics represented in that classroom. At the classroomlevel, the reliability coefficient will indicate the extent of the reliability of the classroomaggregate construct and if this coefficient is not satisfactory, then the aggregation of stu-dent responses to the classroom level is not warranted. Thus, if the sample of studentsfrom a given classroom is truly representative of the classroom and the students do expresssimilar perceptions of their classroom environment, then the aggregate perception measureis viable as a classroom indicator. In sum, this methodological approach allows us to stepback and examine more thoroughly the psychometrics of the classroom goal structure con-structs which have been widely used across the achievement goal theory literature.

Although this type of HLM measurement model is new to the arena of achievementgoal theory and likely to the whole of achievement motivation literature, it is not a newtechnique. Multivariate hierarchical measurement models have been used in a range ofareas to estimate constructs at the individual and organizational levels. For example, Rau-denbush, Rowan, and Kang (1991) examined a multilevel, multivariate model of schoolclimate using teacher reported data on principal leadership, staff cooperation, and otherteacher influences. Cheong and Raudenbush (2000) modeled child and adolescent problembehaviors investigating individual and contextual factors that contribute to the undesired

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx 7

ARTICLE IN PRESS

behaviors. A similar model has also been applied to neighborhood settings as well, usingmultiple informants from each neighborhood to rate neighborhood social control (Rau-denbush & Sampson, 1999). However, the use of these models in the educational psychol-ogy literature has not been advanced, perhaps due to limitations of sample size or thepaucity of studies considering the common hierarchical data structure of students groupedwithin classroom. As mentioned previously, the unit of analysis choice has traditionallybeen at the level of the student, while the clustering of students within classrooms andteachers has been ignored. Furthermore, when HLM models have been used and multiplestudent ‘observer’ reports are employed, the psychometric data are usually presented onlyat the student level with scale reliabilities calculated on the items answered at the individ-ual level without consideration of the nested structure or the possible lack of inter-rateragreement among ‘observer’ reports.

1.4. Summary and research questions

Measures of classroom climate constructs such as classroom goal structures are usuallybased on students’ self-reports of what they perceive to be true within a classroom. Often,they are then aggregated at the classroom level with the mean score within the class labeledas a ‘‘classroom characteristic’’ (Kaplan et al., 2002; Karabenick, 2004; Wolters, 2004).Typically, the only reliability data reported are the Cronbach’s a coefficients. However,within the same classroom, individual student perceptions of the teacher and environment(i.e., classroom context) also vary. Researchers using aggregate measures of classroomcontext should consider that the number of students who responded per classroom as wellas their level of inter-rater agreement affects the reliability of that classroom indictor. Alarger more representative sample of the classroom environment will reduce the standarderror of measurement which in turn will enhance the estimated reliability of the aggregatemeasure. These sources of error are not captured in the reporting of the coefficient a. Inshort, individual measurement error, in addition to other measurement obstacles, includ-ing the number of items on the scale as well as the number of ‘‘student observers’’ perclassroom, complicate the use of aggregate student perceptions as classroom indicatormeasures of classroom environment.

The purpose of this study is to investigate the reliability of classroom aggregate mea-sures as well the relations among the latent ‘true’ score measures of classroom environ-ment at both the student and classroom level. Three specific questions will be addressed:

(1) Are aggregate measures of classroom goals structures reliable indicators of class-room context?

(2) How much of the variance in the latent constructs of classroom environment arewithin the class (between students in a single class) versus between classes (betweenteachers)?

(3) Is there evidence for the discriminant validity of the latent constructs of classroomgoal structures versus other measures of teacher characteristics?

By way of examining these methodological limitations of using aggregate student per-ceptions as indicators of classroom goal structures, we also aim to provide some explana-tion of the conceptual limitations of achievement goal theory. The following conceptualquestions accompany and are directly associated to the methodological questions above:

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

8 A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx

ARTICLE IN PRESS

(1a) What is the potential impact of unreliable aggregate measures of classroom goalstructures?

(2a) Do aggregate measures of goal structures and teacher characteristics provide evi-dence of actual teacher differences between classrooms or are the perception datamerely a function of the intra-student differences?

(3a) Is classroom mastery goal structure an independent construct or as suggested bysome goal theorists, more likely a reflection of good teaching?

Thus, the overarching goal of this paper, in addition to demonstrating the benefits of astatistical methodology that is infrequently used in educational psychology, is to provideevidence of its usefulness in terms of the theoretical dilemmas associated with the incorpo-ration of goal structure measures in achievement goal theory research studies.

2. Methods

2.1. Participants

The sample consisted of 689 high school students from 57 math (n = 33) and science(n = 24) classrooms (57 unique teachers) in a Midwestern semi-urban middle-class highschool (25% of the student population received free or reduced lunch). No students in thissample reported on both their science and math teacher. The school district’s population isapproximately 83% Caucasian, 7% African-American, 5% Hispanic, 3% Asian, and 2%other. The sample used in this study is representative of the district population.

2.2. Measures

All scale measures were on a five-point rating scale anchored with ‘‘not at all true’’ and‘‘very true’’ except for prior achievement, which had nine choices ranging from ‘‘mostlyA’s’’ to ‘‘D’s and F’s.’’ Although these measures are technically ordinal, they are beingtreated as interval which is common in the educational psychology literature. Assumingreasonably normal distributions, the HLM analyses should not be effected (Raudenbush& Bryk, 2002).

2.3. Classroom climate measures

These measures were indicators of the classroom climate and were assessed using stu-dents as ‘‘observers’’ of the classroom and of their teacher. Because these individual per-ceptions at the student level are being aggregated to serve as classroom characteristics at ahigher level of the model, it is appropriate to evaluate the HLM reliabilities. Reliabilityestimates for the total sample, as well as at the student and classroom levels, are reportedin Table 1.

2.3.1. Classroom goal structure

Four items assessing classroom mastery goal orientation (e.g., ‘‘My teacher wants us tounderstand our work, not just memorize it.’’) and seven items assessing classroom perfor-mance goal orientation (e.g., ‘‘My teacher tells us how we compare to other students.’’)

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx 9

ARTICLE IN PRESS

were adopted from the Patterns of Adaptive Learning Survey (PALS; Midgley et al.,2000).

2.3.2. Teacher pedagogical competenceTen items comprising three subscales measuring teacher organization, teacher skill, and

evaluation fairness (e.g., ‘‘My teacher shows good examples to get across difficult points.’’)were utilized from Murdock, Briggs, and Olson (2002).

2.3.3. Teacher respect

Seven items were included to assess students’ perceptions of the extent to which theirteacher behaved in ways that demonstrated respect for students (e.g., ‘‘This teacher showsrespect towards students;’’ ‘‘This teacher embarrasses or insults students’’ (reflected)). Thescale was developed by Murdock et al. (2002) based on qualitative work by Gorham andChristophel (1992).

2.3.4. Teacher interest

The level of teacher-communicated interest was measured with five items adapted fromMurdock et al. (2002) based on Gorham and Christophel (1992). Sample items included,‘‘This teacher introduces interesting ideas about the subject,’’ and ‘‘The teacher makes pre-sentations that are dry and boring’’ (reflected).

2.3.5. Teacher expectations

Three items evaluating students’ perceptions of their teacher’s expectations for theirfuture success (e.g., ‘‘How good does your math/science teacher think you are at math/sci-ence?’’) were adapted from Jussim and Eccles (1992) measure of teacher expectations.

2.4. Student level predictors

These constructs were assessed using student self-report. As all variables in this sectionare reported at the student level, Cronbach’s a is the appropriate reliability estimate.

2.4.1. Personal goal orientation

Four items assessing mastery goal orientation (e.g., ‘‘I like math work I’ll learn fromeven if I make a lot of mistakes;’’ a = .85) and nine items assessing performance goalorientation (e.g., ‘‘One of my goals is to show others that I am good at math/science;’’a = .92) from PALS (Midgley et al., 2000) were used.

2.4.2. Prior achievement

One item assessed students’ prior achievement in math/science class: ‘‘Usually in math/science class I get: mostly A’s . . . D’s and F’s.’’

2.5. Teacher/classroom level predictors

Teacher self-report of information, including years of teaching experience and area ofcertification, were assessed with open-ended questions. The level of the course (remedial,regular, and advanced) was noted based upon the district’s course title. Since these were

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

10 A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx

ARTICLE IN PRESS

individual report variables at the classroom level, Cronbach’s a is the appropriate reliabil-ity estimate of scores.

2.5.1. Teacher self-efficacy

This 21-item scale (Woolfolk & Hoy, 1990), originally based on the teacher efficacyscale developed by Gibson and Dembo (1984), was composed of two subscales: personalteaching efficacy (12 items, a = .81) and general teaching efficacy (9 items, a = .78).

2.5.2. Content area knowledge

Three items (a = .81) were developed to measure teachers’ beliefs about their contentarea (mathematics or science) knowledge (e.g., ‘‘I know enough about this specific areaof mathematics to teach this class effectively.’’).

2.6. Procedure

Mathematics and science teachers in the participating school district were recruited bysoliciting their participation through a short presentation about the purposes of the studyat a departmental meeting. For each of the teachers who agreed, one of their classes wasselected to participate. Classes were selected to cover a range of class topics and gradelevels.

All selected classes were visited by a graduate student who explained the purpose of thestudy to students and gave them a letter in order to obtain parent permission. Classes with80% return rates were entered into a drawing. Students and the teacher in winning class-rooms each received $10 cash. Graduate students administered the surveys 2 weeks afterthe permission slips were collected.

3. Data analysis

3.1. Description of modeling techniques

A three-level hierarchical linear model as described previously was estimated usingHLM 6 software. Level 1 was a measurement model of the variation within each studentresponding to items on the classroom climate measures capturing item inconsistency whichis the variation around the individual’s ‘‘true score’’ or true perception of the classroom;level 2 of the model captures the variation among the student respondents within theclassroom around the classroom’s ‘true score’ or, in other words, individual variationsin perceptions of the classroom; and finally at the highest level of the model, level 3, thevariation across the classrooms in which the students are nested is captured in terms ofhow different classrooms vary around a grand mean of the sample (see Fig. 1).

Five classroom environment constructs were included in the model: mastery goal struc-ture (4 items), performance goal structure (7 items), teacher competence (13 items), teacherrespect (7 items), and teacher interest (5 items). Additionally, teacher expectancy (3 items)was also modeled for comparative purposes. Therefore the level 1 file contained 26,871observations (689 students · 39 items), level 2 had 689 cases (students), and level 3 had57 cases (teachers). Dummy coding was used to label each item as to which construct itbelongs. In this manner, the analysis is multivariate where all six classroom environmentconstructs are the outcome variables.

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

Fig. 1. Diagram of HLM measurement model: level 1 (items), level 2 (students), level 3 (classrooms).

A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx 11

ARTICLE IN PRESS

3.2. The model

Following the notation of Raudenbush and Bryk (2002), the level 1 measurement modelcan be described by the equation

Y ijk ¼X

p

ppjkapijk þ eijk;

whereYijk is the student j response to item i in classroom k

apijk is an indicator variable that takes on the value of 1 for construct p.Here p = 1, . . . , 6 for each of the classroom context constructs (listed above) included inthis model.

ppjk is the latent score for student j in classroom k on construct peijk is the error term.As previously mentioned, all six constructs were evaluated simultaneously through the

use of the indicator variables which takes on different values for each construct. In thismanner, the unconditional model was the means by which reliability and inter-correlationsof the constructs are examined and through the conditional model, construct validity was

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

12 A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx

ARTICLE IN PRESS

examined with the addition of student and classroom level predictors. In this exampleanalysis, all of the items comprising a given construct are equally weighted. Essentially,this is the same as constraining the factor loadings on a given factor to be equal in a con-firmatory factor analysis model. Although a more rigorous item analysis method, such asitem response theory (IRT) could be employed here, the equal weighting is appropriate forthis example as the common usage of these scales is mean scale scores and the purpose ofthis example is to determine if using the mean of mean scale scores is an appropriate indi-cator of classroom environment.

Level 2 of the model represents the latent true scores across students within classrooms

ppjk ¼ bpk þ rpjk ;

whereb pk is the mean latent score on construct p in classroom k

rpjk is the student individual effect.As can be seen by the model formula above, there were six beta values for this model; onemean latent score for each of the six classroom environment constructs.

Level 3 of the model produces the latent true scores across classrooms

bpk ¼ cp þ upk;

wherecp is the grand mean of the latent score of construct p

upk is the classroom effect.Again, there were six grand mean latent scores; the average of each of the six classroomconstructs across all 57 classrooms in the data set.

The unconditional model (model with no predictor variables) permits the researcher toexamine the psychometric properties and inter-correlations of the constructs at both thestudent and classroom levels. Among other things, these analyses will determine: (1) thelevel of reliability that exists at the classroom level that is highly dependent upon boththe number of students per classroom as well as their level of inter-rater agreement; (2)the inter-correlations of the constructs after accounting for the lack of independence amongobservations that may be helpful in examining the validity of goal structure constructs; andfinally (3) the amount of variance in the given constructs that is between and within class-rooms. Significant variance between classrooms will indicate actual differences betweenclassrooms that may be attributable to teacher differences whereas high levels of withinclassroom variance would indicate that the constructs are more of a function of who isin the classroom rather than actual differences occurring in the classroom.

After full consideration of the three major properties of the unconditional modeldescribed above, predictor variables in the form of student and teacher characteristic vari-ables will be added to the model, making it conditional, and attempting to account for var-iance in the latent true scores that will help to discern differences and similarities betweenthe goal structure and teacher characteristics of the classroom. At the student level (level2), personal goal orientation and prior achievement were added. Achievement goal theoryposits that classroom goal structures influence the personal goal orientations that studentsadopt. Therefore, we expected personal mastery goal orientation would be related to mas-tery goal structure and personal performance goal orientation would be correlated withperformance goal structure. Previous research has also shown that prior achievement isrelated to students’ perceptions of the classroom goal structure with higher achieving stu-

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx 13

ARTICLE IN PRESS

dents endorsing mastery goal orientation. At the classroom level (level 3), teacher self-ef-ficacy and content area knowledge were added as predictors of the classroom constructs.This analysis was exploratory in nature; we expected a relationship between teacher self-efficacy and students perceived teacher competence. Results pertaining to theses two con-structs may assist in discerning differences between the mastery goal structure and teachercompetence constructs.

4. Results

4.1. Unconditional model

The unconditional HLM model provides three main characteristics of the latent class-room constructs: (1) reliabilities at each level of the model; (2) inter-correlations of thelatent constructs at each level of the model; and (3) the amount of between classroom var-iance in each construct.

The unconditional measurement model partitions the variance of the items from a givenscale into three components: ‘true’ score variance at the student level (r2

studentÞ, ‘true’ scorevariance at the classroom level (r2

classroomÞ, and item variance (r2itemÞ, modeling latent con-

structs at both the student and classroom levels free of measurement error. The estimatedvariances of these constructs are then used to calculate internal consistency estimates atboth levels (Hox, 2002; Raudenbush & Bryk, 2002; Raudenbush et al., 1991)

Student level ¼ r2student

r2student þ

r2item

p

Classroom level ¼ r2classroom

r2classroom þ

r2student

n þ r2item

p�nð Þ

wherep is the number of items in the scalen is the average number of students in the classroom.

In an HLM analysis, the classroom level internal consistency is dependent on the numberof items in the scale, the inter-correlation of the items, the level of agreement amongstudents within the classroom, and the number of students sampled within the classroom,while the internal consistency at the student level is only dependent on the number of itemsand the inter-correlation of the items.

As seen in Table 1, the internal consistency at each level of the model differs fromthat of the typically reported Cronbach’s a. Across all of the scales, the latent variablemodeled at the student level is below that of the Cronbach’s a, where nesting was notconsidered. In comparison to Cronbach’s a, where the reliability is the average internalconsistency of the responses to the items of all students treated independently; the stu-dent level coefficients are the mean reliability within each classroom which takes intoaccount that observers within classrooms are not independent from one another. Aswith Cronbach’s a, reliability at this level is most strongly affected by the number ofitems on the scale. This is evident by the poor reliability of the three item teacherexpectations scale and the modest reliabilities of the four- and five-item mastery goalstructure and teacher interest scales.

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

14 A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx

ARTICLE IN PRESS

At the classroom level, reliabilities were adequate for all constructs, other than the reli-ability for teacher expectations, which was as we expected. Reliability at this level is anestimate of the inter-rater reliability within the classroom. Whereas all of the other con-structs assess students’ perceptions of how their teacher behaves within the classroom gen-

erally, the teacher expectation items referred to the individual student’s perception of howthe teacher views him or her individually. Thus, consistency within the classrooms shouldnot be anticipated. Both classroom and performance goal structure perceptions, althoughtheir reliabilities are slightly lower, still maintain acceptable scale consistency at the class-room level. Perceptions of teacher competency, although still within an acceptable range,have a significant attenuation of the reliability from the standard Cronbach a.

Next, correlations among the latent level 1 and level 2 variables were calculated fromthe unconditional model. These correlations are examined for evidence of discriminantvalidity among the constructs. This procedure produces correlation estimates that accountfor error variance. As such, the correlations should be larger than zero-order correlationsamong student scores and aggregated teacher scores where measurement error attenuatesthe relationships. Moreover, at the student level, the correlation coefficient now takesclassroom grouping into account. Thus, this correlation coefficient reflects the averagewithin classroom correlation among level 1 variables.

As can be seen in Table 2, several of the student-level correlations are much larger whenclassroom grouping and measurement error are taken into account. With the exception ofthe measure of performance goal structures, the magnitude of the correlations among theother variables increased by between .15 and .25. Relations among mastery goal structure,teacher interest, and to a lesser extent teacher respect, were now so high as to suggest thatat this level they were not different constructs. Further analysis using student and teacherlevel predictors is needed in order to determine if these constructs are distinguishable fromeach other. The addition of predictors to the model at level 2 (student characteristics) andlevel 3 (teacher/classroom characteristics) can perhaps provide some evidence of discrim-inant validity among the five outcome classroom environment constructs. This part of theanalysis is exploratory in nature.

The classroom level correlation structure is not equivalent to the individual levelbecause the measurement error is different at each level depending on the number of scaleitems at the individual level and the number of respondents at the classroom level. It is alsoimportant to note that correlations may legitimately vary between the individual and class-room level. For example, for a classroom with a high correlation between teacher compe-tency and mastery goal structure, this does not imply that all students within that

Table 1Cronbach’s a and HLM internal consistency estimates at the student and classroom levels

Scale Reliability estimate

Number of items a HLM student-level HLM classroom-level

Performance goal structure 7 .80 .75 .76Mastery goal structure 4 .80 .65 .77Teacher competence 10 .90 .80 .77Teacher respect 7 .86 .70 .80Teacher interest 5 .77 .66 .79Teacher expectations 3 .72 .40 .50

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

Table 2Zero-order and HLM correlations at the student and classroom levels

Student—level 2 Classroom—level 3

Zero-ordera HLM Zero-orderb HLM

Performance goal with

Mastery goal .088 .075 .125 .189Teacher competency �.002 .024 �.029 .057Teacher respect �.215 �.308 �.169 �.133Teacher interest .126 .107 .251 .327

Mastery goal with

Teacher competency .788 .991 .896 .920Teacher respect .655 .884 .797 .823Teacher interest .782 .998 .913 .961

Teacher competency with

Teacher respect .735 .887 .862 .893Teacher interest .769 .987 .853 .912

Teacher respect with

Teacher interest .631 .875 .704 .706

a Student level without consideration of classroom membership.b Classroom level means.

A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx 15

ARTICLE IN PRESS

classroom who perceived the teacher to have high competency also rated the classroom ashigh mastery goal structure. Once again, however, there were very high correlationsbetween mastery goal structures and the other teacher variables, implying redundancyamong these constructs.

Finally, results from the unconditional model are also used to calculate the amount ofbetween-class variance in each of the constructs. If these constructs are actually measuresof important aspects of what happens in the classroom, there should be some variationacross teachers. In all HLM analysis, an intra-class correlation coefficient is calculatedat the unconditional model stage. This calculation is the basis of most multilevel modelingapproaches. The simple ratio of variances permits the researcher to determine how muchof the variance in the dependent variable(s) actually lies between the groups in the analysis(in this case, between classrooms). Between-class variances (intra-class correlations) wereas follows: classroom mastery goals, 29%; classroom performance goals, 25%; teachercompetency, 26%; teacher respect, 32%; and teacher interest, 30%. These results suggestthat across classrooms, students do perceive teachers differently on these variables, butthat a larger portion of variance comes from perceptions within classrooms, meaning thata large portion of the variance in these climate constructs were due to the similarities anddifferences among the student ‘observers’ themselves. These individual differences couldinclude motivational, social, and cognitive differences. Several studies have already exam-ined predictors of students’ perceptions of goal structures (Anderman & Young, 1994;Urdan, 2004; Wolters, 2004).

Note that the estimated amounts of between classroom variance presented here arelarger than if they would have been observed rather than latent scores because we parti-tioned out the item error. Analyses based on observed scores incorporate the error term

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

16 A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx

ARTICLE IN PRESS

into the within classroom variance forcing the within classroom variance to be falselyinflated and attenuating the between classroom variance. To correct for this, someresearchers have begun to report reliability-corrected intra-class correlations (Urdan,2004; Wolters, 2004).

4.2. Conditional model

The final step of this study was an exploratory analysis examining possible predictors ofthe latent constructs to provide preliminary discriminant and/or convergent validity evi-dence for the constructs. Recall that mastery goals were very highly correlated with teach-er competency and interest at both the classroom and individual levels (r > .90), and thatthere were large amounts of within-class variability in these two constructs. As mentionedpreviously, some researchers have suggested that perhaps mastery goal structures cannotbe separated from teaching ability (Murdock et al., 2001; Roeser et al., 1996); howeverachievement goal theory does not assume any relationship between teacher quality andthe goal structure supported in the classroom. With the addition of individual and class-room level predictors in the conditional model, an examination of the similarities and dif-ferences of these constructs was conducted.

In the student-level analyses, we entered prior achievement as measured by self-report-ed previous grades in mathematics or science courses (prior achievement) and personalgoal orientations (mastery and performance) as these two student variables are hypothe-sized to be related to classroom goal structures. At the classroom level, teacher character-istic variables of years teaching experience, the level of the course which was labeled asremedial, regular, or advanced (dummy coded), self-reported teaching efficacy and self-re-ported content area knowledge were entered. Selection of these characteristics was guidedby the assumption that teaching efficacy and content area knowledge should be more relat-ed to the students’ perceptions of teacher competence. These predictors are plausible con-structs which may help to disentangle teacher characteristic constructs from goal structureconstructs.

There were many similarities in predictors of mastery goal structure, teacher competen-cy, respect, and interest at the classroom level. First, there was a significant inverse relationbetween total years teaching experience and all of the latent constructs, indicating that themore experienced the classroom teacher, the lower students rated them on competence,respect, and interest promotion; they were also less likely to be seen as promoting a mas-tery-oriented classroom goal structure. As seen in Table 3, the unstandardized coefficientsare of similar magnitude across all four latent constructs. This comparison can be madebecause this is a multivariate analysis. Furthermore, the teaching experience variabledid not predict performance goal structure. In contrast, whereas student course levels(e.g., advanced, remedial) predicted the extent to which they viewed the classroom as per-formance oriented, with more advanced students viewing it as less performance focused,this variable did not predict perceived mastery goals, teacher competence, respect or inter-est promotion. Self-reported teaching efficacy and self-reported content knowledge did notpredict significant amounts of variance in any construct

At the student level, there was again a similar pattern of predictors for the teacher con-structs of competence, respect, and interest and mastery goal structure. Personal masterygoal orientation positively predicted all of these constructs, whereas personal performancegoal orientation negatively predicted respect and interest and positively predicted perfor-

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

Table 3Unstandardized predictors, t values, and variance accounted for in the latent classroom constructs

Dependent variables Performance goalstructure

Mastery goal structure Teacher competency Teacher respect Teacher interest

Fixed effect 1.62 2.62 2.75 2.95 2.44t t t t t

Classroom level predictors

Years teaching experience �.003 �.521 �.016* �2.042 �.018** �2.780 �.016* �2.44 �.017* �2.283Remedial course .205 1.014 �.209 �.0832 �.135 �.665 �.305 �1.44 �.208 �.894Advanced course �.416** �3.223 �.003 �.022 .112 .853 .108 .793 .081 .544Student level predictors

Gender �.231*** �3.918 �.019 �.280 .005 .095 .114* 2.046 �.030 �.475Prior achievement in

mathematics/science.053** 2.667 �.012 �.513 �.017 �.894 �.017 �.882 .001 .037

Personal mastery goalorientation

.000 .015 .423*** 10.111 .376*** 11.314 .300*** 8.961 .380*** 9.936

Personal performance goalorientation

.246*** 7.425 �.074 �1.885 �.052 �1.682 �.118*** �3.77 �.084* �2.340

Variance explainedClassroom level 25% 10% 21% 25% 13%Student level 36% 36% 35% 34% 33%

* p < .05.** p < .01.

*** p < .001.

A.D

.M

iller,T

.B.

Mu

rdo

ck/

Co

ntem

po

rary

Ed

uca

tion

al

Psy

cho

log

yx

xx

(2

00

7)

xx

x–

xx

x17

AR

TIC

LE

INP

RE

SS

Please

citeth

isarticle

inp

ressas:

Miller,

A.

D.,

&M

urd

ock

,T

.B

.,M

od

eling

latent

true

scores

tod

etermin

e...,

Co

ntem

po

rary

Ed

uca

tion

al

Psy

cho

log

y(2007),

do

i:10.1016/j.cedp

sych.2006.10.006

18 A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx

ARTICLE IN PRESS

mance goal structure. It is logical that personal performance predicted performance goalstructure and personal mastery predicted mastery goal structure as students adopting thegoal are more likely to pick up on those cues in the classroom environment. However, it isinteresting to note that students who more highly endorse personal performance goalswere more likely to rate the teacher lower on respect or ability to generate interest. Or,perhaps it is more logical to think that students who perceived their teacher as less respect-ful and less interesting were more likely to highly endorse performance goal orientation.As all of these data were correlational, causation cannot be established.

As will all goal theory studies, gender was also entered in the model and similar to pre-vious findings; it was a significant predictor of performance goal structure and teacherrespect, with boys being more likely to rate the classroom as higher on performance goalstructure and girls more likely to rate the teacher as more respectful. Prior achievementwas a positive significant predictor of performance goal structure and nonsignificant forall other constructs. This is again consistent with separateness of performance goals,but not necessarily mastery goals, from all of the other constructs.

The amount of variance explained by the latent constructs was very similar at the stu-dent level with approximately 35% whereas, at the classroom level the range was from 10%to 25%. Classroom level predictors were most effective in predicting performance goalstructure and teacher respect and least effective in predicting mastery goal structure. Itappears that, with the similarity of variance accounted for at the student level, students’personal goal orientations are important indicators of how students will rate the teacherand the classroom goal structure.

5. Discussion

The aim of this article was to demonstrate the usefulness of HLM for modeling class-room context constructs based on aggregate student reports. We specifically applied thistechnique to determine the reliability and validity of aggregate measures of classroom goalstructures which are frequently used in achievement motivation research. In comparison totraditional psychometric approaches, the HLM measurement model allows for the compu-tation of level 2 (in this case student) and level 3 (in this case classroom) latent ‘true’scores. The model adequately handles the correlated error resulting from the nonindepen-dent observations from multiple students reporting on the same classroom environment.Results from these analyses can then be used to understand the interrelations among con-structs at each level and help to determine the extent to which various constructs overlapwith one another at the student and classroom levels. They also afford us a more accurateestimate of the extent to which our measures of classroom level phenomenon capture var-iance that is a function of the individual differences of students within the classroom versusactual perceived differences between classrooms.

5.1. HLM reliability

Examination of reliability coefficients at both the student and teacher levels illustratesthe importance of using a hierarchical data analysis technique for nested data. Althoughmany of the construct reliabilities at the student level were comparable when nesting wasnot considered, the reliabilities dropped for several of the constructs at the student level

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx 19

ARTICLE IN PRESS

when nesting was considered, including the measure of classroom mastery goal structure.As such, when nested data are not treated accordingly, the researcher may overestimatehow accurately he or she assesses the impact of individual student perceptions of masterygoals on various outcomes.

In addition, these results intimate that the use of aggregated student perceptions asclassroom environment measures should be undertaken with caution. It is not fair toassume that adequate reliability, as measured by Cronbach’s a, will translate into a reli-able aggregate, as is illustrated by our results showing differing reliability at the stu-dents and classroom level. The use of an aggregate student perception measure at theclassroom level in a regression model assumes a high inter-rater agreement of theobservers. Although the strong classroom level reliabilities found in these data suggestthat most of these constructs can be adequately aggregated within classrooms, class-room reliability is strongly influenced by the number of students per classroom versusthe number of scale items (Raudenbush et al., 1991). For example, in the present exam-ple, if the average number of students were reduced from 12 to 8 per classroom, thecalculated reliability of the perceived mastery goal structure would drop to .68, assum-ing no change in the variances. In actuality, the reliability would most likely be evenlower because the variance would also change. In the HLM goal structure studiesreviewed here, the average classroom size ranged from 6 to 18 students per classroom;and while the present study found adequate reliability for an average sample size of 12,there are multiple studies whose classroom average was much lower. Thus, researchersshould not assume that the reported classroom level reliability in one study will providea basis for estimating the reliability in another study unless these studies have similarclassroom sample sizes.

In a related issue, the sample size of published HLM studies has varied greatly and thereporting of power is not, at present, common in these studies. Although the conditionalmodel was not the main focus of these analyses, if power were estimated for the modelwith 57 classrooms with an average size of 12 students, intra-class correlation of .20and a medium effect size, the power is approaching an acceptable level (>.70). However,there are many HLM studies with fewer classrooms and smaller numbers of studentsper classroom. As the appearance of HLM analyses in educational research increases,these important issues of reliability and power need to be addressed and uniform reportingstandards established.

5.2. Between versus within classroom variance

Researchers commonly report intra-class correlations and examine predictors of thisbetween-class variance in classroom studies. In the current study, although there was sub-stantial variance between classrooms on all of the classroom level variables, there was sub-stantially (approximately three times) more variance within classrooms. This is notsurprising given the assumption that within any classroom setting there is considerablevariability in how the teacher is perceived (Church, Elliot, & Gable, 2001; Skinner & Bel-mont, 1993), suggesting that some of students’ perceptions are not only a function of theteacher’s behavior, but also a result of the students’ own personal histories and expecta-tions. Longitudinal studies that examine both students and classrooms as units over timewill help us to better understand the extent to which students’ view of the classroom can bealtered by the behavior of the teacher (e.g., Anderman, 2003).

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

20 A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx

ARTICLE IN PRESS

5.3. Discriminant validity

Measures of perceived classroom goal structures have been found to highly correlatewith other aspects of the classroom environment when these constructs are measured onlyat the individual levels. As a result, some researchers question the uniqueness of masterygoal structures from other aspects of the classroom (Murdock et al., 2001; Roeser et al.,1996). Findings from our unconditional HLM model suggest that this lack of distinctive-ness between these constructs is even greater when variables are modeled with nesting tak-en into account. At the student level, when the nested data structure was considered,mastery goal structures shared over 98% of the variance with two other indicators of class-room context. In the conditional model, not only were similar amounts of variance in thelatent constructs predicted, but also the predictors were of the same direction and similarmagnitude. As a whole, these data suggest that future work needs to be done on perhapsboth the conceptual as well as operational definitions of classroom goal structures that arepresumed to be determinants of student motivation. Some scholars do see this as a con-ceptual problem. Patrick (2004), for example, stated that goal structure studies havebecome ‘‘selective’’ focusing too much on evaluative components of the construct andignoring other more social elements of the classroom which are also hypothesized to con-tribute to the classroom goals structure. Others argue that the items commonly used tomeasure goal structures are not pure measures of goal structures. Instead, they are con-founded with student interpretation and suggest a redevelopment of these measures(Urdan, 2004). Whereas the purpose of this study was to demonstrate a statistical tech-nique which could benefit educational psychologists while also shedding light on a com-monly used goal structure measure, the results of our study evaluating the PALSmeasure of goal structure does support the views of these scholars who have called fora reexamination of the construct.

Our findings are limited, however, by the nature of the predictors that we used in theconditional model. All of the predictors used at the classroom level were based on teacherself-report. It would be useful to use an independent observer’s report of teacher attributesin order to attempt to better discriminate between teacher characteristics and goal struc-ture measures. For example, in this study teachers’ self-efficacy and reported content areaknowledge were not significant predictors of the classroom context as perceived by stu-dents; however, these variables could be biased by social desirability in teachers’ responses.An observer report of teacher strategy use or a standardized test score on content knowl-edge may provide more informative data for a model of this type. Future studies focusingon the construct validity of mastery goal structure variables are clearly needed.

Finally we consider the impact of these measurement inconsistencies on our efforts tounderstand classroom effects on motivation. The impact that teachers have on studentsis extremely complex to conceptualize and to measure quantitatively. Student perceptionsof the classroom are of utmost importance but complicated to incorporate into analyses.While no definitive answers can be reached with the preliminary exploratory analysesundertaken here, this study does bring to attention some of the potential explanationsfor the mixed and sometimes even counterintuitive results of goal theory studies. Forexample, while the aggregate student perception measure was found to be reliable in thisstudy with an average classroom size of 12 students sampled, we know that several of thestudies reviewed prior to conducting this study used much smaller samples. The reliabilityof their aggregate measures are certainly drawn into question and may explain why the

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx 21

ARTICLE IN PRESS

goal structure measure sometimes predicts student outcomes while, in other similarly con-structed studies, the construct is a nonsignificant predictor of the same outcome.

6. Conclusion

The multilevel measurement model used here has allowed an initial exploration of class-room climate constructs that is new to achievement motivation research. With thisapproach we have been able to: (1) examine the psychometric properties of goal structureclassroom environment measures at both the student and the classroom level without aunit of analysis problem; (2) partition the variance in each of these constructs to classroomlevel and student level components; and finally (3) examine the dimensionality of theseconstructs investigating the similarities and differences of classroom goal structure mea-sures and measures of teacher characteristics. Although this study approaches motivationthorough the lens of goal theory, this is just one of the arenas in which these aggregatedmeasures of context are used, and thus the implications from this study extend to otherareas of educational psychology.

References

Ames, C. (1992). Classrooms: goals, structures, and student motivation. Journal of Educational Psychology, 84,261–271.

Ames, C., & Archer, J. (1988). Achievement goals in the classroom: students’ learning strategies and motivationprocesses. Journal of Educational Psychology, 80, 260–267.

Anderman, E. M., Eccles, J. S., Yoon, K. S., Roeser, R., Wigfield, A., & Blumenfeld, P. (2001). Learning to valuemathematics and reading: relations to mastery and performance-oriented instructional practices. Contempo-

rary Educational Psychology, 26, 76–95.Anderman, E. M., Griesinger, T., & Westerfield, G. (1998). Motivation and cheating during early adolescence.

Journal of Educational Psychology, 90, 84–93.Anderman, E. M., & Young, A. J. (1994). Motivation and strategy use in science: individual differences and

classroom effects. Journal of Research in Science Teaching, 31, 811–831.Anderman, L. H. (1999). Classroom goal orientation, school belonging and social goals as predictors of students’

positive and negative affect following the transition to middle school. Journal of Research and Development in

Education, 32, 89–103.Anderman, L. H. (2003). Academic and social perceptions as predictors of change in middle school students’

sense of school belonging. Journal of Experimental Education, 72(1), 5–22.Cheong, Y. F., & Raudenbush, S. W. (2000). Measurement and structural models for children’s problem

behaviors. Psychological Methods, 5, 477–495.Church, M. A., Elliot, A. J., & Gable, S. L. (2001). Perceptions of classroom environment, achievement goals,

and achievement outcomes. Journal of Educational Psychology, 93, 43–54.Dweck, C. S., & Leggett, E. L. (1988). A social cognitive approach to motivation and personality. Psychological

Review, 95, 256–273.Eccles, J. S., Lord, S., & Midgley, C. (1991). What are we doing to early adolescents? The impact of educational

contexts on early adolescents. American Journal of Education, 99, 521–542.Elliott, E. S., & Dweck, C. S. (1988). Goals: an approach to motivation and achievement. Journal of Personality

and Social Psychology, 54, 5–12.Gibson, S., & Dembo, M. H. (1984). Teacher efficacy: a construct validation. Journal of Educational Psychology,

76, 569–582.Gorham, J., & Christophel, D. (1992). Students’ perceptions of teachers as motivating and demotivating factors

in college classes. Communications Quarterly, 40, 239–252.Hox, J. (2002). Multilevel analysis: Techniques and applications. Mahwah, New Jersey: Lawrence Erlbaum.Jussim, L., & Eccles, J. S. (1992). Teacher expectations II: Construction and reflection of student achievement.

Journal of Personality and Social Psychology, 63, 947–961.

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006

22 A.D. Miller, T.B. Murdock / Contemporary Educational Psychology xxx (2007) xxx–xxx

ARTICLE IN PRESS

Kaplan, A., Gheen, M., & Midgley, C. (2002). Classroom goal structure and student disruptive behaviour. British

Journal of Educational Psychology, 72, 191–212.Kaplan, A., & Maehr, M. L. (1999). Achievement goals and student well-being. Contemporary Educational

Psychology, 24, 330–358.Kaplan, A., & Midgley, C. (1999). The relationship between perceptions of the classroom goal structure and early

adolescents’ affect in school: the mediating role of coping strategies. Learning and Individual Differences, 11,187–212.

Karabenick, S. A. (2004). Perceived achievement goal structure and college student help seeking. Journal of

Educational Psychology, 96, 569–581.Midgley, C., Anderman, E., & Hicks, L. (1995). Differences between elementary and middle school teachers and

students: a goal theory approach. Journal of Early Adolescence, 15, 90–113.Midgley, C., Feldlaufer, H., & Eccles, J. S. (1989). Change in teacher efficacy and student self- and task-related

beliefs in mathematics during the transition to junior high school. Journal of Educational Psychology, 81,247–258.

Midgley, C., Maehr, M. L., Hruada, L. Z., Anderman, E., Anderman, L., & Freeman, K. E. (2000). Patterns of

Adaptive Learning Survey (PALS) Manual. Ann Arbor: University of Michigan.Miller, A. D. (2006). Teacher-student relationships in classroom motivation: A critical review of goal structures.

Washington DC: Paper presented at the meeting of the American Psychological Association.Murdock, T. B., Briggs, W., & Olson, E. A. (2002). Instructional and motivational predictors of effort and

achievement among undergraduate college students. Unpublished manuscript.Murdock, T. B., Hale, N. M., & Weber, M. J. (2001). Predictors of cheating among early adolescents: academic

and social motivations. Contemporary Educational Psychology, 26, 96–115.Nolen, S. B., & Haladyna, T. M. (1990). Personal and environmental influences on students’ beliefs about

effective study strategies. Contemporary Educational Psychology, 15, 116–130.Patrick, H. (2004). Re-examining classroom mastery goal structure. In P. R. Pintrich & M. L. Maehr (Eds.).

Motivating students, improving schools: The legacy of carol midgley (vol. 13, pp. 233–264). New York: Elsevier.Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Application and data analysis methods (2nd

ed.). Thousand Oaks, CA: Sage.Raudenbush, S. W., Rowan, B., & Kang, S. J. (1991). A multilevel, multivariate model for studying school

climate with estimation via the EM algorithm and application to U.S. high-school data. Journal of

Educational Statistics, 16, 295–330.Raudenbush, S. W., & Sampson, R. J. (1999). Econometrics: toward a science of assessing ecological settings,

with application to the systematic social observation of neighborhoods. Sociological Methodology, 29, 1–41.Roeser, R. W., & Eccles, J. S. (1998). Adolescents’ perceptions of middle school: Relation to longitudinal changes

in academic and psychological adjustment. Journal of Research on Adolescence, 8, 123–158.Roeser, R. W., Midgley, C., & Urdan, T. C. (1996). Perceptions of the school psychological environment and

early adolescents’ psychological and behavioral functioning in school: the mediating role of goals andbelonging. Journal of Educational Psychology, 90, 408–422.

Ryan, A. M., Gheen, M. H., & Midgley, C. (1998). Why do some students avoid asking for help? An examinationof the interplay among students’ academic efficacy, teachers’ social-emotional role, and the classroom goalstructure. Journal of Educational Psychology, 90, 528–535.

Skinner, E. A., & Belmont, M. J. (1993). Motivation in the classroom: reciprocal effects of teacher behavior andstudent engagement across the school year. Journal of Educational Psychology, 85, 571–581.

Turner, J. C., Midgley, C., Meyer, D. K., Gheen, M., Anderman, E. M., Kang, Y., et al. (2002). The classroomenvironment and students’ reports of avoidance strategies in mathematics: a multimethod study. Journal of

Educational Psychology, 94, 88–106.Urdan, T. (2004). Using multiple methods to assess students’ perceptions of classroom goal structures. European

Psychologist, 9, 222–231.Urdan, T., Midgley, C., & Anderman, E. M. (1998). The role of classroom goal structure in students’ use of self-

handicapping strategies. American Educational Research Journal, 35, 101–122.Wolters, C. A. (2004). Advancing achievement goal theory: using goal structures and goal orientations to predict

students’ motivation, cognition, and achievement. Journal of Educational Psychology, 96, 236–250.Woolfolk, A. E., & Hoy, W. K. (1990). Prospective teachers’ sense of efficacy and beliefs about control. Journal of

Educational Psychology, 82, 81–91.Young, A. J. (1997). I think, therefore I’m motivated: the relations among cognitive strategy use, motivational

orientation and classroom perceptions over time. Learning and Individual Differences, 9, 249–283.

Please cite this article in press as: Miller, A. D., & Murdock, T. B., Modeling latent true scores todetermine ..., Contemporary Educational Psychology (2007), doi:10.1016/j.cedpsych.2006.10.006