EDRS Price MF-$0.75 HC$8.05 - CiteSeerX

DOCUMENT RESUME

ED 044 371 SP 004 390

AUTHORTITLE

INSTITUTION

REPORT NOPUB DATENOTE

EDRS PRICEDESCRIPTORS

Morsh, Joseph E.; Wilder, Eleanor. W.Identifying the Effective Instructor: A Review ofthe Quantitative Studies. 1900-1952.Air Force Personnel and Training Research Center,Chanute AFB, Ill.AFPTRC-TR-54-445,'

159p.

EDRS Price MF-$0.75 HC$8.05*Educational Res, arch, *Effective Teaching,*Evaluation Criteria, *Teacher Characteristics,*Teacher Evaluation

ABSTRACTThis research review contains summary and synthesis

of 360 references selected from over 900 in 1) Education Index, 2)Psychological Abstracts, and 3) some 40 reviews and bibliograpriies,28 of which were selected for inclusion in the 392-item bibliographyat the end of this review. Principal findings of the citedquantitative research studies are summarized in the introductorysection. Concluding implications for further research, presented as aguide in Air Force technical training research projects, are alsoexpected to assist other investigators in the field. The descriptionof research studies and tabular material are presentedchronologically (1900-1952) under each topic heading. Topics underthe major heading of "Criteria for Instructor Effectiveness" arerating the effectiveness of instructors, administrator rating, peerrating, student rating, self-rating, objective observation ofperformance, and student change as a measure. Topics under "ThePredictors--Traits and Qualities Assumed to be Related to InstructorEffectiveness" ate intelligence, education, scholarship, age andexperience, knowledge of subject tatter and present professionalinformation and teacher examination scores, extracurricularactivities and genera' culture test scores, socioeconomic status andsex and marital status, teaching aptitude and attitude towardteaching and interest, voice and speech characteristics, photograph,statistical analyses of abilities, personality studies area tests. (JS)

I

rI.

RESEARCH

uttetinAPPTRC-TR-54.44

Identifying the Effective Instructor: A ReviewOf the Quantitative Studies, 1900-1952

By Joseph E. Marsh

And Eleanor W. Wilder

A

D4 PAIllmtht Of motAttM. /DuCATKI441**01.1111

Oh Pet OF 10vC411,4DOCANt*/ Dki DUN 144100m,440410CW, 4.8 DtarolD Moo till PelsOri out014CAINAtrOD 0I1GA4l1 fva fl Poghti OF*Ity 0* 04*,00** VA D DO 140i 14 MSU"' DIPME4rt ti0At &Mt Of 104S/CAllOof PO*0011: *WC,

Pyfinte4 Iw. 100

qzo. AIR FORCE PERSONNEL & TRAINING RESEARCH CENTERes) tAtIttAND Alk POW IASI IAN ANtONI0 1111A%

O

Cl.

HIAOCK/AltigH

AIR FORCE PERSONNEL AND TRAINING RESEARCH CENTER

Atr Research and Oevelopmenr Commend

Wetland Air Forte lose,

Col. Herbert N. CowtCermoonder

Dr. Arthin W. Melton

Irethrittel Dia est

Dr. Chart.' W. liferNW?.hitch 4.41 Devol*polgehl

Dr. Henry S. OdberI

Nit.,tfw11 it tel infermsren

ein RESEARCH OCIOSIt it 5474411,14lietin A ArPTRC-T11.54-44

The armed setv!ces new require that large numbers slpersennel he rapidlytrained to a high degree of technical competence in the operation andmaintenance of a variety of complex electronic and mechanical equipment.The rapidity and efkctiveness of training test In large PreASSite 111 the

ClUslity of instruction. Hence considerable Importance Is *seamed by theproblems of locating individuals /soon* militry personnel who may betrained to become effective ihstruttors and of evaluating instructors onthe Jon. As a first step, the criteria suitable for determining instructorproficiency must be ascertained, and the factors employed to preCct inssttuctor effectiveness evaluated. In this Research bulletin over threehundred sixty civilian studies which :elate to the evaluation and predictHon of instructor proficiency are reviewed and interpreted.

Or. Joseph E. Much Is a Realm at the Personae! Research laherateryof this Centtt. Mrs. Elesnor W. Wilder was formerly a *ember of theTraining lids Research Lebnistory. The survey of studies was made aspart of the regular research of the Training Aids Research Laboratory.

t-4

1- ILENTIFYING THE EFFECTIVE INSTRUCTeR:

tr\ A REIEW OF THE QUANTITATIVE STUDIES

04 1900-1957

OO

Project No, 7114Isek No. 772h3

By Joseph r. MorahAnd Eleanor W. Wilder

Training i.ide Research LaboratoryIORCE RERSONNEL TRQUING REWRCH CENTERAir Research and Development Cc*wrdChanute Force Ease, Illinois

Approved by:Richard W. Faublon, Col, USAF, DirectorArthur A, Lumsdaine, Technical Directcs.

Training tide Research Laboratory

4

ACKNCWLEDOWNTS

This review is the outgrowth of a working bibliography which was assem-

bled in connection with Human Resources Research Center Project 507-010-O005,

"The Identificaticn of the Charnoteriatice end Teaching Procedures of Suc-

cessful Technical School Instri.zetor." The authors are grateful to Dr. George

B. ;Anon and to Dr. Robert A. Swenson who read the manuscript and offered

many constructive criticisma. Special acknowledgment is due to Mr, James R.

Berkshire who worked closely with the senior author inthe,preparation of the

manuscript. Mrs, Katherine M. Zawadke supervised the (letting up of the ta-

bles,

if

TABLE OF CONTENTS

Page

List of Tables iv

Introduction 1

Principal Findings of Cited Research Studies 2

Criteria 2

Predictors 5

Criteria of Instructor Effectiveness 7

Rating the Effectiveness of Instructors 10iiministrative Rrting of instructor Effectiveness 15:eer Retin6 of Instructor Effectiveness 23Student Rating of Instructor Effectiveness 273elf-Reting of Instructor Effectiveness 40Objective Observation of Instructor Performance 42

'ItInge is a Measure of Instructor Effectiveness 50

The PredictorsTraits and Qualities Assumed to be Relatedto Instructor Effectiveness 59

Intelligence as Related to Instructor Effectiveness 60Education 83 Related to /hstr.actor Effectiveness 66Scholarship as Related to Instruetor Effectiveness 701,ge and Experience as Related to Instructor Effectivon.98 791-nevlet-Ige of Subject Matter, Present Professional Information,

and Teacher Examination Scores as Related to InstructorEffectiveness 84

Extracurricular Activities and General Culturc Test Scoresversus Instructor Effectiveness 87

Socioeconomic Status, Sex, and Marital Status versus Instructor:.:fez:tiventse 91

The Relation of Teaching Aptitude, Attitude 'reward Teaching,' endInterest to Instructor Effectiveness 96

The Relation of Voice and Speech Characteristics to InstructorEffectiveness 101

The Photograph as a Predictor of Instructor Effectiveness 104

Statistical Analyses of Instructor Abilities 105Opinion Studies of the Personality Characteristics of Effective

and Ineffective Instructors 108Personality Testa of Teachers 114

Implications for Further Research 118

iii

Criterion Researa

Table of Contents (Cont.)

Predictor Reseatph

Bibliography

Revibwe and Bibliographies

Page

119122

125

149

LIST tYF TABLES

Table Page

I Reliability of Administrative Ratihg of Instructors 36

2 Correlatioh of Winidtrative Rating with Other Measuresof Instructor Effootiveneee 19

3 CcTreletdone Between liatinga a Teacher Charecteriatice a 22

4 Reliability of Peer Rating of Instructora ,,, 25

5 Correlation of Peer Rating with Other Measures ofInstructor Effestiveneec 26

6 Reliability of Student Rating of Instructora 29

7 Cbrtelateon of Student Rating with Other Meaeures ofInstructor Effectiveness 32

8 intercorrelations by Trait of Student gating ofInstructors 33

9 Correlation 01Grades Received by Studeni,6 with TheirRating of Their Instructora 35

10 Relationship df Teacher Factors to Student Rating . 36

11 Relationship of Student Factors to Student Rating 38

Z2 Relationship of Selfasting to Other Measures of InstructorEffectiveness 41

iv

List of Tables (Cont.)

Table

13 Reliability of Various Methods of Observing Teaching

Page

Effectiveness 41

14 Reliability of Measures of Student Gain 56

15 Correlation of Measures of Student Gain with OtherMeasures of Teacher Effectiveness 58

16 Correlations Between A.C.E. Psychological Examinationand Various Measures of Teacher Effectiveness 62

17 Correlations Between Various Psychological Examinatonsand Measures of Teacher Effectiveness for Groups ofTeachers of 90 or gore 64

le Relation of Education to Instructor Effectiveness . I 67

19 NwrItione: Qualifications of "Best," White, HighSchool Teachers 69

20 Relation of Practice Teaching Grades or Ratings toScholarship 72

21 Relaion of Practice Teaching Grades or Ratings toTeaching Effectiveness in the Field 74

7...-leticn of Scholarship to Teaching Effectiveness inthe Field 76

23 Age and Experience as Related to Teaching Effectiveness . 80

24 Teaching Experience of Military end Civilian instructorsin Air Force Technical Schools

25 Relation of Scores on SubjectMatter Testa to Measuresof Instructor Effectiveness 85

Relation of Scores on Professional Information Teststo Measures of Instructor Effrotivences 86

27 Relation of Exttacurricular Activities to InstructorEffectiveness 89

28 Relation of Scores on the Cooperative Gentre.1 CultureTest to Heasures of Instructor Effectiveless . . . 91

List of Tables (Cont.)

Table Page

29 Sex of Instructor as Related to Instluctor Efftotivenecs , . 94

30 Relation of Scores or. Measures of Teaching Aptitude toTeaching Effectiveness 97

31 Relation of Interest Test Scores to Teaching Effectiveness . 4 100

32 Relation of Ratings of Voice and Teaching Ability 102

33 Opinion Studies of Traits, Qualities, and Characteristicsof Successful Teachers 109

34 The Five Most and the Five Least Important of 46 TeacherTraits as Ranked by Four Groups of Judges 110

35 Relation of Personality Measures to Measures of InstructorEffectiveness 115

36 Relation of Social Adjustment Measures to Measures ofInstructor Effectiveness 118

vi

IDENTIFYING THE EFFECTIVE INSTRUCTOR:A REVIEW OF THE QUANTITATIVE STUDIES

1900-1952

INTRODUCTION

The equipments of modern warfare are highly technical. Successfulprosecution of a war demands that thousands of young men be able to main-tain and operate electronic and mechanical. devices that are oftenextremPlycomplex. Since these men, upon induction, do not have the skills and knowl-edges necessary to such tasks, the armed forces are required to establisheubstenlel training programs aimed at making satiefactory technicians outof raw rec:uits.

Fast and effective training requires at its core skilled instruction.The problem of how to select personnel who can successfully acaompliah thisaccelerated instruelonal job is thus crucial to the armed forces. Methods

of training these potential instructors most rapidly and efficiently mustalso be devellped. Research in the area of selection and training of in-structors has, therefore, very high probability of payoff in terms of a moreefficient military organization. The first step would appear to be that ofdetermining wht:t is now known concerning the problems involved.

While the research literature wee ?sing surveyed as background materialit became apparent that a summary of the findings of the quantitative studieshad potential value for anyone concerned with instructor selection and train-ing problem, not only in the Air Force, but also in the other services andin civilian institutions, schools, and cottages. With these wider implica-

tions in mind, r comprehensive and critical review of pertinent research re-ports has been prepared.

Over the pelt fifty years a considerable literature has been built upconcerning the problems associated with teacher effectiveness. Many of thearticles that have appeared merely reflect expressions of opinion in theform of "armehab ' analyses of teaching. Others, often written by the orig-inal investigators, deal with theoretical coLsideretions arising out of re-search studiee. Undoubtedly many of these general discussions are.vorthl ofattention. Inasmuch as the more pregnant theoretical implications usuallyform en integral part of reports of actual research investigations, it was .decided to include in this review only those studies that involved a quanti-tative attack on problem concerned with teaching effectiveness. Some ex-ception was made in the ease of a few of the most recent theoreti.al discus-siooa by leading investigators in the field, Limiting the scope of thereview ih this manner reduces the bulk of material to be handled withoutseriously limiting the analyses of the problems of assessing teaching effec-tiveness or neglecting the prowess that has been made in solving.theseproblems.

In the search for quantitative studies over 900 references were examined.Of these, over 360 were abstracted for inclusion in the review. To obtainthese references the following sources were used Educational Index, Ely-choloecal Abstracte, and some 40 reviews and bibliographies, including thecomprehensive Domas-Tiedeman (380) bibliography. A selected list of 28 ofthese reviews and bibliographies is included with the references accompany-ing this report. While no assurance can be given that all important researchon instructor effectiveness has been covered, the reviewers had availablethe extensive facilities of the library of the University of Illinois as wellas other sources of information.

Findings are presented as given in the original reports, even though insome cases the research designs are obviously faulty, or insufficient num-bers of subjects have been used to allow statistically significant generali-zations to be drawn. The discussions of research studies and the tabularmaterial are presented chronologically under each topio heading, except ina few instances where some specific feature of the investigations is empha-sized (e.g., in Table 30 order of presentation is chronological for eachtest). The chronological order mables the reader to judge results in termsof the tendency in later work to use more precise statistical methods, im-proved research designs, and to report more meticulously the conditionsunder which an experiment was conducted.

An attempt has been made to include in the tables all information con-sidered necessary for interpretation of results. In the column describingthe samples used in the various studies, besides the size of the sample,level of teaching position is stated wherever known. Other data on whicha sample was selected are also given, such as: the sample was a dichotomousone of good-poor teachers, or, it was composed of only inexperienced teach-ers. In cases where this additional information is not included, it may beassumed that the sample was indeterminate except for the particular variablecited.

Fromtheerrays of results that have been assembled, the reviewers haveset down what appearad in their opinion to be the moat probable generaliza-tions arising from the data and have drawn certain conclusions from these toserve as a guide in Air Force technical training research projects. It isanticipated that these facts end conclusions may also assist other investi-gators in research planning in this field,

PRINCIPAL FINDINGS OF CITED RESEARCH STUDIES

Criteria

The main findings of the quantitative studies reviewed in tile presentreport will be summarized.

2

Surveys of Hating devices. Surveys of appointment blanks and rating

scales in use have failed to provide means for identifying the significant

items to be used in setting up instructor rating devices. The most fre-

quently mentioned qualities on existing teacher appointment blanks are abil-

ity to discipline; ability to teach, scholarship) and personality, There

is no general agreement as to what constitutes the essential characteristicsof a competent teacher. Similarly; items on present rating scales tend tobe subjective, undefined) and varied, there being no consistency as to what

traits a supervisor might be expected to observe and evaluate,

Administrative ratings. Administrative over-all opinion constitutes

the most widely used measure of instructional competence. Available studies

show in general that teachers can be reliably rated by administrative andsupervisory personnel (usually with r's of .70 or above), For the most part)

administrative ratings do not produce very high correlations with measuresof student gain. Intercorrelatienn of rated traits or categories appear togive evidence that traits which ,re more objectively observable or are moreindependent of opinion tend t be Jess prone to logical error or halo effectthan are those traits which arc more intangible an hence more subjectivelyestimated. The implication seems clear that by and large ratings made bythe same person are apt to be contaminated by halo and that in many such in-stances a single rating of over-all effectiveness may be as useful as anevaluation based on a composite of a number of ratings of separate traits.

Peer ratings, Peer ratings have been little used, For administrativepurposes they are probably not too useful since teachers have certain mis-

givings about passing judgment on fellow teachers. From a research stand-point in using peer opinion, ranks 1/111 probably give better results thanratings. There is considerable agreement between supervisors and fellow in-structors in ratings of instructors. As in the case of administrative rat-ings, considerable corrfiation is found among ratings given different traitsby the same peer raters. That is, halo influences peer ratings just as it

does administrative ratings.

Student 'ratings. The use of student ratings of instructor effectivenessappears to be growing. Such ratings tend to show fair consistency, theirreliability, as with other ratings, increasing with the number of ratingspooled in fairly good accordance with the Spearman-Brown formula. When stu-dent ratings have been compared with other measures of instructor effective-ness) rather diverse results have been found depending in part upon the cri-teria employed. Considerable halo effect is usually found when studentsrate their instructors on several traits. Whether or not grades received bystudents affect their ratings apparently depends upon the instructional sit-uation, Results may indicate that if the instructor favors the brighterstudents he will be approved by them and a positive correlation between stu-dent ratings and grades will result. If he teaches for the weaker studentshe will be disapproved by the brighter students and a negative coefficientwill be obtained. By and large such factoro as diz3 of class, sex of stu-dents, age or maturity of .students, and intelligance or mental age of stu-dents seem to have little bearing on student ratings. Research has been too

3 POOR ORIGINAL COPY - BEST

AVAILABLE AT TIME FILMED

sporadic and results too inconclusive to allow generalizations to be made

concerning the influence on student ratings of other factors such as age

and sex of teaAer) length of students' acquaintance with the teacher)

length of time teacher has taught in the school or taught a student, pleas-

urable personal relationships between student and teacher) end whether or

not subject taught by rated teacher is students' favorite subject. There

is considerable expressed opinion but little redearoh evidence that student

ratings will ,onixibute to instructor improvement or could be used to im-

prove supervisory ratings.

Self-ratinga. While there is some tendency for instructors to overrate

themselveo) self-ratings show negligible relationship with administrativeratings) student ratings) or measures of student gains, On the basis of the

few available studies of self-ratings of instructors, the obvious, undis-

guised self-rating technique would seem to offer little encou.7agement for

evaluative or research purposes.

§2ptemetic observations. Systematic observation techniques to deter-

mine differences in performance of effective and ineffective instructorshave been largely neglected in research in the instructor area, Most of

the 'observations made have been dependent upon the subjective judgment, of

the observer. In general, the reliability of planned observational record-ing compares favorably with other methods of instructor evaluation. The

most general criterion of validity of observation has been face validity.No single) specific, observable teacher act has yet been found whose fre-quency or per cent of occurrence is invariably significantly correlated with

student achievement. There seems to be some suggestion) however) that ques-tions based on student interest end experience rather than assigned subjectmatter) the extent to which the instructor challenges students to supportideas) and the amount of spontaneous student discussion may be related to

student gains. Apparently there are no optimum time expenditures for par-ticular class activities; a good instructor may function successfully with-in a wide range of time expenditures. A factor analysis of a number ofinstructor and student behaviors resulted in three factors: (a) understand-ing) friendliness) and responsiveness on the part of the instructor) (b)systematic and responsible instructor behavior) and (c) the instructors'stimulating and original behavior.

Student gains. Of the several methods used to measure student change)residual student gain, that is, the di5:ferencebetwe.en actual gain and pre-dicted gain) is becoming ziore widely used as a criterion of instructor effec-

tiveness. With all its difficulties it appears to offer one of the bestcriteria thus far used. As compared with commonly reported test reliabilitycoefficients those obtained in gains studies have been low. The great die-.

crepancies in the findings of investigators who have examined the studentgains criterion emphasize the extreme variability in relationship with othercriteria used to indicate instructor ability. Within the 1Wts of meas-ures so far used, the relationship between administrative opinion of an

4

instructor's competence and the amount of subject matter that the.instruc-tors will impart to his students cannot be predicted.

Predictors

Intelligence. Whether or not intelligence is an important variablein the success of the instructor apparently depends upon the situation.In general there appears to be only a slight relationship between intelli-gence and rated success of an instructor. Correlation coefficients forhigh school teachers tend to be somewhat higher and somewhat less variablethan those reported for elementary teachers. For all practical purposes,however) this variable appears to be of little value as a single predictorof rated instructor competence.

Education. Considered as a group, the investigations of semester hoursor years of education as relntcd to instructor efficiency have indicatedthat any relationship that L''y exist is slight. Beyond certain more or lessobvious knowledge requirements, greater or lesser education of a teacher intends of courses or semester hours seems to be unimportant in discriminat-ing between good are5 poor teachers.

Scholarship. Implications of studies reviewed with respect to scholar-ship are quite clear. Grades a student will obtain in a practice teachingcourse may to some extent be predicted by the grades that student obtainedin college. Accurate prediction of success in practice teaching, however,cannot be made on the basis of an individual's scholastic record in highschool. Almost all available studies report low positive correlation co-efficients between measures of on-the-job performance of teachers and ear-lier scholarship as reflected in over all achievement in high school orcollege, or in standing obtained in specific college courses (includingpractical teaching courses). There appears to be some relationship, butit is small. No investigator has shown that the attainment of a particularstanding in high school or college or the mastery of any single course orgroup of courses is essential to teaching competence. The positive corre-lation coefficients usually found probably reflect primarily the relation-ship of general intelligence to both academic and teaching success.

,26211911x1prience.increases at first ratherto five years or beyond.may show little change inyears, after which, as In

It appears that a teacher's rated effectiveness:7apidly with experience and then more slowly upThere is then a levelling off, and the teacherre.,ed performance for the next fifteen or twentymost occupations) there tends to be a decline.

Knowledge of subject matter. Whether or not knowledge of subject mat-ter is related to instructor competence seems to be a function of the par-ticular teaching situation. Some studies suggest that too much knowledgeon the part of the teacher may result in teaching "over the heads" of stu-dents.

5

Professional information, Scores on tests of professional informationappear to bear some slight relationship to supervisory ratings or rankingsof instructor competence, ContraAtotory results have been obtained, however).

when suoh scores are correlated with pupil gain.

Extracurricular activities, In general, investigators have found lowpositive relationship between an individual's participation as a student inextracurricular activities and his later instructor effectiveness.

General culture, Studies reviewed appear to indicate that the relationof Cooperative General Culture Telt scores to instructor effectiveness dif-fers little from those reported fo: other subject matter tests.

Socioeconomic, status. Studies of the relationship of soCioeoonomiostatus (ad measured by such devices as the Sims SocioZconomic Scales) tocriteria of instructor'effectiveness show little) unless it is that thosefrom higher status groups have greater probabilities of success in lifethan those less fortunate.

Sex. No particular differences have been shown when the relative ef-feotiveness of men and women teachers has been compared.

Marital status. Despite some prejudice to the contrary there appearsto'be no evidence that married teachers are in any way inferior to unmarriedteachers.

Teaching aptitude. Results obtained from measures designed toprediotteachingability show great disparity, Data thUs far available either failto establish the existence of any specific aptitude for teaohing with anydegree of certainty or indicate that tests used were inappropriate to itsmeasurement.

Teachin3 attitude. Attitude toward teachers and teaching as indicatedby the Yeag'Scale devised for its measurement seems to bear a small.butpositive relationship to teacher success measured in terms of pupil gains.

Intereet in teaching, In most of the studies reviewed) interest:inteaching 1,73 measured by interest teat scores which indicated similarity ofinterest of teachers and persons undergoing the interest teat,. 'Correlationsresulting from the use of several standard interest tests either clusteraround zero or are so inconsistent as to render such tests of rather doubt-ful value as predictors of teaching success, The common factors that werefound thrcugh factor analyses to underlie the reasons given for chooding theteachili, profession are perhaps provocative of further research but werebased on too few cases to justify any clear-cut interpretation.

Voice and speech characteristics. On the basis of studies reviewed)in generfl, it appears that the quality of the teficher's voice is not con-sidered .;oo important by school administrators, teachers, or students. Ia

6

one study, however, certain speech factore were found to be correlated sig-

nificantly with student gains and with effectiveness ratings of supervisors,The intercorrelations of the speech factors, however, were so high thatgeneral speech ability based on a single factor is probably as useful as acomposite of judgments based on several speec.factors,

The photograph. Studies of the use of the photograph as 'a predictorof instructor effectiveness have failed to demonstrate that photographshave any predictive value.

Statistical analyses of instructor abilities. Such instructor factorsas empathy) professional maturity, general knowledge) mental ability) socialadjustment, and the like have been identified through factor analyses byvarious investigators. The statistical analyses so far reported, however)suffer from inadequacies of criteria, testing instruments, or number ofcases,

Opinion studies of instructor personality characteristics. The attemptsmade to'identify characteristics of successful and unsuccessful instructorsby making lists of traits tiased on opinion appear larggly sterile in termsof usability for evaluation or selective purposes.

Causes of teacher failure. In most of the studies of unsuccessfulteachLrs poor maintenance of discipline and lack of cooperation tend to befound as the chief causes of failure. Heald, educational background, train-ing, age).and knowledge of subject matter, on the other hand) appear to berelatively unimportant factors in terms of teacher failure,

Personality tests. Results obtained with personality tests of teach-ers have shown wide variation when correlated witli other measures. SomeBO-called personality tests appear to show significant correlations withcertain measures of instructor effectiveness. Until carefully controlled,well-designed studie3 employing adequate numbers of instructors have beenmade, however, the -,droblem of determining the personality patterns of ef-fectiveteachers must still remain unsolved.

CRITERIA OF INSTRUCTOR EFFECTIVENESS

By common definition a criterion is any standard used for judging. For

the scientist, however) such a definition is inadequate. A criterion whichis to be used for scientific judgments cannot be just any standard. Itshould be the beot possible standard for the particular class of judgmentsthat are to be made. This means that the scientist must be able to justifyhis choice of a criterion by demonstrating its logical relevance to the prob-lem at hand and by showing that it possesses measurement characteristicswhich are technically adequate.

So long as the investigator restricts his research to laboratory stud-ies the establishment of a justifiable criterion usually presents no greatdifficulties, A criterion for memory, for instance, may be the recitationwithout error of a list of nonsense syllables, or the criterion of learningmay be a specified minimum of blind alleys a rat enters while traversing a

maze, The moment research is moved into less rigidly controlled life situa-tions, however, the investigator is confronted with criterion problems whichare seldom simple and often impossible of completely adequate solution. The

determination of a scientifically justifiable criterion of instructor ef-fectiveness presents such problems.

Every educational system and every training program has certain goals.The first requirement for choosing a criterion of instructor effective:Jessie that these goals be defined, The measure of a particular teacher's ef-fectiveness is then the extent to which that teacher facilitates the stu-dents' progress toward these goals, Since in any system there are usuallyseveral educational goals, a measure appropriate to each goal is indicated,The construction of a single, over-all criterion of instructor effectivenesswould require that these various measures should be weighted. into this cri-terion in accordance with supportable value judsmente as to their relativeimportance.

Obviously, the fulfilling of the requirements for such a criterion ofinstructor effectiveness is a large order. The comparative student changesthat would require measurement in certain educational systems, or at cer-tain stages in a particular curriculum) might quite defensibly include suchaspects as: changes in knowledges of.specifio subject matter, improved suc-cess in subsequent schooling) improved personal adjustment, or increasedsuccess in life. It is conceivable, also, that the effective teacher con-tributes to ohanges in other teachersi pupils through individual guidance;assistance in planning the school program) good influence on group morale)and the like, thus creatinq effects that cannot be. isolated. or ascribed toany one teacher.

In.the studies reviewed, the criterion, problems have been handled .withwidely varying degrees of sophistication, Measures found acceptable as ori-teria of instructor effectiveness by one investigator are often oonsideredas unvalidated potential predictors by others, In order to provide for com-perisonS among studies and-for appraisal of research progress, the reviewershave grouped together what appeared to them to be comparable studies, Thebasis for these groupings rests on the use by the investigators of similarcriteria, or where no measures appeared to merit designation as a criterion,of similar potential predictors.

The largest grouping covers studies in which ratings or rankings ofteachars have been used as criteria. Most commonly the reporting investi-gator does not deal explicitly with the problem of the relevance of suchcriteria to teacher effectiveness. In the opinion of the reviewers, if one

8

is concerned with teacher effectiveness as the changes brought about by ,theteacher in the teacher's own pupils, then ratings and rankings are less rele-vant than either measures of student change or controlled observations ofstudent behavior. Ratings are someone's estimate of the effects on studentsof those teacher characteristics the rate.. heppened to observe, and which

he deemed important. Without demonstration that these estimates have re-lationship to student achievement, they cannot really be considered as sat-isfactory substitutes for measures of pupil change, On the other hand, if

one ie considering that part of teacher effectivene(a which the teacher

contributes to the growth of all pupils by participation in the efforts ofthe educational group, then ratings or rankings woult seem to be somewhatmore relevant, In this latter case the influence of the teacher is a func-tion of the quality of the teacher's relations with students in general,with other teachers, supervisors, and the community. Differential effec-

tiveness is a matter of differential contribution totee over-all goals ofthe school or educational system, Since such contribt .ion is almost in-

evitably in a cooperative setting, and since its effec:e are diffuse and(almost certainly) unmeasurable, there would appear to be logical justifi-cationfor an attempt to get estimates of effectiveness in this area by theuse of ratings or rankings obtained from other people in the educational

situation.

Another section covers studies in which observational measures ofteacher performance have been used, It is plausible that changes in stu-dents should be related to whet+ the tealher does and how he does it. Fur-

thermore, it seems reasonable that careful and objective observation ofthe teacher's behavior in the teaching situation could provide a measure ofthe teacher's effectiveness, A number of investigators have thus attemptedto achieve objectivity in a criterion by the use of observational measuresof teacher performance. However, before any method of objectively evaluat-ing effective performance on the part of a given teacher can become usefel:such method must be proved to be capable of measuring kinds of teacherhaviOr related to the type and amount of change the teacher produces inher pupils.

Studies that used measures of pupil change as a criterion are alsogrouped together. Granting that many of the pupil changes that would in-dicate a teacher's effectiveness are in behaviors that are not measurable,or at least have not jet been measured, there is at least one area in which

measurements have been made. This is the area of student changes in'knowledge of subject matter, While adherents of various educational philosophiesmight disagree as to the importance of changes in subject matter knowledgerelative to other kinds of desired changes, it seems probable that all wouldagree that such changes have some importance and that they are relevant to

the problem of teacher effectiveness.

The last section of the review covers other instructor or student vari-ables or measures that were included in the studies read. These the re-

viewers have classified as "possible or potential predictors" regardless ofwhat they were designated by the original authors. They are so classified

9

because many of them, if their correlation with an adequate criterion ofinstructor effectiveness could be demonstrated, would be useful in the se-leotion of personnel for teacher training or for assignment to teachingpositions. Within the section the various classes of potential predictorsor correlates are placed together to allow comparisons to be made and; wherepossible, conclusion's to be drawn.

Rating the Effectiveness of Instructors

An Appraisal of Instructor Rating Methods

In attempts to evaluate instructors systematically many kinds of ratingmethods have been used (361). In a great many of the studies reviewed; in-vestigators adopting rating as a criterion of teaching effectiveness haveaccepted the rating scale or method in use in a particular school situation.The types of rating scales which have been most favorably received by schooladministrators are the graphic; the check list, and to a lesser extent therank order or order of merit. Consequently, these scales account for nearlyall of the studies using rating as a criterion. In a few studies, however;the paired-cOmparison; critical- incidents, or forced-choice type of ratingscales have been used.

The reason for the varying degrees of popularity of the different typesof rating scales for administrative use is obvious. Ease of administrationplus assurance that the administrator can follow his subjective leaningsappear to have been the factors given the greatest weight in the choice ofa rating method.

Since the results obtained in rating teaching effectiveness depend inpart on the adequacy of the methods used, a brief appraisal of some of themore usual,methoda that have been applied to instructor rating seems appro-priate to the purposes of this review.

The graphic rating scale is simple, comprehensible; easy to administer;free from direct quantitative terms, and discriminates as finely as therater desires. It is also very susceptible to leniency effects,

The check list; on superficial appraisal; appears to be a simply con-structed device though it is cumbersome to administer. To achieve a tech-nically sound instrUment however; it is necessary to d& more than justcompile a collection of random statements. A thorough job analysis shouldbe undertaken and as with other rating methods; comparative evaluation mustbe made of the various behaviors to discover those elements which determinegood and poor instructors.

The rank-order technique while offering a simple means of evaluatinginstructors, lacks the popular appeal of the above two methods. From a

10

research point of view its chief drawbacks are that it does not indicatethe magnitude of the differences between persons rated nor does it indi-cate the differences between groups. This device has sometimes been usedto validate other methods.

Although some investigators have claimed that the paired-comparisontechnique tends to be more accurate than rank-order or rating-scale methods,it has lacked favor among administrators, first, because it is extremelytime consuming and laborious especially when used in rating large groupsand; second; because there is usually a very high correlation between pairedcomparisons and rankings. It is also somewhat more resistant to manipula-tion by the rater than rank-order or graphic rating scales. Some investi-gators have recommended this device as a criterion of validity againstwhich less rigorous methods of rating may be checked.

The search for more stable rating methods has led to the developmentof the critical-incidents and forced-choice techniques. Of these theforced-choice technique appears to be the more promising. The unique fea-ture of this technique is that it limits the rater's control of the finalresult of his rating, thus effectively reducing biasability (272). Limit-

ing the rater's control helps also to counteract another weakness usuallyassociated with rating, that is, the raters tendency to become more andmore lenient with repeated ratings. Nonbiasability effectively minimizesthe effects of this changing frame of reference on the pert of the rater.

The critical-incidents method, devised by Flanagan (113, 114), wasdeveloped as a means of identifying the important and valid behaviors onwhich rating should be made. So for it has not shown much promise in therating of instructors, Domes (104) and Jensen (167) in attempts to usethis method in school situations have demonstrated, perhaps unintentionally,the principal weakness of the method. When the "oritical incidents" havebeen collected some attempt must be Lade to organize them so that they may

be used Oonveniently. The resulting categories appear, however, as a listof vague generalities which might have been jotted down without going throughall the elaborate process of accumulating the inoidents, After Domes bad

collected 1000 and Jensen had assembled 500 critical incidents, they foundthey were unable to fit them into categories except as they represented ef-fective or ineffective behavior and so presented them in their reports.Charters and Waples (74), incidentally, encountered the same difficultywhen they. attempted. to organize lists of characteristics essential for suc-cessful teaching. Another principal weakness of the critical-incidentstechnique is that it depends entirely on the conception of effectiveneesheld by those who report the incidents. In applying the technique to teach-ing, its validity depends on the opinions of effective teaching held by theparticular superintendents, teachers, students, or others from whOse re-ports incidents are sought.

High reliability in terms of agreement among raters depends upon pre-dee definitions of traits being rated so that raters have a common under-otarlding of what is being rated, and sufficient frequency of occurrence of

11

the behavior, trait, or quality so that systematic) extensive observations

may be made. Wrightstone (361) reports studies by several investigatorswhich tend to show that the following traits can be more reliably rated:efficiency, originality, perseverance, quickness) judgment, energy, scholar-

ship, leadership, and intelligence. Such traits as courage, selfishness)cheerfulness, kindness) judicial sense) and tact proved not to be so reliably

rated, These findings are perhaps specific to the raters, the rating scaleused, the ratees, and the situation. It should be pointed out that it isdoubtful if such literary traits as those exemplified here can be suffi-ciently well-defined to be useful nor can they be agreed upon by differentraters) except perhaps as they uniformly reflect halo from an agreed on

reputation. Aach (11) has shown that the content and functional value ofa trait changes. with the context of other traits, Gaining an impression

of another person is not a process of fixing each trait in isolation and

noting its meaning but rather a summation of the effects of these traits.For this reason it is probably more accurate to judge whole impressionsrather thanartificially isolated traits, Carefully planned studies, how-ever) might well enat21 predictions to be made as to what types of traits

and behaviors can be more reliably rated than others.

The reliability and validity of ratings tend to be reduced by several

sources of error. Among these should be included judgments based on insuf-ficient evidence) lack of training of the rater) and poor rating devices.Subjective rating scales depend 1P.rgely upon memory and therefore are sub-jeot to errors by forgetting.

Another source of error lies in the feot that some raters tend to over-rate and some to underrate, while still others tend to rate everyone nearthe middle of the scale, Thus, ratings made by different raters may reflectdifferences in rating habits rather than differences among the people rated,

Perhaps the greatest sources of error are those of "halo effect," firstnoted byiVells 049), and "logical error," Halo effect ie the tendency ofthe rater to rate one trait or quality high (or low) because another traitor quality has been rated high (or low) or because the rater knows that the

individual rated excels (or is particularly weak) in some respect, Logical

error arises from presuppositions in the minds of the raters and lack of

definiteness of the trait being rated,

Ratings also tend to become more and more meaningless with repeateduse, This is well illustrated by the results of repetition of the samescale in rating Army officers, In 1922, 25% of Arm/ captains were ratedas excellent; by 1940 the percentage had reached (.).-.',6; while in 1945, 95%

of captains received an excellent rating (15). Inorsased leniency with re-peated ratings is probably not directly a function of the type of ratingscale but rather due to the operation of social and situational pressure'.With repeated ratings there tends to be a changing Trams of reference onthe part of the raters, It should also be noted that leniency tendenoy isnot as serious a drawback under research conditions as contrasted with op-erational conditions,

12

The reliability of rating scales is increased by pooling the rating ofseveral judges. Ar shown in findings reported by Bryan (61), Remmers et al.(270), and others) reliability in ratings increases with the number of rat-ings pooled in fairly good accordance with tie Spearman-Brown formula.Bendig (29).in a study in 1952 of inter-judge versus antra -judge reliabilityof the order -of -merit method found the relationship between these two typesof reliabilities to be U-shaped. The groups'of judges with the most highlyreliable and most highly unreliable intra-judge-reliability showed the mostgroup agweement. Furfey inWrightstone (361) showed that reliability was,increased 'also by subdividing traits and havingratings made on the sub-traits.

In the next sections the results of studies dealing with ratings made,by administrators, fellow teaohers) the teacher himself) and students arereviewed. In interpreting the results of these studies the-milny sources oferror in rating methods must be constantly borne in mind. By and large in-vestigators have tended to ignore the problems of correcting for the varioussources'of error and have worked with ratings as though they were already aperfected criterion.

Surveys of Types and Content of Scales

In an attempt,to determine what characteristics of instructors are con-sidered desirable or essentially authorities in the field of education)several studies have been made of appointment blanks or rating scales asused by teacher-training inatitutions) uniiersity departments of education)or state departments of public instruction. In most cases the procedureconsisted of collecting the forms used, tabulating the items on the ratingeheetsvand determining the total frequency a given trait or quality wasmentioned on the rating devices used by all the institutions surveyed.

In 1920 Osburn (250) attempted to determine the desirable personal 0ar-acteristics of the teacher by studying appointment blanks used by 121 teacher-training institutions. The outatanding.finding of this investigation was thelack.of agreement as to what constitutes the essential personal characteris-tics.of a competent teacher. The universities tended to be in somewhat closeragreement than the normal schools. Ability to discipline) ability to teach)scholarship) and personality were the most frequently mentioned qualities.

A critical analysis of rating sheets in use for rating student teachersin institutions, of the North Central. Association of Secondary Schools andColleges was made by Smith (318) in 1936. Of the 128 institutions replyingto a request for information) 103. made use of soap form of rating sheet. Ap-

proximately 77% of these depended solely upon persona] opinions of the raters.In 1941 Samuelson (291) reported a survey of rating scales in use in approxi-mately 50 teachers, colleges and schools in 29 states. The investigator'schief finding %ls the variety of practices and methods of measurement employed.

13

Graphic scales, usw. ly with five-point scale division, predominated, al-though descriptive balles, letter scales, and numerical scales were also

used. In 1940 Schellhammer (294) also examined rating procedures in 109teacher-training irstitutione, The Lem) he found, varied from a singlerating of 9 items to a comprehensive scoring of 72 items on a seven-pointscale, Intelligence and health items appeared moat frequently, with noother item appearing more than 11 timce, Thie seemed to indicate that thereis no general agreement as-to which characteristics the supervisor might beexpected to observe and evaluate, Peterson and C00% (255) in 1930, Dean(99) in 1939, and Woollner (358) in 1941 also surveyed rating proceduresused in teacher-training institutions,

Barr and Emans (19), in 1930, in order to determine what qualitiesare prerequisite to success in teaching, studied 209 teacher rating scalescollected from cities of more than 25,000 population, from state departmentsof public instruction and from university departments of education, Theyreported that the 6939 items found in the rating scales tended to be highlysubjective and undefined, The scales also varied widely in content'and or-ganization, many being either quite superficial or apparently representingspecial points of view or systems of teaching

In 1945 Reavis and Cooper (262) surveyed rating methods in use it 123city school systems, They reported that the most notable characteristic ofthe rating devices employed was their lack of uniformity, The instruments

varied in type, in number of items to.be rated, in speoifio oharaoteristionincluded, end in individual reeporeihle for the rating, In one oity teach.ere were'rated "only by degrees hold," A total of 1538 items were inoludedin the scales used, Of these only 2,6 appeared on more than one device,

It would seem that the survey method might provide an obvious way' ofdetermining the significant items to be used in setting up instruotor rat-ing devices, The etudies summarised, however, appear largely sterile, The

meaningless sort of results obtained are probably due to the failure of thesurveyore to develop a rationale vhioh could be imposed on the materialssurveyed, The reliability of the categorising of descriptive terms fortraits or characteristics would have to be tested, Single judgments, or evenjudgmento based on a group of closely associated judges, would not suffift.'blither, agreement should be tested for fitting the categories into the ra-tionale by a series of independent judges, much in the same way that the re-liability of the categorisation of behaviors by independent observers instudied in time-sampling studies. Suoh surveys of content are not sot toproduce results worth the effort until, through empirical or other means,hypotheses concerning what teaching characteristics should be rated tre firstformulated and then these hypotheses are checked by reference to institu-tional practice,

Types of Paters

Rating devices not only differ in form and content but they are alsodesigned to be used by different classes of raters, An instructor's

lh

competence, for instance, may be rated by his supervisor or by an outsideexpert, by his fellow instructors, by his students, by himself, or by somecombination of these. Most instructor ratings heretofore have been madeby administrative personnel, but in recent years student ratings of theirinstructors have been receiving more and more widespread use.

Administrative Ratiagpf Instructor Effectiveness

Ae has been repeatedly shown by surveys, many school systems employunstructured rating procedures, the most widely used measure of an in-strudtorls competence being the over-all opinion of the principal, super-visor, superintendent, or school inspector. On the basis of judgment ofsuch administrative personnel, instructors may be selected, hired, pro-moted, or fired. To the best, of the reviewers' knowledge, a rating formfor teachers was first used administratively An Milwaukee in 1896 (170).By 1900'school systems in a number of other cities were also using ratingforma.

Demonstrated lack of agreement among administrators, however, chdtheundepen4able nature of subjective opinions in general have ledfrequent attempts to put instructor rating on a sounder footing thvauti.,the use of more analytic administrative rating devices. One of theearliest attempts to quantify instructor behavior was the tents'ive schoefor the measurement of teachi4 efficiuncy outlined by Elliott kiC6) in1910. He based hi.' method on the promf.se that the teacher was an flooto-personality"--executive, projecting, supervising, professional - technical,

social, physical, moral, and dynamic.

Investigations of the reliability, validity, and halo effect of ad-miniztrative ratings ubflizing rating devices will be examined ir. thisreport.

Reliability of AdminillalimblADLALItIlryetor Effectiveness

Reliability can be measured (a) between raters, (b) for a single raterfrom one rating scale or item to another (which may reflect halo effect),and (c) between ratings by the same rater from one occasion to another.The available studies appear to show that teachers can be reliabl:/ ratedby administrative and supervisory personnel, the preponderance of relia-bility coefficients reported being .70 or above. As shown in Table 1,

there is considerable variation, coefficients of reliability for ratedgeneral effectiveness ranging from .17 to .98. When traits or qualitiesother than general ability are rated, the reliabilities tend to be some-what lower than those found for general effectiveness Barr (16), Board-

man (39)'., Part of the range of reliability coefficients can be ascribed

15

Igigraggsg.L. tta..Lat tangs

Nur In (25n) w student

boohoo (1721) IN high onfel

Jacobs (1511) 100 'loan. aty(30 god. 50 peer)

togs (1120)

barn (1711)

&lip 4 Iletvuoat(19)0)

tailor (19)0)

twang (U)l) 113 high 'Owl

Prow 1 Salt(1132)

Cali 1 Carton(17)))

77 elementary, 1 or.enrol, no

15 lemen.art 1 re,..e xteriense

1 elesurntary, rrstrerte%.8

(let reported)(110. repertma)

110 etstiont

Eh elementary10 lemestarr

40, is, 66 1 rt.on. tv

veropertoll)IP/11411). a rr. .

up. t Ithe18 itot4murr, / gr.

towline*112 slastatar., S rt.

tsp4tiIM0:11 tlwarattr, /rt.

gaga tit sea

itLij.(1)9) 44 olonottorr

kitip (1933) 151

(Vatoralltt WM) to 41011st-o7

bast I tor.(14)71

100 eleoestary

;0 alracmtary

ts

Sreelkowt 12 Mgt leshel

1.0. Offiea .1 kg all grltais ies 10115ro(1%0)

Hess 11910 W fits.:all 1 briar LW flirty iltiew-

two191 /171,4 inFttle-

Cato

It $3. 6Z rosterp inatrcetart

90 gyro, iwrin.e.

155 CIOP, looton.tent

telitonity et Mairlotraliv SAW of instructora

TM/ f( rtinfOrapAir (54 items)

Itareing (sip* position)

Grtftle (7 !too) 1 rankingGraphic I ge wr.1 meritsRanting & grJetel merits

GratAte (it lens)

Grathle (41 items)

°vapid* (9 ties.)*amoral writ

1.rephie ilwCosh!. PO RooGothic 10 (teas

1.ankingIntim I turning

Organist 19 !taw)

*aural wit'Po State Sopartinant Sts

ration 1 tot Nalco;lamed

aler-Sorwase 1 argAlishoIwale

1nrcloltatol tole

-.5c »ore 1,11111

fertursee kilo

4 witty (144,0ueilri 'Woo)Omni c knit

Swam

Larking

Paridt4

)rwhi wale.Sato./ tree IniMeS Obi

ten.11Rased Cr. rifles bib.

Nei

Par9.6 wile (10 Stow)

OraptS1 111 new)

two craws. fell(

tiring grattetornrtocachlAt learieloctelp

*apis (.6 item)

tarrril Writ*manic (1) 'taw)twoorol writ**Au INeetal writ

tsit)

Supervisor

I,l 547

PrincipalPrimipa1Principal

krerritior

limetraloor

OboarrorbSaportiter

Suprctwrtwartiaer1.parrf oar

Print is 1lawational

specialist

144rtiow

/*writercparti Jan

baps rt o sr

kg. rti oar

Cot Knit

06Wrrst

taw.vitor

lawiriataitilaaa

Sward gat

Sward set

SwarTiont

WoottonWooster

Callao asst

lkortateerest

baportiwe

kr* rat me

Sword oar

Swami rot

144.reivot*semi orStwavticat13.wrrtioor!were *et

Nano s: nLtliAly/atm two tit forint ratans

Split halt (of raters)

Sum num., graphic to.rSabi rater, grphil ra. gees merit

Saw rotor, rar.kirg IP*. gamma merit

Aerating, same ,.tan

Pirating, sus r.t:to

mstatteg, pea *NooseRotating, 1.41111 14Orgit

tenting. 5,10 rotorbattwn two et/to-eat MooSplit halt

Itorank,c4, i rt. laterLew ratoi, maim vv. :Wing

sandal, 1 rr. lat.?, 0114 rata,9) 01105$ /iffarlat rotas 72 runt

1111t114

Selman two litterot Mints

Sta rater, tiffany: S alos

"stating, 1 rt. toutSaw wale. 1 111/oriatt Wowtatt

Imo Walt, li:farcat *Swett,.

&No rata. raritsci II. rating

keratt..g, was ratans

Amigo istetwoetlatiaa botaw.are..?9 (prisaipal. witiotast priatipa1.

sapertiwt)Waipalngs polattra tr. walottst prin-

caerate j, prtheipal amt/e, eesietest

prtnaipal re. cuperstest

Oestrogens y wittielant (I. Ltspactwi)Split halt (1 inapottaes)

4111 half (. Awe)

LRAM tem Ittlreteet Solari

lletvol two tinstoil ttt4t0

Sue rata?, lit twee* Wal

tiring pecnitiowr N. tamehthg Nett-story

lartweetw littareal ratan

.

.92

.9.

.70

.54

.79

.9)

..16 tto .77.16

.96$72

.42

.15

OM 1 .90

.05

.11, 0, 44

.17

.1i

.91

.67

.31

.16

.71

.15

.so

.8/

.50

13

art

.91

.70

.65

0?11% 9.1,4 (1 Wirt, II .1%6kilt half ratace, 13 .71 to .arm Tali f nvort. 90 .41

Split Ulf rWtort. a tO .Ito .9)fewer two lIttoreall More Qs 135) 00

I Official tali-4. (In !se lwartia el* v4 run row SOS .711.1.1 ratty rstrvse tr eta toictat St v4 4E4 It 114 rust rat attoNtirgi V1 fo11e.-0 ratings loom Natthanttal, aN 14 tie. 111.5 Wawa sta taro rating. rarest Pro t awstto la NI ran At V'S tint Pa.leww Ltd 11 rftte tali ',Ira rot LW went trilwo-ar.)

ta tilt rts/k th tasehor ws stwrtei fa two .trusts. by 0$ Irtsv.ims ecrtromeerto an* ore s -.0 of site Pa loan?.e 19 .0 04 LI 1141 ors tors rate W Shay tar * Oft sere evortstes.`selt Om 1.151 9,40 the FOR sad semi tINO.

I Mats. 011 1. set I --tram seteols.

111 llialros seetitiaits owe wortoolcol 4lpwroastra.r foroda.

16

Table 1 (don.)

--2113111112113--

Barter (11/10

Brandt 11%53

Ts ether avrar3 I 1114 of Rtt:i Mee run rlalldiSI.:1 IP .19

.77, .7), .2142, ad.27, At. .11

60 (30 Astoria, 20enrage, 20 par)

7 lessontary, I nos

IP elosenta7, radii

li Fareonalitt Stens Poerrisa

Pie t ir4 ea ) different Sattrintoolontatlas vs. retina en Suporri Der6th scale

Rat 1r4 tor. 3 di tf rent Svy4 r. l earlanai,s vs. rating on1.h scale

.11)11111/1

Masa 2 titers to. 110411. 2 ethers(14 c'. J a 60)

Poratihts (1 yr. later)Poratinte (I yr. later)

dersitnds (11 sr. later)

22 high sehnol Aft. I oust. saw scat SUperviser RerstLAda yr. later) .1212 elemotary, 1 tr.

experion-toRats 6 rent. am scale livrerwi ger

ISPersttro, yr. .72

horse*. (1M) St elowatart Otatali (11 item kiarlr.tendent Patine' vs. tollev-vp, saw ester .35 to .5764 Ile tart Irtptis (13 Star SaorintenEent Paint: r.. fellontt, lifforent rotor -.15 to .1761 elorsentartd arNio (12 items Superintendent PAW es. let follinv-to .21 to .15El olsanta, rt,e.a. ;AP item Aspitliftclerst. 114411.4. e. 2,4 falso-wp .22 to .3511 slasents Grey (12 Items 11vior6ntandsnt let fellow* vo. Ir4 follow* .15 to .1:61 Odra/ to Central writ II grephlo bytnrintondent Sane rota, terala seat ve. gratrte .)6, to 1711 el Oenertl halt I partd 6sprlsitordoret tams r,rate morel halt vs. tetra .)e, to .51

61 flar6rrts.ry.Cli=rten (11 item)1 Fara as- hpeantetvant

eavaiecolLone rata, trtikl to. ist54 4 en pert. .00 tt .54;Arl see

Late 111513 2 Aigt, Khali 1 tr.evert ebee

Rettig u N atolls- trim 441611114 6 trsiede scale

Sur rotor. daferett *ales .10

Vase (1151) 112 co!eantary Patty 11m:erre blank' SW rase, differtmt andel .23over-a11 rating

165 PLO wheal Rating, daterver 1164* rttnelpo1 tom rotor differsort Kale' .23avr-al rvting

1414 (1112) 75 hist alai, 1 n. :sting ssete,tvtil'1). ir.1 Sue rater. dittertet erase .5)storionto 6 grpht

traphis listeriser Sane otal. 2 different retort .55

&Whoa (1451) 16 sa. 1 yr. taped. Rattra "aaostat411.3A 1 is met eNkot Mao toter, different onslee .0&se

11 1(rapt

Rstint osemlatlite 1144rtrtndorttrepItte

S.. eater, /Lateran% beds,robe, yr. asportse.

Ot llllll nuns. Iii the faits" eVesy O. ream as the andel Wing irttottoll Ihr 04 lather et the trod of the fltrel, Star oftAteldhel 164 tollreor titilyi wens tea/11*A fa), and the tie r Waal ten Inv taints rtrttet Sire I san0to )I 711:. let the nut ht.1-* el teat la 11 Iwo tor the Peed Pelleo-ay.)

it Ibis stay Ita tesehor nite *arta on tel atati no tig 60 *Satins mperittondas ays vote wtstgainted stub lito latch*.

Is at 04 SI tee et.e.-4 rote retell the 14r1 Ste tr, w. are instertrtordont leo rani the the lira and votnd limos.

d talmota I- aa 3: roes savvae.

OS kalsrone teaftotente yore (errata by SpeaanArva, foralls.

to method. 4here correlations vers obtained when the safit raters vs td

different methods or scales, the coefficients tended to be high, e.g.,the r of .96 is of this type. Where correlations were obtained betweentwo different raters using the same method, coefficients were confider -ably lower, e.g., the r of .32 in the Hamrin study (143). Hampton (142)in her study of administrative ratings made in 1951 found that "correla-tions between successive trait ratings of the same persons were differettifrom zero at the one per cent level, trait by trait, when the raters werethe same and nominally equal to zero when the raters changed."

The reviewers found some confusion among authors as to whether re-liability or validity was involved in certain of the correlation co-efficients computed. When raters are of equivalent prestige, s'Atus, orstanding, the reviewers have assumed that consistency cf ratings, i.e.,reliability, is intended. Such studies are reported in this section.When raters are of obviously unequal prestige, of different °leases, orthe comparisons are with an entirely different order of criterf,on variable,the studies are included in the following section on validity.

17

In order to insure that raters were confronted with a common situation,Shiels (308), in 1915, asked 110 principals to rate the same ten case stud-ies of teachers for instruction and discipline on a five-point scale. Higherreliability would be expected than would be the case in rating real teacherssince the judges wore basing their opinion: on identical data. The ratings,however, showed considerable variation and range, there being no instanceof 100% agreement. There was, moreover, less than 75% agreement in all butfour caoos in rating inettuction, and in all but two caeas in rating die-cipline.

In Barr's study (16) similar results were found. lihen 60 visitingsuperintendents observed, for two different periods of 30 min. each, theteaching of one teacher relatively unknown to them and then rated theteaching effectiveness of that teacher, grear, divergence of opinion wasfound. The cupetintendents spread their ratings on all traits over atleast 9 points of a 10-point scale and to: more than 50% of the itemsover all 10 points. One superintenJent commented on the poornees or theteaching, while another remarkld that he wished he could employ the tetcherin his school. Correlations between first and second observation by thesuperintendents also proved in general to be low. Barr stated that anoutstanding fact brought out by this study was that supervisors cannotagree when asked to analyze a teaching situation about which they have noadvance information. He conclude,: further that ""conventional supervisionis highly subjective."

Correlation of Administrative Ratings Pith Other Measures of Instructorffectivenees

A number of investigatori over the past thirty years have made com-parisons of various criteria of instructor effectiveness. Their studieshave been summarized in Table 2. The correlation coefficients, where re-ported, range from -.61 to .82, the former being determined by Jones(172) when he compared principals' ratings of 13 teachers with gains madeby their pupils in English and the latter by Nanning& (238) when he com-pared principals' with assistant principals' ratings of 15 high schoolteachers. In some instances rather substantial coefficients were ob-tained when ratings of various types of administrators were comparedte.g., Brandt (51), Bryan (61), Nanninga (238), Tiege (333)J. In these

f"and other cases where relatively high correlations were reported,opportunities for collaboration, prior discussion, or other sourcesof lontamination of data were not completely ruled out. For the mostpart administrative ratings do not produce very high correlations withmeasures of student gains :e.g., Brandt (51), Taylor (331)).

In KnudseAls and Stephens' (180 analysis of 57 published devicesfor rating teaching, they discoveral that often the validity of the devicewas implied in the assumption of Cie competence of its designers to selectsignificant traits. Forty gave nr statistical evidence of validity or

18

Table 2

Correlation of Adnini ttttt its Ratty with Other Mosoutes cf Tootroetor Effsetivenses

. InveItIts0 Tosc114r Natio

Morton (1919) 71 high ochool. city151 high schoo).

1)3 tlementory, composite of3 cities

115 elemmntary, each of )cities

11111 (1921)

went (1922)

Postipan (1928)

Pshninga (1928)

?logs (141)

Board Ba.1 (1929)

ULM 41910)

Taylor (19)0)

Simmons ,(19)2)

Case Cernoll (19)))

Crams Ow)

Nasvol1or PAM

lissom (1937)

Moe 11937)

)9 elwoontary high

88 hiah school

atm.smart Cprrol.tion

Inspector rating (4 categories ) vs. salary .16 to .21

Inspector rating (4 cotegorioe) vs. salary .26 to .)9

AdaLtistrater ratite vs. pupil gain .45

Administrator robin/ vs, pupil gain .45,..24, .19

school Supervisor rating vs. studeit rank .74 (4'4Supervisor retire vs. peer rating .96

Sui..or:aor vs. follow teacher raringSutorvloor vs. studont ranking

tr'..cIp.1 vs. 9 graduate student ratings1..i.tarl principl .e. 9 gradusts

01.0en% ratingstver.ge principal, issittint pr.noirmlVtt 9 wigs:, student ratings

Ten:ipal te. so Pietant rine:pal rating

15 high school, 5 years orme m' in lard school

25 elo.+ntary, I t a.

-portent.

470 .lecotary

21 mot voNoti

c.. :ant

101'olesehtory

77 elemontary

slotontery

103 elemantaty

Welsmentary

'112 elemontaiy, 2 yr.experisnoto

)2 tellies

Sid slossetsrp

IS Als% aehoel

I* M. no Kbeo

41 Jr. kik% Wool

11 sr. MSS sthsel

IA it, MO stbooll

A* et. We WoolId et. kWh easel

Supervisor vs. principal toting

Srporvisor mt4he vs. ptinll gat.

Sopervlsor rating vs. student ranking

Printi).1 .s. retouch doporthentrealm

Two prlreipal elYAine vs. 2 researchdeportment ratird

feint's) 192) -4 rc.kine vs% reessethdopartmott 192e° rating

istimatod teaming ability vi. pap:1arithmetic gate

1.tleated totth!lot ability vt. pupilregains pip

listing by ) ompervis,rs vs. pupil gain

Sopervisor vs. a obesrvore (kW-

2 atom-tr. (Almy-Se:onsen) tutor-Hoer (10 points)

'2 abutter. (Toreforsom) re. insporvioor(10 cvat.,tol

Rack big sr bean vo. ranking try poor* antstudents

.61

.56

.47

.76

.S6

.62

.70

.14

*tloao aereesiant*

.51

at;

.4%

.02 to .15

.0 to .24

P. relationship*

.97

.48

.41

1411 SA 00 0146044 let 10 II

.toe of pore 1.6saw to

tioteelsi let ICIall but toe ofpoorovlot 10 ea4440t libt

Asporistondesk, pitAlot!, assistant $53#tiselps1 (impolite. othoeitteem Ti.poet rsaklas es porsona;114)

Impervioer rating ts. pupil WA *Ss relstiesskle

fhiporlrOsedos4 es. sosIstask ;risotto-1 .41(mood OfosrlIftg ability)

Omptrillsodosk to. 'emitter* ',theist .63(petrel tootele4 telliti)

Soporisiteedeet Ti. eattetstA rietitel Ai(U hoes)

Siporisksedook to. sttoorlps: (gabbed .63tsbout wilt')

. oSeportotendesk Ti. stlrocaal (11 =it..) AMPrisetpal TO. &ulnae' piLetteet 44fissioral Oswalt* *11114)

19

late/tinter

Cooke (19)7)

knotsvor (1%0)

Sorter (1942)

&enterer (190)

-LsOsks (1945)

°ethos (190.

Lino (1946)

Jon.. (1916)

to m.a (1w.)

Oalt 1 Osier (1947)

nosh (1949)

*sea (049;

Sighlsa1 I bortekin (1931)

Lases (19$))

16b.6 (1131)

Tusher ma.

22 sr. high *duel

41 Jr. blob school

22 sr. high Mool

0 Jr. high schei

22 sr. high school

41 Jr. high school

Tolls 2 (CM.)

to 411 elementary 4. Li*school

3) igh school11 high

hochool

27 Mats% teachers

66 hale high wheel

)1 elmetars, 1-room

$7 elomatsry, I- 4 S.r101

53 lit& slosh won, )fl. expor:sese

IA high *hoot7 high sehr,11) MO mho.) (18.4a)(lushor unreports4)high oebool

$1 hips whet.) sum, 1yr. experience

22 high school moms, 1yr. awe IOC

1141 String instrattoro

Pspll gals is ettitelo.

U elementary

olossetars, 1 -Tees

1f tverml

to 16 high seheel

19 slonsetarp. 1 ft.'venom

4)5 47 toetsioal soboolS

33 hi* pones,. 1 yr.emporium

7) high Well 1 11111.empotlesee

.11PPriallpal re. assistant principal(11 Roos)

Prineiptl so. assistant prineipel(11 itenr)

Aroma cininisanter (morel nerit)vs. student toting

iterate eadaistrstor (gutos1 snit)so. 'intuit rating

Are rtoe odnististratsr (10 traits) vu.etsdeet rstiog

Averse odnialstrstor (10 traits) vo.student ntias

Corrolstiee_

.21

.52

.36

4)

. 36

.55

Coecasito principal se. tuahor tole- .C4 to .49Wing

buporsiur se. rspil nib* .404/terms 2 supervisor so. pupil riling .4)

Sopervisor ratio. so. pupil rating 0421no sonorous*

lesiorviur maim to. pupil rating %light watt's'

04sposite ...penises rating se. pupil ..14 to .04gOn

Cespoti:o saslaletrotor (3 pollee) se. .40pupil 011

kapott. Pkinistritor (1 sesloo) vs. .10pupil gals

Ces'osite supervisor rating vs. pupil riskCeepoeits overview. rating so. pupil gals

04..041 fn. wspervisor nitro°Maid retina it. pupil pisPrinoipal ratiso so. pupil Cain 041116)Primmipal rstilo re. puµl osia (01

tobJeste)

Seperwieor vs. istervin (1 qrsl.)bspervimcr te. isteeviu, Bert (0 kW.)bupersion to. Walloon* (1 pal.)

Two supervisor Wing so. MAMA rating

19109Plotabkot rotis4 (3 tutors) to.pupil eim

Oeperistoniset Ming (S tutor?, ti.mil WA

$0,41rdfoir 'lithe (II yr. Wore) vs.gills

bspumiser mina (11 rt. beton) To.pupil els

bupsnisor (1 IOW) el. thin, Mindtoilet male

6.sorrtiost (I tusk) is. ishotrialemolosios ?sale

lepersioot so. atoms post 'Miss. .11

Prismdrel one-all to. skirts? ittisos .69Principal se. sawyer (sew awls)

Primtpal %cooptabilito to. suporrisse .),

. 1s1

49-.27..41.1.20 and .10

. 1. to 41

.14 to .44

.17 to 41

.00, .os

.so to .11a

.1)

. 14

04, 4). bf$

14:11

20

*repels to. I iparwetwt onside 1 .11

reliability; 11 mentioned correlatioas between ratings of the same teacherby different judges; 4 quoted correlations of weights assigned to variousitems on a given device by different judges; 3 gave correlations betweensuccessive judgments of the same judge; 2 included intercorrelations ofscores assigned by different judges; and 2 mentioned correlations of scoreson items with scores on general merit.

Intercorrelations of Rated Traits or CateAores

Some, if not all, of the studies reported in this section appear togive evidence of the presence of the halo offect which tends to bias rat-ings in general. A number of the investigators whose studies are reviewedhere have called attention to this factor as at least a.partial cause ofthe large correlation coefficients found when ratings of several traitsby the same rater are compared. Nher investigators report high correla-tions without comment.

In any interpretation of these studies it is important to recognizethat, for instance, a coefficient 4f..90 between rated "efficiency" andrated "use ofiLethods".does not mean that good itathods lead to efficiency;it merely *means that raters.terd to rate a given p6rson at the same rela-tive level on the two traits. it should be noted that two kinds of inter-

correlation may indicate halo effect. The first kind is the correlationfound when ratings of two traits by the same rater are plotted againsteach other. Tha second kind of correlation is that found when mean rat-ings of two Traits of seveeal'instructore are plotted against eacn other.

In Table 3 12 studies are summarized in which correlations were core-puted fin. the Bryan (61) and Brookoirer (55) stucLes act'tal coefficientswere not reported) between some rating of general teaching merit and rat-ings on aqme other teaching characteristic where the two types of ratingswere made by the same rater. It will be-noted that,, in general,. the co-

effihente tend to be high, probably indicating operation ,of considerablehalo effect. In some cases the relationships are quite as ridiculous aethose Knight (178) found and commented on in his study of peer ratings.Knight obtained a correlation coefficient of .94 between general teachingability and intellectual ability and oni of .79 between teaching abilityand skill in discipline when these were rated tr.fellow teachers. He

also frund a correlation of .86 between ratings on skill in disciplineand in.A4llectual ability. In poinCipg out the absurdity of these correla-tions, Knight said, '!Ore this really the truth, what a prodigy of in-tellect the 'strict,' but ofteh dale teacher would bell., Further, "If

we thus generalised, we would also hold that Grant, admittedly a past

master in control, also towered aboveincoln in mental stature."

In the case of certain traits, however,.the correlation coefficientsare low. For instance Ruediger and,Strayer (283) report a coefficient of.04 between general. merit and health and .20 between general merit and

21

Lout isstsr

hustiser 'Iltrtm (1910)

Sons (1M)

teetres (1919)

haled b Mts. (1129)

Odsevsalor (1936)

1Prish

briskets? (1945)

Table

Carrelstiess ikettesas Lstlass et feasts? Chsrastariertisi

flasher mewls

204 (sppm's.) slesostary aistsreisor (Is ts 26school sysisss)

)43 MO school143 ki Wool325 MO oohed311 MO school376-330 hige. school

12) stedeat

520 asestetsevR1 sIsssr.tui464 ilsaantai

Ihipstriser (27 systems)

$1441V110? (12 *stems)

PrIsa !pal

160 lossstiry !wordier

47 Jr. 11 22 sr. kip Susreiser & ;spitsestarol

44 sill kW soltssl thstrIstestse

au a ort., (1947) Ito /IAN thstrstses Overdoes

"Wises", 4. (Mt) 131 resort isetrecters Soperstser

Ksiotele 111513

%Ken (1919)

olusstasty

&I ilsseetir?

644reisof (ips4t,issale)

lepotrIsst (painedssipsti see)

91 hip *skeet, ettet Itse.stlesal *Walls%

151 ht. sebel. rsral lesestlerre opsetilisi

"lig k ',tussah (1952) 110 *mike* balostyl sot

22

lrait6, Bran tad Ceerelatise

Noma *sit Inahoping erdsr .56hashing skill .54Initiative .50Rultk .047 otter traits .02 to .44

/sot rust less] skill .90IstsJ ps9t1 mussAtavistic's et mils .60etssIth 111lb ether traits to .71

bode posertssketiqui of tuskless

17$79

Ps steal is: 55Csstml erns withaltrykotioaal spiritSold tattIllosss

I6,1

3724

II AU? treats .;. is .72Ps roma: it .65

10 otter trstts OrelAIIP ate.MALI 3implIfV1,0MUM, Mato its thlato.te

tordAt et mettle oltgaitieLet17waists sts

16 Utters.% testis .26 Is .00

tottyes ptspersttaim, to slim's. KUAppeaetas10 et2sr traits

turas *Pars*PittipltaaStealth k snail*

emus traitsbasses statuestarslodees *1 sibisi

*MOPSKIM9 Aber testis

appearance; and Boyce (48) found a correlation of .18 between general meritand health. On the other hand Boyce reported a correlation of .90 betweengeneral merit and instructional skill. This suggests that traits which aremore objectively observable or are more independen!, of opinion are lessprone to logical error or halo effect than are those traits which are moreintangible and hence more subjectively estimated. The implication seemsclear that, by and large, ratings made by the same person are apt to be con-taminated by halo and that in many such instances a single rating of over-all effectiveness may be as useful ae an evaluation based on a compositeof a number of ratings on separate traits.

Peer Rating of Instructor Effectiveness

Apparently little use has. been made of the practice of having teach-ers rate their fellow teachers. Roberts and Draper (299) in 1927 obtainedmaterial on the scope and character of the work of the principal fromprincipals' reports from 441 high schools having an enrollment from 5 to4000 pupils in all sections of the United States. Only 12 principalsasked teachers to rate each other and 379 did not require such ratings; noanswer to this question was given by the remainder of the principals. A

survey made by Reavis and Cooper (262) in 1945 on rating methods in usein city school syetema showed that in only two systems was teacher opinionused as part of the rating set-up.

In a number of studies, however, lists of desirable traits of teachershave been compiled by teachers themselves (53, 120, 173, 215, 303). A de-tailed analysis of these and other related studies is included in the sec-tion on Opinion Studies of the Personality Characteristics of Effective andIneffective Teetructors.

Superficially at least, the most obvious way to discover how a nAneptla a job is to ask a fellow employee. It would seem that fellow-teacheropinion should provide a valid ,tease re of instructor competence. Therating a teacher makes of a tellow teacher, howaver, is probably rarelybased on first-hand observation but rests rore often on hearsay and repu-tation. Even if he does have opportunity to observe other teachers' per-formance in the classroom, he may not know what is important to look for.

Furthermore, peer ratings have never been popular. This is probablydue to the dislike of persons to evaluate or to Le evaluated by their closeassociates. The raters can never be absolutely cemein that uncomplimentaryopinions do not get back to the person rated, nor are they always sure justhow their ratings will be used. They are loath, for instance, to acceptany responsibility for separating even an incompetent fellow worker fromhis job.

23

For administrative purposes, therefore, peer ratings of instructorsare probably not too useful since teachers tend to have certain misgivingsabout passing judgment on fellow teachers. When ob13.1e4 to rate theirfellow teachers, they are apt to do what is popularly called a 'snow job."They are careful to give only favorable ratings, thus avoiding any reper-cussions if t'sir ratings became known to the ore rated. This means ofcourse that the instructor-rater keeps his more candid opinions to him-self. From a research standpoint, in using peer opinion, ranks mightgive better results than ratings, especially if steps are taken to assurethe raters of the anonymity of the results.

ReliabilityqflerbIlingof Instructor Effectiveness

Not many data were found on reliability of peer rating of teachers.Four studies in which reliabilities were obtained for fellow-teacherratings are phesented in Table 4. In these studies the N used was thenumber of instructors rated.

Correlation of Peer Rating_ligh Other Measures of Instructor Effectiveness

Several investigators have been interested in showing the relation-ship between peer rating and other measures of instructor effectiveness.The rationale for making such comparisons appears to be that of lendingsupport to the validity of the measure used in a particular study. Appar-ently there is considerable agreement in opinions of supervisors and fel-low instructors. This would seem to indicate that the reputation of anindividual is a common element in influencing the judgment of all who areassociated with the teacher whether pupils, fellow teachers, or supervisors.

In the four available studies where correlations were computed, thecoefficients ranged from .53 to .96. These four studies have been su-merited in Table 5, together with three reports where noncorrelationalmethods were used in comparing peer ratings with other measures of in-etruotor effectiveness,

latualmelWomuLtuslating of Instructor Effectiveness

As in the case of intercorrelaticns between traits rated by the sameperson for administrative ratings, close .-elationship is found for ratingsgiven different traits by the same peer raters in the few studies avail-able.

In 1922 Knight (178) in a study of 153 elementary and high schoolteachers found that mutual judgments o: teachers with respect to generalteaching ability correlated with their judgments of intellectual ability

Table

14

Reliability of Peer Rating ofTnstructors

Investigator

Teacher sample

No. of raters

Measure of reliability

CarrelatIone

K4ght (4922)

153 elementary & high

school

30

Split half (of raters)

.90b

Guthrie (1927)

16 college

5Average of pairs of raters

.26

Boardman (1928)

68 high school

88

Split half (of rankers)

.72b

Odenweller (1936)

100 elementary

3Average interccrrelation

among the 3 raters°

.42

49 elementary

3Average intereccrelation

among the 3 raters

.el

48 elementary

3Average intercorrelatian

.29

among the 3 ratersc

aThe correlation coefficient is based on the teacher sample, the number of instructors rated.

bUncorrelated.

C Ratings of

personality.

Table 5

Correlation of Peer Rating with Other Measures of Instructor Effectiveness

Investigator

Teacher sample

Ratings compared with peer r.,ting

Correlation

Knight (1922)


school

Supervisor estimates (over-all)

.96 (mean r)


school

Pupils' estimates (ranking)

.58 (mean r)

Boardman (1928)

88high school

Supervisor ranking (average sigma

position)

.68

Pupils' ranking (average sigma

position)

.66

Greene ,1933)

32 college

Ranking by dean

All but two of

peers' 1st 10

on dean's list

of 10 best.

Odenweller (936)

560 elementary

Superintendent, principal, &

assistant principal (teaching

effectiveness) vs. peer rank-

ing on personality.

.53

A".%bert (1941)

1 :Ugh school

1 administrator vs. 140 pupils

and vs. 16 teachers (on a rat-

ing scale devised by author)

Pupils agree

more closely

with teachers

than with

administrator.

Ferguson & Havoc

1 high school

304 pupils & unknown N of peers

Peers rated 5

(1942)

rated 12 personality traits

higher, 4

same, 3 lower.

Highland & Berkshire

635 AF technical

Average rankings of peers vs.

(1951)

school

rankings

.94 and with judgment of skill in discipline .79, while judgments of skillin discipline correlated with intellectual ability .86. lie concluded thatin judging particular traits, "general estimate' (i.e., halo) influencesthe ratings to such a degree that judgments of particular traits are inthemselves of little practical use,

Odenweller (247) also noted that in his study correlations are marked-ly. higher when both traits are judged by the same set of judges than whenone is judged by one set of judges and the other by another.

Student Rating of Instructor Effectiveness

In recent years certain educators have been quite voluble in advo-cating the use of student rating in evaluating the effectiveness of in-structors. It is maintained that such ratings tend to raise standardsof .instruction by providing a basis for weeding out incompetent instruc-tore and for improving the effectiveness of good instructors, These rat-ings, it is sale., provide administrators with a means for-securing depen-dable information which they should possess as to the opinions 'of studentswith respect to every member of the teaching staff.

That student ratings,. within the limits of their reliability, arevalid measures of student opinion of instructors cannot be questioned.It is probably true also that students being in a more or less close re-lationship with their instructors are in a better position than anyoneelse to make certain judgments of them. Whether Jr not these studentratings are in turn related to over-all effectiveness of the instructorin the teaching situation has not been demonstrated. There may oe acloser relationship between pupils' success in school and their reactionto the teacher than there is between their success and methods of teach-ing or the so-called important physical aspects of the school environmentand teaching aids.

While the practice of obtaining student ratings appears to be grow-ing, their disadvantages have frequently been pointed out. Some adminis-trators oppose them because of the cost in time or money or beca se oftheir possible disruptive effects upon student and staff morale. Amonginstructors there is considerable opposition to student ratings. Cer-tain instructors fear the misuse of student opinion as a basis for ad-vancement or separation of personnel. They point out also that studentratings may make instructors emotional, self-conscious, or resentful andthat attempts to cater to student opinion may produce changes in unde-sirable directions. Students may lose respect for their instructors bybeing encouraged to set themselves up as judges of instructor com-petence. Instructors contend that student ratings are unreliable becauseof immaturity and prejudices of the raters who are influenced by grades,interest in specific subject matter, reputation of particular instructors,difficulty or ease of course material, and the like. M,ny students alsoare unfavorably disposed to rating their instructors. They consider such

27

ratings a waste of their time unless administrative action results.Students themselves point out that the preferred instructor is oftenyoung, genial, and entertaining, while the serious, more experienced in-dividual who stresses subject matter and insists upon certain standardsof deportment and effort is rarely popular.

Quite a number of investigators have reported studies of studentrating as a measure of instructor effectiveness and also as a means ofinstructor improvement. Among these are the studies of Bryan (61, 62,63, 64, 65, 66), Starrak (322), Riley et al. (276), Goodhartz (131),and Remmers and his associates (264, 265, 266, 267, 268, 269, 321, 348).Galt and Grier (126) in a report of an investigation of flying instructorsstate that they found student rating useful and suggest that such ratingsmight well be looked into further. In a very recent study Flesher (115)has suggested that the question of whether or not ratings of an instructormight be inferred from their students' rating of the course taught by theinstructor might well bear investigation. Flesher contends'that studentrating of courses tends to be more objective and frank and hence, more-valid than their ratings of instructors. In a limited test of this hypothe-sis done as a by-product of another study, Flesher obtained correlationsranging from .60 to .82 between course ratings and instructor ratings,with mean ratings for courses tending to be lower and more vari'able.

Reliability of Student Ratin of Instructors

It might be expected that higher reliability coefficients would beobtained for composite student ratings than for composite administratorratings of instructors because of the usually much larger numbers ofstudent raters as compared with administrators making the ratings. Asshown by the investigations summarized in Table 6, however, there isconsiderable variation in the reliability of student rating.

It will be noted that two kinds of correlational studies have beenincluded in Table 6. In most of the studies the correlation coefficientsare based on the number of instructors. This obviously,is.the proper Nwhere reliability of students ratings in differentiating instructoreffectiveness is required. In four studies, Remmers and Brandenburg,(267),Root (281), Smeltzer and Harter (315),and Amatora (4), the reliabilitycoefficients show the consistency with which the same students rate aparticular instructor, using either the same or differentirating devices.These studies give no information as to the reliability of student ratingswith reference to the instructor differentiation problem since the N usedis-the number of student raters and not the instructors rated.

In addition to the studies report,A in Table 6, a number. of investi-gators have reported findings which have a bearing on the reliability of

28

Table 6

Reliability of Student Rating of Instratore

Imrastin Paths? soggIL Student tango AWL raga 22:le 110=

-111.KRUM (1922) 11 elementary Pupil ranking

-_Charm halt it

cadent rating200 (approx.) .77

15 high school 40 *most dependable* per Nil ranking Chau* halt .52!miser

13 high school 40 Nod dependable* perteeth*?

Pupil ranking Chalet halt .91

Guthrie (1927) 17 colitis 285 (approx.) (lvirgg*1.25 per toathari

Average rank Chance halt ofetudent rating

.79

(Numberunreported)oollege

365 freshmen Rank Chance half ofstudent rating

.56

101 tallest 85 advanced lank graph!.rating

Average rank vs.average rating

.61

87 college 215 (approx.) Rank Rorenking (1 oo.later)

.09

Reamers 6 Srandetburo (1927) ) college 301 331 33 Purdue (10items)

Reratinge .83, .50, .641

Board/on (1928) $8 high school 4 sohools Yank Chant.' halfamigo rank*by student

.78

Root (1931) 1 college 200 Check list, 42items

Iterating .95b

Wilton (1932) 97 college (Not reported) Graphic rating

(35 items)

Two studentgroups for etch

.65 to .88

Lnetractr (351'57)

Bowan (1934) 21 student 1 to 40 per teacher Graphic rating(7 !temp)

Chants hallstudent ratings

.91

30 student 10 to 42 per toschsr Purdue (10items)

Chants half eachitem (10es)

.58 to .92

30 student 10 to 42 par teacher Purdue (10items)

Average chancehalf of studentratings

.84

Remmers (1934) 57 student (Not reported) Purdue (pres-entation)

20 studentrating (20 ea)

.11 to .50

Padua (stis.elation)

20 presentationsstudent rating

-.09 to .47

(20 es)Purdue (inter.

set)20 prmienteticts

strodant rating.02 to .35

37 ogles. (Not reported) Pardus () abovetraits)

20 presentationsstudent rating

.43, .35, .29

(lump of 20Lem)

Stltase 6 Barter (1934) 5 college 12 elapses 45 itemes(5-

point)Signed vi. anon -pout rating,(12 ;le)

.63 to 491

Senn (1937) 22 sr. high 600 (average 66 per 11 items (5- Average *banes .69 to .91school feather) point) halt of ratings

(1144)41 Jr. high 900 (average 71 per 11 Haas (5- Avenge ahants .61 to .94

School !gather) point) half Of ratings(U ea)

Asthma & irmentreat (1936) 46 college 17 to 121 per teacher Purdue (10Rene)

fifth placementvi. rating 4,students

.75

23 college (Not reported) Pardee (10its's)

Raratings (6 to7 yr. intervals)

.69

lryan (1941) $6 highschool (16schools)

30 per tarn.? 10 items (S-potlit)

Average Ghana*halt

.13 to .92

4 Onoorrootod reliability seeffidentil correlation based on lastrnoter I except stk.

b Oorreiitton bawl on otodoet

4 Oorrooted to equal that of 25 ratings.'

29

?able 6 (cont.)Reliability.

ttaktike Luslau..mult Student sample .11111... Wathod coeffigient

Davenport (1944) 46 high 1250 (approx.) 23 items (5- Average of 48 .86

school point) pairs ofratings

12,0 (approx.) Pupil ranking Average of 41 .93pairs ofrankings

Student ranking .46TO. rating

Cook & Leeds (1947) 100 elementary 20 per teacher 50 item (Ti.- Split halt of .93°No-7) average

ratings

Oalt & Orier (1947) 277 flying 3.25 So each Graphic (18 Class-to-class .36

instructor class items)

Amatora (1950) (Not report- 1174 elementary 7 intrassalsa Oraphic vs. Cheo's .74 to 44 4

ee) ldet A7 intrascales Graphic vs. Check .75 to .6I4

ListISOT check lists .72 to .41!

Total scores Graphic vs. 2 .90 4 .91'°hook Bits

Total score. Two check lists .79

Drucker & Remmars (101) 17 college 6 students (8 alumni, Puidue (10 6 student vs. e .40 to .64each inetitution) items) aluxml (10 es)

4 Unoorrected ratability coefficients) correlation tassel on instructor except at ILL

b Correlation based on student rater M.

° Corrected to "soil that of 2$ ratings.'

student rating but in which correlation coefficients are not reported. In 1926Fritz (123) found. hat 89 students varied widely in their ability to duplicatetheir judgments on two ratings of one teacher obtained on a seven-part scalea week apart. In 1942 Porter (257) cound, in having pupils- rate some 27 stu-dent teachers, that some classes were considerably more lenient than othei,sePorter gave no statistical basis for his finding nor did he consider thatthe difference might be due to teacher merit ratherthan leniency of pupils,if a teacher taught a better. lesson .in one class than in another. He con-cluded also that pupils tended to ageee'closely in judgments of Vest andpoorest teachers but varied widely in their judgment of the middle group,a finding usually associated with the use of rating scales.

In 1929 Remmers (264), using the Purdue Rating Scale for Instructora,,and in 1934 Starrak (322), analyzing ratings'by students of the entire fac-ulty of Iowa StAte University, reported that reliabilities obtained comparedfavorably with those of the best standardized objective tests. In 1932Flinn (116) found that when an instructor was rated by four different super-visors and four ditferent groups of pupils during a ten-year period thepupil ratings were much more uniform than were the ratings of suriervisors.Flinn's result may simp,ly reflect the fact that the standard erroF'of anarithmetic mean is a function of the number of cases on which 4.t is basedand that a mean based on four 'different supervisors could fluctuate morewidely than one based on a presumably' larger group of pupils;. In 1941Albert (1) obtained consistent resuTtke when 78 high school teachers were

30

rated by their 1578 pupils. In 1946 Remmers' et al, (268) asked 559 engineersto ude the Purdue Rating Scale in rating the Net and worst instructors eachhad 1..n college. The mean differences between beet and worst instructors,as rated by the total group on the'10 traits of the scale and based on atotal possible score of 100, ranged from 17.5 for personal appearance to59.4 fpn stimulating intellectual curiosity. The average difference be-tween means for the 10 characteristics was 39.6. 'These results are not

too meanirigfill in the absence of standard deViationsot the ratings ofbest and worst teachers.

Correlation of Student Rating with Other Measures of Instructor Effective-ness

A number of invu,,Ligators haVe compared the results ofstudent ratingof inetructori with toe obtained from adminietratiye and fellow-teacherratings. Some have rf1Dorte.dittle obtained.wrelatidns as "validity co-efficients." In a faeinstances;e.g., tin (203) and Remmers et ale (269),pupil gain has been used as the criterion with which comparisons were made.

Table 7 summarizes 21 studies, in 12 DI. which correlation coefficientswere reported. The emsiderable differendessah hagnitude of the coefficientsobtiained may be padtty explained in terns of the criteria employed,and in part they.may be a ir,incAion of the small numbers of teachers involvedin mqAt of thog investigations. In gdneral, the coefficients reported arequite high Mere ratingo of teaching efficiency were used for both groupsof. judges.. When a numbe4of traps were'reeed,tholoever, quite a wide range

in 'coefficients resulted. Thilimak,have peqn due to.the.differing inter-"

pretation pl.t0ed on the meartvg of ttle, tra, by different raters. Re-,

sults are not alweys' comparablwarom study to study because of the lack ofstatistical controls. It was not always possible to tell frth the reports,for,example, when pupils ranked their teachered.f corrections were made forpip' of voupp. ;41ight (178) applied arch correction, se did Boardman (39.)

who changed his ranks to nigma positions. BOA' got quite high correlations.Greene's study (135) which showedr a high relatiohihip between the teacher'ssalary .and ranking by pupil§ may mean onl'r that pupils were influenced byacademic position.

Davenport (92) obtaineg a low norrelation between teachers self-

ratings and pupils- ratings .4' tpaehing.on comRarable scales. He found

a zero relationship between pupils' ranktu of; their teachers and theteacher's self-rtihg., Davenport,suggests that a teacher's actual teach-ingmay well bt:ctifferent frcp hoar philospOOY of teaching, simply becausesuch factors as .siz4 of clEiss or other cldasroom factors force her to

compromise.

It is interesting to note that ir, the two.studies where pupil gain wasone of the measures, only slight relationship was found., In the Line'study (203) the low corralat/onmight be due to the small. number of teachers

31 POOR ORIGINAL COPY - BEST


Tetra 7

Cerrelatiom of Otedart latiag with Other *warn of lass rotor liffeotielassa

limbo etailats 771 lotimandisstar ..I2131111611116.1 ...31KU1211U -....661SiS13331... ..,....41.1MMUSCU...---... C0f21455,911...

imight (1M) 314aosaia....a/ 11 40 Iftaktai her 'VAN (froml au") '54Oupsrvicer rating (Amoral .76

merit)

Outhrio (1127) (hobo, sire- 16.5 Ranking mimeo fa. adocaoall .53parted) collap students

boards= (1920 M high achool (lot reported) lankingl tanking! by alderriser 44Ranking' by poor .66

131% (19)0) U high school 44-217 &taking laporoloor Willa er leas agroment1

Once (OM 53 college 11 Rankled Rank of salary .66^or rsaldrd All let oae atLarking 47 loan Doke. 1st 10

11 toe et porelet 10 same itotadontso 1st 10

Rome (1934) 30 %alma 0-40 Pardue RatLog 5011/ Critio Imam, sans lode ..50 to .47

1114ftsk (1934) (Now cars- (lot reported) Ors=la (17 item) iduastianel opeothlifta 75% if "tufted"ported) wet lye Woo= "sleaa=Una faulty apossaarei 256

.4ftergteft ofopinion"

*yr (1957) 23 fr. high =heal 34 152 Oralids (11 item) Aroma addidotrutere .56(ganaml writ)

image 3 adadaistesters .36(10 traits)

al 5r. high school 10.151 Oral t a (11 lieu) Acing* it adathistratora .53(moral merit)

Morays I administrators(10 traits) .55

frookever (1940) 35 high =heel 12-57 Turtle Rating kale doserinthaeant (ass seals) -.Cd12 high oohool 12-57 Tadao Rating kale aroma 3 administrators 35

albrt (1%1) 1 Ugh school 140 Graphic Unintotratore (Saw golf) NFU' scat16 pears (saw scale) sore eloody

with touterthan withsdalal ftrator 6 .

*rd. xs_pa (1941) 40 etodoat reported) ?oda. luting kale 3 rreperviDer Coma seals) -.09 to .916median .60

forgoes& keg. (3%) 1 high =teal 306 Personality (12 her (ear arch) fors raise rhotraits) traits Mahar,

fear the saw 6three lower thanpapils.

Partse (1062) 77 Kudrna (not mortal) Chock list Critic teacher tutted "Class ageasenat

Dampert (1144) 44 Ugh =heel 94.0 Oraphis (15 Base) golf, %lot I Teach. .25gm 'heater*Teach"

P.0.1 lankia$ 3olf, ate I lochs .02

In Nada (1146) 50 Ugh *oboe: 5-6 Nokia( Id wattles} specialist Pea of 2 irerating oi bade of data trignificant atcollected to inePrdere .01 Ural.4 outobispachles Nap .06 to 34

Um (1146) See high fool 5-6 Ranking Coq:aria= 5 eapsrelears .2*17 high =heel Pupil gain .06

Galt 6 Cris. (110) 1141 3.12 Oraphis (11 itoas) Ilopsnrisona--Coate -a11 .06langotel.ors

iVerriftflying prolicioacy

ro-.0ftr-all .04tooling prof tailor,

Ra=n. pia. (1541) as redo 25 peer (tot reported) Ford= Rating Scale Pupal gain CR fee 2 of 12(ds. lob) trait* sisal-

bunt et .01kola a st .03heel.

20 good* Za re" (Rot reported) Purdue Retied Seals Paoli gala Cl for fa traitsignificant at.01 leant 1 st.02 lona.

Ceepse 4 Lois (1931) 30 moll-11141 (Mot "aorta!) Chock 11r1 Nkomo. of =wroth .536lass-liked Iss sip, %ewes Inapotice Ile calm tionaltipsftelonts lersehach

115.a° -U...6 Toteseherie farralatlas seaffialemt.

32

used in this part of the study or to some selective fa6tor in the mannerof choosing which students would rate each teacher. The traits ore whi.ch

differences were significant at the .01 level of Remmers' study (269)were: rating as compared to other instructors in the university and caveof communal apparatus. Those significant at tha .02 level were: super-vision during tests and dailies, knowledg6 of chemistry, returning testsand dailies, should instructor be kept if suitable replacements areavailably.

Intercorrelations of Student Rating of Instructors

Ten studies in which intercorrelRtione were obtained between ratingsby students for more than one trait ar, presented in Table 8. kdivergence

tnvestimator

Remmers & Brandenburg (1927)

Stalnaker & Remmers (1928)

Remmers (1929)

Boardman (1930)

Bowman (194)

neormrs(1930

Starrak (1934)

11,inemtament--7-t(193

Table 8

/ntsroorrelations by Trait of Student Rating of Instructors

Number studentspitcher sample per teaeher Type of ratine Correlation

2 teller

1 college

115 college

87 high school

21 student30 student

64 student & 76college

(Humber unre-ported)entirecollege faculty

32 Purdue scale (10 traits)

94

(Not reported) Purdue scale

(Not reported) Teaching efficiency Y0.1Work hardest ferLike beetDisciplineLearn most

S-40(Not

Seven traitsreported) Purdle stale (10 traits)

10 Presentation of subject matter vs.interest in subject

Stimulating intellectual curiosityvs. interest in subject

Presentation of subject matter vs.stimulating intellectual curiosity

(Not reported) Graphic (17 items)

46 college 17-121 Purdue scalePersonal appearance vs.. sympa-thetic attitude (lowest x)

Stimulating intell. curiosity vs.presentation of subject matter(highest x)

-.02 to .62.25 (virago forall 10 traits)!

-.07 to .72.37 (average forall traits)

.45 (average forall traits)

.73

.82

.75

.89

.12 to .79

.49 to .90

-.005 & .18

.02 It .12

.1201. .19

-.06 to .6347. (average

.for all traits)

.06 to .87

(31 of the 45Zee above .64)

Smalsried & Remmers (1943) 40 student 20-35 Purdue Kale .29 to .88(28 of 45 remabove .60)

Rerrimson ;1949) (900 ratings) (150 total)

0.4t0 r.0",) (got reported) General toting vs. groups of item .06 to .3301 to .66

EffectivenessGeneral merit s. personalityVoice merit vs. voice

.66a

.57a

Mona - -items on

Kale rated bystudents

: o' contingency.

POOR ORIGINAL COPY - 6E01AVAILABLE AT TIME FILMED

33

of results was evident in the various studies as to how much halo effectwas present, even in cases where investigators used the same Guile.Remmers and his associates (264, 266, 267, 321) in their several studieson the Purdue Rating Scale for InstrUctors show very little halo effect.As can be seen from Table 8 they reported consistently low correlations.In one study (321) only seven of the 45 intercorrelations proved to beabove .50.

In the report of his study made in 1934, Remmers (266) says that hisresults emphasize the relative independence of the traits: interest insubject, Trait 1; presentation of subject matter, Trait 5; stimulatingintellectual curiositytrait 10. In thip study.Remmers, in addition tothe correlations reported,in Table'8,'determined halo effect by taking"five samplings of intercorrelations of five randomly selected pupilsagainst five other pupile.for Trait 1 versus Trait 5 4nd Trait 1 versusTrait 10." (Correlations were not computed between 'L,.its 5 and 10 forsome reason.) These were the 3 of the 10 traits appearing on the Pur-due scale that were indicated by students'as being the most important.Remmers averaged the is without, conversion to Fisher z's and withoutregard to the varying numbers of teachers involved in each r and then"corrected for attenuation." The resulting "true" correlation of .34,it seems to the reviewers, may be regarded with more than a little sus-picion. In the case of collegs students, Remmers reported average re'scorrected for attenuation of .52, .38, and .49 for Traits 1 vs. 5, 1 vs.10, and 5 vs, 10; reppectively.

In 1936 Heilman and Armentrout (148) also using the Purdue scale,found considerable halo effect and Smalzried and Remmers (314) in theirfactor analysis study of the Purdue scale, made in 1943, report that 28of the 45 intercorrelations were above .60. Other'investigators usingdifferent scales mention that quite a bit of halo effect was found.Bowman (47), in fact, in,a third o4 a series of studies on student rat-ing used an over-al] rating becaude of the high intercorrelations amongtraits found in his first two studies.

Influence of Grades Received by Students on Their Rating Instructors

The meaning of students" ratings of inptrUctors is doendent to someextent on whether or not such ratings are related to grades received bystudents from the instructor concerjed. If grades,received are.relatedto students' ratings? presumbly instructors who gavehigh grades wouldbe expected to receive higher ratingp from their students than those whogave low grades. The presence or absence of the relationships here con-sidered thus bears significantly on the validity assigned to students'ratings of their instructors.

The array of correlation coefficients presented in Table 9 is some-what bewildering, particularly in the presence therein of coefficients

34

Table 9

Correlation of Grades Received by btudent with Their bating of Treir Instructors

InvpetIgetor Teacher goal*bobber students

Asbdemic measure Correlation_

Remmers (1930) 7 student i 4 college 16-32 Pail*, peals (individualitems)

Students dividedinto two groups on

.46 to .09(biserial

15Lisle of grades -.71 to

Starreb (1934) Entire faculty of

one oolloge(Mot reported) Oraphic scale (17 items) Oredee

Bowan (19)4) 9 rtoloot 0-40 12 charectseistics and..rollfsmos Woven

pole 4 etudent

.004 to .65-.t9 to .)6

Immo geode

daintier S. Herter (1934) 5 sollsgs (lot reportei) C/r1Pht0.00410 (49 items)anonyncds

Final oxaminatior-.20 to .10

Signed Anal essalnstfon -.14 to .17

((roue (1935) (lot reported) (lot reporto) Analyst' of *beet, &*worsts teacher

Credo Xs significantcorrelation

bellaan 4 Arsentrout 46 collets 17-121 Purdue faca;heilfZuer rity -.04

(19)6)iairsunie in (Wine ..24

Bryan (1937) 22 Or. hits school 20-152 Oonsro1 tooling ability Geodes .07

41 Jr. high (school .15

Obtained by oo6putios tba roan of 411 the cruise assigned by each teacher for three quarters.

of substantial magnitude, but in both positive and negative directions. How-

eirer, a hypothesis advanced by Remmers et al. (269) in 1949 makes such re-sults plausible. These authors explain the apparently contradictory resultsobtained bettAreen this study and one by Remmers (265) at an earlier dcte interms of methodology. In the earlier study the ins.ructor was kept constantwhile students were varied in terms of grades and presumably scholasticability. In the 1949 study the instructors were varied on the basis ofwhether or not their classes fell short or exceeded their predicted grade- -presumably a measure of instructor ability. They point out that gradesobtained under a single instructor and due to student differences may beeither positively or negatively related to student ratings but that gradesreflecting instructor differences rather than student differences are posi-tively related to the ratings given instructors.

If one assumes that good students will approve of instructors who con-duct their teaching at a high level (and over the heads of the poorer stu-dents), then, a positive correlation between student ratings and gradeswould result. Conversely, if the instructor pitches his teaching at thelevel of the weaker studente, the brighter students will disapprove and anegative correlation will result. This hypothesis would account both forthe rahge of coefficients obtained and for the fact that when correla-tions are not computed separately for each instructor, coefficients ofnegligible magnitude are found.

In those studies where grades were assigned "subjectively," i.e.,where the instructor was directly responsible for the grade a student

35

received, the relationship between grade and rating may reflect the students?response to the instructor's affective attitude. The relationship betweenstudent ratings and objective grades, on the other tand, may provide anindication of the students' reaction toward teaching competence. Anotherdistinction among studies in this area is whether the correlation is betweenmean grades and ratings (where classes are the unit) as in the study ofHeilman and Armentrout (148) or between individual ratings and grades(where the student is the unit) as in the report of Smeltzer and Harter (315).

i.fluence of Teacher Factors on Student Rating_cf Instructor'Effefltiveness

In addition to the grades a student receives a number of other factorshave been investigated as having a possible influence on student rating ofteachers. Among factors considered hAxe been age and sex of teacher, lengthof students' acquaintance with teacher, length of time teacher had taughtin the school or had taught pupil, pleasurable personal relationship betweenstudent and teacher, and whether or not subject taught by rated teacher wasstudents? favorite subject. In view of the fact that research involvingthese factors has been rather sporadic and that some contradictory resultshave been reported generalizations cannot well be made. The few availablestudies are briefly summarized in Table 10.

Brookover (54, 55) in his two studies found what are apparently some-what contradictory results. This Light be explained by the fact that themeasuring devices used by 3rookover differed fur the two studies. Brook-over concluded that the nature of the pupils? personal relationships withtheir teachers affects their ratings of the teachers' abilities. This

Invorticator Teacher sample

Table 10

Relationship of Teachers Factors to Student Rating

Dueler studentsfir tesehar student ratim__ Teacher foctor Relationetio

group (1935) (got raportA) (gct reported) ?elect best 4 poorest Taught otutantts Mae relationshipteacher favorite subject between favorite

subject II subjecttaught b7 butteacher

Rsilean 4 imontrout (1936) 1.6 college 17-121 Purdue stale Itcperionce, age, 4 sex. go reliable differ-4,0400.

greokower (1940) 37 high school 12-57 Purdue 4 lemon -to- parson Age 6 sax

Davenport (1944) 51 high school 66.6 Graphic scale (15 items) goober smasaters*Flow Teachers Tomb" rtudent had been

taught Dy testbsr

Dreamt. (1945) 66 high .spool axle (lot reported) General asrit AgeLength of aoquaintanoewith pupil

length of tins teacherbad tsught in school

toll in onarunityPupil pin Pleasurable pera7

relatir, dr

No relationship

lloPittificant re-lationship

Positive relationshipPositive relationship

Fesitive relationship

No relationshipLow, but aignificant

negative

conclusion may be based on a form of halo effect, or more, generally, apersistent rosponse set on the part of the pupils. It is interesting tz:

note that in the Brookover 1940 study, ratings of 39 teachers by theirstudents on a scale measuring pleasant personal relationship yielded acorrelation coefficient of .64 when correlated with superintendents'ratings. Boardman (40), in a study reported in 1930 in which pupils'rankings of teaching efficiency were correlated with their rankings ofteachers in terms of for whom they worked hardest, the teacher likedmost, the teat:her having the best order or discipline, and the teacherfrom whom they learned most, found that when other factors were heldconstant pupijet liking fl the teacher 'was the. largest single factor indetermining judgment of to .cher efficiency.

In a longitudinal study of student rating'i in which there was someturnover from year to year, Starrak (322), in 1934, found that ratingscores of teachers tended to increase with successive ratings. This changewas gradual, teachers viginally placed in the lowest quarter moving to .thesecond or third quarter by the end of attwo-year peribd. eWhether thie,im-provement was due to some general biasing factor such as teachersi repu-tations.among students) or due to increased erfectiveness of the teachersbecause of added experience is not clear.

Influence of Student Factors on Student Ratings of Instructor Effectiveness

As in the case of teacher factors, the studies concerned with studentfactors other than grades have been sporadic and not too clearly defined.Often they are just a by-product of studies concerned with other aspectsof student ratings. Available studies have been suMmarized in Table 11.Information on four factors was consiflered: size of class, sex of stu-dents, age or'waturl.ty of students, and intelligence or mental am of.

students. By and large the results of the various st.udleelshow ghat thesefactors have litt],e bearing on student rating. The cLrVilihear resultsfound by StaYrak in'regard to influence of size- of class ,re cf someln-

tereet. It is unfortunate that Heiman and Armentrout did not test For

curvilinearity as the size of the claws in their etZdy ranged from 17' to121. Starrak concluded: "On the liaeis of the ritinui 2Q studonts seemto be the optimum number for a college class." .Alth9ugh his study wasextensive (ratings were made quarterly on all inetruotore of the collegeand cover several years with a total of 40,000 ratings), it is difficultto see how the optimum size of a ()lass could be seleced merely'on thebasis of student ratings.

In the case of the influence of the sex of the pupils it might wellbe expected that girls and boys would differ in their ratings of teachersof .certain subject matter. It is possible that a won,,n teacher better

understands the emotions and thinking of girl students while a man teaohermight deal better with boys and that these differences might vary for dif-ferent student age groups. To a limited extent the few ,studies'on thla

37

/art stlutotteet?,rto (1e71)

lessors (129)

So.otto (1114i

Stunt (1)

tolls 11

Oil/Atter stip et Stelent looters

Palm, etudes%toothy otaaelo at (What

to Stodge% Itatle.

11!111 tutor kan1111401..

al soli's, 16.5

...227aL.d111ka.

esspl..10 stodsnls vs. atm:4Notodotto

Lail* by ow r..dote hoot lover p.11011114

111 lollop 74.1 Perth', male MIA et lases Saa11 awes (lilathin 10 Nistle) serfto flea Air 4f rot.5.455 fttLervit ropol.t tees% to

11 Astons 1041 Calurs1 sorts kit vs. girls .114

intirs sallies totally (Set ssposess) Creeds (17 etas) Sits it shoo wAt sea` ,

Valais 6 troontrpot MU)

bye* Mr)

&tiro 'lollop fotsItt46 s.116.1ti vollogo

H tr. 11th uses!41 so. Otio 5e59o1

(lot resnA)17422

10.19

Ottp01e CU Imo)

Port.' soilsPoodiso foals

Ortpodo (11 Uses)

12 Its MO 104411 10411 tete (U Stag)

U 8f. 1341 5ot461 10111 Crept. (U Item)

U of. MU othoo1 10.118 loo% sprit35. R. 5s59.1 82.151 %meal sort%

fortes/6 6 Nets (100) 1 Uri Wood Ilrutdo (11 ltoos)

boveruft (10461 51 U. 618441 (tot poport011 Optahla (63 item)

bother (1141) (9o0 mug') (590 ulss) OropOlo (6 UAW

'rotor 6 toasts (1151) odUes 1.81% ottollooto

Misr HoltMisr

Iota/ (1011)

131 demi

38

61840%isUhl

llotfu bon)

*IC Mtfotot D.!". !Mel)

Ibtssity

Ms of flossJr. ktleg. vo. of. tolls.'

Iqs sv. One

kto M. 5irlsfivefold el. 119.4

knot )poysetlosboor

aJolt.eolftoto 64 oft.

Mor?Wpm is poemWow', ta tobloot

MIR we trotto sotanitMAIhrt, ON II

deed

(*Up 4666

As vs. woo

nesse. sl. es,? and wore VAC

10 tendel glveNoulateet1p lowertot lnp

Ut tls Polo oestdp

iii of at. oollogo!Igor b.%ottolftostalp

ILselit Met Witt.wet top ossotot o161, ea ropingtooOof Itsit. (103of %eta} ornttio(1) Do latfornemu to otttrtostUot

11,110tUtp 000fttsex otreo1 .64.11

IlauslUtp 000ffIl.Moot mei .61.06

.10:11

butottotoo ftetWisest

trio idol to tateill %Uttar, %Woemu 411 Ut1.Sou nootolltoottrt .t/M

614019t (AIProd) Uos foals.WU du nail

aotatototipilhoortr."

tRit 44roartt

.66 M .41 (A Ass)

(1yAr 'AAA Nonarts WA) IstatitMal U All 111114

Mob Al son Sm.led) Neihmatet AI WA

variable appear to support these generalizations though the most out-standing result is the lack of differences between ratings by the twogroups.

Inventigations in which maturity or age of students was one of thevariables studied appear to be unanimous in the conclusion that thisfactor influenced ratings very little. It should be pointed out thoughthat in almost every pase a every limited range in age of students wasstudied. Usually aniinveatigation covered the range within a particularcollege or high school or was concerned with first year students as com-parod with advanced students regardless of age. The study by Drucker andRemmers (105) is an exception in that it dealt with the relationship be-tween ratings byistudents and ratings by alumni of at least ten years'standing. This 'study it particularly relevant to the frequently raisedobjection to student ratings that students are too,innsture to rate theirinstructors and that many years later, as alumni., students will have.different values and willevaluate their former instrhotors on a' differ-ent and presumably better basis. Pdsicive relationship of some magnitudewas found. What differences did occur showed that the students ranked theirInstructors higher than ,did the alumni. The4difference was significant forthree traits. It is possible that this might reflect a change in the teach-ers, i.e., that they be9ame more effe8tive, rather than a change in opinionof students as they get older. There was high agreement between the stu-dents and alumni as to the relative importance of the ton traits on thescale. The Pearson product-moment correlation coefficient between medianranking4 of these ten traits by the 251 students and 138 alumni was .92.

UsingitudentRatforlstrLnvroemen

There appears to be considerable opinion that, properly used, stu-dent rating has value in bringing about instructor improvement. For ex-

ample, Schutte (296), Clem (77), Flinn (116), Riley et (276), andStuit are', Ebel (327), after having students rate instructors on one formor another, state (generally without adequate research evidence) that stu-dent letting enables instructors to evaluate their courses and teachingperformances and that students' opinions often provide a better basis forself-study and instructor self-improvement than do the opinions of super-visors.

At thetend of both the first and second semesters Bryan (62), in 1938,asked pupils to rate 29 junior high school teachers. He urged a 9-item, 5-

point scale, defined in descriptive phrases. Imp?ovement revealed by theratings was reported in terms of the percentage of items showing a differ-ence between the first and second ratings. In this and subsequent articles(63, 64, 650 66).he indicated that most teachers find the student, ratingshelpful orl'at least, not harmful. This expressed attitude of the teach-ers, however, may reflect a positive bias, in that participation of theteachers in the study was voluntary; thus, the population studie0 ray havebeen one that already believed in the helpfulness of students' ratings.

39

In 1941 Ward et al. (348), using the Purdue Hating Scale for Instructors,asked students to rate 40 practice teachers at the end of one month of in-struction and again at the end of tha semester. The ratings were used indiagnosing the weaknesses of the practice teachers and as stimuli for im-provement. On the retest 39 of the 40 teachers showed a gain in rating. Ap-parently no use was made of a control group of practice teachers who did notget information concerning themselves from student ratings against whichchanges in the experimental group could have been compared.

Porter (257), who based his opinion on a consideration of pupil ratingsof 27 student teachers obtained in 1942, suggested that supervisors' ratingsmay be made more objective by making use of pupil ratings. PresumablyPorter intended that supervisors should utilize pupil evaluation of practiceteaching to support their own evaluation of practice teachers. Whether ornot supervisory estimates thereby become rore objective has not been estab-lished.

Self - Rating of Instructor Effectiveness

Few studies of self-appraisal by teachers have been reported in theMerature, Surveys of rating practices in the schools also show thatself-ratings are sparingly used.

In 1927 Roberts and Draper (279) reported results of a study of prin-cipals' reports obtained from 441 high schools with enrollments ranging from5 to 4000 pupils in all sections of the United States. Of the 398 reportingon the use of self-ratings, principals indicated that in 86 schools teach-ers were required to rate themselves, in 3 schools it was suggested thatthey do so, and in 309 schools no such rating was required,

In 1945 Reavis and Cooper (262) surveyed 123 cities in 34 states andthe District of Columbia. Only one of these required a report of self-appraisal filed for administrative evaluation.

Table 12 summarises seven investigations. In six of these investiga-tions, oor:elations were determined between self ratings and certain othermeasures of effectiveness. Alministrative ratings, pupil ratings, or pupilgain show negligible relationships with teachers' self-ratings. Seven of

the 10 coefficients for different schools reported by Cooke 01) were .21or less. Even the largest, an x of .94, is not significant, having beenobtained with an E of only 25 teachers. The only coefficients signifi-cantly different from zero (at the .01 level) are those obtained by Flory(117) between self-ratings and ratings by friends, Unfortunately Florydid not report the difference between means of self-ratings and ratingsof friends; hence, he provided no information pertinent to the questionas to the tendency to overrate oneself. The close agreement between self-rating and principal's rating in the study by Fichandler (111) might beexplained in part by the teacher's familiarity with the principal's forger

rating.

40

Table 12

Relationship of Self-Rating to Other Measures of Instructor Effectiveness

Mmvestigator

Fiery

(193

0)

Teacher sample

35 student

99 students

Measures compared

Self-rat

179.

triends (same

Self-rating vs.

friends (same

composite rating 5

sole)

composite rating 2

scale)

Correlation

coefficient

.56

.49

tam

an(1930)

116 with 1 sem. experi-

ence

Freyd's graphic self-rating scale vs.

superintendent or principal's ratings

.09

Barr et al. (1935)

66 elementary

TorgersoE self-rating scale vs. pupil

gain (Stanford Achievement)

-.01

Cooke (1937)

9 to 48 elementary

& high school

Self-rating vs. composite of 3 ratings

by principal

.02 to .49

Devenpori. (1944)

44 sr. high school

Self-ratings ("Row I Teach") vs. pupils

ratings ("Row Teachers Teach")

.25

Self-resings vs. pupils' ranking (how

well teacher liked)

.02

Fuller (1946)

45 student

10-point graphic self-rating scale vs..

complasite supervisor rating

.09

The tendency for individuals to overrate themselves is exemplifiedin the atudy of Knight and Franzen (179) who, in 1922, asked 110 studentsto rate themselves in terms of order of interests and also to rate "ideal"and "typical" junior students. The correlation coefficients obtained be-tween self-rating and rating for the ideal was .46 and between order ofinterests for ideal and typical students Was The authors concludethat the data show a well-marked tendency for a person to overrate him-self when he compares himself with others and that the tendency still per-sists when th© judgment is independent of comparison with others.

In only rare instances are an individual's own estimates of his can -petencd acceptedlat full value by his superiors. The educational fieldappears.to,be no exception in this respect. On the basis of the few a"ail-able studies of self - ratings of instructors as well ao from self-ratings ingeneral, k/he obvious, undisguised self- rating scale technique would seem tooffer, little encouragement for furlher investigation. It is possible, how-ever, that thire may be some justification for further exploratory workwith more aubtle.selfrrating instmLents.

Objective Observation of Instructor_bdormance

The emphasis of present day teacher - training institutions appears tobe lea upon selection of a particular kind of person than upon trying toteach Methods, of peeormanee that will insure success in the classroom.The establishment o2 departments of instructor training at various AirForce bases attests to the adherence to this apl'oach in the Air Force.Potential` instructors are given training in methodology and provided withthe 9pportunity 'o practice the appnwed techniques under eiruiated class-room conditions. l., keeping with this emphasis upon instructor performance,it might be expected 'at an inatitctor'e effectiveness might be evaluatedby observing whet the instructor actually does in the classroom, prol4dethat' the observed behaviors are validated against other criteria.

Investigations wing observational methods to determine differences inpe,7ormanc,7t tifect::e and ineffect4ve teachers.have been fpw in numberatiA have vricri widely in lesign. B6bwnell (59) points out this lack,stating VA: the use of th:, technique of continuous, or a series of spaced,obaerviti.111 intended to dItect'changes in some form of behavior as beengr,:ssly neglected in the research work in this area.

Unforttrately, also, most of these studies have leaned rathet heavilyupo the elbjective judEment'Of the observers. In many cases the investi-gat himse;':, and sometimes al administrative orficial, did the observingthol0 'then are a number atedies in which specially trained independentobservers have been employed. The ob,, vational methods used includechit. ; variations of the time-sivtpring technique or check-lie, records of4he ireaehce, absence, or duration of particular activities. In a very

fey cases photographic, phontgraphid, stphographic.reports, and frequencyaunts have also been utilised. Studief in which a rating scale was core.pleted by an individual after observing a classroom situation are not in-cluded in this section,

42 POOR ORIGINAL COPY. 6f*IAVAltABLE AT TIME FILMED

Reliability of Objectiye Observation

In only a few of the studies using the observational approach was thequestion of the reliability of the method considered. Too often it isthought enough to say that the observer has had practice in observing, orreliability was assumed on the basis of the fact that the observer wassupposedly an "expert" in the educational field. These assumptions aremade particularly, of course, in cases where the investigator or an ad-ministrator was the observer,

Where reliability was computed, the criterion most generally usedwas agreement of independent observers determined by use of a correlation

. coefficient or percentage of agreement on the basis of an item ; -by -itemcomparison of records. In a few casas occasion-to-occasion reliabilitywas computed for the same observer. In Table 13 reliability coefficientsare listed, In general, it may be said that the reliability of plannedobservational recording compares Lvorably with that of other methods. .

Anderson and Brewer (7) found that a total of from 300 to 400 minutes ofobservation yielded a high degree of consistency in the sampling of teach-ers' behavior and that observers sere more reliable in recording "domina-tion" than "integration."

Validity of_Objgctive Observation

The moat general criterion of validity of observation has been facevaLidity. In a few studies, however, different methods of evaluatingthe same lessons were compared. In 1930 McAfee (208), who evaluatedteacher efficiency by counting the number of good teaching practices andthe number of poor practices as recorded by one observer on a ailedrating sheet, obtained a correlation coefficient of .4) between thisevaluation and supervisory ratings for a group of 98 teachers. Shannon(304), in 1936, compared three methods for measuring efficiency in teach-ing. One of these was based on an attention score obtained by dividingtotal minutes of observed pupil attention (determined by pupil's 1.e.sturalattitudes and movements) by total possible minutes of pupil attention. The

other two, which were subj,ative, although accomplished by the same individ-uals as the attention score, consisted of five-point ratings made on asoon card containing 43 rubrics grouped under five headings, and rankingof the teaching performance of each teacher within tAs group. the observer-raters were 14 graduate students who had had experience in supervision, andthe teaohere studied were 111 student teachers divided into eight homo-geneous groups, Correlations between score-carS ratings end attentionscores ranged from.07 to .61 and between rankings and attention scores from-.16 to .73 while the correlations betwaen score-card ratings and rankingranged from .38 to .97. It appears that while pupils' attention scoresare more reliable (see Table 13) than the score-card ratings or ranking theydo not compare as closcly with the ratings or ranking as the le:c,ter two

43

Table 15

Reliability of Various Methods of Observing Teaching Effectiveness

Investigator

Teacher sample

WrIghtstone

LIM)

(Not reported)

Shannon (1936)

111 students

Jayne (1945)

Observational Method

Recording

by use of catego-

ries or ',odes of

behaviors

Attention scores

Rating on score card

Ranking

28 elementary

Specific obier7able

rural

activities (check

lint)

10 elementary

rural

Anderson at el.

3 elementary

Ayans (1952)

48 elementary

Time-sampling_of

dominative and

integrative be-

havior

Classroom observation

scale lcheik list

of 26 behavior di-

mensions)

Over-all assessment

Measure of reliability

3 observers vs. 3

observers

3 observers vs. 3

observers

One observer (odd-even

days) for 5 classes

2 observers vs. 2

observers

2 observers vs. 2

observers

2 observers vs. 2

observers

Scores from 2 obser-

vations

Observer vs.

observer

Scores from 2,obser-

vations for 4

observers

Intercorrelation be-

tween observers

Reliability

coefficient

.91 (initiati -;

.93 (cocnern-

tion)

.82 to .96

.67 to .84

.49 to .84

.26 to .92

.41 to .95

.66 to .99

.85 to .93

(domination)

(Per cent agree-

80

ment

86

.74 to .91to (inte)-

grative)

(Per cent agree-

ment 74 to 7?)

.72 to .85

.68 to .84

compare with each other. Since the rankings determined by the two sub-jective measures bear higher correlations than comparisons involvingattention scores, Shannon concludes (gratuitously, it appears to the re-viewers) that "the more subjective means are the better ones of the threeincluded in this investigation."

In a later paper in 1942, Shannon (307) made another study of thevalidity of attention scores. Two seventh and eighth grade classes com-posed of 47 boys and 53 girls were used. Observations were made by threegraduate stuaents while material was read to the class. Pupils werelater given multiple-choice tests covering the material read. Correla-tions between attention scores and test scores were: for boys, .6 ?; for

girls, .34; for total group, .59. The respective correlations betweentest scores and intelligence were .37, .40, and .37, while attention andintelligence correlated .14, .34, and .21. The author concluded, "As-suming that the material read...was new to the children the evidence isdamaging to the validity of the attention measurement. That it has aslight degree of validity is clear, but that it has enough validity towarrant its use in judging classroom activity is worse than doubtful."It aypears to the reviewers that Shannon was unduly pessimistic. Resultsshowing an attention measure which is somewhat more closely related tostudent performance than it is to intelligence have implications justify-ing further research. Strictly speaking, Shannon's study does not pertainto the teaching but rather to pupil factors effecting learning, since theteaching was the same for all pupils.

Some Significant Observational Studies

The findings of a number of studies using the observational rrethodwill to reviewed at some length because tke results appear distinctlyencouraging.

One of the earlier observation studies was that of Barr (16), in1929, who set forth to observe characteristic differences in teachingperformance of good and poor teachers of the social studies. A group of47 superior teachers was selected, on the basis of superintendents' andstate inspectors' ratings, from cities with a population of 4000 andover, Similarly, 47 poor teachers were selected from cities of less than4000, excluding teachers from one- and two-room rural schools. Thesuperior teachers were from the "promoted" group, with better trainingand more experience than the poor teachers. The poor teachers wererated C- or below, and 5C1 did not return to their teaching positionsthe following year. The median experience of the good teachers was12.3 years, while that of the poor teachers was 3.7 years. An obviousdefect of the design of this study was the failure to hold teachingsituation constant by holding type of school constant.

45

Teaching methods were studied by using a combination of subjectiveand objective devices. These included! (1) an annotated stenographic re-port, (2) a time-chart record of one or more recitations, (3) an atten-tion chart for one or more recitations, (4) a time-distribution study ofthe major activities of the recitation periods for one week, (5) a check-list record of one recitation, (6) a comprehensive questionnaire upon thevarious practices of each teacher, (7) superintendents' estimates of theteachers' strengths and weaknesses, (8) the teacher's 'self-analysis ofher teaching.

Barr found the usual subjectively determined qualitative differencesbetween good and poor teachers. Strong points of superior social studyteachers included, for instance, knowledge of subject matter, good tech-nique in asking questions, ability to stimulate interest, and socializa-tion of class work. Elements of weakness included such items as no pro-vision for individual differences, formal textbook teaching, no interestin work, no daily preparation, weak discipline, and no knowledge of sub-ject matter. Barr mentioned 52 separate traits in listing the personalqualities of ecod and poor teachers, including Personal appearance, sin-cerity, energy and vitality, and speaking voice. Berr's results may besomewhat suspect since his evaluation of the qualitative differences mayhave been unintentionally contaminated by foreknowledge of the identityof the good and poor teachers. With respect to quantitative differenceshe found that correlatiors between time distributions of various aspectsof class Ictivities and supervisory 'ratings ranged from -.23 to .17. +Re-lationshirs between particular items on the time-chart record and estimatesof teaching success were also found to be small. Barr concludes that. it isdoubtful 'whether tite e:-.pended in class upon such iteile as those reportedin this study are reliable indices of teaching ability.' indicates thatwithin very broad limits there appear to be no optimum time expenditures forclass actifitiee and that good teachers function successfully within a widerange of time expenditures.

Olson and Wilkinson (248), in 1938, attempted to investigate teacherpersonality as revealed by the amount and kind of verbal direction used inbehavioral control. They used time-sampling records of responses of 30student teachers, 25 women and 5 mdh, to a constant group of children, 13first grade, 13 third grade, and 13 fifth grade pupils, in a one-roomsituation. Each of these grade groups was divided into two subgroups orclasses, equated as nearly as possible for ability. fach teacher wasobserved with each one of the subgroups at :east once. Ten five-minutesamples per teacher were obtained for each class taught. The frequencyand methods of redirzeting children's attention were observed. Distinc-tion was made between language ar.i gestural responses and between positive,directive verbal responses as opposed to negative responses. A 'blanketsabre° was also obtained by noting each five - minute period in which theteacher adjpsted to the cla.ls as a whole, rather than to an individual incontrolling behavior when the attention of an individual child needed to beredirected. Observations were made by a critic teacher. Teacher efficienoy

46

was obtained for each grade, based on independent judgments of schoolprincipal and critic teacher together with average ratings obtained onLeonard's Rating Sheet for Predicting Teaching Success. The coefficientof correlation between the two raters was .73 for the total score on thescale. The correlation between rated teacher efficiency.and total teach-er responbe score was -.06, between teacher efficiency and positive teach-er response, .69, and between teacher efficiency and blanket response,y.62. then correlatiOns were computed between teacher responses of thefive most able teachets and pupils' scores on the Haggerty-Olson-WickmanBehavior Rating Scale, Schedule B, (pupils were rated by principal andcritic teacher) the resulting coefficient,was .69. For the five leastable the qoeffioient was .30. Olson and Wilkinson felt their resultsindicated thrit there as better distribution ofiattention.in terms ofpuloil'need An the case of the able teachers ande quantitative analysisah9wed that the less able teachers tended to avoid contact with themore difficult cases. Conclusions based on correlations involving twoOs of five each, however, cannot be taken too seriously.

Jayne (366), in 1945,'compared pdpil changes with specific observableteacher acWitiess He used 28 teachers of tlostker's (282) study, and anacklitional'10 teachers and 95 pupils; Pupil gain for the 28 teachers wasmeasured' by eomputidg rotidual gain (actual gain minus predicted gain)for 0114881 in social studies on the basis of eight tests, six'w whichwere published t'ests,,andtwo,composed for tflt particglir course Of study.For theAlfdditiOnal teachers, gain whe measured after each class hadhad, a ilebson oniAlaskagby computing posttest minus pretest and recalltest minus prated,. In this study no single, specific observable teaehoract was found,whose frequency or per cent of occurrence was invariablysignificantly correlated with pupil gain. "There ie." Jayne etatee. "ingeneeel, littl, relptienship between specific observable teacher actsand the ptipillgain criterion." The results, however, vAried greatly fordifferent methods of adeessing pupil gain.

Jayne noted that analysis of the coefficients of correlation seemedto indicate that the most significant positive correlations with pupilgain were those having to do with extent to which questiops were based onpupil interest and experience rather than om assigned text, the extentto which &!ie teacher challenged pupils to support ideas, and amount ofspontaneous pupil dis,Jussion. A composite index score, called "Indexof Meaningful Discussion," based on seven items, correlated .80 withpupil.gAin based, on a composite of eielt tests and 109 with pupil gainbased on two tests constructed for the particular course for the 28teachers from the Rostker study; however, this store yielded negativecoefficients of -.67 for ifmediete recall find -.68 for delayed recallfor the 10 additional teachers. Jayne eAplains this by +.heifact. that theaim ee.the lessons.in the first study (Rostker'e) and the scco:id weredifferent. The teaching in the first study was of wider scope, whilethat of the second was steed toward recallgimaking discussion of textbook

47

material essential. Accordingly, Jaynl made up a second composite of itemsrelating to mere recall of assigned material. This yielded higher coeffi-cients for the group of 10 teachers (.82 for immediate recall and .53 fordelayed recall) than it did for the 28 teachers (.19 for the composite ofeight tests and -.35 for the course Lusts). From this it would seem thatteaching procedures that were appropriate and effective under conditionsof the first study may have been inappropriate end ineffective under con-ditions of the second study.

Anderson, Brewer, and Resd have made a series of rather exhaustivestudies of teachers' classroom behavior. Iri the first of their studies in1945 Anderson and Brewer (6) investigated dominative and socially int,e-grative behavior of kindergarten teachers. A total of 101 children in twoschools were observed to determine pupil reaction to the differential be-havior of teachers. Among other results, teachers were found to use domin-ation of individual children more consistently than integrative contacts;teachers tended to dominata boys more often than girls; the number ofteachdr-pupil contact's per hour had little relation to the numbers ofchildren in,the room; for a mental hygiene point, of view, there was "better"tltaphing ill the morning than in the afternoon, It thus appears that in-Oividual children may live in vastly different psychologicAl environ-ments in the same schoolroom.

In a subsequent monograph in 1946 Anderson and Brewer (7) discussedresults of observations of teachers' aminative and integrative contactsin second, fou'rth, and sixth grades. The categories of teacher behaviorobserved }ere largely descriptive and represented activities that rade adifference in the behavior of the children. Fourteen statistically sig-nkficaht differences between children in the two second grade classroomswere found, These were reported to be consistent with the personalitydifferences of the teachers. Pupils of Me more' integrative teachershowfdlsigpificantly lower frequencies bf looking uptiplaying with foreignobjects, in genera) lees conforming and nonconforming behavior, and morespontditeity, initiative, and socialibehavior'ttm did those of the domin-ative teacher. Teacher contacts in thd sixthigrede situation were asfrequent as they were in the second and fourth grades.

In a third monographtin 1946 Andeflon, Brewer, and Reed (8) reporton follow-up :Audios of the effects of dominative and integrative con-tacts on ahildrentelbehavior. Tpe dominating, teacher was, a year later,ftill dominatibg, but the children who had passed on into the thirdGride'no longer showed the undesiraliad personality patterns formerly noted.Two third grade teachers were also observed, ona'bt whom had twice as manyfrequerfoies of domination in conflict contacts with individual, childrenand over four times as many such coOacts !with groups of children as theother teacher. Within the validity of certain mental hygiene assumptions,observations of the teachers' classroom behavior reveal .'d certain strop;points and certain weak points. The authors suggest that the weak pointsare such that they aie amenable to correction by instituting teacher in-service training programs. As a result of the work by Andersor Ai Al.

48

diectssed in the above references, a scale for recording dominative and inte-grative behaviors of tv.chers has been prepared and is to be published in aforthcoming issue of the plied Psychology Monographs.

:n 1952 Ryans (288, 289) reported two studies concerned with factoranalysis of teacher behaviors, one of elementary women teachers (275 thirdand fourth graded and one of high school teachers (115 men and 134 women).These investigations are part of the "Teacher Characteristic Study" being con-ductod by The American Council on Education and the Grant Foundation. Thepurposes of this breeder project as outlined are "(1) to try to determine thepermalaaty patterns of teachers,(at'elementary and secondary school levels)and 2 t* explore the possibility of developing measures that will reflect,and rediA, such patterns as may be found." The research is limited to thestudy of the personal qualities of the teacher on the assumption that cer-tainminiza of intelligence and knowledge of subject matter (and perhapsknowledge of 'techniques" of teaching) are primary requisites for teaching.In the part reported by Ryans, observers trained. over a period of five weeksrecorded observations on a specially devised Classroom Observation Scale.This settle covered 26 behavior dimensions relating directly to teacher be-havior and pupil behavior (presumably reflecting teacher behavior). Eachof these dimensions of behavior was described in terms of opposite polesand was assesset )n a four-point scale. Each elementary teacher was ob-served by at least three different observers on different occasions. Eachhigh school teacher was observed by at least two different observers andsometime by three. Data were factor analyzed by the centroid method.The factors obtained for the two groups of teachers did not duplicate eachother entirely although there are points of similarity. Ryans (287) be-lieves that three correlated factors may serve satisfactorily to describeteacher behavior at both levelst (1) understanding, friendliness, and re-erorsivenees on the part of the teacher; (2) systematic and responsibleteacher behavior; and (3) the teacher's stimulating and original behavior.The three factors show somewhat different relationships in the two schoolsituations, Factors 1 and 3 are m,:et highly correlated in the elementaryschool situation with Factor 2 being relatively independent. In theseocn!ar:t school situation Factcrs 2 and 3 are most highly related withFactor 1 being relatively independent.

The work reviewed in the foregoing section constitutes a preliminaryattack iihich pr6bises to be one of the most productive in this area, Sys-tematic observation should prove fruitful both as a source of rationalehypotheses concerning the nature of teacher effectiveness hnd as a tech-nique for testing such hy;othoses. The relevant categories for observa-tion will of course depend on the farticularbaituation being investigated.Thus, ln Air Force schools, for instance, the observational technique willprotably employ vategories which differ from the categories of observationdeveloped for elementary and secondary school teacher behavior. The dif-ferentiatEt of those behavior- categories which are related to instructoreffectiverxeb from those which are ittaterial remains to be investigated.

49

Another approach to the investigation of the effectiveness of instruc-tors should also be explored further. It is that in which teacher factors,situation, or method are systematically varied as was done, for example,in studies (204, 355) of so-called authoritarian-democratic teaching. It

has teen suggested that the experimental classroom in which factors as-sociated with teaching can be manipulated under controlled conditions mayoffer greater potentialities for achieving successful results than do thecorrelational studies of teaching competence in situ.

Student Change as a Measure of Instructor Effectiveness

Most educational authorities hold that the primary responsibility ofthe instructor is to bring about change in the knowledge, skills, under-standings, attitudes, appreciations, interest, and motivation. of his stu-dents. For advocates of this point of view the determination of instruc-tor effectiveness is logical and straightforward. consists of measuringthe changes that are produced in students as a result of the insLructorlsefforts.

The importance of pupil achievement as a measure of teaching abilityhas long been recognized. As early as 1921 Courtis (85) pointed out thesignificance of student gains as a criterion of teaching efficiency, as wellas the importance of holding constant extraneous factors. He pointed outthat a comparison of pupils? learning curves for incidental learning withtheir curves for direct instruction would provide a means of evaluatingteacher competence. In a later article (86) he cautioned that any methodof measuring teaching effectiveness must involve the use of a "single-variable"measure. He held that it was neces3ary to measure the change in the rate ofgrowth whichtakes place in thestudent when the amount of quality of teach-ing is the only variable in which change occurs. 'ourtis then defined goodor poor teaching by the periods when the actual-growth curve showed markeddeviation from the theoretical growth curve. To illustrate the method, anobserved growth curve of a particular function was compared with a theoreti-cal growth curve for the same function as defined by GompertOs formula ex-pressing the general law of biologic growth. The author maintained that,while much research remained to be done, an exact scientific method hadbeen devised by which the effects of teaching might be precisely measured.

Unfortun,tely the possibility of comparing curves of " incidental learn-ing" with curves of learning from "direct instruction" seems much furtheraway today than it did to Courtis in 1921. While there has been immenseprogress in the science of measurement, this progress has brought a reali-zation of the difficulties invol,'ed in charting intellectual growth curves,particularly in an area as ill-defined as "incidental learning."

The first reported attempt to use student change is n measure of in-structor effectiveness appears to have been that of Hill (155) in 1921.

50

This and subsequent studies can, for purposes of discussion, be dividedinto five classes, according to the kind of measure of student changethat was used or suggested: raw, gain (posttest minus pretest scores);achievement or accomplishment quotient; miscellaneous measures; correctedraw gain (raw gain corrected for initial intelligence, grade, or othervariable); and residual gain (actual gain minus predicted gain).

Among the ir,..ves:Igators using raw gain as their criterion or as oneof their criteria are Baird and Bates (14), Barr at al. (20), Betts (33),Bimson (34), Bowden (46), Brookover (55), Hartmann (146), and Hill (155).Use of raw gain as a criterion is manifestly inadequate. Teaching isonly one among many factors operating to produce changes in students.It is necessary, consequently, to hold constant all factors other than theeffects of the particular teaching situation being studied. Since theearly 19309s raw gain has rarely been used or, if used, was one of sev-eral gain criteria.

The accomplishment (41:1ti .4 or rat*o which is the ratio a pupil's,educational age or quotient, as measured by standardized achievementteats, bears to his mental age or quoticrt, as measured by standardizedintelligence tests, has been widely used as a so-called objective measureof teaching efficiency. This ratio allegedly indicates the extent towhich a child is "working up to his ability." Goodenough (129) pointsout, however, that there are several sources of error which are likelyLo reinforce rather than cancel each other both for individual cases.andin group data. The errors arise from lack of knowledge as to the absolutezero point.in the two measures, from unequal variability, and fromfailureto allow for regression due to errors of measurement. As Goodenough (129)says "...in spite of repeated demonstrations of the unsound assumptionupon which the method is based, it has proved to be one of the most per-sistent die-hards in the history of educational psychology." The accom-plishment or achievement quotient has been used by Barr et al. (20), Coy(88), Crabbs (89), Simmons (310), and Stephens and Lichtenstein (323).

Certain investigators have attempted to use other student measures ascriteria of instructor effectiveness. Thus, in 1934 Davis (96) used pupilachievement in terms of passing or failing state high school examinations;in 1934 Frederick and Hollister (121) used numbers of honor grades and fail-ing grades; in 1935 Lancelot (192) utilized persistence in taking advancedcourses and grades received in those courses; in 1938 Beaumont (26)em-ployed number and achieveMent of students taking advanced courses; in 1945Cheydleur (75) used ranking of instructors according to the ratio of classaverage to group average in college French. While some differences amonginstructors were found, the outcome of none of these studies appeared tobe very significant.

The validity of these student measures as criteria of instructor ef-fectiveness may:well be questioned. Whether or not a given student passesor fails a state elamination, or achieves honor or failing grades, dependsupon many faators besides his teacher. The same is true of the ratio a

51

class average bears to a group average. Where pupils from different schoolsare compared, some means must be found for controlling such variables assize and type of school, equipment and library facilities, and the like.In all cases where groups or classes of pupils are compared, such pupilfactors as intelligence, motivation, interest, and aptitude of pupil for aparticular subject must also be controlled. The reliability and validity ofthe examinations on which the student9s grade is based must also be takeninto consideration. The number of students or theie persistence in takingadvanced courses and the grades achieved in these courses may depend uponthe enthusiasm of the instructor or the interest he is able to build up inhis students in the elementary courses or it may be a function of the repu-tation or competence of instructors teaching the advanced courses. Anysimple measure of student gains that fails to take into account the com-p71.exities involved will almost inevitably produce misleading results.

While not strictly concerned with gains Seyfert and Tyndal (302), in1934, used a rather unique approach in attempting.to evaluate differencesin teaching ability. The subjects were two general science teachers whohad previously been rated best and poorest of a group of seven teachersby superintendent, principal, and supervisors. Four groups of studentswere used: two groups of girls matched for age and score on the TermanIntelligence Test and tINo mixed groups with age and score on the RulonScience Teat held constant. Student achievement was determined in termsof the menidal age nec..issary in order that a student of the less able oftwo teachers may achieve the same score level as a corresponding studentof a better teacher. The difference in teaching ability between the twoteachers was found to be equivalent to about three months of mental growthon the part of the students.

Lancelot (191) says that mere acquisition tests are not sufficient todetermine student gains becalse of the discrepancy between acquisition ofknowledge on the one hand and its retention on the other. fie feels thata better and relatively sound criterion of teaching ability consists inthe degree of retention by the students of knowledge taught. While theo-retically this may be true, use of amount of retention as a criterion posesthe additional problem of finding some method for holding interveninglearning constant.

The first studies to measure student gains by partialling out factorsother than achievement were those of Mose et al. (235) in 1929, Taylor (331)in 1930, and Betts (32) in 1933. Moss et al. in studying the efficiency ofchemistry instructors used classes equated for intelligence and previoustraining in chemistry. Taylor corrected for intial score, age, and in-telligence. Betts, besides using a measure of gain in reading indicatedby the mean of the final scores on the Stanford Achievement Test, studiedthe relationship of various teacher measures with standard deviation ofthe class and measures of heterogeneity and homogeneity of achievementwhich were obtained by combining pupil mean final score and standard devia-tion by formulas. He also computed correlations ,with these teacher meas-ures after partialling out factors of age, initial score, and standarddeviation. He obtained much higher correlations for his teacher measures

52

(intelligence, professional inforration, vocabulary) when the criterion of"hetero-achievement" was used. He points out the pitfalls of judging gainby score alone or by heterogeneity (standard deviation) of the group. The

latter "can be secured by causing dull pupils to forget some of the thingsthey knew initially aid by inducing superior pupils to learn. If bothaverage achievement and heterogeneity of pupil groups are taken in combina-tion, such an influence serves to reduce the composite score because a max-imum composite can be obtained only by increasing both concurrently."

In 1945 Soltal (41) used he ratio of mean pupil achievement to itsstandard deviation as a measure of teaching effectiveness. In comparingsix teachers of United States History for matched groups, of pupils, hereported that one teacher excelled, having a ratio of teaching effective-ness more than four times greater than the teacher next in line, while theratios of the other five were close together, In interpreting Bolton'sfindings one should avoid the fallacy of the tobacco company that adver-tises cigarettes which contain "five times less acid tar." The use ofratios based on educational or psychological test scores involves assump-tions untrue of such scores, namely that their lower limit represents anabsolute zero point and that intervals between scores are equal, We cannever say that one person is four times as intelligent, knows twice asmuch history, or is four times more effective as a teacher than some otherperson. Other investigators who have used corrected raw gain includedBimson (34), Day (Q8), and. Georges (127).

Of the several methods used to measure pupil change, residual pupilgain (i.e., the difference between actual gain and predicted gain) is be-coming more widely used as a criterion of instructor effectiveness. Thismethod is really a more refined example of the corrected raw gain criterionalready discussed. Its main advantage is that a more adequate attempt ismade to hold constant student factors other than the effect of the instruc-tor. The chief disadvantages are its dependence upon the availability ofvalid instruments for measuring student growth, the excessive time requiredto obtain the necessary data, and the rather elaborate statistical assump-tions and analysis involved. With all its difficulties, however, thisappears to be one of the best criteria of instructor effectiveness.

Several versions of residual pupil gain where gain was predicted onthe bases of such student factors as initial scores or intelligence quo-tients have been used by Gotham (132), Jayne (166), Jones (172), LaDuke(188), Lins (203), Remmers et al, (269), Riesch (275), Rolfe (280),Rostker (282), and Von Haden (344). These studies will be considered onsubsequent pages.

Difficulties of the Gains Criterion

Tyler (338) and others, however, have pointed out the difficultieswhich attend the use of student gains as a criterion. In the first place,

53

as was noted earlier, what is meant by gain must be adequately defined. An

instructor is called upon to perform many duties and to accomplish manychanges in his students that are not measurable in terms of subject-matterachievement. Therefore, any measure or measures of student change based. ongain in subject matter alone represents only a small area of the instructor'stotal effectiveness. Thic ,bjection probably applies less or may not beapplicable at all to the Air Force situation, *.i.n which the instructor'schief concern is the teaching of course material of a technical nature.

Determination of gains attributable solely to the teacher is depend-ent on the availability of valid instruments for measuring such growth.If more than just subject-matter learning is to be used, mere use ofachievement test data is manifestly inadequate.

As a practical solution most studies measure student gain on thebasis of subject matter learned on the assumption that it is, if, not thetotal gain, at least probably representative of the major part of theteauheOs job. Even assuming that the type of gain that is to be measuredis known, there arc still difficulties in obtaining a valid measure. If

gains of classes under different schools are compared, use of standardizedachievement tests may only reflect the differences in the teaching programin use in the different schools and not the ability of the different teach-ers. In this connection, tests designed to measure the learning achievedin a given course of study are probably more adequate than the more generalstandardized achievement tests. The nature of the subject matter selectedmay also make a difference. A gain in spelling may be a less complex meas-ure than again in arithmetic. Judging the effectiveness of a teacher whois teaching several subjects, such as is usual in the elementary gradep,on the basis of the gain of his students in a single subject field is ob-viously inadequate.

As another difficulty, en instructor whose students obtained highinitial scores might show up poorly under a gains measure even if correc-tion were made for the high scores. This is because of the limited gainpossible in the case of high original scores and the increased improbabil-ity of making e given gain as the initial score becomes higher. Every testhas a ceiling, a maximum or perfect score beyond which no one can go. If

a student's score is near the top on the initial test he cannot gain asmuch as the person whose score falls near the bottom. This difficultycan be overcome if regression equations are used to obtain predicted finalscores and if the tests used have high enough ceilings. Analysis ofcovariance may also counteract this difficulty.

The gain of a.student with a high initial score for his grade groupis also limited to some extent by the general teaching situation. In mostschools for each subject and each grade there is a definite rahge of diffi-culty of material to be taught. This in effect iriposes a test ceiling forthat particular grade in terms of the subject content considered to fall

54

within its range. For this reason a student who has already made inroadsinto the subject-matter content for his grade will appear to be making lessprogress than a student of lower initial achievement.

Reliability of Student Gain

In Table 14 appear :eliability coefficients of measures of pupilgain as reporCed by five investigators. It will be noted that, as com-pared with conmon].y reported test reliabilities, most of the coefficientsappear to be rather low. Taylor (331) explained the reliability coeffi-cient of .26 for reading progress in terms of the slight numerical changesin scores that took place. Rolfe (280) reported a reliability coefficientof .82 for the initial composite of three Hill tests and a coefficient ofa8 for the final. Hill composite, yet the reliability of the change wasonly .19. In general, reliabilities for gain tended to be lower thanthose reported for either initial or final scores. Rostker (282) sug-gested that this.may have been due to the fact that the gain reliabilitycoefficients contain errors of measurement derived from both the initialand final applications of the tests used. In addition, the reliabilityof a gains measure is dependent not only on the reliabilitk.of initial andfinal measures but also on the correlation between them. The higher thecorrelation of these variables the lower the reliability of the gainsmeasure.

In general, the statistical computations involved in the estimation ofthe reliability of student gains are equivalent to those involved in esti-mating the reliability of differences between test scores. Methods arediscussed and relevant formulas are given, for example, in Lindquist (202).

Interpretation of a reliability coefficient rests on the assumptionthat it has been obtained as the result of correlating comparable measuresof the same thing and that the variable errors are uncorrelated with them-selves and with the true scores. If errors are correlated, it follows thatthe obtained reliability coefficient will be spuriously high. In thisconnection it should be noted that all the correlations reported in Table 14are split-half. These coefficients show the uniformity of,the effect of theinstructor within a single class: They do not give any information as to theconsistency of instructor effectiveness in different classes. Coefficientsof reliability obtained by the split-half method will be increased by anynoninstructor variables that affect a whole class, while class-to-classcorrelations would be decreased by such influences.

An investigator may be interested in the effect of the instructor upona class as a whole or upon certain types of students within a class. Sincemost research in this area,hasbeen concerned with the effectiveness of theinstructor with respect to a class, measures used. in determining pupil gainhave usually consisted of means for groups of students. The reliability ofaverage measures of pupil gain based on a group of pupils may differ fromreliability of gain determined for individual pupils.

55

Table 14

Reliability of Measures of Student Cain

Investigator-

NuMber of classes

Pupil measures

Measure of gain

Reliabilitye

coefficient

Taylor (1930)

105 elementary

classes

Arithmetic (Woody-McCall)

Achievement quo-

tient

.76

Reading (Thorndike)

Achievement quo-

tient

.26

LaDuke (1945)

31 elementary

Appreciatecab

Residual gain

.64

classes

Attitude

Residual gain

.62

(Community

Information

Residual gain

.72

living)

Interest

Residual gain

.53

Rolfe (1945)

47 elementary

UnitbComposite (2 tests)

Residual gain

.46

classes

(Citizenship)

Wrightstone Composite (3

tests)

Residual gain

.69

Hill Composite (3 tests)

Residual gain

.19

UWE (Composite all 8 tests)

Residual Gain

.68

Roatker (1945)

28 elementary

Undtb

Composite (2 tests)

Re3idual gain

.37

classes (Social

study)

Wrightstone Composite (3

tests)

Residual gain

.75

Hill Composite (3 tests)

Residual gain

.44

Stephens &

86 elementary

Arithmetic (Stanford

Mcdified achieve-

.63

Lichtenstein

classes

Achievement)

ment quotient

(1947)

a Uncorrected, split-half of each class or of samples therefrom.

b Tests specially constructed for the-particular course of study.

Correlation of Student Gain with Other Measures of Instructor Effectiveness

Investigatione in which attempts have been made to relate measures ofstudent gain to other presumed measures of instructor effectiveness havebeen summarized in Table 15. Reported coefficients range from -.61 to .81.In more than half of these studies one or more negative correlation coeffi-cients were obtained, This extreme variability may mean that measuresused were inadequate or that the gains criterion is depenaent on factorsother than the teacher such as subject matter taught or pupils' academiclevel. On the other hand, in view of the statistical pitfalls awaiting anunwary user of the student gains criterion, certain of the studies whichshow low or negative re)ationships may merely be reflecting inadequateresearch design.

In five of these studies, Simons (310), Bimson (34), Brookover (55),Von Haden (344), T.emmers et al, (269), correlation coefficients were notcomputed, were not significant, or were not available to the reviewers.Bimson consistentAy found that pupils of teachers rated above the medianmade higher gains than pupils of lower rated teachers, but that greaterprogress was made by pupils of, lowest intelligence. It should be pointedout that Bimson (34) determined a progress quotient by dividing the dif-ference between pretest and posqest scores by I.Q. This procedure appearshighly questionable since it penelizes the brighter students who tend tomake high initial scores. Due to test ceiling the possible gains of thesestudents are less than possible gains of duller students. This in turnfavors the instructor whose effors are directed toward the students oflow I.Q. The "little relationship" reported in Table 15 for Jayne's study(166) is based on the fact that Jayne found significant only 20 or aboutsix per cent of 336 correlations between frequency scores of observable,instructor activity and pupil gains. In the report reviewed, Brookover(55) failed to include statistical analyses which were evidently made inthe original doctoral dissertation from which the article was drawn. Thenegative association which Brookover found between mean gains in pupils'history information and the pleasurable personal-relationship which theteacher has with his pupils is what might be expected. The instructorwho spends his time being a "good fellow" with the students probably tosome extent neglects to impart subj6et matter information.

As one examines the results of correlational studies such as someof those summarized in Table 15, one wonders what thinking lay behind theinvestigations. Some of the variables intercorrelated are so unreasonableand arbitrary that one suspects they, were computed simply because data oncertain variables were available or could Oa readily obtained. In someinstances, certainly, there exist no psychological nor educational groundson which relationship between student gain and some of the variables usedmight reasonably be expected to exist. Computation of such correlationswere obviously largely a waste of time and their reporting makes no contribu-tion to our understanding of the relationship of stuient gains to ratedeffectiveness of instructors.

57

Table 15

Correlation of Measures of Student Dein with Other Reasurea of Teacher Effectivenees

lindatildtgy --1!4411.1. 145149

Hill (1921) 115 elementary

Cribb' (925)

Baird f Sates (1927)

Taylor (1930)

Stamina (1932)

Barr. yj. (1935)

Jody (190)

'rend (1947)

Rogers, (1949)

Ton Haden (1945)

Ling (1946)

Dimon (1957)

Elementary, ruralElementary, tartanSlamsdary, ruralEl...Adam urban

470 elementary

105 elementary

40 elementary

66 elementary

13 high school65 high school

9 (laDuks)

17 (Reetker)

53 ch.:dairylaboratory (/IIesseed prediotionp25.under predin-lion)

50 doing. (28szcoel prdietion)20 under prediction,

stay 1st

Arithmetic, pine ship,spelling

ReedingReadingComposite 5 idled@

( reading arithmetic,pewanehip,

deposit ion)

Readirg

ReadingArittastioReading

(Mot reported)

Arithmetic

Adthastie

Arithmetic

Engltah15 high schonl sui.jects

Corameantty

Social stud le I

Chsalstry

17 high ideal 6 high school rids :smien, I yr. avert-secs

17 high 'oboe, 6 t.ida school subjectswiden, 1 yr. evid-ence

15 de (idol Algebra, generalscience, history

lirookover (1945) 66 high Moil ale U. I. illstory Word-tics

66 Met *dial malt

66 high Mod sale

Gotham (1943) 57 eldidary,11-reat

447110 (1945) 3d elemeralarr natal

U. 8. Nielor7

U. 8. History

Citieenehip course

Ikeial dupes

pkaeury et rail wan

Posttest sinus pretest

Achievement quotientAchievement qactientAchievement qu,tiontAthisvmmant quo,isnt

Achievement quotient

Posttest Maus pretest

(Initial soon., ageintolligand heldcomitent)


Posttest minas pretest

Arhirvescens quotient

Andorran:art quotient

!leaf dual 6414Reddwil gain

Residual gala(original duly)

ROlithal Sob(cniginal study)

Residual wain

Other traits.

Reeldual gala

Residual gain


/Slave of teactioulteraunueAdmin,stretor toting NS/mike) .45Administrator to try Cary)ActalhIstrator rating Detroit) .19

.24

average ranking (5 impended's)Ranking (1. euperrieor)Estinating testi ins it Lenore/Estimating ta..ching in genial

.77-.36

.32-.26

principal rating (gambrel welt) .14

Ccapoolte addaidreter ranking .246 education tpracialty rating ,ce

cula2s1s. Idatm 'ft re tcr mnkirs .24education eydialty rating .10

Ilealni 4rator rating*

Superintendert rating (com-posite, 7 808.08,

Superintended retina (cam-pmate, 7 scals)

Suprintendant rating sath of7 sul..

Superviecr ratingSupervisor refit;

Supervisor follow-up (8 yr.)

Srapordsor follaw-up (11 yr.)

Studsot rating 32 traits,Liam of camearnal epparatusRating Co oared with Pardue .01

instructor tautAmadedge of shauistry .02Returning dailies 6 tests .02Mould instructor be kept .00eepervaion during tests .ceCyrano of deigned wo-k MI

Negligiblerelation

-.61-.28 6 .10

.35

.016

Not significant

Supervisor ratings of personal Nod If )4 losdsta items signifidat

Coaposil 5 supervisor ratings .19Pupil evaluation of teacher .06

effectiveness

Supervisor ratings

Posttest minus pretest Pupils' pleasant personalralatioas

Potted dna pretestPoetise% Mad protest

Residual pia

Resided gas

Addatetrator Mime

Pupil suttee id ability

Superideded, experrieer,observer (5 Hales)

Preview et obesreablosat irate'

Higher r.:14idolisesshow soonelatedlymore milprogress

Da "Witt111600awe rolaLanett"

No sijaiftWttrelationship

Lora irrodtareclat tisehlp

.10

141113 tele-ttearklybetweeneclat"ebeeralesate 6 pupildata

belesive K 1. and 6-rece wawa&b Level at oestidonee of date:4w a wan ratings between inet.eetere vb... 4141110. obtained "Mei la IMmlatry kidder Mae predated sat Mods

vbcee elasee" obtained end*. lover than yradistld.*via ends teeseend 6 Mlle Copirstas Social Studies test

Iledsbers* nodal adjeetnent InventoryWeed RUM Conduit TMlrefterst kale ter Ileaserind Mask ?owed ?wisher

58

:able 15 (Cori.)

1nvestizater Sub 16: MAISEL 21111.

i.esiduel pia- Inter-nation tort

Residual gain(0capr. ension

:fgVtd:1!°211for,nation. Interest)

Residual pinResidual pin

Residual gain

Residual gain

feeithal gain°s ual gaika

Rs s idual gain:Residual gateResidual gain°

WALL sg teacher effect isppins

LaDuine (1945)

Rostker (1945)

Rollo (1945)

Much (1949)

31 elementary, 1 -root

241 tleaontary, rusk

47 elenentar2. 1- 6Lama

22 sIonentral

Community /icing course

kohl .1AdiesSnoi.1 studios

Citisonship *auto

1:blovenent in socialstudio,

hrsorklityRight condo :6Social MI:arta:eraatt itoloCaeposite ail 5

Etitsrintamdent riling ca.se) .17Overvieor tosobr rating (5 -.03

Kale.)Superintendent rating (2 scale.) .02Supervisor teartter ralins -.25

ecelai)

Investisatcr rating () soles) .23, .26Supervisor rating (3 so.1,1) -.ca, -.01, .15

3 %Aim sale. .36. .37, .43

Superintendent nett:4 .22

Superintenimt rating .20Superintendent rting .35Superintendent rating .24Superintendent rating .elSuperintendent rat (AS at

1111111OUPII

a lacolasivo of 1- and 2-roon schcalsb of __. ofa.sfea Oa CurmaNdIOCID Oa .i.hOrlIr011 L wan tretrici Instotorr cl.813; atitn>4 wics hip I, sro

shoo, 'laws obtained grades later Van predicted.Tuts sac:: Townsend 6 Willie tanporstivo Social Studies Toot

Wasbburno Soohl Adjustatert loventorpWood Right Conduct testRsoserst Scale for lieasurits attitude Tested, roact.1,

The Greif. -Iscrel.7:ncie5 in the findinc,s ;11-1c, x-

amined the student gains criterion emphasize the extreme variability In re-lationship among criteria used tu indicate instructor ability. Al-Tarently,

at :least within the limits cf the measures so far used, the re/atiorv;hipbetween administrative opinion of a teachervs competence and the amount ofsubject matter that teacher will impart to her students cannot he predicted.While there may be no single measure that correlates consistently withmeasures of student change, it appears, as Jayne (166) has po5.,,ed out,that a composite index may be found which has high correlati,n pith thestudent gains criterion.

THE PREDICTORS--TRAITS AND QUALITIES ASSUMED TO BE RELATEDTO INSTRUCTOR EFFECTIVENESS

is might be expected many reaearch investigations have been concernedwith measuring or assaying these abilities) traits, qualities, and person-ality characteristics which are assumed to contribute to success in teach-ing. Assumptions are usually implicit also that the effect of a traittends to be constant, that potential instructors can be selected on thebasis of these traits, and that effective and ineffective instructors canto differentiated in terms of patterns of traits. Traits related tofailure have also been investigated and are summarized in a later section.

Among the traits and qualities of teachers that have ben investigated,studies most frequently have been concerned with the following characteris-ticJ: intelligence, scholastic achievement (academic level reached orgrades obtained), knowledge of subject matter, age and expe:ience, cultural

59POOR ORIGINAL COPY -BEST


backgryand, tcaCoing ability, teaching aptitude, professional attitude towardand interest in teaching, emotional stability and social adjustment, and per-

sonality. Attempts have been made to evaluate and relate a teacherts per-sonality in general to teaching success and also to indicate the relation-ship to teaching ability of such allegedly specific personality traits asaggressivenez..1 and control, appearance, considerateness, cooperativeness,enthusiasm, motivation, objectivity, and reliability. Some factor analysis

stu,.;ies have al-o been made (12, 70, 142, 149, 189, 220, 277, 265, 288, 289,295, 314) in order to determine to what extent various factors contributeto teaching 'ffectiveness.

In the following pages the available quantitative studies relative tothese traits and qualities will be summarized. In considering the variouscorrelation coefficients reported it should be remembered that their moan -iigfulness may be limited by the use of unvalidated criteria such as ratings,and their magnitudes may be limited by unreliabilities of the criterion aswell as of the predi. .9 POOR ORIGINAL COPY dESI

AVAILABLE TIME FILMED

Intelligence as Related to Instructor Effectiveness

It .3uld appear at first glance that of the desirable teacher charac-teristics one a the most important should be intellectual brightness.That there might be a relationship between teaching ability and intelli-gence was realized even before the Stanford revision of the Binet-SimonIntelligence Test popularized the and the Army Alpha provided aneasily accessible measure. This implicit hypothesis that teaching effec-tiveness and intelligence are related is reflected in the correlationsbetween ratings of these two teacher variables; such correlations may behigh because of halo effect, or more accurately, because of the logicalerror of assuming that intelligence and fetcher merit are related.

In 1912 for instance, Boyce (48), basing his findings on thn rankingsof 325 secondary school teachers by 27 administrators, reported a correla-tion coefficient of .71 between ranking on general merit and ranked esti-mate of intellectual capacity. As late as 1929 Baird and Bates (gi) Se-cured subjective ratings of intelligence of 444 elementary schoo. teacherbmade by their principals with a five-point scale. When general meritratings were correlated with estimates of seneraI intelligence a correla-tion caefficleAt of .58 was obtained. The corresponding coeffil.ient forsocial intelligence was .57. When these coefficients are compared withthose obtained by using more objective measures of intelligence (seeTable 17), the presence of the halo effect in these estimates of intel-ligence becomes apparpnt,

IntellkssoLkst Scores AS Related to Instructor Effectivenes.

In 55 of the available studies which have appeared in the last 25years, attempts have been made to relate objective measures of intelli-gence of the teacher to various measures or estimates of teaching

60

effectiveness. Intelligence test scores have been correlated with practiceteaching ratings or grades, various administrative ratings, student ratings,and pupil gains. In the studies mentioned, 17 different intelligence exami-nations (in some cases two or more) were employed. The American Councilon Education Psychological Examination was used in 12. studies and the Army

Alpha in? studies.

In 15 studies (12, POI 52, 58, 37, 119, 125, 172, 203, 208, 247, 261,275, 280, 323) negative correlations *ere renorted th largest being those

of Riesch (275) r n -.34, Jones (172) r = -.26 and Stephens ana Lichtenstein

(323) r 3 -.24. All three of these coefficients were obtained when intelli-gence of the teacher los correlated with student gains. In.16 investiga-

tions (20, 32, 39, 56, 57, 79, 119, 133, 191, 172, 184, 185, 188, 256, 282,320) positive correlations with r = .30 or more ars reported between teach-ers' intelligence test scores ana various criteria of teacher effectiveness.The highest relationship, a correlation coefficient of .57 with studentgains, was reported by Rostker (282) fur a group of 28 teachers. (LuDuke in

Reference 188 mentioned a coefficient of .61 in the conclusion of his study,but no zero-order coefficient of this magnitude appears elsewhere in hisreport. Between a composite measure of student gains and teacher intelli-gence he found a coefficient of .43.) Among the 55 available studies inwhich correlations are reported between'intelAgence scores and variouscriteria of teacher effectiveness, the number of subjects is often so small- -in one instance, in part of Jones' (172), study, as few as six- -that thecorrelation coefficients reported have little meaning.

In Table 16 are shown correlation coefficients obtained between scoreson the American Council of Educstion'Psychological Examinati6h and severalcriteria of instructor effectiveness, It will'be observed that the corre-lation coefficients reported vary.from -.26 to .57. This would appear toindicate that whether or not intelligelice is an important variable in thesuccess of the teacher depends upon the situation.

In Table 17 appear the 24 studies (8 have 2 entries)'in which find-ings are given for,90.or more teachen4. The first 18 entries are con-cerned with student-teacher groupe. With the exception of the VYle (261),Breckinrido (52), and Fuller (125), investigations most of the studiesreport a low positive correlation betweenantelligence and practiceteaching grade or rating. The lhst ll entries relatd bo groups of,teachersin the regular school situation. Except for Somers (320), Kriner (185)9and Gould (133), these latter investigations appear to show that there isonly 4 slight relationship between the intelligence and rated success ofa teacher.

It was noted earlier that student grade or achievement is sometimes'negatively related to the rating of teachers by students and sometimespositively, because some teachers may be batter for bright students kndothers for dull students. Similarly, the telattonship of instructor In-telligence to instructor competence may be positive, negative, or non-existent depending upon motivation and ability of students, subject matter,

61

Table 16

Corre/atiori Between A.C.E. Psychological Examination and

Various Measures of Teach..?.r Effectiveness

Investigator

Teacher sample

Effectiveness measure

Correlation

Martin (1944)

123 with 1 yr. experience

Superintendent rating

.02

Practice teaching

.14

LaDuke (1945)

31 elementary, 1 room

Pupil gain

.43

Rolfe (1945)a

47 al-mentary, 1- & 2-roam

Pupil gain (social stud-

les)

-.10

Rtetker (194.5')a

28 elementary, ruralb

Pupil gain (social stud-

ies)

.57

3eagpe (1945)

31 student

Practice teaching rating

.12

Seagpe (1946)

25 elementary, 2 yr. expert-

Supervisor ranking

.10

ence

a Bare (280) and &otter (282) r?so found correlations between Teachers College Psycho-

logical...F.-x=0.3=U= scores and pupil achievement, the coefficients being .05 and .40, respectively_

to eldition to the A.C.E., Jones (172) and Lins (203) used the Hermon-Nelson Intelligence Test.

Jones reports correlations of .11 an

.02 between the Henson-Nelscn Intelligence Test and super-

visors' ratings awl pupil gains respectively.

Line obtained correlations of .15, -.25, and .04

between intelligence scores on this test end the criteria stated above.

bExclusive of 1- and 2-room schools.

Coefficient of contingency.

Table 16 (Cont.)

Investigator

Teacher sample

Effee_Ttireness measure

Correlation

Jones

(194

6)a

31 high school


.10

19 high school

Pupil gain

-.26

6 E

zIgl

ish

Pupil gain

-.24

1.1no

(194

6)a

56high school women, 1 yr.

experience

Five observers* ratings

.24

168 high school women, 1 yr.

experience

Student rating

-.12

17 high school women, 1 yr.

experience

Pupil gain

-.01

Gould (1947)

96Lble4 school, 1 yr. experi-

ence

Administrator rating

.53C

Stephen. &

20 elementary

Pupil gain

-.24

Lichtenstein (1947)

Bach (1952)

76 high school, 1. sem. experi-

ence

Practice teaching

.002

55high school, 1 sem- experi-

Princirel rating (2

-.01 &

Mee

different blanks)

-.10

2 different superintend-

-.01 &

ents* ratings same

scale

-.08

a Rolfe (06) and Rastiaer (282) alao found correlations between Teachers College Psycho-

logical Examlnatias scores and pupil achievement, the coefficienta being .05 and .40, respectively.

In addition to the A-C., Jones (172) and Line (203) used the Benmon-Belson Intelligence Test.

Jones reports correlaticms of .21 and .02 between the Sew-Belsou Intelligence Test and super-

visors* ratings rood pupil gains respectively.

tins obtained correlations of .15, -.25, and .04

between intelligence scores on this teat and the criteria stated above.

b Exclusive of 1- and 2-room schools.

Coefficient of cc:itineracy.

1441e 17

Cor7v1etion4 Mtvren 74ric.us fyychoIosioal Eusisstione 441 Iteasuree of TeacherEffectiveness for Groups of fea:Sere of 90 or Sort

vatltator

:Titre, (1922)

4:64 rs (1923)

Cooper (1921)

1444rin (1927)

(1411)

6solts (1921)

Merit $ 011111441 (1429)

Brew (19V)

Collins (19)0)

1:1144s (11)0)

lesits4, 1 Inter (19)0)

Orectintido (19)1)

Prow (1932)

Case 6 Corne11 (193))

Sent (1117)

(199)

Ilertis 1144)

taller (1946)

tft. itney (192t)

&mart (1411)

Estrin (19r)

tilt (1421)

4411411 (10)0

I/1(8 1,13e (39)0)

Initar (19)C)

lertesteriedt (1131)

Ones (1931)

Cote 1 ttettll (1999)

rums, ow)

0144411er (19)x)

tether 1937)

wrtln 111410

04411 (11147)

Jtecktt. 044:14 Perobolotical 4441 AffectIvisniss aat tr's awlatidiSt444nt teachers

7110 Arty Alpha I Therndite Rating .16

156 Cospoeita toots) Ming .91

107 Thurston' Orad4 .22°

101 Any Alpha Mtlnt .10

351 (14 sr.) Detroit kivneed Oriole105 ( vast 44) Detroit 4dvencecl Orsde -.01

101 Ara 4104 Geode .09Terms Grste .1)

IS) Otis s

Thersdilte

Oritde

Cr414

.5n

.30

104 41wrote It4 Grote .34

116 (444140 Drees Univerbity WU* .11

1C0 11,4 etas. luting .24

(toalleeled)( ArtncoAll

Erg Alphatrip 11444

Grade .09

154 Ther4diks Grad. .31136

sfc. (.lyres.)

114r4dies

Te non

1st it.

er414

.24

.

59/ Mier Melba's Salim .10

1H Gate state GM. .19

11) 1 qtheIngtee1 Grid I .14Lostraitie4

93 ftychelagits1torinettan

Sating .06

45 tiller Sys1ssies it4 .td

It-earrice tescHar

1.3 rsorMite 6 erg, L1rhe 4444rionr .0)

110 .irk 1 rr. 44torier.t4 Covelito i1 lute) Sopervisor Mina .43

Ice with 1 tr. 4speriorce 441 1104 &weds, ecif tont rat L4 .04

49 41tS 1 rt. egariest4 Niro!! Alltsoed trim lettl s tt tot t .0).119 t IT. erforisect Detroit kirk.ced Prtscif43 ettleate .04

SISIWttI n Vioncvs /44rwiser ntise .011112 olesentary Ostrow CRIIIRSt t111fie .90

141 410 / tr. evers...g. 092r 11.44 ri .dent .00

116 Stet setool, 1 see.oner14444

r15

Dram Oniterlitt

Arey SliiiisIsiottster rsitr4

idertrletto fitted

.15

.09

1)7 Them! OS kialiviettitet 141141 .11

500 (attest.) e1grettatt TOMS% S.44rviset tatitti1 yr. Seporlis

alC (4(41et.) 410114ti terw 6441',/k* rstiseI rt. lisleriewsslsrentaft. 1 IT. forma EsT4 esrytt .05

441 by

15) 01140sitel 1 le. ISOisotoel

041 Katt bldseirtenterd 1 KU-.441 ratted

.9)

!Ur Nem a WIWI AI560 414444940, ter1441 sealideetrette redtted .01 1 01

94 4911 1 r. everiese ihnroteee Serortatrehml MINI .35

It) 410 1 tr. tliortabes rarseiaberal Mk:1m et ester 44114 .01Elsainst144

9$ irlth 1 H. entrisoves E470,414141,1tzsainst lea

11141,41strstst retied

(t47'"--..111444411 6. 6aKittealg

64

classroom conditions, and other factors. In correlating instructor in-telligence with effectiveness, the assumption is implicit that the effectof intelligence is constant regardless of time, type of student, natureof subject matter, educational objectives, classroom climate, and thelike. The variety of relationships found by investigators in this areaprovides strong support for questioning this assumption. In some casestoo match intelligence on the part of the teacher may constitute somewhatof a handicap. This is understandable when one considers the possibilitythat some teachers may not be able to "get down" to the level of thestudent. In a technical school situation this might very well be thecase, especially where civilians having considerable technical or academie'training are employed.

Considering tho more or less restricted range into which the in-telligence of a public school teacher may be expected to fall (intelli-gence quotients with a range of 103 to 126 and an average of 114 as re-ported in findings with the Army Alphal); for all practical purposes thisvariable is of little value as a single predictor of rated teacher success,inasmuch as it would be used with a population already selected on thebasis of intelligence.

Although no particular relationship is Clown between into Aigcnoe vfteachers in general and teaching competence, it is possible that in thecase of teachers of more advanced subjeot matter a significant relaticn-ship might be found. The investigations of Knight (178), Jones (171),Boardman (39), Ullman (339, 340), and Jones (172) who worked with highschool teaohere might be expected to throw some light on the possibility.With the exception of the correlation reported by Jones (172) who obtaineda coefficient of -.26 when he correlated intelligence of 19 high schoolteachers with student gains, correlations ranged from .10 t, .45, thelatter coefficient being obtained by Knight (178), apparentl 4th lessthan 38 subjects, It is seen that these correlation coefficients tenduo be somewhat higher and somewhat less variably than those reported forelementary teachers.

In 1927 Pyle (260) pointed out 111 we find that intelligence asdetermined by various types of psychological experiments is a just-barely-perceptible factor in teaching success." The studies involving groups ofteachers of 90 or mere which were eummarized in Table 17 have largely sup-ported this generalization to the extent that low positive correlations haveusually been reported. Of 42 product- moment correlation coefficients be-tween some measure of intelligence of the teacher and some criterion ofteaching success, 37 were positive and ranged from zero to .48 while only5 were negative, the largest of those latter being -.08.

Intelligence test scores are probably of little value as indicatorsof success failure with respect to teachers of the lower academicgrades, This is probably due to the narrow range of scores involved, the

1Army A)pha scores range from 97 to 148 with an average score of 122.

65

teachers from v.hom intelligence test scores have been obtained for researchpurposes constituting a highly selected sample of the total population. Insome teaching situations the intelligence factor, however, may make somecontribucion when used with measures of other instructor variables as apredictive device. If one considers the mean scores of instructors teach-ing very diverse subject matter (e.g., calculus vs. trade school) signifi-cant differences in intelligence between instructor groups may appear. Anintelligence tests score below the minimum found for an instructor of cer-tain subject matter might well predict lack of success in teaching, forinstance in the more complex levels of such a field as mathematics.

In the Air Force technical schools there is some indication that in-telligence may be somewhat more Important as an instructor variable. Morshand Swanson (232) reported a correlation coefficient of .46 (signifi-cantly different from zero at the .01 level) between Army General Classi-fication Test scores and supervisors' ratings of 38 instructors of recip-rocating engine courses on the Instructor Description Form (154).

The restriction of range of intelligence which may have kept thecorrelation coefficients low when obtained with elementary or high schoolteachers may not occur in the instructor population of the Air Forcewhere the range of intelligence may be much greater than that of civilianteachers. It may be expected, however, that intelligence will bear adiffering relationship to teaching success, depending upon the complexityof the course material and the level of student aptitude and experiencecompared with that of the instructor. Consequently, great care must betaken in generalizing from one course to another. The correlation of in-structor intelligence with the criterion of student gains might well bequite different for high level courses, such as the weather courses, inwhich the students are highly selected, as compared with a course such assheet metal,

Education as Related to Instructor Effectiveness

From 1905 to 1951 some 26 studies were made of the relation of amountor kind of education of a teacher to success in the classroom. In 9 ofthese studies statistical relationships between some criterion of instructorefficiency and amount of education were determined. These investigations?Ave been summarised in Table 18.

Results of these studies are difficult to interpret. In the greatmajority of the investigations, the range of education is given but thevariability in the amount of education for the teachers studied is notindicated. As in the case of intelligence the restriction of range inthe amount of education tends to lower the obtained correlation. Alsothe criterion used in most of these studies is highly suspect and anyrelationship found may primarily reflect contaminatic in the criterion.The two hie-est correlations were one of .42 found by Enight (178) and oneof .41 reported by Davis and French (97). In the Knight study the education

66

lrsmtlaater

*dos (1400

Boogies, 4 ftzspr (1910)

Vora (1914)

Mita (1111)

taiga (ay)

Loss (OA)

5.1,10 IS

*lotion it lioNalts to Inv/treater 12tootIroisoo

Soutar molt

SOt olgoostsr7

4% o1 osontii2

busty if olloitivoida

Ikaorriar ranking

300 (vireo.) Mott school Suorintendost I,sung

Off total rattail

ol000mtary b ties "Awl Pollee toietor rating

P.41,. ago ter Atlas

4oporvloor attires

4441-.1Por tottato

1$4, olosontuaUV lip *Owl

olosarlora4.14 Up oe,,a1

laroloolooloo 1 Boor (19O) 4041 r;1113 Ir. ilp 1414.-1

Dario 1 Preset (IOW 4154

hoebo (1941)

Role, ALA. (1119)

broom (1112)

4aris (1934)

Olosoil1or tint)

idetitwit (193 )

?Nit iitin

P etts4 (1w)

I.U. 11145)

%al (1%&)

5C evel 50 ttot.).144,7

(tabu Ittypo. old)

10.3

124) IAA teal

400 elelreUtt

(Ilatot irropotte4)ale

Loa us% woo)

(11a9at an sr to4oll000lory I blot ...oil

4 oloratatt. I., 4.4wor

139 olosistart l literRho*:

lulbsu I titStaistois (I%) olsustary

h owl (1%1)

tam. laa (%M)

116,61 bosom (1951)

112,24 (taI)

beilisors tostirtilsoll

Sopordoor patina!up rel oar ro t

Sop 1.1 ter riltimarotolose toting

Frt,clnl mashyPates* rarkis1Ofttolol Miss

Pei solos: rot

sildstotritor

tool

irititpol, assist**iria:lpl, I 0.4irritett %Oise

tro.t.to (*solid VW*

Prirolool ratite

Pr!: ran1.1.4

p411 pia

Pala la 105 anwrortaratietwo a re of1 yr.

11.4141 get*

..1414=1-4.11104116...

anent of trolalag

Asemmt et tisiAtag

Mould of Ittlalaa

Laval of tollptrolalag

Yowl protortisea1otole

Moat pot000looalMW

hootiotort so simalot

Oioirl rotes fists;stow

AI-at of Orel/dadtirtiaAtt* of profs..

Dismal nb3.otstwat of pr trilateral

1.10,11P4

111 etCrool.o.

&mint!

}toff healtro MN

14 Whim% oollogtetfttlif

PO.D. no. V.I.

Volts to "Natio!abaft 110 laaolas torota tsat4

loorsol 4 It.

Coraati I rololt4

NPINI Woe% toltraloSog 0.11}orttan% sod ototatiestvra:101

turent W ltalatalultra 1.514

frtlalos Oars 149.foetal

tiortubvitpropto Masai

92241 K irotrootosalalas,

Correlatloo

Conoso sreduato logosoeoctotol is towerdada

Sorsa school palateoffostioop oollses,

stet, kisb Wool snit.1stot

Coopriota oltgally latarot of UGH havingpro to s distil 1,511114.

.21 oItsisitar;

IktO school

.34 doesotirl

.36 MO school

.11 I -.22

.92 1 ...III

..135.19.14.19

.0

Crilleil lotto§ tosoirool boon* geak.,Vachon ookbortowel poet tioshor

bolts too Vissitioost

90.31 dwelt toies1

iltwohorsIleIN

Oiolig itte110449.71 chow III554.

3.0 (tritIal roils)

listings twasil Si4.1#44 tot loodarobtu tripe! Olson

loro 115111 101

Do rolotioottip

blal000lda tr.Oifleort al .09 1-441

So 4ithisoe4 estop%ihot WAS 'SU rio1-his'.)woo@ samottegyUor

st

M 41olaHlopy 4moritisato

rlisos otitisPeril

tan et Vaal (1.4)has St oiraulag (14)

e a per*Womb'

tatormadua

soUat kers& Mi{ hem MU 1111 awl area.

11.1.4 sego. walla2511 24.4 awe swifts

4% 1.7. isitiorteripo, 1.1. Imitators

hosiviii. Wirekrorrt sit retire

tan a 64.801.4tape Naala .04

et

191 olaaalarf Capootto OrtrosisratIodo

low44 atlas/ .11

6?

measure was that of amount of in-service training taken and the relationshipmay only reflect the extent to which raters look with high favor on suchtraining. Davi& and French compared official ratings reported to a stateeducational department with amount of professional training. Here againthe raters were probably aware of the amount of training each teacher had,and such knowledge may well have influenced their ratings.

Another source of error in studies comparing &mount of edt:aticnwith teaching efficiency is that often the factorn of age and years ofteaching are not held constant. Frequently the teachers with the "poorer"educational background as defined in the different studies belong to thegroup of older teachers so that factors other than amount of educationmay be operating to result in their getting a lower rating.

Some of the studies reported are too old to have much significancefor present day education. The variables, elementary teaching and col-lege. education for instance, have changed rtdically since 1905. Thestudies are of some historical interest, however, and may also be usedto see if any changes have occurred. It VI interesting to note that in1905 Meriam (225) said, "Professional work in Normal Schools does notcontribute as much as one would expect, t,lough Normal School graduatesdo better than teachers in city training schools, and these in turnbetter than teachers with no professional education." Then in 1938 Allen(2) in a study of 60 superior and 60 interior teachers makes the followingsimilar statement, "After a relatively high minimal background has beenreached in such items ad are normally stressed in substantial teacher-training programs, further addition to these backgrounds are not necessarilythe things which differentiate superior from inferior teachers."

In 1944 Daniel (91) reported a study in which educational levels ofteachers rated "excellent" were compared with the percentage of all teach-ers of their &tate having the same educational level, He asked a largesampling of superintendents, supervisors, principals, teachers, pupils,And patrons of schools in South Carolina to indicate their "best" teachers.In Table 19 is shown the percentage of "best" teachers as indicated bypupils and patrons (parents) for the various educational levels and per-centages of the teacher population for the state as a whole. Unfortunately,these data do not necessarily show that teachers with better education arereally better teaches. They may have been rated "best" because of theireducation.

In 1951 Ryans (286) found no significant differences when 275 elementaryteachers were divided into groups based on amount of college training. Thecriterion of teaching effectiveness was factor scores obtained when compositeobserver rating was factor analyzed by the centroid method, The nontingencycoefficient based on 191 cases was .11.

Considered as a group the investigations of semester hours or years ofeducation as related to instructor efficiency have shown that any relationshipthat may exist is slight. Results of these studies suggest that further

68

Table 19

Educational Qualifications of "Best," White, High School Teachersa

South Carolina"Best" teachers leachers_

Educational level ti------3 _E_

High school graduation or less 1 0.5 0.5

2 years of college 2 0.9 0.33 years of college 3 1.5 0.9Bachelor's degree 45 21.8 61.5

Bachelor's degree plus 98 47.6 24.4Master's degree 20 9,7 10.2Master's degree plus 37 18.0 2.0

aTaken from a study by DaLiel (91).

search along lines followed here for factors which differentiate the effec-tive from the ineMctive teacher "411 probably not be too rewarding.

Such variables as "years of education" or "semester hours" lack mean-ing unless psychological or educational changes induced in individualsundergoing training can be measured. Whether or not a teacher has had acourse in educational psychology has little significance because of thevariation in such courses from college to college and even from instructorto instructor within a given college. We learn from these studies, whatwe might have suspected from the beginning, that the amounts of educationor semester hours are meaningless variables in relation to measures ofteacher effectiveness. progress in research in this area can be madeonly when more specific and detailed measures of the effects of trainingare developed as variables and substituted for thegross indications ofeducational achievement used heretofore. More meaningful variables mightbe provided, for example, by using direct measures of the outcomes to beexpected from given amounts of training of a specific kind such as mightbe associated with child psychology, psychology of learning, or othersubject matter courses.

On the basis of what has been reported to date, however, it can onlybe said that beyond certain more or less obvious knowledge requirements,greater or lesser education of the teacher in terms of courses or semesterhours seems to be unimportant. Where any substantial relationship hasbeen shown, the possibility of contamination of data has not been eliminatedsince a school administrator's rating of a teacher ray be influenced bywhat he knows about that teacher's training. There is some svgestionfrom the text of a number of articles that the primary uotivation for

69

research lay in the educator's enthusiasm for some particular courseor combination of courses in hi.s institution. It is thus perhaps in-evitable that some of the results received somewhat less critical inter-pretation here than they deserved.

Scholarshi as Related to Instructor Effectiveness

In ttilit search for variables which might be usdd as bases for the pre-diction of teaching effectiveness, one of the most obvious indicator,. interms of accessibility and objectivity would appear to be that of previousscholarship. The hypotheuis is rather widely held that the individualwho is himself a good student of mathematics, for instance, can impart hismathematical inforhation to others. In line uith this assumptionoin AirForce technical schools instructbrs'are frequently selec 'ted on the basisof grades they obtained in particular subject matter courses. Anotherschool of thought maintains that knowledge of subject matter is not asimportant as knowledge of teaching methodology, thus assuming that thestudent teacher who excels in practice' teaching or in courses inemethodswill automatically become a good teacher.

In the attempt to relate scholarship to teaching competence two types ofstudies have been made. The first of these involves be investigation ofacademic grades received by student teachers as they are related to stand-inglh practice tbching. The shond type concerns the competence' of teach-ers in the school situation as related to their earlier scholarship in termof grades received in schoolor college, including geperal scholarship,standing in academic major, professional education and methods courses, withparticular emphasis on grades in practice teaching.

The usual measure of scholarship is expressed in terms of grade-pointaverage or gradp-point ratio, which is grade weighed by the ntnber ofhours or units credit in the.course. In Tables 20 and 22,,varioue designa-tions used by investigators (general scholarship, marks, average grades,honor point ratio, academic average, etc.) have all been interpreted bythe reviewers as the college scholarship variable.

Practice Teaching Grades versus Scholarship

Many attempts have been made to relate practice teaching grades ttscholarship in an effort to obtain some basis for forecasting success inpractice teaching. by implication, a goodistanding in practice teachingwould indicate probable success later in the school situation itself.

Of some 31 studies of teachers in training available to the reviewers,23 report correlations obtained between eome measure of average collegegrades and grades or ratings in practice teaching, 16 i '4port correlationsbetween standing in spedifit college courses and practice teaching, and

70

9 report correlations found between high school scholarship and practiceteaching. The results of these studies are summarized in Table 20. It

will be noted that tno cOrrelaion coefficients shown are all positive, andin several instances where comparativdly.large groups are involved they arequite substantial. There is.greatar variability in thy case of the coeffi-cients found when grades in specific courses are compared with practiceteaching than when the average for all college courses is so compared. Thisvariability probably has little meaning. due to differencee in sizes ofgroups used and in methods ot.obtaining the original data.

The implication is quite clear, however, that grades a student willobtain in a.praotice teaching course may to some extent be predicted bythe grades that student obtained in college. Vnfortunately, these is noindication in the studies reviewed that steps were taken to keep themeasures of practice teachingiekperimbntally indeperdent and uncontaminated.In other words, persons assigning practice teaching grades,were apparentlynot kept unaware of the grades obtained. by the students ,in othet collegecourses. This mould that the Positive correlations in Table ".0 may beattributable in part to tho operation of logical error qr halo effect. Theinstructor who grades hie student on practice teaching may give highergrades to the student hd knows to have received higher grades in his pre-vious college work. Cn the othershend, in the, light of the posItive co-.efficients fbund regardless of the course oK courees correlated with prac-tice teaching general scholarship may be the determining factor. It is

probable, too, that both performance in practice teaching and generalecholarship are hlated tq intelligence level. The imphrtance of thisrelationship depends, nowever, on the extents to which practice teachinggrades predict later uUCCOOS as a ,teacher. The,research on this questionis reviewed in the next section.

With the exception of one study, Somers' (320), the coefficients re-ported for high school standing, thqugh positive, are rather low. Fromthis'it Mould appear that while alms positive relationship is found forgroups,.little predictiOn of success in practice teaching may be made onthe basis of en individualts scholaaeic record in high school. Althoughagain the investigators do not state Whether or not the persons assigningthe practice teaching' grades were kept unaware of the student's highschool standing,, the probabilities are tnatshalo effect was not present toany great eftent here, It is doubt/NI if, in most college situations,college instructors are aware.of their ottldents' hi3h school grades. How-sver,:it is also true that there is very little variability in the highschool grades. of cbllege students, since the better students tend to 4oon to college. This lett°, factor would operate to lower the6orrelationcoefficients obtained.

ve.L.....ausTeashalgjaiccess in the Field

The second broad approach in relating'sbholarship to teaching abilityis that of cosidering high school or college reccode of teachers who arc

71

fable 20

Bastion of Practice 7eac!ing Grade. trIstings to Scholarship

Irrreotha91--.--

Voaber of High school

alL113PStraSil

.27

.A4

College grade Major

Litila

.19

.70

Educational

method Other courses

Haan II Bliley (1916)

Fordyce (1919)

Whltney (1922)

Scams (1923,

Cooper (1924)

...-011128LatelLY.

40

323

780

156

107

.24

.61

.39

.336

.57

.21

Hearin (1927) we .45

Shalt, (1928) 108 .08 .43

Zaht (1928) 200 .32 .3C (Psychology)

Broom (1929) 148 .21

Murris (1929; 60 .55

Ullinn (1930) 116 .26 .22 .46

Whitney & Frasier (1930) 100 (selected) .47

70 (control) .52

Breckinridge (1931) 420 (beginning course) .26

(advanced course) .12

Neal & Mead (1931) 64 .37 .49

Broom (1932) 235 (grade) .58

232 (grade) .k5235 (rating) .39 .04

Brom & Ault (1937) 55 ( rating publicschool)

.11

48 (rating publicschool)

.44

63 (rating college) .5368 (rating college) .22

Cost & Cornel1.(1933) 700 (approx.) .09

006d (i933) 90 .35

Hatcher 1934) 20 .25

Butler (1935) 242 .40 .23

118 .46 .43

griper (1935) 55 .33 .52

Bent (1937) 577 .21 .46 .45 .27 .29 (English)

Layton (1939) 705 .48t528 (19)6) .45.

417 1937 .46°

'Martin (1944) 123 .07 .12

Hult (1945) 100 .49 .45 .51 .36 (Minor sub)ect)16 .35 .1467 .16 (Minor sub3ect)

Soave (1945) 25 .52 .4723 .5)

Fuller (1946) 85 .0353 .62

Schvarts (1950) 34 .32 .56

Bach (1952) 76 .62 .19

4 Coefficient of mean square contingency

b College leaving examination

72

now on the job. Some 49 ouch research studies have been examined. The re-

sults of these investigations are discussed in the following pages undertwo headings: (1) Practice Teaching Grades versus Teaching Success in the

Field and (2) Other Academic Grades versus Teaching Success in the Field.The latter section includes general college average grades in major sub-ject, education courses and other specific college courees and high school.grades or rank.

Practice Teaching Grades versus Teachin Success in f:1-.4 Field

In 31 available studies practice teaching grades or ratings were cot-pared with some criterion of on-the-job teaching success. In 29 of thesesummarized in Table 21, the correlation coefficients ranged from -.17 to.84. The .84 coefficient was obtained by Tudhope (336) in a study of 50male teachers in England. This investigator's data probably reflect con-tamination due to the rating of teachers 'in service by the same officialinspectors who participated' in assigning practice teaching grades.

As indicated in Table 21 with two exceptions, Broom and Ault ,(58)and Jones (172), all of the available studies reported a positive rela-tionship between practice teaching grades and criteria of success in thefield. Most of the correlation coefficients are low, however, only sixbeing .40 or better.

Upon examination of Table 21, it will be noted that many of the in-vestigators used a teacher population of under two years" experience. It

might be expected that if.grade in praCtice teaching was predictive oflater success in teaching, a lairger correlation would be found in thosestudies with the less experienced teachers. Presumably after about twoyears of experience, a selective facitor has entered the picture, thefailures and teachers who have not adjusted to the teaching situatiohhaving been eliminated. This hypothesis does net standup under the 4e-sults as presented in Table 21, however, as many of'the studies with in-experienced teachers report extremely low correlations. In fact, thoseearrelations reported in studies whose. population included the more ex-perienced teachers are equally as high as many reported in studies withinexperienced teachers. These results might be partially ekpIained bythe inadequacy of the criteria used. In the great majority of thesestudies some form of administrative rating was employed. SLICE) there

appears to be a definite tendence of administrators t$o withhold highratings from beginning teachers their ratings may be forted toward thelower end of the scale, thus curtailing the range of the sample studied.in only one of'the studies, Seagoe (298), were the teachers ranked ratherthan rated. Seagoe obtaineda correlation coefficient of .49 using thecriterion of teachers ranked within their own faculty, the.ranks beingconverted to percentile scores for analysis. In two of the studies of in-experienced teachers less fallible criteria were used. Coxe and Cornell(87) reported a eorrelation coefficient of .28 (N = 112) for trained-ob-eerver rating while Lins (203) obtained a coefficient of .25 (N = 58) forobserver rating and a coefficient. of .21 (N = 17) for pupil gain when thesemeasures were correlated with grade's in practice teaching.

73

Table 21

Relation of Prentice Teaching Oradea or Utile. to Teecnine Effectiveness in the Yield

Invostimetor Toombs:. sample Pamirs of effectirenett- Correlation

x.rim (1905) 1195 elementary Normal school principalestimation

.44

Moody (1910 107 men Salary .23

527 women Salary .25

Whitney (1922) 780 with 1 sem. experience Supervisor rating .24

Somers (1923) 110 with 1 yr. experience Principal rating .70

Hearin (1927) 108 with 1 yr. experience Supervisor ratings .06 (let critic teacher).23 (2nd critic teacher.

Anientrout (1928) 200 with 1 yr. experience Superintendent rating .29Superintendent rating .406

Pyle (1928) 99 with 2 yr. experience Administrator rating .15

Multi (1918) 58 with 2 yr. experience Superintendent rating .12

Wrgenhorot (1930) 191 with 1 yr. experience SuperinteRiont rating .23

McAfee (1930) 98 elementary Supervisor retina .16112 elementary tlisaroom observes .26

Ullman (1930) 116 high school, 1 arm.experience

Average principal 4sucerintendert rating

.36

Bossing (1931) 100 high sch161 Administrator rating .69

Broom (1932) 238 Administrator citing .26

Broom 4 Ault (1932) 38 to 6) with 1 yr.experience

Rating, sent DepartmentEducation

.02 to .30

29 to 38 with 1 yr.experience

Ratings sent CollegePlacement

-.17 to .10

Coxe 4 Cornell (17'3? 300 (approx.) elementary,1 yr. experience

Administrator rating .13

400 (approx.) elementary,2 yr. experience


112 elementary, 2 yr. experi-ence

Composite observer rating .28

Reiner (19)6) 55 with 1 yr. experience Administrator ...tine .39

Hardesty (1935) 231 Superintendent rating .07

Odin:roller (1936) 560 elementary Supervisor rating .19

Law (1937) 42 (4-yr. course) 1 yr.experience


94 (2-yr. course) 1 yr.experience


Saniford, it al. (1937) 242 Cemposite 7 tnepectors .35

Stewart (1940) Rural (number not re-ported)

Superintendent rating .21

Pudhop. (1942) 93 with 3 yr. experienceplus

Inspector rating .81

Karlin (1944) 123 with 1 yr. experience Superintendent rating .18

Seeps. (1946) 25 elementary, 2 yr.experience

Supervisor ranking (per-centile)

.49

Jones (1946) 52 high school Supervisor rating -.0432 high school Pupil gain .13

Line (1946) 58 high school women, 1yr. experience

Composite rating (5observer)

.23

50 high 'school women, 1yr. experience

Student rating .06

17 high school women, 1hr. experience

Pupil gain .21

Could (1947) 11) with 1 yr. experience Principal rating .666

Stephens 4 GichtenAtein (1947) 86 elementary Pupil gain .01

Schwaris (1950) 18 with 2 yr. experience Supervisor rating .06

Bach (1952) 73 high school, 1 see.experience

Principal rating (2 dlr.ferent scales)

.06 and .20

Superintendent rating 2different raters, samescale)

.18 and .12

I Coefficient of mean equate contingency.

74

As part of a study concerned with the relation of practice teachingsuccess to other measures of teaching ability, Bach (12) in 1952, soughtan answer to the question, "Is there any agreement in the factor patternsof critic teacher and principal ratings?" A device consisting of 13 itemsarranged on a five-point scale was used. Ratings were made by the criticteacher while the student was engaged in practice teaching. After fourmonths in an actual teaching situation, ratings were again made by thebeginning teacher's principal. As a reault of factor analysis four fac-tors were found for each of these ratings as follows: For practiceteaching rating--pupil response, technical competence, relations withothers, and personal appealfor beginning teacher rating--technical com-petence, cooperative attitude, initiative, and personal appeal. In con-clusion Bach states:

"There is considerable agreement between two of the four commonfactors found in the analyses of the practice teaching and beginningteacherratingss'but there are nonetheless important differendes.These two factors are interpreted as Technical Competence and Per-sonal Appeal. The correlations between these two factors were .27fcr the practice teaching rating and -.02 for the beginning teacherrating. High positive relationships are also found between threepairs of factors in the praCtice teaching analysis but only one largepositive and three small negative relationships are found between thefactors in.the beginning teacher analysis. The above differences leadto the conclusiori'that iri site of the similarity of name in the twofactors common to each analysis, critic teachers and principals areemphasizing 'different characteristics or abilities in the people theytrain and hire, or else they place different values upon and seek dif-,ferent combinations of the same abilities" (12).

From the results reported in this section, one.could anticipate thatresearch with Air Force personnel might show some relationships betweenstanding in instructor training courses and subsequent performance as aninstructor. If such correlations were shoWn for Air Force technical train-ing school instructors, however, the information would become availabletoo late to have much. practical predictive application for the instructorsample used.but might have implicationa for future instructor samples.

Other Academic Grades versus Teaching Success in the Field

In 35 available studies correlations are reported which are based on.scholarship or grades received by teachers while students as compared withvarious criteria of the effectiveness of teachers in service. These in-vestigations have been summarized in Table 22.

With respect to general college average the correlation coefficients,with the exception of 4 studies, Meriam (225), Coxe and Cornell (87),Jones (172) and Bach (12), are all positive but range from'zero (Broomand Ault in Reference 58) to .73 lSomers, Reference 320). For the most

75

Table 22

Relation of Scholarship to Teaching Effectiveness in the Field

Tusher 'Lamle Amur, of offootIvonstt'Rrik

RAWCollege grade

Image &laxeueation

gther oftr000

Karina (1905) 1185 elementary Normal school principalestimation

-loam.q(Parythol.)

(gusher unreported)eleuentary

Kredy (1918) 527 warm Salary .231C9 are Salary .35

Ritter (1918) 1436 elementary& Ugh school Official rating .65

Whitney (1922) 760 with 1 son perienCO Adeinistrator retina .09 .07

Welt (1922) 19 slesentary Peer rating .15

6 high school her rating .6053 eleeentary& high school Peer rating .33

Jones (1923) 45 high school town 3uperrieor rating .46445 high school women 3upervieor rating .45444 high school men Supervieor rating .294

Sores (1923) 110 with 1 yr. experience Principal rating .77 .73

Hearin (1927) 108 with 1 yr. experience Supervisor retina .05

McAfee (1910) 98 elementary Superintendent rating .15

U2 elementary fteerver rating .40

Mod (1930) 116 high school, 1 sea.experience

Administrator rating .30 .20 .30

Wagenhers4 (1930) 191 with 1 yr. experience Administrator rating .01

Anderson (1931) 460 teaching certificate Superintendent rating .10 .19

110 Bachelor Degree Saperintendeat rating .22 .21

Bossing (1931) 100 high school Adadniatrator rating .17 .19

Breckinridge (1931) 215 Principal rating .35

Trines (1931) 26k elementary & high school Supervisor ruing184 elementary& high school38 elementary & high school

Supervisor ratingSupervisor rating

.61.ex°

Broom (1932) 240 Administrator rating .19

237 Administrator rating .19

Broom& Ault (1932) 81 with 1 yr. experience Administrator rating .24(official)

50 with 1 'To experience Administrator rating46 1 yr. porton:. for college placement .ao

Come & Cornell (19)3) 500 (approx.) elementary Supervisor rating -.0141 yr. experienxe

400 (approx.) elementary Superthin.. rating -.0141 yr. experience

112 elementary 2 yr. evert- Campmate observer rating .08 .10*

OCICI

Peterson, 11_11. (1934) 63 &spin-vigor rating .1247 to 204 Salary .22 to

.71

Briber (1915) 55 with 1 yr. experience Supordoor »Uma .36 .49

Phillips (1935) 17) elementary 6 5r. highschool

average adainietratorToting

.14

Hardesty (1934) 231 Superintendent rating .15 .09

Odenweller (1936) 560 elementary idainistrator .08 .29 .26

Jones (1923) used seder grades and Cos and Cornell (1933) used second semester millioninent as sweeurse of scholarship.

bBased oa giedoe (T s 49)6 Awed on students placed scholastically in approxieete top half of class (i .62)f Weed en thetas deft-

nite1y placed ii top half of elm. (g s .61).

* Coeffielent of Jean square contingency.

76

Investigator

Seiner (1937)

Teacher sample

42 in 4-yr. courte,with 1 yr. experience

94 in 2-yr. course,with 1 yr. experience

fable 22 (Cont.)

Seaga:elf effectivenes;

Supervisor rating

21421

.27

Supervisor rating .3)

Sandiford, et al. (19)7) 242 Inspector rating

Stewart (1940)

Martin (1914)

Jones (1948;

Line (1946)

Seaga. (1946)

Gould (1947)

StepheLs & Lichtenstein(1947)

Esp 'chide (1948)

Schwarts (1950)

Each (1952)

84

193 rural71 rural

12) with 1 yr. experience

54 high school51 high school50 high school4) high school

33 high school32 high school30 high school28 high school10 English

58 high school women,1 yr. experience




17 high school women,1 yr. *aperients


25 elementary, 2 yr.experienco


86 elementary

46 physical education,1 yr. experience

18 with 2 yr. experinace

70 high school, 1 sm.experience

Superintendent ratingSuperintendent rating


Principal ratingPrincipal ratingPrincipal ratingPrincipal ratingPupil gainPupil gainPupil gainPupil gainPupil gall:,

Comp:lilts administrator

Composite administrator

Pupil evaluation

Pupil evaluation

Pupil gain

Pupil gain

Supervisor rank (per-.centIls)

Principal rating

Pupil gain

Principal rating

Supervisor rating

Principal rating (2different scale)

Superintendent rating(2 different raters,sue scale)

. 33

.07

. 19

-.22

-.43

.33

.06

.69

SonolarshIpCollege grads Education

MEW-.46

.40

.25

.22

.15

.24

-.08

.31

.03 .05

.53

.0)

.44c

.01

&.121 courses

.05

-.08

Cther coureeg_

.23 English)

.13 Social

.40 .4S Echoes)ee)

Studio).33 .47

.28 (English)

.22 SocialStudies)

.19 .20 (English).13 History).20 Geog-

raphy).24 (Special-

Lets)

.40

.26

.23 .29 .35 (minorsubject)

.12

.24

-.01-.06.08

.o3

.13 .01 (.doorsubject)

.55 ,52 .44 (minorsubject)

-.15 .01

.24

.02

.09

-.02-.08-.01

-.73 (Intro-duction toteaching)

.01 (Educationpsychology)

.19 (History ofeducation)

.15 (Methods ofteachingreading)

Jimmie (1923) need senior grades and Coxs and Cornell OM) used second smatter achlevemont as measures of soholarship.

Based on grades (x r .39); bud oa students placed sebolastical1y In approximate top half of claw' (j : .62); based oa students defi-nitely plated in top half of class (1 : .011).

* Coefficient of mean 'guars contingency.

77

part the coefficients tend to be low. In only 9 studies were they as great

as .40 or above, and even within some of these studies great variation isshown in the size of coefficients obtained, e.g., Knight (178), Jones (171)McAfee (208)1, Peterson et al. (254), Lins (203). While the over-all resultsare not such as to permit any very confident .interpretations, it would ap-pear that some relationship exists. It may be suspected that the commonrelationship of general intelligence to both academic and teaching successis involved.

In two studies critical ratios rather than correlation coefficientswere reported. In 1937 Stuit (326) found average college grades of 100"superior" teachers as rated by superintendents and principals to be sig-nificantly higher than for 46 "poor" teachers (CR 2.8). Shannon (305) who,in 1940, compared 111 "highly successful," 111 "average," and 37 "failing"teachers selected from among teachers who were graduated from a stateteachers college during the period 1898 to 1934, also found success in thefield to be related to college scholarship (CR's 2.3 to 8.2).

In 16 of the studies reported in Table 22, investigators attempted todetermine whether or not teaching effectiveness in the field might be pre-dicted from achievement in one or more college courses apart from prac-tice teaching. Correlations between field performance of a teacher andhis grades in specific college courses yielded coefficients which tendedto be low but positive. In only five investigations, Jones (172), Broomand Ault (58), Seagoe (299), Stephens and Lichtenstein (323), and Bach(12), are negative coefficients reported, these appearing among positiverelationships also found in these same studies. The results in the caseof specific courses appear to be much the same as those obtained whenpractice teaching grade or rating is compared with teaching effectivenessin the field.

The relationship of high school grades or ranks to success in teach-ing was studied in 13 of the investigations. As will be seen from Table22 the correlation coefficients (except for those reported in the Jones"1946 study) are all positive but vary from .07 to .81. The relativelyhigh coefficients reported by Somers (.77), Kriner (.81 and .62), andLins (.69) appear to be somewhat out of line with results obtained byother investigators.

In the great majority of the studies concerned with the relationshipof scholarship and teaching effectiveness, the question of whether or notratings by administrators were influenced by knowledge of the teachers'college scholastic rocord is not considered. It should be pointed outthat in the case of supervisors' ratings no investigator could be certainjust what knowledge might contaminate the criterion nor could this becontrolled. The question concerning contamination of ratings by knowledgeof high school grades should also be raised but the probability is remote,however, that many supervisors are aware of the high school grades of theteachers they rate.

78

Considerable effort has been expended by investigators in attemptingto discover the relationships existing between on-the-job performance ofteachers and earlier scholarship as reflected in over-all achievement inhigh school or college or standing obtained in specific college courses.The outcome of all of this research appears to be that there is some rela-tionship but that it is probably small. So far none of these investigatorshas shown that the attainment of a particular standing in high school orcollege or the mastery of any single course or group of courses is essen-tial to teaching competence. General college scholarship and scholarshipin specific college courses are both correlated to some extent with prac-tice teaching grades. Intelligence test scores are also correlated withpractice teaching grades. Investigators have apparently treated subjectmatter knowledge as if it were a discrete variable. Scholarship in specificcollege courses, however, is probably just a less reliable measure of gen-eral scholarship, or perhaps somewhat more indirectly, of intelligence.Zero-order correlations will not indicate whether subject natter knowledgeper se is related to teaching effectiveness or whether subject matterknowledge, general college scholarship, and intelligence are interrelatedvariables. The lack of any substantial communality of content objectivesof courses that, bear the same title under different instructors or in dif-ferent college:: makes it unlikely that a course selected by title onlywill be found essential to teaching competence.

Age and Experience as Related to Instructor Effectiveness

The relations of age and of experience to instructor effectivenessare reviewed together because of the obviously close relationship betweenthese two variables. In 1928 for instance, Bathurst (23) obtained a co-efficient of .88 when he correlated them.

In Table 23 are listed 17 studies in which correlation coefficientshave been reported. (Bathurst?s study is included since he used Knight'sProfessional Aptitude Test not as a measure of "aptitude" but as a cri-terion of teaching effectiveness.) It will be noted that these coeffi-cients range from -.38 to .53. This suggests either that the importanceof age and experience in teaching effectiveness depends upon the partic-ular teaching situation involved or that product-moment correlations pro-vide an inadequate indication of any nonlinear relationships that mayexist.

That the relationship between age or experience and estimates ofinstructor effectiveness may be curvilinear is suggested by the studiesof Ruediger and Strayer (283), Young (362, 363), and Davis (96). Ruediger

and Strayer, in 1910, used supervisors' estimates of 204 elementary teach-ers while Young, as reported in 1937 and 1939, used principals' ratings of1521 teachers. These investigators reported improvement in instructoreffectiveness up to 5 years, no improvement from 5 to 20 years, and somedecline thereafter. Davis, in 1934, on the basis of an investigation in-volving approximately 1700 high school teachers, his criterion beingpupil success in passing State Board tests, concluded that pupils taught

79

lneeetiAater

Karim (1905)

Knight (1921)

Somers (1923)

Lang (1924)

&antrum (1928)

Barthelmeesi Boyer (1928)

Davis 4 'mob (1928)

Bathurst (1928)

Bathurst (1929)

Odenoeller (1929)

ArInrr (1931)

Pelts (1945)

Jones (1946)

Stephens & Lichtenstein (1947)

Riesch (1949)

Hymns (1951)

Table 23

Age and Experience is Masted to Teaching Effectiveness

Teacher sample

38? elementary

(Mumter unreported)elementaey 4 highschool


154 elementary

113 high school

88 high school

5002 elementary1220 Jr. high school

2156

171 high school

300 elementary

560 elementary

262 (131 best f 131worst)

47 elementary, 1- 4 g-room

54 high school

33 high school

40 (approx.) eleven -tary, normal schoolgrade

23 (approx.) elemen-tary, city schoolgrade

22 elementary, city &rural

1$ elementary, rural

203 elemanteni

a Pearson cos s g coefficient!.

b Coefficient of contiNioncY.

Aga or experience

Experience (0 to 16 yre.)

AgeAgeExperienceExperienca

Age

Tvperience (presentschool only)

Experience (presentschool only)

AgeExprionee

Experience (0 to 30 yrs.)Experience (0 to 30 yrs.)

Experience

Age

Experience

Age (experience factoredout)

Experience (age factoredout)

Age

Experience

Ago (experience factoredout)

Experience (age factoredout)

Age (18 to (6 yr.)Experience (1 to 7 ye.)

Experience

Experience

Experience

Age (20 toExperience (1 to 30 yr.)

Experience

Experience

54 Tr.)

Agar

Experience (0 to 9 Yr.)

AgeExperience (4 to 24 yr.)

Age (20 to 68 yr.)

Experience (1 to 43 yr.)

Ago (20 to 68 yr.)Experience (1 to 43 yr.)Age

Experience

Experience (divided intogroups of 1 to 4 yr.,5 to 9 yr., 10 or moreyr.)

80

WOUrf of teacher effect.eeneee prreletio0

Supervisor ranking .10

Fellow tsacher rating .14

Superdsor rating .03

Pallor teacher rating -.04Supervisor rating .14

Stpervisor rating .07

Supervisor rating .26 4 .39

Supervisor rating .46 4 .42

Composite ranking (supervisor, .34a/sedate teacher 4 pupil .39

rating)

Principal ranking .27Principal ranking .36

Official rating

Knight Professional AptitudeTest

Knight Professional Aptitude .15

TestKnight Professional Aptitude -.15Test

Knight Professional Aptitude .21

Test

.23

.06


Knight Profsegional AptitudeTeat



-.03

.06

-.17

.18

Ranking (supervisor, principal, .15

assistant principal) .15

Superintendent opinion (elereu- .10a

tary teochers)

Superintedent opinion (hr.r .26a

school teachers)Superintendent opinion (total .10a

group)

Residual pupil gain (33e 1' an,8th grad.)

Supervisor rating (k -

M -DI/ink)

Supervisor rating (WiscorsirM-Blank)

rupil achievement goctindPupil achievement quotient

Pupil achievement quotientPupil achievement quotient

Supervisor rating (Wisconsin8.41ank)

topervisor rating (Wisconsin/441118:1)

Residual pupil galaResidual pupil gainSupervisor rating (W!sconein

8411ank)Supervisor rating (Wisconsin

M-Blank)

Composite observer rating

.<

. o

.04

.41

.53

-.38-.21

.21.t

by teachers with one years' experience but no better tnan pupils taught

by teachers with two years of experience.

In 1929 Birkelo (36) using student ratings of elementary and highschool teachers apparently showed increased instructor effectivenessWith age, a result in agreement with that found by Daniel (91) in 1944.The significance of these findings as well as those of Ruediger andStrayer (283) and Young (362, 363) just mentioned is somewhat doubtful,however, since the proportion of each age or experience group in thetotal samples used is unknown.

In the few

was

to study the, lationship between length oftime teacher was employed in the school and efficiency ratings, highercorrelations were found, as might be expected. In 1924 Lang (193) re-ported correlations ranging from .26 to .46 between supervisory rating

of teaching efficiency and the teacher's local experience. In 1934Davis (96) in a study of teaching efficiency based on the per cent ofeach teacher's pupils passing state tests in high school subjects statedthat teachers with longer tenure in a given school were more successfulin passing pupils through state tests than were teachers who had beenemployed in the same school for a shorter period of time. However, theschools which had the highest percentage of pupil, passing the statetests were those schools with marked* high teacher turnover, Becauseof these confusing results Davis concludes, "It would seem more likelythat the tenure of the teacher is e result of her success as measuredby State Board tests than that success in State Board tests is a resultof increased tenure." In 1945 Brookover (55) found that length of ac-quaintance with pupil and length of time teacher had taught in theschools, all well as age of teacher, were positively related to pupilratings.

Several investigators in this area reported no significant differ-ences. In 1936 Heilman and Armentrout (148) reported results of ratingson the Purdue Scale of 46 college teachers by 2115 students in 50 classes.In'terms of experience teachers were divided into four groups, 7 to 12years of'experience, 12 to 17, 17 to.27, and 27 or more years of experi-ence. Instructors were also divided into age groups by five-year in-tervals. No reliable differences in rating scores were found in eithercase. In 1946 Blair (37) compared 92 teachers with less than 10 years ofexperience with 113 teachers with 10 cr more years of experience in termsof the number of "poor" answers on the multiple-choice Rorschach test.He also compared 107 teachers under 35 years of age with 98 teachers over35 years of age. Differences were not significant in either comparison.

Englehart and Tucker (108), in 1936, asked 224 high school pupils tochoose their best and worst teachers and to check their appropriate traitson a list. Their findings with respect to age are summarized as follows:

81

Good teachers Poor teachers

e No. No.

204tg 29 28 23.9 27 25.3

30 to 39 68 58.1 47 43.9

40 to 49 16 13.7 26 24.3

50 or above 5 4.3 7 6.5

No, significance test with respect to the differences in percentages

was applied. In 1946 Nemec (244) made a study of a, group of 265 pro-

bationary teachers who failed to receive.certificates at the end of atwoJyear probationary period because of unfavorable supervisory reports.When these teachers were divided into two groups (ages 19 to 22 and 23years and over) according to the age at which they began teaching,Nemec found no differences which were significant at the .05 level.Ryans (286) in a factor analysis study of trained observers' ratings ofteachers on the basis of directly observable teacher behaviors foundthat teachers (N = 60) with 1 to.4 years of experience were significantlydiffetent from teachers (N = 32) with 5 to 9 years of experience at the.01 level for two factors, which he named "controlled pupil activityand business-like approacti". and "teacher calm and consistent; liked be-

cause human," and for the total rating. Differences were significant

at the .05 level for two factors he called "pupil participation andteacher open-mindedness" and "sociability." The teachers with 5 to9 years of experience were oignificantly different from the teachers(N = 111) with 10 or more years of experience at the .01 level forfactcts "pupil participation and teacher open-mindedness" and "teachercalm and consistent, liked becauLi human" and at the .05 level fortotal rating by the observers. The teachers with 1 to 4 years of ex-perience were significantly different from the teachers with 10 or moreyears of experience at the .01 level for " controlled pupil activityand busineds-like approach."

The research findings of Davis (96), Meriam (225), Ruediger andStrayer (283), Ryans (286), and Young (362, 363) imply that teachingeffectiveness bears a curvilinear relationship to age or experience.The zero or near zero correlation coefficients reported by Bathurst(23, 24), Jones. (172), Knight (178), Odenweller (247), Riesch (275)bRolfe (280), instead.of showing lack of relationship, proba6ly indicatethe inapplicability of the Pearsorl product-moment correlation method tothe nonrectilihear data involved. It appears that a teacher's rated'effdctiveness increases at first rather rapidly with experience and thenmore slowly up to 5 years or beyond. There is then a leveling off and

ithe teacher may show little change h rated performance for the next 15or 20 yeafs, after which, as in most occupations, there tends to be adecline. It must be borne in mind, however, that ratings it such studiesas the foregoing may suffer from the "logical error" which results froman implici+ assumption that, the young, inexperienced teachers can notbe as good as those of 5 or more years of experience.

82

In interpreting the alleged decline in teaching effectiveness after20 years or more of experience, the effect on ratings of the physicaland mental changes acco.panying aging in general must be considered. It

is quite conceivable that while the ratings of students and supervisorsmight favor the younger and more vivacious teacher, the real effective-ness of teachers in bringing about student changes might not be relatedto age at all. There are as yet, however, no adequate studies of thisrelationship.

The research findings on age and experience have some interestingand rather important implications for the Air Training Command. In a

study of the correlates of instructor morale in Air Force technicalschools, Richey and Berkshire (273) reported percentages with respectto experience of 3117 military and 797 civilian instructors as shownin Table 24. If more valid techniques eventually confirm the findings

',able 24

Teaching Experience of Military and Civilian InstructorsIn Air Force Technical Schoolsa

Experience

Less than 6 mos.6 mos. to 1 yr.1 or 2 yr.3 or 6 yr.5 yr. or more

Military Civilian

24.2% 5.4%41.3 11.825.4 18.97.0 17.5

2.1 46.4

aFrom Richey and Berkshire (273).4of previous investigations that an instructor continues to improve forthe first five years, the great majority of military instructors have notreached the period of greatest effectiveness. The present rotation policymay be manifestly working against best utilisation of instructor poten-tiality in Air Force technical schools in that military personnel arenot permitted to function as instructors long enough for them to achievemaximum officienoy. Any interpretation of the results of these studiesfor the military situation, however, must take into account thJ factthat military instructors may repeat the same subject matter as many as25 times a year as contrasted with public school teachers who repeatthe same subject matter only once or twice a year. It thus may wellbe that military instructors reaoh their peak in a shorter period oftime than public school instructors.

83

Knowledge of Sub ect Matter Present Professional Information,And Teacher Examination Scores as Related to Instructor

Effectivenens

Knowledge of Sub ect Matter as Related to Instructor Effectfveness

It is frequently stated that the good teacher is the one "who knowshis stuff," that knowledge of subject matter being taught is the primerequisite of teaching success. With respect to this hypothesis the re-viewers considered the findings of some 20 studies where various criteriaof instructor competence were correlated with one or more measures ofprofessional information or subject matter knowledge.

Much variability is evident among the coefficients found when scoreson subject-matter teats are correlated with criteria of instructor compe-tence. As shown in Table 25, these vary from -.69 to .58. It would ap-pear that whether or not knowledge of subject matter is related to in-structor competence is a function of the particular teaching situation.The negative relationships found in some studies suggest that too muchknowledge on the part of the teacher may result in teaching "over theheads" of the students.

Two minor studies are not included in Table 25 because correlationcoefficients were not computed. Madsen (213) in 1927, found that interms of scores received on a test of elementary grade subjects, allexcept 1 of 31 teacher failures were found to be in the lowest l( ofa group of teachers studied. Allen (2), in 1938, using a Lest that in-cluded subject-matter knowledge, reported a low relationship betweentest results and teacher success for a group of 60 very superior and 60very inferior teachers as rated by three supervisors. Only languageusage and spelling significantly differentiated superior from inferiorteachers.

Professional Information as Related to Instructor Effectiveness

On the basis of the nine available studies which have been summarizedin Table 26, scores on tests of professional information tend to bearsome slight relationship to several measures of instructor competence.With two exceptions, Rolfe (280) and Stephens and Lichtenstein (323), allthe coefficients are positive. However, only two investigators, Crabbs(89), Betts (32), report any coefficients greater than .40.

National Teacher Examination Scores as Related to Instructor Effectiveness

Flanagan (112), in 1941, obtained a correlation coefficient of .51 be-tween scores on the Common Examination of the National Teacher Examinationand superintendents' ratings. He also reports coefficients significant at

84

InvestItstgy.

Potts (19)9

Coos 6 Cornell (153):

Ear', It sl. (1913)

truer (1HS)

lirlrnr (1111)

itaetin (191.t)

hells (1941)

loot:or (1949

torn (1966)

line (1'461

Seams (1966)

Ora 111 (190)

toptureu I lienterstots (190)

Table 23

'elation or Sterol en :vb.latt-liattar feet' to *mum of Instructor Santarem'.

tuelar easels Ten Nora* of effectinoess Correlettom

61 elemmotort St. "'get-cotter oreobulary 11101 goth (nollog) .01 to .46

SOO (approx.) elantary, 1 tr. avert. ?runlet LoglIs. Supervisor rating .00ems WhIpplo tenth,. Sopervisor rating -.02

400 (approx.) elementary. 2 rr. overt- Tre II ler lotgllsO Srporrilor rating .COenc Whipple Iteagirtg Superoloor toting .06

112 almontav, 2 yr. overtime Ionian loglith towpath I titer-a rret1,0g

.11

losappla Reuling Compeethe 'WormretIng

.17

66 elomotery Stanford nithat lc upt1 gato (eritroetic) .12A.Q.

Stanford Aril. Meth c Np11 gain (esp. S.Q.rev path)

Stinfer4 nttloottit Suporthtsonet Toting(cap. 7 salts)

SS oil% 1 tr. egori*nes CronCoop. Wish

Seyern 'stingSvonlIor

.$o

.4oCoop. 141.*Farr Icoatintunco Siporykikot,=Csor. General 3:14r.. Sopsnlior :nth* .31

41 (4-7r. cense) 1 yr. erpsrisore Gros 4411th Suforoisor roithi .31Coop. inglio4, WIT-tiara,

Oerortl Saone?toperolsor rating .33..30

(2-rr. court.) 1 tr. orporienco Cross tnilloh )ape'. tin rstiog.Coop. hells', Litertturo,

Genera/ Seisms!opinion ratios .24, .15, .21

123 vitt. 1. tr. experience Coop.? Literstere. 21no Sonnet *Min rating 10 1l .03Atte, Anomie, Satial .C.Y. .13Studio% Ilatbellitiel

Toottor College inglion. froperintandatt sotto') .C9

e) elesentert, 1. 6 2 -roam inaritan Cowtry Clothe I Peril 4414 (aittlirtectp) ..v)Ctavarnaatit

iirthorsolats. Yob. Probs. Nil path (4itlienhit) .01

21 elementary, ro.n1 ANiriean Cowart fines Itooremont

P.01 gith (Oeciilsteals.)

.9.vrtotstoro (Itcseern

01.111.7)holt gsla tseetal

studies).S.

611 MO school, Menthe Compretettlso Papereivor toting .1231 Sign Wool Peat ir.g Cooprebentlat Pupil pis (mint

nijocts).13

13 410 is (Iod119) iotlng Cooprotenleo Fall pin (14110) -.69

44 410 ocomi moan, 1 tr. etterimit Com . Coelho Cspostte overdo°,rating

.H77 4144 or4001 noon, 1 pt. sapariabc,11 610 nl000l man, 1 tr. eryerlenec

Corp. trythoCoop. tryllso

Pv11 praleattheYostil tai (*orlon

.t1On)roto)

Si 54* posemit 14444. 1 tr. earthiest,

19 1444 Othoel -Ion, 1 tr. isperleon

Coop. loading

Coco. is dl's

Compitthe tvorelostNair*,

9.01 m'ot'es.14.

.5413 Mph 'wheel stem, 1 rt. lintort.rivi Coup. loading Pepll golo (onion

tortjecte).10

RI with I tr. emporteme sop, diathorat lee. type rot On making .C.11 269Uteri' SrtiOntO, Serial .1), .12

ConteoperolyStraits

113 nth 1 tr. emporleme

70 foldnon111,7

Coop. Dmemporarykffelt,

Arithounte terdouontalo.Itrittortie imooning

Suedind Cooprototoloattutee Clap. Soutane*

Stricter.

rinnyel ratio*

Fell tutu (tritmotio)

Pool) rats

Pupil pip

.964

.9,..u,

.is

ftells.n of 1 GM room ott.colt.

I floettteint of Mesa Soars nosittnpoccp

85

table 26

Relation of Scores on Professional Information Tests to Measures of Instructor Effectiveness

Iniestitetor Toscher temple Tee% Measure of effectiveness Correletlgi,

Crstbe (1925) (Number unreported) Steele-Marring Pupil gain .05elementary SupervOlor rtnir.A .L1

5eardman (1922) SA high school Professional information (unpub-lithed)

COmposits rank (super-visor, teacher. pupil)

.2(

Procedures .26

Ullman (1930) 116 high school, 1 gem.erperience

Odell (principles of teaching) Average euporittendent kprincipal rating

.11

Meter (otlectises of teething) ererage superintendent 4principal rating

.09

Pitts (19)3) 61 elementary Professional information (tcsEc-ite 16 tests)

Pupil gain .10 to .L6

Garr, st S. (19)5) 66 elementary torgereon Professional Informs. Pupil gain A.Q. .23Lion

Torgerton Professional lnlerma-tion

Pupil gain (co 'unitek its gain).

.0a

tergerson Trofeselonal Informs. 9umerintepdant rating .16Lion (composite 7 ecielep)

Martin (1944) 12) with 1 yr. experience Tescher'Coilege Elementary Superintendent reline, .02

'elf* (1945) 47 elementary, 1-2-room

Leverenz-Steinmets (edzestIonorientation)

Pupil gain .06

keetker (1915) 22 elementary, morel Gersten...Steinmetz (adocetionorientation)

Pupil PIA oo

Stephens 4 Lichtenstein (1947) 35-U elementary (nor-mal wool)

Professional termination Pupil gain -.11

21.16 elementery (eityechoml)

Prefeseionsl 'emanation Pupil gain

a &atheist of 1- and t-rsea se/rels.

the .05 level. between total s,cores on the Common Examination of theNational' Teacher Examination and the proportion of students reportingthe' particular teacher's nacre. in response to the question: Mhichteachers seemed to havp'a broad knowledge of other subjects besidesthe,one you, had with them?" On thy. other hand, when.Lins (203), in1946, correlated National Teachers Examination scores. withevaluation of.,their teachers he.obtained a correlation coefficientof -.30 significant at the .01.1evel.of confidence. When Lins used acomposite gain criceridn he. found a coefficient Of..45. The latterfigure, however, is probably not significant since only seven teacherswere involved.

In 1951 Ryans (286) correlated scores obtained by 192 elementaryand-165 secondary teachers.on the.General Frinciplea and Eethods ofTeaching test or the 1949 National Teachers Examination-Battery withtwo kinds of ratings made'by principals. For the'elementary teachersthe correlation coefficients obtained between examination scores. andprincipals' ratings on an observation blank was .17, and between exam-ination soorPs and principals' ratings of over-all.effectivenesd,..23.The corresponding coefficients for the secondary teachers were .13 and.15. The principals' ratings on the two blanks correlated..83 for bothgroups of teachers. When an analysis was.madeof.examination scores Ob-,tained by.the upper. and lower 27% of the teachers, differences significantat the .01 level were obtained with respect to 52 "high".and 5? "low"

86

elementary teachers, but the differences were not significant at the .05

level for the 45 "high" and 45 "100 secondary teachers.

Despite the more or less unpromising results that have been reportedby investigators of the relationship of professional information andknowledge of subject matter to instructor effectiveness, this might stillbe a field in which useful research work can be done. The restriction inrange of information of elementary school teachers might account for some.of:the lo4 correlation coefficients. It is possible, too, that the par-ticular subject matter' involved may be a factor in determining the rela-tionship between an instructorts competehce and his knowledge of subjectmatter and/or professional information. It appears that in teaching car-

taim technipal school subjects, at least,Ithe amount of technical Worms-tion possessed by the instructor may be important. Morsh and Swaneon (232)

in a small exploratOry study found a corre2ation coefficient of .45 (sig-

nificantly different from zero at the .01 level of confidence) betweenpower plant proficiency examination scores and supervisors' ratings of 73instructors on a.forced-choice form.

An Air Force technical school instructor must possess a certain mini-

mum of technIcal information. lie must be familiar with certain facts;

must poseess the requisite skills, and must vanderstand the. proceduresinvolved ih 1,'e specialty he is teaching in. drder to iTpart these facts,ski113, and techniques to his students. The differential between instruc-tors' knowledge as compared with that of their student° is also an im-

portant consideration. The instructor with wide experience and backgroundor technical information which goes far beyond that of his students mayhave,' a same difficulty as that of the overly.intelligentinstructor'incommunicating at the student ieVel.. On the other hand, an instructor-who has the bare minimum of the kribwledge requiements may be put in anembarrassing position or mayactually lose the respect of older, experi-enced students wbo know more then the instructor about the subject athand, The extent and implications of the differences between subject-matter knowledge of instructors and the knowledge of Their students mayvary from course to course in ways only to be determined through invent-gation.

ipatiar-A04Extracurzivitiesaraleulturel Test ScoresVersur Instructor Ef ectiveness

Extracurricular Activities

There is rather wliespreJd belief among school adminii3trators that ateacher who,has taken part inl'acblvities outside the classroom in highschool or college thereby becomes a more rounded person and makes a bet-ter teacher. In two investigations (292, 3b5) critical ratios were com-puted between teaching effectiveness of groups of teachers who as studentshad participated in extraclassroom activities as compared with teacherswho had been nonparticipants.

87

Sandiford et al. (292), in 1937, compared the top and bottom third ofthe group when 336 student teachers were ranked according to teachinggrades. Significant critical ratios favoring the top third were foundwith respect to several extracurricular activities. In terms of numberof extracurricular participations; Shunon (305), in 1940, reported sig-nificant critical ratios wren 86 most successful men teachers were com-pared with 24 failures and when 111 most successful men and women teacherswere compared with 37 failures.

Since the less able student cannot keep up with his studies if heparticipates and hence refrains from participation or is not allowed toparticipate in extracurricular activities, it is necessary to partial outscholastic ability if the relationships found by Sardiford, Shannon, andothers are to be atributed to the student's becoming a "more roundedperson." As they stand, these results merely reflect the tendency forthe brighter students both to get higher grades in all college subjects(including student teaching) and to participate more in extracurricularactivities.

Several investigators (171, 182, 185, 196, 218, 298, 299, 319, 320,324, 344) have reported correlations found between teacher.% or studentteacher participation in extracurricular activities and ratings of teach-ing effectiveness. As will be seen from Table 27, in the nine studiesof teachers on the job the correlation coefficients range from -.06 to.46. Iri general, investigators found low Oositive relationships betweenextracurricular activity and instructor effectiveness. On the basis ofthe results of the studies reviewed, there appears to be slight justifi-cation for further search for selection or evaluation measures in termsof the amount of extracurricular participation of a teacher while a stu-dent, in high school or college.

General Culture Test Scores

Six investigators attempted to correlate scores on the CooperativeGeneral Culture Test with measures of teacher competence.. The resultsare carkedly inconsistent, with a rather strong negative relationshipbeing indicated in several instances. These studies are summarised inTable 28. In addition to the studies reported in Table 28, several in-vestigators (125, 161, 184, 218,.298) correlated total scores on theCooperative General Culture Test with student teaching grades. Correla-tion coefficients obtained ranged from -.02 in the Seagoe study (298)with 31 student teachers to .21 in the Kriner study (184) with 55 stu-dent teachers. The studies reviewed appear to indicate that the relationsof Cooperative General Culture Test scores to instructor effectiveness dif-fer little from those reported for other subject matter tests.

88

Table 27

Relation of Extracurricular.Activities to Instructor Effectiveness

Investigator

Teacher sample

Type of activity

Measure of

effectiveness

Correlation

coefficient

Jones (1923)

45 high school women

Extracurricular

Supervisor rating

.05

(1920)

45 high school women

Extracurricular

Supervisor rating

-.06

(1921)

44 high school men

Extracurricular

Supervisor rating

.27

(1920-21)

Somers (1923)

110 with 1 yr. experi-

ence

E;:tracu=icular

Principal rating

41

Kriner (1931)


Extracurricular

Supervisor rating

-.04

school

Held student office

Supervisor rating

.10

Soderquist

(1935)

482 adult education

277 who held stu-

dent office vs,

205 who did not


.25'

Kriner (1937)

42 (4-yr. course) 1

yr. experience

Extracurricular

Supervisor rating

.46

94 (2-yr. course) 1

yr. experience

Extracurricular

Supervisor rating

.25

Stewart (1940)

145 rural

Extracurricular


.24

aBi-serial r.

Tble 27 (Cont.)

Investigator

Teacher sample

Type of activity

Measure of

effectiveness

Correlation

coefficient

Martin (1944)

123 with 1 yr. experi-

Extracurricular

Supervisor rating

.22

ence

Number offices held

Supervisor rating

.18

Seugoe (1946)

25 elementary, 2 yr.

experience

Officer membership

ratio


.16

Memberships


.c6

Von Haden

(1946)

58 high school women,

1 yr. experience

High school extra-

curricular

Supervisor rating

.19

50 hign school women,

1 yr. experience

High school extra-

curricular

Pupil evaluation

.17

17 high school women,

1 yr. experience

High school extra-

curricular

Pupil gain

.06

aBi-serial r.

Table 28

Relatioti of Scores on the Cooperative General Culture TestTo Measures of Ins ructor ffectivensss

CorrelationMeasure ofInvestigsla Teacher,_ effectiveness coefficient

Krizier (1935) 55 with 1 yr. experi-ence

Supervisor rating .30

}Carter (1937) 94 (2-yr. course) 1year experience


Martin (1944)

42 (4-yr. course) 1year experience

123 with 1 yr. expeti -ence

Supervisor rating

Superintendentrating

.22

.11

Seagoe (1946) 25 elementary, 2 yr,experience

Supervisor rank-ing

-.01

Jones (1946) 50 high school Principal rating .03

30 high school Pupil gain -.2313 English Pupil gain -.58

Line (1946) 57 high school women,1 yr. experience

Composite super-visor rating

.05


Pupil evaluation -.34


Pupil gain .23

Socioeconomic Status Sex, and Marital StatusVersus Instructor Effectiveness

Socioeconomic Status of Instructor

In 1930 Ullman (339), in an attempt to predict teaching success,among other measures used the Sims Score Card to determine socioeconomicstatus of 116 junior and senior high school teachers with one semesterexperience. Near zero coefficients resulted when socioeconomic: statusscores were correlated with social intelligence, general intelligence,knowledge of principles of teaching, knowledge of aims of secondary educa-tion, self-rating, academic marks, education marks, major s.,,bject marks,and practice teaching rating. In the case of teaching interest, asmeasured by the Strong Interest Blank, a coefficient of -.25 was obtained.

91

This negative relationship appears reasonable considering the low salariesof teachers and the opportunity for individuals of high socioeconomic statusto enter, professions requiring more costly preparation, but there is noreason why economic status should be related to the other variables. The

correlation between socioeconomic status and rated success in the field was

.19. Any such low positive coefficient may mean only that supervisors areinfluenced somewhat by the socioeconomic standing of their teachers. It,

could mean, too, that persons from the higher socioeconomic group do makebetter teachers because of greater social poise.

Kriner (182), in 1931, made a study of 131 best and 131 poorest teach-eYs within a school system as judged by superintendents. He found thathigh school teachers who came from a rural area and whose fathers werefarmers and elementary school teachers who came from urban communities andwhose fathers were businessmen had the best chance for teacher success.Either type of teacher, especially the elementary, was handicapped iftheir fathers were artisans and especially handicapped if their fatherswere laborers. To enter the teaching profession because of financialreasons or compulsion predicted substantially against teaching success.Size of family affected teacher success probably as a by-product of finan-

cial reasons. Travel and past illness had little if any relationship toteacher success. Krinerls results are probably note specific with teachers.They may simply be demonstrating the truism that those from the higherstatus groups have greater probabilities of success in life than thoseless fortunate.

Phillips (256) secured ratings by superintendents and principals of173 elementary and junior high school teachers. He also administered the

Sims Socio-Economic Scale to the same-group. The resulting correlationcoefficient between these measures was .05. When the ratings were con-verted to sigma scores, Phillips reports a correlation of .22 for theentire group and a critical ratio of 3.5 for two groups of 43 teacherseach standing at the extremes of teaching ability as rated administratively.

Rolfe (280) computei correlations between achievement in citizenshipof 338 seventh and eighth grade pupils from one- and two-room rural schoolsand various measures of their 47 teachers. He reported a correlation co-efficient of -.15 between the teachers' Sims Socio- Economic Status scoresand pupils gains.

The results obtained with the Sims Socio-Economic Scale, like thosefound with the Cooperative General Culture Test, seem to provide littleincentive for further research in this area.

With the exception of Rolfels (280) study the criterion used in tnesestudies was supervisory ratings, which are often negatively correlated withstudent gain. It is possible that with other criteria and with otherhypotheses involving socioeconomic status of teachers research of more

probable producti. ewes might be undertaken. Socioeconomic status of the

teacher is probably not of significance in itself but only as it might

92

be reflected in various "psychological" dimensions of teachers. For in-

stance, if extreme upward social motility has characterized a giventeacher and this motility has resulted in insecurity and anxiety on theteacher's part, this 'dlight in turn be reflected in the teacher's patternof classroom behavior or in the adjustments the teacher makes to admin-istrative personnel, fellow teachers, and pupils. Instead of looking forpeople who have exhibited this social motility, or who possess a certainsocioeconomic statue, investigation might be directed towcrd the mani..feet degree of anxiety or insecurity.

Sex of Instructor as Related to Instructor Effectiveness

In Table 29 are summarized the ten available studies in which sex ofinstructors was related to instructor effectiveness. It will be notedthaticriteria of effectiveness employed included student ratings, studentdesignation of'best teacher, average class marks, administrative ratings,and success or failure on the job. One investigator used three criteria:pupil gain, pupil ratings, and administrative ratings. Six of these stud-ies appear to favor women, three show no differences between effectivenessof men and women, and two studies favor men. In studies conducted priorto 1940, in no instance apparently was the significance of the obtaineddifference between teaching effectiveness of men and women teachers tested.In the four later investigations significance was determined but in onlyone etudy, that of Cheydleur (75), was a significant difference found, acritical ratio of 6.6 being reported in favor of women instructors.

As indicated by the foregoing studies the question as to whether or'not women teachers are superior to men teachers has been considered forsome years. The problem may not be merely one of academic interest outmay have practical or economic implications for some school and collegeadministrations. No particular, differences have been shown when therelative effectiveness of men and women teachers has been compared. In

view of the results found, it may well be that consideration should begiven to assessing the effectiveness of women instructors in Air Forcetechnical schools. In case of full scale mobilization women, bothcivilian and WAF, would seem to offer an invaluable potential sour.e ofinstructional personnel. Employment of greater numbers of women instruc-tors than at present would release'like numbers of technical srecialistswho would then be available for combat serport in their specialty.

The Relation of Marital Status to Instructor Effectiveness

While in some parts of the country there has been considerable appo-sition, generally for economic reasons, to the holding of teaching posi-tiors by married women, there appears to be little evidence that Parriedteachers are in any way inferior to unmarried teachers. The reviewersfound only three investigators who had made any objective study of thquestion. In 1934 Peters (253) conducted a rather comprThensive study

93

Investigator

Boyce (1912)

Nannings (1924)

Birkelo (1929)

'412

Eerda (1935)

Heilman &

Armentrout

(1936)

Engelhart &

Tucter (1936)

Table 29

Sex of Instructor as Belated teInstructor,Effectiveness

Teacher sample

343 high school

336 men, 896 high school

108 men, 2179 elementary

(Number uarrported)

13 men, 14 women

(Number unreported)

None

women

women

31 men, 15 college women

(Nraber unreported)

Differences in

favor of

Measure of effectiveness

Men Women Neither

Administrator ratings

(79 superintendents,

reports of failure)

Percentage of 614 college

students designating as

best pre - cortege teachers

Pupil rating

Pupil gala

67 superintendents opinions

Mean student ratings

Percentage of high school

students designating as

beat and poorest teacher

ever had

X Xe

x

X X x

aWen teachers excelled men-in 13 subject fields; men excelled women in 7 subject fields.

bCritical ratio

Investigator

Davenport (1944)

Cbeydleur (/945)

Nemec (1946)

Cooper & Lewis

(1951)

Table 29 (Cont.)

Teacher sample

*51 high school

61 men & 9 warn, college

French

74 men, 191 women probationary

longer than 2 yrs.

153 student

72 high school

Differences in

favor of

Measure of effectiveness

Men Women Neither

Student ratings or rewkiner

X

Average cites marks

705

Teaching certificate fi-

Xnatty granted

Student ratings

X X

a Women teachers excelled men in 1, subject fields; men excelled women in 7 subject fields..

b Critical ratio= 6.6.

of the status of the diarried woman teacher. He matched according to age,education, teaching situation, and so on, 110 married with 110 singleelementary school teachers and compared the gain of 2195 pupils of theformer group with that of 2250 pupils of the latter group. Supervisoryratings (made by superintendents or principals) were obtained for 1123married teachers and 1123 single teachers matched on the same variablesas above. Differences in achievement and mental growth of pupils of themarried women teachers as compared with the single teachers as shown byscores on the Otis Clesvification Test Parts I (achievement) and II(mental growth) were .86 ± .29 and .60 t .23, respectively. These dif-ferences in favor of the pupils of the married teachers were just underthree times the probable error of the differences or on the border lineof being significant. Differences in supervisory ratings of married andunmarried teachers were too small to be significant.

In 1951 Ryans (286) compared 99 single women with 107 married womenthird and fourth grade teachers with respect to ratings made by trainedobservers. Dimensions observed included 20 items relating to directlyobservable teacher behavior and 6 items referring to pupil ',-ehavior.Comparison of mean criterion scores with respect to marital status revealedne differences that were significant at or near the .05 level of confi-dence. When the relation of marital status to pupil behavior alone wasstudied for the 206 teachers, a coefficient of mean square contingencyof .11 was obtained.

The Relation of Teaching Aptitude, Attitude Toward Teaching,And Interest to Instructor Effectiveness

Teaching Aptitude versus Instructor Effectiveness

The results of the ten investigators using several measures desit,nedto predict teaching ability show great disparity. In Table 30 entrieshave been arranged according to teaching aptitude test instead of chron-ologically in.order to improve comparability of studies. As will be seenfrom T:J.)1.3 30, correlation coefficients between various criteria of effec-tiveness and the Knight aptitude test ranged from -.10 to .78, the largestbeing reported by Cooke using nine teacher subjects. The Morris TraitIndex-L test, apparently devised to indicate leadership aspects of teach-ing aptitude, gave correlation coefficients between scores on this testand various criteria of teaching competence from -.17 to .23. In thecase of the Core- Orleans Aptitude Test the range of coefficients withvarious criteria of teaching efficiency was -.32 to .51. Do,,d (100)

suggests that the Coxe-Orleans test measures qualities related to generalscholarship rather than to teaching success as revealed by supervisors'ratings of practice teaching. The range for the Stanford aptitude testwas -.15 to .14. For the George Washington University Aptitude Test acoefficient of -.19 was reported by Seagoe (299).

96

Table 30

Relation of Scores on Measures of Teaching Aptitude to Teaching Effectiveness

Investigator Teacher sample Measure of effectiveness Correlation

Knight (1922) 33 elementary7 high echool

33 elementary7 high school

Fellow teacher ratingyellow teacher ratingSupemieor retire.Supervisor rating

Knight Aptitude-Elementary

.45

.15

.77

.00

Tiege (1928) 25 elementary, 1 see. experi-ence


Bathurst (1929) (bomber unreported)elementary


Barr, et al. (1935) 66 elementary Pupil gain A.Q. -.01Pupil gain (composite A.Q. -.10

& raw gain)Superintendent rating (com-

posite 7 scales).2e

Cooks (1937) 27, 18, 9 elementary & high Self-rating .21, .22, .38school Supervisor rating .32, .12, .78

Mills Trait Index-1.

Barr, it al. (1935) 66 lemantary Pupil gain A.Q. -.11Pupil gain (composite A.Q. -.04& raw gain)

Superintendent rating (cods..posits 7 scales)

.08

Phillips (1935) 173 elementary & Jr. high ecaool Superintendent & principalrating

.20

S. ',ma score rating .23

Rolls (1945) 47 elementary, 1- & 2-room Pupil gain -.17

Rostke: (1945) 28 elementary, rural.' Pupil gain (social studies) .20

Sear.' (1946) 25 elementary, 2 yr. experience Administrator rating .00

Coxs-Orleans Aptitude

Come & Cornell (1933) 500 (approx.) elementary, 1 yr.experience

Supervisor rating -.03

400 (approx.) elementary, 2 yr.experience


112 elementary, 2 yr. experience Composite observer rating .08

Phillips (1935) 173 elementary & Jr. highschool

Average superintendent rat -.ing

.16

Sigma score rating .28

Cooke (1937) 9-48 elementary & high school Self-rating -.32 to .04Supervisor rating -.12 to .51

Soave (1946) 25 elementary, 2 yr. experience Supervisor ranking .01

Stanford Aptitude

(3 subtests)

Aostker (1945) 26 elementary, rural.' Pupil gain .02, .04, .10

Solts (1945) 47 elementary, 1- k 2.room Pupil gain .15, -.13, .08

Soave (1146) 25 elementary, 2 yr. 044A110*. Supervisor ranking .02, .04, .14

Imorge WashingtonUniversity Aptitude

Set800 (1946) 25 elementary, 2 yr. experience Supervisor ranking -.19

a ixolseive of 1- and 2-room schools.

In 1952 Jarecke (165) made an initial report of a teaching judgmenttest he had devised which follows a somewhat different pattern from othertests of this type. Jarecke's instrument is a situational test of aforced-choice ranking type. A list of problem situations typical in thedaily life of a teacher is presented. There are five alternate solutions

offered for each situation. Solutions are to be ranked in order of favor-ableness. All solutions are of the nonoptimum or poor type on the theorythat good teachers could discriminate between varying degrees of poor alter-natives while poor teachers would tend to rank higher the one they them-selves might employ. Jarecke reports very high correlations (.68 to .93)when scores on the teaching judgment test were correlated with variouscriteria of teaching effectiveness. Unfortunately, however, these re-ported correlations are spuriously high because the population on whichthey were obtained included the population on which the scoring key wasbased, thus making it difficult to evaluate the test on the basis of pres-ent data.

At first glance it might appear informative to examine the factorsthat have been considered worth including in tests of teaching aptitudetogether with the underlying rationale and implicit hypotheses. The re-viewers are of the opinion, however, that rationale or hypotheses or themethods used to implement them have been inadequate. If one knew whatkinds of things were important to instructor effectiveness and were ableto construct devices for measuring both the instructors' knowledge ofthese things and the probability of their shaping their behavior in ac-cordance with them in an instructional situation, the use of aptitudetests would seem to be a reasonable approach.

It may be that there is a specific aptitude for teaching which isrelated to effectiveness of teacher performance. Data thus far avail-able, however, either fail to establish the existence of any such apti-tude with any degree of certainty or indicate that the tests used wereinappropriate to its measurement.

Teaching Attitude versus Instructor Effectiveness

Attitude toward teachers and teaching, as indicated by the YeagerScale devised for its measurement, appears to bear a small but positiverelationship to teacher success measured in terms of pupil gains. Rolfe

(280) administered a battery of tests to 47 rural teachers. He reporteda correlation coefficient of .22 between pupil achievement in citizen-ship and teacher scores on the Yeager Scale. He also found a coefficientof .38 between this success criterion and teachers' scores on the HartmannSocial Attitude Test. With 28 teachers as subjects, Rostker (282) re-ported a coefficient of .45 between teachers' Yeager scores and measurablechanges produced in their pupils in social studies. LaDuke (188), whocorrelated scores of 31 rural teachers on the Yeager test with "objective"tests of pupil gain in attention, appreciation, information, interest, anda composite of these, found coefficients ranging from zero to .20.

98

Interest in Teaching versus Instruc-,or.Effectiveness

Operationally, interest in teaching may be quite different from atti-

tWer toward teachers or tea6hing: That an effective teacher shonid beinterested in teaching would appear to be so obvious ae to be axiomatic.A fewlinYestigators haV? attempted, to show that among successful teachersinterest! in teaching developed during. the teachcrsi.secondaryschool peri-

od or before. In the majority,N. investigations, however, interest inteaching was measured try interest test scores which indicate similarity

of interests of teachers and persons undergoing the interest test. The

results of these studies are shown in table.31.

As will be teen from Table 31 those correlations resulting'from theuse of the Strong interest test or modifications of,it and-the test used

by Ccx and (tornell ($o all tend to c)usteY around zerq. The Link Ac=tivitigs and Interest. Inventory on the other hand shows such inconsisten-cies in the light of the yather bf_nrse data available as to render it alsoof somewhat dbubtful value.

The Kriner (182) study which produced such high correlations was basedon recall by the teachers as to their interests when t.hey were!in high

school. Obviously there is noway of keeping such opinions free from theinfluence Of later experience of success or failure, thus making the cor-relations obtained practically meaningless. The Lins .(2D3) investigation,

on the other, hand, was a follow-up study. Students listed their choice's

as to occupatiOns when they first entered college and these.dhoices wereCorrelated against rating received some years later.

In 1952 Ringness (277) reports a study in which he attempted:(1) to dis=cover, if possible, any coMmon factors that may underlie the reasons givenby underOduates for the choice of teaching as a profession; (2) to de-termine Whether;the'anWers given'to:essentiaady;the same questions in twodiffereht types'Of testing devices reveal cOmparable data; and (3) to in-vestigates the relationship between the reasons giVen for choide of profes-sion 'and subsequentkteachihg 'success as measured by criteria of efficiencyand acceptability. A paired-coMparison and a ranking questionnaire wereused to determine the reasons fdr choice of teaching as a career. Data

wqre analyzed by'the centroid method of factor analysis to find the .com-mon factors: Siltyj-three'men and 37 women sttdent teachers comprisedthe sample used in Parts.One and ;Two Of the etudy, and 16 Men and 1E1women with one-year experience were used in the last part of the study.Criterion of teaching success was an "acceptability'? rating by the super-intendent. This was an over-all rating Made after an interview of thesuperintendent by the investigator in which questions were asked1whichrelated not only to teaching efficiency but also to personality Greats ofmany kinds. In the factor analysis study factorSidentified!as interestsin working' onditions; in people, in security, and in subject matter areato be taught seemed to be generally emphasized. Desire for professiohaladvancement did not appear to be a general characteristic of the factor

99

Table 31

Relation of Interest Test Scores to Teaching Effectiveness

Ars &saw_ Tomalley eanols liguatjtj effeotiveneso

Supervisor rating


Test

.02

.74'1.50

.894 .9214

494 .69b

-.644 -.62b

Ullman (1930)

Irtner (1931)

116 high school, 1 sea. experience

76 poor & 74 best elementary

54 poor & 56 best high sobool

Cowdery 6 Strong

Interest in teachingwhile in' high echo.]

Sather teach now than doanything .lea

Interest in teaching as acareer while in highschool

Ito interest is teachingwhile In high school

Cox, & Cornell (1933) 500 (approx.) elementary, 1 yr.experience

Supervisor rating Coxe & Cornell -.01

40 (approx.) elementary, 2 yr.experience

Supervisor rating Cone & Cornell .07

112 elementary, 2 yr. experience Composite 2 observerratings

Cox, & Cornell .10

Barre et al. (1935) 66 elementary Pupil gain A.Q. Strong interest .03Pupil gain (coscoeite Strong interest .06A.Q. A rev gain)

Superintendent rating Strong lig...lent -.11(composite 7 scales)

Phillips (1935) 173 elementary & 3r. tigh acbool Average superintendent Phillip' & Manson .16& principal rating.

Sigma score rating Phillips & Hanson .10

Sandirmd, et al. (1937) Top and bottom thirds 420 Practice teaching grade Expect to teach a life- 3.5 (Critical(approx.) student time ratio)

Seats. (1946) 25 Ilementiry, 2 yr. experience Supervisor ranking Strong interest -.08

/once (1946) 49 high school Supervisor ranking Link Activities & In-terest

.36

27 high school Pupil gain Link Activities & In-terest

.07

12 English Pupil gain Link Activitibe & In-terest

-.34

Line (1946) 41 high school women, 1 yr.experience

Supervisor rankings Teaching listed as firstchoice of occupationwhen teacher enteredcollege

.Z7

Espenschade (1946) 46 Aveisai eddcation, 1 yr.exporter'''.

Supervisor rating Strong interest .07

a Pearson eoe r r formals, eleamltary teacher staple.

b v..,ann cos r r formula, high school teacher ample.

100

structure, nor did desire for service to society or prestige and respectof the profession. Factors bearing similar labels were found in analyzing

the results for men and women. However, these factors were only broadlyalike and had somewhat different arrangements and loadings of the vari-ables. An interest in "security," for example, as interpreted from themen's data is not precisely that interpreted from the women's data. Cor-

relations between reasons for choice of teaching as a profession and ac-ceptability ratings differed slightly between the men and women subjects.Items which had a c rrelatikal of .30 or higher, in either the paired-com-parison or ranking questionnaire, for the women were: "relatively good

financial reward," "ease of getting a position," "clean, attractivephysical surroundings," "short working hours," "frequent vacations," and"environment of interesting co-workers." Items which had a correlationof .30 or higher for the men were: "security against job loss and layoffs,""clean, attractive physical surroundings," "opportunity for professional ad-vancement," "opportunity to serve society," "ease of getting a position,"and "opportunity to pursue a favorite interest." Multiple-correlation co-efficients between acceptability ratings and raw scores in the men's paired-comparison questionnaire were .64, and for the women's questionnaire .44.Multiple correlation coefficients between acceptability ratings and rawscores for the ranking questionnaire were .76 for men and .78 for women.It appears to the reviewers that Ringness may have gone somewhat furtherin his interpretation of his data than the size of his N's justifies.

The Relation of Voice and Speech CharacteristicsTo Instructor Effectiveness

Shannon (303), in 1928, reported that the teacher2s voice was placedeleventh in order of importance among qualities listed by 3317 high schoolpupils and ninth in importance by 107 university students. One hundredtwenty-four crfl.: I teachers placed voic3 second among personal and socialtraits considered essential ',c effectiveness that wer found to be weak

in student teachers under their direction. Voice did not appear amongthe 15 most important qualities mentioned by 97 supervisors.

In 1951 Richey and Fox (274) had 1883 high school boys and 2022 highschool girls in Indiana check characteristics that pertained to their best-liked and least-liked teachers. Among characteristics of the best-likedteachers, the item, "had a pleasant speaking voice," was marked by 76% ofthe boys and by 84% of the girls. Of the characteristics of the least-liked teachers, "had bad speaking voice" was designated by 39% of theboys and by 37% of the girls.

In other investigations discussed under the section on Opinion S,udiesvoice was mentioned among the ten most important teaching characteristicsin eight studies of high school pupils, nine studies of college students,and two studies of administrative groups. Voice was not included amongthe first ten traits in opinion studies of two grade school groups, four

102.

studies of high school groups, seven college student studies, one study ofadministrative opinion, and two opinion studies of teachers themselves.

In 1929 Barr (16) studied the characteristic differences in the teach-ing performance of 47 good and 47 poor teachers of the social studies.Twelve of the good and 17 of the poor teachers were listed as having goodvoices. Twenty-five good teachers and 7 poor teachers showed "conversa-tional manner." A repetition of the study with another group of teachersproduced similar results.

In 1941 Baxter (25) in an investigation of teacher-pupil relationshipsreported results when 42 teachers were studied by two observers. Voice andmanner of effective teachers were said to be original and intriguing whilenoneffective teachers showed voice and manner that were prosaic and color-less.

In 1943 Henrikson (150) made some comparisons of ratings of voice andteaching ability. Teachers were selected at random from the files of aplacement bureau. Results are shown in Table 32.

Table 32

Relation of Ratings of Voice and Teaching Ability

Variables No. of casesCorrelationcoefficient

Voice rated by supervisor of practice teach-ing vs. voice rated by school supervisor 433 .20

Teaching ability rated by school supervisorvs. voice rated by practice teachingsupervisor 433 .20

Teaching ability rated by practice teachinggrade vs. voice rated by supervisor 432 .27

Teaching ability vs. voice rated by samejudge:

Training school supervisor 434 .62

Public school supervisor 580 .58

102

The last two correlation coefficients in Table 32appear to be arather neat demonstration of the inability of judges to separate supposedlydifferent characteristics of individuals, viz., teaching ability and voice.Other good examples of the same inability on the part of raters can be beenin the investigations of Martin (218) and Eenrikson (151). In 1944 whenMartin correlated su,erintendente ratings of 123 teachers after theirfirst year of teaching with the same superintendents' evaluation of voiceand mechanics of speech, she obtained a correlation coefficient of .58.In a later study in 1949 Henrikson (151) investigated relations betweenpersonality, speech characteristics (voice, pitch, rate, quality), andteaching effectiveness of college teachers as rated on a five-point scaleby 150 college students enrolled in a speech course. He reported coeffi-cients of contingency ranging from .42 to .66 and chi-square values showingsignificant relationships between various qualities of instructors asdetermined by the student ratings.

From the studies reviewed above it appears, in general, that thequality of the teacher's voice is not considered too important by schooladministrators, teachers, and students. Halo effect or "logcal error,"which so often has been found a contaminating factor in ratings, alsoappeared to be present to a large extent in these studies.

A study, made by McCoard (210) in 1944 on speech factors as relatedto teaching efficiency, appears somewhat more promising. Speech effec-

tiveness of 40 teachers in one-room schools was measured by having 22speech teachers rate each teacher on a seven-point scale on each of 14speech factors. Recc:.Ings were made while_ each teacher read standardizedmaterial for three minutes and also spoke for three minutes on an assignedtopic. A special pronunciation test was also administered. Correlations

were obtained between the gains of 338 seventh and eighth grade pupils ina citizenship test and their teachers' speech scores. In the reading ex-

periment 12 of the 14 ratings on speech factors and the total speechscore were significantly correlated with student gains at the .01 level,and the correlations of the other two speech factors with student gainswere significant at the .05 level of confidence. The coefficients ranged

from .34 to .46. In the speaking experiment two speech factors, vari-ation in pitch and variation in quality, had correlations with student

gains that were significant at the .01 level. Eight speech factors and

the total were significantly related te gains at the .05 level of con-

fidence.

Correlations obtained between a composite of effectiveness ratingsby supervisors and reading scores were all significant at the .05 leveland all but two were significant at the .01 level of confidence. Thecorrelation between total speech scores (reading and speaking combined)and supervisors' ratings was .49.. Intercorrelations among variousspeech factors (pitch, quality, volume, rate, phrasing, distinctness,etc.) centered around .90 which led the author to conclude that evenwith trained judges an indication of general speech ability based on asingle factor will give as good results as a totalof judgments onseveral factors. McCoard reported correlation coefficients between

10.3

pronunciation test scores and other teacher measures as follows: pupil gain.02, supervisors' ratings .40, total reading score .49, total speaking score

.40.

In 1950 Huckleberry (159) investigated the possible relationship ofspeech to student teaching. He also attempted to develop means of iden-tifying significant speech qualities of student teachers and observed theeffect of improvements of speech on student teaching competency. Threespeech teachers rated recordings of 54 volunteer subjects (24 in the ex-perimental group and 30 in the control group) in terms of articulation,pronunciation, voice quality, voice pitch, inflection, rate, rhythm, andconviction. Huckleberry concluded that positive change in student teach-

ing proficiency, as observed by critic teachers, was directly associatedwith positive change in rated speech proficiency. The reviewers comparedthe correlation coefficients of his experimental and control groups, how-ever, and found the differences were not statistically significant.

While research on voice and speech characteristics tends to be some-what scanty, this area appears promising for research in the Air Forcetechnical school situation. It is possible that voice, apart from othervariables, plays an important part in supervisors ratings. It may be,

too, that speech characteristics constitute a crucial instructor variable,that in addition to certain subject-matter knowledge or other prerequisites,the competent instructor is the one whose voice appeals to his class. It

may be, on the other hand, that "actions speak louder than words," thatthe instructor who "knows his stuff" and is able to demonstrate his knowl-edge has little need for words. A potentially fruitful research approachto this problem might be first, to determine the extent to which studentgains in Air Force schools are related to the instructors' oral presenta-tion; and second, to determine whether or not this ability can be measuredprior to selection for the instructor assignment.

The Photograph as a Predictor of Instructor Effectiveness

Many school administrators require a photograph of the applicant toaccompany letters of application for teaching positions. In order to de-termine the validity of this alleged aid to selection, Tiegs (333), in1928, evaluated photographs as a means for teacher selection. He re-ported that rankings by five judges of teaching effectiveness of 25 ele-mentary school teachers on the basis of photographs gave rise to inter-correlations among them ranging from .00 to .50. Official ratings ofthe 25 teachers given by superintendents, after the rating forms had beenchecked by principal and general supervisor, when compared with rankingsby photograph produced a correlation coefficient of -.08.

Johns and Worcester (169), in 1930, also attempted to submit the photo-graph to an experimental check. In their study 6 faculty members of ateachers college ranked 6 men school superintendents or principals, 6 highschool, 6 elementary school, and 6 kindergarten and primary women teachers

104

on teaching effectiveness. Photographs of these 24 administrators or

teachers were then mailed to 748 judges: 61 superintendents, 38 school

board secretaries, and 49;placement bureau secretaries, The judges were

asked to,rank members, of each,group from the photographs as to their de-

sirability as teachers. The results showed every photograph assignedevery rank from one to six by every class'of judge. 'Correlations be-

tween composite rankings of judges of photographs and faculty committeerankings were: for superintendents and principals, -.10; for high schoolteachers, .14; for elementary school teachers, -.01; and for kindergartenprimary teachers, .37., No ong judge of photograph.; in the whole 148agreed With the faculty committee ratings for'any of the groups ranked.

Statistical Analyses of Instructor Abilities

Nine studies (12, 70, 142, 149, 189, 277, 295, 287, 288, 289, 314) reportresults of factor analyses of data from presumed measures of teaching abili-ties. In 1932 Butsch (70) by means of a tetrad difference analysis found ageneral factor among the intercorrelations of judgments of teacher traits.In 1943 Smalzried and Remmers (314) applied the Tilt tone method of factor

analysis to student ratings of 40 practice teachers on the Purdue RatingScale for Instructors. Two factors emerged Which they designated "empathy"and "professional maturity." Items which had greater saturation of "empathy"

were fairness in grading, personal appearance, sympathetic attitude towardstudents, and liberal and progressive attitude. The items with the greaterloading-for "professional maturity" were self-reliance, confidence, and pre-sentation of subject matter. The other items of the scale show lower andmore nearly equal saturation with both basic factors.

Hellfritzsch (149), in 1945, reported a factor analysis of some 27teacher variables using data from the Rostker (282) and Rolfe (280) ,studies.He concluded that four independent primary teacher abilities satisfactorilyexplain the intercorrelations observed between a battery of measures com-monly used in investigations of the nature, measurement, and prediction ofteaching ability. These he identified as: general knowledge and mentalability; teacher rating scale factor; personal, emotional, and social ad-justment; eulogizing attitude toward the teaching profession.' The fourfactors were uncorrelated with each other. Each of the several teachermeasures was dependent primarily upon only one of the factors. Hellfritzschalso stated his study revealed that no single teacher measure of those heused could validly be substituted for the actual measurement of pupilgrowth in evaluating the ability of teachers to teach. Supervisory ratings,he found, were only slightly related to observed pupil growth in socialstudies and, hence, Hellfritzsch concluded were of doubtful value as ameasure ofteaching effectiveness conceived in terms of ability to pro-mote pupil growth.

In 190 Schmid (295) conducted an investigation to determine by meansof,factor analysis if a few common factors might adequately summarizeareas of personality and ability of prospective teachers. Scores were ob-tiined by meatis of the Washburne Social Adjustment. Inventory, Mooney

105

Problem Check-List, Minnesota Multiphasic Personality Inventory, and per-sonal data from student files with respect to 24 traits for 51 male and 51female student teachers. The size of the total group tested varied from80 to 101 for the different variables. Schmid hypothesized that the fac-tor patterns would differ for males as compared to females and rar. separateanalyses by sex. Unfortunately this reduced the number of individualsrepresented by each correlation coefficient to such low figures (40 to 51)as to make the results of his analyses highly tentative. Factor analysisof female scores yielded four common factors, identified as "problems inresponse set," "professional maturity," "introversion," and "social ad-justment." In general, Schmid says, these factors failed to cut acrossareas measured by the personality measures he used perhaps indicating thatthese instruments are measuring different aspects of personality. Factor

analysis of the male scores resulted in two common factors, "social andeducational adjustment" and a "personality-psychological" factor. The

factor pattern of the male students showed a marked discrepancy from thatof the female students.

In 1951 Lamke (189), in a factor analysis of personality characteristicsas measured by Cattell's 16 Persohality Factor Test for 10 good and 8 poorhigh school teachers with one year's experience, found that responses ofgood and poor teachers did not fall into two well-defined and characeristicpatterns. There was some indication that some good teachers differed fromsome of the poor teachers on the responses associated with Cattell's Sourcetraits F (surgency vs. desurgency or anxious agitated melancholy), H (ad-vent.irous cyclothemia vs. withdrawn schizothemia), and N (sophisticationvs. simplicity). The reviewers are inclined to doubt the significance bothstatistical and practical of factor analytic studies based on 18 cases.

Ryans as part of the "Teacher Characteristic Study" has made a factoranalysis of trained observer ratings of elementary and secondary teach-ers on a classroom observation scale containing 20 items referring toteacher behaviors and 6 referring to pupil behavior. Results of thisstudy have been published in a number of different references (287, 288,289). A detailed account of the factors found is given in the sectionon Objective Observation of Instructor Performance.

In 1951 Hampton (142) published the results of a factor analysis ofsupervisory ratings of elementary teachers. Two different scales, apaired-comparisop scale and a graphic rating scale, were used. Hamptonconcluded that a'general factor did not account for the intercorrelationsof the ratings on either instrument. Furthermore, that a greater numberof factors was needed, namely six as compared with three, to account forthe intercorrelations of the same traits on the paired-comparison instru-ment than was needed to account for the intercorrelations of the ratingson the graphic scale.

In 1952 Bach (12) used the factor analysis approach in a study of therelationship of critic teacher ratings as student teachers and supervisory

106

ratings of the same subjects after they had had actual teaching experience.Bach found four factors for each of the two ratings, but only two of theseappeared to be similar.

In 1952 Ringness (277) factor analyzed data concerning reasons giveeby teachers for choice of teaching as a career. This material is dis-cussed in detail in the section on The Relation of Teaching Aptitude Atti-tude Toward Teaching and Interest to Instructor Effectiveness.

Two investigators (220, 285) have reported results of item analyses ofinstructor traits, one in terms of student change and the other in termsof principals' assessments.

In 1940 Mathews (220) made an item analysis of measures of teachingability in relation to student change. By means of a battery of tests hederived a composite index of the changes produced in seventh and eighthgrade pupils by 57 rural school teachers of social studies. The teachers

were given a battery of 11 psychological, subject matter, and adjustmenttests. Of the 1675 items in all tests given the teachers only 68 items,or slightly over 4%, possessed statistical significance in terms of pupilchange. Mathews concludes that the findings cast serious doubt on thevalidity of the tests studied as measures of teaching ability when pupilchange is used as a criterion.

Ryans (285), in 1951, applied analyses of internal consistency and ex-ternal validation procedures to test items measuring the professional in-formation of 192 elementary and 165 secondary teachers with one or moreyears' experience. He used three teacher measures: (a) scores on theGeneral Principles and Methods of Teaching Test of the 1949 NationalTeacher Examination battery; (b) principals' assessments by means of anobservation blank of teacher behavior in terms of pupil behavior, teacherpersonal-social behavior in the classroom, and teacher behavior indica-tive of intellectual and educational background; (c) principals' generalevaluation of teachers' over-all effectiveness on a graphic rating scale.The two principals' ratings produced an intercorrelation coefficient of.83 for both elementary and secondary teacher groups which might be ex-pected because of the common factors involved. Upper and lower ; ofteachers were segregated and 'analyses of the three measures and itemdiscrimination indexes for the teachers' test leere computed for thesegroups. The General Principles and Methods of Teaching Test, Ryans con-cluded, appeared to be made up of items that functioned satisfactorilyfrom the standpoint of interna2 consistency. However, when the test itemswere analyzed against either of the principals' ratings less than 20% ofthe 45 items discriminated significantly at the .05 level or better betweenhigh and low elementary teachere. Only 5% of the items discriminated be-tween high and low secondary teachers. Ryans attributes these somewhatunsatisfactory results to ". . . the doubtful validity and reliability ofthe assessments upon which the external criteria were based, the low re-liability of individual items, and'the fact, that understanding of education-al concepts comprises only one segment cf over-all teaching effectiveness. . ."

1C,7

In the opinion of the reviewers all the studies of the foregoingtypes so far reported suffer from inadequacies of criteria, tests, ornumbers of cases. it still seems possible that a more adequately de-

signed study might yield results of considerable basic importance to thesolution of problems of evaluating and selecting instructors.

Opinion Studies of the Personality CharacteristicsOf Effective and Ineffective Instructors

.For over fifty years attempts have been made to identify the person-ality characteristics of successful and unsuccessful teachers by makinglists of traits based on opinions. In most cases these lists have beenrade up of subjectively estimated characteristics of such a vague, gen-eral nature as to render any precise measurement of them impossible. Oneof the earliest studies of this kind was that made by Kratz (181) in 1896.When 2411 pupils were asked to indicate the characteristics of their bestteachers, the factors most frequently mentioned were: helped in studies,personal appearance, good, kind, pleasant, happy, jolly, patient, polite,neat. In 1929 Charters and Waples (74) collected some 2800 teacher.traitsas reported by 27 teachers, 14 parents, 10 pupils, 3 teacher agency execu-tives, and 2 professors of education. It might be thought that this ex-haustive and comprehensive list would be the list to end all lists. How-ever, more gapers using this approach have appeared since 1929 than everappeared before that date.

In the search for traits, qualities, and characteristics of the suc-cessful teacher, almost no stone has been left unturned. Table 33, lintsall available studies categorized according to the group from whom opin-ions were solicited. The studies are arranged in chronological orderunder each category.

Several of the opinion studies that are scmewhat interesting becauseof the novelty of the approach employed, the date of the study, or themagnitude 'f the effort involved will be briefly reviewed.

In 19C0 Bell (27), in a study of the teacher's influence, reportedresults of a questionnaire ccmpleted by 543 men and 488 women normalschool students. In indicating characteristics of those teachers thatwere most helpful the students' answers fell into four groups: (1)

moral influence; (2) personal interest, kindness, encouragement, sym-pathy; (3) intellectual influence; (4) self reliance. Almost all stu-dents indicated that they had had a teacher whom they positively dislikedor hated, The disliked tea& .ra were retorted to have a malevolent atti-tude, either active or passive, resulting in such behavior as unjustt.unishment, sarcasm, insult, and ridicule.

Shannon (303), in 1928, made a nDst comprehensive investigation ofopinions of the persoral and social traits of successful and unsuccessful

108

11/16 33

Optatea 3Ledits of ?mitt. Gual1t)49,. tra gorketertgles of Came.] Toatbers

Wm/Art etbc41 turilto eitgiald

?muth1W.1 et t26

r4111 114Mkt (334)171169 357)

11311140a1%7

Delsb14.16

11IS

114.4

11t)

avug1M 91)1941° Copps r (14)

%Irmo mu11126016 (34)LAI 011)tliatio (70Wean MOOrteall WI)

Lass24(1145'Mirt( )16

0440ar64 111)hattaiiii jada adds"t lksilipt IP)Allah &Al. (no

hatu-k(6 Data Papal' algaeNAM/ (In) 1131/batator VIM) MI

(p 1131110

Bain (SI 1144VIttp 1134 1%11/1, (31 1141

k46 (4.1) 101111:4 OS) 111Otaniberls 12

1101116=

Jr.466 (ra

1131111111911ratr(v1 ) 1,11WA

trim (14.) 143)Bart MS) 11)41legeltart 4 NM" OW 1916hrt111 164) '4114t1 (1

19001144

&SU1641.1 (9

:Ma asJO

1%3IMO19411

1141

144

1449.14 (191) 19011A49 6 144 (170 1911

Mt; m

164661 KAKI Asilleloo Del 41.41691?.4444'.' ..sum.

1100

1149191/

114430611Il ma

:119"133A6

)

tiros (116

gkiWOL61111tWilUgd1114944())) 107Proldsrilt0 11t7runt

)491ss(74) 1911

(173) 191'1Catt.61 11)1alga (111) 11)1066161

1

(11) 1%4

Unit 146e.14/0 aUgam

164441.4 f303) 1E4Imams 1/041 1%1

psislot 47 Wake 1talaltt 6We1411 ted

424r1.1m1 6 lapin (74) 109Cattell (72) 1131?HMO. 1134) 1937

MakowaAMMAJmAwbwAsuos. 1114ttattort 111144 (741 1921irta yrsuwra 1%111441444 ( 1141

1944

1117

193/1%11144

itu (116) 1119Shwas (306, 116:41 (11) 1%4

%1

UnagmaJOIWW.U.unklu?moll (r) -ubnulausAinsW 4n114:4.4 (170 11111461

U1)(1)

bighutuaLassula1996616 (170 11:2

..arters 44,91.4 (74) 1919

;'11)Its

(kw,. *I atttilrea 11n4 144sAtt%141164/4 44 marl

twif46 1491,1 174) 191

InelmlIng Onto I.

secbndary school teachers. He interviewed 97 "selected' supervisors; huhad 3311 high school pupils and 107 university students list good and badqualities of teachers; and he asked 124 critic teachers to 1.st ,personal'and social traits. found to toe, weak in student teachers under their., difeic-

tibn.l!SAannon also studied de,problem by making analyses of traits useilon rating scales, recommendation procedures, reasons fur teacher cailure,trait's considered in tcrtification and cedes of praessional ethics 'forteacher0. Among teacher traits Shannon found -to be considered most im-porcant were such qualities as stimulative power, forcefulness, sympattiv

self,corltroloand fairness.

In'1929 Jordan (173), in a study of personal and social traits asrglated to high school teaching, used a questionnaire of 46 traits. The15 trait's considered of most importance, the 16 of medium importance,,and the 15 of least importance were checked by 150 high school pupils,120 teachers, 100 supervisors, and 120 school patrors. As an exampleof the outcome of typical studies of this kind, the 5 most important andthe 5 least imortant traits as listed by the various groups are givenin table 34. The rather remarkable agreement among the four groupsstudtvitsugges!s the probable existence of powerful cultural stereotypesirlael region where the study was ponductW. This conclusion is em-phasized by the comparative lack of importance indicated by other studiesdf certain factors judged.among the most. important in Jordan's study.

109

Table 34

The Five Most and the Five Least Important of 46 Teacher TraitsAs Ranked by Four Groups of Judgesa

Most important trait

Pupils Teachers Supervisors Patrols

1. Fair Intelligent Tactful Intelligent.

2. Intelligent Tactful Intelligent Fair3. Interesting Healthy Fair Broad-minded

4. Broad-minded Broad-minded Cooperative Tactful5. Cheerful Cooperative Healthy Patient

Least important trait

42. Dignified Trustful Ready ofspeech

In touchwith life

43. In touch Willing to Of broad in- Trustfulwith life lead terests

44. Thoughts cen- Reverent Thoughts Proud oftered out-side of self

centeredoutside ofself

profession

45. Reverent Modest Willing tolead

Of broad in-terests

46. Proud of pro- Thoughts cen- Modest Tilling tofession tered out-

side ofself

lead

--_-_-----_-_

aJordan (173).

411.10M1

In 1929 Klopp (177) gave results obtained by asking sumner school pu-pils in junior and senior high schools to compare 81 practice teacherswith an "ideal teachers' on 10 traits. A majority of the pupils rated theirstudent teachers as equal to the ideal teacher on eight of these traits(kindness, neatness, fairness, patience, approachableness, sense of humor,enthusiasm, willtngness to help). Percentages for the different traitsranged fromi56% to 78%. The majority rated their teachers below the idealteacher for thoroughness (55%) and discipline (62%).

In 1932 Kyte (187) asked 69 supervisors to analyze their most seriousproblem teacher. The.supervisors rated their unsuccessful teachers on 53characteristics. Among these deficiencies judgeA most important weredeficiencies in leadership, In influence on mils' habits, in selectionof method, in cot. .' Of class, and in work responsibility.

110

In 1936 Engelhart and Tucker (108) asked 224 high school pupils to

check a list containing 1C0 positive traits and their corresponding oppo-

sites for the teacher they considered best and also for the one consider-

ed the poorest.' Of the 1C0 traits, 46 were found to correlated signifi-cantly and positively with quality of teaching. The highest tetrachoric

coefficient of correlation was .93, the 46th was .32. Of the 46 traits

correlated 25 were .72 or above. The traits showing tetrachoric coeffi-cients of .80 or higher were good judgment .93, clear in explanation .88,respecting others' opinions ,86, sincere .83, impartial .83, fair .82,appreciative .80, interested in pupils '.80, broad-minded .80.

Tostlebe (334) made an analysis of the relative importance of varioustraining factors to success in the one-room'rural school. A check list

of 135 items arranged as a four-point scale was, marked by 40 specialistsin the field of teacher training and 40 county superintendents. Split-

half reliability coefficient for the specialists was .85 and for thecounty superintendents .86. A correlation coefficient of .81 was ob-

tained between the judgments of the 40 specialists and the 40 superin-tendents. A weighted :ndex was obtained for each of the 135 success

factors which were then divided into fourths. The'type of successfactors which most predominated in the top fourth were those, centeringabout assignments, individual differences; study periods, mastery offundamentals, unit method of instruction, adjusting programs, teacher'spersonal self, and the relationships of the teacher to child and parents.

Daniel (91) compiled opinions of 20e superintendents, 267 principals,29 supervisors, e46 white teachers, 602 Negro teachers, 1659 white eighth

grade pupils, 523 Negro eighth grade pupils, 998 white eleventh gradepupils, 378 Negro eleventh grade pupils, 1351 white patrons, and 973Negro patrons. Each of the above individuals indicated the qualificationsof the teacher whom they considered best Within their experience. All

groups followed remarkably similar patterns giving first place to quali-

tips related to professional interest and competency, followed by per-sonal qualities.

In 1948 Witty (357) listed in order of frequency traits found in14,000 letters 5utmitted by pupils from Grades 1 to 12 in a contestwhich reluired them to describe the teacher who had helped them most.In a second study of 33,000 such letters the list renained substantiallythe same. The 12 most frequently mentioned traits in order were: coop-

erative and democrati'c attitude, kindliness and consideration of the in-dividual, patience,, wide interests, personal appearance and pleasingmanperilairness and impartiality, sense of humor, goodidispositton andconsistent behavior, interest in pupils' problems, fleXibility, use ofrecognition and praise, unusual proficiency in teaching.

,Undesirable characteristics were also analyzed in the second study.In order of frequency the 12 most often mentio:.ed negative factort were:

Letferedand intolerant, unfair and inclined to have favorites, dis-inclined to enow interest. in the pupil and to take time to help.him,'

unreasonable in demands, tendency to be gloomy and unfriendly, :arcastic

and inclined to use ridicule, unattractive appearance, impatient and in-flexible, tendency to talk exceesively, inclined to talk down to pupils,overbearing and conceited, lacking in sense of humor.

In a study made at Brooklyn College and reported by Ooodhartz (131)in 1948 and by Riley et al. (276) in 1950, 6681 students at Brooklyn Col-lege selected from a 10-item list, 3 qualities which they considered tobe of the greatest importance in a teacher in the biological and physicalsciences, the social sciences, and the arts. This study has a certainunique, value in that it secured opinions concerning teachers of differentsubject matter and did not assume that all good teachers would have thesame qualities regardless of the subject they taught.

In 1949 Irwin and Irwin (162) obtained an appraisal of certain teachertraits by 415 senior high school students by having them list words thatmight be used in describing good and bad teachers.

Using 694 students from four college classes Bradley (50), in 1950,using an unstructureu, open-end questionnaire technique, found that withrespect to college teachers and their teaching, students like such fac-tors as "teaching efficiency," %este stuaental needs," "puts subjectmatter across," "facilitates learning." These were mentioned 1649 times,or more than all other factors put together. Similarly, in terms of dis-like, the negativesof these factors appeared 1507 times, again more oftenthan all the other negative characteristics combined.

The results of all of this effort in conducting opinion studies of in-structor personality characteristics appear to be largely sterile interns of usability for evaluative or selective purposes. It seems quitepossible that anyone who had passed th.ugh the average American schoolsystem could sit at his desk and devise an "armchair" list of charactcr-istics of the effective as opposed to those of the ineffective teacherthat would be quite as useful as any list thus far developed. The trend

in present day research in the area of selection and evaluation of per-sonnel is definitely directed away from opinion studies as sources ofideas concerning the requirements of teaching and toward the use of psy-chological theory and rationale in the development of systematic sets ofhypotheses to be tested with objective tests and observational techniques.

Carefully designed opinion studies of personality characteristicsof instructors might lead to some understanding of why supervisors' rat-ings of instructor effectiveness, which are based on opinion, fail tocorrelate with the student gains criterion. Investigation might aleo bedirected toward the problem of providing sounder bases for supervisorjudgment. It is possible that in such studies the use of some of themore recent methodological refinements such as Stephenson's Q-techniqueor Csttell's R-technique might be productive of more operationally usefulresults.

112

It should be pointed out perhaps that mere collection of great massesof data does not necessarily produce a more effective study. Adequate

sampling might have eliminated, for example, in the studies of Witty(356, 357), the arduous task of going through 33,000 or even 14,000 let-tbrs withput sacrifice of any meaningful finding.

Causes of teacher Failure

In a number of studies attempts have been made to set forth the causesof .teacher failure. Several of these (60, 68, 187, 213, 216, 234, 237,278, 111) merely report summaries of superintendent's' reasons for.dismis-sal of unsatisfactory teachers or give superintendents' opiKlions as towhat,constitute the chief weaknesses of failirg teachers. Other investi-gators, Andersen (5) and Morrison (231), include. reasons for failure asreported by school board members. Mott (236) queried 200 teachers ofagriculture, while James (164) canvassed opinions of college freshmen,schobl administrators, and teachers themselves. School principals wereincluded in Littler's (205) survey. 'McLaughlin (212) made a case studyof 98 effective and 16 ineffective female elementary school teachers.

The first such report available to the reviewers, that of Littler (205),in 1914, mentioned weaknesses in maintaining disciplines in teaching skill,interest, personality, effort, and cooperation as the most important causesoffteacher failure. Subsequent studies have more or less reiterated insomewhat variedterms the findings of this earlier report. Poor mainte-

nance of discipline and 1pck of cooperation tend to be listed among thechief causes of failure or dismissal in most of these studies. Health,educational background, training,'age, and knowledge of subject matter, onthe other hand, appear to be relatively unimportant factors. These in-vestigations are marked by a complete absence of operational definitionso: the terms used, so that any estimate as to the importance of the vari-dus factors depends entirely upon the personal likes and dislikes, pre -cbnceptions and misconceptions of the judges and upon their individualinterpretation of the terms. In none of these studies was any attemptmade to observe unsuccessful teachers systematically in order to deter-mine those specific behavlr,....s which differentiate the ineffective fromthe successful teacher. Another important consideration in evaluatingthese 1 odies of the causes of teacher failure is that the. stated causesmay have been concocted. efts., the decision to reliePe the teacher offurther duties had been made.

As in the case of opinions regarding the unsuccessful teacher, manyjudgments have also been made as to what constitutes good teaching prac-tice. No one knows, however, towhiat extent manifestly undesirable be-hav2br may be offset by presumably desirable factors. In othei words, no

brie has determined what constitute the allowable instructor idiosyncracies.A potentially fruitful approach to the problems of determining instructoreffettiveness might well be the'investigation, through objective observa-tion techniques, of behavior characteristics commonly deemed to constitute

113

unsound teaching practices by educational authorities. Then study should

be made of the extent to which such pedagogically undesirable behaviors

may be present without appreciably reducing the efficiency of an instruc-

tor in terms of pupil gain.

Personality Te:ts of Teachers

Investigations of the relations of personality test scores to meas-ures of teaother success have yielded widely varying results. In Table

35 are summarized results of studies it which attempts have been madeto related various personality measures to measures of instructor effec-tiveness. The material has been grouped according to the personalitymeasure used. It will be noted that correlation coefficients computedbetween scores obtained on the several sections of the Bernreuter Per-sonality Inventory and various criteria e instructor effectivenessrange, for "neurotic tendency" from -.31 to .17, for "self-sufficiency"from -.24 to .20, for "dominance-submission" from .CO to .33, for "ex-troversion-introversion" from -.14 to .01. Correlation coefficientsfor the Bernreuter-Flanagan self-confidence scale vange from -.38 to .00and for the Bernreuter-Flanagan sociability scale from -.26 to -.06.

High scores on the Bell Adjustment Inventory and on the ThurstonePersonality Schedule are associated with poor adjustment so that negativecoefficients with effectiveness might be expected. As reported by vari-

ous investigators these range from -.04 to -.40. The positive coeffi-

cients given in the Gould (133) study probably indicate only that hereversed the direction of his scores so that the results among severalsets of variables would have comparable directions. Although thetetrachoric correlation of .52 found by Cooper and Lewis (83) betweenpupil rating and absence of neurotic sign on the Rorschach is higherthan is usually found with supposedly more "depeldable" data, the authorspoint out that extent of overlapping prohibits the use of neurotic signsfor individual prediction. An important feature of the Cook and Leeds(80) and Leeds (198) studies was the use of item analysis against theexternal criterion of teachers designated by their principals as he

best and worst in the schools in getting along with children.

Ryans (286), in 1951, as part of the "Teacher Characteristic Study"referred to earlier, studied the relationship of scores on the ThurstoneTemperament Schedule for the upper and lower 27% of a group of 275 ele-mentary teachers selected on the basis of composite observer ratinv.These ratings had been factor analysed by the centroid method and yieldedfive oblique factors which appeared to refer to: (a) pupil participationand teacher open-mindedness; (b) controlled pupil activity and business-like approach; (c teacher calm and consistent, liked because "human ;"(d) sociability; (e) appearance end attractiveness. (This last factorwas not Tied in the analysis.) Differences for the "vigorous" categoryof the Thurstone Temperament Schedule were significant at the .01 levelfor Factor (a); for the "impulsive" category at the .05 level for Factor

Palle IS

flalstIont of Personality loo.nre 1 17.4ourn at Iretrreter ttfectlrtetes

ttocktc.nult- Pelualtr_ljlettivenete syrialtice

Derszystar ?groom lit, %Iwo:44r,

S 0 1i. A.laysint 1194/ 107 etedent Pretties teechtny prod, ..21 .03 ..Lk .3)

22111122 Osss) vs learetry i jr. MO Antp raoriniandent 6 principal .09 .01, .07acneal matins'

,,Porintendent miry Wye Kerte) .12 .11 .21

2a11ase I kneentrovt (21)6) 21 4124. Stsesrt rating (Pardso) -.39 -.11 .11

Ssedifer4 (11)0 IA/ (apprsa.) vtadoct intik, ...ening Andes .12 Isleektie ea, /2 p Al .2)

Vrl 11 Iirt 11413 OS relent Itraclice teaching retry (raper- -.11 .1. ...041 .26 .02 ..61ei Per'

Pratice tea:Sing reline (trine .09 .t) .01 .00 .431 .11t owe,)

tartan (141) 132 with 2 to 1 fr. tn'ettt,tt S'orint.neert reline Sersrester $ trusty 1.07) ittall 93.31 /pal 6 eetel1.4.171 etable 91.11 'reel &eits:lost.

c.o... Ims) r eleoert.ry, 1. I Pail inane .11 -.11 .06:eepeeite 1) teacher ssavetes .13 .11 .1)Ceepealte ( eurarintemisni, furor-

titer, ebeertor) I 21114 1.1119.16 Al .11

I Mine /cells .17 .1 .21

Late 11451 61 ..-16-2 1 and 2-roa Peril pin -.12 -.31 .0Partaer (:43) 29 e:,enelary, tarsi. 1..11 este ..II .*, .19 .hkW. (145) )1 not et Pretties teacsins rietir4 122994e 0940

a iyeksdado (M$)

99 e1s ertarr, I rr. ever-fen.,

9 4: aft, giqieleal ed.***** .

Moist 'trete, ry.144 ( es r tort ile)

Iteirietretse retina

.)0 .26

.01 .11 .12 .22I yr. exyerience

$411 127ettsent Ireentery2

hillipe ((I7) I') lseeetat7 I jr. kir. 42 *1 Icersye sated- (eneent II tritely.) ..011:taw4.0,-istisrent rot 4 (siva

ware.).07

0.604 (141) )1 rt seri Prectice teacticis retina .40

Caere (7544) 21 lloarttc?. '3 yr.. 9errienes 16nirietreter tarAin4 (pyres-11 II) .11

Jones (146) 41 410 fe1/44 1 Jetertleet titi.45 ItIt Mel tt tett] /Ile .remu tato) s1 sit 1 r. seralere friss 4.1 1.4:44: .371

tetteter4 Per soff.11t) tette,*

title' (last) Is sells t 'Attie* tes,eilig *lit.* .)4Savo (MS / I7 nil ye 4abletrter tuna* (rite,. tilt) .31

Oes14 0411 11) sit* 1 yr. egerteece Marital rating .1)5

ttssai (trs) r) 4,asseetsr/ 'RAY Cassestle redid If eiteerverl ttat kg of Plat ttiptiti V7 e...os 3r57 sp bst !sr

Sterse%act

ilistr (AA) rS 45. ssretee lbw (evert ese 14 ,s. tricssr1oves1) 13t1 7. 4044e5 pelt ItelsiNst.(14. of opine* se (01 t 4.I.emote) * .01)

137 pus rifts 1st %Mee e' sell 1.121(/r. of °peer* fee1.2j.etre

beer 1 14.41.. (7537) )0 IRS.49.4 4. 9 )0 laws% of 1y. hill :stingIto se sod iIss 131 "west

nIsirS4* of I. 4141 1141440 sely1a.

1146 4-srof as Os 1141 111.tetteer4 trtfrkaft WIWI feet eliettaert.

thetfteteell et molmoner.

115

moon)

bosses of .1/ tvetn-allere* te Sip theelel

tylselatt, 40- Is roles ire AltUfer.syst

)(feints/I st 04 et4t tAttIstersimit (SHY:Yes

%log IlleOlint S11/41,1 ttstletreaps Ice fen. eir 1-119*4

V' set 114.Wines

Investigator 7Nerir earls

Callis (1952) 62 elementary

Taylor (1951)

Calif. (052)

56 Auden!. union

31 etvds.rt Malin

77 elementary

!Mole 35 (Cont.)

Cried., t Newt:aro (1942) 166 student

Seaga (1915) V Rodent

Rodent

Seated (1946) 25 with 2 yr. overlent*

Sroebsvar (1940) 39 MO school

lirooterver (1915)

Conk II thado (1947)

Laois (1950)

66 high etheol. male

100 Punselettol

Canis (1954) 77 elammatary

(d,; for the "dominant" categoryfor tof,a1 rating, and fo ratingfor the "sociable" factor at therating of pupil behavior.

111414r, Of OffOttiTOO410

Student rating

Otter's.* rating

Practice teething rating (ever-visor)

17 idel-re. lov-rtrkod leaelAr

Stadent rating°boomer retiedPrincipal Teeing

Prattle* teaching

Prattle* teething rating

Practice teaching eating

Administrator ranking (percentile)

Main I et rater ranking (Percentile)

Stedont satins (Purdue scale)

(Soil gains to Meter, thlarmation

Peptl rating personal effective-beef

Printipalal rating - personal off's-titaness

txperts1 sting personal *fret-tie gene Of

Student ratingOneerver ratingPrinsipal relict

Correlation

Anxiety .141toR5lit, -.12anxiety .13Soul nit, ..16theist;neetility .41

Minneoeta 19211I1ktoit

9 outocores ..34 to 414

*Aerie (only aletifitant be-tom significant more .01.05 of the 9 martens)

it..,8 .11.11.31

fome-Wademarch toperament Stale

Sega% the f their*.

Quali tatty. .6)estimate

No- Amine .30

Qualitat ire .65ettimate

IStrfAralt .52

Kik alatOosO waintras

Stmlent person- .64to-person in-terattioa ra-tings

Pupil rating kw eignifi. Mg.personal rel.

Leeds ?weber- ASPupil lam-tory

Lode tuber- .13Pupil Layse-tery

Lsod o teacher- .19Pupil Invertory

Mm.timeta .19

Mita& .10esetor, (nil) .19

at the .01 level for Factor. (a), (d)of pupil behavior (taken separately).05 level for Factors (a), (d), and

Other personality testa given to teachers have included the PresseyX-0 Test (271), the hudisill scale for measurement of the personality ofelementary teachers (132), the Occupational Personality Inventory (101,102), tests of Cattell's primary source traits (297), Cattell's 16 Per-sonality Factor Test (189), Johnson Temperament Analysis, Minnesota Per-sonality Scale, and Minnesota P-S-E Test (337). Correlation coefftcientLwhere reported tend to be low and are probably not significant except pc!haps for some of those found by Schwartz (297). Using 3G teachers, hereports coefficients ranging from -.32 to .28 when teas of "primarysource traits" were correlated with practice teathing rating, and coeffi-

cients from -.60 to .31 (II = la) when the "pilmary solace traits" werecorrelated with supervisors' ratings.

116

Lemke (189), in 1951, attempted to find out if the personalities ofgood and poor teachers as evaluated by Cattell's 16 Personality FactorTest were characteristically different. He used Fisher's discriminantfunction and factorandlysis in the examination of his data. Results

of the analysis by either method failed to reveal a characteristic per-sonality pattern for either the good or the poor teachers. Lemke says

the Tesponse patterns of the teachers studied on thej personality fac-tor test suggest that "Itois possible that.personality traits need to be'balanced in a certain way for the teacher to be superior. Lackingthis balances, perhaps the teacher is likely to be only average; with

a certain makeup she may be poor." Considering the results of-the fac-

tor analysis of the responses to this test, Iamke concludes:

"Using Cattell's terminology,_it appears that good teachers arelikely, more than poor teacher's, tb be gregarious, adventurchis,frivolous, to have abundant' emotional responses, strong Artistic orsentimental interests, to be interested in the opposite sex, to bepoished, fastidious and cool. Poor teachers are more likely thangood teacfiers.to busy, cautious, conscientious, to lac) emotionalresponse and artistic onsentimental interests, to have a compara-tively slight interest in the opposite sex, to be clumsy, easilypleased, and more attentive to people." (Lemke, Reference 189 )

Other measures) related to personality tests, which have been stud-ied by a number of investigators are those pertaining to various aspects

of social adjustment. The results of the studies dealing with theseveriablas'are shown in Table 36. It will be. seen that most of the corre-lation coefficients foUnd between social adjustment measures and othermeasures of instructor effectiveness tend to cluster around zero. Some..

exceptions are evident in tho.case of the Washburne Social AdjustmentInventory endolackson!eNSocial Proficiency Test. Corkelations rangedfrorz.,.4 reported by Gotham to -.60 Illund by,Schwartz when scores on theWashburne inventory were correlated with ratings.1.IaDuke obtained acorrelatibn coefficient of -.37 when he correlated scores on the Jacksontest with pupil gains. The extreme variability of results found with theWashburne Social Adjustment Inventory and the gtnerally insIgnificantrelationships shown by other "social" tests suggest that such measureshave little to contribut4'ai predictors of instructor effectiveness.

Results obtained with personality tests of teachers have in.gereralshown wide "ria4on when dorrelateo. with measures of teacher effective-ness. Correlations range from rtIther large positive or negative relation-

ships to'sero or near zero relationships' depending upon the particular.situation andithe teacher measures used. There.are many conceivablA kindsof effectiveness even for teachers 6f the same subject or grade level inthe nine kind of community and therefore there will probably be differentpatterns of teacher personality'for such effectiveness. As Lemke andothers have pointed out, success jn teaching maybe a "balance" and topredict success it may be necessary to understand what is required for

117

the balance. Study of the association ofitraits, one by one, with successwill not suffice. The problem of determihing the personality patterns:ofthe effective teachers still remains unsolved, despite the fact that someso-called personality (and other) measures apparently show significdntcorrelations (either positive or negative) with certain measures of in-structor effectiveness. Carefully controlled, well-desighed studies em-ploying adequate numbers of instructors are neede0 to determine whatmeasures or combinations of measures have definite predictive value.There is probably'even a greater nee' for the development of adequaterationales, frameworks, and systems of hypotheOs which are based onthe'best available theories concerning social iilteraction, irterp(rsonalrelationships, motivation, and learning. Through research Wort thesetheories may then be related to specified dimensions of teacher personal-ity and performance.

IMPLICATIONS FOR FURTHER RESEARCH

After scrutiny of several hundred research studies pertaining more orless directly to the identification of instructor effectiveness, the re-viewers have arrived at certain conclusions with respect to the areas in

Mat )6It/lathe of Serial 1.4)%stsoot Rearms ti 1toarurre of Inotreeter Martinets'

Isivostitstor toOrhr molg Ton matt atonr4 it fftetifttatot Oernlatiae

Perris (1921)

Parr. tub (19)5)

WAAL' (1%5)

kits (OM

60 student

66 12091151r7

57 ltatntory; 1. and 2-mem

1ntef7. 1- and 2-rss

vSywattg4 (d.yis.d by author)

1%)s. Serial !Atoll.

traelbarto Serial A4Nteart !Rmstor?

Waortexte Serial Kfettratt :Arm-targ

Predict tosedia4 yell

fepil gaSoportetonin ratingtiAg

feeepolit 7 solos)Popil gainCdtpotitt rttlit (7

otgloo1C,40it rating (1suit')Pori) gala

.16

.

.1)10

.06

.1.0

.10

.C.S

tattier (1949)

Oval. (1147)

21 loesetarr, ?oral.

U) leith 1 ft. orporionee

440.1rarnr Serial 11)ootsent Noon-tor,

lfaorberea Serial id)aotaorit (Ann-tery

Pop 11 gala

Priselgel rating

.14

)Sa

Mooch (1941) 91mototary ia Abort. Serial 14)osteart Irrro- Pupil gall .oator, tsttrintondftf rating .0;)

Srbrarti (11i0) .11 wit% I yr. fogerionea llagAbarno Setts) Ilaysttatt.f Lem-tory

Sogerrioer feting .60)4 ttodtat Vorl.S4rso Serial 11 )osteorg larrort-

toryheath teething rating .22

Wets (MS )1 alogestarg, 1 -rose haseals Serial Proficonty fart' gala (ceogeolta) .)7Soage (111)) 71 ttident 41111.4162 keetirrel %Urn, Solo Prottit teleking feting .19

Strang at /lung R.2 seals Proffitt tuella( feting .01

%ages 11%6) Si oloretarg, I gr.omporiont

leogtSViltreet P-P male

Sep rel ore roririm ( for-toot 11o)

.02

.04

1101114......1011.tthfite it 1. ar4 !rear arblola.tooffieleAt of owofingtsfy.

118

which further research is needed and in which the probabilities of securingworth-while results appear greatest. In certain other areas, however, the

available studies seem to demonstrate beyond reasonable doubt tha. research

has alread:;, proceeded for a considerable distance up a blind alley. The

problems which in the opinion of the reviewers appear worthy of further re-search fall into both the main categories into which the review is organized

those problems having to do' with the search for more adequate criteria ofinstructor effectiveness and those problems concerning discovery or im-provement of predictors of the Criteria.

Criterion Research

The changes induced in the students by the instruotor appear to con-stitute the Most important component of any criteria of'instructor effec-

tiveness. As Orleans et al. (249), Evans (382), and others have pointed

out, the ideal criterion of the effective instructor is probably a compositeof several measures. For Air Force.instructors it seems obvious that therelative gains in subject-matter knowledge of groups of students underdifferent instructors should be 'a most important element in this composite.

The Air Force technical schools because of the large numbers of personnelinstrueting in the same subject-matter fields offer an ideal situation inwhich to make a thorough investigation of this, criterion.

The results obtained from any simple use,efiraw gains scores are cer-tain to be.misleading; -Th4 adequate use:of the;gains criterion requiresthe control of 'such variables as student:aptitude?, ability and motivation,the effects of distractions, diverse classroom conditions, cultural dif-ferencep in different localities, and the like.

Theireliability of a medsur; of instructor effectiveness should be thereliability of thiat effect on different'. or successive classes and not thusplit -half reliability determined from the same olass in which'situationaland temporal varianceqmore properly reviewed as error variance) increasesthe estimated reliability. This involves rather elaborate design and sta-tistical manipvlation much beyond.the scope of the average school system

or the average,supervisorts 'apabilities. As a practical measurement de-vice, srart from its use 4n an experimental situation, the measurement ofstudent gains affords a costly, unwieldy,, and laborious method of evalu-sting instructors. If 'it can be shown that student gains correlated ade-qately with some other moreieasiik obtained measures, these latter couldbe used, for most research ,and Winistratiye purposes as substitutes.

The ddmand continues for more Objective measures to be used for in-structor selection and evaluation. Precise methods of direct observationhave been little used in.deterImiting 'instructor efActiveness, probably'because of.the inherent diffiOulties in their, epplicatior. Such observa-

tions require study as potential Predibtors'of other criteria of instruc-tor performance; measures of Observable behavior which turn out to be validcould then, 'in turn, be further used as criteria for future research or for

119

practical application as evaluation indexes. Exploratory studies designedto investigate various techniques of instructor observation are thus ur-gently needed. The utilization of tape recorders, photographic, and otherrecording devices in connection with observation of instructors has notbeen thoroughly investigated. While some work has been directed towardobserving instructors in a classroom situation there appear to be few,if any, studies of methods for making reliable observations of instruc-tor and student behavior in the laboratory or shop. In this connectionthe methods of Olson and Wilkinson (248), by means of which they attemptedto determine differences among teachers in terms of the amount and kindof verbal direction used in controlling behavior of elementary schoolpupils, appear worthy of further investigation. Their techniques, ifmodified to suit adult students, might well produce results of value inthe evaluation of instructors and instructional methods in Air Forcetechnical training schools. Observation to be of research value, how-ever, must be repeatable by other scientists. Judaments of instructorsthat depend for their accuracy on the intuition or diagnostic skill of alone observer are not adequate data for research. This may mean thatevery possibility of success is eliminated, but it still remains to bedemonstrated that behaviors which can be reliably observed by differentobservers and which are reliably associated with different occasions (aretypical of the instructor) are not related to effectiveness.

The relatively high coefficients obtained by Shannon (307), when hecorrelated student attention scores with scores on achievement tests, alsosuggest a lead which might prove useful if applied to students in an AirForce situation, despite Shannon's rather low opinion of his findings. (Seethe section on Objective Observation of. Instructor Performance.)

There are, however, other aspects of the instructor's performance thatmay play some part in his over-all effectiveness as a member of a groupwith a common goal. For instance, the instructor has certain administrativeand clerical responsibilities that, while they do not add to student gains,are important to the orderly administration of the training courses. Fur-ther, it is possible for instructors to contribute to a greater or lesserdegree to improvement of the curriculum and to the development and promo-tion of better methods of presentation. Estimates of the extent to whichdifferent instructors make such contributions are probably best obtainedfrom supervisors' ratings of instructors.

Additionally, it seems possible that the behaviors and expressed atti-tudes of the instructor could have a marked effect on the willingness ofboth his students and fellow instructors to work together to accomplish agroup mission. In other words, the influence of the instructor on schoolmorale may also be an aspc,et of his effectiveness. This aspect would prob-ably best be reflected in ratings of.the instructor made by his fellow in-structors and by his students. Nothing is known of the amount of inter-relationship or the extent of independent reliable variance likely to lafound in such measures in Air Force schools. Considerable research effortwould be necessdry to determine the weightings that should be used in anycomposite criterion of instructor effectiveness.

120

The general unsatisfactory nature of past rating methods has stimulatedthe search for more satisfactory techniques. Among rating methods theforced-choice technique evidently offers some promise for operational usesince it tends to reduce Useability. Considerable research would be re-quired, however, to determinepe value of forced-choice scales devisedf6r use, by student raters, fellow, teachers, or as self- rating scales. It

must be determined also whether or not repeated ratings on forced-choiceforms, like those on graphic scales, tend to become progressively morelenient and less valid.

Little practical use has been made of fellow teacher ratings in civil-ian institutions. While an instructor's opinions of hi's fellow,instz'uc-tors may be biassed, it is more than probable that through his day -to -days,

close contacts with them he knows what kind of instructors they are. Hisrelationships with his fellow instructors being different from those ofthe supervisors or students will enable him to know them in a somewhat'different way and his judgments of them ''ill be baLed on this differentpoint of view. Peer ratings of instructors in the,Air Force should receivefurther investigation, either through forced-choice or other methods, inthe expectation that they might be used to corroborate supervisor ratingsor as a part of a composite to b ng about a more adequate rating of in-structors thah supervisor rattnc, used alone.

Student ratings are being more and more.widely in civilian schoolsand colleges, a tend in.keepingyJ+h he present day tendencies- .t, give'greater emphasi_ to the democratic ,...ecess in education. The argument is)frequently advanced that in the Air Force technical schools the phases eraso short that the student has Insufficient time to get well enough ac-quainted with his instruotoor to make adequate judgmant,of him. Thetotalhours an Air Force technical school.Student spends with his instructor,however, are often considerably greater than the time a college studentspends with his instructor during a one semester college course. As inthe cage of.peer ratings, student ratings have played no great part inthe evaluation,of Air Force technical training school instructors. Thor-

ough study would be required to determine their utility for selfLimprove-ment of instructors and also to discover theiy value as a criterion per seor as a predictor of gains or other criteria of instructor b2fetiveness.

It is possible that the use of a composite criterion will obf&ure pat-terns and significant elements or specific aspects of effectiveness. It

may be difficult to add together, say by means of regression equation tech-different'components of teacher effectiveness so that a high,degree

of ovite'component is allowed to counterbalance a lowidegree of another, whenboth may be equally important in their own way. Thus, it may be necessaryto develop hew ways of combining of otherwise utilizing several criteria.The develoyMent of such a composite will require CIA) best availab1d.judg-Mention the part' of psychologists and schoOl administrators as to the rela-tive weights.to be assigned considering the interrelations found.

121

Since many of the studies reviewed have been concerned with ratings, inthe foregoing discussion of implications for further research on criteria,the reviewers haveiemphasized methodological considerations. 'The major prob-

lems of research. on criteria, however, may not be methodological but ratherconceptual or definitional problems, The objectives of training programsneed to be defined, students' achievements of these objectives insofar asthey can be' measured need to be ascertained, and the effects of instructorson these achievements need to be isolated.

It,is contended by same educational authorities that no kind of ratingon any kind of scale by any kind of person is likely to provide an accept-able criterion until it can be shown to be related to student change inthe direction of the educational objectives of the sphool or training prd-gram. This is an extreme posMon which would appear to rule out, as un-acceptable, ratings which tend to show negligible correlations with stuldent gains. While it appeare'feasonable that measurable student changesshould constitute a part (perhaps the largest part) of a total criterion okteacher effectiveness, it is also possible that ratings may reflect areasof effectiveness not directly measurable, The question of whether ratingsare acceptable as a part of a total criterion depends on whether there arelogical grounds'forbelieving that the teacher can contribute to the ac-complishment pf school objectives in addition to his effects on his own.students. If it seems possible for teachers to contribute differentiallyto ,the group efforts through work on the curriculum, through development ofimproved methods, through theirdinfluence on group morale, etc., then theuecontributions should be-a part of 'any tOtal criterion of effectiveness.If it likewise-Sems possible that ratings might reflect the quality of ateacher's participation in the group ,effort, then the use of ratings as anelement, in a total criterion of effectiveness is justifies.

To the reviewers the ma30r, problem connected with ratings is not thejustification of their use, but nath6r the improvement of their accuracy.

(Predictor Research

Resea'rch on predictors necessitates formulation of hypotheses and thedevelopment of conceptual frameworks tased on the best available psychologi-cal'and educational theories. These hypotheses will reflect the rationalethat certain traits or behavior of an instructor may be expected to be re-lated to and hence may be used as predictors of instructor competence. f'or

example, hypotheses might be set up with respect to the relation of instruc-tors' intelligence to instructor.effectiveness for different kinds of sub-ject matter. Similar hypotheses might be generated for age, experience,extracurricular activities, sex, verbal fadility, and other instructor vari-ables.

. The deferential relations of instructor intelligence to instructoreffectiveness for different kind's of subject matter should be determined.Likewise the bptimal relations between.instructor intelligence and the ap-titude and experience levels.of students shouldice investigated. The

122

student gains criterion might be used to determine the value of intelli-gence AS a predictor of instructors/ competence in courses of differingcomplexity. It is quite possible that the intelligence factor when usedwith other instructor measures might contribute materially to an instruc-tor selection battery.

A number of investigators have shown a relationship between instruc-tor effectiveness and age or experience which appears to be curvilinear.Teachers tend to reach maximum rated efficiency after five or more yearsof teaching experience. In the Air Force, however, extremely few (approxi-mately two per cent) airman instructors remain in a teaching assignmentfor as long as five years. If the Air Force is indeed losing the majorityof airman instructors before they reach their period of maximum efficiency,a change in policy might be anticipated.

The evidence suggests that the kind and number of activities a teach-er has engagedin may have some relation to his effectiveness as a teach-er. This finding, as shown with respect to certain specific extracurricu-lar activities in the case of some civilian school teachers, might alsoapply to Air Force technical school instructors. A study might be madeto determine if past participation and interest in specific activitiesor in many varied activities) are related to an instructor's success intraining student airmen to become proficient in varied technical schoolspecialties.

No fundamental differences in instructional effectiveness between menand women teachers have been demonstrated. Although these findings wereobtained in quite different training situations from Air Force technicalcourses, the possibility of utilizing WAF instructors should not be over-looked.

The rather interesting findings of McCoard (210), with respect to verbalfacility suggest several potentially fruitful areas of research: (a) todetermine the relationship between the verbal facility and technical in-formation an instructor shows in the classroom as compared with his abilityto demonstrate equipment and. procedures in the technical laboratory or shop;(b) to determine the extent to which an instructor's aoility to organizeand present verbal material is related to the subject-matter gains of hisstudents; (c) to find out if verbal facility can be measured and used aspart of the instructor selection procedure.

The investigations of factor analysis of instructor abilities so faravailable are somewhat vitiated due to inadequacies of criteria, measuringdevices, or numbers of cases used. A more adequately designed investiga-tion might yield factorial results which might prove of considerable valuetoward the solution of instructor selection and evaluation problems in theAir Force.

The personality patterns of the successful instructors have not yetbeen determined. This does not mean, however, that this alizoach should beabandoned. Carefully controlled, well-designed experiments employing

123

adequate numbers of instructors would be needed in which plausible measuresor combinations offisuch measures are investigated. Certain tests used inpreliminary studies, have shown promise. These should be used. in more thor-oughgoing experiments. The search should continue also for new and untriedmeasuring instruments in the hope that some device will be discovered whichwill enable the Air Force to predict teaching success of instructors .111training and ti., evaluate instructors on the job.

It should be pointed out that the importance of many of, the problemssuggested by this Research Bulletin has been recognized by the Air Force Per-sonnel and Training Research Center, and preliminary experiments in severalof these areas are now underway.

124

BIBLIOGRAPHY

1, ALBERT) H,R, Jai analysis of teacher rating by pupils in San Antonio,Texas. Eduo, Adm,22prvis,) 1941) 27, 267-274,

2, ALIEN Ha., Earmarks of a good teacher. Amer, Sch, Bd J,) 1938, 96(A 25 -26; 92.

3, ALMY1 n.c., and SORENSON) E. A teacher-rating scale of determined re-liability and validity. Edw. Adm. alkervis,) 1930, 16, 179..186.

4. AMATORA, S,M. A diagnostic teacher-rating scale. J. Psychol,, 1950,

30, 395 -399.

5. ANDERSEN) W.N. The selection of teachers. Eduo, Adm. Su ervis" 1917,3, 83 -90.

6. ANDERSON) H.H., and BREWER, HELEN M. Studies of teachers' classroompersonalities) Dominative and socially integrative behavior ofkindergarten teaohers. A2212., PeyoLol, Mono6E.) 1945; No. 6.

7. ANDERSON) Hal.) and BREWER, J.13, Studies of teachers' classroom person-alities) II: Effeovs of teachers, dominative and integrative contactson children's classroom behavior. ApplIpmh211_Monogr.) 1946) No.8.

8, AN1ERSOT4 H.H.) BREWER, J.E.) and REED, MARY FRANCES, Studies of teach-ers' classroom personalities, III: Follow-up studies of the effectsof dominative and integrative contaots on children's behavior. Appl.Psychol, Monogr., 1946) No, 11.

9. AMMON) H.J. Correlation betv'en academic achievement and teachingsuccess. Elem. Soh. J., 1931, 32, 22-29,

10, ARMENTRCUT) W.D. The rating of teachers by training teachers and super-intendents. Elam, Sob. J., 1928, 29, 511-516.

11, ASCII) S.E. Forming impressions of personality. J` Psycho)"1946) 41) 258-290.

12. BACH, J.0, Practice teaching success in relation to other measures ofteaching ability. J. exp. Edw.) 1952, 21, 57-80.

13. Unit) DI,E, Reply to Travers' A critical review of the validity andrationale of the forced-choice technique. Psyohol, Bull.) 1951, 48,421.434.

14, BAIRD, J., and BATES, G. The basis of teacher rating. Edue. Adm.Superris,) 1929, 15, 175-183,

125

15. EARNER) M, ELIZABETH, Summary of the relation of personality adjust-ments of teachers to their efficiency in teaching, J, eduo. Rea.,

1948) 41) 664-675,

16, BA2R, A.8, Oharao:srietio differences in theleuhlmplEformance offladanftkscE2;9achers of the social etudics. Bloomington) Ill:

Publio School Publishing 77710797

17. BARR) A.S. Teaching competencies. In W,S, Monroe MO) Eno clo ediaof educational research, New York: Nhomillan) 1950, Pp, 1 -1 5

18, BARR, A,S,) BEOHDOLT, B,V.) OW) W.W.) GAGE) Nat.) ORLEANS) J,S,,REMMERS) H,H,) and RYANS, D,G, Report of the committee on the cri-

teria of teacher effectiveness. Rev, Eduo, Res,) 1952, 22, 238-

263,

19, BARR) A.S., and EMANS, L.M, What qualities are prerequisite to successin teaching? Nation's Sob., 1930i 6 (3), 60-64,

20, BARR, A.S., =GERSON) T.L., JOHNSON, 0.E., LYON, V,E,0 and WALVOORDA.0, The validity of certain instruments employed inhe measure-ment of teaching ability. In Helen M. Walker (Ed.), The measure-

ment of teaohingsffioienoz. New York: Macmillan) 1933. Pp, 73-

11

21, BARTHELMESS) HARRIET M.) and BOYER, P.A. A study of the relation be-tween teaching effioienoy and amount of college credit earned while

in service. Edud. Adm. Supervis,, 1928, 14) 521-535,

22. BARTHELMESS) HARRIET M,) and BOYER, P.A. Relation between teaching and

amount of college credit earned while in service. Penn. Soh. J,,

1929, 77, 291.

23, BATHURST, J.E. Do teachers improve with experience? Personnel J.,

1928; 7,

24,

25.

261

BATBRAISTI J,E. Relation of efficiency to experience and age among ele-

mentary teachers. J, educ, Res.) 1929, 19, 314-316,

BAXTER) BERNICE. Teacher-pupil re1ationshiPe,.1941,

BEAUMONT) H. A euggested method for measuringteaching introductory coursee. J, eduo. Ps

612,

New York: Macmillan)

the effectiveness ofyshol,) 1938, 29, 607-

27, BELL, 8. A study of the teacher's influence. Pedag. Semin,) 19CO3 7,492.525.

28. BELL, VIOLA M. Traits of teachers. Eduo, Rea, Bull., 1932, 11, 281 -286,

126

29. BENDIG) A,W, Inter-judge vs intra-judge reliability in the order-of-

merit method, Amer, J. Psyohol., 1952, 650 84-89,

30. BENDIG, A.W, A preliminary study of the effect of academic level) sex)and course variables on student rating of psychology instructors.J. PsychoL) 1952, 34) 21-26,

31. BENT, R,K, Relationships betweer qualify!ng exeminationa) variousother faotors) and etudent teaching performance at the University

of Minnesota. J12222_Eduo., 1937, 5, 251-255.

32, BETTS, G,L, The education of teachers evaluated through measurementof teaching ability. In National surve of the education of teach-

ers. U,S, Off, Educ, Bull., 1933, No. 10 5 . Pp. 7-153.

33. BETTS) G,L, Pupil achievement and the NS trait in teachers, In Helen

M. Walker (Ed.), The measurement of teaching effillimay, New York:

Macmillan, 1935. Pp. 144-237,

34, BIMSON, 0,11. Do "good" teachers produce "good" results? North Cen-

tral Ass. Quart,) 1937, 12, 271 -276.

35. BIRD, GRACE E. Pupils' estimates of teachers, J. eduo. Psychol.,

1917, 8, 35.40.

36. BIRKELO) C.P. What characteristics in teachers impress themselvesmost upon elementary end high school students? Educ. Adm. Suplay2.)

1929, 15, 453-456,

37. BLAIR, G,M, Personality adjustments of teachers as measured by the

multiple choice Rorschach test. J. eduo, Res.) 19460 39, 652-657,

38. BLISS, W.B. How much mental ability does a teacher need? J. eduo.

Res" 1922, 6, 33.410

39, BOARDMAN) C,W, Professional testa as measures of teaching efficiencyin high school, Teach, Coll, Contr, Eduo., 1928, No. 327.

40, BOARDMAN) C.W. An analysis of pupil ratings of high school teachers.

Eduo. Adrl rviSupea,) 1930, 16, 440 -

41, BOLTON, F,B, Evaluating teaching effectivenessscores on achievement tests, J, eduo. Res,)

42, BOND, J.A. Strengths and weaknesses of studentRes,) 1951, 45) 11-22,

43, BOOK, W,F, The high eohool teacher from the pupil's point of view.

L12$4_220E., 1905, 12, 239.288,

thro'igh the use of

19450 38, 691-696.

teachers, J. eduo,

127

44. BOSSING) Na. Teacher-aptitilde tests and teacher selection. U.S.Off. Eduo. Bull.) 1931, No, 12, 117-133.

45, BOWLED) W.A. Students' ratings of qualities considered deairabl3

in college professors. Soh. & Soc.) 1940) 51, 253.256,

46. BOWDON) A.O. Change - -the test of teaching. Soh , & Soc., 1934) 40)

133.136,

0. BOWMAN) E,C. Pupil ratings of student teachers. Univer. Ind. Eduo.

Soh, Bull., 1934) 11, 28-37, Also Eduo. Adm, Supervis.) 19547%:14i=1767-

48. BOYCE) A.C. Qualities of merit in secondary school teachers. J. eduo.

Psychol.) 1912, 3, 144-157,

49, BCWE, A.C. Method for measuring teachers' efficiency. Nat, Soo.Stud. Eduo. Years.) 1915, 14 (Part II)) 1-83.

50. BRADLEY) GLADYCE H, What do college students like and dislike aboutcollege teachers and their teaching? Eduo. Adm. Supervis., 1950,

36, 113-120,

51. BRANDT) W.J. A follow-up of some earlier Wisconsin atudiec of teach-ing ability. J. Eulllus.) 1949) 18, 1-29.

52, RRECRINRIEGE) ELIZABETH, A study of the relation of preparatory schoolrecords and intelligence test scores to teaching success. Eduo.

Adm, Supervie.) 1931, 17, 649-660,

53. BREED, P.S. Factors contributing to success in college training, J.

eduo. Res.) 1927, 16, 247-253,

54. BROOKOVER, W.B. Person-person interaction between teachers and pupilsand teaching effectiveness. J. eduo, Res.) 1940) 34) 272.287.

55, BROOMER) W,B. The relation of social factors to teaching ability,12122alu2.) 1945) 13, 191-205,

56. BROOM, M.E. The predictive value of three specified factors for suc-cess in practice-teaching, Eduo, Adm, SuperVis,) 1929, 15, 25-29.

57. BROOM, M.E. A note on predicting teaching success, Educe Adm.Supervis.) 1932, 18, 64-67,

58. BROOM, MX.) and AULT, J.W. How may we measure teaching success?Eduo, Adm. Sunery12.. 1932, 18, 250-256.

59. BROWNELL, W.A. A critique of research on learning and on instructionin the school, lat. 12p, Stud. Edw.) 1951, 50 (1), 52-66,

128

60, BRUBAOHEE, A. 4. Why teaohers' fail.

593 -600,

New York State Eduo,) 1931, 18,

61. BRYAN, 11,C, Pupil rating of seoondary saho6i te*chero, Teach. Coll.

22BILLIARR.) 1937, No: 708.

62, BRYAN, R.C. Pupil ratings of secondary-schnol teachers. So, h, Rev.)

1938, 46, 357 -36 ?.

63, BRYAN, R.C. Eighty-six teachers try evaluating student reactions to

themaelvea. Eduo! Adm, §u22012,) 1941) 27, 513-526,

64. BRYAN, R.C. Reliability) validity) and needfulness of written student

reactions to teachers. Eduo. Adm. SaInis.) 1941) 27, 655-665.

65, BRYAN, B.C. Why student reactions to teachers should be evaluated.Eduo, Adm. Super's,, 1941) 27, 590-603.

66. BRYAN, R.C. Benefits reported by teachers who obtained written stu-dent reactions. Eduo. Adm, SuEtrvis,) 1942) 28). 69 -75.

67, BUCKINGHAM) B.F. Opinion and practice as to the rating of teachers.Edw. Res, Bull,, 1922, 1, 171-174.

68. BUELLESFIELD) H. Causes of failures among teachers. Eduo. Adm.

Supervis.) 1915, 1, 459-1,52,

69, BUTLER, F.A. Prediction of success in practice teaching. Eduo. Adm,

Elam12., 1935, 21, 448.456.

70, BUTSCH, R,L.C. The two - factor theory applied to measurement of teachertraits. Edw. Adm. Supervis,) 1932, 18, 257-276,

71, CAMS) R,, et al. Studies on the effectiveness of teaching. In Dif-

ferential characteristics of the mare effective and less effectiveteaoheralLEgEummatotnipultalleR. Washington: Divisionof Fie34 Studies and Training, Extension Service) U. S. Departmentof Agriculture) February 1953,

72, CATTELL) R,B, The assessment of teaching ability. Brit. J. educ.

Psycho1,) 1931, 1) 48.72,

73. CHAMPLIN) C.D. The preferred college professor. Soh. & Soo.) 1928,

27, 175-177.

74, CHARTERS, 1#.W., and WAPLES) 14 The commonwealth teacher - training study.

Chicago: Univer, Chicago Press) 1929.

129

15. CHEYDLEUR) F,D, Judging teachers of basic) French couraes by objectivemeans at the University of Wieconsin-1919-1943. J. educ. Rea.,

1945) 39) 161-192,

76. UM) F.L. Soholarehip in relation to teaching efficiency. Soh. Rev,Mona,' 1915, 6, 64-70.

77, DIEM, 0,M, What do my students think about my teaching? Soh. & Sot.,

1930, 31, 96-100.

78. CLINTON) R.J, Qtvalitits college students desire in college instructors.Soh, & Soo.) 1930, 32, 702,

79. COLLINS) LA, Relation of intelligenoc to auccese in teaching. Soh.

& Communitz) 1930, 16, 155-157.

80, COOK) W.V.; and LEEDS, 0.11, Measuring the teaching personality. Edao.psyohol, Measmt, 1947) 7, 399.410.

81. COCK, Dal, How do teachers rate themselves? Eduo, Adm. Supervis.)1937) 23, 473.476.

82. COOPER) HAZEL 3. Correlation of the grades in practice teaching re-ceived by seniors in a college for teachers) with their scores inthe Thurston Group intelligence Tcpt, Pedag, Semin.) 1924) 31,176-182.

83, COOPER) J.0.) and LEWIS) R.B. RuantlAttive Rorschach factors in theevaluation of teacher effectivenes. J, eduo, Res.) 1951, 44, 703-707.

84, COPPER) F.LeR, Who is a good teacher! Education) 1928, 49) 111-117.

85. CCURTIS) S.A. Standards of teschint ability. Edltjar.) 1921, 62,183 -186.

86. CCHRTIS, S.A. The measurement of iN effect of teaching, Soh, & Sot.)1928, 28, 52-56; 84.88.

87. COX, W,W,, and CORNELL) ETHEL L, The prclpoels of teaching ability ofstudents in Nev York state norm/ scho,)Is. Unlver, State of NevYork Bull.) 1933, No, 1033.

88. COY) GENEVIEVE L. A study of various factors vhioh influence the useof the accompliehment quotient as a mcaeure of teaching efficiency.J educ. Res.) 1950) 21, 29.42,

89. CRABBS, LELAH H. *raving efficiency in supervision and teaching,Tetch, Coll. Contr. Mc., 1925, no. 17y.

230

90. DALDY) DOROTHY M. A stuij of adaptability in a group of teachers.Brit. 42...sksIlL/m2r1., 1937, 7, 1.22,

91, DANIEL, J.McT, rxcellttnt teaohers.eirldualifications.Report of the inveatigation of educational qualifications of teach-ers in South Carolina. Columbia: Univer. of South Carolina) 19Vq.

92, DAVENPORT) K, An investigation into pupil rating of oert%in teachingpractices, Puxdue Univer. Stud. higher Edw., 1944, No, 49.

93, DAVIES, J,E. What are the traits of the good teacher from the stand-

point of junior high school pupils? Soh. & Soo,' 1933, 38, 649.652.

94, DAVIS) 0.0. The high school as judged by its students, Proo. North

Central Ass, Coll. Sec. Sc:'., 1924, 29, 71.144.

95. DAVIS) C.O. Our beat teache Soh. Rev.) 1926, 34) 754.759. Also

Sch. & Soc., 1926, 21.,

96. DAVIS, H.MoV, The use of state high school examinations as an instru-ment for judging the work of teachers. Teach, Coll, Contr. Edw.)

1934, No, 611.

97, DAVIS) 3.B,, and FRENCH) L.C. Teacher rating. Pittsburgh Univer. Soh.

Educ. J,, 1928, 3, 57; 60.64,

98, DAY, L.C. The teaching quotient, Elem. Sch. J., 1933, 33, 604.607.

99. DEAN) C,D. Current trends in rating student - teachers, Edo° Adm.

Supervis., 1939, 25, 687.694

100, DODD, M.R. A study of teaching aptitude. J. eau°, Rees) 1933, 26,

517-521.

101. DODGE, A.F, What are the personality traits of the successful teacher?

14.122112EL01014, 1943, 27, 325'337.

102. DODGE, A.F. A study of the persomlity traits of successful teachers.

PAguatipne, 1948, 27, 107-112,

103. DOLOH) E.W., Jr. Pupils' judgments of their teachers, plAteallElq.)

1920, 27, 195.199.

104. 'AVIS) S.J. Report of en explcratory!leisfaqsl!LpAagtllnsl.Onmbridge, Maas,: Peabooly House, 195r).

105, DRUCKER) A,J,, and RENNERS) MI, Do alumni end students differ in

their attitudes toward instructors? J. e3uc, Paychol., 1951, 421

129.143,

131

106. ELLIOTT) E.C. Outline of a tentative eoheme for the measurement ofteaching efeara57AHNOWTWW. tam' a t,NTIFIROCER677§10

107, FLIOTT) E.C. How shall the merit of teachers be tested and recorded?rduo. A. Supervis.) 1915, 1, 291-299.

108, EKO8LHART) M.D., and TUCKER, L.R. Traits related to good and poor'teaching. Sch. Rev., 1956) 44, 28.33.

109, ESPENSCHADE) A, Seleotion of women major students in physical educa-tion. Res. Quart, Amer. Ass Hlth.) 1948) 19, 70-76.

110, FERGUSON) H., and HOVDE) H.O. Improving teaching personality by pupilrating. Soh. Rev.) 1942) 50, 459.445,

111, FICHAHDLER) A. A study in self - appraisal. Soh, & Soc., 1916, 4)1000-1002,

112. FLANAGAN, J.O. A preliminary study of the validity of the 1940 edi-tion of the National Teacher Examinations, Soh. & Soo.) 1941) 54)59.64.

113. FLANAGAN, J.C. Critical requirements; A new approach to employeeevaluation. Personnel itylhol,) 1949) 2, 419-426.

114, FLANAGAN) J.C. The critical requirements approach to educational ob-jectives, Soh, & Soc., 1950, 71, 321.324.

115. FLESHEB) V.P. Inferential study rating of instructors. MO. Ree,Bull.) 1952, 31 57-62.

116, FLINN, VEE. Teacher rating by pupils. Eduo, Method) 1932, 11,'290-294.

117. FLORY) C.D. Personality rating of prcsiective teachers, Educe Adm.Supervis., 1930) 16, 155.143,

118, FCRLICE) O. Note on the correlations between general teaching perand acme specific teaching qualities. Nat, Soo, Stud. Eduo Yearb.)1919, 18 (Pert I)) 549.351,

119. FRASSR, O.M. Intelligence as a faotor in determining student teach.

ing success. E129.1-15Agia22VLI1201 1929, 15, 62)629.

120. FRAZIER, O.N. Teaching as rated by pupils, school officers and otherteachers, Hawaii educ. Rev., 1927, 15, 170.171; 176.

121, FREDERICK) R,W., aria HOLLISTER, P.C. The relationship between theacademic success of pupils and 'he practice teaching grade receivedby their teachers. Edw. Adm. Igtnlit.0 1934, 20, 468A71.

132

122. FRED, N grephic scale. J. educ, Psychol., L923, 14,

123. IRITZ. M,}, The variablIC4 of judgment in thl rating of teachersby students. Ecluo, A6A,Superyi,J., 1926, Ye, 630-634,

124. FRY) J.0, All superior officers, Infantry J.) 1948) 63, 21.26.

125, FULLER) ELT2,ABETH M, The use of measures of ability and general ad-justment in the preservice selection of nursery school-kindergar.ten.Wmury teachers, J. educ, PAythol,) 1946, 37, 321.334,

126, GALT) J.r., and GRIFRI D,J. Evaluation and selection of flying in-btructors. In N. E. Miner (Ed.), Psychological research aullottraining. AAF Av:.4tion Psychology Program Research Report No,-87-

IM, PP. 209'3510

127. =BUS) J.S. Determining instructional effioienoy, ?ch. pev., 1931,

39, 64.66.

128. WESE, W,J., and STEVENS, s.N, The application blank can be made

predictilo of teaching success. Amer, Sch. Ed J., 1939, 98 (4),

23.25.

129. GOODENOUGH) YLCCENCE L. Mental testing) its histouiprinciplea, andupliesticna. tisv vorkt Rinehart, 1949,

130. 0CMI:JUG3) FWBENCE Li, FULLER) ELIZABETH M,) ALA O1S(11, EDNA. The

use or the Goodenough Speed-of-Association Test it the prew-Viceselection of narsery school-kindergarten-primary teachers. J. ethic.

Pellhcl,, 1946) 3 ?, 335-346,

131. ONDLARTI, A,S. Student attitudes and opinions relating to teaching

t, Brooklyn College, Sch, & Soc., 1948, 68, 345.349

132. GOMM, R,E, Personality and teaching efficiency. J, exp. Educ,,

1945) 14, 157-165,

133 GOUD, O. The predictive value of certain selective measures. Lduo.

Admi_Suplvis,) 1941) 33, 208.212

134. GRAVES, NELM N. What traits do high school pupils admire in teach-

ers? Schmid.. 1925, 8) 103-105,

135. GAME, H.W. A comparison of student ratings, administrative ratings,ratings by colleague's, and relative salaries as criteria of teach-

ing emelience, ita Ye, State Coll, Bull.) 1933, No. 5.

136. 01:F.:3R, C., and NEWBUPN, H.', Temperament in prospective teachers,

J. clue. ,(f e., 1942, 35,

133

137. GRIM) P.R., and HOYT, C;J. Appraisal of teaching competeno;., Eduo.

Rea, Bull., 1952) 31, 85-91.

138, GUILFORD) J.P. Peolyametrice, New York: McGraw -Hill, 1936.

139. (Omitted)

140. GUTHRIE) E.R, Measuring student opinion of teachers. Soh. & Soc.,

1927, 25, 175-176.

141. HAGGARD, W,W, Some freshmen describe the desirable college teaoher.Soh, & Soc., 1943) 381 238.240,

142, HAMPTON, NELLIE DELIGHT, An analysie of supervisory ratings of ele-mentary teachers graduated from Iowa State Teaohere College. J.

exp. Educ., 1951, 20, 17)-215,

143, EAMBIN) S.A. A comparative study of ratAngp of teachers -in- trainingand teachere-in-barvice. Ell!, Soh, .1.1 1927, 28, 39.44,

144. HARDESTY' C,D. Can teaching success be rated? Nation's Soh,' 1935,15 (1)) 27.28,

145, HART, F.W, Teachers and teaching by ten thousand lOgh9chool seniors.New York:Rgaillan) 1956.

146. HARTMANN) O.V. Measuring teaching effioienoy among college inetruc-tors, Arch, Peychol., 1933, 23, No. 154.

147. HATCHER, MATTDE% Qualities of personality compared with success inpractice- teaching. Peabody J. Eduo.' 1934) 11, 246-253.

148. HEIMAN) J.1),) and ARMENTROUT, W,D, The rating of college teacherson ten traits by their otudente. J. th2L1220221., 1936, 27,197-216.

149, BELURITZSCH) A.O. A factor analysie of teacher abilities, J, eu.tauq,) 1945) 14) 166.199.

150, HENP/KSON, E.H. Comparisons of ratings of voice and teaehlug ability,J, educ, Pe42121., 1943) 34) 121-123,

151, AgNRIMSON, E.H. 3oce relations between peroonality) speech character -tetice and teaching effectiveness of oollegn teachers, SpeechMonogr,' 1949) 16, 221.226,

152. EDDA' F.J. Some aspects of the reiative instructional efficIenoy ofmen end vcoen teachers, J, educ, Rees, 1935, 29, 196-203.

134

153. HIGCNS, SISTER, Reducing the variability of supervisors! judgments:An experimeatal study. John Hopkins Univer, Stud/ Eduo,, 1936,

No, 23,

154, HIGHLAND, RoWoo and BERKSHIRE, J.R. A methodological study of forced-ohcAcla2rformance rating, San Antonio, Tex,: Human ResourcesResearoh Center, Laokland Air Force Base, May 1951. (Research Bul-letin 51-9.)

155, HILL, C,W, The efficiency rating, of teachera. Elem. Soh. J., 1921,21, 438-443,

156, HILL, L.B. Teaching qualities in forcer graduates as guides in im-proving atudent-teaching. Eduo, Adm, Supervis.) 1929, 15, 362 -366.

157, HINES, 11,C, Merit syatems in the larger cities, Amer Soh, Bd J.,1924, 68 (5), 52; 111-112; 115.

158, HOPPOCK, R, No Y. U. students grade their professore. 22hilija2g.,1947, 66, 70-72.

159, HUCKLEBERRY, A,W. The relationship between change in speeoh profi-ciency and obange in student teaching proficieroy, appech MonosE.,

1950, 17, 378-389.

160, EUDELSON, E, The validity of student rating of instructors, Soh. 4PAS., 1951, 73, 265.266,

161. HUM, ESTHER, Stue,y of achievement in educational psychology. J.

exallup,) 1945, 13, 174-190.

162, IRWIN, CLAIRE C., and J.R. An appraisal of certain teachertraits by representative high sc.hool students. Oath, Edw. Rev.,1949, 47, 667.675,

163, JACOBS, C.L. The relation of the teacher's education to her effec-tiveness, Teach, Coll, Coaltr Ed1-.2., 1928, No. 774

164. JAMES, !LW, Causes of teacher - failure in AlAtama. PeabDdy J, Educe,

1930, 7, 269.271,

165, JARECXE, WA, Evaluating teaching success through the use of teach-ing judgment teat. J, educ. Res., 1952, 43, 683.694

166. JATNE, C.D1 A study of the relationship between teaching proceduresand educational outcomes. J, expi_ Edw., 1945, 14, 101434.

167. JENSEN, Me Determining critical requxrementa for teachers. J.

tULELVe, 1951, 20, 79 -85,

135

168. JERSILD, i..T. Characteristics of teachers who are "liked best" and

"disliked mast." Il2211EAu0 1940, 9, 139-151

169, JOHNS, W,B., and WORCESTER) D.A. The value of the photograph in the

selection of teachers. J. 42111EYS121., 1930) 14) 54.62.

170, JOHNSTON) J.H. Teachel rating in large cities. Soh. Rev.) 1916, 24)

641.647,

171, JONES, E,S, The prediction of teaching success for the college stu-dent. Soh. & Soo,' 1923, 18, 685-690.

172, JONES, R.DeV. The prediction of teaching effioienoy from objective

ncasures. Jo exp. Eduo., 1946) 15, 85-99.

173, JORDAN) F. A study of personal and social tratte in relation to high-school teaching, Jo eduo, Soolol.) 1929, 3, 2743,

174, KELLY) 11,1,.. and ANDERSON) RUTH E. Greet teachers and eame methods

of producing them. J. eduo. Rep.) 1929, 20, 2-30.

175, KEPEART) A.P. What kind of a teacher? Amer. Soh, Bd J., 1922, 64(3), 47.48; 84; 87.

176, KING, teR,A, The present status of teacher rating. Amer. Bob. J.,

1925, 70 (2), 44.46; 154; 157.

177. KLOPP, W,J, Eveluation of teacher traits by vacation-school pupils.Soh. Rev., 1929, 37, 457.459.

178, ;mar, P.B. Qualities related to success in teaohLng, Teach. Coll.

Contr. Edlo., 1922, No. 120.

179. MiCHT, P.B. and FRANZRN, R.B. Pitfalls in rating schemes, il_1(129.,

Peychol., 1922, 13, 204-213.

180. KWVESEN) and 111221IENS, STEW. An mallets of fifty-seven de-vice. for rating teaching. Ell:11111.12,, 1931, 9, 15.2h.

181, DM, Helga Characteristics of the best tesohlrchildren. tomalca., 1896, 3, 43.1418,

182, DINER, Stift Pre-training raotors prediotive of

Penn, State l, Stud, Edw., 1931, No. 1,111101.1011110...Col 4111.

183. DINO, ILL. Preliminary report on a five-year study of teachers

college admissions, Edue. r.4s, Supervie., 1933, 19, 691.693,

as recognised by

teacher Goose.

136

184, DINER) H,L, Second report on a five-year study of teachers college

admissions, Educ. Adm, Supervis.) 1935, 21, 56.60.

185. MINER) H.L. Five-year study of teacher college admissions. Educ.

A641, Supervise) 1937, 23, 192 -199.

186. KROUS) 0,T, A study of traits and qualities of teachers and theireffeotiveneas in teaohing) based upon the estimates of their stu-

dents, In Stanford Univer,, Abatraota of diasertations . , 19j4,-

, Fp. 182-185,

187, KYTE) 0.0, The problem teacher in the grades--a composite picture,

Nation's Soh,) 1932, 9 (5), 55 -60.

188. 'ADM) 0.1,, The measurement of teaching ability, Study number three.

J. exp, Eduo.) 1945) 14)75-100.

189, LAMKE, Tdt. Personality and teaching success, J, exp, Eduq.) 1951,

20, 217-259,

190. LAMSON) EDNA E. Some college students describe t.io desirable collegeteacher. Soh, & Soo,' 1942) 56, 615.

191, LANCELOT, Standards of measuremen: of teaching ability, 8A, &Soo., 193), 34) 236-238,

192. LANCELOT) W,H, A study of teaching effioiency as 'notated by certainpermanent outcomes, In Helen M, Walker (Ed.)) The measurement oftelacInfLealoitpoz, New Yorks Macmillan, 19,3767-370-7-----

193, LAM, A,R, A study of amount of treining and experience as objectivefactors entering into a basis for determining teachers' salaries.Unpublished dootorle diaertation) Stanford Univer" 1924,

194, LARSON) A,H,) end MARZOLF) 8,3, Attitude of teachers college students

toward teaching. Educ, Ada, Supervia,, 1945, 29, 454-438.

195, LOVE) D,W, Emotional differences between superior and inferior teach.

ere. Nato elem Principal) 1936, 15, 395.401,

196, IAVION) J,A, A study of factors useful in choosing candidates for theteaching profession, lir_ittLVedLLLidPs'Chol., 1939, 9, 131.145,

197, LAMM S,R, The Bernreuter Personality Inventory in the selectionof teachers. Educ, Ada, Supervise, 1934, 20, 59..65,

198. LEEDS) C.E. A scale for measuring teacher-pupil attitudes and teacher-pupil rapport. flgsMajITN., 1950, 64) No. 6 (Whole No. 312),

13?

199. LEIPOLD, L.E. Teacher traits that pupils like or dislike, Clearing

House) 1949) 24) 164.166,

200, LIGHT, U.L. High-soLool pupils rate teachers. Soh, Eev) 1930, 38)28.32,

201. LINDGREN) H.C. The inoomplete sentences tests as a means of courseevaluation, Edultmylholl Meaemt) 1952, 12, 217.225,

202, LINNUIST) E.F. Edusetiorle1 measurement, Washingtoni American

0ounoll on Educat!ion) 931.

203, LINE, L.J. The prediction of teaching effioienoy.1946) 15) 2-60.

2014, L/PPITT) B. An exporimental study of the effeot of demooratio andauthoritarian group atmosphere, In Studies in topological and

veotor psyoUlstz. I, Univer, Iowa Studiee in bhild Welfare) 1940)

16. Pp, 45-65.

205, LITTLER, S. Oev.Ae of failure among elementary echool teachers. Soh.

Home Educ.) 1914) 33, 255-256,

206, WIEMAN, W.W. What college freshmen think of their high school teach-

ers. Sell, Executives Magatine) 1931) 50) 527-528,

207, LYNCH, J.M. Teacher rating trends psychologically examined, Amer.

Soh, Bd J., 1942, 104 (6), 27.28.

208. McAVEE) L.0, Tne reliability of the evidences of teaching efficiency

secured in txtension visitation. Elea, Soh. 47#) 1930) 30, 746.754,

209, MCOARTHA) 0,W, The practice of teacher evaluation in the Southeastin 1948. , eduo, Pee.) 1950, 44) 122.128.

210, McOOARD) W.P. Speech factors as related to teaching effloienoy,Mesh Mpnogr,) 1944, 11, 53.64,

211. MacDONALD, MARION E. Students' opinione as regards desirable and un-desirable quslifications and prdotices of their teachers in teacher-

training institutions, Educ, Ada Superrie,, 1931, 17, 139-146.

212, MoLAUORIAN) JO, A case study of teachers Judged auccessful and 'len-

t:successful, In Stanford Univers) Abstracts of dissertations 1

1910-31. Pp, 206.210,

215. MADSEN, I.N. The prediction of teaching success, ugg,tilltAgeL/211.,1927, 13, 39-47,

138

214, MAJOR) C.L. The percentile ranking on the Ohio State University psy-chological test en a factor in forecasting the success of teachersin training, 9s14 & Semi) 1938) 47) 582.584,

215, MALAN, C,T. What are the most desirable character traits of tem:t-ern? Education, 1931, 52, 220 -226,

P16. MANAHAN) J.L.) and JARMAN) A.M. A comparison of superior and inferiorteachers, Amer. Soh, Bd J,, 1935, 90 (4), 23.24,

217, MARKT) ANNA 11,) and GILLILAND, A.R. A critical analysis of theGeorge Washington University Teaching Aptitude Tent, Educ, Adm.SuTervis,) 1929, 15, 660-666,

218, MARTIN, LYCIA 0, The rediction of success for students in teachereducation, New York: Teachars Coll.) Columbia niver.) 1944,

219, MARTINDALE) F,E, Situational factors in teacher placement and ouo-ctss, gleall_Eduo., 1951, 20, 121-177.

220, MAT1EWS, L,H. An item analysis of measures of teaching ability. J.

educ, Rea.) 1940, 33, 576-560.

221, MEAD) A,R, qualities of merit in good and poor teachers. J. eduo.

Res., 1929, 20, 239-259

222, HEAD) A,11,) and HOLLEY) C,E, Forecasting success in practice teach-ing, J, edua, Pnyclhol.) 1916, 7, 495-497.

223. MECHAM, G.P. Si study of emotional instability of teachers and theirpupils. George Peabody Coll, Conti., Educ,) 1941) Ho, 312,

224, WNW, T,K,N., end SUM) N,M, Intelligence and teaching ability,Indian J, Payehol., 1946, 21, 33-58.

225, MERIAM) J.L. Normal school education and efficiency in teaching.Teseh, Coll Contr. Eduo., 1905, No. 1.

226, MESSIER ) W,A, Are you the "best teacher," Grade Teach., 1932, 49,800-801; 829.

227. WCRAELIS, end TYLER, Fa, MNPI and student teaching. J, spill.

Psyahol.) 1951) 35, 122.124,

228. MOODY, P.E. The correlation of the professional training with theteaching suceless of normal-school graduates, Soh, Rev.) 1918, 26,180-198,

139

229, MORGAN, A.L, Present status of teacher rating in the United States.

Tex, Outlook, 1944, 28 (2), 21-22,

230, MORRIS, ELIZABETH H. Personal traits end suocees in teaching, Teach.

Coll. Contr. Edud., 1929, No. 342,

231, MORRISON, R,H. Factors ceusing failure in teaching. J., eduo, Res.,

1927, 16, 98-105,

232. MORSH, J,E,, and SWANSON, R,A, The relation of technical knowledge

and other_vaiableatoirtutee. SanAntonio, Tex.t Human Resources Research Center, Laokland Air Force

Base,'April 1952, (Research Note Tech 52-10

233, MORTON, R, L, Qualities of merit in secondary teachers, Edmo, Adm.

Supervia., 1919, 5, 225-238.

234, MOSES, CLEAR V. Why high-school teachers fail, Sch. Home Eduo,, 1914,

33, 166.169.

235, MOSS, F.A,, LOMAN, W,, and HUNT, THELMA, Impersonal measurement of

teaching. Educ, Rea,' 1929, 10, 40.50,

236, MOTT, S.B. Teacher failures in the public schools. Adric. Moot1950, P2, 208; 210; 213.

237, NANNINOA, S.P. Teacher failures in high school, Bch, & Scc,, 1924,

19, 79-82.

238, NANNINCA, S.P. Intimates of teachers in service Lade by graduate stu-dents as compared with estimates bade by principal end assistantprincipal, Soh, 'liev 1928, 56, 622-626,

259, National Education Association. Teachers' salaries and salary trends

in 1923. Nat, Educ, Rea, Bull*, 1923, X, 1-115.

240, National Education Assooiation, Practices affecting teacher personnel.

Nat, Educ, Res, Bull., 1928, 6, 205-255.

241, National Education Association, Administrative practices effecting

classroom teachers. Part II, The retention, promotion, and im-provement of teachers. Nst,Eduo. Res, Bull., 1932, 101 33.76.

242. National Education Association. Teacher personnel procedureal Esploy-

nent conditions in service. Nat, Eduo. Res. Bull., 1942, 20, 81-

116,

243, NEEL, MARY 0., and MEAD, A.R. Correlations between certain group res-

tore in preparation of secondary school teachers, Eduo,

Supervis., 1931, 17, 675-676.

140

244. NEMEC, Loip G. Relationship between teaoher certiricatioh and educa-tion:ip Wisconsin: A stua:,* of their effects on beginning teachers.

1946, 15) 101-132,

245, NEWMARK, D, 'Students' opinions of their best and poorestteachera.Elem, Sch. J., 1929, 29, 576-585

246, NUTTING; E.P. Does quantitative training produce better teaching?Soh. Executive, 1943) 62 (6), 21.

247, .00ENWEUER, A.L. Predicting the quality of teaching, the predictivevalue of certain'traits for 'affectivenesa in teaching. Teacn. Coil

Contr, Edqc., 1936, So, 676.

248. OLSON, W,C,) and WILKINSON) MURIEL M. Teacher pereonali% as revealedby the enount and kind of verbal direction used in behavior control.Educ. Adm. Stm.rvis.) 1938, 24, 81.0.

249, ORLEANS) J.S., CLARKE, Ti,, OSTREXCHER, L,, end STANDIEE, L. Some pre-

liminary thoughts on the criteria of teacher effectiveness. J.

educ. Rea.) 1952, 45) 641-648.

250, %%BURN, W.J, personal characteristics of the teacher, EdlAc. Adm.

Superiis 1;20) , 74-85.

251. PAYNE) E.G. Scholarship and success in teaching.1918, 9, 217-219.

252, PETERS) C.C., and VAN %looms) W.R, Statistical procedures an4 their

mathematical bases. New York-: M-Grave -Hill, 1940.

253, PETERS) D.W. The status of the married wman teacher. Teach. ;MA,

Contr. Educ., 1954, No. 603.

254. PETEa800,.H.A., °BOURN) 0,, WALLACE) HAZEL, end SMITH) O.W. Relationof scholarship during college career to success in teaching judged

by salary. Educ. Ads, Supervis.; 1934) 20, 625428.

255, PETERSON; ODA X,, and 000k, W.A. Score Garde and rating sheets in

teacher training. Ed_ uc. Method, 1930, 9, 522-330.

256. PHILLIPS) W.S. An analysis of certain characteristics of active andpiospeotive teachirs. 0*arge Peabody Coll, Contra,, 1935, No161.

257. POHTiR) JW,A, Pupil evaluation of practice teaching. J1 educ. Rea.,

1942,. 35, 700-704.

258. tOSE7)"0.Wd A new answer to an old problva--shall we rate teachers'

AAr;;Sel_i,Ed.ts, 1944, 108 (5), 314-35.

141

259. PRESTON, H2O. The development of a procedure for evaluating officersin the United States Air Force. Pittsburgh: American Institute

for Research, 19

260. PYLE, W.H. Intelligence and teaching, an experimental study. Educ.

Adm. Supervis., 1927, 13, 433-448.

261, PYIE, W,H, The relation between intelligence and teaching success:

A supplementary study. Educ. Adm. Supervio., 1928, 14, 257-267.

262. REAVIS, .W.C, and COOPER, D.H. Evaluation of teacher merit in city

school systems. Suppl. eduo. Monogr., 1945, No. 59.

263. REED, ANNA Y. The effective and the ineffective college teacher.New York: American Book, 1935.

264, REMMERS, H.H. The college professor as the student tees him. Purdue

Univer. Stud. higher Educ,, 1929, No. 11.

265. REMMERS, H.H. To what extent do grades influence student ratings ofinstructors? J. educ. Res., 1930, 21, 314-317.

266. :HEMMERS, H.H. Reliability and halo effect of high school and collegestudents' judgments of their teachers. J. appl. Psychol., 1934, 18,619-630.

267, REMMERS, H.R., and BRANDENBURG, G.C. Experimental data on the Purdue

Rating Scale for Instructors. Educ. Adm. Supervis., 1927, 13, 519-

527.

268, REMMERS, Mi., DAVENPORT, K.S., and POTTER, A.A. The best and the

worst teachers of engineering. Purdue Univer. Stud, higher Educ.,

1946, No, 57.

269. REMMERS, H.H., MARTIN, F.D., and ELLIOTT, D.N. Are studentst ratingsof instructors related to their grades? In H.H. Re'imers (Ed.),Student achievement and instructor evaluation in chemistry. Purdue

UniverLaAId.1:10., 1949, No. 66, Pp. 17-26.

270. REMMERS, H.H,, SHOCK, N.W., and KELLY, E.L. An empirical study of thevalidity of the Spearman -Brown formula as applied to the Purdue Rat-ing Scale. J. educ. Paychol., 1927, 18, 187-195.

271. RETAN, G,A. Emotional instability and teaching success, J. eduo. Rest,

1943, 37, 135-141.

272. RICHARDSON, M.W. Forced-choice performance reports: A modern merit-

rating method. Personnel, 1949, 26, 205-212.

142

273. RICHEY, LW.; and BERESEIRE, J.R, Job satisfaction of Air Force tech-

nical school instructors. San Antonio, Tex.: Human Resources Re-

Search Center, Lackland Air Force Base) November 1952, (Research

Note Tech 52-12.)

274. RICHEY, R.W and FOX, W.H. A study of high school students with re-gard to teachers and teaching. Ind. Univer. Sch, Edw. Bull., 1951,

27, 9-63,

275, RIESCH, K.P. A study of some factors in pupil growth. .1,e25112Educ.,

1949, 18) 31-55.

276, RILEY, J.W., Jr., RYAN, B.F., and LIFSHITZ, MARCIA. The student looks

at his teacher. New Brunswick, N.J.: Rutgers Univer. Prlss, 1950.

277. RINGNESS) T.A. Relationships between certain attitudes towards teach-ing and teaching success. J. exp. Educ., 1952, 21,

278, RITTER) E.L. Rating of teachers in Indiana. Elem. Sch. J., 1918,

18, 740-756.

279. ROBERTS, A.C., and DRAPER, E.M. The.higlh-ischool principal ae adminis-

trator, supervisor, and director.of extracurricular activities.New York: Heath, 1927,

280. ROLFE, J.F. The measurement of teaching ability. Study number two.

J. exp. Educ., 1945) 14, 52p74.

281. ROOT, A.R. Student ratings of teachers. T. higher Educ., 1931, 2,

311-315.

282. ROSTRERo.L.E. The measurement of teaching ability. Study number one.

J. exp. Educ., 1945) 14, 6-51.

283. RUEDIGER) M., and STRAYER, G,D, The qualities of merit in teachers,

J.educ. PsychOl., 1910, 1, 272-278.

284. RUGG, H. IS the rating of human-character practicable? J. educ.plyshol.) 1921, 12, 425-438; 485-501; 1922, 13, 81-93,

285. RYANS, D.G.. The results of internal cOnsistenoy and external valida-tion procedures applied in the analysis of test items measuring

professional information. Edw. Tsychol,. Measmt) 1951, 11) 549-

286. RYANS, D,G, A study of the extent ofassociation of.certain profes-sional and personal data with judged effectiveness' of teacher be-

J. exp. Educ,, 1951, 20, 67-77.

1L)

287, RYANS, D.C. A study of criterion data (a factor analysis of teacherbehaviors in the elementary school). Educ. psychol. Measmt) 1952,

12, 333-344.

288, RYANS, D,G., and WANTT) E, Investigations of personal and socialcharacteristics of teachers. J, Teach. Educ., 1952, 3, 228-231,

289, RYANS, D,G., and WANDT, E. A factor analysis of observed teacherbehaviors in the secondary school: A study of criterion data.Educ. psychol. Measmt, 1952, 12, 574-586.

290, KYLE, FLORENCE. Qualities that students admire in teachers, Calif.

Quart. secondary Educ,, 1928, 4) 82-85.

291. SAMUELSON, t.E,, The evaluation of teachers and teaching. Soh. Execu-

tive, 1941) 60 (9)) 15-16; 27.

292, SANDIFORD, B., CAMERON, M.A., CONWAY, C.B., and LONG, J.A, Forecast-

ing teaching ability. Univer. Toronto ethic. Res. Bull., 1937,

No. 8.

293. SCHAFFIF, A.E.F. The pupil checks the teacher. Soh, Executives

Magazine, 1931, 51, 151-153,

294. SOHELLHAMMER, F.M. Rating the practice teacher. Soh. Executive,1940) 60, (2), 32-33,

295, SCHMID, J., Jr. Factor analyses of prospective teachers' differences,J. exp. Educ., 1950, 18, 287-319,

296, SOHUTTE) T.H. The teacher) through the students' eyes. Amer. Sch.

Bd J., 1926, 73 (3), 72 -79,

297. SCHWARTZ, A.N. A study of the discriminating efficiency of certaintests of the primary source personality traits of teachers. J.

exp. Educ., 1950, 19, 63-93.

298, SEAGCE, MAY V. Prognostic teats and teaching success, J, edge. Ras.,

1945, 38, 685-690.

299, SEAGOE) MAY V. 'Prediction of in-,Tervice success in teaching. J.

educ, Res.) 1946) 39, 658-663.

300. SEELEY, L.C. Preliminary validation of the instructors evaluationreport. Washington: OffiA,e of Naval Research, April 1949,--Tech-nical Report, SDC 383-1-9.)

301, SENOR, H,L, Status teacher-rating in 1936. Amer. Soh, Bd J., 1936,

93 (3), 56.

144

302. SEYFERT, W.C., and TYNDAL, B.S. An evaluatf.on of differences in teach-

ing ability. J. eduo, Res.) 1934, 28, 10-15.

303. Su. :1011) J.R. 2ersonal and social tr,Ats requisite for high gradeteaching in sema/m22221. Terre Haute, Ind.: State Normal

Press, 1928.

30. SHANNON, J.R. A comparison of three means for measuring effiolenoyin teaching. J. eduo. Res.) 1936) 29, 501-508.

305, SK,NNON) J.R. A comparison of highly successful teachers, failingteachers) and average teachers at the time of their graduation fromIndiana State Teachers College. Eduo. Adm. Supervis., 1940) 26,43-51.

306. SHANNON) J.R. Elements of excellence in teaching. Eduo. Adm, Supervie.)

1941, 27, 168.176.

307. SHANNON, J,R, A measure of the validity of attention scores, J. educ.

Res., 1942, 35, 623-631,

308. SHIM) A, The rating of teachers in the New York City public schools.Soh. & Soo., 1915, 2, 752-754,

309. SHULTZ, I.T. A descriptive and vredictizejltly of a class in aschool of education. Unpublished doctor's dissertation, Univer.

of Pennsylvania) 1928.

310. SIMMONS) EDNA. Correlation of administrative ratinesLIItchers andpupil achievement. Unpublished doctor's dissertation) GeorgePeabo4 College for Teachers, 1932,

311. SIMON, D.L. Personal reasons for the dismissal of teachers in smallerschools. J. eduo. Res., 1936, 29, 585-588.

312. SIMPSON, R.B., MIER, E.L., and JONES, S. A study of resourcefulnessin attacking professional problems. Soh. Rev., 1952, 60, 535-540.

313. SISSON, E.D. Forced - choice- -the new Army rating, Personnel Psyohol.,

1948, 1, 365-381.

314. SMALZRIED, LT., and REMMERS, H,H, A factor analysis of the PurdueRating Scale for Instructors. J. eduo, Ps y21121., 1943, 34, 363-367.

315, SMELTZER) C.H., and HARTER) R.S. Comparison of anonymous and signedratings of teachers, Eduo. Outlook, 1934, 8, 76-84.

316. SMITE) A.A. What is good college teaching? 2..114Aller Eduo., 1944)

15, 216-218.

145

317. SMITH, A.A. What traits do high school pupils admire in teachers?High Soh. J., 1945) 28, 279 -286.

318. SMITH, E.L. A,critical analysis of rating sheets now in use for rat-ing student-teaohers, Eduo. Adm, Supervis.) 1936, 22, 179-189,

319, SODERQUIST) H.0. Partibipation in extracurricular activities in highschool or college and subsequent success in teaching adults. Soh.& soo,, 1935, 42, 607-609.

320, SOMERS) G.T. Pedagogical prognosle. Prediotirig the success of pro-spective teachers. Teach, Coll. Contr. Eduo.) 1923, No. 140,

321. STAMMER) j,M.) and REMMERS) H.H. Can students discriminate traitsassociated with success in teaching? J. aPpl. Psychol.) 1928, 12,602-610.

322. STARRAK) J.A. Student rating of instruction. J. higher Edw.) 1934,5, 88-90.

323, STEPHENS, J,M,) and LIIIHTENSTEIN) A,. Factors associated with successin teaching grade five arithmetic. J. eduo. Res,, 1947) 40) 683-694,

324, STEWART, MAY L. A study of success and failure in eleven ,years ofrural teacher. tkeining, Educ. Adm, Supervis.) 1940) 26, 372-378.

325, STOLUROW) L,M,) IRION, A.L.) and PASCAL) G.R. .The selection and train-ing of gunnery instructors; In N. Hobbs (Ed.), Psychological re-search, on flexible gunnery trainla, AAF Aviation Psychology Pro-gram Research Report No, 11, 1947, Pp. 337-301.

326. STUIT, D.B. Scholarship as a factor in teaching success. Sch. & Soot,1937, 46, 382.384,

327, STUIT, D.B., and EBEL, R.L. Instructor .rating at a large state uni-versity. Coll, & Univer.) 1952, 27, 247_254,

328, STUMP, N.F. A comparative study of two teaching aptithde tests. J.

educ, Psychol., 1937, 28) 595-600.

329. TAYLCR) E,K,) and WHERRY, R.J. A study, of leniency in two rating sys-tems, 'Persohnel Peychol., 1951, 4, 39-47,

330.' TAYLCR) H.R. The influence of the teacher on relative class standingin arithmetic fundamentals and reading comprehension. Nat. Soc.Stud. Eduo. Yearb. 1928, 27 (Part II), 97-110.

331, TAYLOR, H.R. Teacher influence on class achievement: A study of therelationship of estimated teaching ability to pupil aohievement inreading and arithmetic. Oenet,22/9123) 1930, 7, 81-175.

332, TBURSTONE) L.L. The method of paired comparisone for social values.J. abnorm, soc. Pichol.) 1927, 21, 384.400.

333. TIEGS, E.W, An evaluation of some techni uee of teacher selection,Bloomington) Ill.: Public School Publishing Co.) 1928.

334, TOSTIEBE, M,F, Analysis of the relative importance of the successfactors common in the training of teachers for the one-room ruralschool, J, educ, Res., 1937, 30, 397.402.

335. TRAXIER, A.E. Are students in teachers colleges greatly inferior inability? Sch, & Soo.. 1946) 63, 105-107.

336, TUDHOPE, W.B. A study k the training college final teaching mark asa criterion of future success in the teaching profession, Brit.J. educ. Ptychol,) 1942) 12, 167-171; 1943) 13, 16.23.

337. TYLER, F.T. Personality tests and teaching ability. Caned. J. Psychol"

1949, 3, 30 -37.

338. TYLER, LEONA E, The psychology of human differences, New York:

Appleton-Century-Crofts) Inc.) 1917.

339. ULLMAN' R.R. The prediction of teaching success. Educ, Adm, Supervis.,

1930, 16, 598-608.

340, ULLMAN) R,R, The prognostic value of certain factors related to teach-ing success. Ashland, Ohio: A,L. Garber) 1931,

341. U. S. Office of Indian Affairs, A rating scale for Indian serviceteachers. ?roar. Educ.) 1940) 17, 363.366,

342, UPSHALL) C,C, A ten-year study of two groups of teachers college stu-dents of contrasting ability, J. Amer, Ass, Collegiate Registrars,1942; 18, 36.44.

343. UPSHALL) 0,0, The validity of composite faculty judgment as a methoaof identifying undesirable prospeotive elementary sohool teachers.J. educ, Res,, 1942, 35, 694499.

344, VCN HADEN) H,I, An evaluation of. certain types of personal data em-ployed in the prediotia) of teaching efficiency, J. Exp. Educ.,1946) 15, 61.84.

345, WADDELL, C.W. The prognostic value of Army Alpha scores for successin practiceteaching. Educ, Adm, Supervis.) 1927, 13, 577.592.

147

546. WAGENKORST, L.H. The relation between ratings of student teachers incollege and success in first year of teaching. Eduo, Adm. Supervis.,

1930, 16, 249-255,

547, WARD, L.B., and KIRK, S,A. Studies in the selection of students for ateachers college. J. educ. Res., 1942, 35, 665-672.

548. WARD, W.D., REMNERS, 11,13,1 and SCHMAIZRIED, N.T, The training of teach-ing-personality by means of student-ratings. Soh. & Soc., 1941, 53,

189-192.

549, WELLS, F.L, A statistical study of literary merit. Arch,Psyoholn

1907, No. 7.

350. WHERRY, R.J. Comparative validit of the WD AGO Form No. 6 and the

FOL-2 according to various breakdowns.; ashington: Personnel

Reseqrch Section, Adjutant General's Office, December 1945, (Re-

port No. 671.)

351. WHITNEY) F.L. Theinalliagauxyparation and teachin skill ofstate normal sohool graduates in the United States.) Minneapolis:Univer. of Minnesota, 1922.

352. WHITNEY, F.L. The prediction of teaching success, J. Eduo. Res.Monogr., 1924, No. 6.

353. WHITNEY, F.L., and FRASIER, C.M. The relation of intelligence to stu-dent teaching success, Peabody J. Educ;., 1930, 8, 3-6.

354. WILSON, W.R. Students rating teachers. J. 1:1gber Eduo,, 1932, 3,

75 -82.

355. WITHALL, J. The development of a technique for the measurement ofsocial-emotional climate in classrooms. J. exp. Educ,, 1949, 17,347-561,

356. WITTY, P.A. An analysis of the personality traits of the effectiveteacher. J, eduo, Res" 1947, 40, 662-671.

357. WITTY, P.A. Evaluation of studies of the characteristics of the ef-fective teacher, In Improving educational Re-

rrt Amer. eduo. res. Ass.) 194E7157:78.42A.--

358. WCELLNER) R.O. Evaluation of apprentice teachers. Soh. Rev.) 1941)49, 267 -271.

359. WRIGHTSTONE, J,W, Measuring teacher corduot of sass disouasion.Elem. Sch. J.r 1954) 540 454-460.

148

360.7 WRIGHTSTONE, J.W. Constructing an

OO11. Bee,, 1935, 37, 19:

361. VRIGHTSTOVE; J.W. Rating methods.

pedia of educational research,961,964,

observational technique, Teaoh.

In W.S, Monroe 0E60) Enoyclo-New York: Macmillan) 1950. Pp.

362. YOUNG, F, Efficienoy ofhigh school teaohers as. measured by prin-

cipals' ratings. Tex. Outlook) 1937, 21.(1)),24-25,

363. YOUNG, F, Some factors Westing teaching efficiency, J. educ.

Res.) 1939, 32, 649.652,

364. ZANT) J.H, Predicting success in practice-teaching. Edits, Adm,

Supervia.) 1928, 14, 664-670.

Reviews and Bibliographies

365. ANIERSON, E.W. Techniques of research used in the field of teacher

personnel. Rev. educ. Rea.) 1934) 4,

366. BARR- A.S. Measurement of teaching ability. Rev, educ. Res,. 1940)

10, 182-184; 267:268.

367. BARR).A.S, The measurement. and prediction of teaching efficiency.

Rev, educ. Res,. 1943) 13, 218.223.

368. BARR, A.S, The measurement and prediction of teaoningRev. educ.: Res,) 19461 16) 203,208.

369 BARR) A.S, .The measurement and. prediction. or teaching efficiency:A summary of investigations. J. exp. Educ,) 1948) 16, 203-283.

370. BARR) A.S.. Measurement and prediction of teaching success. Rev,

educ, Ras.) 1949, 19, 185-190.

571. BARRi A,S, The measurement of teacher characteristics and predic-tion of teachingefficienoy. Rev, educ. Res.) 1952, 22, 169-174.

372. BARR, A.S., and DOUGLAS, LOIS, The pre-training selection of teach-es, J. educ, Red" 1934) 28, 92-117.

575. BEECHER) WE, Evaluation of teaching and concepts.

Syracuse UniVer, Press) 1949.

374. BETTS, G,L, The education of teachers evaluated through *measurementof teaching ability. In.Nittional surve of the education ofteach-ere. U, .S, Off. Edtio. Buil.) 1955, No, 10, 5 ) 7.155,

149

375, BUTSCH) R.L. Teacher rating, Rev. educ. Res.) 1931, 1, 99-107; 149-152.

376. BYERS) B,H. Speech in the prediction of teaching success. Peabody

J. Eduo., 1950, 28, 80-87.

377 COLE, LUELLA, The back ound for colle e teaching. New York: Farrar

and Rinehart, 19 O.

378. CCBEY) S.M. The present state of ignoran0e.about the factors affect-ing teabher success. Educ. Adm. Supervis,) 1932,E 18, 481-490.

379, COREY) S.M. What are_the.laotors,involved in the succesa of highschool teachers? North Central Ass, Quart., 1935) L0, 224-231.

380. DOMAS) S,J.) and TIEDMAN) D.V; Teacher competence: An annotatedbibliography. J. exp. Edw.) 1950, 19) 101218,

381. DURFLINGER) G.W. A study of recent finlings. on the prediction ofteaching success. Educ. Adm, SU ervis,) 1948).34) 321-336.

382. EVANS, KATHLEEN M, A critical survey of methods. of assessing teach-ing ability. Brit..J. educ. Psychol.).1951) 21, 89=95;

383. FRAZIER, B.W. Education of teachers: Selected bibliography, October1, 1935 to January 1),1941, U. S. Oft. Educe Bull. 1941),No, 2,

384. JEWETT) IDA A. Summary of studies onteacher zelectiOn, In M. M.Stroh) I.A. Jewett) and V,M, Butler.. Better selection of betterteachers. Washington: Delta.Kaha Gatma'Soo., 19 3v PP. 2-91.

365. LANCASTER, J.H. A.guide. to .the literatUre:on the education of teach-ers, Eduo. Adm, Supervis., 1933,,19,

386. MAHLER) W.R. Twenty years of merit' rating, 1926 - 1946.:. New York: The

Psychological Corp.) 19117.

387. MONRCE) W.S. Contrplle0.1experimentation as d means of evaluating meth-.ods of teaohing, Rev. educ.; Res., 1934; 4) 36-42,

388. TORCERSON) T.L. The measurement and prediction of teaching ability.Rev. educ, Res., 1934) 4) 261-266; 329-330.

389, TORGERSON) T.L. Measurement and prediction of teaching ability. Reveduc. Res., 1937, 7, 24-246;..319-320.

390. TRW, W.0.) andMcLOUTH) FIMENCE. An improvement: card for student-teachers. Edub. Adm. Supervis., 1929, 15, 1-10; 127 -133,

150

391. U, S. Office of Education. Bibliography of research studies in educa-tion. U. S. Off. Educ. Bull,, Nos. 22, 1928; 36, 1929; 23, 1930;

13, 1931; 16, 1932; 6, 1933; 7, 1934; 5, 1935; 5, 1936j 6, 1937; 5,1938; 5, 1939; 5, 1940; 5, 1941.

392. YAUKEY, J,V., and ANDERSON, P.L. A review of the literature on thefactors conditioning teaching success, Educ. Adm, Supervis" 1933,19, 511-520,

Manuti.2291.veduber 1953.

151

EDRS Price MF-$0.75 HC$8.05 - CiteSeerX

Documents

Transcript of EDRS Price MF-$0.75 HC$8.05 - CiteSeerX