Online and paper evaluations of courses: a literature review and case study

21
This article was downloaded by: [Keith Morrison] On: 11 September 2013, At: 17:59 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Educational Research and Evaluation: An International Journal on Theory and Practice Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/nere20 Online and paper evaluations of courses: a literature review and case study Keith Morrison a a Macau University of Science and Technology, Taipa, Macau, China To cite this article: Keith Morrison (2013) Online and paper evaluations of courses: a literature review and case study, Educational Research and Evaluation: An International Journal on Theory and Practice, 19:7, 585-604 To link to this article: http://dx.doi.org/10.1080/13803611.2013.834608 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

Transcript of Online and paper evaluations of courses: a literature review and case study

This article was downloaded by: [Keith Morrison]On: 11 September 2013, At: 17:59Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Educational Research and Evaluation:An International Journal on Theory andPracticePublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/nere20

Online and paper evaluations ofcourses: a literature review and casestudyKeith Morrisona

a Macau University of Science and Technology, Taipa, Macau, China

To cite this article: Keith Morrison (2013) Online and paper evaluations of courses: a literaturereview and case study, Educational Research and Evaluation: An International Journal on Theoryand Practice, 19:7, 585-604

To link to this article: http://dx.doi.org/10.1080/13803611.2013.834608

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Online and paper evaluations of courses: a literature review and casestudy

Keith Morrison*

Macau University of Science and Technology, Taipa, Macau, China

This paper reviews the literature on comparing online and paper course evaluations inhigher education and provides a case study of a very large randomised trial on thetopic. It presents a mixed but generally optimistic picture of online course evaluationswith respect to response rates, what they indicate, and how to increase them. Thepaper presents a case study of 1 university and finds that means for paper courseevaluations tend to be higher than for online evaluations, that the standard deviationsof online evaluations are typically larger than for paper evaluations, that onlineevaluations take longer to complete than their paper counterparts, that students preferonline evaluations, and that factor analysis shows similar and different numbers offactors for the 2 types of evaluation, even with the same instrument and the samepopulation. Caution is advocated in assuming that the same online and paperevaluations yield similar results.

Keywords: online evaluation; course evaluation; SET; paper evaluation; qualityassurance

Introduction

Since Hmieleski and Champagne (2000) reported that 98% of course evaluations were paperbased, online course evaluations have become bedded down inmany universities. Increasingattention has been paid to their advantages and disadvantages, as well as to their similaritiesto, and differences from, paper evaluations. Around the same time, reports were published(Hastie & Palmer, 1997; Hoffman, 2003; Seal & Przasnyski, 2001; Sorenson & Johnson,2003) to show that online course evaluations were on the increase.

Given the exponential rate of online course evaluations in higher education, it is timelyto review the key literature on this matter. This paper reviews relevant literature and relatesit to a case study of a randomised trial using one of the largest data sets on the topic from asingle institution.

Online course evaluations: advantages, disadvantages, and challenges

The advantages of online course evaluations over paper versions are several:

. Students prefer them to paper versions, particularly as they gain more experience withonline evaluations.1

© 2013 Taylor & Francis

*Email: [email protected]

Educational Research and Evaluation, 2013Vol. 19, No. 7, 585–604, http://dx.doi.org/10.1080/13803611.2013.834608

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

. They allow students more time and space to enter written comments, resulting inmore online comments, of greater richness, length, thoughtfulness, usefulness, andformative potential.2

. Students can complete them at their own convenience (Dommeyer et al., 2002; Dom-meyer et al., 2004; Donovan et al., 2007; Johnson, 2003) as computers are accessibleany time (Crews & Curtis, 2011; McCracken & Kelly, 2011).

. They reduce the number of people involved in their administration (Lalla & Ferrari,2011; McCracken & Kelly, 2011).

. Time savings can be made in respect of: (a) printing, distribution, and collection; (b)storage; (c) data entry, scanning, transcription, retyping (e.g., to ensure anonymity byremoving handwriting) (Hmieleski & Champagne, 2000); (d) processing time (Kron-holm, Wisher, Curnow, & Poker, 1999, indicate a 97 per cost saving in this respect,with average processing time moving from 16 hr down to a few seconds, and Kasiaret al., 2002, report staff workload decreasing from an average of 30 hr to 1 hr fordownloading scores and comments); (e) class time which had been devoted to com-pleting paper versions.3

. They reduce financial costs (e.g., time, paper).4

. They reduce transcription errors and failure of scanners to recognise or print writtencomments correctly (Lalla & Ferrari, 2011).

. They can guarantee privacy and anonymity, thereby increasing response rates (Crews& Curtis, 2011; Donovan et al., 2007; Ranchod & Zhou, 2001; Ravelli, 2000).

. They are faster to complete (Kasiar et al., 2002).

. They overcome attendance problems (students being absent from the class when thein-class course evaluations are completed; Perrett, 2013).

. They can reduce tutor and peer influence, that is, the tutor’s face-to-face social pres-ence or, indeed, tutor adjustment of results (malpractice).5

. They enable faster reporting of results and increased timeliness of feedback.6

On the other hand, several disadvantages of online versions have been reported:

. They have lower response rates than paper versions.7

. Many faculty prefer paper versions, believing them to be more accurate than onlineversions (H. M. Anderson et al., 2005; Crews & Curtis, 2011; Dommeyer et al.,2002).

. Students report having no time to complete them outside class time, and mayprefer to complete them in class time (H. M. Anderson et al., 2005; Stowell et al.,2012).

. They may provoke “flaming” and over-harsh comments about faculty (Barkhi &Williams, 2010).

. They require easy, secure, and anonymised computer access (H. M. Anderson et al.,2005; Ballantyne, 2003; Crews & Curtis, 2011; Donovan et al., 2006; Sorenson &Reiner, 2003; Stowell et al., 2012).

. Students may forget to complete the online version (Fike et al., 2010; Stowell et al.,2012).

. Non-response bias: Some students – individually or by group characteristics – maynot respond (e.g., males may respond more than females to online evaluations(Bothell & Henderson, 2003).

. Students may be unable to ask for help if the online entry encounters technical pro-blems (Donovan et al., 2007; Fike et al., 2010).

586 K. Morrison

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

Table 1 summarizes the advantages and disadvantages of online evaluations.Large variations are reported in response rates to online course evaluations, both

between online and paper versions and within online versions (Table 2).Ravelli (2000) notes that a poor online response rate (< 35% in his report) was not

necessarily a negative matter; students simply felt that it was unnecessary to report ifthey were satisfied with the instructor, that is, there was no reason to write.

Avery et al. (2006) comment that it is hardly surprising that response rates to onlineevaluations are lower than for paper versions, as the latter are typically completed inclass, whilst online versions are completed in students’ own time. This raises an importantpoint: Most comparisons of paper versions and online versions are unfair, as they do notcontrol for the time of completion. A true comparison of response rates would require a fac-torial design, to compare:

(a) paper versions completed in class;(b) paper versions completed in students’ own time;(c) online versions completed in class;(d) online versions completed in students’ own time.

In terms of response rates, comparisons are made typically between (a) and (d), and this isunfair, not least as in-class evaluations are donewith a captive audience. There are no reportedexamples of comparisons of: (a) and (b); (a) and (c); (b) and (c); (b) and (d); (c) and (d).

Much research compares the results of online and paper versions to determine whetherthe means of paper versions differ from those in online versions. The results are mixed; asHardy (2003) remarks: “the ratings may be lower or higher or the same” (p. 33). Someresearch reports that paper evaluations are more positive than online evaluations (Barkhi& Williams, 2010; Donovan et al., 2006); others report the opposite (Baum et al., 2001;Carini, Hayek, Kuh, Kennedy, & Oumiet, 2003; Norris & Conn, 2005; Thorpe, 2002).

Table 1. Claimed advantages and disadvantages of online evaluations.

Claimed advantages Claimed disadvantages

Students prefer them. Lower and large variation in responserates.

24-hr availability for students to compete online. Response bias (some individuals or groupsmay not respond).

They give students more time and space to writecomments.

Faculty prefer paper-based versions.

Students’ comments are richer, longer, and morethoughtful.

Students write over-harsh comments andratings are more extreme.

Can be completed at students’ own convenience. Students have little out-of-class time tocomplete them.

Reduce loss of class time on completion. Students forget to complete them.Guarantee privacy and anonymity. Require secure arrangements for online

anonymity and privacy.Faster to complete. Students are unable to seek help if the

online system encounters problems.Reduce costs (administration, people, time, money,distribution, data entry, processing, transcriptionerrors, reporting).

Require follow-up reminders to ensuregood response rates.

Overcome student non-attendance at class. Require incentives to complete.Reduce tutor and peer influence. Peer influence is unchecked/uncontrolled.

Educational Research and Evaluation 587

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

Some research reports that, regardless of lower response rates, there are no statisticallysignificant differences between the means of paper and online evaluations.8

Barkhi and Williams (2010) report that online evaluations produce more extreme evalu-ations than their paper counterparts, that is, the extremes of scales are used and harsh writtenresponses are received. Gamliel and Davidovitz (2005) report that standard deviations arehigher for online evaluations, though Fike et al. (2010) report similar distributions of scores.

Barkhi and Williams (2010) report that, though means for online evaluations are stat-istically significantly lower than for paper evaluations, when these data are controlled forcourse and instructor, no statistically significant difference is found between the means.Fike et al. (2010) report that online means were “marginally lower” (p. 50) (1.6% lower)than those for paper evaluations, but that the difference is “of little practical significance”(p. 50). Similarly, Guder and Malliaris (2010) found “slight drops” (p. 136) in online scores,but these could have been due to “random fluctuations” and, anyway, were slight (p. 136).

Whilst much research indicates lower response rates to online evaluations, the effects ofthese on reliability, means scores, and distributions are mixed (Guder & Malliaris, 2010). Itcannot be said unequivocally that online evaluations are any more or less reliable than paperevaluations. Indeed, Avery et al. (2006, p. 30) report that moving to online versions is unli-kely to exert an overall effect on course evaluation scores. Nulty (2008) reports that “papersurveys are not intrinsically better than online” (p. 303), and Lalla and Ferrari (2011) findthat changing the mode of collection of evaluations is inconsequential.

It appears that factors other than mode of evaluation (paper/online) may exert an effecton response rates and scores (e.g., J. Anderson et al., 2006; Beran & Violato, 2005; Fikeet al., 2010; Layne et al., 1999). Layne et al. (1999) suggest that it is (a) academic area;(b) time pressure (e.g., at the end of a term when students are under examination pressure);(c) ease of access to computers (see also Moss & Hendry, 2002); and (d) computer literacyof students rather than survey mode which affect the results (see also Avery et al., 2006).McGhee and Lowell (2003) report that psychometric properties of student ratings alsoplay a part but that “differences observed ... of online and paper-based systems weremore likely due to differences in instructional formats and student populations than toratings modality” (p. 47). Indeed, one can suggest that online and paper-based evaluationsmay not necessarily yield similar data, and that this might affect the reliability and validity

Table 2. Response rates (percentages) for paper and online evaluations.

Paper-basedevaluation Online evaluation Source

92% 23% Ha et al. (1998)65% 31% Cummings and Ballantyne (1999)60.6% 47.8% Layne et al. (1999)

< 35% Ravelli (2000)33.3% 32.6% Watt et al. (2002)70% 29% Dommeyer et al. (2004)

31% to 89% H. M. Anderson et al. (2005)72.9% 48.5% Avery et al. (2006)

12% to 84% J. Anderson et al. (2006)56% 33% Nulty (2008)68.29% 54.14% Fike et al. (2010)

25% lower than forpaper-based

Guder and Malliaris (2010)

20% lower for paper-based Ernst (2006); Stowell et al. (2012)

588 K. Morrison

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

of the responses. For example, students may perceive evaluations completed out of class tobe less important than those competed in class, or online evaluations may be more ameasure of student willingness to complete in their own time, or their preference for anonline medium, or their nervousness/ease when facing computers. Or, for example, in-class paper evaluations may be more a measure of students’ desire to complete themquickly and leave the class than to give serious thought to some of their responses (cf.McGhee & Lowell, 2003).

J. Anderson et al. (2006) indicate that student and faculty engagement exert an influenceon response rates, and that instructor factors rather than, for example, class size, discipline,or online distribution affect evaluation scores. McPherson, Jewell, and Kim (2009) reportthat: (a) students’ expected grade affects evaluation results (e.g., by inflating grade expec-tations, an instructor can increase his/her evaluation scores, p. 48); (b) evaluation scoresdecrease with class size (p. 48) and are negatively related to the age of the instructor(p. 48). Donovan et al. (2006) indicate that smaller classes tend to have higher responserates. Grade expectation is also reported by Avery et al. (2006), Isely and Singh (2005),and Johnson (2003) to influence evaluation scores. Ballantyne (2003), Guder and Malliaris(2010), Perrett (2013), and Robinson, White, and Denman (2004) suggest that facultysupport for online evaluations and the actual effort that they put into supporting it (e.g., dis-cussing it in class, showing students how to access the software, indicating the importanceof student feedback, sending emails to remind students to complete), exert an influence onresponse rates. Ballantyne (2003) suggests that differences found are not due to whether theevaluations are conducted on paper or online, but to the environment, types of student, andwhether or not the classes are on campus.

Response rates and how to increase them

There is plentiful advice on (a) what are acceptable response rates to online evaluations and(b) how to increase response rates.

For (a), what constitutes an “acceptable rate” is open to wide interpretation. Whilst largesamples may be preferred to small samples, the acceptable rate depends on the confidencelevel and interval that is required (Cohen, Manion, & Morrison, 2011; Nulty, 2008; Stowellet al., 2012) and the population size (Cohen et al., 2011). Richardson (2005) suggests that a50% response rate is acceptable; Nulty (2008) reports that a 60 or 70% response rate isdesirable and acceptable. He contends that, for classes of 20 or below, the response rateneeds to be above 58%, that a 47% response rate is only acceptable for classes of over30, and that a 20% response rate is only acceptable for classes of over 100.

For (b), there are many possible ways to increase response rates:

. Timing: (a) complete the online evaluation in class (H. M. Anderson et al., 2005;Crews & Curtis, 2011); (b) schedule time in a computer laboratory for completingthe evaluation (Crews & Curtis, 2011); (c) make explicit the time for completion(Stowell et al., 2012); (d) extend the duration or time period for the onlinesurvey’s completion (Nulty, 2008; Richardson, 2005);

. Incentives: (a) provide incentives for completion (Avery et al., 2006; Crews & Curtis,2011; Nulty, 2008; Robinson et al., 2004), though this may compromise anonymity(Avery et al., 2006); (b) offer grade incentives (Dommeyer et al., 2004) (though Bal-lantyne, 2003, and Dommeyer et al., 2004, question the ethics and legality of this); (c)offer the chance to take part in a lucky draw (Stowell et al., 2012); (d) count the evalu-ation as an assignment (Crews & Curtis, 2011); (e) award bonus points/“class

Educational Research and Evaluation 589

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

participation” points for completion (Crews & Curtis, 2011; Johnson, 2003); (f)require students to complete some, but not all course evaluations (Crews & Curtis,2011, suggest 70%);

. Negative incentives: (a) withhold early access to grades (Crews & Curtis, 2011); (b)indicate to students that they are only able to register for an examination after theyhave completed the evaluation (Lalla & Ferrari, 2011);

. Make online entry easy: (a) show students how to access and complete the onlinesurvey (Crews & Curtis, 2011); (b) ensure easy access to high quality technology(Lalla & Ferrari, 2011); (c) “familiarize students with the online environment byusing online teaching methods” (Nulty, 2008, p. 306); (d) email the online evaluationlink to students (Crews & Curtis, 2011);

. Encouragement: (a) encourage students to complete the evaluation (Crews & Curtis,2011); (b) stress the importance and value of student feedback and how it will be used(Crews & Curtis, 2011; Johnson, 2003; Sorenson & Reiner, 2003; Stowell et al.,2012); (c) encourage staff to press the importance of completing the evaluationsand to show their students their personal interest in having them complete evaluations(Johnson, 2003; Lalla & Ferrari, 2011);

. Questionnaire design: (a) make the questionnaires brief to avoid evaluation fatigue(Nulty, 2008; Sax, Gilmartin, & Bryant, 2003); (b) keep the questionnaire shortand easy to understand (Moss & Hendry, 2002); (c) tell the students how long itwill take to complete the questionnaire (and ensure it takes no more than around10 min) (Moss & Hendry, 2002);

. Reminders: (a) send reminders/repeat reminders to students to complete the evalu-ations (Cartwright, 1999; Crews & Curtis, 2011; Dommeyer et al., 2004; Nulty,2008; Robinson et al., 2004) (this can be done through automatic reminders;H. M. Anderson et al., 2005). Moss and Hendry (2002) indicate that a reminderafter 2 days is more effective than after 5 days; (b) assure students of anonymity(Layne et al., 1999; Nulty, 2008; Oliver & Sautter, 2005); (c) introduce humorousslides into PowerPoint lectures to remind students to complete the evaluations(Crews & Curtis, 2011); (d) employ pre-notification of the evaluation (thoughSheehan, 2001, indicates mixed results here).

For institutions that are moving towards increasing online course evaluations, the situ-ation is encouraging, for reports indicate that, after an initial period of “bedding down”,response rates increase (Avery et al., 2006; Ballantyne, 2003; Johnson, 2003). Johnson(2003) indicates a rise from 40% to 60% response over a 3-year period; Ballantyne(2003) reports a rise from 30% to 72% in one school and from 40% to 95% in another.

The need for fair and complete comparisons

Whilst much research suggest that lower response rates do not seem to affect average scoreson course evaluations, not all of the surveys conducted use comparative methods (e.g., con-trols, matching, and random allocation) to ascertain this. Hence, there are questions to beasked about the reliability of the evidence. Some studies use surveys (often small scale)of faculty rather than of students; others use small samples of students; others useclasses, sections, or courses as the unit of analysis; others survey non-matched or non-randomised groups. Some studies use data gathered over time but without mention ofmatching students or randomisation of allocation. Some studies used data gathered inclass and out of class. The only studies found with explicit random assignation of students

590 K. Morrison

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

to paper or online evaluations were Barkhi and Williams (2010), Fike et al. (2010),Gamliel and Davidovitz (2005), Heath et al. (2007), and Layne et al. (1999). Somestudies did not make clear whether there had or had not been random assignation. Onlyone case was found (Johnson, 2003) in which students completed both formats, that is,the same students in both conditions. In other words, the number of studies of comparisonsbetween paper and online course evaluations with strict controls is limited. Table 3provides outline details of 24 studies of comparisons between paper and online courseevaluations.

Only one study was found (Layne et al., 1999) which compared the factor structures ofpaper and online evaluations, that is, to discover whether, even though the same questionswere used in paper and online evaluations, the factor structure revealed by the data differed.This is an under-researched area. Whilst Layne et al. (1999) report that the two modes ofevaluation do not differ in their factor patterns (p. 221), if it can be shown that factor struc-ture of one format differs from that of the other, then this might signal that comparisonsbetween paper and online versions which report only means may not catch the full signifi-cance of these two different versions. On the one hand, if the same instrument is used forboth paper and online versions, then this can be argued to render the factor structure irre-levant. On the other hand, if differences are found in the factor structure between the sameversion when completed in two different formats, then this suggests that attention has to begiven to the validity and reliability of the format chosen, to alert researchers to the possiblebias in the instrument, that is, to see the “fit” between the phenomenon under investigation(the course evaluation), the instrument that is measuring it, and the students who are com-pleting it. Put simply, if different factor structures are found, then the instrument may bebiased such that it may be dangerous to assume that the same instrument necessarilymeasures the same thing to the same students.

To build on the literature reviewed here, the present report has four purposes:

(1) to add to the current research on differences of means and standard deviationsbetween groups completing paper and online course evaluations, with the largeststudy of its kind found anywhere;

(2) to report the results of a large-scale, randomised trial in one university, of paper andonline course evaluations, in terms of means, standard deviations, factor structures,and student preferences for either paper or online course evaluations;

(3) to report the differences in time taken to complete paper and online courseevaluations;

(4) to draw implications for the practice of course evaluations.

A case study of one university

The present study addressed five research questions:

(1) What are the differences in time taken to complete paper and online courseevaluations?

(2) How easy do respondents find it to complete online evaluations?(3) What are the student preferences for paper and online course evaluations?(4) What does a comparison of the means and standard deviations of the paper and

online course evaluations indicate?(5) What does a comparison of the factor structures of the paper and online course

evaluations indicate?

Educational Research and Evaluation 591

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

The study had been conducted originally to provide data for the university in its consider-ation of whether or not to move to online course evaluations. A signal feature of this presentstudy is its very large database: 36,396 cases. This places it amongst the largest of its kindreported in the world, and the size gives confidence to the results. Further, in response to thenote earlier that there are very few studies which randomly assign students from the samegroup to one or other of paper or online versions of an evaluation, the present study ran-domly assigned students from the same group (i.e., from the same individual module/course) to one or other of the paper and online versions of the evaluation instrument,thereby enabling a fairer comparison to be made.

Method

An experiment was conducted in summer 2010 with the data collected over a 2-weekperiod. The study introduced online course evaluations as an innovation as, up untilthen, the university had not used online evaluations for any of its courses; they had allbeen paper only, and completed in class time. All the undergraduate students in a university

Table 3. Key studies of online and paper course evaluations.

ResearchFaculty/staffrespondents

Studentrespondents

Classes/sections/courses

Cummings andBallantyne (1999)

280

Layne et al. (1999) 2453Dommeyer et al. (2002) 53 professorsMoss and Hendry (2002) 1350Johnson (2003) > 6000 (exact number

unidentified)194 courses

Thorpe (2002) 404Dommeyer et al. (2004) 16 instructorsRobinson et al. (2004) 3000H. M. Anderson et al. (2005) 28 instructors Unidentified number

of students9 courses

Beran and Violato (2005) 1800 full-time faculty 25,000Gamliel andDavidovitz (2005)

198

Avery et al. (2006) 3037 29 coursesJ. Anderson et al. (2006) > 4500 surveys

(to fewer than4500 students)

Donovan et al. (2006) 11 instructors 413Donovan et al. (2007) 851Heath et al. (2007) 342McPherson et al. (2009) 70 instructors 997 classesBarkhi and Williams (2010) 1846Fike et al. (2010) 4550Guder and Malliaris (2010) 17,161 657 sectionsCrews and Curtis (2011) 49 instructors/facultyLalla and Ferrari (2011) 10,485Perrett (2013) 36 graduate sections

45 undergraduatesections

Stowell et al. (2012) 32 instructors 2057

592 K. Morrison

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

in China were approached to participate in the study (8734 students) for all theircourses, that is, a population rather than a sample. The average ages of these studentsvaried by year group (e.g., freshmen, sophomores, seniors, etc.) but, overall, were over-whelmingly in the age group 18 to 21. The students were studying a range of majors:business, management, information technology, law, humanities, arts, tourism, Chinesemedicine, that is, a comprehensive range of subjects. Additionally, all the students followedan extensive program of General Education, languages, mathematics, and physicaleducation.

Students within each course (in other institutions this might be termed a “module”,that is, a single, credit-bearing taught element of a whole degree) were randomlyassigned to either an “online” or a “paper” group, that is, two groups for each course:One half of the course, students completed the course evaluation online in their owntime, and the other half of the course students completed the paper version during the class-room session.

Staff and students were prompted to participate in the study by email and notices postedaround the university indicating that the research would be conducted, its purpose, how andwhen it would be conducted, how to find out more about it, and its importance. Studentswere requested on several occasions, before and during the course evaluation period, tocomplete the online evaluations in their own time. Paper evaluations were completed inclass at the end of each course, as was the normal practice in the university.

The course evaluation instrument was that used in normal, everyday practice in theuniversity. It comprised 18 x 5-point Likert scales ranging from a score of 1 (not at all)to 5 (a very great deal), together with a space for students to write any other word-basedcomments that they wished (the latter do not feature in the present report). In addition tothe 18 standard course evaluation questions, every student was asked to indicate howlong it had taken to complete the course evaluation questionnaire. The students who com-pleted the online version were also asked whether they preferred the online version or thepaper version (the students who had completed the paper version were not asked this, asthey had had no experience of an online version prior to or at the time of the experiment).Additionally, the students who had completed the online version were asked how easy/dif-ficult they found it to complete.

Every student was asked to complete either the online or paper version of the courseevaluation, for every course that he/she was taking at the time, that is, several evaluationsper student. This was conducted at the end of the semester, which conformed to standardpractice in the university. The data were processed using the Statistical Package for theSocial Sciences (SPSS), and comparisons were made between the online and paper versionsof the course evaluation using the following statistics:

. Cronbach alphas of reliability for online and paper versions;

. frequencies, percentages, means, standard deviations, for both online and paper ver-sions, to enable comparisons to be made between the two versions;

. t tests and Mann-Whitney U tests of difference between online and paper versions;

. principal components analysis of factors and factor loadings for the online and paperversions, for comparisons between the two.

Results

In total, 36,396 course evaluations were received, from 173 courses, a response rate of63.85%. Of these, 22,189 course evaluations (60.97%) were from students completing

Educational Research and Evaluation 593

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

the online version, and 14,207 course evaluations (39.01%) were from students completingthe paper version. Some responses were incomplete, and some were spoiled data,resulting in 36,244 usable course evaluations.

The average time taken to complete the online version was 7.33 min, and for a paperversion it was 2.88 min. Of the 11,979 responses received to the question of how easy/dif-ficult the students found it to complete the online version, 4190 students (35.0%) responded“very easy”, 7251 (60.5%) responded “easy”, 433 (3.6%) responded “difficult”, and 105(0.9%) responded “very difficult”.

Of the 11,815 responses received to the question of which version was preferred (onlineor paper), 7101 (60.1%) preferred an online version to a paper version and 4711 (39.9%)preferred a paper version to an online version.

Reliability testing

Reliability tests were conducted on each course using the Cronbach alpha. For 167 out ofthe 173 courses (96.5%), the alphas were > .95; in only one case did the alpha fall below .67.In nearly all of the cases where differences were found in the alphas between the online andpaper versions, these were smaller than .01, that is, the differences were extremely small,less than one hundredth of 1%. Whether the online or paper version was used made a neg-ligible difference to reliability.

Means and standard deviations

Whilst it was possible to calculate means and standard deviations for all completedresponses, it was not possible to conduct difference tests where comparative data weremissing (see below). For some statistics, data were usable from 173 courses, and forother statistics data from 163 courses were usable.

In analysing the means of the courses, 2149 means (74.18%) for the paper version werehigher than those of the online version, 696 means (24.04%) for the online version werehigher than those of the paper version, 52 means (1.79%) were the same in both versions,with 0.93% spoiled data. In other words, just over two thirds of all the means were higherfor the paper version than for the online version. Table 4 shows the number of times that theonline and paper evaluations varied according to the means and standard deviations (SD)found.

In comparing the means with standard deviations for paper and online evaluations, with2906 usable pieces of data, Table 4 indicates several points:

Table 4. Comparing means and standard deviations for online and paper evaluations.

Online SD higherthan paper SD

Paper SD higherthan online SD

Online and paperSD the same Total

Online mean higherthan paper mean

130 596 1 727

Paper mean higher thanonline mean

1588 516 7 2111

Online and papermeans the same

24s 35 9 68

Total 1742 1147 17 2906

594 K. Morrison

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

. The majority ({1742/2906*100} = 59.94%) of the standard deviations for the onlineversion were higher than those of the paper versions, regardless of which means werehigher, lower, or the same for the online and paper versions.

. A sizeable minority ({1147/2906}*100 = 39.47%) of the standard deviations of thepaper version were higher than those of the online version, regardless of whichmeans were higher, lower, or the same for the online and paper versions.

. Where the means of the paper version were higher than those of the online version,the standard deviations of the online version were higher (1588: 54.65% of the totalnumber of occasions).

. Where the means of the paper version were higher than those of the online version,the standard deviations of the paper version were higher (516: 17.76% of the totalnumber of occasions).

. Where the means of the online version were higher than those of the paper version,the standard deviations of the online version were higher (130: 4.47% of the totalnumber of occasions).

. Where the means of the online version were higher than those of the paper version,the standard deviations of the paper version were higher (596: 20.51% of the totalnumber of occasions).

The chi-square statistic was calculated for the data in Table 4, yielding a highly statisticallysignificant result (χ2 = 3946.37; df = 4; p < .0001), that is, the differences found in the dis-tributions were statistically significant and not random.

Whether the evaluation was conducted in a paper or online version accounted for20.47% ({1742–1147}*100) of the number of differences found in the standard deviations,whereas whether the means of the paper or online evaluations were higher accounted for47.63% ({2111–727}*100) of the number of differences found in the standard deviations.Difference in the means between the paper and online evaluations appears to relate to astronger influence on the standard deviations than whether the evaluations were conductedonline or on paper; the majority of standard deviations were higher for online evaluations, adifference of 20.47% (59.94% minus 39.47%) between the two versions being found here.

Difference testing was conducted in order to discover if there were statistically signifi-cant differences (p < .05): (a) between the means of each course in respect of online andpaper versions (t test); (b) between the online and paper versions of each course, using adistribution-free statistic (Mann-Whitney U test). Out of 3114 variables used (173courses*18 variables [one for each of the 18 evaluation questions]), statistically significantdifferences between the means of the online versions and the paper versions were found for558 variables (17.89%) using t tests, that is, 82.11% of means had no statistically significantdifferences between the online and paper versions. Using the Mann-WhitneyU test, statisti-cally significant differences were found between the ranking of the variables of the onlineversions and the paper versions for 599 variables (19.24%), that is, 80.76% of the variableshad no statistically significant differences between the online and paper versions.

In 12 out of the 173 courses (6.5%), statistically significant differences were foundbetween the means of the online and paper versions in respect of all the variables for acourse. Of those courses in which five or fewer variables were statistically significantlydifferent from each other in each course, 119 means (88.1%) were higher for the paperversion than for the online version (16 means: 11.9%).

Hence, whilst paper versions had a very clear majority of higher means than online ver-sions, in over 80% of these cases the differences were not statistically significant, that is,they could have been caused by random fluctuations and random patterning.

Educational Research and Evaluation 595

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

Principal components analysis

Principal components analysis (PCA) was conducted (with direct oblimin rotation, giventhe high alphas of reliability) for all those courses for which more than 30 responseswere received for each of the online and paper versions, that is, a minimum total of 60responses. PCA was conducted in order to discover whether there were differences in thefactor structure and factor loadings for the online and paper versions. This was only con-ducted for 77 courses, as, for other courses, there were too few cases to make it possibleto compute the statistic reliably, or other matters (e.g., spoiled or incomplete data) preventedSPSS from computing the statistic. In comparing the online and paper versions, the resultsshowed the following:

. There were only 3 cases where the number of the factors and their nature (i.e.,what each factor was about) remained stable and where there was more than onefactor for each version, for example, the same number of factors were found andsimilar amounts of variance were explained by each factor, but with factor loadingsvarying.

. There were 2 cases where the number of the factors and their nature (i.e., what eachfactor was about) remained the same but substantial differences were found inamounts of variance explained and in the factor loadings.

. There were 3 cases where different numbers of factors were extracted, though thetotal amount of variance explained was similar.

. There were 35 cases where different numbers of factors were extracted and the totalamount of variance explained was dissimilar.

. For 38 courses, a single factor was extracted in both the online and paper versionswith the factor loadings varying within these.

. For 15 courses, the online version extracted more factors than the paper version.

. For 20 courses, the paper version extracted more factors than the online version.

. For 3 courses where more than one factor was extracted for each course, the numberof factors extracted was the same.

The incidence of differences found in (a) the number of factors found, (b) the nature ofeach factor, (c) factor loadings, and (d) amounts of variance explained, suggests that cautionhas to be exercised in making assumptions of similarity between the results of online andpaper evaluations, even when they use the same instrument with the same population. Thisis an important finding.

Discussion

The research here was conducted with students for whom the use of an online course evalu-ation was a complete novelty. This might have influenced the results through the Hawthorneeffect, for example, in the higher number of students completing the online version and thenovelty effect of this method. The Hawthorne effect might have been operating further, as itwas noticeable that, even though paper evaluations were conducted in class and in classtime, more than 1.56 as many online evaluations were received in comparison to paperevaluations.

In both cases (paper and online), the time taken to complete the evaluations was veryshort: 2.88 and 7.33 min, respectively; students took more than 2.55 times as long to com-plete the online version as did those completing the paper version. Whether the longer time

596 K. Morrison

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

was due to unfamiliarity with the medium, difficulty with the medium, or students’ concernto complete the online evaluation carefully is an open question that the research did notexplore.

The very short time taken to complete the paper version raises questions about howcarefully the students completed the evaluation, or whether, after several years of usage,they were so familiar with it that they took only a short time to complete it, or, indeed,how seriously they took the study. In both cases, given the short time taken to completethe evaluations, the question can be raised of how seriously the students took the entireexercise (evaluation fatigue can set in if students have too many evaluations to complete;Sax et al., 2003).

Further reliability queries arose. For example, it was discovered that some course evalu-ations were not conducted entirely properly: (a) Some teaching groups were not briefed cor-rectly about the course evaluations by their course tutors; (b) some evaluations wereconducted in a perfunctory manner in class time; (c) some evaluations were all done onpaper, or all done online, despite two groups having been arranged from each course tocomplete the course evaluation (in which case the data were unable to be used).

Despite these concerns, however, the very large response – both in respect of studentsand in coverage of courses – gives considerable credibility to the results.

The majority (95.5%) of the 11,979 respondents to the question about how easy theyfound it to complete the online evaluation reported it to be “very easy” or “easy” to com-plete, whilst 4.5% found it “difficult” or “very difficult” to complete. This underlines theimportance of the literature cited earlier which indicates that rendering it easy to completean online evaluation contributes to its success.

The present study generally confirms literature cited earlier that finds larger standarddeviations and greater variance of standard deviations for online evaluations in comparisonto paper evaluations. It also confirms the literature cited earlier which reported very mixedfindings in respect of whether means are higher or lower depending on the medium used(paper or online). Whilst the present study found that 74.18% of the means in the paperversion were higher than those of the online version and, conversely, that 24.04% of theonline means were higher than those of the paper versions, difference tests indicatedthat, overall, few statistically significant differences were found in the means betweenthe online and paper versions. Even if such differences were found, then, on their own,these do not indicate which version should be preferred. However, given that 39.01% ofresponses were to the paper version and 60.97% of responses were to the online version,and that 60.1% of responses indicated a preference for the online version, in comparisonto 39.9% of responses indicating a preference for the paper version, it seems as thoughthe online version has greater strength here.

The results here indicated that the means of paper versions were slightly higher thanthose of online versions, even though only in 17.89% of cases were these statistically sig-nificant. Nevertheless, the percentage is not small; nearly one in six means were differentdepending on whether they were completed online or on paper. Whilst this does not tellus whether paper or online versions are to be preferred, it does alert us to expect some differ-ent results dependent on the medium used.

This difference is compounded by two other factors: (a) many of the standard deviationswere higher for online versions than for paper versions, though, as noted earlier, thisappeared to relate more to the means than to the differences in the medium of completion(paper or online); (b) the number of factors found, the foci of the factors (i.e., what theywere about), factor loadings, and variance explained, differed between online and paperevaluations, even when the same instrument was used with the same population. This

Educational Research and Evaluation 597

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

suggests that online and paper versions are not entirely commensurate; despite their iden-tical contents, they give rise to different results. As before, whilst this does not tell us whichversion is more reliable or preferable, it may advise us to expect different results dependingon the medium used. This raises issues of validity and the credence that can be placed in theresults.

It is perhaps unsurprising that so many courses (38) had only a single factor, as theywere measuring different facets of the same latent factor. Given this, it is unsurprising,also, that the reliability alphas (mentioned earlier) were exceptionally high, regardless ofthe medium used (online/paper), as the alpha is a measure of homogeneity, and onewould expect a single-factor structure to have high homogeneity.

The findings concerning differences found between online and paper evaluations haveto be tempered by two other considerations.

First, difference in the means between the paper and online evaluations appeared to beassociated with a stronger influence on the standard deviations than whether the evaluationswere conducted online or on paper.

Second, one cannot attribute causality to the findings, for example, one cannot say that itis the medium (paper or online) which was causing the differences found. The resultsconcern association rather than causation. The differences found between the means, stan-dard deviations, and time taken for both versions may not be caused by the medium but byother factors. For example, savings of time in using the paper version in comparison to theonline versions might have led to more positive feelings in the students, so they voted morepositively, or, conversely, the extra time taken (and in students’ own time) may have ledthem to feel more negatively about completing the online evaluations in comparison tothe paper versions. In other words, convenience, time, and disruption to personal easeand increased irritation (i.e. “bother”) might have caused the differences found ratherthan the medium being the cause. The medium of completion might simply be thecontext of, or conduit for, other independent variables to work in the situation whichbring about differences found in the results of the two media of evaluation. Alongsidethis suggestion, one has to recall that 95.5% of the students who responded to the questionabout preference for medium used, preferred the online medium. The findings of the presentstudy support those several other studies reported earlier, which suggested that factors otherthan the medium of completion (paper or online) might be exerting an influence on courseevaluations and their results.

Whilst the results concerning student preference for online versions accord with the lit-erature reviewed in the earlier part of this paper, they contradict the literature whichsuggested that online versions are quicker to complete (e.g., Kasiar et al., 2001), though,in the present study, this may have been due to the novelty of the online situation for thestudents. Further, in only a small number of cases, tutor influence was seen to impedethe smooth operation of the present study; hence, it can be inferred that the literature is sup-ported which suggests that tutor influence can be reduced in online course evaluations.

Whilst a considerable body of literature was cited earlier, showing that response rates toonline evaluations are lower than those for paper versions, this was not supported by thepresent findings. However, in the present study this could have been due to the considerableencouragement of, and perhaps even pressure placed on, students to complete the evalu-ations, or the novelty value of the online version, or Chinese students’ reluctance todeny their tutors’ requests, or, indeed the presence of several of the incentives and stepstaken to ensure high numbers of online responses outlined in the earlier part of thepaper. In other words, a fair comparison cannot be made to the (Western) literature inthis instance.

598 K. Morrison

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

Further, if the Chinese students completed their evaluations under duress, pressure, orout of a sense of duty only (a signal feature of Chinese culture), or, as the in-class evalu-ations were completed during lesson time, even after the tutor had permanently left theroom, that is, the class completed the evaluation and a student took them to a central collec-tion point in the university, then this might have affected their response rate, their ratings,and comments.

Moreover, given the Chinese (Confucian) culture of respect for teachers and the empha-sis placed on harmony and positive relationships between teacher and students, the studentsmay have been unwilling to be too negative about their teachers.

Despite the very large size of the present study, it must be noted that the number ofresponses from the online students was 1.56 higher than for those completing the paperevaluations, and this might suggest that making comparisons between the two groups(online and paper) might be invidious. This may be due to the liking of Chinese studentsfor online communication (East Asia, e.g., South Korea, Japan, China, Macau, HongKong, Taiwan, at the time of writing has the world’s highest proportion of internet users(double that of Europe and nearly quadruple that of North America) and cellphone users(http://www.internetworldstats.com/stats.htm). However, the large numbers of studentsinvolved (22,189 and 14,207, respectively) can be regarded as a sufficient counterbalanceto the difference in proportions.

Conclusions

This present study addressed four research questions. Whilst the main body of the paperaddresses these, in summary version, the following results were found.

The average time for paper and online evaluations was very short (2.88 min and 7.33min, respectively). The majority (95.5%) of the 11,979 respondents to this question found it“very easy” or “easy” to complete the evaluation, whilst 4.5% found it “difficult” or “verydifficult” to complete. Of the 11,815 responses received to this question, 60.1% preferred anonline version to a paper version and 39.9% preferred a paper version to an online version.Attention was drawn to the fact that the present study’s use of online methods of courseevaluation was the first time for such a method, and hence the Hawthorne effect andnovelty factors might compromise the trust that can be placed in the findings here. Thestudy found that 74.18% of the means for the paper version were higher than those ofthe online version, 24.04% for the online version were higher than those of the paperversion, and 1.79% were the same in both versions, that is, a little over two thirds of allthe means were higher for the paper version than for the online version. However, ofthese differences found, the t test found that 82.11% of the means had no statistically sig-nificant differences between the online and paper versions, and the Mann-Whitney U testfound that 80.76% of the variables had no statistically significant differences betweentheir results, that is, they could have been caused by random fluctuations and randompatterning.

With regard to standard deviations, 59.94% of the standard deviations for the onlineversion were higher than those of the paper versions, and 39.47% of the standard deviationsof the paper version were higher than those of the online version. The differences found inthe distributions of the standard deviations were highly statistically significant (p < .0001)and not random. It was found that difference in the means between the paper and onlineevaluations appears to exert a stronger influence on the standard deviations than whetherthe evaluations were conducted online or on paper.

Educational Research and Evaluation 599

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

Principal components analysis revealed several differences in the number of factorsfound, factor structures, factor loadings, and amounts of variance explained. The impli-cations of these differences were to suggest that caution has to be exercised in makingassumptions of similarity between online and paper evaluations, even when they use thesame instrument with the same population.

It must be noted that, for the students, this was their first foray into online evaluations,and this might have influenced their results. Additional and ongoing data collection over alonger period of time might be important, after the online system has had an opportunity tobecome embedded in the everyday workings of the university in question. Whilst this studyused random allocation of students, in which each course group was split into two halves,the response rate was uneven (more students completed the online evaluation than the paperevaluation). Further, the course evaluations were completed under different conditions, thepaper version being completed in class and in class time, and the online version being com-pleted out of class in students’ own time, rendering comparisons imprecise and perhapsunfair. The data were gathered for undergraduate students only, and not postgraduate stu-dents, and the literature cited earlier suggests that differences in responses to course evalu-ations between these two levels of students might be found. Further, this study has notreported the results of the students’ written comments; that is beyond the scope of thepresent paper.

Despite the caveats outlined here and in the discussion section, the large number ofrespondents here gives considerable weight to the findings. These, coupled with the veryhigh alphas reported (96.5% with alphas of > .95 for each course), accord the presentstudy considerable reliability.

Finally, it was contended that caution has to be exercised in assuming that online andpaper evaluations will necessarily yield similar results, even when the same instrument isused with the same population. Whilst this does not indicate whether one version is prefer-able to another, it suggests that reactivity to the medium might influence the results and,hence, the weight that can be placed on them. Ultimately, decisions on whether to optfor online or paper evaluations might be taken on grounds of a range of cost savingsrather than for educational reasons, and both the literature review and the data in thepresent study indicate that, when time and timeliness are at a premium, these are importantconsiderations.

Notes1. H. M. Anderson, Cain, and Bird (2005); Donovan, Mader, and Shinsky (2007); Fike, Doyle, and

Connelly (2010); Johnson (2003); Layne, DeCristoforo, and McGinty (1999); Stowell, Addison,and Smith (2012).

2. Baum, Chapman, Dommeyer, and Hanna (2001); Collings and Ballantyne (2004); Crews andCurtis (2011); Dommeyer, Baum, Chapman, and Hanna (2002); Dommeyer, Baum, Hanna,and Chapman (2004); Donovan, Mader, and Shinsky (2006, 2007); Fike et al. (2010); Guderand Malliaris (2010); Ha, Marsh, and Jones (1998); Hardy (2003); Hmieleski and Champagne(2000); Johnson (2003); Kasiar, Schroeder, and Holstad (2002); Kuhtman (2004); Layne et al.(1999); Ravelli (2000); Rhea, Rovai, Ponton, Derrick, and Davis (2007); Sorenson andJohnson (2003); Sorenson and Reiner (2003); Stowell et al. (2012); Tucker, Jones, Straker,and Cole (2003).

3. H. M. Anderson et al. (2005); Bothell and Henderson (2003); Crews and Curtis (2011); Cum-mings, Ballantyne, and Fowler (2000); Dommeyer et al. (2002); Dommeyer et al. (2004);Donovan et al. (2006, 2007); Hmieleski and Champagne (2000); Johnson (2003); Kuhtman(2004); Layne et al. (1999); McCracken and Kelly (2011); Nulty (2008); Sorenson and Reiner(2003); Stowell et al. (2012).

600 K. Morrison

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

4. Ballantyne (2003); Bothell and Henderson (2003); Crews and Curtis (2011); Fraze, Hardin,Brashears, Smith, and Lockaby (2000); Hmieleski and Champagne (2000); Johnson (2003);Kronholm et al. (1999); McCracken and Kelly (2011); Sorenson and Reiner (2003).

5. H. M. Anderson et al. (2005); Badri, Abdulla, Kamali, and Dodeen (2006); Barkhi and Williams(2010); Crews and Curtis (2011); Dommeyer et al. (2004); Donovan et al. (2007).

6. H. M. Anderson et al. (2005); J. Anderson, Brown, and Spaeth (2006); Crews and Curtis (2011);Donovan et al. (2007); Hmieleski and Champagne (2000); McCracken and Kelly (2011); Watt,Simpson, McKillop, and Nunn (2002).

7. Avery, Bryant, Mathios, Kang, and Bell (2006); Chang (2003); Crews and Curtis (2011);Dommeyer et al. (2002); Dommeyer et al. (2004); Guder and Malliaris (2010); Laubsch(2006); Layne et al. (1999); Liegle and McDonald (2005); Norris and Conn (2005); Richardson(2005); Schawitch (2005); Sorenson and Johnson (2003); Sorenson and Reiner (2003); Stowellet al. (2012); Thorpe (2002).

8. Avery et al. (2006); Barkhi and Williams (2010); Carini et al. (2003); Dommeyer et al. (2002);Dommeyer et al. (2004); Donovan et al. (2006); Fike et al. (2010); Gamliel and Davidovitz(2005); Guder and Malliaris (2010); Ha et al. (1998); Handwerk, Carson, and Blackwell(200l); Heath, Lawyer, and Rasmussen (2007); Johnson, 2003; Layne et al. (1999); Perrett(2013); Sorenson and Johnson (2003); Stowell et al. (2012); Thorpe (2002); Turhan, Yaris,and Nural (2005).

ReferencesAnderson, H. M., Cain, J., & Bird, E. (2005). Online student course evaluations: Review of literature

and a pilot study. American Journal of Pharmaceutical Education, 61, 34–43.Anderson, J., Brown, G., & Spaeth, S. (2006). Online student evaluations and response rates recon-

sidered. Innovate, 2. Retrieved from http://www.innovateonline.info/index.php?view=article&id=301 (The article is reprinted here with permission of the publisher, The FischlerSchool of Education and Human Services at Nova Southeastern University.)

Avery, R. J., Bryant, W. K., Mathios, A., Kang, H., & Bell, D. (2006). Electronic course evaluations:Does an online delivery system influence student evaluations? The Journal of EconomicEducation, 37, 21–37.

Badri, M. A., Abdulla, M., Kamali, M. A., & Dodeen, H. (2006). Identifying potential biasing vari-ables in student evaluation of teaching in a newly accredited business program in UAE.International Journal of Educational Management, 20, 43–59.

Ballantyne, C. (2003). Online evaluations of teaching: An examination of current practice and con-siderations for the future. New Directions for Teaching and Learning, 96, 103–112.doi:10.1002/tl.127

Barkhi, R., & Williams, P. (2010). The impact of electronic media on faculty evaluation. Assessment& Evaluation in Higher Education, 35, 241–262.

Baum, P., Chapman, K., Dommeyer, C., & Hanna, R. (2001, June). Online versus in-class studentevaluations of faculty. Paper presented at the Hawaii Conference on Business, Honolulu, HI.

Beran, T., & Violato, C. (2005). Ratings of university teacher instruction: How much do student andcourse characteristics really matter? Assessment & Evaluation in Higher Education, 30, 593–601.

Bothell, T. W., & Henderson, T. (2003). Do online ratings of instruction make sense? New Directionsfor Teaching and Learning, 96, 69–79. doi:10.1002/tl.124

Carini, R. M., Hayek, J. C., Kuh, G. D., Kennedy, J. M., & Ouimet, J. A. (2003). College studentresponses to web and paper surveys: Does mode matter? Research in Higher Education, 44,1–19.

Cartwright, D. W. (1999, May–June). Assessing distance learning using a website survey. Paper pre-sented at the Association for Institutional Research Annual Forum, Seattle, WA.

Chang, T. (2003, April). The results of student ratings: The comparison between paper and onlinesurvey. Paper presented at the Annual Meeting of the American Educational ResearchAssociation, Chicago, IL.

Cohen, L., Manion, L., & Morrison, K. R. B. (2011). Research methods in education (7th ed.).Abingdon, UK: Routledge.

Collings, D., & Ballantyne, C. S. (2004, November). Online student survey comments: A qualitativeimprovement? Paper presented at the 2004 Evaluation Forum, Melbourne, Victoria, Australia.

Educational Research and Evaluation 601

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

Retrieved from http://our.murdoch.edu.au/Educational-Development/_document/Publications/Eval_forum_paper.pdf

Crews, T. B., & Curtis, D. F. (2011). Online course evaluations: Faculty perspective and strategies forimproved online response rates. Assessment & Evaluation in Higher Education, 36, 865–878.

Cummings, R., & Ballantyne, C. (1999, October). Student feedback on teaching: Online! On target?Paper presented at the Annual International Conference of the Australasian Evaluation Society,Perth, Western Australia.

Cummings, R., Ballantyne, C., & Fowler, L. (2000, August). Online student feedback surveys:Encouraging staff and student use. Paper presented at the Teaching Evaluation Forum, Perth,Western Australia.

Dommeyer, C. J., Baum, P., Chapman, K. S., & Hanna, R. W. (2002). Attitudes of business facultytowards two methods of collective teaching evaluations: Paper vs. online. Assessment &Evaluation in Higher Education, 27, 455–462.

Dommeyer, C. J., Baum, P., Hanna, R. W., & Chapman, K. S. (2004). Gathering faculty teachingevaluations by in-class and online surveys: Their effects on response rates and evaluations.Assessment & Evaluation in Higher Education, 29, 611–623.

Donovan, J., Mader, C. E., & Shinsky, J. (2006). Constructive student feedback: Online vs. traditionalcourse evaluations. Journal of Interactive Online Learning, 5, 283–296.

Donovan, J., Mader, C., & Shinsky, J. (2007). Online vs. traditional course evaluation formats:Student perceptions. Journal of Interactive Online Learning, 6, 158–180.

Ernst, D. (2006, October). Student evaluations: A comparison of online vs. paper data collection.Paper presented at the annual conference of EDUCAUSE, Dallas, TX.

Fike, D. S., Doyle, D. J., & Connelly, R. J. (2010). Online vs. paper evaluations of faculty: When lessis good. The Journal of Effective Teaching, 10(2), 42–54.

Fraze, S., Hardin, K., Brashears, T., Smith, J., & Lockaby, J. (2002, December). The effects of deliverymode upon survey response rate and perceived attitudes of Texas Agri-Science teachers. Paperpresented at the National Agricultural Education Research Conference, Las Vegas, NV.

Gamliel, E., & Davidovitz, L. (2005). Online versus traditional teaching evaluation: Mode can matter.Assessment & Evaluation in Higher Education, 30, 581–592.

Guder, F., &Malliaris, M. (2010). Online and paper course evaluations. American Journal of BusinessEducation, 3, 131–138.

Ha, T. S., Marsh, J., & Jones, J. (1998, May). Aweb-based system for teaching evaluation. Paper pre-sented at theNewChallenges and Innovations in Teaching (NCITT) 1998Conference, HongKong.

Handwerk, P., Carson, C., & Blackwell, K. (2000, May). On-line vs. paper-and-pencil surveying ofstudents: A case study. Paper presented at the 40th Annual Meeting of the Association forInstitutional Research, Cincinatti, OH.

Hardy, N. (2003). Online ratings: Fact and fiction. New Directions for Teaching and Learning, 96,31–38. doi:10.1002/tl.120

Hastie, M., & Palmer, A. (1997, July). The development of online evaluation instruments to compli-ment web-based educational resources. Paper presented at the Third Australian World Wide WebConference, Lismore, NSW, Australia.

Heath, N. M., Lawyer, S. R., & Rasmussen, E. B. (2007). Web-based versus paper-and-pencil courseevaluations. Teaching of Psychology, 34, 259–261.

Hmieleski, K., & Champagne, M. V. (2000, September/October). Plugging into course evaluation.The Technology Source. Retrieved from http://technologysource.org/article/plugging_in_to_course_evaluation/

Hoffman, K. M. (2003). Online course evaluation and reporting in higher education. New Directionsfor Teaching and Learning, 96, 25–29. doi:10.1002/tl.119

Isely, P., & Singh, H. (2005). Do higher grades lead to favorable student evaluations? The Journal ofEconomic Education, 36, 29–42.

Johnson, T. D. (2003). Online student ratings: Will students respond? New Directions for Teachingand Learning, 96, 49–59. doi:10.1002/tl.122

Kasiar, J. B., Schroeder, S. L., & Holstad, S. G. (2002). Comparison of traditional and web-basedcourse evaluation processes in a required, team-taught pharmacotherapy course. AmericanJournal of Pharmaceutical Education, 66, 268–270.

Kronholm, E. A., Wisher, R. A., Curnow, C. K., & Poker, F. (1999, May). The transformation of adistance learning enterprise to an internet base: From advertising to evaluation. Paper presentedat the Northern Arizona University NAU/Webb99 Conference, Flagstaff, AZ.

602 K. Morrison

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

Kuhtman, M. (2004). Review of online ratings of instruction. College and University Journal, 80,64–67.

Lalla, M., & Ferrari, D. (2011). Web-based versus paper-based data collection for the evaluation ofteaching activity: Empirical evidence from a case study. Assessment & Evaluation in HigherEducation, 36, 346–365.

Laubsch, P. (2006). Online and in-person evaluations: A literature review and exploratory compari-son. MERLOT Journal of Online Learning and Teaching, 2, 62–73.

Layne, B. H., DeCristoforo, J. R., & McGinty, D. (1999). Electronic versus traditional student ratingsof instruction. Research in Higher Education, 40, 221–232.

Liegle, J., & McDonald, D. (2005). Lessons learned from online vs. paper-based computer infor-mation students’ evaluation system. Information Systems Education Journal, 3(37), 3–14.

McCracken, B., & Kelly, K. (2011). Online course evaluations: Feasibility study project plan anddraft report. Retrieved from http://www.docstoc.com/docs/138693848/ONLINE-COURSE-EVALUATIONS -- - San-Francisco-State-University

McGhee, D. E., & Lowell, N. (2003). Psychometric properties of student ratings of instruction inonline and on-campus courses. New Directions for Teaching and Learning, 96, 39–48.doi:10.1002/tl.121

McPherson, M. A., Jewell, R. T., & Kim, M. (2009). What determines student evaluation scores?A random effects analysis of undergraduate economics classes. Eastern Economic Journal, 35,37–51.

Moss, J., & Hendry, G. (2002). Use of electronic surveys in course evaluation. British Journal ofEducational Technology, 33, 583–592.

Norris, J., & Conn, C. (2005). Investigating strategies for increasing student response rates to online-delivered course evaluations. Quarterly Review of Distance Education, 6, 13–29.

Nulty, D. D. (2008). The adequacy of response rates to online and paper surveys: What can be done?Assessment & Evaluation in Higher Education, 33, 301–314.

Oliver, R. L., & Sautter, E. P. (2005). Using course management systems to enhance the value ofstudent evaluations for teaching. Journal of Education for Business, 80, 231–235.

Perrett, J. J. (2013). Exploring graduate and undergraduate course evaluations administered on paperand online: A case study. Assessment & Evaluation in Higher Education, 38, 85–93. doi:10.1080/02602938.2011.604123

Ranchod, A., & Zhou, F. (2001). Comparing respondents of e-mail and mail surveys: Understandingthe implications of technology. Marketing Intelligence and Planning, 19, 245–262.

Ravelli, B. (2000, June). Anonymous online teaching assessments: Preliminary findings. Paper pre-sented at the Annual Conference of the American Association for Higher Education, Charlotte,NC.

Rhea, N., Rovai, A., Ponton, D., Derrick, G., & Davis, J. (2007). The effect of computer-mediatedcommunication on anonymous end-of-course teaching evaluations. International Journal on E-Learning, 6, 581–592.

Richardson, J. T. E. (2005). Instruments for obtaining student feedback: A review of the literature.Assessment & Evaluation in Higher Education, 30, 387–415.

Robinson, P., White, J., & Denman, D. W. (2004, October). Course evaluations online: Putting astructure into place. Paper presented at the 32nd Annual Association for Computing Machiner(ACM) Special Interest Group on University and College Computing Services (SIGUCCS)Conference on User Services, Baltimore, MD.

Sax, L. J., Gilmartin, S. K., & Bryant, A. N. (2003). Assessing response rates and non-response bias inweb and paper surveys. Research in Higher Education, 44, 409–432.

Schawitch, M. (2005, June).Online course evaluations: One institute’s success in transitioning from apaper process to a completely electronic process. Paper presented at the Association forInstitutional Research Forum, Atlanta, GA.

Seal, K. C., & Przasnyski, Z. H. (2001). Using the world wide web for teaching improvement.Computers and Education, 36, 33–40.

Sheehan, K. (2001). Email survey response rates: A review. Journal of Computer MediatedCommunication, 6(2). Retrieved from http://jcmc.indiana.edu/vol6/issue2/sheehan.html

Sorenson, D. L., & Johnson, D. (Eds.). (2003). Online student ratings of instruction [Special issue].New Directions for Teaching and Learning, 96, 1–112.

Sorenson, D. L., & Reiner, C. (2003). Charting the uncharted seas of online student ratings of instruc-tion. New Directions for Teaching and Learning, 96, 1–24. doi:10.1002/tl.118

Educational Research and Evaluation 603

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13

Stowell, J. R., Addison, W. E., & Smith, J. L. (2012). Comparison of online and classroom-basedstudent evaluations of instruction. Assessment & Evaluation in Higher Education, 37, 465–473.

Thorpe, S. W. (2002, June). Online student evaluation of instruction: An investigation of non-response bias. Paper presented at the 42nd Annual Forum of the Association for InstitutionalResearch, Toronto, Ontario. Retrieved from http://www3.airweb.org/forum02/550.pdf

Tucker, B., Jones, S., Straker, L., & Cole, J. (2003). Course evaluation on the Web: Facilitating stu-dents and teacher reflection to improve learning. New Directions for Teaching and Learning, 96,81–93. doi:10.1002/tl.125

Turhan, K., Yaris, F., & Nural, E. (2005). Does instructor evaluation by students using a web-basedquestionnaire impact instructor performance? Advances in Health Sciences Education, 10, 5–13.

Watt, S., Simpson, C., McKillop, C., & Nunn, V. (2002). Electronic course surveys: Does automatingfeedback and reporting give better results? Assessment & Evaluation in Higher Education, 27,325–337.

604 K. Morrison

Dow

nloa

ded

by [

Kei

th M

orri

son]

at 1

7:59

11

Sept

embe

r 20

13