Modelling Students at Risk

Post on 15-May-2023

1 views 0 download

Transcript of Modelling Students at Risk

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

MODELLING STUDENTS AT RISK

DIANE M. DANCER

University of Sydney

DENZIL G. FIEBIG

University of New South Wales

Using a sample of several hundred students we model progression in a first-year econometrics course.Our primary interest is in determining the usefulness of these models in the identification of ‘studentsat risk’. This interest highlights the need to distinguish between students who drop the course andthose who complete but who ultimately fail. Such models allow identification and quantification of thefactors that are most important in determining student progression and thus make them a potentiallyuseful aid in educational decision making. Our main findings are that Tertiary Entrance Rank (TER),mathematical aptitude, being female and attendance in tutorials are all good predictors of success butamongst these factors only attendance is significant in discriminating between students who fail andthose who discontinue. Also, there are differences across degree programs and, in particular, studentsin Combined Law are very likely to pass but, if they are at risk, they are much more likely to discon-tinue than to fail.

I . I

N T R O D U C T I O N

As funding of public education comes under increased scrutiny and universities are calledupon to be more accountable for cost-effective use of public funds, there is an increasing needon the part of policy makers and universities to better understand student progression rates. Atthe system level, there is a considerable interest in comparing indicators of non-completion ofdegree programs across universities and different programs within a university. Smith andNaylor (2001) provide such an analysis for the UK. These aggregate statistics are derivedfrom decisions made by individual students at the course level and this is where our interestlies. In addition to providing valuable course-specific information, analysis of the type wepropose aims to give some insight into overall non-completion rates but from a micro-levelperspective.

There is considerable research where some measure of student performance in a particularcourse has been modelled; see for example Anderson

et al.

(1994), Borg

et al.

(1989), Morgan

et al.

(1985), Reid (1983), Siegfried (1980) and Watkins (1979). These studies ignore thosestudents who discontinue and hence their results are potentially biased by sample selectionproblems. Douglas and Sulock (1995) avoid this criticism by first modelling dropouts in orderto use a Heckman (1979) estimation procedure to guard against potential selectivity bias in theirmodels of performance.

Correspondence: Diane Dancer, Econometrics

ε

i

Business Statistics HO4, University of Sydney, SydneyNSW 2006, d.dancer@econ.usyd.edu.au* The authors gratefully acknowledge the helpful comments received from Bob Bartels, Pierre Uldry, twoanonymous referees and the Editor, Jonathan Pincus. The second author acknowledges the hospitality ofCHERE where much of this paper was written.

2004 MODELLING STUDENTS AT RISK 159

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

Studies of persistence tend to focus on a student’s decision of whether to drop out ofuniversity or not; see for example, Tinto (1975, 1993). Empirical work such as Lam (1984) andPascarella and Terenzini (1980) have tended to investigate this decision by concentrating on thebinary choice problem of whether students drop out during their first year or not and haveignored whether those who persist actually pass their courses or not. Dancer and Doran (1990)studied this conditional division into pass or fail given continuation. They use a probit analysisto model the probability of successfully completing first-year studies in the Faculty of Economicsat the University of New England.

One question that has not been emphasised by these studies is whether it is possible todistinguish between different types of students at risk, where we define students ‘at risk’ to bethose who either discontinue the course or who persevere but ultimately fail. A completeanalysis of progression distinguishes three outcomes: the student discontinues the course or thestudent continues and ultimately either passes or fails. Our first objective is to identify the keyfactors that determine the probability of each of these three outcomes and, in particular, toexplore differences between the two types of students classified as being at risk.

Using a sample of several hundred students enrolled in the Faculty of Economics at theUniversity of Sydney, we model progression in the first-year course Econometrics I, which, atthe time of this study, was a compulsory subject for most students in the Faculty of Economicsand Business. Smith and Naylor (2001) find that the majority of students in their UK samplewho do not complete their degree drop out in the first year of study. Thus, first-year subjectssuch as Econometrics I are especially important in the overall analysis of student progression.

Some knowledge of the key factors affecting students’ choices is important because discon-tinuing (or failing) a course involves costs to the student and to the university. For the studentit may lower morale and self-esteem and may restrict their choices in later years leading to apotential delay in graduation. Such delays adversely affect aggregate progression rates, whichin turn, may be costly for the university in terms of reputation and lost government grants.From an institution’s point of view, Thomas

et al.

(1996) argue that student withdrawals raisequestions regarding admissions procedures, course information and student care.

If students at risk can be identified early in a course, then extra help and resources can bedirected towards those students in an effort to improve their chances of successfully completingthe course and ultimately their degree. However, there are different kinds of help that can begiven to students by the faculty, department, academic staff, student services and counsellingand this aid may depend on the type of student at risk. Thus, our second objective is to use ourresults to comment on current teaching and admission practices as well as potential policyinitiatives aimed at improving retention rates.

Students are classified into three categories according to whether they drop the course, com-plete the course but fail or whether they pass. While the last two categories could be consideredas ordered, the presence of students who drop implies a situation where the dependent variableis categorical and unordered. Accordingly, a multinomial logit model is developed to investigatethe key factors determining the ultimate outcome for each student. Because of access to extrasurvey information over and above that usually made available on university databases, we havean unusually rich set of explanatory variables to carry out this analysis.

I I . B

A C K G R O U N D

T O

M

O D E L L I N G

Students enter university with a diverse range of abilities, backgrounds, expectations regardingthe value of their degree program and commitment to further education. In Australia, students

160 AUSTRALIAN ECONOMIC PAPERS JUNE

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

are admitted to university largely on the basis of their high school performance. They areranked on the basis of their Tertiary Entrance Rank (TER) score and for high demand pro-grams, such as provided by the Faculty of Economics and Business at the University ofSydney, there is intense competition for a limited number of places. It could be argued thatthis type of admission process produces a relatively homogeneous intake of students thathelps to reduce the mismatch of student abilities and degree requirements. However, the TERis an overall measure of ability that is calculated as a weighted average of scores obtained inhigh school subjects and may not capture a student’s preparedness for a particular universitycourse.

Smith and Naylor (2001) find this lack of course-specific preparation to be a significantfactor in explaining the probability of withdrawal for some but not all disciplines. Relevant forour study is the fact that the social sciences were one of the areas where it was found to be acontributing factor. In addition to the TER score, we have measures of mathematical ability,which is probably the most relevant ability indicator for econometrics. In fact, Dancer andDoran (1990) find, using students from the University of New England, that there is a significantdifference in performance between students who attempted the two highest levels of mathematicsat the final examination at the end of secondary schooling and those who attempted the lowerlevels of mathematics.

The admission process does have the potential to produce a mismatch between programs intowhich university students are admitted and those they would prefer to enter. McInnis

et al.

(2000) find that in 1999, 20% of commencing students in Australia hoped to change programsafter their first year. This only serves to exacerbate the propensity for program mismatch causedby students making their program choices without full information. Power and Robertson(1987) follow a cohort of South Australian students and find that over 60% of them report alack of knowledge about the university and its courses, and a significant number report that thecourse in which they enrolled was not what they expected. Power and Robertson (1987) suggestthat many school-leavers make uninformed and inappropriate choices, resulting in low commitment,poorer performance and a higher potential to withdraw. In an attempt to capture some of thesefactors, our sample of students was asked about their motivation for enrolling in an economics-related degree.

Once enrolled in a degree program, students choose their courses and make decisions abouteffort and persistence subject to the constraints imposed by the institutional framework.Because Econometrics I is compulsory, students must repeat the subject if they fail or discon-tinue. This implies strong incentives for students to persist and to pass the course. Students whofail must pay to repeat the course, and, because of pre-requisite structures, they may not finishtheir degree in the minimum time. Relative to non-core subjects, one would expect students incore courses to be more likely to persist.

The process of modelling performance is typically thought of as a production functionfor new knowledge; see for example Anderson

et al.

(1994), Douglas and Sulock (1995) andSiegfried and Fels (1979). In empirical work, student grades are taken as a (crude) measure ofnew knowledge acquired. The student’s ability, their level of effort and their commitment thendetermine performance.

Clearly students have a range of abilities implying some are more capable than others andhence are better equipped to cope with the demands of a university education. As has beenmentioned, we have the student’s TER and measures of their mathematical ability. But onewould expect that educational outcomes are also determined by how well the student uses theirinnate abilities. Do they apply themselves? Somewhat surprisingly some studies including Borg

et al.

(1989) do not find statistically significant effort effects. Schmidt (1983) argues that

2004 MODELLING STUDENTS AT RISK 161

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

this puzzling result occurs when total time devoted to the course is used as the proxy for effort.Students must balance competing demands on their time to decide on how much total time tospend on each of their courses. But they also have some scope to allocate time to alternativestudy modes for any one course. The results of Schmidt (1983) show that total time spent on aneconomics course do not affect performance but components of that time do show up as signi-ficant determinants. Our data has several measures of effort and thus allows us to avoid theproblem identified by Schmidt (1983).

A range of other socio-economic variables has typically been included in studies of studentperformance and persistence at university. In summarising a range of studies on teachingeconomics, Siegfried (1979) notes the existence of persistent gender effects in terms of per-formance with males doing better. For their analysis of UK dropouts, Smith and Naylor (2001)also find significant gender effects across a wide range of disciplines. Here females tended tobe more persistent than males.

Variables relating to family background, such as educational attainment and occupation ofparents are also commonly used as control variables. These are taken to represent the student’slevel of commitment to higher education and also the financial resources available to them.Power and Robertson (1987) find that students dependent on part-time work rather than on thefinancial support of their parents are more likely to dropout. Conversely, living at home mayinvolve travelling long distances to university that may seriously affect the amount of time spenton coursework and ultimately on performance. Being physically remote from the universitymay also lead to problems of social isolation. According to the influential work of Tinto (1975,1993), a student’s social and academic integration into university life is the major determinantof completion. Given these findings, it is significant that McInnis

et al.

(2000) highlight a trendof less attachment to university life largely because of increased time spent in paid employmentby full-time students.

While the decision to persist in a degree program is potentially influenced by local labourmarket conditions, see for example Smith and Naylor (2001), such indicators need to vary suf-ficiently over the sample to be useful in the analysis. They are unlikely to be appropriate if allstudents are attending the same university as they are in the current work.

I I I . A M

O D E L L I N G

F

R A M E W O R K

For modelling purposes there is no observable index representing the degree to which astudent is at risk. All we can observe is whether a particular student falls into one of three distinctcategories: discontinue, fail, and pass. Our dependent variable is discrete. While it could beconsidered that a student who fails is, in some sense, lower than a student who passes, theordering of discontinue relative to the other two categories is not clear. This implies that thedependent variable is an unordered, polychotomous variable. If in fact fail and discontinue canbe pooled as one category then the problem reduces to describing a binary outcome.

We assume that each student has an unobserved utility associated with each of the discreteoutcomes. The utility index is assumed to depend on personal characteristics and their expectedgrade. Individual students then choose the alternative with the highest utility. With a linear randomutility model we have: ;

j

=

0, 1, 2 and under the assumption that (

ε

i

0

ε

ij

)follows a logistic distribution, this random utility framework motivates the use of the multino-mial logit model. Under this model specification, the probability that the

i

th student falls into the

j

th category is given by:

U xij j i ij = +β ε′

162 AUSTRALIAN ECONOMIC PAPERS JUNE

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

(1)

If the

m

th category is taken to be the numeraire then

β

m

is normalised to zero. The interpreta-tion of the coefficients is facilitated by considering the log odds ratio defined by:

(2)

Thus if

β

jr

>

β

kr

, then an increase in the level of characteristic

r

will increase the log odds of thestudent being in category

j

rather than

k

.One problem with the multinomial logit model is the assumption that the disturbances

are independent and identically distributed. As can be seen from equation (2), the odds ratioonly depends on alternatives

j

and

k

and is independent of the other alternatives. This oddsratio only holds if the disturbances are independent and this property is referred to as theIndependence of Irrelevant Alternatives (IIA). If the disturbances are not independent, thisimplies that students view some subsets of alternatives as having more in common amongthemselves than they do with alternatives not belonging to the subset. Some of the alternativesare thus more closely related than others in ways that are not captured by the explanatoryvariables.

A natural alternative is to choose the multinomial probit (MNP) model, which provides amore flexible approach to capturing the pattern of substitution between alternatives. Despitethis advantage relative to MNL, the MNP specification has historically been used much less inmodelling because of a considerable difference in computational burden. This argument infavour of the MNL model has less force since the advent of practical simulation estimators forthe MNP. However, Keane (1992) stresses another practical problem associated with the MNP.He demonstrates the existence of identification problems when choices are being explained bycharacteristics of the individual that do not vary across choices. This is exactly the model structurewe face and attempts to estimate MNP models failed.

Staying within the logit framework, it is possible to specify a two-level nested model. Thereare two nested structures that could be considered here. Firstly, at the branch level; the highestdivision in the tree structure, the student faces two alternatives – ‘at risk’ and ‘pass’. Conditionalon a student being ‘at risk’, there are then two further alternatives – ‘discontinue’ or ‘fail’. Thesecond nesting structure would again have two alternatives at the branch level; namely the studenteither discontinues or continues. If the student ‘continues’, there are two further alternatives;namely, ‘fail’ or ‘pass’.

In terms of the process we have described, the first of these nesting structures is moreappropriate. If in fact all at risk students can be treated as the same then the model collapses toa binary logit represented by the initial division. This pooling of at risk categories is moreproblematic in the second of the nesting structures.

Another approach that provides a more general framework than the multinomial logit andhence avoids the IIA property, is the random parameter or mixed logit model. In fact, the recentresults of McFadden and Train (2000) provide strong support for this type of approach fordiscrete choice problems. Whether any of these extensions to the multinomial logit model iswarranted for our data will need to be investigated. If the multinomial model is to be used, thenthe potentially restrictive IIA property needs to be tested.

Px

x

i n j mijj i

k ik

m

exp ( )

exp ( )

, , . . . , , , . . . , .= = =

=∑

β

β

′1

1 1

log ( ) .P

Pxij

ikj k i

= −β β ′

2004 MODELLING STUDENTS AT RISK 163

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

I V. D

A T A

The data used in this study related to all students enrolled in 1996 in the year-longEconometrics I course at the University of Sydney. There were 1054 students enrolled, somefor the first time and some repeating the subject. The university database provided thestudent’s TER score, the degree in which they were enrolled, their age and their gender.Students in Econometrics I were streamed into three groups and this grouping was also pro-vided by the university database. The streaming was performed on the basis of the level ofmathematics undertaken in high school. Stream A students had usually completed 4 unitmathematics, Stream B – 3 unit mathematics and Stream C – 2 unit mathematics. Streaming isutilised for a number of reasons. It allows groups of approximately the correct size to fit intolecture theatres and it allows different styles of teaching to cater for the different mathematicalabilities. However, all three streams have the same syllabus, assessment and final examination.The department database also provided information on the number of tutorials attendedthroughout the year, whether a student discontinued and the various assessment marks forthe course.

An additional source of information was a survey that was conducted in Week 4 of SemesterI. Week 4 was chosen because there were still a number of students enrolling in the course upto this point. All students were contacted either through tutorials or by mail. The survey wasdesigned to collect data not typically available from the usual sources and included informationabout the students and their family background.

Because of non-response to the survey, 160 students were deleted leaving 894 students withrelatively complete survey information. For these remaining students a further problem was theunavailability of the TER and mathematics mark from the Higher School Certificate (HSC) in147 cases. These cases were typically complete except for these two variables and hence werenot deleted. To avoid the missing mathematics marks, it was decided to use the ‘stream’ vari-able as a proxy for the mathematics ability. It must be acknowledged that the use of the ‘stream’variable comes at a cost. There were a number of different tutors teaching classes in eachstream. Thus, it would be expected that a tutor effect would be confounded with the ‘stream’variable. In order to counteract the missing TER, we used the modified zero-order technique. Adummy variable was constructed to indicate the presence of a missing TER. This was thenincluded as an explanatory variable in all of the models. The same method was employed forthe variable, Yr12hrs, which also had some missing values. While we know of no formal evalu-ations of the use of the modified zero-order technique in discrete choice models, we suspectthat the method is likely to perform well in our type of situation where there are a number ofexplanatory variables but only a few have missing observations and where the number ofaffected cases is substantial. See Greene (2000) and Maddala (1977) for further details of thismethod in the context of linear regression models.

Of the usable sample of 894 students, 53.7% were male and 46.3% were female; 23.0%were in Stream A, 29.0% were in Stream B and 48.0% were in Stream C. The large percentageof students in Stream C was the result of two factors – the increased enrolment in 1996 by theUniversity of Sydney and the continued decline of students attempting the highest level ofmathematics at school. Students were enrolled in a variety of degrees within the Faculty. Therewere two major degrees: the Bachelor of Commerce and the Bachelor of Economics. Theproportion of the sample enrolled in the Commerce degree was 51.6% and in the Economicsdegree 23.4%. The next largest category was the Agricultural Economics degree whichaccounted for 11.1% of the total enrolment. The degrees of Arts/Commerce and Combined Lawaccounted for 6.6% and 7.4% respectively.

164 AUSTRALIAN ECONOMIC PAPERS JUNE

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

The variables used in this paper are defined in Table I where we have also provided the meansfor students in the three categories of discontinue, fail and pass. Students were deemed to havediscontinued the course if they did not attempt the final examination that was at the end ofthe second semester. Of the 894 students 107 or 12.0% discontinued, 150 (16.8%) failedand the remaining 637 (71.2%) passed. In broad terms there were three sets of factors thatwere identified as candidates to explain a student’s propensity to fall into one of these threecategories.

i. Ability

The TER is an overall indicator of a student’s aptitude. More specifically, their mathematicaland quantitative skills are proxied by the stream in which they were placed. To a certain extentthe degree program is also an indicator of aptitude because there are marked differences in theentrance requirements for the different degrees. For example, Combined Law has one of thehighest TER requirements at the University of Sydney.

ii. Effort and commitment

Indicators of effort were typically derived from the survey where students were askedquestions such as the number of hours they spent studying in Year 12 outside of required schoolhours, whether they attended lectures always, mostly or sometimes/never. They were askedabout their motivation for enrolling in an Economics related degree; the five possible responsesbeing the TER score, did Economics at school, career, other reasons and a combination ofreasons. Tutorial attendance in Semester I was also viewed as a good measure of a student’slevel of effort and commitment.

iii. Socio-economic

The gender and age of the student were included here. From the survey we also had informa-tion on the education level of the student’s parents. It could be argued that age and familyeducation might also be interpreted in terms of commitment. In the survey, students were askedhow long it took on average to travel to the university. Presumably, long commutes may have adeleterious impact on the time a student has available to meet their university obligations. Itmay also affect how well a student integrates into university life.

Comparing the three groups of students based on the means shown in Table I the discontinueand fail groups are very similar in terms of TER, age, gender, and the distribution over streams,mother’s education, motivation and attendance at lectures. Moreover, as a subgroup, the discon-tinue and fail groups tend to be distinct from the pass group for most of these variables. Whenconsidering the distribution over the streams, the percentage in the pass group for both StreamsA and B is higher than the fail and discontinue groups. However, this trend is reversed forStream C. A similar occurrence appears with the distribution over attendance at lectures. Com-pared to the fail and discontinue groups, the pass group has a higher percentage always attend-ing lectures although the percentages are similar if we aggregate always attending with mostlyattending lectures. The percentages for sometimes/never attending lectures are relatively small.

Interestingly, when we compare the marks obtained on the mid-semester exam, the means forthe discontinue and fail groups are very similar; 16.08 compared to 17.21. This would seem toindicate that there is very little difference in performance at this early stage in the semester.However, when compared to the pass group with an average of 22.18, it would seem to indicate

2004 MODELLING STUDENTS AT RISK 165

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

that both the discontinue and fail groups have already fallen behind the students who ultimatelypass the course.

The pattern of the means for TutAttend is not unexpected. Pass students attended moretutorials on average than fail students who attended more than discontinue students. Figure 1provides further insights into these differences in attendance. The three groups are very similar

Table I Means and definitions of variables

Variables Discontinue Fail Pass

TER = Tertiary entrance rank 85.21 84.69 87.84TutAttend – number of tutorials attended in Semester 1 7.78 8.76 9.69Age 18.92 18.50 18.51Travel Time – time spent travelling to university 51.93 45.91 47.97Yr12hrs – hours studying per week in Year 12 14.88 16.04 18.22Female 0.39 0.38 0.49Arts/Com – Arts/Commerce degree 0.06 0.08 0.06AgEcon – Agricultural Economics degree 0.11 0.19 0.09Econ – Economics degree 0.29 0.23 0.23Commerce – Commerce degree 0.46 0.49 0.53ComLaw – Combined Law degree 0.08 0.01 0.09Stream A 0.15 0.14 0.26Stream B 0.20 0.23 0.32Stream C 0.65 0.63 0.42Mother’s education – primary 0.08 0.06 0.06Mother’s education – secondary 0.54 0.52 0.48Mother’s education – tertiary 0.38 0.42 0.46Motivation – TER score 0.03 0.09 0.05Motivation – Economics at school 0.07 0.07 0.05Motivation – career reasons 0.64 0.60 0.68Motivation – other reasons 0.06 0.05 0.03Motivation – combination of reasons 0.20 0.19 0.19Always attends lectures 0.55 0.60 0.70Mostly attends lectures 0.39 0.38 0.26Sometimes/never attends lectures 0.06 0.02 0.04

Figure 1. Tutorial attendance

166 AUSTRALIAN ECONOMIC PAPERS JUNE

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

in terms of the percentages attending five or less tutorials. The most pronounced difference isin terms of the percentage of students who attended nine or more tutorials where there is asharp decline after 9 tutorials for those who fail or discontinue. Except for the bulge around 6or 7 tutorials, the distributions of the discontinue and fail students are similar especially whencompared with the pass students. In particular, notice that a substantial percentage of the dis-continue students attended over half of the tutorials. This is important because we plan to useTutAttend as a proxy for a student effort. Such an interpretation would be jeopardised if a lowvalue for TutAttend simply reflected that students had already dropped the course. Better prox-ies could be attendance records in other courses that the student completed or the proportion ofeconometrics tutorials attended while still active in the course. Because neither of these wasavailable, we have used Semester I tutorial attendance for Econometrics I remembering that itis a year-long course. Our choice is supported by the fact that of the students who discontinue,only one student did not attempt the mid-semester examination in Week 7, 70% attempted theSemester I examination and 45% actually attended some tutorials in Semester II.

Another indicator of effort is the variable Yr12hrs. Here the match with prior expectations isas expected with the ranking in terms of this variable going from a low of 14.88 for the discon-tinue group, increasing to 16.04 for the fail group and increasing again to 18.22 for the passgroup.

When considering the different degrees, it appears that the proportions for a student enrolledin an Arts/Commerce degree or a Commerce degree do not differ greatly across the groups.However, there are some discernible patterns for the Economics, Agricultural Economics andCombined Law degrees. Students enrolled in the Economics degree are over-represented in thediscontinue group, those in Agricultural Economics are over-represented in the fail group whilethe Combined Law students are under-represented in the fail group.

In summary, there appear to be several marked differences between the pass group and thefail and discontinue groups, but fewer differences between the fail and discontinue groups. Fora more complete delineation of these differences we turn to the econometric analysis.

V. E

C O N O M E T R I C

A

N A L Y S I S

Multinomial logit results for two models are presented in Table II. The general specification,denoted by Model 1, includes a full set of explanatory variables. Model 2 contains a subset ofthese variables that could reasonably be available at the start of the semester. Hence a compar-ison of Models 1 and 2 provides an indication of the increased explanatory power associatedwith the extra survey questions and being able to observe tutorial attendance as the semesterprogresses.

Initially, the Small and Hsiao (1985) test was used to test the IIA property. For the generalspecification, Model 1 in Table II, the value of the Small and Hsiao test statistic is 26.37 withan associated

p

-value of 0.28. For Model 2 the statistic is 13.73 with a

p

-value of 0.25. Forthese calculations the restricted choice set was obtained by removing the failures. Qualitativelythe same results were obtained when the discontinue group were removed. Secondly, the modelswere also estimated using the random parameter logit model with the choice specific constantsspecified as random. Here the additional parameters characterising the random parameter distri-butions were always insignificant and hence support the conclusion drawn from the Small andHsiao tests that it is reasonable to use the multinomial logit model. Moreover, estimation ofalternative nested structures yielded inadmissible values of the inclusive value parameters andattempts to estimate the MNP model failed because of the fragility of the identification noted

2004 MODELLING STUDENTS AT RISK 167

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

Table II

Multinomial logit estimates

Variables

Model 1 Model 2

Fail vs Discontinue

Pass vs Discontinue

Pass vs Fail

Fail vs Discontinue

Pass vs Discontinue

Pass vs Fail

Constant 0.77 (3.08)

6.58** (2.68)

7.36***(2.48)

3.17 (2.58)

0.17 (2.10)

3.34* (2.01)

Female

0.09 (0.28)

0.36 (0.24)

0.45** (0.20)

0.05 (0.27)

0.50** (0.22)

0.55***(0.19)

Arts/Com 0.54 (0.61)

0.52 (0.56)

1.06**(0.44)

0.51 (0.60)

0.43 (0.52)

0.93**(0.42)

AgEcon 0.68 (0.45)

0.12 (0.42)

0.56* (0.32)

0.83* (0.43)

0.30 (0.39)

0.54* (0.30)

Commerce 0.13 (0.35)

0.35 (0.31)

0.48* (0.26)

0.26 (0.33)

0.05 (0.27)

0.31 (0.25)

ComLaw

1.75** (0.87)

0.97* (0.54)

0.77 (0.78)

1.75** (0.85)

0.59 (0.47)

1.16 (0.77)

Stream A 0.19 (0.41)

1.43*** (0.36)

1.24***(0.30)

0.10 (0.40)

1.07*** (0.32)

0.98***(0.29)

Stream B 0.30 (0.34)

0.87*** (0.30)

0.58** (0.25)

0.28 (0.34)

0.75*** (0.28)

0.48**(0.24)

TER

0.01 (0.03)

0.04* (0.02)

0.05** (0.02)

0.002 (0.03)

0.05** (0.02)

0.05***(0.02)

Missing TER

0.80 (2.37)

3.34 (2.07)

4.13** (1.76)

0.03 (2.23)

4.47** (1.85)

4.50***(1.64)

Age

0.10 (0.09)

0.02 (0.08)

0.08 (0.08)

0.16* (0.08)

0.16** (0.07)

0.005 (0.07)

TutAttend 0.22*** (0.07)

0.60*** (0.07)

0.38***(0.06)

Mother’s education – primary

0.24 (0.56)

0.35 (0.48)

0.11 (0.42)

Mother’s education – secondary

0.18 (0.29)

0.31 (0.26)

0.13 (0.21)

Travel Time

0.01** (0.005)

0.01** (0.005)

0.0007 (0.004)

Motivation – TER score 1.24* (0.74)

0.92 (0.72)

−0.33 (0.42)

Motivation – Economics at school

0.24 (0.59)

0.27 (0.54)

0.03 (0.43)

Motivation – career reasons

−0.16 (0.34)

0.07 (0.30)

0.23 (0.26)

Motivation – other −0.05 (0.70)

−0.61 (0.63)

−0.56 (0.54)

Always attends lectures 0.52 (0.81)

−0.61 (0.62)

−1.13 (0.69)

Mostly attends lectures 0.62 (0.80)

−0.70 (0.62)

−1.32* (0.69)

Yr12 hrs 0.01 (0.01)

0.03** (0.01)

0.02* (0.01)

Missing Yr12hrs 0.85 (0.86)

1.32* (0.80)

0.47 (0.52)

Log likelihood −598.5 −668.1934R2 (LR test) 0.16 (224.5) 0.05 (85.23)

Note: Standard errors are in brackets. *** significant at 1% level, ** significant at 5%, * significant at 10%.

168 AUSTRALIAN ECONOMIC PAPERS JUNE

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

by Keane (1992). While, a priori it is natural to be suspicious of the MNL model with its IIAproperty, it proved to be a reasonable representation of these data.

A number of other sensitivity issues were addressed. Firstly, a number of students have beenexcluded from the analysis because of non-response to the survey. The basic characteristics of thestudents who did not respond to the survey and those who did respond are remarkably similarwhen considering the gender, the degree and the stream for the students. Further, when studentswho completed the course are classified as non-respondents and respondents, there is no signif-icant difference between the means of the mid-semester test and the means of the first semesterexamination. In addition, Model 2 was also re-estimated using all 1054 students. The results fromthis are very similar to the reported results for Model 2. Consequently, it is assumed that thereis no selection bias introduced by ignoring those students who did not respond to the survey.

We have previously mentioned that endogeneity may be a problem with the variable, TutAttend.Douglas and Sulock (1995) overcome this problem by prorating their effort variables. Thisoption is not available for our students as we do not know the exact date of their withdrawal.Making the extreme assumption that students withdrew in the week following their last tutorialattendance allows us to calculate a proxy for the percentage of tutorials attended. When thisproxy is used in the analysis, the results are qualitatively the same as those presented thusindicating that endogeneity from this source is not a concern.

Both Models 1 and 2 have reasonable R2 values for this type of data while the LR tests indicatesignificant relationships. These measures of fit involve comparisons with a base specificationcontaining intercepts but no explanatory variables. In comparing the significance of the addi-tional explanatory variables contained in Model 1 relative to Model 2, the LR test statistic is139.31 which when compared to a chi-square critical value with 24 degrees of freedomyields a p-value that is less than 0.001. The addition of the survey and TutAttend variables addssignificantly to the explanatory power of the model.

Can the two groups of students ‘at risk’ be considered as one or are there significant differ-ences between them that justifies the current treatment of them as separate groups? In order totest the null hypothesis that the two groups can be pooled into one homogeneous group, the testdue to Cramer and Ridder (1991) was used. The test statistic is asymptotically distributed as achi-square with degrees of freedom equal to the number of restrictions imposed.

For Model 2, the Cramer and Ridder test statistic is 16.75 with 10 degrees of freedom and ap-value of 0.080. Thus the difference between the two groups of ‘at risk’ students is significantat a 10% but not 5% level of significance. The evidence is clearer for Model 1 where the teststatistic is 38.14 with 22 degrees of freedom and a p-value of 0.018. On balance, the statisticalevidence favours a distinction between students who discontinue and those who fail, particu-larly when the model incorporates the survey variables and the number of tutorials attended inSemester I. We now turn to a discussion of these and other estimation results.

Consider first the results for Model 1 and recall equation (2) where we noted that the esti-mated coefficients do not represent marginal responses. Instead, a positive coefficient in the firstcolumn indicates that students with higher values of the variables will be more likely to failrather than discontinue the course. Similarly, in column 3 a positive coefficient indicates thathigher values of the variable indicate a greater likelihood of passing rather than failing.

In order to gain further insight into these results, they will be discussed in conjunction withsimulated probabilities that have been produced for some representative students. An initialbase case is defined as a female student, aged 18 years, enrolled in a Commerce degree, inStream C, who attended 10 tutorials in Semester I, had no travel time to university, spent 15hours studying outside of school in Year 12, motivation was a combination of reasons, motherhad a tertiary education and who always attended lectures. In Figure 2, the predicted probabilities

2004 MODELLING STUDENTS AT RISK 169

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

of discontinuing, failing or passing are provided for a range of TER values for this base student.Then in Table III, probabilities are presented for each of the three groups; discontinue, fail andpass, for different scenarios compared with the base case. The base case is the same as used inFigure 2 but now the TER score is fixed at 85.

TER is widely recognized as a strong predictor of success at University. Our results are con-sistent with this belief. As illustrated by Figure 2, students with higher TER scores are muchmore likely to pass and less likely to fail or discontinue. However, TER does not help verymuch in discriminating between the fail and discontinue groups. From Table II and Figure 2, ahigher TER implies a lower probability of failing relative to discontinuing but this impact issmall and not precisely estimated.

Assuming stream is a proxy for quantitative and mathematical ability, the estimates for thestream variables are as expected. Tables II and III illustrate that a Stream A or B student ismuch more likely to pass and less likely to fail or discontinue compared to the base student whois in Stream C. Again though, stream is not a significant determinant of differences betweenstudents who fail and those who discontinue.

Overall ability as measured by TER and mathematical aptitude as proxied by stream are sig-nificant and important in predicting success in Econometrics I. The one other factor associatedwith ability that has proven to be statistically significant is whether the student is enrolled inCombined Law. Notice that the negative estimate for pass relative to discontinue for Combined

Figure 2. Probabilities for the base student

Table III Probabilities associated with different types of student

Student characteristics Discontinue Fail Pass

Base case (TER = 85) 0.057 0.197 0.746

Changes from baseMale not female 0.072 0.273 0.655Travel time of 30 minutes not zero 0.075 0.190 0.735Motivation – TER not combination of factors 0.022 0.263 0.716TutAttend = 11 not 10 0.034 0.148 0.818Combined Law not Commerce (TER = 98) 0.072 0.034 0.893Stream A not C 0.017 0.070 0.913Stream B not C 0.027 0.126 0.847Yr12hrs = 25 not 15 0.046 0.175 0.779

170 AUSTRALIAN ECONOMIC PAPERS JUNE

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

Law in Table II does not imply that these students are less likely to pass. Rather it is the oddsratio of pass relative to discontinue that falls when a Combined Law student is compared to anEconomics student. Similarly, the odds ratio of pass relative to discontinue for our base studentin Table III is (0.746/0.057) = 13.2 while for our Combined Law student this ratio falls to 12.4.The interesting result here is that Combined Law students, who require extremely high TER scoresto gain admittance, are highly likely to pass but if they are at risk they tend to discontinue ratherthan fail. It may be that these students have relatively high expectations from their Universitycourse and are more prone to become discouraged. Alternatively, some may enrol in the coursemore because they have the TER score than because they are truly interested in the program.

The TutAttend variable is highly significant and indicates that the more tutorials a studentattended the higher the probability of passing relative to discontinuing and of failing relative todiscontinuing, and more likely to pass rather than fail. The magnitude of these impacts is illus-trated in Table III where the change in probabilities is estimated due to increasing tutorialattendance from 10 to 11.

Like TutAttend, Yr12hrs is interpreted as a proxy for a student’s level of motivation. Theestimated coefficients indicate that the more hours a student has worked in Year 12 the morelikely that the student is to pass relative to discontinuing, fail relative to discontinuing and passrelative to failing. In terms of the scenario represented in Table III, studying 25 hours per weekin their final year of high school rather than 15 hours leads to only modest changes in predictedprobabilities associated with passing, failing or discontinuing.

The estimated coefficients for travel time are negative and significant for the two comparisonsrelative to the discontinue base. The longer a student spends travelling to university the morelikely they are to discontinue relative to either failing or passing. In terms of the magnitude ofthese changes, Table III indicates that an increase of 30 minutes in travel time, reduces theprobability of passing for the base student from 0.746 to only 0.735, has little impact on theprobability of failing but the estimated probability of discontinuing increases from 0.057 to0.075. Thus, travel time is one factor that is important in discriminating between students witha tendency to discontinue relative to failing.

The only motivation variable that is significant is the TER score for fail relative to discon-tinue. This implies that a student who chose an economics-related degree based on their TERscore is more likely to fail than discontinue. In terms of the scenario in Table III, the probabilityof failing changes from 0.197 to 0.263.

The estimated gender effect indicates that females are more likely than males to pass relativeto failing or discontinuing. In Table III, changing the base student from a female to a maledecreases the probability of passing from 0.746 to 0.655 with this change being balanced byincreases in the probability of failing and discontinuing.

Comparing the estimation results for the variables common to both Models 1 and 2, a relativelystable pattern of signs, magnitudes and significance levels is observed. Thus, without the benefitof the extra explanatory variables collected in the survey and of monitoring tutorial attendance,the inferences associated with the remaining variables do not seem unduly affected by omittedvariable biases. The joint significance of the additional variables contained in Model 1 has previ-ously been established. Thus, there is predictive power in these additional explanatory variables.

V I . C O N C L U S I O N

An attempt has been made to isolate factors that are important in determining whether astudent is likely to pass, fail or discontinue Econometrics I at the University of Sydney. Our results

2004 MODELLING STUDENTS AT RISK 171

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

indicate that high school performance measured by TER score and mathematical aptitude aregood predictors of success in Econometrics I. Conversely students with lower TER scores andwho have weaker mathematical abilities are more at risk of failing. There are two implicationsto be drawn from this result. Firstly, any pressure to increase access to a university educationraises a dilemma. Under current entry procedures, increasing student places means admittingstudents with lower TER scores. This raises the potential conflict between the goals of increasingaccess and decreasing non-completion. The solution may lie in alternative entrance proceduresthat better match students to programs but these invariably come with increased administrationcosts. Secondly, the policy of streaming students in Econometrics I by their mathematical abilityallows for more efficient targeting of this at-risk group. For example, students with weaker math-ematical backgrounds are provided with longer tutorials.

However, TER and mathematical ability are not significant in discriminating between studentswho fail and those who discontinue. Therefore, those students targeted for extra or remedialwork to improve their chances of passing are not necessarily the same students who may requirecounselling in order to help them cope with university pressures.

Students enrolled in the Combined Law degree are very likely to pass Econometrics I and, ifthey are at risk, they are much more likely to discontinue than to fail. It is possible that thispropensity to discontinue is a reflection of unfulfilled expectations about the course or that theirchoice of degree was driven more by the sense of achieving a high TER score rather thanbecause they were truly interested in the program. This interpretation is consistent with McInniset al. (2000) who suggest that approximately a third of students have made ‘poor choices andare reluctant participants even after six months at university’. Again, alternative entrance pro-cedures may help to alleviate such problems.

Recent work by Rodgers (2002) provides an especially relevant comparison with the currentanalysis because it considers performance in a basic statistics course in an Australian university.Her work indicates that while attendance is positively correlated with performance, schemes toinduce students to attend more tutorials do not necessarily improve performance. Our resultsconfirm the first of the Rodgers’ conclusions; the more tutorials an Econometrics I studentattended, the higher the probability of passing relative to either failing or discontinuing. Studentswho attend fewer tutorials early in the course are also more likely to discontinue than to fail.This suggests that attendance should be carefully monitored with students contacted early if theirattendance becomes spasmodic. However, given the second conclusion of Rodgers (2002) it isnot obvious that the solution is for the student to improve their attendance record.

The estimated gender effect indicates that females are more likely than males to pass relativeto failing or discontinuing and that this effect is substantial. Rodgers (2002) also found thatfemales performed better, although the estimated effect she reported was not statistically signi-ficant. Such findings are somewhat at odds with studies of performance in economics coursessurveyed by Siegfried (1979). However, these studies are now somewhat dated and it could bethat the type of female student taking economics and business courses has changed over time.

Because TER, mathematical background, degree program and gender are all known at thestart of semester they have the potential to be used early on to identify students at risk. Theseindicators can identify broad groups of potential at risk students and help target extra interven-tions. Amongst the extra variables that were collected as part of this study, tutorial attendanceseems to be the most useful addition to this base list of indicators, providing a means to accu-rately identify individual students who may need assistance. Some of the other socio-economicvariables and some indicators of commitment and effort were significant but, with the possibleexception of travel time, their impact on probabilities was small. Thus, it is not clear whetherthe extra effort in collecting this information is justified.

172 AUSTRALIAN ECONOMIC PAPERS JUNE

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

The analysis presented suggests that more work needs to be done to better understand factorsthat lead students to drop or fail a course. Replication of the study in other institutions and forother courses would be useful in determining the robustness of the results. Especially usefulwould be consideration of later-year subjects where problems are potentially different becauseof the self-selected nature of continuing students. It is quite possible that qualitative data drawnfrom case studies could also provide useful complementary information to the quantitativeanalysis provided here. We are not only thinking of students who drop courses. It would also behelpful to identify and interview students who have successfully managed to overcome diffi-culties in their University education. How did these students manage? What resources – both atuniversity and in the community – did they utilise and find most helpful? It is likely that ourquantitative models would be helpful in identifying such students.

R E F E R E N C E S

Anderson, G.A., Benjamin, D. and Fuss, M.A. 1994, ‘The Determinants of Success in UniversityIntroductory Economics Courses’, Journal of Economic Education, vol. 25, pp. 99–119.

Borg, M.O., Mason, P.M. and Shapiro, S.L. 1989, ‘The Case of Effort Variables in Student Performance’,Journal of Economic Education, vol. 20, pp. 308–313.

Cramer, J.S. and Ridder, G. 1991, ‘Pooling States in the Multinomial Logit Model’, Journal of Econo-metrics, vol. 47, pp. 267–272.

Dancer, D.M. and Doran, H.E. 1990, ‘Prediction of the Probability of Successful First Year UniversityStudies in Terms of High School Background: With Application to the Faculty of Economic Studiesat the University of New England’, Working Papers in Econometrics and Applied Statistics No. 48,University of New England.

Douglas, S. and Sulock, J. 1995, ‘Estimating Educational Production Functions with Correction forDrops’, Journal of Economic Education, vol. 26, pp. 101–112.

Greene, W.H. 2000, Econometric Analysis, 4th ed, Macmillan Publishing Company, New Jersey.Heckman, J.J. 1979, ‘Sample Selection Bias as a Specification Error’, Econometrica, vol. 47, pp. 153–

161.Keane, M.P. 1992, ‘A Note on Identification in the Multinomial Probit Model’, Journal of Business and

Economic Statistics, vol. 10, pp. 193–200.Lam, Y. 1984, ‘Predicting Dropouts of University Freshmen: A Logit Regression Analysis’, Journal of

Educational Administration, vol. 22, pp. 74–82.Maddala, G.S. 1977, Econometrics, McGraw-Hill International Editions, Singapore.McFadden, D. and Train, K. 2000, ‘Mixed MNL Models for Discrete Response’, Journal of Applied

Econometrics, vol. 15, pp. 447–470.McInnis, C., James, R. and Hartley, R. 2000, ‘Trends in the First Year Experience’, Report 00/6,

Evaluations and Investigations Programme, Higher Education Division, DETYA, Canberra.Morgan, R.G., Cornick, M.F. and Kauder, W.F. 1985, ‘Predicting Student Success in Intermediate

Accounting I’, Journal of Education for Business, vol. 61, pp. 80–84.Pascarella, E.T. and Terenzini, P.T. 1980, ‘Predicting Freshman Persistence and Voluntary Dropout

Decisions from a Theoretical Model’, Journal of Higher Education, vol. 51, pp. 60–75.Power, C. and Robertson, F. 1987, ‘Selection, Entry Requirements and Performance in Higher Educa-

tion’, CTEC Performance in Higher Education Study, Report No. 2, National Institute of LabourStudies Incorporated, Flinders University, South Australia.

Reid, R. 1983, ‘A Note on the Environment as a Factor Affecting Student Performance in Principles ofEconomics’, Journal of Economic Education, vol. 14, pp. 18–22.

Rodgers, J.R. 2002, ‘Encouraging Tutorial Attendance at University did not Improve Performance’,Australian Economic Papers, vol. 41, pp. 256–266.

Schmidt, R. 1983, ‘Who Maximizes What? A Study in Student Time Allocation’ American EconomicReview (Papers and Proceedings), vol. 73, pp. 23–28.

2004 MODELLING STUDENTS AT RISK 173

© Blackwell Publishing Ltd/University of Adelaide and Flinders University of South Australia 2004.

Siegfried, J.J. 1979, ‘Male-Female Differences in Economic Education’, Journal of Economic Educa-tion, vol. 10, pp. 1–11.

—— 1980, ‘Factors Affecting Student Performance in Law School Economics Courses’, Journal ofEconomic Education, vol. 12, pp. 54–60.

—— and Fels, R. 1979, ‘Research on Teaching College Economics: A survey’, Journal of Eco-nomic Literature, vol. 17, pp. 923–969.

Small, K.A. and Hsiao, C. 1985, ‘Multinomial Logit Specification Tests’, International EconomicReview, vol. 26, pp. 619–627.

Smith, J.P. and Naylor, R.A. 2001, ‘Dropping out of University: A Statistical Analysis of the Probabilityof Withdrawal for UK University Students’, Journal of the Royal Statistical Society, A, vol. 164,pp. 389–405.

Thomas, M., Adams, S. and Birchenough, A. 1996, ‘Student Withdrawal from Higher Education’,Educational Management and Administration, vol. 24, pp. 207–221.

Tinto, V. 1975, ‘Dropout from Higher Education: A Theoretical Synthesis of Recent Research’, Reviewof Educational Research, vol. 45, pp. 89–125.

—— 1993, Leaving College: Rethinking the Causes and Cures of Student Attrition, (2nd ed), Universityof Chicago Press, Chicago.

Watkins, D. 1979, ‘Prediction of University Success: A Follow-Up Study of the 1977 Internal Intake tothe University of New England’, Australian Journal of Education, vol. 23, pp. 301–303.