Too little and too much trust: Performance measurement in Australian higher education

21
1 Too little and too much trust: Performance measurement in Australian higher education Abstract: A striking feature of contemporary Australian higher education governance is the strong emphasis on centralized, template-style, metric-based and consequential forms of performance measurement. Such emphasis is indicative of a low degree of political trust among the central authorities in Australia in the intrinsic capacity of universities and academics to do their work efficiently and effectively. At the same time, it is indicative of a political trust in highly centralized and top-down forms of performance measurement, and their capacity to effectively control and coordinate the work of universities and academics. In this paper we argue with regard to performance measurement that these patterns of trust and mistrust embody contradictory assumptions regarding the agency and motivations of universities and academics, and prevent adequately coming to terms with the unintended effects that the current performance measurement regime has on universities and the academics working in them.

Transcript of Too little and too much trust: Performance measurement in Australian higher education

1

Too little and too much trust: Performance measurement in Australian

higher education

Abstract:

A striking feature of contemporary Australian higher education governance is the strong

emphasis on centralized, template-style, metric-based and consequential forms of performance

measurement. Such emphasis is indicative of a low degree of political trust among the central

authorities in Australia in the intrinsic capacity of universities and academics to do their work

efficiently and effectively. At the same time, it is indicative of a political trust in highly

centralized and top-down forms of performance measurement, and their capacity to effectively

control and coordinate the work of universities and academics. In this paper we argue with

regard to performance measurement that these patterns of trust and mistrust embody

contradictory assumptions regarding the agency and motivations of universities and academics,

and prevent adequately coming to terms with the unintended effects that the current

performance measurement regime has on universities and the academics working in them.

pwoelert
Typewritten Text
pwoelert
Typewritten Text
pwoelert
Typewritten Text
pwoelert
Typewritten Text
pwoelert
Typewritten Text
pwoelert
Typewritten Text
pwoelert
Typewritten Text
This is an accepted manuscript of a paper published in 'Critical Studies in Education' on 26/08/2014. The paper is available online at: http://www.tandfonline.com/doi/full/10.1080/17508487.2014.943776
pwoelert
Typewritten Text
pwoelert
Typewritten Text
pwoelert
Typewritten Text
pwoelert
Typewritten Text

2

Introduction

In 2005, Max Cordon, a distinguished economist, began a reflection on higher education

governance in Australia in the following way:

The Soviet System crashed in 1985, thanks principally to Mr. Gorbachev. I shall call it the “Moscow System”. It became clear – at least to those to whom it had not been clear before – that the Soviet central planning system had been a failure. There was apparently no one left to defend it. Thus, it was a surprise that, just after that time, a mini-version of this system, with all the mentality that goes with it – but applying only to higher education – was apparently being constructed in Canberra. I shall call this little capital city Moscow on the Molonglo, the Molonglo being the river that runs through Canberra, and which was dammed to make the Canberra Lake.

(Cordon, 2005, p.7)

Cordon went on to discuss examples and effects of over-centralized planning and management

of Higher Education in Australia, beginning with the Dawkins reforms under a Labor

Government in the late 1980s, and continuing through successive government changes up to

and including the Liberal (Conservative) government under the Education Minister Brendan

Nelson, in 2004. More recently (and yet more changes of government and minister later),

Cordon’s theme is echoed in the comments of one of the academics we were interviewing in

2012 as part of a research project investigating knowledge-building in physics and history at

Australian schools and universities (discussed further below). Reflecting upon the changing

conditions for academic work in Australia, and associating the strong emphasis on centrally set,

measurable performance targets, not with a useful form of accountability, but rather with

“micro-managerial Stakhanovism”, he said:

So yeah, that’s what—that’s the thing that I most resent about the current um, performance management culture, is that it is far too dirigiste, far too micro-controlled. (…) Well, I couldn’t tell you how it changed over time because regrettably I have known nothing else but what I call the Stakhanovist output model. (…). I mean, as a historian I can say this with a certain sense of irony but the first two countries, the first two men to introduce these systems were Joseph Stalin with the four year plans or the five year plans and Hermann Göring with the five year plans. And we all know where those stories ended.

(Interview 34, Historian, HEJV 080612).

In this paper we want to consider the ways in which academic knowledge work in Australian

universities is now being governed and managed and the impact of those ways on that

knowledge work. In employing the term ‘knowledge work’ we are drawing attention to the

substantive work of academics in their teaching and research activities, the ways in which they

contribute to the knowledge of the population via the students they teach, and their own

3

intellectual inquiry and outputs. The contrast here is with considering knowledge and its effects

via more ‘black box’ terms such as ‘productivity’, ‘efficiency’ or ‘cost’ that leave the distinctive

content of the activity unexamined.

This is a period which has already seen considerable global discussion of some of the political

and policy changes impacting on universities and their work. Previous research has drawn

attention to broad directions of these changes: to the spread of what has been called ‘New

Public Management’ and its associate forms of policy by measurement (Hood, 1991; Hood &

Peters, 2004; Lingard, 2011); to the associated rise of what has been labelled the ‘audit society’

with its new forms of accountability (Power, 1997; Shore, 2008); and to the way universities

have been increasingly impacted upon by new global rankings and benchmarking exercises

(Marginson & Considine, 2000; Whitley & Gläser, 2007; Blackmore, Brennan, & Zipin, 2010;

Lauder; Young; Daniels; Balarin; & Lowe, 2012). In this paper we focus more specifically on the

constitution and use of performance measurement in the governance of Australian higher

education, and from a perspective whose central interest concerns the substantive knowledge-

building and dissemination dimension of academic work. We emphasize three interrelated

features of performance measurement which we suggest in Australia take a more intense and

less qualified form than they do in other countries with which Australia often benchmarks itself:

a rather centralized form of ‘command and control’; a preference for ‘one-size-fits-all’

templates; and a use of numbers in a way that constantly tends to reduce measures of quality to

measures of quantity.

The form of this paper is broadly discursive. Using comparative examples from other higher

education systems, as well as ‘on the ground’ experiences of those teaching and researching in

Australian universities, we explore the potentially perverse and distorting effects of the specific

form of performance measurement in use in Australian higher education today. The paper

arises from the research interests and the professional experiences of both authors.1 It also

draws on interview evidence with Australian academics, taken from a current Australian

Research Council funded research project in which the authors are involved. In this project,

Knowledge Building across Schooling and Higher Education, we are focusing on questions about

knowledge in contemporary times. In particular we are interested in the extent to which

1 One author worked for six years in a senior university research leadership role where she was involved in various ways in developing institutional processes or practices that would comply with Australian government policy and funding requirements and templates, including those associated with the national research quality assessment, ERA. The work included regular meetings with representatives of all faculties to review ‘bottom up’ and ‘top down’ research performance pressures in different kinds of disciplines. The other author previously worked in a professional cross-university role concerned with higher degree students and external and internal policies related to these.

4

disciplines remain epistemically important in the changing world; the ways in which they are

affected by concerns about competencies and graduate attributes, interdisciplinarity,

collaboration with industry and the like; and the impact of new, performance-based governance

mechanisms on the work of both schools and universities. To give focus and manageability to

this inquiry, we have confined our empirical study to those working in two fields: physics and

history, chosen as core disciplines respectively of the sciences and the humanities. We

interviewed both academics and teachers but for the present paper our interview evidence of

‘on the ground’ perspectives is drawn from the 53 academics we interviewed, who were

sourced from twelve different Australian universities, and who covered a range of different

career stages. We use verbatim brief quotes from the interviews to illustrate the arguments we

are making here, but the argument itself reflects our overall analysis of this substantial body of

data. The semi-structured open-ended interviews were framed to have interviewees talk about

their knowledge work and their perspective on their discipline as a changing entity, and

included questions about the research and teaching they are doing, what matters to them in

their work, and their perception of changes, including the changes to the assessment of their

own work performance.

Our main argument is that the current emphasis on centralized, top-down and formulaic forms

of performance measurement in the governance of academic work in Australia embodies a

peculiar tension, even contradiction. On the one hand, this emphasis is symptomatic of the ‘New

Public Management’ belief that academics, just as other professional actors, are “in need of

monitoring and incentive structures, if their performance is to be improved” (Lewis, 2013, p.

73). This belief and the governance practice it supports speak of little, maybe too little, trust in

academics and their intrinsic motivations. Moreover, as has been noted by Michael Power, this

lack of trust may itself also prove to be costly, for trust, while itself involving (an albeit) implicit

and informal form of accountability, “releases us from the need for checking” (Power, 1997, p.

2).2 On the other hand, the strong emphasis on performance measurement mechanisms in

Australian higher education is indicative of a strong, arguably too strong, degree of trust among

the relevant authorities in performance measurement as an effective instrument of governance,

where the use and usefulness of performance measures is often simply taken for granted.

2 To clarify, we do not advocate here the naïve view that trust alone is sufficient to coordinate the work of universities and academics alike. Rather, and in broad reference to Power’s line of argument developed in The Audit Society (1997), we hold that both trust and (some formal forms of) accountability have their place and legitimacy, but that the present period has seen an unbalanced (and largely unchecked) expansion of performance assessment mechanisms and the associated regimes of accountability.

5

The resulting tension between various forms of trust and mistrust, we ultimately argue, is

indicative of a contradictory ‘managerial’ stance toward the governance of academic work.

Fundamental to this stance is an understanding of academics (and also of whole universities) as

extrinsically motivated strategic actors, who require various forms of control (in the form of

performance measurement) as well as sanctions, positive or negative, in order to do their job

properly. Yet on the other hand, within the very same stance, the possibility that these actors

may resort to overly strategic responses to the performance measurement regimes they are

subjected to in their work is commonly downplayed and ignored. The same applies also to any

effects resulting from these strategic responses that may be contrary to the desired effects of

the performance measurement system.

Notwithstanding our opening quotes, our concern in this paper is not to label these

developments politically in particular ways, or to denigrate by association, but rather to

illustrate the ways in which these measures are detrimental to the work of universities. We do

not underestimate the difficulty of managing the huge and complex organizations that

universities have become, nor deny the need for some defensible, non-arbitrary criteria and

mechanisms for making appointments, distributing research funds, and the like. But the

discussion here is an attempt to make explicit some of the tensions and contradictions inherent

in the ways in which performance measurement is currently used in Australia, and to gesture

toward the kinds of impact the particular configuration of performance measures in Australia is

having on the central knowledge work of universities and on the academics working in them.

Performance Measurement in Australian Higher Education: Key Characteristics

Recent times have seen a proliferation of institutionalized forms of performance measurement

and of associated performance management instruments in a wide range of public institutions.

The underlying idea, often loosely referred to as ‘New Public Management’ (see Hood, 1991;

Hood & Peters, 2004), is that the introduction of competition-based governance mechanisms

into the administration of publicly funded institutions, and a greater focus on (quasi-) markets,

will lead to improvements in these institutions’ accountability and efficiency. In practical terms,

NPM is mainly based on two core pillars: first, the development and use of formal measurement

systems to evaluate performances and to distribute funding according to the results (or

‘outputs’) achieved, and second, the devolution of responsibilities for the management and

administration of those activities to the various organizational actors whose activities are to be

evaluated. The first countries to introduce NPM reforms were New Zealand and Australia in the

6

early 1980s. Subsequently the core ideas spread, first throughout the Anglosphere, and then, to

various degrees, throughout much of the rest of the world (see Pollitt & Bouckaert, 2004).

With specific regard to higher education, the proliferation of consequential performance

measures can be regarded as being symptomatic of a wider loss of trust within liberal

democracies in universities and academic work (see Weingart, 2013), the latter being painted as

being largely irrelevant, ‘esoteric’, and conducted in an inefficient manner. Among Australian

policymakers, the conviction is common that the rigid application of consequential performance

measurement regimes to universities by a central government authority constitutes an effective

remedy against this situation (see Marginson, 1997; Marginson & Considine, 2000; Lewis,

2013).

It was Labor Education Minister John Dawkins in 1988 who first comprehensively introduced

performance-based funding mechanisms into the governance of universities in Australia. He

justified this governance reform in terms of overcoming perceived inefficiencies in the

Australian higher education system and of making possible a drastic expansion of the provision

of university education. The associated policy whitepaper identified as a key mechanism a

“funding system that responds to institutional performance and the achievement of mutually

agreed goals”, and a system where funds are allocated on the basis of “a range of output, quality

and performance measures” (Dawkins, 1988, p. 85). Since then, Australian universities have

been subjected to an increasingly complex array of governmental performance measure

instruments, with the allocation of funds attached to a range of predefined output measures.

The universities themselves have reacted to this increase in scrutiny and the increased

competition for research funding mainly through internally replicating the measures and the

associated performance-based funding systems that have been imposed on them from without,

often in quite undiscerning manner (see Marginson & Considine 2000; Gläser, Lange, Laudel, &

Schimank, 2010; Lewis & Ross, 2010).

As we will elaborate in the following sections, what sets the current situation in Australian

higher education governance apart from those in other countries is not so much the emphasis

on institutionalized forms of performance measurement and management per se, but in the

tendencies to exaggerate certain elements of this. The situation in Australia is distinguished by a

governance system for Australian universities that is very strongly centralised in terms of

control and coordination; by a strong tendency toward using ‘one-size-fits-all’, output-focused

indicators in the measurement of performances, and related to this, by assessment that is

almost exclusively geared towards quantitative indicators, eschewing other, more devolved and

locally contextualised and quality-focused forms of evaluation.

7

‘Command and Control’

If I was strategic I would probably be really pursuing a project on the environment that I

was invited to because there’s potentially a lot of funding in that area.

(Interview 25, Historian, HNJV 270312)

At the structural level, a decisive step toward a centrally administered, competition-based mode

of control of university-based research activities was the creation of the Australian Research

Council (ARC) in 1988. To the present day, the ARC is responsible for providing competitive

grants for basic and applied research activities in all fields except those with a clinical direction

undertaken in medicine and the health sciences. Responsibility for competition-based research

funding for the latter fields is with the National Health and Medical Research Council (NHMRC),

which was formed in 1937 and gained the status of an independent statutory agency in 1992

(Larkins, 2011, p. 168).

The formal performance measurement machinery in Australia tends to be organized in a highly

centralized and top-down manner, both at the nation-system level and at the university level,

where the centre sets the “rules of the game” (Marginson, 1997, p. 65) by determining to a

significant extent the relevant performance criteria and targets. Moreover the institutionalized

forms of performance measurement common in Australian higher education, particularly those

assessing research, also tend to be highly formalised, repeated at relatively frequent intervals,

and consequential for those who are evaluated – an example of what Richard Whitley (2007)

would call a ‘strong’ research evaluation system.

Through the ARC and NHMRC, the Australian federal government has attained a degree of

control over the strategic funding of research undertaken at Australian universities that is

arguably stronger than that of governments in many other liberal democracies.3 A number of

factors contribute to this.

First, like many other OECD countries, the Australian government over the course of the last

two to three decades has significantly reduced the discretionary funding that is allocated to

3 It is true that part of the government strategy is intended to force universities to find funding elsewhere – industry, benefactors, international sources, and Australian universities have been doing this with different degrees of success. It is also true that state governments also are significant funders (some more than others). Yet the broad point about the degree to which Australian higher education is over-heavily oriented to the rules of the game at the centre still holds. One of the puzzling issues is why annual league tables of ARC and NHMRC outcomes have so much salience, when that funding is actually reducing as a proportion of overall university funding.

8

universities to distribute internally for their research, while increasing the funds that are

allocated, on a competitive basis, through the competitive grant system. Moreover the

government sanctioned competitive grant schemes provided by the ARC and the NHMRC have a

flow-on strategic importance because ‘Category 1’ research funds of this kind are differentially

important (compared with other research income ) in gaining further infrastructure funding

from government and status in the comprehensive governmental research ‘quality’ assessment

‘Excellence in Research for Australia’ (ERA).

Second, with regard to the decision-making processes of its major research funding bodies, the

Australian government holds considerably more sway than many of its counterparts in other

liberal democracies. While both the ARC and the NHMRC were granted the status of

independent statutory authorities in 2001 and 1992 respectively, they have been subjected to

increased, hands-on form of governmental control in recent times, and, at least occasionally,

have been subjected to direct political interference. In the instance of the NHMRC, a policy

change in 2006 gave the Government the right to appoint the NHMRC’s CEO (see Larkins, 2011,

p. 169), which, at least indirectly, increased the power of the central authorities over the

strategic direction and inner operations of the funding body. In the instance of the ARC, one

important change was the Government-mandated requirement to include ‘research areas of

national priority’ (see Larkins, 2011, p. 79) from 2002 onward as criteria within ARC

assessment processes. This change heralded the move toward a decidedly more ‘substantive’

(Whitely, 2011) form of the governance of research, where the government attempts to steer

research activities through the setting of research objectives and priorities, rather than leaving

this mostly to the researchers themselves.

Adding to this move toward a more substantive form of research governance, the Australian

Government also can exert influence concerning the actual ARC grant allocation processes. Like

other national research councils, ARC grants are awarded following multiple forms of peer

review and ranking, overseen by central academic panels (the ‘College of Experts’ in the case of

the ARC). But, in contrast to other national research councils, all final decisions concerning the

award of individual research grants require approval by the responsible federal minister (Yates

2004, 110). And there have been highly controversial, publicly known instances where that

right of veto has been exercised, under Education Minister Brendan Nelson (see Haigh, 2006).

Such a level of direct government interference in the research grant decision-making process of

major public funding bodies, in the instance above somewhat ironically exercised by a self-

proclaimed liberal government, would be politically unthinkable and/or formally impossible in

many other liberal democracies with comprehensive public science systems. For example, in the

9

case of the major national research funding body in Germany (the German Research

Foundation), formally a self-governing organization, the academic researchers on the various

boards can outvote the representatives of the federal ministries in all final decisions concerning

the allocation of research grants.

Third, compounding the degree of centralised governmental control over the broad direction

and allocation of research funding, Australia has by international standards an unusually sparse

research funding landscape in terms of diversity of funding sources. This contrasts markedly

with the multiple large funding bodies of the USA, and also with the situation in most European

countries, where researchers can apply for research funds both with their national funding

bodies as well as with the European Research Council. Naturally, one cannot really fault the

Australian government for this situation – the level of philanthropic investment into research in

Australia remains comparatively meagre, as does, by international standards, the level of

private investment into research and development (see Larkins, 2011, p. 257).

‘One-Size-Fits-All’

I just finished an article. It took me two years to write it. Of about 20,000 words for

Germany’s most prestigious review in ancient history. It’s a 50 page, double whopper. It’s

on something that hasn’t been studied before and it will be cited for a long time to come

(…). What does it get me here? It gets me one abstract point. As opposed to colleagues for

example who publish a thousand words in five articles and get five points.

(Interview 34, Historian, HEJV 080612)

The centralized and top-down organisation of the research governance and evaluation process

in Australia is directly reflected in the pronounced tendency to rely upon predefined ‘one-size-

fits-all’ performance indicators that can be applied by a central authority uniformly across a

large number of disparate domains (Butler, 2003; Gläser & Laudel, 2007). More localised,

qualitative and dynamic indicators and more consultative forms of assessment are largely

eschewed. Competitive research income, for example, is a measure used as a proxy of quality

across universities of widely different size, history and demographics; and research income and

quantity of publications or citations are increasingly used to measure and compare

performances across academic disciplines, regardless of long-entrenched differences in

research and publication practices (within disciplines as well as between them).

10

One example is the current governmental system for the assessment of the quality of university-

based research activities, ‘Excellence in Research for Australia’ (ERA). Compared to research

assessment systems developed in other countries, ERA relies disproportionally strongly on a

range of quantified output measures (e.g., number of publications, citation numbers, research

income – the latter actually being an input), and makes comparatively little use of direct expert

peer-review.4 It is assumed that the quantitative output measures themselves are proxies of

peer review of quality, but while they are built on peer-assessment they conflate differences

between sub-fields when pulled out for a template comparison. This particular focus on uniform

output measures, it has been noted in the higher education literature, is not only problematic

with regard to the measurement of research, but also does not properly “account for the

multiple-product character of universities, various organizational missions, and effects on the

quality of services produced” (Enders, de Boer, & Weyer, 2013, p. 21).

Moreover, there is little scope within ERA for those who are subjected to assessment to provide

any sort of contextualising information in addition to the reported outputs. And the central

assessment authorities in turn provide little contextualizing information about the assessment

scores back to the evaluated universities, or at least this is not formally provided for, which

means that the flow of information between evaluating and evaluated agencies tends to remain

of a rather one-dimensional, ‘flat’ nature. This contrasts markedly with the approach regarding

the evaluation of research taken in the Netherlands, for example. In the Netherlands, one of the

countries that pioneered institutionalized forms of research assessment on the European

continent, the process of evaluation remains far more consultative, and the measures more

multi-dimensional, than their counterparts in Australia. In the Netherlands, not only do the

evaluated agents (usually university-based research programs or institutes) have the

opportunity to contextualise their reported outputs and to raise concerns, both in writing and in

person, to the evaluators (see KNAW, 2009, pp. 15-16), but the evaluating bodies also provide a

qualitative rationale of half a page for every score achieved (p. 21). Finally, in the Netherlands,

the actors responsible for the evaluation process also have some leeway in adapting the

evaluation process in a way that reflects specific disciplinary traditions, or, for that matter,

interdisciplinary research approaches (KNAW 2010, p. 6). This flexibility with regard to

interdisciplinary research in particular is significantly restricted in the Australian environment

4 As in other countries ERA does use academic expert panels (the ‘Research Evaluation Committees’) for final review of the data and submissions, but the disciplinary spread and workload and unselective submission process reduces the weight of the panel stage as an independent ‘peer review’ part of the process. There is also some peer review of textual and creative arts items, but within constraints that will be referred to below.

11

with its stronger focus on rigid disciplinary classifications and one-size fits all templates (see

Woelert & Millar, 2013).

‘What Can’t Be Counted Doesn’t Count’

There’s a tremendous drive to put a number to everything. So we are rated, we have a Q

index or whatever it is, and there’s something else now, some other index which I’ve

forgotten, and then we have the ERA type rating, of 1 to 5 for groups of people. The

university is rated in a certain way, the students now, we have a rating system on your

lectures and I think in many cases it’s an attempt to put a number to something you can’t

put a number to.

(Interview 32, Physicist, PESQ 260412)

Although measurement or counting is a central element of new public management, and

although league tables and benchmarking are a widely noted phenomenon in higher education

today, what is striking about the use of this in Australia is the extent to which there is a

conflation of quality and quantity; and the extent to which there is an attempt to use numbers

produced in one context or purpose for another to which they are not equally well fitted.

The distinctive form of ERA is a classic example. In most other countries where such large-scale,

nationwide research assessments take place (and, notably, they do not take place in the country

still widely regarded as the current benchmark standard, the USA) they are focused on attempts

to elicit and assess evidence of quality, or ‘excellence’ in the rhetoric of the Australian scheme.

Most ask for a portfolio submission designed to demonstrate and illustrate the achievements of

that department or individual. ERA requires that all research outputs be submitted, counted,

averaged. Although ‘quality’ rankings are produced from the exercise, the mechanism is

weighed down and distorted by the vast workload (at both the submission and evaluation end)

and the multiple technical complexities required to attempt to measure everything. The

technical rules are focused on making sure no researcher employed by a university escapes the

net; and that no low level publications that could be submitted should escape scrutiny.

This approach once again signifies the mistrust Australian governments have in relation to

universities. The relevant government department responsible for higher education (whose

name frequently changes) already receives annual research reports from all universities

accounting for publications and funding and verification that they have done this accurately,

and the department and ARC and NHMRC themselves also audit the claims made by universities.

12

Yet, this kind of data is now required to be resubmitted and recategorized5 just for ERA

purposes. The technical requirements to re-submit in line with a different set of technical norms

impose huge extra work and costs on the universities that must be set against the funding

available for core activities such as research and teaching. Furthermore, the approach taken

(micro-scrutiny of all activity) also epitomises the underlying concern with exposing

weaknesses, thus demonstrating the ‘stick’ mentality that underpins ERA.

Another element of the ERA approach is an overly simplistic rendering of the relationship

between quantity and quality. One fundamental tenet of ERA is that appropriate indicators for

the assessment of the ‘quality’ of research activities must be “quantitative”, that is, “objective

measures that meet a defined methodology that will reliably produce the same result,

regardless of when by whom principles are applied” (Australian Research Council, 2012a, p. 1).

Despite this concern of ERA with science-based measurement, it was forced to include some

direct peer review of outputs in fields in the humanities, social sciences and creative arts under

strong lobbying from relevant councils. For this part of the process the ERA submission rules

required that (30%) of all items included in the numerical submission be uploaded to a

repository to enable review (see Australian Research Council, 2012b, p. 7). But an official

interpretation of the rules for the 2012 ERA confirmed that universities were not allowed here

to simply select the best work for the repository, but rather had to submit a ‘representative’

sample – interpreted as being similar proportions of conference papers, book chapters and the

like as had been part of the numerical reportage. This effectively turns a quality assessment (an

effort to discover ‘is this discipline producing leading work in an international perspective?’)

into a different question ‘where does the average or median publication output of this

department rank on a scale of world average/above world average?’.

This is not comparing like with like in comparison to the UK or NZ or Hong Kong assessments

for example, where a more selective submission of a smaller number of best work publications

is required for evaluation. Within a discipline or within disciplinary knowledge production, both

in the sciences and the HASS disciplines, research is built using different kinds of

communications (early conference papers, work in progress, later more substantially theorised

work) as the major work develops over time. An approach that assumes that using the average

or median work is an appropriate way of measuring what is being achieved in terms of quality

in research, or that assumes that potentially a research group could have a profile consisting

5 For example, within Australian universities, research income is usually assigned to the administrative unit – research group, centre, department or faculty – and for annual reporting purposes it usually only has to be accounted for at overall university level; but for ERA, it has to be unbundled and rebundled and submitted against disciplinary codes.

13

entirely of highest ranking journal publications, shows an approach to research that detaches

the measure from the activity of knowledge production in a way that is likely to be distorting of

that activity. For example, some universities have responded internally to the ERA assessment

regime by restricting what kind of conference paper communication can now be undertaken by

their academics – to prevent the ‘dilution’ of their publication output, as it were. This already

indicates that the focus on the total number of outputs, or on artificially constructed averages, is

likely to produce gaming responses in various forms, as we will illustrate in the next section.

Performance Measures and Academic Knowledge Work

The preceding discussions have highlighted some of the salient characteristics of the form of

performance measurement commonly in use in Australian higher education today. In this

section we explore some of the implications, actual and potential, of this form of performance

measurement for academic work in Australia.

The academics we interviewed were generally supportive of the notion that academic work

should be subjected to various forms of critical evaluation, and also demonstrated considerable

recognition of the legitimacy of some attention to the societal relevance and impact of their

research. Yet there was concern that the current forms of measuring and managing academic

work in Australia were distorting and potentially counterproductive to the aim of building good

research and teaching.

(…) you need to produce a certain amount of research, so it has to be a sufficiently safe in

that you can predict at the outset that the grant is going to fund work that is doable and

produce a result. There might be targets in science that are actually much more interesting

and fundamentally significant, that you wouldn’t have a go at because you’ve got to keep

publishing or perish.

(Interview 42, Physicist, PESN 120912)

Among the physicists we interviewed there was a strong understanding that academic work has

always been competitive and open to measurement and comparisons of various kinds, and that

this fulfils a necessary self-regulating function. But there was also considerable concern

regarding the mechanical use of one-dimensional quantity measures by management at various

levels outside the physics department as proxies for the quality of academic work, and a

concern and that this use can create a distortion of the process and purpose of academic work:

14

(…) and the metrics now have gone absolutely berserk in terms of research, you know with

ERA and everything. Everything now is focused down on those KPIs and achieving KPIs and

there’s this, I think it’s ridiculous that there’s this sort of blinkered view of that’s what the

world is all about, you must publish papers, you must meet these KPIs. (…) I think in a lot of

people’s minds objectives and outcomes have been replaced by KPIs. The KPIs are just a

measure of an outcome, of trying to achieve an objective and in my mind, in research terms

the objective is simple, to do high quality, impacting research.

(Interview 40, Physicist, PNSN, 260712)

One repeatedly voiced concern was that while quantitative performance data (e.g. number of

publications or citations) potentially provide useful clues to research quality, when they are

used as an abstract rating that speaks for itself, they constitute an inadequate tool for assessing

research performances. In one instance, a Head of Department from a research-intensive

Australian university began by talking about the fact that he does refer to quantitative

publication and citation data in making decisions about appointments and promotion, but he

then goes on to talk of his misgivings with the use of bald measures as an abstracted quality

assessment tool:

I mean, I think if somebody writes very few papers and hardly gets them cited, you know,

that does tell you something important, okay? So with suitable interpretation and suitable

averaging and a suitably intelligent interpretation of the data [it is useful… but…]. You see

CVs of scientists who have 500 papers. Whenever I see that, I know that you know, they’re

very good at collaborating with other people. That’s what it tells me. Or they’re a lab

director or something like that. (…).There are others who’ve written very few papers and

have won Nobel Prizes because the papers they’ve written have been very important. So

what our system doesn’t capture very well is the importance of papers. It uses citations to

gauge impact, and it equates impact with importance. And sometimes it is, and sometimes

it isn’t.

(Interview 13, Physicist, PESV 181111)

And with regard to teaching, another physicist commented about the narrowing of useful

information that came with the transition toward a centrally administered, more standardised

form of teaching quality assessment:

15

The fact that you impose a generalised instrument across the campus means that it is not

as good as the highly specific ones that we had for physics in the old days. So we now have

a cruder instrument that’s a bit more um machine-readable and more quantitative.

(Interview 22, Physicist, PESN 150312)

A number of academics from both physics and history noted that the problem with current

forms of research performance measurement was not merely that they had some inadequacies

as an assessment tool, but that they also have come to drive behaviours in a range of

undesirable ways.

One example is the effects of the strategic drives to maximise research productivity being at the

expense of teaching quality:

I can’t stand the ARC system where some people can carve out careers purely on research.

You know I come from the United States where even the big superstar professors, they all

teach.

(Interview 44, Historian, HNJN 140912).

Another is the concern about driving quantity at the expense of quality and apparent

productivity at the expense of actual productivity:

And the second thing, is this, um, what I think this unhealthy trend and again it is sector

wide, to set targets, which I think is, well counterproductive and creates dynamics that will

ultimately subvert the quality of the work.

(Interview 34, Historian, HEJV 080612)

Then, you know, there’s other people that will just push out whatever dross they can get

together so they have high publication counts. And, people know that but the metrics hide

that quality aspect of research. It’s the same problem across the board.

(Interview 23, Physicist, PESN 150312)

A related concern regards the effects on publication and collaboration practices:

I certainly reviewed in recent times sort of papers that are very similar to something else in

the scripts published with minor changes in what I would see as an attempt to increase the

number of publications. And I guess subconsciously too, instead of putting together one big

publication that has a lot of stuff in it, there is a drive more and more to break it down into

16

smaller publications because with each publication well, you’ve got another publication

and you’ve got more chance of a citation and so I would say yes, that drive to put a number

to everything has a flow on effect in a range of ways.

(Interview 32, Physicist, PESQ 260412)

There’s quite a game now trying to get yourself in collaboration so you can get on a paper,

so that you have twenty papers even though you’ve only done the same amount of work

you’d normally do.

(Interview 43, Physicist, PESN 130912)

These strategic forms of adaption to output measurement regimes and the associated

performance targets mentioned here are commonly referred to in the public administration

literature as “output distortion” (Bevan & Hood, 2006). Generally speaking, output distortions

occur when attempts to achieve higher output targets come at the cost of significant but

unmeasured aspects of performance. If the focus of performance measurement is almost

exclusively on quantity of outputs (as tends to be the case in Australian higher education), then

this, it has been noted, “may lead agents to increase the volume of outputs at the expense of

quality” (Holmstrom & Milgrom 1991, p. 25). And, notably, the observations made by some of

our interviewees regarding the occurrence of ‘output distortions’ in Australian higher education

align with the findings of some earlier bibliometric studies indicating that such adaptation

effects indeed have been occurring in Australia as a result of basing performance evaluations,

and ultimately (some of) research funding, on raw publication counts (Butler, 2003).

The phenomenon of ‘output distortion’ brings us to the discussion of a more deep-seated

problem with the current form of performance measurement en vogue in Australian higher

education. In some instances, the use of heavily centralized and consequential forms of

performance measurement indeed has proven to bring about efficiency increases as regards the

production of outputs (see Verbeeten, 2008). Yet there is also a wide range of evidence, both

historical and recent, that the same performance measurement regimes can stimulate a range of

strategic adaptations commonly referred to as ‘gaming’ which may ultimately have a

detrimental effect on the actual outcomes achieved (Smith, 1995; Hood, 2006; Pollitt, 2013). For

example, the setting of minimum performance targets, have a tendency to produce ‘threshold’

and ‘ratchet effects’ (see Hood 2006). Minimum targets may lift poor performers up but may

also demotivate high performers (threshold effect), and actors may decide not perform to the

17

best of their capacity in one year if “next year’s targets are based on last year’s performance”

(ratchet effect) (Pollitt 2013, p. 352). Again, our project interviews produced a number of

unsolicited comments of such ‘gaming’ practices being learnt or observed today in Australia.

There are indications that in Australian higher education, such ‘gaming’ of performance

measures has also occurred on the level of whole universities and their reporting to

government. With the 2012 round of ERA, gaming could be achieved, for example, by submitting

a small ‘discipline’ group that seems to be comprised only of the very leading performers while

others who work with that team are allocated elsewhere; or by changing staff coding so that

people working in lower level research roles in research teams are no longer classified as

researchers but as professional support and do not need to be included. Further significant

scope for strategic responses within the context of ERA lies within the disciplinary coding that is

applied to publication outputs, and related to this, within the actual discipline codes that are

ultimately submitted by the universities as units of assessment. Comparative analysis of 2012

and 2010 ERA data has shown a significant reduction of such units of assessment. A more

benevolent reading of this change would be that universities have begun focusing their research

portfolios around areas of particular strengths, but it may also directly reflect the fact that

“universities undertook a significant strategic rationalisation of the disciplines they chose to be

assessed” (Larkins, 2012, p. 3).

It is obvious that such strategic responses jeopardize both the validity and legitimacy of

performance measurement as an objective governance mechanism. This may explain why

occurrences of gaming commonly tend to be ignored for some time, or are generally not readily

acknowledged publicly by the assessing authorities. In the case of ERA, for example, the CEO of

the Australian Research Council, Prof. Aidan Bryne, has recently publicly responded to

allegations that ERA was systematically gamed by universities by stating that he was “pretty

convinced that institutions are using ERA in a positive way to develop their research”

(Trounson, 2013).6

‘Too little and too much trust’

6 To give Prof. Bryne some credit, in the same context of discussion he also admitted to some problems regarding the rigid Field of Research (FoR) classification system underpinning ERA. Also, in 2012, Prof Bryne publicly indicated that Australian universities may have strategically reclassified their research outputs in the ERA 2012 round to attain better scores (see Trounson, 2012).

18

In contemporary higher education, performance measurement systems are employed to ‘steer’

and ‘audit’ and to drive productivity and efficiency. But universities work with complex

pressures and agendas, including concerns about income, teaching, research, public profile, and

international benchmarking. We have argued that the use of the particular kinds of performance

measurement systems and management approaches discussed in this paper produces

unintended and perverse effects, especially to the particular nature of the work that is

fundamental to universities, the knowledge work.

In this article, we have drawn attention to the way in which the approach to managing academic

performances in Australia today seems to embody some conflicting assumptions about the

nature of the knowledge workers and their motivations: an anomaly we referred to as ‘too little’

and ‘too much’ trust. Here of course we were not talking about a psychological category of trust,

or attempting to judge the actual beliefs and motivations of the policy makers and managers,

but to point to a conception of motivation and steering evident in the policies and strategies

themselves. On the one hand there are multiple micro-management measures, with a heavy

‘stick’ component, seemingly premised on an assumption that such external drivers are

essential for good teaching and research to take place. In the case of Australia the extent of the

centralization of performance measurement and of the consequential impact of it, and the

extent of the micro-management and preference for dealing with abstracted numerical

measures is notable. On the other hand, there is too little recognition of the creative agency of

the academics (and also of whole universities) in finding ways to meet the letter of the

indicators in the most strategic way, even if this means undermining their apparent intent. What

counts as a publication point, what ERA chooses to count, the ever-rising expectations of

research productivity and its effect on teaching commitment are all part of the latter dynamic.

We have tried to show from our interviews how some of the unintended and perverse effects of

the machinery of performance measurement currently in place in Australia are being seen and

interpreted by the academics themselves. What these interviews have highlighted is that this

machinery is distorting in its attempt to squeeze knowledge work that is highly diversified,

complex and dynamic and thus also somewhat unpredictable as to its results into one common

and inflexible grid of metrics. We have further illustrated by way of selected comparative

examples that while some form of performance measurement is now common in academic work

almost everywhere, that the kinds of top-down, micro-managed and template approach often

used in Australian higher education is not the only way to proceed.

19

Acknowledgements: We want to acknowledge financial support from the ARC for the project

which we draw on in this paper: DP 110102466, Knowledge Building in Schooling and Higher

Education: policy strategies and effects. We also want to acknowledge our many fruitful

discussions with our research colleagues on this project, Victoria Millar and Kate O’Connor, and

excellent research assistance support for the qualitative analysis stage of the project from

Huong Thi Lan Nguyen,

References:

Australian Research Council (2012a). ERA indicator principlies. Retrieved from http://www.arc.gov.au/pdf/era12/ERA_2012_Indicator_Principles.pdf.

Australian Research Council (2013b). ERA 2012 submission guidelines. Retrieved from http://www.arc.gov.au/pdf/era12/ERA2012_SubmissionGuidelines.pdf.

Bevan, G., & Hood, C. (2006). What’s measured is what matters: Targets and gaming in the English public health care system. Public Administration, 84(3), 517–538. doi: 10.1111/j.1467-9299.2006.00600.x

Blackmore, J., Brennan, M. & Zipin, L. (Eds.) (2010). Re-positioning university governance and academic work. Rotterdam: Sense.

Butler, L. (2003). Modifying publication practices in response to funding formulas. Research Evaluation, 12(1), 39-46. doi: 10.3152/147154403781776780

Christensen, T. (2011). University governance reforms: Potential problems of more autonomy? Higher Education, 62(4), 503–517. doi: 10.1007/s10734-010-9401-z

Cordon, W. M. (2005). Australian universities: Moscow on the Molonglo. Quadrant Magazine, XLIX(11), 7-20.

Dawkins, John. S. 1988. Higher education: A policy statement. Parliamentary paper 202/88. Canberra: Department of Employment, Education and Training.

Enders, J., Boer, H., & Weyer, E. (2013). Regulatory autonomy and performance: The reform of higher education re-visited. Higher Education, 65(1), 5–23. doi: 10.1007/s10734-012-9578-4

Gläser, J., Lange, S., Laudel, G., & Schimank, U. (2010). The limits of universality: How field-specific epistemic conditions affect authority relations and their consequences. In R. Whitley, J. Gläser, & L. Engwall (Eds.), Reconfiguring Knowledge Production: Changing Authority Relationships in the Sciences and their Consequences for Intellectual Innovation (pp. 219–324). Oxford: Oxford University Press.

20

Gläser, J., & Laudel, G. (2007). Evaluation without evaluators: The impact of funding formulae on Australian university research. In R. Whitley & J. Gläser (Eds.), The changing governance of the sciences: The advent of research evaluation systems (pp. 127–151). Dordrecht: Springer.

Haigh, G. (2006). The Nelson touch. Research funding: The new censorship. The Monthly, 12. Retrieved from: http://www.themonthly.com.au/monthly-essays-gideon-haigh-nelson-touch-research-funding-new-censorship-214.

Holmstrom, B., & Milgrom, P. (1991). Principal-agent analyses: Incentive contracts, asset ownership, and job design. Journal of Law, Economics, & Organization, 7, 24-52.

Hood, C. (1991). A public management for all seasons? Public Administration, 69(1), 3–19. doi: 10.1111/j.1467-9299.1991.tb00779.x

Hood, C. (2006). Gaming in targetworld: The targets approach to managing British public services. Public Administration Review, 66(4), 515–521. doi: 10.1111/j.1540-6210.2006.00612.x

Hood, C., & Peters, G. (2004). The middle aging of New Public Management: Into the age of paradox? Journal of Public Administration Research and Theory, 14(3), 267–282. doi: 10.1093/jopart/muh019

KNAW – Royal Netherlands Academy of Arts and Sciences (2009). Standard evaluation protocol

2009-2015. Protocol for research assessment in the Netherlands. Retrieved from:

https://www.knaw.nl/en/news/publications/standard-evaluation-protocol-sep-2009-

2015?set_language=en

Larkins, F. P. (2011). Australian higher education research policies and performance 1987-2010.

Carlton: Melbourne University Press.

Larkins, F. P. (2012). ERA 2012 (part 2): Discipline research profile changes 2010-2012. Retrieved

from:

http://www.lhmartininstitute.edu.au/userfiles/files/Blog/FLarkins_HE%20Research%

20Policy%20Analysis_ERA2012_pt2_Mar2013.pdf

Lauder, H., Young, M., Daniels, H., Balarin, M. and Lowe, J. (Eds.). (2012). Educating for the

knowledge economy? Critical perspectives. London: Routledge.

Lewis, J. (2013). Academic governance: Disciplines and policy. New York: Routledge.

Lewis, J., & Ross, S. (2011). Research funding systems in Australia, New Zealand and the UK: Policy settings and perceived effects. Policy & Politics, 39(3), 379–398. doi: 10.1332/030557310X520270

Lingard, B. (2011). Policy as numbers: Ac/counting for educational research. Australian Educational Researcher, 38(4), 355-382. doi: 10.1007/s13384-011-0041-9

Marginson, S. (1997). Steering from a distance: Power relations in Australian higher education. Higher Education, 34(1): 63–80. doi: 10.1023/A:1003082922199

21

Marginson, S., & Considine, M. (2000). The enterprise university: Power, governance and reinvention in Australia. Cambridge: Cambridge University Press.

Pollitt, C. (2013). The logics of performance management. Evaluation, 19(4), 346–363. doi: 10.1177/1356389013505040

Pollitt, C., & Bouckaert, G. (2004). Public management reform: A comparative analysis. Oxford: Oxford University Press.

Power, M. (1997). The audit society: Rituals of verification. Oxford: Oxford University Press.

Shore, C. (2008). Audit culture and illiberal governance: Universities and the politics of accountability. Anthropological Theory, 8(3), 278–298. doi: 10.1177/1463499608093815

Smith P. (1995). On the unintended consequences of publishing performance data in the public sector. International Journal of Public Administration, 18(2-3): 277–310. doi: 10.1080/01900699508525011

Trounson, A. (2012, December 12). Quantity deserves reward in ERA. The Australian. Retrieved from: http://www.theaustralian.com.au/higher-education/quantity-deserves-reward-in-era/story-e6frgcjx-1226534762281

Trounson, A. (2013, April 26). Academics lose out in ERA juggling. The Australian. Retrieved from: http://www.theaustralian.com.au/higher-education/academics-lose-out-in-era-juggling/story-e6frgcjx-1226629409408

Verbeeten, F. H. M. (2008). Performance management practices in public sector organizations: Impact on performance. Accounting, Auditing & Accountability Journal, 21(3), 427–454. doi: 10.1108/09513570810863996

Weingart, P. (2013). The loss of trust and how to regain it: Performance measures and entrepreneurial universities. In L. Engwall & P. Scott (Eds.), Trust in universities (pp. 83–95). London: Portland Press.

Whitley, R., & Gläser, J. (Eds.) (2007). The changing governance of the sciences: The advent of research evaluation systems. Dordrecht: Springer.

Whitley, R. (2007). Changing governance of the public sciences: The consequences of establishing research evaluation systems for knowledge production in different countries and scientific fields. In R. Whitley & J. Gläser (Eds.), The changing governance of the sciences: The advent of research evaluation systems (pp. 3–27). Dordrecht: Springer.

Whitley, R. (2011). Changing governance and authority relations in the public sciences. Minerva: A Review of Science, Learning, and Policy, 49(4), 359–385. doi: 10.1007/s11024-011-9182-2

Woelert, P., & Millar, V. (2013). The ‘paradox of interdisciplinarity’ in Australian research governance. Higher Education, 66(6), 755–767. doi: 10.1007/s10734-013-9634-8.

Yates, L. (2004). What does good education research look like? Maidenhead: Open University Press.