A comparison of nearest neighbours, discriminant and logit models for auditing decisions

18
INTELLIGENT SYSTEMS IN ACCOUNTING, FINANCE AND MANAGEMENT Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007) Published online in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/isaf.283 Copyright © 2007 John Wiley & Sons, Ltd. A COMPARISON OF NEAREST NEIGHBOURS, DISCRIMINANT AND LOGIT MODELS FOR AUDITING DECISIONS CHRYSOVALANTIS GAGANIS, a FOTIOS PASIOURAS, b CHARALAMBOS SPATHIS c AND CONSTANTIN ZOPOUNIDIS a * a Financial Engineering Laboratory, Department of Production Engineering and Management, Technical University of Crete, University Campus, Chania, 73100, Greece b School of Management, University of Bath, Bath, BA2 7AY, UK c Division of Business Administration, Department of Economics, Aristotle’s University of Thessaloniki, 54124, Thessaloniki, Greece SUMMARY This study investigates the efficiency of k-nearest neighbours (k-NN) in developing models for estimating auditors’ opinions, as opposed to models developed with discriminant and logit analyses. The sample consists of 5276 financial statements, out of which 980 received a qualified audit opinion, obtained from 1455 private and public UK companies operating in the manufacturing and trade sectors. We develop two industry-specific models and a general one using data from the period 1998–2001, which are then tested over the period 2002–2003. In each case, two versions of the models are developed. The first includes only financial variables. The second includes both financial and non-financial variables. The results indicate that the inclusion of credit rating in the models results in a considerable increase both in terms of goodness of fit and classification accuracies. The comparison of the methods reveals that the k-NN models can be more efficient, in terms of average classification accuracy, than the discriminant and logit models. Finally, the results are mixed concerning the development of industry- specific models, as opposed to general models. Copyright © 2007 John Wiley & Sons, Ltd. 1. INTRODUCTION All UK companies are required by company law to prepare financial statements for their share- holders. These statements, which are under the legal responsibility of the directors, must comply with law and accounting standards (e.g. UK General Accepted Accounting Principles (GAAP), International Financial Reporting Standards (IFRS) 1 ). With the exception of very small 2 companies, financial statements must be then by audited by UK registered auditors. At the end of the examina- tion, the auditor must prepare a report that contains a clear expression of opinion on the financial statements and on any further matters by statute or other requirements applicable to the particular engagement (Statement of Auditing Standards—SAS 600.5). * Correspondence to: C. Zopounidis, Financial Engineering Laboratory, Department of Production Engineering and Manage- ment, Technical University of Crete, University Campus, Chania, 73100, Greece. E-mail: [email protected] 1 As of 2005, listed companies in the EU have to prepare their financial statements in accordance with IFRS. The remaining companies can prepare their statements using either national GAAP or IFRS. In the UK, SAS 600 was also superseded by International Standard on Auditing (UK and Ireland) (ISA UK and Ireland) 700 ‘The Auditors’ Report on Financial Statements’ with effect for accounting periods commencing on or after 15 December 2004. In this study we discuss SAS 600 because it is relevant to the period of our empirical analysis. 2 Turnover less than 1,000,000 GBP, balance sheet total less than 1,4000,000 GBP, less than 50 employees.

Transcript of A comparison of nearest neighbours, discriminant and logit models for auditing decisions

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

MODEL COMPARISONS FOR AUDITING DECISIONS 23INTELLIGENT SYSTEMS IN ACCOUNTING, FINANCE AND MANAGEMENTIntell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)Published online in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/isaf.283

Copyright © 2007 John Wiley & Sons, Ltd.

A COMPARISON OF NEAREST NEIGHBOURS, DISCRIMINANTAND LOGIT MODELS FOR AUDITING DECISIONS

CHRYSOVALANTIS GAGANIS,a FOTIOS PASIOURAS,b CHARALAMBOS SPATHISc ANDCONSTANTIN ZOPOUNIDISa*

a Financial Engineering Laboratory, Department of Production Engineering and Management,Technical University of Crete, University Campus, Chania, 73100, Greece

b School of Management, University of Bath, Bath, BA2 7AY, UKc Division of Business Administration, Department of Economics, Aristotle’s University

of Thessaloniki, 54124, Thessaloniki, Greece

SUMMARYThis study investigates the efficiency of k-nearest neighbours (k-NN) in developing models for estimating auditors’opinions, as opposed to models developed with discriminant and logit analyses. The sample consists of 5276financial statements, out of which 980 received a qualified audit opinion, obtained from 1455 private and publicUK companies operating in the manufacturing and trade sectors. We develop two industry-specific models anda general one using data from the period 1998–2001, which are then tested over the period 2002–2003. In eachcase, two versions of the models are developed. The first includes only financial variables. The second includesboth financial and non-financial variables. The results indicate that the inclusion of credit rating in the modelsresults in a considerable increase both in terms of goodness of fit and classification accuracies. The comparisonof the methods reveals that the k-NN models can be more efficient, in terms of average classification accuracy,than the discriminant and logit models. Finally, the results are mixed concerning the development of industry-specific models, as opposed to general models. Copyright © 2007 John Wiley & Sons, Ltd.

1. INTRODUCTION

All UK companies are required by company law to prepare financial statements for their share-holders. These statements, which are under the legal responsibility of the directors, must complywith law and accounting standards (e.g. UK General Accepted Accounting Principles (GAAP),International Financial Reporting Standards (IFRS)1). With the exception of very small2 companies,financial statements must be then by audited by UK registered auditors. At the end of the examina-tion, the auditor must prepare a report that contains a clear expression of opinion on the financialstatements and on any further matters by statute or other requirements applicable to the particularengagement (Statement of Auditing Standards—SAS 600.5).

* Correspondence to: C. Zopounidis, Financial Engineering Laboratory, Department of Production Engineering and Manage-ment, Technical University of Crete, University Campus, Chania, 73100, Greece.E-mail: [email protected] As of 2005, listed companies in the EU have to prepare their financial statements in accordance with IFRS. The remainingcompanies can prepare their statements using either national GAAP or IFRS. In the UK, SAS 600 was also supersededby International Standard on Auditing (UK and Ireland) (ISA UK and Ireland) 700 ‘The Auditors’ Report on FinancialStatements’ with effect for accounting periods commencing on or after 15 December 2004. In this study we discuss SAS 600because it is relevant to the period of our empirical analysis.2 Turnover less than 1,000,000 GBP, balance sheet total less than 1,4000,000 GBP, less than 50 employees.

24 C. GAGANIS ET AL.

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

An unqualified opinion is expressed when the financial statements ‘give a true and fair view’3 andhave been prepared in accordance with the identified financial reporting framework. A qualifiedopinion is issued when either of the following situations occurs: (a) there is a limitation on the scopeof the auditors’ examination that prevents them from obtaining sufficient evidence to express anunqualified opinion (SAS 600.7) or (b) the auditors disagree with the treatment of the disclosure ofa matter in the financial statements (SAS 600.8); and in the auditors’ judgment, the effect of thematter is or may be material to the financial statements and, therefore, those statements may not ordo not give a true and fair view of the matters on which the auditors are required to report or donot comply with relevant accounting or other requirements.

It should be mentioned at this point that explanatory paragraphs are not considered qualified auditopinions in the UK. Specifically, in forming their opinion, auditors should consider whether the view givenby the financial statements could be affected by inherent uncertainties. In the case of an inherentuncertainty that (a) in the auditors’ opinion is fundamental and (b) is adequately accounted for and dis-closed in the financial statements, the auditors should include an explanatory paragraph referring to thefundamental uncertainty in the section of their report setting out the basis of their opinion. When addingsuch an explanatory paragraph, auditors should use words that clearly state that their opinion on thefinancial statements is not qualified in respect of its contents (SAS 600.6). In that sense, the explanatoryparagraph is included as part of the basis for the auditors’ opinion so as to make it clear that it refersto a matter which they have taken into account, but it does not qualify their opinion. By contrast,when the auditors conclude that the estimate of the outcome of a fundamental uncertainty is materi-ally misstated or that the disclosure relating to it is inadequate, they issue a qualified opinion.

The issuance of the wrong type of report can have consequences for the auditors, since discipli-nary action can be taken against them (Fearnley and Hines, 2003). However, several reasons makethe discrimination between financial statements that should receive qualified and unqualifiedopinions a controversial and challenging issue. For example, there might be difficulties in collecting,analysing, and synthesizing large quantities of data from several sources. Furthermore, Shepherdet al. (2003), who summarize the literature on the effects of expertise in the area of judgment/decision making, highlight both advantages and disadvantages. On the one hand, individuals oftenacquire important advantages as they gain increasing experience in performing various tasks (e.g.Frederick and Libby, 1986; Choo and Trotman, 1991; Frederick, 1991); on the other hand, they maysuffer from overconfidence, which could lead to serious errors or ‘overfitting’ the world by drawingconclusions for a small sample of observations and overgeneralizing from them (e.g. Oskamp, 1982;Fischhoff, 1982; Mahajan, 1992). The empirical results of Shepherd et al. (2003) also indicate thatgreater experience at the venture capital task may not always result in better decisions. Hence, byextrapolation, we can assume that auditors following a human expert approach may also suffer fromoverconfidence or ‘overfitting’.

Consequently, researchers have developed classification models to help auditors in forming theiropinion. By using such models, auditors can simultaneously screen a large number of firms anddirect their attention to those having a higher probability of receiving a qualified audit opinion, thussaving time and money. Furthermore, auditors can use these models to predict what opinion otherauditors would issue in similar circumstances, when evaluating potential clients, in peer reviews, tocontrol quality within firms and as a defence in lawsuits (Laitinen and Laitinen, 1998).

3 Under IAS 700, the terms used to express the auditor’s opinion are ‘give a true and fair value’ or ‘present fairly, in allmaterial respects’ and are equivalent.

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

MODEL COMPARISONS FOR AUDITING DECISIONS 25

Most of the previous studies in the field used discriminant analysis (DA; e.g. Levitan and Knoblett,1985; Mutchler, 1985), probit analysis (e.g. Dopouch et al., 1987; Lennox, 2000) and logit analysis (LA;e.g. Menon and Schwartz, 1987; Keasey et al., 1988; Spathis, 2003; Caramanis and Spathis, 2006). Onlymore recently have a few studies employed multicriteria decision-aid techniques (e.g. Spathis et al.,2002, 2003; Pasiouras et al., 2007), and artificial intelligence (AI) and machine-learning4 techniques,such as neural networks (e.g. Fanning et al., 1995, Fanning and Cogger, 1998; Lenard et al., 1995;Gaganis et al., 2007), support vector machines (Doumpos et al., 2005), and decision trees (Kirkos et al.,2007). Thus, there is a small, although growing, strand of the literature that employs AI techniquesin auditing. However, their application in modelling auditors’ opinions is by far limited, comparedwith similar problems in finance and accounting, such as bankruptcy and credit risk assessment.Nevertheless, as Baldwin et al. (2006) highlight, ‘The very nature of auditing provides motivationfor the use of AI’ and ‘AI applications . . . should be investigated to the fullest possible extent’.

In this study, we develop, for the first time in auditing, k-nearest neighbours (k-NN) models and comparethem with traditional techniques, such as DA and LA. k-NN is an AI, and in particular a machine-learninginstance-based, technique that predicts class value for an unlabelled example by analysing its k neigh-bouring examples (Lai and Tsai, 2004; Leban et al., 2006). Recent applications of k-NN in the fieldof finance and accounting can be found in bankruptcy prediction (e.g. Tam and Kiang, 1992), creditrisk modelling (Henley and Hand, 1996; West, 2000; Fritz and Hosemann, 2000; Doumpos andPasiouras, 2005), stock returns modelling (Hellstrom and Holmstrom, 2000), interest rate modelling(Nowman and Saltoglu, 2003) and the prediction of acquisition targets (Pasiouras et al., 2005).

We also examine, whether the inclusion of non-financial variables, such as credit ratings and theindustry in which firms operate, have an impact on the classification ability of the models. Finally,we examine the development of general models, using samples pooled across industries, as opposedto industry-specific models. Considering the differences in the financial statements of firms acrossindustries as well as the differences in the environment in which they operate, the development ofindustry-specific models might be more appropriate than the development of general ones. Forsimilar reasons, industry-specific models have been proposed in bankruptcy (Altman, 1983) and inacquisitions prediction (Barnes, 1990, 2000).

The rest of the paper is organized as follows. Section 2 describes the dataset and variables, whileSection 3 presents the classification techniques used in the study. Section 4 discusses the empirical results.Finally, the last section outlines the concluding remarks along with some directions for future research.

2. SAMPLE AND VARIABLES

2.1. The Dataset

The data, both financial statements and auditors’ opinions,5 for this study were obtained from theFinancial Analysis Made Easy (FAME) database of Bureau van Dijk’s company, which contains

4 Machine learning usually refers to the changes in systems that perform tasks associated with AI (Nilsson, 1996). Briscoe andCaelli (1996) characterize machine learning as a relatively new branch of AI, while Nilsson (1996) points out that AI researchhas been concerned with machine learning since the beginning. In any case, as Carbonell et al. (1983) mention, machinelearning constitutes an integral part of AI, and its methodology has changed in line with the major concerns in the field.5 The only audit information available in FAME is whether the auditor issued a qualified or unqualified opinion. Hence, wehad no further information to distinguish whether qualifications are due to disagreements (e.g. accounting treatment ordisclosure), limitations on scope (i.e. lack of audit evidence) or going-concern issues.

26 C. GAGANIS ET AL.

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

6 Total assets above 27 million euros, turnover above 40 million euros and above 250 employees.7 We classified firms on the basis of UK SIC 2003. One of the objectives of this study was to examine whether industry-specific models should be preferred to general ones. The manufacturing and trade sectors were the only ones that had enoughobservations with qualified statements that would allow a sufficient estimation and testing of the models with the employmentof a holdout sample from a future period.

Table I. Observations by year and sector

Year Manufacturing Trade (retail and wholesale) Total

Qualified Unqualified Total Qualified Unqualified Qualified Qualified Unqualified Total

1998 43 266 309 29 109 138 72 375 4471999 69 453 522 43 225 268 112 678 7902000 122 561 683 69 287 356 191 848 10392001 175 592 767 85 322 407 260 914 11742002 163 583 746 85 312 397 248 895 11432003 53 353 406 44 233 277 97 586 683Total 625 2808 3433 355 1488 1843 980 4296 5276

data relating to the UK and Ireland. The final sample consists of 485 firms with qualified financialstatements and 970 firms with unqualified financial statements. This sample was constructedas follows.

We first collected a sample of 308 manufacturing and 177 trade firms that met the followingfour requirements: (a) they had financial statements in FAME for at least 1 year over the period1998–2003; (b) they received a qualified audit opinion for at least 1 year over the period of ouranalysis; (c) they were characterized as large firms;6 (d) they were operating in the manufacturingor trade (i.e. retail and wholesale) sectors.7

Then, an additional set of 1000 manufacturing and trade firms that did not receive a qualifiedopinion was randomly downloaded from FAME. From these, 30 firms were dropped from the sampledue to missing information (i.e. n.a.) with respect to auditors’ opinions, resulting in a sample of630 manufacturing firms and 340 trade firms with unqualified opinions.

Some of the firms received qualified opinions more than once over the period of our analysis,while some of the firms with unqualified opinions were not included in the sample for all years,due to missing financial data. Consequently, the final sample consisted of 980 financial statementswith qualified opinions and 4296 financial statements with unqualified opinions, giving a total of5276 observations. Table I presents the observations by year and sector.

Training and Testing SubsamplesAn important issue of concern while assessing the classification ability of a model is to ensure thatit has not overfit the training (estimation) dataset. As Stein (2002) mentions, ‘A model withoutsufficient validation may only be a hypothesis’. However, it is well known that the accuracies arebiased upward when classification models are used to reclassify the observations of the trainingsample (Fanning and Cogger, 1994). Thus, it is necessary to use a testing sample and classify a setof observations not used during the development of the model.

As Barnes (1990) points out, due to inflationary effects, technological changes and numerousother reasons (e.g. changing accounting policies), it is unreasonable to expect that the distributionalcross-sectional parameters of the financial ratios will remain stable over time. Thus, a superior

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

MODEL COMPARISONS FOR AUDITING DECISIONS 27

approach would require the evaluation of the model in a future period, since this approach moreclosely reflects a ‘real world’ setting. As Espahbodi and Espahbodi (2003) mention:

After all, the real test of a classification model and its practical usefulness is its ability to classify objectscorrectly in the future. While cross-validation and bootstrapping techniques reduce the over-fitting bias, theydo not indicate the usefulness of a model in the future.

Therefore, in this study, in order to consider the case of population drifting (i.e. change of populationover time) and determine whether the models remain stable over different time periods, we split thesample in to two distinct datasets. The first consists of data from the period 1998–2001 and servesas a training sample. The second contains data from the sebsequent two years (i.e. 2002 and 2003)and serves as a testing sample.

2.2. Variables Selection

Table II presents a list of the explanatory variables included in the models. The financial variablescover liquidity, profitability, growth, and financial leverage. We also include a variable indicatingthe overall strength of the firm, as measured by the credit risk assessment of a rating agency.Additionally, in the case of the general model we use a dummy variable indicating whether the firmoperates in the manufacturing or the trade sector.

Financial VariablesThe choice of financial ratios that should be included in the model is a challenging issue. First, itis unclear which variables are related to the probability of qualified audit opinions. Second, thereare numerous ratios that can be used as proxies for the same financial attributes (i.e. profitability,liquidity), and it is often unclear which the best alternatives are. Third, including a large set offinancial variables in the model not only increases the time and cost required for data collection andpreparation, but it can also lead to multicollinearity (Spathis et al., 2003). As Hamer (1983) men-tions, the variable set should be constructed on the basis of (a) minimizing the cost of data collection

Table II. Variable descriptions

Dependent variableAuditors opinion Dummy variable taking the value of 1 for qualified audit opinion

and 0 for unqualified audit opinion

Independent variablesFinancial

PROF Return on total assetsLIQ Quick ratio calculated as (Current assets − Stock)/Current liabilitiesGROWTH Annual change in total assets calculated as [(Total assets in year t)

− (Total assets in year t − 1)]/(Total assets in year t − 1)LEV Shareholders funds/Total assets

Non-financialRATING Variable taking the value of 1 if the firm is classified in the

Secure risk group, 2 if the firm is classified in the Stable group,3 if it is classified in the Normal, 4 in the Caution,and 5 in the High Risk group

INDUSTRY Dummy variable taking the value of 1 if the firm is operatingin the manufacturing sector and 0 if it is operating in the trade sector

28 C. GAGANIS ET AL.

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

and (b) maximizing the applicability of the model. However, it is not easy to determine how manyratios a particular model should contain. Too few and the model will not capture all the relevantinformation. Too many and the model will not only be overfitting the training sample, but it willmost likely have onerous data input requirements as well (Kocagil et al., 2002).

Some studies start from a large list of potential variables, which are later reduced by means ofstatistical screening such as stepwise analysis (e.g. Laitinen and Laitinen, 1998). However, Palepu(1986) criticizes such an approach and argues that ‘this method of variable selection is arbitrary andleads to the statistical overfitting of the model to the sample at hand’. Therefore, to avoid suchcriticisms, and at the same time enhance the applicability of the model, we select one variable fromeach of the above-mentioned categories (i.e. profitability, liquidity, etc.) after considering previousstudies (e.g. Laitinen and Laitinen, 1998; Spathis et al., 2002; Ireland, 2003; Doumpos et al., 2005;Pasiouras et al., 2007) and data availability. The discussion that follows outlines the variables thatwe use and their expected relation to auditors’ opinions.

LIQ denotes the liquidity position of the firm, as measured by the quick ratio.8 Spathis (2003)mentions that the possibility of a qualified report increases when the financial health of the firmdeteriorates. On the other hand, high liquidity might increase the likelihood of a qualified opinion,as assets might be overstated (Ireland, 2003). Prior empirical research in the UK indicates thatcompanies with poor liquidity are more likely to receive going-concern modifications than othercompanies do; however, liquidity does not have a significant impact on non-going-concern-relatedmodifications (Ireland, 2003). Laitinen and Laitinen (1998) also report that there were no significantdifferences in terms of liquidity between firms that received qualified and unqualified opinions.

PROF represents the profitability of the firm, as measured by return on assets. Numerous previousstudies indicate that firms which receive qualified opinions or have falsified financial statements areless profitable ones (Loebbecke et al., 1989; Summers and Sweeney, 1998; Laitinen and Laitinen,1998; Beasley et al., 1999; Spathis, 2002; Spathis et al., 2002; Pasiouras et al., 2007). As Spathis(2002) argues, the profitability orientation is tempered by managers’ own utility maximization,defined by job security.

GROWTH is defined as the annual change of total assets over the last year. Trends in ratios andfinancial accounts have been found to be important in the past. For example, Laitinen and Laitinen(1998) report that the lower the growth of the firm, the higher the probability that the audit reportis qualified, and Dopuch et al. (1987) found the change in the ratio of total liabilities to total assetsto be one of the most significant variables.

LEV corresponds to the financial leverage of the firm, as measured by the shareholders’ funds9

to total assets ratio. Obviously, higher values of LEV indicate that the company relies on sharehold-ers’ funds rather than on external debt (e.g. loans) to finance its total assets and, therefore, it is lesslikely to fail. We include this variable in the analysis because numerous studies indicate that firmswith high probability of default are more likely to receive qualified opinions (Bell and Tabor, 1991;

8 As an anonymous reviewer suggested, we could use current ratio as an alternative. We believe that since the two ratiosare usually highly correlated, the preference of one ratio over another might not have a significant impact on the classificationresults. Nevertheless, the results of previous UK studies are in favour of the use of the quick ratio. For example, Pasiouraset al. (2007) report a significant correlation (0.996) and point out that the quick ratio was finally included in the model becauseit is more stringent. Doumpos et al. (2005) also considered both current and quick ratios. However, univariate tests performedwith t-statistics and support-vector machines indicated that only the quick ratio was significant and considered in the finalmodel. Finally, Ireland (2003) also relied on quick ratio to measure liquidity.9 Shareholders’ funds is calculated as: Issued Capital + Share Premium Account + Revaluation Reserves + Profit (Loss)Account + Other Reserves.

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

MODEL COMPARISONS FOR AUDITING DECISIONS 29

McKeown et al., 1991; Reynolds and Francis, 2001). Ireland (2003) also found that UK companieswith high gearing are more likely to receive both going-concern10 and non-going-concern modifica-tions than other companies are. Similarly, Laitinen and Laitinen (1998) found that the higher theshare of equity in the balance sheet, the higher the probability that the audit report is non-qualified.Therefore, we expect a negative relationship between LEV and the probability of a qualified opinion.

Non-Financial VariablesThe use of credit rating is also justified on the basis of the above-mentioned relationship betweenprobability of default and audit opinion. Some previous studies have used Altman’s Z-score as aproxy of default (e.g. Reynold and Francis, 2001; Spathis et al., 2003; Kirkos et al., 2007); however,such an approach may not be the most appropriate one. Altman’s Z-score was developed for aparticular industry (i.e. manufacturing), under different economic conditions (i.e. in the 1960s) andfor a specific country (i.e. the USA). Hence, without the necessary modifications, the model maynot be appropriate at present or for UK firms. Therefore, we rely on the credit risk assessments ofCRIF Decision Solutions Limited, who provide ratings for UK firms. The Quiscore estimatedby CRIF corresponds to the likelihood of default for the 12 months following the date of its calculation.CRIF classifies firms into the following five risks on the basis of their Quiscore: Secure (scorebetween 81 and 100), Stable (score between 61 and 80), Normal (score between 41 and 60), Caution(score between 21 and 40), and High Risk (score between 0 and 20). According to CRIF, failureis very unusual for firms in the Secure group and it normally occurs only as a result of exceptionalmarket changes, whereas firms classified in the High group are unlikely to be able to continue theiroperation unless significant remedial action is taken. Firms classified in the other three groups posean intermediate level of credit risk. In this study, we use a variable (RATING) taking values between1 (Secure) and 5 (High Risk) as a proxy for the five risk groups, and we expect it to be positivelyrelated to the qualified audit opinion likelihood.

We also use a dummy variable, in the case of the general model, indicating whether the firm isoperating in the manufacturing sector (INDUSTRY = 1) or the trade sector (INDUSTRY = 0). AsDe Beelde (1997) argues, industry-specific knowledge and audit programs optimized for theseindustries might result in an advantage over competitive audit firms and, consequently, explainhigher market shares. Hence, auditors may specialize so as to increase their ability to detect qualifiedstatements through expertise (Carcello et al., 1992; O’Keefe et al., 1994). Such specialization mightbe especially attractive, given the focus on inherent risk analysis and industry knowledge found inthe professional literature (Pomeranz, 1992). This potentially also explains why research in auditquality and auditors’ industry specialization reveals that audits performed by industry-specialistauditors are of higher quality (Bonner and Lewis, 1990; Ashton, 1991).

3. CLASSIFICATION METHODS

The problem considered in this study is a classification one that, in general, involves the assignmentof a set of alternatives (e.g. financial statements), described along a set of independent variables (e.g.

10 Auditors are required to add an explanatory paragraph in their report whenever there is ‘substantial doubt’ about a client’sability to continue its operations as a going concern. Hence, the going-concern uncertainty opinion is issued by the auditorto a client company when that company is at risk of failure or exhibits other signs of distress that threaten its ability tocontinue as a going concern.

30 C. GAGANIS ET AL.

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

financial ratios), into two or more predefined mutually exclusive groups (e.g. qualified/unqualified).The following subsections discuss the three classification techniques that we use in this study,namely k-NN, DA, and LA.

3.1. Nearest Neighbours

Nearest neighbours is a non-parametric instance-based technique that as mentioned earlier has beenemployed in various problems in finance and accounting. The nearest-neighbour rule classifies anobject (e.g. firm, financial statement, etc.) to the class of its nearest neighbour in the measurementspace using some kind of distance measure, like local metrics (Short and Fucunaga, 1980), globalmetrics (Fukunaga and Flick, 1984), or Mahalanobis or Euclidean distance. The latter is the one mostcommonly used and is also the one that we use here.

In the present study, we use the modification of the nearest neighbour rule, the k-NN method thatclassifies an object (e.g. firm, financial statement) to the class (i.e. qualified or unqualified) moreheavily represented among its k-NN.

Assuming a firmdescribed by the feature vector <g1(x),g2(x), . . . ,gn(x)> where gr(x) is used todenote the values of the rth characteristic of firm x, the distance between two instances xi and xj

is estimated as follows:

d x x g x g xi j r i r j

r

n

( , ) ( ( ) ( ))≡ −=

∑ 2

1

(1)

Then, the algorithm for approximating a discrete-valued function of the form f :ℜn → C, where Cis a finite set of classes {C1,C2, . . . ,Cq} proceeds as follows:

Step 1. For each training example (i.e. firm) <x, f (x)>, add the firm to the list of training examples.Step 2. Given a query firm x to be classified, let x1,x2, . . . ,xk denote the k instances from the training

examples that are nearest to x.

Step 3. Return F ( ) arg max ( , ( )),x c f xc C i

i

k

← ∈=∑ δ

1

where δ(a,b) = 1 if a = b and where δ(a,b) = 0otherwise.

Thus, the algorithm returns the value F (x) as an estimate of f (x), which is the most common valueof f among the k training examples nearest to x.

3.2. Discriminant Analysis

DA seeks to obtain a linear combination of the independent variables. The objective of this methodis to classify observations into mutually exclusive groups as accurately as possible by maximizingthe ratio of among-groups to within-groups variance. The discriminant function is of the form

Z = b0 + b1x1 + b2x2 + . . . + bmxm

where xj is the jth independent variable, b0 is the coefficient for the jth independent variable, andZ is the discriminant score that maximizes the distinction between the two groups. A given financialstatement will be classified as a qualified one if Z > Zc (the critical Z), and as unqualified if Z < Zc.

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

MODEL COMPARISONS FOR AUDITING DECISIONS 31

Table III. Descriptive statisticsa (training sample, N = 3450)

Manufacturing Trade Pooled sample

Qual. Unqual. Kruskal-Wallis Qual. Unqual. Kruskal-Wallis Qual. Unqual. Kruskal-Wallis(p-value) (p-value) (p-value)

Continuous variablesLIQ 1.70 1.38 0.183 1.65 0.86 0.000 1.68 1.20 0.181

(4.68) (3.13) (4.02) (0.66) (4.46) (2.59)LEV 28.18 35.49 0.000 26.98 30.86 0.016 27.76 33.94 0.000

(33.95) (26.82) (29.52) (23.20) (32.42) (25.76)GROWTH 12.81 14.14 0.005 10.54 17.99 0.000 12.00 15.43 0.000

(77.75) (64.10) (63.33) (55.23) (72.90) (61.29)ROA −5.42 4.97 0.000 −2.32 6.30 0.000 −4.32 5.42 0.000

(39.22) (19.73) (32.90) (9.15) (37.10) (16.97)

Dummy variables (no. of observations)Credit ratingSecure 34 295 22 80 56 375Stable 30 402 29 199 59 601Normal 44 585 30 302 74 887Caution 81 414 54 302 135 716High Risk 220 176 91 60 311 236

a Standard deviation in parentheses.

3.3. Logit Analysis

In LA, the probability of a financial statement receiving a qualified opinion based on a set ofindependent variables is given by the function

P

ei Z i

=

+ −

1

1where

Z

P

Pb b x b x b xi

i

im m i ln

. . . =

−⎛⎝⎜

⎞⎠⎟

= + + + + +1

0 1 1 2 2 ε

is the probability that financial statement i will receive a qualified opinion, b0 is the intercept termand bj ( j = 1, . . . , m) represents the coefficients associated with the corresponding independentvariables xj ( j = 1, . . . , m) for each firm. The coefficient estimates are obtained by regression, whichinvolves maximizing a log-likelihood function. The model is then used to estimate the group-membership probabilities for the financial statements of all firms under consideration. Each financialstatement is classified as qualified or unqualified using an optimal cut-off point, attempting tominimize type I and type II errors.

4. EMPIRICAL RESULTS

4.1. Univariate Results

Table III shows summary statistics (mean and standard deviation) of the three training samples,while distinguishing between financial statements with qualified and unqualified audit opinions.

32 C. GAGANIS ET AL.

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

The results of the Kruskal–Wallis11 test are also presented, indicating whether the differencesbetween qualified and unqualified statements are statistically significant.

Consistent with our expectations, both manufacturing firms and trade firms with qualified statementshave lower PROF, LEV, and GROWTH. LIQ, as measured by the quick ratio, is statistically significantonly in the case of the trade sector, and appears to be higher for firms with qualified financial state-ments. As expected, more risky firms appear to be associated with qualified audit opinions. Morespecifically, approximately 50% of the observations with qualified financial statements are classifiedin the High Risk group and only 9% in the Secure group. By contrast, corresponding figures forobservations with unqualified financial statements are 8% (High Risk) and 13% (Secure).

4.2. Results from Classification Models

As mentioned previously, the objective of the present study is the assessment of the relative efficiencyof models developed through k-NN, DA and LA. Consequently, all models are developed using a commontraining sample and the same set of input variables, and are then evaluated using a common testing sample.

We first develop the models using only financial variables as an input. Table IV presents thecoefficients of DA and LA and the accompanying statistics. All models are statistically significant

11 Eisenbeis (1977) points out that deviation from the normal distribution in economics and finance appears more likely to be therule rather than the exception. Therefore, we preferred to rely on a non-parametric test, rather than a parametric one (i.e. t-test).

Table IV. Coefficientsa of the variables and summary statistics of the DA and LA modelsdeveloped with financial variables

Manufacturing Trade General

DA LA DA LA DA LA

VariablesLIQ 0.101 0.049 0.235 0.561 0.139 0.085

(0.404) (10.479)*** (0.676) (38.981)*** (0.506) (21.065)***LEV −0.180 −0.070 −0.016 −0.100 −0.017 −0.070

(−0.543) (17.654)*** (−0.420) (11.987)*** (−0.508) (24.360)***GROWTH −0.000 −0.000 −0.004 −0.020 −0.020 −0.010

(−0.110) (0.002) (−0.238) (2.442) (−0.106) (1.079)PROF −0.023 −0.130 −0.026 −0.280 −0.025 −0.150

(−0.729) (38.104)*** (−0.634) (27.285)*** (−0.714) (58.558)***Constant 0.406 0.147 0.273 −0.190 0.371 0.123

(5.042)** (2.887)* (5.140)**

Summary statisticsWilks’ λ 0.960 — 0.940 — 0.958 —χ2 92.332 — 72.071 — 149.124 —Sig. (p-value) 0.000 — 0.000 — 0.000 —Canonical 0.199 — 0.245 — 0.206 —correlationχ2 — 102.192 — 123.506 — 172.194Sig. (p-value) — 0.000 — 0.000 — 0.000Pseudo R2 — 0.033 — 0.082 — 0.037

* Statistically significant at 10% level; ** statistically significant at 5% level; *** statistically significant at 1% level.a Standardized coefficients are presented in parentheses in the case of DA. The results of the Wald test are presented inparentheses in the case of LA.

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

MODEL COMPARISONS FOR AUDITING DECISIONS 33

at the 1% level, with chi-square values equal to 102.192 (Manufacturing), 123.506 (Trade) and172.194 (General) for LA models, and Wilks’ λ equal to 0.960, 0.940 and 0.958 for the correspond-ing DA models. Hence, we reject the null hypothesis that all the coefficients are zero at the 1% levelin all cases. The pseudo R2, which indicates how well the data fit the presumed underlying theoreticaldistribution in the LA models, equals 0.033 (Manufacturing), 0.082 (Trade), and 0.037 (General).The corresponding figures of the canonical correlation, which is used to access the goodness of fitin the case of the DA models, are 0.199, 0.245 and 0.206.

LIQ, which is positively related to the probability of a qualified audit opinion, is statisticallysignificant in all logit models and relatively important in the discriminant models (based on thestandardized coefficients). LEV and PROF are both negatively related to the probability of a quali-fied audit opinion and statistically significant in all logit models, while they also carry a highstandardized coefficient in the discriminant models. Consistent with our expectations, GROWTH isnegatively related to the probability of a qualified opinion; however, it is insignificant in all the LAmodels and it is the less important variable in the DA models.

Table V presents the classification results obtained through k-NN, DA and LA in the trainingsample. The development of the k-NN models greatly depends on finding an appropriate value forthe parameter k, which is the number of nearest neighbours that should be considered. Improperselection of this parameter may result in overfitting or underfitting of the models. One way to selectthe number of neighbours is to split the training sample into two parts, where one is used forestimation and one is used for validation. Alternatively, a cross-validation approach, using theobservations in the training sample, can be employed (e.g. Nowman and Saltoglu, 2003; Doumposand Pasiouras, 2005).

In this study, we follow the latter approach. More specifically, the observations in the trainingsample12 are randomly split into 10 almost-equal datasets (i.e. 10-fold cross-validation), each onehaving a similar distribution for the dependent variable. The model is developed using nine sets and

12 These observations are from 1998–2001 and correspond to 2281 for the Manufacturing model, 1169 for the Trade modeland 3450 for the General model.

Table V. In-sample (training) correct classification accuracies of models developedwith financial variables

Model Classification accuracy (%)

Unqualified Qualified Average

k-NNManufacturing 60.90 74.57 67.74Trade 77.09 54.87 65.98General 69.45 74.96 72.21

DAManufacturing 69.10 51.80 60.45Trade 72.10 52.70 62.40General 69.80 51.30 60.55

LAManufacturing 70.50 51.60 61.05Trade 73.40 58.40 65.90General 71.00 51.50 61.25

34 C. GAGANIS ET AL.

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

the tenth is used to obtain initial estimates of the error rates. The process is repeated 10 times, eachtime using a different set for validation. Finally, the results of the 10 iterations are averaged tocalculate the error rate. The performance of the model in the validation set is used to access itsgeneralization ability.

As in previous studies (e.g. Liu, 2002), the number of nearest neighbours is adjusted and theperformance of the model is evaluated each time. In this study, we experiment with various valuesfor k and select the one that provides the highest average classification accuracy in the validationsubsample. Finally, we set k = 27 in the case of the manufacturing model, k = 51 in the case ofthe trade model and k = 13 in the case of the general model. We then re-estimate the models usingthe entire training sample.

The results in Table V indicate that all the models are able to distinguish between qualified andunqualified financial statements, with the average classification accuracy ranging between 60.45%(DA Manufacturing) and 72.21% (k-NN General). Generally, the models developed through k-NNachieve higher classification accuracies, although in some cases the differences are relatively small.Of particular importance is the fact that the k-NN models are, in most cases, more efficient inidentifying financial statements that should receive qualified audit reports.

Table VI presents the results using the observations from the period 2002–2003 (i.e. testingsample). As expected, the classification accuracies decrease in the testing sample. The highestaverage accuracy is achieved by the k-NN Trade model (65.66%) and the lowest by the DA Manu-facturing model (55.20%). As in the training sample, the k-NN models are more efficient in all cases,mainly due to their ability to identify financial statements that should receive qualified audit reports.By contrast, with the exception of the LA Trade model, both DA and LA classify correct less than50% of the qualified financial statements.

With respect to the development of industry-specific or general models, the results are mixed.Specifically, whereas the General model is more efficient than the Manufacturing model, it is lessefficient than the Trade model, irrespective of the method that we use.

Table VII presents the coefficients of the DA and LA models, when we consider the variablesRATING and INDUSTRY. As before, all the models are statistically significant at the 1% level.However, the comparison with the models developed with financial variables alone indicates a

Table VI. Correct classification accuracies in holdout sample of models developedwith financial variables

Model Classification accuracy (%)

Unqualified Qualified Average

k-NNManufacturing 56.52 67.13 61.82Trade 77.06 54.26 65.66General 66.98 57.97 62.48

DAManufacturing 67.30 43.10 55.20Trade 71.90 45.70 58.80General 68.70 44.90 56.80

LAManufacturing 68.40 44.90 56.65Trade 73.80 52.70 63.25General 70.20 44.90 57.55

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

MODEL COMPARISONS FOR AUDITING DECISIONS 35

Table VII. Coefficientsa of the variables and summary statistics of the DA and LA modelsdeveloped with financial and non-financial variables

Manufacturing Trade General

DA LA DA LA DA LA

VariablesLIQ 0.054 0.086 0.201 0.903 0.087 0.152

(0.214) (18.119)*** (0.580) (67.920)*** (0.317) (33.045)***LEV 0.010 0.010 0.012 0.010 0.011 0.009

(0.298) (24.181)*** (0.316) (7.346)*** (0.313) (30.881)***GROWTH −0.001 −0.001 −0.005 −0.004 −0.002 −0.002

(−0.075) (1.537) (−0.268) (12.475)*** (−0.144) (8.924)***PROF −0.006 −0.006 −0.009 −0.011 −0.007 −0.007

(−0.173) (10.265)*** (−0.220) (5.125)** (−0.194) (15.473)***RATING 0.866 0.837 0.868 0.871 0.871 0.783

(1.081) (321.314)*** (1.059) (103.265)*** (1.084) (396.089)***INDUSTRY — — — — −0.232 −0.218

(−0.110) (7.645)***Constant −3.374 −3.328 −3.462 −4.053 −3.261 −3.014

(257.941)*** (101.460)*** (283.525)***Summary statisticsWilks’ λ 0.798 — 0.863 — 0.829 —χ2 513.54 — 170.98 — 645.05 —Sig. (p-value) 0.000 — 0.000 — 0.000 —Canonical 0.449 — 0.370 — 0.413 —correlationχ2 — 507.84 — 251.058 — 658.46Sig. (p-value) — 0.000 — 0.000 — 0.000Pseudo R2 — 0.191 — 0.183 — 0.159

* Statistically significant at 10% level; ** statistically significant at 5% level; *** statistically significant at 1% level.a Standardized coefficients are presented in parentheses in the case of DA. The results of the Wald test are presented inparentheses in the case of LA.

considerable increase in the goodness of fit, as measured by the canonical correlation and the pseudoR2 in all cases. RATING is statistically significant in all LA models and the most important variablein the DA models (based on the standardized coefficients). Consistent with our expectations high-risk firms that have a higher probability of default are more likely to receive a qualified auditopinion. INDUSTRY is also statistically significant, indicating that firms operating in the manufac-turing sector are less likely to receive qualified audit opinions.

Turning to the classification results in the training sample (Table VIII), the comparison with thecorresponding accuracies in Table V, reveals a considerable increase in all cases. The model thatexperiences the highest increase is the Manufacturing DA model, which now achieves an averageclassification accuracy equal to 70.95% compared with the 60.45% that was achieved by thecorresponding model in Table V. Thus, credit ratings can play an important role in the detectionof financial statements that should receive qualified audit opinions. Although in many cases the DAand LA models experience higher increases than the corresponding k-NN models,13 the latter retaintheir ability to classify financial statements more efficiently in all cases.

13 In the models developed with financial and non-financial variables we set k = 21 (Manufacturing model), k = 21(Trade model) and k = 51 (General model).

36 C. GAGANIS ET AL.

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

Table IX presents the classification accuracies achieved in the testing sample. The results are quitesatisfactory, with average accuracies that are above 70% in all cases. DA and LA achieve similaraverage classification accuracies that are in most cases lower than those achieved by k-NN. Furthermore,all the models now classify correctly a satisfactory proportion of firms that received a qualified auditopinion. Hence, the inclusion of the credit rating plays an important role in distinguishing betweenqualified and unqualified financial statements, and especially in identifying the qualified ones.

5. CONCLUSIONS AND SUGGESTIONS FOR FURTHER RESEARCH

In this study we introduced k-NN as an alternative AI machine-learning method for the developmentof classification models for auditing decisions. We also developed models with DA and LA, which

Table VIII. In-sample (training) correct classification accuracies of models developedwith financial and non-financial variables

Model Classification accuracy (%)

Unqualified Qualified Average

k-NNManufacturing 79.01 71.15 75.08Trade 84.84 63.27 74.05General 80.89 67.72 74.30

DAManufacturing 68.6 73.3 70.95Trade 66.1 65.9 66.00General 67.0 71.7 69.35

LAManufacturing 68.4 74.1 71.25Trade 73.1 70.8 71.05General 67.4 70.7 69.05

Table IX. Correct classification accuracies in holdout sample of models developedwith financial and non-financial variables

Model Classification accuracy (%)

Unqualified Qualified Average

k-NNManufacturing 77.78 72.22 75.00Trade 86.42 70.54 78.48General 81.56 71.01 76.29

DAManufacturing 68.60 74.5 71.55Trade 67.50 73.6 70.55General 66.50 74.5 70.50

LAManufacturing 67.80 75.00 71.40Trade 74.10 72.90 73.50General 66.60 73.60 70.10

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

MODEL COMPARISONS FOR AUDITING DECISIONS 37

are considered traditional techniques in classification problems in finance and accounting, and arefrequently used for benchmarking purposes.

The sample consisted of 980 qualified and 4296 unqualified statements from 1455 private andpublic UK companies belonging in the manufacturing and trade sectors. Two industry-specificmodels (i.e. one for manufacturing and one for trade) and a general model were developed usingdata from 1998 to 2001. The models were then tested using data from 2002 and 2003. In each case,two versions of the models were developed, the first including financial variables alone and thesecond a combination of financial and non-financial variables.

The results can be summarized as follows. First, k-NN models appear to be more efficient, interms of average classification accuracy, than DA and LA models. Second, the inclusion of creditratings in the models resulted in a considerable increase in terms of both the goodness of fit andthe classification accuracy. Third, the results are mixed with respect to the development of industry-specific models, as opposed to general models.

Despite its contribution, our study is not without its limitations. One of the drawbacks of the k-NN method is that it does not reveal which variables contribute in the decision to classify a financialstatement as qualified or unqualified. Hence, the method operates as a ‘black box’ similar to theneural networks approach. Nevertheless, the main purpose of this kind of research is not to provideevidence of the association between published audit reports and company characteristics. Instead,attention is on whether financial statements can be accurately classified as qualified or unqualified.Therefore, the coefficient estimates, their significance level, and even their signs are less important(Dietrich, 1984).

Future research could extend our study in numerous directions. First, one could use alternativeclassification techniques, such as multidimensional scaling. Furthermore, the results of differentmethods could be combined in an integrated model, an approach that has yielded promising resultsin bankruptcy prediction (Jo and Han, 1996; Gaganis et al., 2005), credit risk assessment (Doumpos,2002) and acquisitions prediction (Tartari et al., 2003). Finally, it would be worthwhile consideringadditional non-financial variables, such as managers’ experiences, auditors’ reputations and size,audit and non-audit fees, firm’s market share, organizational capacity and technological advantage.

ACKNOWLEDGEMENTS

We would like to thank an anonymous reviewer, Bob Berry (Editor), and participants at the 28thEuropean Accounting Association Annual Congress (2005) for valuable comments and suggestionsthat helped us improve earlier versions of this paper.

REFERENCES

Altman E. 1983. Corporate Financial Distress: A Complete Guide to Predicting, Avoiding and Dealing withBankruptcy. Wiley.

Ashton A. 1991. Experience and error frequency knowledge as potential determinants of auditor expertise. TheAccounting Review 66: 216–239.

Baldwin AA, Brown CE, Trinkle BS. 2006. Opportunities for artificial intelligence development in the account-ing domain: the case for auditing. Intelligent Systems in Accounting, Finance and Management 14: 77–86.

Barnes P. 1990. The prediction of takeover targets in the U.K. by means of multiple discriminant analysis.Journal of Business Finance and Accounting 17(1): 73–84.

38 C. GAGANIS ET AL.

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

Barnes P. 2000. The identification of U.K. takeover targets using published historical cost accounting data. Someempirical evidence comparing logit with linear discriminant analysis and raw financial ratios with industry-relative ratios. International Review of Financial Analysis 9(2): 147–162.

Beasley SM, Carcello JV, Hermanson DR. 1999. Fraudulent financial reporting: 1987–1997: an analysis of USpublic companies. Research Report, COSO.

Bell T, Tabor R. 1991. Empirical analysis of audit uncertainty qualifications. Journal of Accounting Research29: 350–370.

Bonner S, Lewis B. 1990. Dimensions of auditor expertise. Journal of Accounting Research, Supplement 28:1–28.

Briscoe G, Caelli T. 1996. A Compendium of Machine Learning Volume 1: Symbolic Machine Learning. AblexPublishing Corporation.

Caramanis C, Spathis Ch. 2006. Auditee and audit firm characteristics as determinants of audit qualifications:evidence from the Athens stock exchange. Managerial Auditing Journal 21(9): 905–920.

Carbonell JG, Michalski RS, Mitchell TM. 1983. Machine learning: a historical and methodological analysis.The AI Magazine 4(3): 69–79.

Carcello J, Hermanson R, McGrath N. 1992. Audit quality attributes: the perceptions of audit partners, preparers,and financial statement users. Auditing: A Journal of Practice and Theory 11: 1–15.

Choo F, Trotman KT. 1991. The relationship between knowledge structure and judgments for experienced andinexperienced auditors. Accounting Review 66: 464–485.

De Beelde I. 1997. An exploratory investigation of industry specialization of large audit firms. The InternationalJournal of Accounting 32(3): 337–355.

Dietrich JR. 1984. Discussion of methodological issues related to the estimation of financial distress predictionmodels. Journal of Accounting Research 22(Supplement): 83–86.

Dopouch N, Holthausen R, Leftwich R. 1987. Predicting audit qualifications with financial and market variables.Accounting Review 62(3): 431–454.

Doumpos M. 2002. A stacked generalization framework for credit risk assessment. Operational Research: AnInternational Journal 2(2): 261–278.

Doumpos M, Pasiouras F. 2005. Developing and testing models for replicating credit ratings: A multicriteriaapproach. Computational Economics 25(4): 327–341.

Doumpos M, Gaganis Ch, Pasiouras F. 2005. Explaining qualifications in audit reports using a support vectormachine methodology. Intelligent Systems in Accounting, Finance and Management 13: 197–215.

Eisenbeis RA. 1977. Pitfalls in the application of discriminant analysis in business, finance, and economics.Journal of Finance (June): 875–900.

Espahbodi H, Espahbodi P. 2003. Binary choice models for corporate takeover. Journal of Banking and Finance27: 549–574.

Fanning KM, Cogger KO. 1994. A comparative analysis of artificial neural networks using financial distressprediciton. International Journal of Intelligent Systems in Accounting, Finance and Management 3: 241–252.

Fanning K, Cogger K. 1998. Neural detection of management fraud using published financial data. InternationalJournal of Intelligent Systems in Accounting, Finance and Management 7(1): 21–41.

Fanning K, Cogger K, Srivastana R. 1995. Detection of management fraud: a neural network approach.International Journal of Intelligent Systems in Accounting Finance and Management 4(2): 113–126.

Fearnley S, Hines T. 2003. The regulatory framework for financial reporting and auditing in the UnitedKingdom: the present position and impending changes. The International Journal of Accounting 38: 215–233.

Fischhoff B. 1982. For those condemned to study the past: heuristics and biases in hindsight. In JudgementUnder Uncertainty: Heuristic and Biases, Kahneman D, Slovic P, Tversky A (eds). Cambridge UniversityPress: Cambridge, UK.

Frederick DM. 1991. Auditors’ representation and retrieval of internal control knowledge. Accounting Review66: 240–258.

Frederick DM, Libby R. 1986. Expertise and auditors’ judgments of conjunctive events. Journal of AccountingResearch 24: 270–290.

Fritz S, Hosemann D. 2000. Restructuring the credit process: behaviour scoring for German corporates.International Journal of Intelligent Systems in Accounting, Finance and Management 9: 9–21.

Fukunaga K, Flick T. 1984. An optimal global nearest neighbour metric. IEEE Transactions on Pattern Analysisand Machine Intelligence PAMI-6(3): 314–318.

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

MODEL COMPARISONS FOR AUDITING DECISIONS 39

Gaganis Ch, Pasiouras F, Tzanetoulakos A. 2005. A comparison and integration of classification techniques forthe prediction of small UK firms failure. The Journal of Financial Decision Making 1(1): 55–69.

Gaganis Ch, Pasiouras F, Doumpos M. 2007. Probabilistic neural networks for the identification of qualifiedaudit opinions. Expert Systems with Applications 32: 114–124.

Hamer MM. 1983. Failure prediction: sensitivity of classification accuracy to alternative statistical methods andvariable sets. Journal of Accounting and Public Policy 2: 289–307.

Hellstrom T, Holmstrom K. 2000. The relevance of trends for predictions of stock returns. International Journalof Intelligent Systems in Accounting, Finance and Management 9: 23–34.

Henley W, Hand D. 1996. A k-NN classifier for assessing consumer credit risk. Statistician 45: 77–95.Ireland J. 2003. An empirical investigation of determinants of audit reports in the UK. Journal of Business

Finance and Accounting 30(7–8): 975–1015.Jo H, Han I. 1996. Integration of case-based forecasting, neural network, and discriminant analysis for bank-

ruptcy prediction. Expert Systems with Applications 11(4): 415–422.Keasey K, Watson R, Wynarzcyk P. 1988. The small company audit qualification: a preliminary investigation.

Accounting and Business Research 18: 323–333.Kirkos E, Spathis Ch, Manolopoulos Y. 2007. Data mining techniques for the detection of fraudulent financial

statements. Expert Systems with Applications 32: 995–1003.Kocagil AE, Escott Ph, Glormann F, Malzkorn W, Scott A. 2002. Moody’s RiskCalc for private companies:

UK. Moody’s Investors Service, Global Credit Research.Lai Ch-Ch, Tsai M-Ch. 2004. An empirical performance comparison of machine learning methods for spam e-

mail categorization. In Proceedings of the Fourth International Conference on Hybrid Intelligent Systems.IEEE Computer Society: Washington, DC; 44–48.

Laitinen EK, Laitinen T. 1998. Qualified audit reports in Finland: evidence from large companies. EuropeanAccounting Review 7(4): 639–653.

Leban G, Zupan B, Vidmar G, Bratko I. 2006. VizRank: data visualization guided by machine learning. DataMining and Knowledge Discovery 13(2): 119–136. DOI: 10.1007/s10618-005-0031-5.

Lenard M, Alam P, Madey G. 1995. The application of neural networks and a qualitative response model tothe auditor’s going concern uncertainty decision. Decision Sciences 26(2): 209–227.

Lennox C. 2000. Do companies successfully engage in opinion-shopping? Evidence from the UK. Journal ofAccounting and Economics 29: 321–337.

Levitan A, Knoblett J. 1985. Indicators of exceptions to the going concern opinion decision. Auditing: A Journalof Practice and Theory 5(1): 26–39.

Liu Y. 2002. The evaluation of classification models for credit scoring. Institut für Wirtschaftsinformatik, Georg-August-Universitat Göttingen.

Loebbecke J, Eining M, Willingham J. 1989. Auditor’s experience with material irregularities: frequency,nature, and detectability. Auditing: A Journal of Practice and Theory 9: 1–28.

Mahajan J. 1992. The overconfidence effect in marketing management predictions. Journal of MarketingResearch 29: 329–342.

McKeown JC, Mutchler JF, Hopwood W. 1991. Towards an explanation of auditor failure to modify the auditopinions on bankrupt companies. Auditing: A Journal of Practice and Theory 10: 1–13.

Menon K, Schwartz H. 1987. An empirical investigation of audit qualification decisions in the presence ofgoing-concern uncertainties. Contemporary Accounting Research 3(2): 303–315.

Mutchler J. 1985. A multivariate analysis of the auditor’s going concern opinion decision. Journal of AccountingResearch 23(2): 668–682.

Nilsson NJ. 1996. Introduction to machine learning (an early draft of a proposed textbook). Robotics Laboratory,Department of Computer Science, Stanford University, 4 December. http://robotics.stanford.edu/people/nilsson/MLDraftBook/MLBOOK.pdf [15 April 2007].

Nowman K, Saltoglu B. 2003. Continuous time and nonparametric modelling of U.S. interest rate models.International Review of Financial Analysis 12: 25–34.

O’Keefe T, Simunic D, Stein M. 1994. The production of audit services: evidence from a major public account-ing firm. Journal of Accounting Research 32: 241–261.

Oskamp S. 1982. Overconfidence in case-study judgments. In Judgment Under Uncertainty: Heuristics andBiases, Kahneman D, Slovic P, Tversky A (eds). Cambridge University Press: Cambridge, UK; 287–293.

Palepu KG. 1986. Predicting takeover targets: a methodological and empirical analysis. Journal of Accountingand Economics 8: 3–35.

40 C. GAGANIS ET AL.

Copyright © 2007 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt. 15, 23–40 (2007)DOI: 10.1002/isaf

Pasiouras F, Tanna S, Zopounidis C. 2005. Application of Quantitative Techniques for the Prediction of BankAcquisition Targets. World Scientific.

Pasiouras F, Gaganis Ch, Zopounidis C. 2007. Multicriteria decision support methodologies for auditingdecisions: the case of qualified audit reports in the UK. European Journal of Operational Research 180(3):1317–1330.

Pomeranz F. 1992. The Successful Audit. Irwin: Homewood, IL.Reynolds J, Francis J. 2001. Does size matter? The influence of large clients on office-level auditor reporting

decisions. Journal of Accounting and Economics 30: 375–400.Shepherd DA, Zacharakis A, Baron RA. 2003. VCs’ decision processes: evidence suggesting more experience

may not always be better. Journal of Business Venturing 18(3): 381–401.Short R, Fukunaga K. 1980. A new nearest neighbor distance measure. In: Proceedings of the Fifth IEEE

Computer Society Conference on Pattern Recognition. IEEE Computer Society Press: Washington, DC; 81–86.

Spathis C. 2002. Detecting false financial statements using published data: some evidence from Greece.Managerial Auditing Journal 17(4): 179–191.

Spathis C. 2003. Audit qualification, firm litigation, and financial information: an empirical analysis in Greece.International Journal of Auditing, 7(1): 71–85.

Spathis Ch, Doumpos M, Zopounidis C. 2002. Detecting falsified financial statements: a comparative studyusing multicriteria analysis and multivariate statistical techniques. The European Accounting Review 11(3):509–535.

Spathis Ch, Doumpos M, Zopounidis C. 2003. Using client performance measures to identify pre-engagementfactors associated with qualified audit reports in Greece. The International Journal of Accounting 38: 267–284.

Stein R. 2002. Benchmarking default prediction models: pitfalls and remedies in model validation. Moody’sKMV Technical Report #030124, June.

Summers SL, Sweeney JT. 1998. Fraudulently misstated financial statements and insider trading: an empiricalanalysis. The Accounting Review 73(1): 131–146.

Tam K, Kiang M. 1992. Managerial applications of neural networks: the case of bank failure predictions.Management Science 38(7): 926–947.

Tartari E, Doumpos M, Baourakis G, Zopounidis C. 2003. A stacked generalization framework for the predictionof corporate acquisitions. Foundations of Computing and Decision Sciences 28(1): 41–61.

West D. 2000. Neural network credit scoring models. Computers and Operations Research 27(11–12): 1131–1152.