The Essential Elements of Faceted Thesauri

This article was downloaded by: [Dalhousie University]On: 28 January 2014, At: 11:29Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office:Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Cataloging & Classification QuarterlyPublication details, including instructions for authors and subscriptioninformation:http://www.tandfonline.com/loi/wccq20

The Essential Elements of FacetedThesauriLouise F. Spiteri PhD aa Dalhousie University, School of Library and Information Studies ,Halifax, Nova Scotia, CanadaPublished online: 24 Oct 2008.

To cite this article: Louise F. Spiteri PhD (2000) The Essential Elements of Faceted Thesauri, Cataloging& Classification Quarterly, 28:4, 31-52, DOI: 10.1300/J104v28n04_05

To link to this article: http://dx.doi.org/10.1300/J104v28n04_05

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis, ouragents, and our licensors make no representations or warranties whatsoever as to theaccuracy, completeness, or suitability for any purpose of the Content. Any opinions and viewsexpressed in this publication are the opinions and views of the authors, and are not the viewsof or endorsed by Taylor & Francis. The accuracy of the Content should not be relied uponand should be independently verified with primary sources of information. Taylor and Francisshall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses,damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly inconnection with, in relation to or arising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantialor systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, ordistribution in any form to anyone is expressly forbidden. Terms & Conditions of access anduse can be found at http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/loi/wccq20

http://www.tandfonline.com/action/showCitFormats?doi=10.1300/J104v28n04_05

http://dx.doi.org/10.1300/J104v28n04_05

http://www.tandfonline.com/page/terms-and-conditions

The Essential Elements of Faceted Thesauri

Louise F. Spiteri

ABSTRACT. The goal of this study is to evaluate, compare, and con-trast how facet analysis is used to construct the systematic or faceteddisplays of a selection of information retrieval thesauri. More specifi-cally, the study seeks to examine which principles of facet analysis areused in the thesauri, and the extent to which different thesauri applythese principles in the same way. A measuring instrument was designedfor the purpose of evaluating the structure of faceted thesauri. Thisinstrument was applied to fourteen faceted information retrieval thesau-ri. The study reveals that the thesauri do not share a common definitionof what constitutes a facet. In some cases, the thesauri apply bothenumerative-style classification and facet analysis to arrange their in-dexing terms. A number of the facets used in the thesauri are nothomogeneous or mutually exclusive. The principle of synthesis is usedin only 50% of the thesauri, and no one citation order is used consistent-ly by the thesauri. [Article copies available for a fee from The Haworth Docu-ment Delivery Service: 1-800-342-9678. E-mail address: [email protected] <Website:http://www.haworthpressinc.com>]

KEYWORDS. Faceted thesauri, thesaurus construction, evaluation,facet analysis

INTRODUCTION

In this study, facet analysis is defined as the sorting of terms in agiven field of knowledge into ‘‘homogenous, mutually exclusive fac-ets, each derived from the parent universe by a single characteristic ofdivision . . . every distinctive logical category should be isolated,

Louise F. Spiteri, PhD, is Assistant Professor, Dalhousie University, School ofLibrary and Information Studies, Halifax, Nova Scotia, Canada.

Cataloging & Classification Quarterly, Vol. 28(4) 1999E 1999 by The Haworth Press, Inc. All rights reserved. 31

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14

CATALOGING & CLASSIFICATION QUARTERLY32

every new characteristic of division should be clearly indicated.’’1 Afacet consists therefore of a group of terms that represents one, andonly one, characteristic of division of a subject field (e.g., STU-DENTS, CURRICULUM, TEACHING METHODS are facets of thesubject field Education). In this context, mutually exclusive means thateach facet should represent a characteristic of division that is notfound in any other facet. In other, words, no two facets may containterms that could represent the same concepts.

Facet analysis has been used in the construction of informationretrieval (IR) thesauri since the publication of the Information Retriev-al Thesaurus of Education Terms in 1968.2 A faceted thesaurus con-sists typically of the following components: an introduction; a system-atic or faceted display; and an alphabetical display. In the faceteddisplay, the indexing terms form the faceted classification. The growthin the number of faceted thesauri has not generally, however, beenaccompanied by any comprehensive attempts to evaluate their struc-ture and use of facet analysis; most evaluation studies have beenconducted on individual thesauri.3,4,5,6,7,8 While these studies providevaluable insights into the construction of individual faceted thesauri, itis very difficult to compare and contrast the way that facet analysis isused amongst the different thesauri. Furthermore, in the case of manyof these studies, mention is made of ‘systematic’ or ‘hierarchical’displays, but the specific structure of these displays is not discussed.Lastly, the reviewers do not always indicate the criteria against whichthey measure the quality of the thesauri; thus there is no commonframe of reference by which to compare the thesauri.

Van Dijk and Sager conducted two comprehensive evaluation stud-ies of several thesauri.9,10 While very detailed in their analyses, bothstudies looked at a variety of IR thesauri, both faceted and non-fac-eted, and thus the criteria they used were not always specific enoughto the unique structure of faceted thesauri.

OBJECTIVES

The goal of this study is to evaluate, compare, and contrast the waythat facet analysis is used to construct the systematic or faceted dis-plays of a selection of IR thesauri. More specifically, the study seeksto answer the following questions:

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14

Louise F. Spiteri 33

a. Which principles of facet analysis are used in the thesauri?b. Do the thesauri apply the same principles of facet analysis?c. Do the thesauri apply these principles in the same way?d. Are these principles applied in consistency with those developed

by Ranganathan and the Classification Research Group?

The purpose of this study is not to criticize the structure of any onefaceted thesaurus, but to study the usage of facet analysis in thesaurusconstruction. There is no recognized model for the application of facetanalysis to IR thesauri: national and international guidelines for the-saurus construction make minimal mention of the use of facet analy-sis.11 Ranganathan’s principles for facet analysis are extremely de-tailed and complex, and are written in a style that is not always easy tounderstand. There is also a lack of consistency between Ranganathanand the Classification Research Group in their treatments of funda-mental categories and the citation order of facets. Consequently, the-saurus designers might not have access to clear guidelines pertainingto the application of facet principles to IR thesauri. It is hoped that anexamination of the way that IR thesauri currently use facet analysismight lead to the formulation of a model for incorporation into guide-lines and models for thesaurus construction.

METHODOLOGY

A measuring instrument (Appendix l) was designed for the purposeof evaluating the structure or composition of faceted thesauri.12 Thisinstrument was designed to look at all three sections of a facetedthesaurus. Each section is divided into a series of criteria and theircorresponding measures; for example, in section one of the instru-ment, Criterion 1 is divided into Measures 1a-1c. The criteria andmeasures used in the instrument were derived from the followingcategories of sources:

a. Introduction and alphabetical display of faceted thesauri: guide-lines and manuals for thesaurus construction.13,14,15,16,17,18,19

b. Faceted display of thesauri: the principles of facet analysis ex-pounded by Ranganathan20,21 and the Classification ResearchGroup (CRG).22,23,24,25

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


Because this study is interested in the use of facet analysis in IRthesauri, rather than their overall structure, only the sections in theinstrument pertaining to the introduction and the faceted display offaceted thesauri were used. The introduction is important to include inthis study because it is here that thesaurus designers tend to define theprinciples they used to construct the thesaurus.

The criteria used in section one of the instrument are measuredalong an ordinal scale and determine the extent to which the introduc-tory section of a faceted thesaurus explains its underlying principles.This section uses the following scoring mechanism:

No explanation = 1Some explanation = 2Detailed explanation = 3

Section two of the measuring instrument uses an ordinal scale tomeasure the extent to which the IR thesauri apply the principles offacet analysis to the systematic/faceted display. The measures arescored as follows:

Never = 1Sometimes = 2 (less than 70% of the time)Most times = 3 (70% of the time or more)Always = 4 (100% of the time)

For each measure, a formula was used based upon an examinationof 30% of the faceted display, since some of the thesauri have exten-sive faceted displays that cover hundreds of pages. For example, if thefaceted display has 15 main classes (or broad facets), 5 of these classeswere examined in depth (i.e., every third class).

The 14 thesauri (Appendix 2) used in this study were limited tofaceted thesauri published since 1980 and available via the NovanetLibrary Consortium at Dalhousie University. It should be noted thatAppendix 2 indicates also the abbreviated names for the thesauri thatshall be used henceforth. Only the print versions of the thesauri wereexamined. Although some of the thesauri in the population may beavailable online, the introductory and explanatory texts necessary tothe evaluation of the thesauri are not always available in the onlineversions. The following criteria were used to determine whether or notthesauri are faceted:

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


a. Thesauri are described in the bibliographies and the research lit-erature as being faceted, or as using facet analysis; and

b. Thesauri state in their introductions that they are faceted or thatthey use facet analysis.

ASSUMPTIONS

Ranganathan and the CRG believe that a subject should be dividedinitially into broad facets, or fundamental categories, which are rough-ly equivalent to a main class in enumerative classification systems.Each broad facet is broken down into its component sub-facets, eachof which represents a single characteristic of division of the broadfacet. Sub-facets should thus also be homogeneous and mutually ex-clusive.

Ranganathan suggests 5 characteristics of division by which toderive facets, namely Personality, Matter, Energy, Space, and Time(PMEST). The CRG, on the other hand, believes that no single set ofcharacteristics/categories can be applied to all subjects; instead, eachsubject area should be divided into categories that are appropriate toits nature. Since both Ranganathan and the CRG agree about theessential qualities of a facet (i.e., single characteristic of division,exclusivity, and homogeneity), it is these basic qualities that will belooked for in the faceted display of IR thesauri.

One of the most attractive features of facet analysis is its ‘‘synthet-ic’’ ability, i.e., the ability to combine terms from different facets toexpress compound subjects. Ranganathan and the CRG both agree thata citation order should be used to determine the combination of terms,but they do not agree upon the type of order. Ranganathan believesthat facets should be ordered in the sequence P-M-E-S-T, whereas theCRG prefers a more flexible order that is determined by the needs andstructure of the subject being classified. In this study, although the useof citation order was measured, no one type of citation order wasadvocated or looked for.

FINDINGS

Section I: Introductory Sections of the Thesauri

Why is it important to evaluate the introduction? For one thing, allclassification systems reflect bias: the bias of their creators, of the

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


subject area, of scientific and educational consensus pertaining to theway that a subject area should be categorized and organized, etc. It isimportant to explain as much as possible the influences that underliethe structure of the faceted display so that searchers know these in-built biases. Secondly, many searchers, and possibly some indexers,might not be very familiar with either the tenets of facet analysis, orthe way in which a faceted classification system works.

Criterion 1: Explanation of the Facet Principles and ProcessesUsed to Construct the Thesaurus

Measure 1a: To what extent does the introduction explain whatfacet analysis is and the way that it has been used in the thesaurus? Adetailed explanation may include an outline of the basic concepts offacet analysis (e.g., subject division, synthesis, fundamental catego-ries, characteristics of division, facets). Most of the thesauri explain tosome degree the facet principles used to construct the faceted display.The notable exceptions are MOYS and IBE, which provide no suchexplanation. For the most part, the thesauri concentrate upon statingthat facet analysis was used to arrange related terms. Only AAT, REF-UGEE, and YOUTH explain in more detail the principles of the mutu-al exclusivity and homogeneity of facets.

There is little consistency amongst the 14 thesauri when it comes todefining what constitutes a facet. IBE, and UNICEF, for example,define facets as groups that cover related concepts. In BINDING andGENRE, facets are ‘‘gathering terms’’ used to arrange the hierarchicalrelationships amongst broader and narrower terms. ROOT andYOUTH both state that facets are fundamental categories, but do notexplain what this means. In AAT, facets are homogeneous, mutuallyexclusive units of information that share characteristics that demon-strate their differences from each other. It is unfortunate, however, thatAAT uses both the terms ‘‘facets’’ and ‘‘hierarchies,’’ which serves tofurther confuse the issue. Facets represent the broad categories intowhich the subject area is divided, whilst hierarchies represent thesubdivisions of the facets (referred to as sub-facets by CRG and Ran-ganathan). This can be confusing, because the term ‘‘hierarchy’’ inclassification usage traditionally implies a genus-species or whole-part division, which is not the basis for division used in facet analysis.

Measure 1b: This measure looks at the degree to which theintroduction explains the characteristics of division used to derive the

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


broad facets (BFs) in the thesaurus. ASIS, BINDING, GENRE,PHYSICS, ROOT, and SPORTS merely list the BFs they use, butprovide no further explanation about the way they were derived. Theremaining thesauri provide detailed explanations about the way thatthe BFs were derived and the nature of their contents.

Measure 1c: This measure looks at the extent to which the introduc-tion explains the characteristics of division used to derive the sub-fac-ets of the BFs. ASIS, BINDING, GENRE, and UNICEF do not ex-plain the principles used to derive their sub-facets. The remainingthesauri all provide detailed explanations of the way that sub-facetswere derived, with the exception of PHYSICS and SPORT: the formerconfines its explanation to the provision of the main facet labels usedin the thesaurus; the latter discusses the way that only three sub-facetswere derived by way of example.

Criterion 2: Explanation of the Use of Synthesisin the Faceted Display

Measure 2a: This measure looks at the extent to which the introduc-tion explains the way that synthesis is used to express compoundconcepts in the thesaurus. Only 7 of the thesauri (50%) mention thatthey use synthesis to allow for the combination of terms from differentparts of the faceted display. Explanations provided in each of these 7cases are detailed and include helpful examples of the way that termscan be combined. ASIS, BINDING, BUILDING, GENRE, IBE,PHYSICS, and INFORMATICS make no mention of the use of syn-thesis.

Measure 2b: This measure looks at the extent to which the introduc-tion explains the citation order used to arrange sub-facets in the fac-eted display and whether all the BFs arrange their facets in the sameorder. Not surprisingly, those thesauri that do not use synthesis do notmention citation orders. Of the thesauri that do use synthesis, onlyROOT and UNICEF fail to provide an explanation of citation order.AAT explains that indexers should follow the instructions for combin-ing terms given in each facet, as the citation order can vary from facetto facet. MOYS uses the Universal Decimal Classification approachof linking terms from different parts of the thesaurus via a semi-colon;the citation order of the terms depends upon the alphabetical order ofthe notation corresponding to each term. For example, if one were tocombine the two terms ‘‘Schools’’ (notation KN 184.2) and ‘‘Sexual

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


Discrimination’’ (notation KM 208.2), ‘‘Sexual Discrimination’’would precede ‘‘Schools.’’ SPORT also bases citation order upon thealphabetical order of the notation corresponding to the indexing terms.YOUTH uses the shunting approach of the PRECIS system to order itsindexing terms, thereby generating strings of terms for a classifiedcatalogue, e.g.:

Junior Youth Clubs QXTYouth Club Programmes QVManagement GB

Three combinations or strings of terms would be generated from thenotation, namely:

1. GB:GV:QXT2. QV: QXT: GB3. QXT: QV: GB

Section 2: Faceted Displays of the Thesauri

The faceted display of a thesaurus should analyse and break downthe subject area of the thesaurus into its conceptual elements. Thefaceted display should show the way that descriptors are related bothto the subject area and to one another.

Criterion 3: Analysis of the Subject Area of the Thesaurusinto Its Component Parts

Measure 3a: This measure looks at the extent to which the BFs inthe faceted display represent clearly only one characteristic of divisionof the subject. In other words, are the BFs homogeneous? Not all ofthe thesauri examined used facet analysis to derive their BFs. MOYS,ROOT, UNICEF, and YOUTH admit to using main classes basedupon traditional disciplines, such as ‘‘economics,’’ ‘‘history,’’ and soforth, rather than broad facets derived by the strict application of facetanalysis. In these four thesauri, therefore, the definition of a singlecharacteristic of division must be necessarily broad, e.g., the class‘‘History,’’ though broad, can be used to represent a single characteris-tic of division of the subject Youth, i.e., a gathering place for alldescriptors that pertain to an historical study of this area.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


The problem posed here could be one of using two different ap-proaches to the classification of indexing terms. Main classes general-ly represent a somewhat loose association of terms based upon theirrelationship to an academic discipline, rather than to a characteristic ofdivision of the subject. Furthermore, this approach is rather deductive:it lends itself to an easy division of the subject area, but does it reflectan in-depth analysis of the component characteristics of a subjectarea? Finally, main classes are often derived from the application ofhierarchical principles of division (i.e., genus-species and enumera-tion), an approach not advocated by facet analysis, which looksinstead at the division of a subject area into its component parts in anon-hierarchical/linear order.

Some of the BFs used in the thesauri do not, in fact, represent singlecharacteristics of division. SPORTS does a rather poor job of creatinghomogeneous broad facets: only 1 of its 5 BFs, is homogeneous (Or-ganizations). The other BFs represent several different characteristicsof division, e.g., Leisure Facilities and the Physical Environment,which includes concepts pertaining to sports facilities, sports equip-ment, building materials, and construction elements. In PHYSICS thefirst BF in the thesaurus is actually called Physics; given that the entirethesaurus is about Physics, the scope of this BF is rather broad andlacks definition. This BF contains descriptors pertaining to theories,different types of physics, and so forth, which means that it does notrepresent a single characteristic of division.

Two of the main classes in YOUTH represent more than one char-acteristic of division, namely, ‘‘History. Philosophy. Ethics andReligion,’’ as well as ‘‘Arts. Philology.’’ It seems that YOUTH hasclassed together different disciplines into single main classes; in thecase of the latter class, the disciplines are not even related closely.Only one of the 24 BFs used in ROOT is not homogeneous, namely,Social Sciences and Humanities, which represents two areas of study,rather than a single discipline. Only 6 thesauri have completely homo-geneous BFs: ASIS, IBE, INFORM, MOYS, REFUGEE, and UNICEF

Measure 3b: This measure looks at the extent to which BFs aremutually exclusive. In the case of most of the thesauri, all the BFs aremutually exclusive. This might sound surprising, given that some ofthese thesauri had non-homogeneous BFs. It is possible, however, fora BF to represent different characteristics of division that do not, infact, appear in any other BF, which is what seems to have happened

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


here. Noticeable exceptions are BUILDING and INFORMATICS; inthe latter thesaurus, the term ‘‘data transmission systems’’ appears inthe two BFs Systems and Impact. The worst two offenders are ASISand GENRE, where terms appear frequently in different BFs; in fact,each of the BFs contains terms that overlap in at least one other BF.

Measure 3c: This measure looks at the extent to which BFs aredivided into sub-facets. All the thesauri divide their BFs into sub-fac-ets to varying degrees. AAT, BUILDING, IBE, MOYS, PHYSICS,ROOT, UNICEF and YOUTH subdivide all of their BFs. This divisioninto subfacets is important as it helps define more clearly the specificnature of the relationships amongst individual indexing terms. In IN-FORMATICS, the BF Agents contains an undivided list of descriptors.In SPORTS, class V merely lists the names of organizations. In BIND-ING, the BFs Styles and Binders’ Evidence are not sub-divided andcontain merely an alphabetical listing of indexing terms. This is not,perhaps, a terribly crucial point with thesauri whose indexing termsare relatively few in number (as is the case with BINDING), but onemust be careful to avoid the situation where a faceted display starts tolook just like an alphabetical display, relying only upon very broadcategories of terms that appear to share only marginal relationships toone another. The worst two offenders are ASJS and BINDING, whereonly 50% and 31% respectively of the BFs are subdivided.

Measure 3d: This measure looks at the extent to which the sub-fac-ets are homogeneous. Most of the thesauri do a good job of creatingsub-facets that are homogeneous: in the cases of BINDING, IN-FORM, MOYS, PHYSICS, REFUGEE, and ROOT, all the sub-facetsexamined represented single characteristics of division. In UNICEF,the sub-facet Child Development is problematic, as it is a subdivisionof the BF Child Development and Environment. Since child develop-ment is the focus of the entire BF, it is not clear what this sub-facetrepresents. In YOUTH, 248 out of the 250 sub-facets examined arenot homogeneous, e.g., the sub-facet Youth Clubs is divided ‘‘Byreligion, race, sex,’’ which represents three characteristics of division.In SPORT the sub-facet By capital and investment and insurancecontains concepts that might be related but that are not homogeneous.

Measure 3e: This measure looks at the extent to which the sub-fac-ets are mutually exclusive. All the sub-facets examined in AAT,BINDING, IBE, MOYS, PHYSICS, REFUGEE, ROOT, SPORTS,UNICEF and YOUTH are mutually exclusive. The sub-facets of the

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


remaining thesauri vary, however, in their degree of mutual exclusiv-ity. BUILDING, GENRE, and INFORMATICS contain a number ofsub-facets where terms overlap: e.g., in INFORMATICS, Program-ming appears as a sub-facet of the BF Methods and Models and as anindexing term under the sub-facet Systems. In the case of ASIS, all the19 sub-facets examined overlapped to one degree or another: the term‘‘facsimile transmissions,’’ for example, appears under subfacetsCommunications Activities and Information and Library Operations.

Measure 3f: This measure looks at the extent to which the sub-fac-ets are labeled clearly. This measure might have cosmetic value only,as it looks at the extent to which facet labels indicate the characteristicof division that each sub-facet is meant to represent. This measure hasvalue in that it helps ensure a degree of transparency in the structure ofthe faceted display.

Most of the thesauri do a good job of identifying clearly the charac-teristics of division represented by their sub-facets. ROOT, for exam-ple, uses very clear labels such as By process or By type. Of the 61sub-facets examined in UNICEF, only 2 appeared to lack clarity intheir names, e.g., the sub-facet Development in the BF Child Develop-ment and Environment: it is not clear what characteristic was used toderive this sub-facet. IBE could do a better job of differentiatingamongst the meanings of the sub-facets Educational Personnel, In-structional Staff, and Teachers, because their names do not identifyclearly the way that the characteristics of division used by these sub-facets differ from one another.

Criterion 4: Arrangement of Facets and Indexing Terms in FacetedDisplays in Clear and Specific Orders

This section of the instrument looks at the degree to which thefaceted displays arrange their sub-facets into a clearly-discernablecitation order. This section looks also at the order in which indexingterms are arranged within a single sub-facet (order in array). Imposingan order of arrangement is linked to two objectives:

a. Order of sub-facets and/or of indexing terms could be used to re-flect the way that the internal components of a subject area areinter-related. For example, if indexing terms are arranged in ageneral-specific order, users might be better able to determinethe level of specificity of a particular indexing term with regards

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


to its relationship to the subject area. This internal order mightalso make for a more predictable faceted display. If all the BFsarrange their sub-facets in a general-specific order, for example,searchers and indexers might find it easier to locate a specific in-dexing term. For example, indexing terms relating to curriculumdesign would probably appear earlier in the faceted display thanterms relating to special education, as the latter reflects a higherdegree of specificity.

b. Order is linked closely to the concept of synthesis, wherebyterms from different facets are linked in a consistent, prescribedorder. Predictability in the internal arrangement of BFs and sub-facets is a tool of convenience to the users, as it sets up a patternfor them to follow. For this reason, no matter what arrangementis used, it should be used consistently, so that users know what toexpect. In other words, all BFs should arrange their sub-facets inthe same order, as this sets up a predictable pattern regarding theinternal structure of the faceted display.

Measure 4a: This measure looks at the extent to which the BFs usea citation order to arrange their sub-facets. Citation order, in fact, isused by very few of the thesauri: most do not appear to use a particularorder for the arrangement of their facets, other than alphabetical. Analphabetical citation order is certainly a convenient device, but is notpreferred by either Ranganathan or the CRG, since it does not reflectthe structural relationships among sub-facets. An alphabetical citationorder is used exclusively in AAT, ASIS, BINDING, BUILDING, andGENRE. PHYSICS, SPORTS, UNICEF, and YOUTH do not appear touse any particular citation order, including alphabetical. YOUTH, forexample, orders two of its sub-facets within the BF Leisure class asfollows:

Play Sport(By personnel) (By clubs)(By provision) (By persons)(By equipment) (By activities)

The remaining thesauri use a citation order for some of their BFs: ofthe 8 BFs examined in ROOT, only one appears to have an obviousorder which, in this case, conforms to the normal ‘‘operating proce-dure’’ for a research study:

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


Research(By organization)(By method)(By activity)(By result)

MOYS uses a general-specific order for 70% of its BFs. REFUGEEuses the order ‘‘Agents-Actions-Properties-Parts’’ for 80% of its BFs.IBE uses both alphabetical and general-specific order for most of its BFs.REFUGEE uses the following citation order for 4 of the 5 BFs examined:AGENTS--ACTIONS--PROPERTIES--MATERIALS--PARTS--TYPES.

Measure 4b: This measure looks at the extent to which the BFsarrange their indexing terms in a specific order (i.e., order in array).Once again the thesauri vary greatly in the ways in which they orderthe indexing terms within individual BFs. UNICEF and INFORMAT-ICS could not be measured here, since the faceted display of theformer does not contain indexing terms, and because the graphicaldisplay of the latter means that indexing terms are arranged in acircular manner around the sub-facets, which makes it impossible todetermine the direction of the terms.

AAT, ASIS, BINDING, GENRE, IBE, and PHYSICS all use alpha-betical order to arrange the indexing terms within their sub-facets.This order is certainly predictable, and is an order accepted by bothRanganathan and the CRG. ROOT and SPORT sometimes order in-dexing terms in alphabetical order, but this is not done consistently; infact in the latter thesaurus, only 12 of the 106 sub-facets examinedorder their indexing terms. YOUTH does not appear to order indexingterms in any consistent fashion.

Measure 4c: This measure looks at the extent to which each BFimposes a consistent citation order upon its sub-facets. Ideally, eachBF should order its sub-facets in a consistent manner for the sake ofpredictability and consistency. Since most of the thesauri examined donot use a citation order of any kind (other than alphabetical), it is notsurprising that there is little consistency in the order imposed uponeach BF. For example, 3 different BFs may use a different citationorder, depending upon the nature of the subject, but the order of eachwithin each BF should be internally consistent. AAT, ASIS, BIND-ING, and GENRE use alphabetical order to arrange the indexing termsin all the BFs examined. MOYS and IBE use a general-specific order

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


to arrange the indexing terms in some of their BFs, but no discernibleorder for other BFs. When the general-specific order is used, however,it is done so consistently. Since PHYSICS, YOUTH, SPORT, ROOT,and UNICEF do not use citation orders at all, consistent application ofcitation order becomes moot.

CONCLUSIONS AND RECOMMENDATIONS

An examination of the 14 thesauri reveals a number of interestingand perhaps somewhat troubling patterns in the way in which theyapply the principles of facet analysis. The average score for theintroductory section of the thesauri (Appendix 3, Table 1) is 10 out ofa possible maximum of 15, with the lowest scores going to ASIS (6),IBE and PHYSICS (7), and the highest score of 15 going to AAT,BINDING, REFUGEE, and YOUTH. The average score for the fac-eted display (Appendix 3, Table 2) is 28 out of a possible maximum of36, with the lowest scores going to BUILDING (23) and ASIS andSPORT (24) and the highest scores going to AAT, IBE, MOYS, andREFUGEE (31). AAT appears to be the most structurally sound in itsapplication of the principles of facet analysis, and does an extraordi-nary job of explaining its structure and use. ASIS, in particular, couldprofit, perhaps, from a clearer delineation of its facets, as the highdegree of overlap amongst them appears to be unusually high.

Perhaps the most important point to emerge from the examinationof the introductory section of the thesauri is the fact that the 14 thesau-ri do not appear to share a common interpretation of what constitutes afacet. Only AAT mentions the underlying concepts of homogeneity,mutual exclusivity, and characteristics of division that are generallyassociated with the definition of a facet. This lack of consistencypertaining to the nature of facets, which is the bedrock upon which liesfacet analysis, does not bode well for the consistency of the structureof faceted displays in IR thesauri. If designers cannot even agree aboutwhat a facet is, how can they possibly build faceted displays withconsistent structures?

Only 8 of the thesauri examined actually apply the principle ofsynthesis, which is the important second half of the principle of facetanalysis (i.e., analytico-synthetic classification). Strictly speaking, cana faceted thesaurus that does not use synthesis be called faceted? Onewonders, however, about the ability to apply fully the measure of

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


synthesis to a post-coordinate thesaurus. Since most post-coordinatethesauri consist of single-concept indexing terms, the assumption isthat these terms can be combined by indexer and searcher alike. Fromthe indexing point of view, indexing terms are included in the descrip-tor field of the record as separate terms, e.g., ARCHITECTURE.CHURCHES. Granted, the two terms are not intrinsically linked, butthey can be retrieved via the use of the Boolean operator AND. In apost-coordinate thesaurus, it is not necessary to create strings of index-ing terms. It could therefore be argued that synthesis is inherent tothese types of thesauri, and hence mention of this principle in thethesaurus could be redundant. As for citation order, its necessity in apost-coordinate thesaurus might be moot, given that most informationretrieval systems do not distinguish between ‘‘churches AND architec-ture’’ and ‘‘architecture AND churches.’’ The frequent lack of citationorder is therefore not surprising; the variety of citation orders usedwould seem to support the CRG viewpoint that no one order is neces-sarily suitable to all subjects. Without a doubt, alphabetical orderemerges as the most popular way to order subfacets and their indexingterms.

It is obvious that there is as yet no consensus amongst thesaurusdesigners about the best way in which to apply facet analysis; sincenational and international guidelines for thesaurus construction havefailed also to reach a consensus, this situation is hardly surprising.What seems to be evident is that the terms ‘‘facet’’ and ‘‘facet analy-sis’’ mean different things to different thesaurus designers, whichnaturally results in thesauri that are inconsistent in their structure. Thisdegree of inconsistency places the onus upon indexers and searchersto shift mental gears every time they move from one thesaurus toanother. Granted that subject areas are different, but surely designerscould create faceted displays whose structures are more transparentand predictable than is currently the case?

This type of transparent thesaurus structure is, no doubt, a pipedream, but small steps can be taken to help narrow the differencesamongst the thesauri. The first step should involve a standardized and‘‘formal’’ definition of what constitutes a facet. This definition couldbe included in national and international guidelines for thesaurusconstruction, as they often serve as a template for thesaurus designers.Vickery’s definition (provided in the INTRODUCTION section) iselegant and simple enough to serve this function.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


The second step is for thesaurus designers to pay closer attention to theconcepts of homogeneity and mutual exclusivity. As things stand, it couldbe difficult and confusing to use a faceted display if an indexing termappears in more than one facet--where does one look? Is this a case of ahomograph, or is the same term being used in the same semantic contextin different parts of the display? Overlap amongst the facets blurs theinternal organization and integrity of a subject field. In the face of theexponential growth of published information, it becomes increasinglyimportant to define clearly the context and meaning of indexing terms.

The third step involves questioning the importance of applying theprinciple of citation order to a post-coordinate thesaurus. Post-coor-dination provides wonderful flexibility to both the indexer and thesearcher, but the down side is that it can lead also to several false dropsif search engines do not provide searchers with the ability to determinethe sequence of words. Devices such as proximity operators, or the useof exact phrase searching mimic the function of citation order. If IRthesauri are to leave citation order in the hands of the searchers, it isimportant that information retrieval systems provide this serviceinstead, especially since databases are growing to the point where theprecision of searches becomes increasingly problematic.

Received: March, 1999Accepted: August, 1999

REFERENCES

1. B. C. Vickery, Faceted Classification Schemes. New Brunswick, NJ: GraduateSchool of Library Science Rutgers, 1966.

2. G. C. Barhydt & C. T. Schmidt, Information Retrieval Thesaurus of EducationTerms. Cleveland, OH: Western Reserve University, Center for Documentation andCommunication Research, 1968.

3. Jean Aitchison, ‘‘The EUDISED Thesaurus: A Preliminary Appraisal.’’Education Libraries Bulletin 18(1975): 12-25.

4. K. G. B. Bakewell, ‘‘The Unesco: IBE Education Thesaurus (Second Edition):A Description and Assessment.’’ Education Libraries Bulletin 19(1976): 9-19.

5. Britton Goudie, ‘‘Moys Classification and Thesaurus for Legal Materials.’’The Indexer 18(1993): 208-209.

6. Joyce Line, ‘‘ROOT Thesaurus.’’ The Indexer 15(1986): 122.7. Harsha Parekh, ‘‘The Unesco: IBE Thesaurus--An Evaluation.’’ in Seminar on

Thesaurus in Information Systems. Bangalore, December 1-5,1975: C40-C52, 1975.8. Helga J. Perry, ‘‘DHSS-DATA Thesaurus: The Thesaurus of the Department

of Health and Social Security Library, London.’’ The Indexer 15(1986): 56-57.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


9. Marcel Van Dijk, Definition of the Essential Characteristics of Thesauri.Bruxelles: Bureau Marcel Van Dijk, 1976.

10. J. C. Sager, Guidelines for the Establishment of Comparison and Compatibili-ty Matrices Between Thesauri in the Social Sciences. Manchester: Centre for Com-putational Linguistics, University of Manchester, Institute of Science and Technolo-gy, 1981.

11. Louise F. Spiteri, ‘‘The Use of Facet Analysis in Information Retrieval The-sauri: An Examination of Selected Guidelines for Thesaurus Construction.’’ Cata-loging & Classification Quarterly. 25(1997): 21-38.

12. Louise F. Spiteri, Design of an Instrument to Measure the Structural Qualityof Faceted Thesauri. Toronto, ON: University of Toronto, 1996.

13. Jean Aitchison & Alan Gilchrist, Thesaurus Construction: A Practical Manu-al. Second Edition. London: Aslib, 1987.

14. British Standards Institution, British Standard Guide to the Establishment andDevelopment of Monolingual Thesauri. BS 5723:1987. London: British StandardsInstitution, 1987.

15. International Organization for Standardization, Documentation--Guidelinesfor the Establishment and Development of Monolingual Thesauri. ISO 2788-1986.Geneva: International Organization for Standardization, 1986.

16. National Information Standards Organization, Guidelines for the Construc-tion, Format and Management of Monolingual Thesauri: An American NationalStandard. NISO Z39.19-1994. Bethseda, MD: NISO Press, 1994.

17. F. W. Lancaster, Thesaurus Construction and Use: A Condensed Course.PGI-85/WS/1 1. Paris: Unesco, 1985.

18. Maxine MacCafferty, Thesauri and Thesauri Construction. London: Aslib,1977.

19. Elizabeth Orna, Build Yourself a Thesaurus: A Step-by-Step Guide. Norwich:Running Angel, 1983.

20. S. R. Ranganathan, Elements of Library Classification. Bombay: Asia Pub-lishing House, 1962.

21. S. R. Ranganathan, Prolegomena to Library Classification. New York: AsiaPublishing House, 1967.

22. Derek Austin, ‘‘Prospects for a New General Classification.’’ Journal ofLibrarianship1 (1969): 149-169.

23. Classification Research Group. ‘‘The Need for a Faceted Classification as theBasis of All Methods of Information Retrieval.’’ In Theory of Subject Analysis,edited by Lois Mai Chan et al. Littleton, CO: Libraries Unlimited, 1985.

24. Bernard I. Palmer & A. J. Wells, The Fundamentals of Library Classification.London: George Allen & Unwin, 1951.

25. B. C. Vickery, Faceted Classification: A Guide to the Construction and Use ofSpecial Schemes. London: Aslib, 1960.

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


APPENDIX 1Measuring Instrument

SECTION 1: INTRODUCTION

CRITERION MEASURE SCORE1 = No explanation2 = Some explanation3 = Detailed explanation

1 1aThe introduction explains Defines and explains the 1 2 3clearly the facet principles process of facet analysisand processes used toconstruct the thesaurus

1bDefines and explains the 1 2 3broad facets used to dividethe subject area

1cExplains the principles of 1 2 3division used to derive thesub-facets

2 2aThe introduction explains Explains how compound 1 2 3clearly the use of synthesis terms can be formedin the thesaurus

2bExplains the order in which 1 2 3single descriptors can becombined to express compoundconcepts

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


SECTION 2: FACETED DISPLAY

CRITERION MEASURE SCORE1 = Never2 = Sometimes3 = Most times4 = Always

3 3a 1 2 3 4The faceted display analyzes Broad facets arethe subject area into its homogeneouscomponent parts

3bBroad facets are mutually 1 2 3 4exclusive

3cBroad facets are subdivided 1 2 3 4into sub-facets

3dSub-facets are 1 2 3 4homogeneous

3eSub-facets are mutually 1 2 3 4exclusive

3fSub-facets are labelled 1 2 3 4clearly

4 4aFacets and descriptors are Broad facets arrange their 1 2 3 4arranged in specific orders sub-facets in a clear, non-

alphabetical order

4bSub-facets arrange their 1 2 3 4descriptors in a clear order

4cAll sub-facets within a broad 1 2 3 4facet are arranged in anintemally-consistent manner

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


APPENDIX 2Thesauri Evaluated

Art & Architecture Thesaurus. New York: Oxford University Press, 1994. [AAT]

ASIS Thesaurus of lnformation Science and Librarianship. Medford, N.J.: AmericanSociety for Information Science, 1993. [ASIS]

Binding Terms. Chicago: Association of College and Research Libraries, 1988.[BINDING]

BSI: Root Thesaurus. Milton Keynes: British Standards Institution, 1988. [ROOT]

Building Services Thesaurus. Bracknell, Berkshire: BSRIA, 1993. [BUILDING]

Classification/Thesaurus for Sport and Physical Recreation. London: SportsCouncil, 1981. [SPORTS]

Genre Terms. Chicago: Association of College and Research Libraries, 1983.[GENRE]

International Thesaurus of Refugee Terminology. Dordrecht: M. Nijhoff Publishers,1989. [REFUGEE]

Moys Classification and Thesaurus. London: Bowker-Saur, 1992. [MOYS]

Physics Thesaurus. London: British Library Bibliographic Services Division, 1981.[PHYSICS]

Thesaurus for Informatics. Rome: Intergovernmental Bureau for Informatics, 1980.[INFORMATICS]

Thesaurus on Youth. Leicester: National Youth Bureau, 1981. [YOUTH]

Unesco: IBE Education Thesaurus. Paris: Unesco, 1991. [IBE]

Unicef Thesaurus. New York: United Nations Children’s Fund, 1988. [UNICEF]

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


APPENDIX 3Tables of Findings

TABLE 1. Introduction of Thesauri

1a 1b 1c 2a 2b Total/15

AAT 3 3 3 3 3 15/15

ASIS 2 1 1 1 1 06/15

BINDING 2 1 3 3 1 15/15

BUILDING 2 2 1 1 3 09/15

GENRE 2 1 2 2 1 09/15

IBE 1 3 1 1 1 07/15

INFORM. 2 3 3 2 1 11/15

MOYS 1 3 3 3 1 11/15

PHYSICS 2 1 2 1 1 07/15

REFUGEE 3 3 3 3 3 15/15

ROOT 2 1 3 1 1 10/15

SPORTS 2 1 2 1 3 09/15

UNICEF 2 3 1 1 1 08/15

YOUTH 3 3 3 3 3 15/15

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14


APPENDIX 3 (continued)

TABLE 2. Faceted Display of Thesauri

3a 3b 3c 3d 3e 3f 4a 4b 4c TOTAL/36

AAT 3 4 4 3 4 4 1 4 4 31/36

ASIS 4 1 2 3 1 4 1 4 4 24/36

BINDING 3 4 2 4 4 4 1 4 4 30/36

BUILDING 3 2 4 2 2 2 4 2 2 23/36

GENRE 3 1 3 4 2 4 1 4 4 26/36

IBE 4 4 4 3 4 3 2 4 3 31/36

INFORM. 4 2 3 4 2 4 2 N/A 1 22/32

MOYS 4 4 4 4 4 4 3 1 3 31/36

PHYSICS 3 4 4 4 4 4 1 4 1 29/36

REFUGEE 4 4 3 4 4 4 3 2 3 31/36

ROOT 3 4 4 4 4 4 2 2 1 28/36

SPORTS 2 4 3 3 4 4 1 2 1 24/36

UNICEF 4 4 4 3 4 3 1 N/A 1 24/32

YOUTH 3 4 4 3 4 4 1 1 1 25/36

Dow

nloa

ded

by [

Dal

hous

ie U

nive

rsity

] at

11:

29 2

8 Ja

nuar

y 20

14

The Essential Elements of Faceted Thesauri

Documents

Transcript of The Essential Elements of Faceted Thesauri