A review of the status of 20 digital libraries

27
1 A Review of the Status of Twenty Digital Libraries Meyyappan, N., Chowdhury, G.G., Foo, S. (2000). Journal of Information Science, 26(5), 337-355. A Review Of The Status Of Twenty Digital Libraries N. Meyyappan, G.G. Chowdhury and Schubert Foo Division of Information Studies, School of Applied Science Nanyang Technological University, Singapore Abstract Recent proliferation of research in digital libraries has given rise to a number of working digital libraries around the world. These digital libraries have been defined, designed and developed differently, and therefore the experience that one might have from one particular digital library might not be the same with other digital libraries. Current status of twenty digital libraries around the world: twelve from the US, three from the UK, two from Australia, one from New Zealand, one from Singapore, and one from Canada, has been reviewed. Various features of these selected digital libraries were collected from their home pages, journal articles and the information published on the Web. The parameters used to study the chosen digital libraries include: contents, type of library, organization, user interface, access, information retrieval, search features, output format, and links to other Internet resources. While some of the chosen digital libraries cater for specific subject or document format, others play the role of digital as well as virtual libraries giving access to the local digital collection as well as remote collections accessible through the Web. While most of these digital libraries have been developed for use in-house or by authorised users, some digital libraries are globally accessible. The chosen digital libraries differ in terms of the information search and output facilities, and very few have the facility to store search histories. Only four digital libraries have books in electronic form – National library of Canada in general area, Gutenberg in subject-specific area, and SETIS and Carnegie Mellon University in special collection area. The review confirms that whilst digital libraries to date have been quite useful, there is need for further improvements in terms of user interfaces and information facilities. Additionally, this study reveals that two different types digital libraries are likely to emerge in future. The first is subject and document specific digital libraries that will cater for specific subject and type of information like digital video, maps, photographs and paintings, theses, and so on. The second is hybrid libraries that will link the traditional libraries with its OPAC, CD-ROM and online databases to the world of digital libraries and virtual libraries or gateways. The provision of personalized information services is an emerging trend in digital libraries to provide the next higher level of functionality to support users’ specific information needs and preferred search and retrieval strategies.

Transcript of A review of the status of 20 digital libraries

1

A Review of the Status of Twenty Digital LibrariesMeyyappan, N., Chowdhury, G.G., Foo, S. (2000). Journal of Information Science, 26(5), 337-355.

A Review Of The Status Of Twenty Digital Libraries

N. Meyyappan, G.G. Chowdhury and Schubert FooDivision of Information Studies, School of Applied Science

Nanyang Technological University, Singapore

Abstract

Recent proliferation of research in digital libraries has given rise to a number of workingdigital libraries around the world. These digital libraries have been defined, designedand developed differently, and therefore the experience that one might have from oneparticular digital library might not be the same with other digital libraries. Currentstatus of twenty digital libraries around the world: twelve from the US, three from theUK, two from Australia, one from New Zealand, one from Singapore, and one fromCanada, has been reviewed. Various features of these selected digital libraries werecollected from their home pages, journal articles and the information published on theWeb. The parameters used to study the chosen digital libraries include: contents, type oflibrary, organization, user interface, access, information retrieval, search features,output format, and links to other Internet resources. While some of the chosen digitallibraries cater for specific subject or document format, others play the role of digital aswell as virtual libraries giving access to the local digital collection as well as remotecollections accessible through the Web. While most of these digital libraries have beendeveloped for use in-house or by authorised users, some digital libraries are globallyaccessible. The chosen digital libraries differ in terms of the information search andoutput facilities, and very few have the facility to store search histories. Only four digitallibraries have books in electronic form – National library of Canada in general area,Gutenberg in subject-specific area, and SETIS and Carnegie Mellon University inspecial collection area.

The review confirms that whilst digital libraries to date have been quite useful, there isneed for further improvements in terms of user interfaces and information facilities.Additionally, this study reveals that two different types digital libraries are likely toemerge in future. The first is subject and document specific digital libraries that willcater for specific subject and type of information like digital video, maps, photographsand paintings, theses, and so on. The second is hybrid libraries that will link thetraditional libraries with its OPAC, CD-ROM and online databases to the world ofdigital libraries and virtual libraries or gateways. The provision of personalizedinformation services is an emerging trend in digital libraries to provide the next higherlevel of functionality to support users’ specific information needs and preferred searchand retrieval strategies.

2

Introduction

Digital library research has drawn much attention not only in the developed countries butalso in developing countries. Improvements in information technology and increasedfunding towards information infrastructure have led to the development of a wide rangeof digital library collections and services. Some digital library research projects are run incollaboration with academic and international organizations. Digital Library Initiativeprojects in the US and the eLib projects in UK have played a key role in digital librarydevelopment. In addition, many Digital Library projects are currently underway inAustralia, Asia, Europe, Africa and Latin America. While some of them have their ownfunding, others are funded under DL-specific funding initiatives.

Many definitions of digital libraries are available in the literature. According toOppenheim [1], a digital library is an organized and managed collection of information ina variety of media (text, still image, video, audio, 3D models or a combination of these)all in a digital form. The British library DL program [2] defines digital library as thewidely accepted descriptor for the use of digital technologies to acquire, store, conserve,and provide access to information and materials in whatever form it was originallypublished. These definitions emphasize that the materials in a digital library should be indigital form. Stanford digital library working group [3] goes further to define digitallibraries as a co-ordinated collection of services which are based on collections ofmaterials, some of which may not be directly under the control of the organizationproviding a service in which they play a role. Drabenstott [4] has identified the followingcommon elements from various definitions of digital libraries:

� The digital library is not a single entity;� The digital library requires technology to link the resources of many;� The linkages between the many digital libraries and information services are

transparent to the end users;� Universal access to digital libraries and information services is a goal;� Digital library collections are not limited to document surrogates; they extend to

digital artifacts that cannot be represented or distributed in printed forms.

The Digital Libraries Federation (DLF) [5] define digital library from a librarian’s pointof view, stating that digital libraries are organisations that provide the resources,including the specialized staff, to select, structure, offer intellectual access to, interpret,distribute, preserve the integrity of and ensure the persistence over time of collections ofdigital works so that they are readily and economically available for use by a definedcommunity or a set of communities. Borgman [6] has recently examined the variousdefinitions of digital libraries proposed by the various research groups, and proposes thata digital library is: (1) a service; (2) an architecture; (3) information resources, databases,text, numbers, graphics, sound, video, etc.; and (4) a set of tools and capabilities to locate,retrieve and utilize the information resources available.

3

Digital Libraries have been categorized differently by researchers. Oppenheim [1] hasidentified four types of libraries on a continuum running from the traditional to thedigital: traditional, automated, hybrid and digital. A hybrid library, according toOppenheim [1], is a library having a range of different information sources, printed orelectronic, local or remote to various locations of library resources in different parts ofthe world. Rusbridge [7] suggests that a hybrid library should be designed to bring arange of technologies from different sources together in the context of a working library,and also to begin to explore integrated systems and services in both the electronic andprint environments.

Digital library research involves people from different areas such as computer andinformation science, social science and economics, law, and so on. Building a gooddigital library involves research in a number of areas. Chowdhury and Chowdhury [8]have recently reviewed digital library research under 15 major headings highlighting themajor research activities in each of the 15 areas and pointing out the future of digitallibrary research. This paper presents an overview of twenty digital libraries from around the world with aview to:

� providing an understanding of the various working digital libraries in different partsof the world, and

� identifying the various features of the working digital libraries including content,coverage, organisation of information, user interface, access, and search and retrievalfacilities.

Quite a few digital library projects are going on in different parts of the world underdifferent digital library programmes. DLI phase 1 had six projects, DLI phase 2 [9] hasmore than 30 projects and eLib programme [10] funded around 60 projects. Besidesthese, many commercial, public and government agencies have digital library projectsaround the world. Digital library projects are increasing at such a high rate that it isdifficult to keep track of the total number of ongoing digital library projects.

Twenty working digital libraries were selected for this study. The following points havebeen considered for the selection of these digital libraries: representation from differentcountries, origination from different types of institutions, coverage of various subjectsand various types of materials. The chosen digital library projects for this study alsoinclude academic, public and special libraries.

Out of the twenty chosen digital libraries: twelve are from the US, three from the UK,two from Australia, and one each from Canada, New Zealand and Singapore. Variousfeatures of the selected digital libraries were collected from their home pages, journalarticles and the information published on the Web. The parameters used to study thechosen digital libraries include: the type of library, parent organization, the collection,information access, information storage and retrieval including the search and outputfeatures. Thus, this paper attempts to give an idea of what a digital library is, what are

4

their objectives, what do they cover, what are the search features, accessibility options,display format, and so on. This paper is expected to be particularly useful for studentsand beginners in the digital library area because it provides a snapshot of the variousfeatures of some prominent digital libraries around the world. Parts of this paper may alsobe useful for digital library researchers for it provides the comparison of some of thesefeatures.

Changes in digital libraries are taking place all the time, and indeed many changes havetaken place even during this short period of the study (from the end of 1999 to early2000). However, efforts have been made to incorporate all such changes to date aspossible. The list of references may appear short for a typical review paper. However,this list has to be supplemented with the URLs of the various digital libraries that werethe primary sources of information used for this study.

Digital Libraries: Basic Information

Table 1 presents some general information about the chosen digital libraries, such as thename of the organisation or institution hosting the digital library, country, classificationof the library, year of origin, URL, funding agency, and partnership with otherorganisation (if any). Table 2 gives information about the specialization, content and typeof materials in the collection. Table 3 gives information about the access to andorganisation of materials. Table 4 gives information on the output format, sort facility,search history; and Table 5 shows the search facilities available in the chosen digitallibraries.

Twelve, out of the twenty chosen, digital library projects were undertaken in the USalone, and nine of these were undertaken by Universities, two were by professionalorganisations, while one was undertaken by the Library of Congress. Out of the eightother digital libraries chosen from the other parts of the world, six were undertaken at theUniversity level while two were undertaken by National Libraries, viz. National Libraryof Canada and the British Library. Three digital library projects were undertaken by theUniversity of California: one at Santa Barbara campus known as Alexandria DigitalLibrary (ADL), the second one at the Berkeley campus (UCB), and the third one at theUniversity of California Office of the President. The British library has five digital libraryprojects namely, International Dunhuang, Beowulf, Bibliotheca universalis, Magna Cartaand Treasures Digitization. The Beowulf project has been considered for this study.

The earliest of the digital library projects, though it was not called a digital library then,was Gutenberg that was set up in 1971. One project from the British Library started in1993; six projects started in 1994 with a funding from DLI-1 programme; two projectsstarted in 1995, three projects in 1996, two in 1998, and one in 1999. Two projects, viz.,BUILDER and the HEADLINE, funded by the eLib programme in the UK, started in1998, and are still under development.

5

Objectives of the Chosen Digital Libraries

The motivation and specific objectives for each digital library differ significantly. Eachof them was basically designed to achieve a specific purpose. For example,

� The main objective of the ACM digital library was to provide full-text access toarticles, conference proceedings published in ACM periodicals and proceedings.

� ADL was designed to provide access to a large range of maps and images to text andmultimedia using spatially indexed information.

� The objective of AMMEM was to provide a rich primary source of materials relatingto the history and culture of the US.

� The objective of the British Library’s Electronic Beowulf project has been to increaseaccess to its collections by use of imaging and network technology.

� Both BUILDER and HEADLINE aimed to create hybrid libraries. BUILDER issupposed to develop the working model of a hybrid library in a teaching and researchcontext, seamlessly integrating access to a wide range of printed and electronicinformation sources through the WWW interface. HEADLINE, yet to put materials inits digital library, aims to provide the user with a wide range of library resourcesregardless of the physical form.

� The main objective of CDL is resource sharing among the University of Californialibraries in other campuses and some local libraries.

� The objective of creating the digital library at the Carnegie Mellon University(CMDL) was to create an integrated speech, image and language understandingdigital video library in addition to having some e-books, arts, music, e-journals andperiodicals.

� GEMS has been built to serve as a vehicle to deliver a wide range of informationresources regardless of the media type over a campus-wide network to all faculty,staff and students.

� The main objective of the Gutenberg project was to provide easy access to thehumanities literature available in electronic format.

� IDL has been developed to build a collection of full-text journal articles from Physics,Engineering and Computer Science and to make them available over WWW.

� The main objective of the IEL digital library is to provide electronic access to full-textarticles, conference proceedings and IEEE standards in the area of electricalengineering, information technology, applied physics and other technical disciplines.

� The objective of NCSTRL was to develop a distributed technical reports librarycontaining a collection of technical reports related to computer science from theinstitutions or organisations offering PhD programmes in computer science orengineering in different parts of the world.

� NDLTD was designed to build a digital library of theses and dissertations of mastersand doctoral students from various universities in the US and around the globe.

� The electronic collection of NLC was set up to make Canadian online books, journalsand catalogues of over 500 Canadian libraries available through WWW.

� The NZDL’s objective has been to develop the underlying technology, which willhelp others to create and manage their own collections, for digital libraries and makeit available to the public.

6

� Queensland Digital Library project, called DIGILIB, was designed to have acollection of wide range of domestic, public, mining and agricultural buildings inQueensland and Brisbane.

� The objective of SETIS was to facilitate access to in-house and remote textual andimage databases, instructional programs, and the creation and storage of electronictexts.

� The objective of the UCB digital Library project was to develop tools andtechnologies to support highly improved models of the “Scholarly information lifecycle” in a distributed, continuous and self-publishing model through objectrecognition and image retrieval in large image databases.

� The main objective of UMDL is to offer electronic information resources inenvironmental studies and other interdisciplinary areas, including life, natural andsocial sciences over distributed network environment.

Funding

Most of the digital library research projects in the US were funded by the US fundingagencies viz. NSF, NASA, ARPA in DLI phase 1. The six libraries received funds to thetune of US$ 25 million under DLI-1. Each one received approximately US$ 4 million ormore. The NCSTRL project is sponsored by ARPA with the Corporation of NationalResearch Initiatives (CNRI) and the National Science Foundation. Two libraries –HEADLINE and BUILDER, receive grants form eLib phase 3. The American Memoryproject is funded by the Library of Congress and private sector participation; NDLTDproject was funded by the US Department of Education and the Southeastern UniversityResearch Association (SURA). New Zealand digital library (NZDL) project is funded bythe New Zealand Foundation for Research, Science and Technology, and Lotteries GrantBoard. DIGILIB is a collaborative project between the University of Queensland’sArchitectural Department and Library; the Tertiary Education Institute and the Universityof Queensland funded this project. CDL is supported by the University of California,while National library of Canada supported the NLC digital library program. The BritishLibrary Digital and Network Services Steering Committee has funded the substantialequipment purchases used in London, while the University of Kentucky has fundedequipment and system support for use in Lexington. GEMS was supported by theNanyang Technological University, Singapore [11]. ACM and IEL digital libraries aremanaged by professional organisations. The ACM digital library is supported by theAssociation for Computing Machinery, and IEL is supported by the Institute of Electricaland Electronics Engineers (IEEE) and the Institution of Electrical Engineers (IEE).

7

Table 1: Basic information on the chosen digital libraries

Nam

e

Inst

itutio

n an

dco

untr

y

*cla

ssfic

atio

n

Yea

r

UR

L

Fund

/Pro

ject

ACM (Association forComputing Machinery)

Association forComputingMachinery, USA

DNA www.acm.org/dl ACM

ADL (Alexandria DigitalLibrary)

University ofCalifornia atSanta Barbara, USA

D1994

http://alexandria.sdc.ucsb.edu

DLI-1

AMMEM (American Memory) Library ofCongress, USA D

1996http://lcweb2.loc.gov/ammem

Library of Congress, USA

British Library(British libraryElectronic BeowulfProject)

British Library,UK D

1993 http://www.bl.uk British Library, UK andUniversity ofKentucky,Lexington

BUILDER (BirminghamUniversity IntegratedLibrary Development andElectronic Resource)

University ofBirmingham, UK H

1998 http://builder.bham.ac.uk/ ELib funded andpartnership with theUniversity of Oxford,University ofWolverhampton, West HillCollege of Higher Educ. &Birminghan Central library

CDL (California DigitalLibrary)

University ofCalifornia officeof the President,USA

H1997 www.cdlib.org California University, USA

CMDL (Carnegie MellonUniversity DigitalLibrary)

Carnegie MellonUniversity, USA D

1994 www.ul.cs.cmu.edu DLI-1

DIGILIB (QueenslandDigital Library (QDL)Project

University ofQueensland,Australia

DNA

www.architect.uq.edu.au/digilib/index.html

Tertiary EducationInstitute and Universityof Queensland

GEMS (GatewayElectronic MediaServices)

NanyangTechnologicalUniversity,Singapore

H1999

www.ntu.edu.sg/library/media/gems/gems.htm

Nanyang TechnologicalUniversity, Singapore

GUTENBERG (GutenbergProject)

University ofIllinois, USA D

1971 www.gutenberg.net By donation and Volunteerservices

HEADLINE (HybridElectronic Access andDelivery in the LibraryNetworked Environment)

London School ofEconomics, LondonSchool of BusinessUniversity ofHertfordshire, UK

H1998 www.headline.ac.uk ELib

IDL (Illinois DigitalLibrary)

University ofIllinois at UrbanaCampaign, USA

D1994

http://dli.grainger.uiuc.edu

DLI-1 in partnership with14 publishers and 5 S/Wproviders

IEL (IEEE / IEEElectronic Library)

IEEE and IEE, USAD

NAwww.ieee.org/products/online/iel

IEEE and IEL

NCSTRL (NetworkedComputer ScienceTechnical ReferenceLibrary)

CornellUniversity, USA D

1995 www.ncstrl.org ARPA with CNRI and NSF

NDLTD (NetworkedDigital Library ofTheses andDissertations)

VirginiaTech.University,USA

D1995 www.theses.org US department of Education

and South EasternUniversity ResearchAssociation

NLC (National Libraryof Canada)

National Libraryof Canada, Canada H

NA www.nlc-bnc.ca National Library of Canada

NZDL (New ZealnadDigital Library)

University ofWaikato, NewZealand

H1996 www.cs.waikato.ac.nz/~nzdl New Zealand foundation

for Research Science andTechnology and LotteriesGrant Board

SETIS (ScholarlyElectronic Text andImage Services)

University ofSydney, Australia D

1996http://setis.library.usyd.edu.au/

University of Sydney

UCB (University ofCalifornia at Berkeley)

University ofCalifornia atBerkeley, USA

D1994

http://elib.cs.berkeley.edu

DLI-1

UMDL (University ofMichigan DigitalLibrary)

University ofMichigan, USA D

1994www.lib.umich.edu/libhome/dig.html

DLI-1

8

Legends: D – Digital Library H- Hybrid Library NA – Not AvailableSubject Coverage of the Chosen Digital Libraries

Most of the libraries chosen for this study have been set up in academic environmentsproviding access to e-journals, OPAC, CD-ROM databases, and online databases. Wecan classify these libraries into three major categories: general, subject-specific andspecialized collections.

General Collection

BUILDER, National Library of Canada, GEMS, NZDL, UMDL and CDL cover e-journals, OPAC, CD-ROM and online databases from a number of disciplines.

Subject-specific Digital Libraries

ACM, ADL, AMMEM, GUTENBERG, IDL, IEL, NCSTRL, SETIS, and UCB aresubject-specific digital libraries by virtue of their coverage. ACM digital library coversliterature from their publications only. ADL covers spatially referenced mapinformation; IDL covers electronic journals in engineering, physics and computerscience; NCSTRL covers computer science technical reports from Computer Sciencedepartments and industrial and government research laboratories from different parts ofthe world; IEL covers electrical and electronics engineering, information technology,applied physics and other technical discipline from their publications; UCB coversenvironmental subject; and SETIS, GUTENBERG and AMMEM cover humanities andsocial sciences. The subject coverage of the project HEADLINE will be economics,finance, business and management scalable to larger groupings of libraries.

Specialized Digital Libraries

DIGILIB covers images, photographs, historical buildings, and CMDL has concentratedon the development of digital video library. The British Library’s electronic Beowulfproject provides access to the old English poem Beowulf manuscripts in the form ofimages. NDLTD covers theses and dissertations in various disciplines from variousparticipating institutions.

The Collection

Table 2 provides information on the content and type of information contained in thedigital libraries concerned. Some digital libraries contain only abstracts or bibliographicinformation while others contain full-text information. We can classify the contents of thetwenty chosen libraries into five groups by its type: bibliographic, full-text, bothbibliographic and full-text, images or graphics, and multimedia. All the libraries provideaccess to bibliographic or full-text databases in some form or the other. ADLconcentrates on multimedia databases, and the CMDL’s Informedia project concentrateson multimedia and music collections in the digital video library. The latter also covers e-books and full text e-journals and periodicals. Each library has a separate user interface

9

for accessing OPAC. Three libraries, viz. UCB, DIGILIB and ADL have specialcollections of maps, images, and so on. Four digital libraries have a collection of CD-ROM databases. In using these databases, the user first selects a database from a list ofCD-ROM databases available in the library. Subsequently, the user interacts with thedatabase using the search interface provided by the database producer. UMDL has morethan 290 abstracting and indexing journals, newspapers and electronic journals in theirnetworked digital library collection.

Table 2: Information about collections, content and type of chosen digital libraries

Name Category Content Type

ACM Specific Articles, proceedings, calendar of events in ACM periodicals andproceedings

Bibliographic, Full-text andcombined

ADL Specific Geographically referenced materials- maps, images, texts Maps, spatial images and Texts

AMMEM Specific History and Culture of the USA. Multimedia collections ofdigitized documents, photographs, sound and moving picturesand text from the library’s Americana collections

Full-text, image ,video and Audio

BL Special Image based edition of the great old English poem in the BritishLibrary and Images of Cotton Vitellius A. XV. InternationalDunhuang, Bibliotheca Universalis, Magna Carta and TreasuresDigitisation digital library projects, OPAC.

Images

BUILDER General Printed and electronic information sources, and Examinationpapers

Bibliographic, Full-text and Combined

CDL General On-line archive of California, Melvyl Union Catalogue,periodicals database, E-journals abstracting & indexingdatabases

Bibliographic, full-text and Combined

CMDL Special Digital Video Collection plus photographs and full-text Digital Video

DIGILIB Special Images of Queensland historic buildings, Brisbane architecture Images and texts

GEMS General E-journals, OPAC, CD-ROM databases, Project report, AVsources

Bibliographic, full-text and Combined

GUTENBERG Specific Humanities, literature and references Full-text

HEADLINE General London School of Economics, London Business School libraryOPACs, CD-ROMs, E-journals, course material, exam papers,secondary sources, financial and government information

Full-text, bibliographic and Combined

IDL Specific Journal article from Physics, Engineering, and Computer Sciencejournals

Bibliographic, Full-text and Combined

IEL Specific Articles, conference proceedings and technical standards inElectrical and Electronic Engineering, Information Technologyand Applied Physics.

Full-text

NCSTRL Specific Collection of Computer Science research reports & papers Full-text, Bibliographic and Combined

NDLTD Special Thesis and Dissertations, E-journals, VT Spectrum, WDBJ7script archives

Full-text, bibliographic and Combined

NLC General On-line books, Journals and OPAC Full-text, bibliographic and Combined

NZDL General Developing interface technology Full-Text

SETIS Specific Humanities, Poetry, Drama, Dictionaries, Text And Image creation Projects, digital version of Post-graduate theses

Full-text

UCB Specific Text, maps, images, sound, video, Hyper-textual Multi-media

Full-text, hyper text and Multi-media

UMDL General E-journals, CD-ROM databases, electronic reference shelf and UM-Med Search

Full-text, bibliographic andcombined

10

ACM has a collection of 39,378 full-text articles from the ACM journals and conferenceproceedings; table of contents with over 7,000 citations from articles published in ACMjournals and magazines from 1985 onwards; and tables of contents with nearly 35,000citations from articles published in over 700 volumes of conference proceedings since1985.

ADL has a collection of geographically referenced materials such as maps, images andtexts and datasets in multimedia form in earth and social sciences. The datasets includemetadata and basic data in digital elevation models, digital raster graphics, scanned aerialphotographs, landsat, seismic datasets and technical reports, Sierra Nevada ecologicproject datasets, mojave ecologic project datasets, and AVHRR. Metadata are availablefor Gazetteers, Geodex, Georef, mojave bibliography and PEGASUS map records.

AMMEM covers more than one million primary source materials relating to the historyand culture of the United States of America. The collection also covers documents, filmmanuscripts, photographs and sound recordings that describe the American history.

British Library’s Electronic Beowulf project has a collection of manuscripts of thegreat old English poem surviving in the British Library. In addition, Electronic Beowulfincludes images of Cotton Vitellius A. xv, indispensable eighteenth-centurytranscriptions, copies of the 1815 first edition with early nineteenth-century collations ofthe manuscript, a comprehensive glossarial index, and a new edition and transcript. Majoradditions include links with the Toronto Dictionary of Old English project and with thecomprehensive Anglo-Saxon bibliographies of the Old English Newsletter.

BUILDER has a collection of printed and electronic information sources, examinationpapers and electronic version of journals, Forensic Linguistics and Midland History.BUILDER is also involved in developing a hybrid library search interface. CDL consists of the On-line archive of California, Melvyl Union Catalogue, and theCalifornia periodicals database. More than 2000 electronic journals from major scholarlypublishers and information providers are licensed and made available in their network. Ithas a collection of abstracting and indexing databases, reference databases, and automaticweekly search services.

CMDL has a multimedia digital library called Informedia that contains over onethousand hours of digital video, audio, images, and text. Informedia has a collection ofmore than 100 videos produced by the Bureau of Mines, Bureau of Reclamation, theFederal Emergency Management Agency Presents, the Fermi Lab, NASA core, theNational Zoo National Oceanic and Atmospheric Administration, the SmithsonianInstitution Presents, and the United States Geological Survey. This library also providesaccess to more than 300 e-journals, periodicals, and e-books.

DIGILIB has a collection of images of Queensland historic buildings that include a widerange of domestic, public, mining and agricultural buildings. Many of these buildingswere previously unrecorded in any accessible form and several have since beendemolished. Over 1030 images are currently stored in the library.

11

GEMS provides access to networkable CD-ROM databases, Chinese CD-ROM titles,online search services, e-journals, AV sources, OPAC, and the Web. GEMS has acollection of more than 310 e-journals. It has a digital collection of project reports,theses, conference articles and publications submitted by staff and students to the library.It also provides other information such as academic calendar, course information,registration details, timetables, outstanding bills, and so on.

Gutenberg has a full-text collection of the Bible, Shakespearean drama, and otherreligious documents. Full-texts of the Roget’s Thesaurus, almanacs, encyclopedia anddictionaries are also available.

HEADLINE includes electronic journals, locally digitized materials, course-relatedmaterials, reading lists, examination papers, local consortium catalogues, secondarysources such as BIDS, IBSS, ECONLit, SOSIG, Biz/Ed, financial data sets andgovernment information. This digital library also covers diverse resources available at thepartner sites.

IDL has developed a system, called DeLIver (Desktop Link to Virtual EngineeringResources) that provides access to full-text articles from Physics, Engineering andComputer Science journals. The collection contains around 40,000 articles from over 54journals from five publishers.

IEL has a collection of more than 5,00,000 full-text articles from over 12,000publications including journals and conference proceedings. The coverage of the IELincludes full-text archives to IEEE and IEE publications from 1988 to the present. IEEEpublishes nearly 30% of the world’s literature in electrical, electronics, computerengineering and science and provides access to more than 120 journal titles, more than600 annual conference proceedings title and over 875 IEEE technical standards. Thiselectronic library is a subset of the INSPEC bibliographic and abstracts database.

NCSTRL has a collection of over 30,000 documents from more than 156 institutionsoffering PhD or engineering degree in Computer Science. NCSTRL collection isavailable from servers of the participating institutions from anywhere and to anybody inthe world.

NDLTD has a collection of more than 1800 theses and dissertations in Virginia TechUniversity campus. In addition, they have electronic journals, VT Spectrum, and WDBJ7script archives. This digital library provides a facility for federated search from elevenother digital libraries of theses and dissertations. There are more than 60 institutionsusing this ETD software for creating their own digital library of theses and dissertations.

NLC electronic collection incorporates formally published Canadian online books andjournals. Catalogue records for Electronic Collection titles, including the UniformResource Locators (URLs), are also available. NLC’s electronic collection has eighteen

12

million full bibliographic records, 5,50,000 authority records, and 30 million holdings of500 Canadian libraries including the National Library.

NZDL provides access to 13 collections mainly covering Computer Science but alsoincluding the HCI bibliography, FAQ archive, Humanity Development library,Indigenous peoples, youth culture oral history, Oxford text archive, project Gutenbergcollection, TidBits and Newspapers in Maori. Computer Science Technical Reportcollection is the largest one containing over 25,000 research reports from around 300sites worldwide. There is a large collection of frequently asked questions on many topics,and full-text index to the US newsmagazine, the Computists Communique.

SETIS provides access to a large number of networked and in-house full text databasesin the humanities. In addition to the literary, philosophical and religious texts, the serviceis engaged in a number of text and image creation projects. Large number of collectionssuch as the American Poetry full-text database, Australian literature from the year 1840,English poetry database, English drama databases, Oxford English Dictionary, etc., areavailable. There is also a distributed database of postgraduate theses in digital form. The UCB digital library maintains a collection of over 80,000 digital images, about 2million records of data in tabular form and 2513 full-text documents in an onlinedatabase. The collection includes documents, maps, articles, and reports on theenvironment of California, including Environmental Impact Reports (EIRs), educationalpamphlets, water usage bulletins, and country plans.

UMDL concentrates on some journal literature and reference resources includingMcGraw Hill Encyclopedia of Science and Technology, Encyclopedia Americana,Encyclopaedia Britanica, and 200 core and popular journals. This library also providesaccess to 1100 Elsevier journals. UM coverage of journals and newspapers in digitalform has crossed 3000.

Information Storage and Retrieval

Information storage and retrieval plays an important role in any digital library. Specificinformation retrieval features of each digital library are discussed below.

ACM organized their digital library collection using their own classification system,called Computing Classification System (CCS). Collections of this digital library areindexed under journals and magazines, proceedings by subject, by sponsor and by series.Conferences are also listed alphabetically under special interest group. All journals andproceedings literature covered by this library are grouped under eleven categories andalso under 16 general terms.

ADL documents are organized by the Library of Congress Subject Headings. In addition,some index terms are assigned by the university considering special collection of geo-spatially-referenced materials. Documents are organised to search geographic locations,

13

beginning and ending dates, type, ‘available as’ types, originators and identifiers.Documents are organised to search the contents using a two-dimensional world map.

AMMEM categorised their collections into different subject groups, year, place, originalformat, digital format, library division and user’s format. Under each category collectionsare displayed alphabetically. All documents under subject category are grouped intothirteen sub-groups. There is a provision to select all the collection in a group or any ora set of collections to search.

In Beowulf, images of the manuscripts are organised to search the entire edition, specificline(s), or specific folio(s). User can also search by word, sub-string and alliteration. TheBeowulf manuscript was divided into two scribes and the scribes are searchable.

In BUILDER, documents are organized under department, title, and course code andexamination paper number. In Forensic journals and Midland History journalsdocuments are organized to search in full-text.

CDL has categorised their collections and services in three groups: browse, search andservices. CDL has indexed documents under eight selected topics and title alphabetically.Documents are organized under title, topic, and abstract. One can also search for theexact beginning of the title of a document. Documents are also organized to allow usersto search in any of the following four formats: E-journals, databases, reference texts andarchival finding aids. CDL provides access to many information resources for locating orgaining direct access to scholarly materials in both print and electronic formats.

CMDL grouped their collections as art, books, collections, journals, multimedia, music,periodicals and projects. Under each group there are subgroups and each subgroup hasfurther subgroups. Items are organised in a hyperbolic tree structure and each group andsubgroup is arranged alphabetically. In the on-line book page, documents are organizedunder author and title.

In DIGILIB, images and photographs are organized by town, type, features, structures,materials and context.

GEMS provides access to a collection of CD-ROM and Online databases and full-textsof project reports and some selected papers. An alphabetical listing of databases andelectronic journal titles is available for browsing. Documents are grouped under 72subject headings. GEMS has a facility to provide electronic resources from NTUcollections searched through the NTU OPAC. Collections are organized to provide cross-media search – OPAC, CD-ROM and online database indexes, digital theses, conferenceand other publications.

In Gutenberg, the whole list of books is arranged by date of release, by titles and byauthor. Documents can be searched by title, author, subject, language, and Library ofCongress Subject Headings. As all the available documents are in plain ASCII format,the downloaded documents can be used in any system.

14

In IDL, documents are organized such that each part is searchable by selecting orreferring to that part. The full text of each article in the collection is tagged by title,abstract, table, analysis, references and conclusion parts, using SGML. This helps userssearch full-text or parts of the articles.

In IEL all the documents can be searched in the full text, in the body, title, URL, sitename, image link, image alt text, description, keywords, and in remote anchor text as aphrase or terms, or in the name or in combination of the above. Documents are alsoorganized by the date of submission. Users can view the table of contents of journals inPDF and HTML format.

NCSTRL collections are indexed under author, year, title, abstract and institution.Author, year and institutions can be searched using the browse index facility or searchingby words under abstract and title. After searching user can go for full-text documentssubject to the authors’ terms and conditions. The required documents can be downloadedin HTML format or in a format designed by authors.

NDLTD has indexed documents under author and department. All the documents can besearched in the full text, in the body, title, URL, site name, image link, image alt text,description, keywords, and in remote anchor text as a phrase or terms, or in the name orin combination of the above. Documents are also organized by the date of submission.

In NLC documents are indexed alphabetically under title and organized using the DDCsystem and full-text. Full-texts of electronic publications are archived in the followingformats: ASCII, HTML, Text, Word, and WordPerfect. In NZDL, documents are organized in such a way that one can search in the first page ofa document or in one particular page of a document. There are 13 different collections.User has to select a collection, and choose the query type – Boolean or ranked – andspecify the search terms.

In SETIS, there are many text and image creation projects. Only the collections of sixprojects are arranged to browse the full text collections; documents are arranged underkeyword, title of works, author publication date, place of publication, publisher, male andfemale authors, and author date and literature period.

UCB uses Chesire II user interface that allows three forms of search: simple forms, tilebars search, and browse lists of all documents. Simple search form has the facility tosearch by document or by page within a document. The tile bar interface allows user tomake informed decisions about which documents and which passages of those documentsto view based on the distribution behaviour of the query terms in the documents. Thereare two tile bars, one is called simple tile bar which is used to locate information in acollection of documents, and the other one, called Single-document tile bar, is a tool tolocate information within a given document.

15

UMDL resources are arranged in three forms: alphabetically by title, by category andresources by service. There are nine headings viz., arts & humanities, business andeconomics, engineering, general references, government information & law, healthsciences, news and current events, science, and social sciences. There are fourteenresources by service. Some of them are: Cambridge Science Abstracts, ISI CitationDatabases, and Proquest. The UMDL project has also developed two methods ofinteraction: one on the multi-scale (infinite pan and zoom) platform of PAD++, andanother on a distributed multi-person computing environment.

Search Features

Some Digital Libraries have more than one form of search, like Simple Search andAdvanced Search. The search features discussed here include the ones available in boththe simple and advanced search modes. The various search facilities available in thetwenty digital libraries are given in Table 5. Browse / Index facility for searching isavailable in twelve libraries for a limited number of fields. User can go to the alphabeticallisting of a field and choose keywords or author or title field for searching.

Boolean search

Boolean operators – AND, OR and NOT – are used to combine words or phrases in asearch expression. Users can also enter a search phrase within quotes for searching in asimple search field. Simple query forms provide facility to enter query in a single line.The search query may contain Boolean operators or phrases. The search query is parsedinto words or phrase and Boolean operators. These words and/or phrases are connectedwith AND operator. Users can also search in multiple fields in some of the chosen digitallibrary. Multiple field search facility provides the ability to search in multiple fieldsusing Boolean operators. In some digital libraries only AND and OR are used, and insome, AND is implied for multiple field search.

Table 3: Organisation and access facility of the chosen digital libraries

Name Accessibility Organisation of InformationACM Public access on subscription ACM Computing Classification System. Broad groups

under 11 categories, and 16 general terms.ADL For UC domain Under Subject Heading and index terms assigned by cataloguer

AMMEM Public access except for a few items Broad groups, format, time, place, original format digitalformat and library.

BL Public access Full manuscripts, line, folio, folioline, fitt, scribes, SGML tags.

BUILDER Staff, student and faculty Under department, title, course code, examination paper number

CDL Open to all, other campus users and campususers

Subject, format and campus

CMU Public access, some materials need passwordauthentication

Grouped under art, books, journals, multimedia, music,periodicals and projects

16

DIGILIB Public access Organised to search by town, type, features, Structure, materials and context.

GEMS Staff, student and faculty OPAC, e-journals, CD-ROM databases, examination papers

GUTENBERG Public access, some materials are copyrighted Author and Title

HEADLINE London School of Economics, London Schoolof Business, and University of Hertfordshiremembers

The digital library is yet to have materials

IDL Faculty, Staff , Students and selected otherusers

Articles are organised under full-text, different sections andfigures.

IEL Full text is available to subscribers only Full text, in the body, title, URL, site name, image link, image alttext, description, keywords, and in remote anchor text as a phraseor terms, or in the name or in combination of the above

NCSTRL Public access Author, title, year and abstracts.

NDLTD Restricted, Unrestricted, and Mixed Full text, in the body, title, URL, site name, image link, image alttext, description, keywords, and in remote anchor text as a phraseor terms, or in the name or in combination of the aboveIndexed under Department and Author.

NLC Public access, restricted and on payment Title and subject

NZDL Public access Organized to search first page, same page and Same document.

SETIS Users of University of Sydney Campus;Public access for some collections

Alphabetically by collection then by author.

UC B Public access Photographs, databases, documents and geographical layers. Underresources are arranged in different fields

UMDL Three categories: UM network, authorised UMusers, and open to all users

CD-ROM, e-journals under Subject (9 categories), Alphabetical and selected resources by service

NZDL uses &, | and ! as Boolean operators in their query for AND, OR and NOTrespectively. NLC uses &, |, ~ and; for AND, OR, NOT and NEAR respectively. Thislibrary provides a facility to enter query in French language whereby users can use‘accum’, ‘equiv’, and ‘minus’ operator for AND, OR and NOT operators respectively. Ifany Boolean operator is not included in a search expression, the system will take it as aphrase search. NDLTD and IEL use ‘must contain’, ‘should contain’, and ‘must notcontain’ as operators in place of AND, OR, and AND NOT Boolean operatorsrespectively. One can also use ‘+’ and ‘-‘ as the addition and rejection operators in aquery. Similarly IDL has ‘must contain’, ‘may contain’, ‘not contain’ and ‘must containnearby’ in place of ‘AND’, ‘OR’, ‘NOT’ and proximity operators respectively. InCarnegie Mellon University’s e-books search, there are two searchable fields – authorand title. These two fields can be combined with AND only. Other libraries use standardBoolean operators AND, OR and NOT. In Beowulf project OR, AND NOT and WITHoperators are used for Boolean OR, AND NOT and AND respectively. Table 5 showsthe various types of Boolean search facilities available in the chosen digital libraries.

17

ProximitySearch

A Proximity operator searches both words in a field or text with a fixed number ofintervening word(s) between them. Ten digital libraries, out of twenty, under study haveproximity search facilities. The proximity search operators used are “ADJ”, “NEAR” and“WITH”. ACM, NLC, BUILDER, HEADLINE, SETIS and GEMS use NEAR as theproximity operator. NDLTD uses “ADJ” as the proximity operator. In BUILDER andHEADLINE, when we use the NEAR proximity operator the documents that match thesearch term within 50 words are returned: the closer together the words are, the higher therank of the page, so the higher it appears in the list of search results. In SETIS users cansearch by phrase or a combination of two phrases using the proximity operator ‘NEAR’;the number of characters between words can be limited to 40 or 80 or 120. Users cancombine author and title fields with keyword or phrase search selected from any one ofthe above fields. In some libraries user can restrict the number of characters between twowords while using the proximity operator. Table 5 shows the proximity search facilitiesavailable in the chosen digital library.

Phrase Search

A search expression may be built with the combination of terms or phrases and logicaloperators. A query may be entered in quotes to search for an exact match of the phrase.Fifteen libraries, out of the chosen twenty, have a phrase search facility. In BUILDERand HEADLINE, if the user does not specify any Boolean operator, the system will takethe search expression as a phrase. Only SETIS has the facility of combining two phrases.In some digital libraries, for example SETIS, if there is no Boolean operator between twowords in the simple search form, the system takes it as a phrase; in some cases we have toenter a phrase within quotes. In some libraries, for example in NZDL, the sequence ofwords are parsed and connected with ‘AND’ operator. Table 5 shows the phrase searchfeatures of the chosen digital libraries.

Truncation

Truncation searches allow users to search for different word variants with a single searchexpression where the truncation symbol stands for one or more characters in the searchterm. There are three types of truncation: left truncation, right truncation and middletruncation. Right truncation matches any number of characters at the end of the word,while left truncation starts with any number of characters followed by the search word.Middle truncation matches words starting and ending characters with any interveningcharacters. Sixteen, out of the chosen twenty, digital libraries have only right truncationfacility. Various operators such as ‘*’, ‘#’, ‘?’ are used for truncation. The DIGILIB andBeowulf have the facility for single and multiple wild card searching. Table 5 shows thetruncation search facilities available in the chosen digital libraries.

18

Stemming

Stemming searches look for other grammatical forms of the search terms. For example astemming search on fly would also find flies. AMMEM, BUILDER, HEADLINE,NZDL, and ACM have this facility.

Fuzzy search

Fuzzy search expands the search by generating similarly spelled words to the specifiedword or phrase. This type of expansion allows for misspellings. Only ACM digital libraryhas this facility.

Phonic searching

Phonic search looks for a word that sounds like the word we are searching for and beginswith the same letter. Only ACM digital library has this search facility.

Case sensitivity

Only four digital libraries provide case sensitivity options. NZDL has a facility to selectcase sensitive or case insensitive search using ‘c’ or ‘i’ respectively. British Library’sBeowulf has the option to select case sensitive search. In NDLTD and IEL, search termsin lower-case will match words in any case; otherwise, an exact case match is used.

Term weighting

In a search expression user can specify that some terms should count more than other. Forexample, if a user is looking for documents about both ‘Apple’ and ‘Pear’, he/she mightwant to give preference to the word ‘Apple’ over the word ‘Pear’. Term weighting allowsto retrieve documents with higher weightage. NLC and NZDL have the facility of termweighting search.

Limiters

Some of the digital library collections are grouped according to format, year, form ortype. Limiters are used to select or restrict a particular group of documents or forms ortype to search. For example, NCSTRL and ACM provide a facility to limit by the year ofsearch. If a user wants to search for documents for two years, he/she can restrict thesearch period using ‘greater than’ and ‘less than’ operators. Only a few libraries have thefacility of comparative operators ‘>’ and ‘<’ . DIGILIB has this facility.

Search History

Three libraries, namely ADL, DIGILIB and HEADLINE, have the facility to record asearch history. This will help administrators to trace who have used the library, and theusers to use the previous query for updated results.

19

Other features

ADL has a facility to cut, paste, drag and drop images and maps from the sourcematerials. After connecting to ADL, users are presented with the map browser. The mapbrowser allows users to interactively 'pan' or 'zoom' a two-dimensional map of the worldto locate their area(s) of interest. Once they are satisfied with the scale and location of themap in the window, they can select an area to query. This map is also used to display thespatial extents (footprints) of the items retrieved from the library. Users can also retrievedescriptive information associated with the selected item.

In Beowlf, there are two types of searches, transcript and edition search. The transcriptfacilitates extensive and varied searches of the manuscript. The results of these searchesalways identify the folios and folio-lines for ease of reference. The edition search goes totheir source in the manuscript by providing citations to folios, folio-lines, and verse-lines.There is a facility to search by line and folio. Line (edition) searches the entire edition ora specific line or lines. The default is the entire poem. Folio (edition) searches the entireedition or a specific folio or folios. Sub-string search of the transcript in Beowulf projecthelps to locate words where the scribes have not observed conventional word boundaries.

In BUILDER and HEADLINE, only simple form search is available. There is also afacility to search other digital libraries of the eLib phase 3 project.

The CMDL multimedia collection can be accessed through hyperbolic tree structure, ande-books collections can be searched by the first name or last name of the author, andexact beginning of the title or words in the title.

GEMS has a separate search interface for each of the different media: Internet search,databases and electronic journals, and OPAC search. In the cross media search, user cansearch by keyword in the title field, subject field or any part of the record in all availablelibrary resources regardless of the media type. Databases’ and e-journals’ alphabeticallistings can be browsed, and search can be conducted by subject headings and keywordsin on-line databases. In OPAC, users can search by title, author, LCSH, call number,ISBN, ISSN, type of material, languages, publication date, keyword. GEMS supportssearching the Chinese language.

IDL’s DeLIver allows users to search and view individual articles from different fieldtags and also from figure captions using keywords. All the articles are searchable by title,heading, author, author affiliation, abstract, table text, figure caption, and so on. Theentire text of an article is tagged so that the full article can be searched. Hyperlinks havebeen provided to reach referenced articles in their database or users can reach abstractingand indexing services to get abstracts of articles from their library collection. CD-ROMtitles and browsing Chinese sites on the Internet.

20

NDLTD and ACM digital library use the Infoseek search engine to support searchoperations. Documents can be searched using search terms or phrase or combination ofthese in different parts of a document. Search can also be conducted on some parts of adocument by specifying that part in the query using ‘:’. We can also restrict someportions of the web documents by using Infoseek’s field syntax. The interface usesadvanced statistical weighting or search technology. Unlike plain Boolean searches,Infoseek automatically weights our query terms based on their advanced statistical searchtechnique to return the results sorted with the best matches listed at the top. It also usesHTML Meta tag to specify the summary text. There is a provision to limit a search bydate.

NLC uses a technique called threshold score. Search results are returned only when thescore is greater than or equal to the threshold score specified. The default value is set to50. For the ‘AND’ and ‘OR’ operators, the score increases with the increase inoccurrence of search term(s) in a document. Each occurrence of an exact match scoresten points, but the maximum is 100. Scoring is different for different terms. For the‘NEAR’ operator the score is based upon the physical proximity of the search terms inthe document.

In NZDL, there are two different kinds of queries: ranked queries and Boolean queries. Aranked query consists of a list of terms that are likely to appear in the documents the useris looking for. Documents are displayed in order of match of how closely they match thequery. The Boolean search option performs an ‘OR’ search on the specified terms andranks the documents retrieved by relevance according to the cosine rule.

DIGILIB allows search sets to be stored for future use. These search sets allows theretrieval of predefined collections of images. Users can also modify the previouslysearched expression in the same result screen.

In SETIS, users can browse the author index of the collections or search by keywords inany one of the fields, keywords/phrases within all texts, title of works, author, male orfemale authors, publication date, place of publication, publisher, by author date, by pageimages and literature period. The literature covered is divided into five time periods: pre1840, 1840-49, 1850-59, 1860-69 and 1870-79.

Access

Most of the digital libraries that are referred to here have some way to control the accessto their collections. Three major types of access control have been noted: (1) DigitalLibraries that are accessible only to the staff, students of the parent organisation, (2)Digital Libraries that make part of the collection accessible to the general public, and (3)Digital Libraries that are open to the public.

Five digital libraries, viz., ADL, BUILDER, HEADLINE (yet to put materials), GEMSand IDL, provide access only to staff, students, faculty members and some collaborativemembers. ACM, AMMEM, CDL, IEL, NDLTD, NLC, SETIS, UCB, and UMDL

21

make part of their collections accessible to the general public. ACM and IEL digitallibraries provide access of bibliographic collection to public free of charge while givingfull-text access only on subscription. Beowulf, CMDL, Gutenberg, NCSTRL, NZDL,and DIGILIB collections are accessible to the general public.

UMDL provides three types of access: (1) access through UM networks (2) authorizedUM users, and (3) Open to all. In the University of Michigan network 250, 28 and 19databases can be accessed by UM users, by password, and by public access respectively.Some collections of CMDL are freely accessible and some require authentication,specifically for electronic journals. In GEMS, the number of terminals that can provideaccess to the system are limited. In CDL certain resources are available only for users inthe nine University campuses. Some resources, such as the online archive of California,the Melvyl union catalogue and the California periodical database, are available forpublic access without any restrictions.

HEADLINE allows access only to the faculty, and student of the three participatinginstitutions -- London School of Economics, London School of Business, University ofHertfordshire and four hybrid library projects under the eLib phase 3 program.BUILDER has a facility to search for information across Agora, Headline, Hylife, andMalibu project web site simultaneously. National Library of Canada has also restrictedaccess to some documents and access to document is fee-based.

Output Formats

This is a critical area, and the nature, number, format, etc., of the output, depend on anumber of factors – the nature of the digital library, users and their needs, and so on. Allthe libraries have fixed number of records that can be displayed per frame or user can setthe number of records in multiples of 5s or 10s. Only the AMMEM project has thefacility of fixing an output of maximum 5000 bibliographic records for a search.

The fields displayed vary from library to library. ACM, AMMEM, ADL, CDL,Gutenberg, IDL, IEL, NCSTRL, NDLTD, NLC, and SETIS display minimum details likeauthor, title, journal name, date etc., wherever applicable. ACM, ADL, IEL, NDLTD,and NLC digital libraries have facility to display summary or abstract of the retrieveddocuments using hyperlink. Carnegie Mellon University digital library displaysmultimedia title for multimedia documents; book title, author name in case of e-bookcollection; and journal name in case of e-journals. DIGILIB and UCB displayphotographs and/or images. SETIS, BUILDER and UCB display first few words from theretrieved items in addition to the bibliographic information. In SETIS, the number ofrecords displayed can be fixed in multiples of 100 starting from 1-100. Initially, thedisplay contains author name and a few words from the title. If we go further, we will getfull text information.

In AMC, IEL and NDLTD users can download full-text or abstract in PDF formatprovided they have access rights. First the system displays the title, some introductorytext about the theses or dissertations, relevance percentage, date of submission, hyperlink

22

to the document and number of occurrence of search terms in the document for the searchquery. Later the user can opt to select a long record or abstract or full–text information.Results can be sorted by relevance, date, title, with summary or without summary of theitems retrieved. If the Internet resources option is selected, the search results will bedisplayed according the search engine format.

In UMDL, electronic journal and newspaper articles can be downloaded for personal use,subject to authentication. Different CD-ROM databases can be queried and the resultswill be displayed in a format given by the vendor of that CD-ROM product. Electronicjournals and newspapers can be viewed in a format given by the publishers. E-journalsand newspapers can be browsed by broad subject, and by alphabetical list.

Table 4: Display facility of the chosen digital libraries

Name Display (output format) Sort facility Searchhistory

ACM 24 items per frame, Title, author, publication information, relevance rating,the availability of various components

No No

ADL Export/print full meta-data as XML tagged text No Yes

AMMEN Maximum of 5000 items with author and title fields Relevance No

BL Images are sizable upto 300% . Manuscript folio number, British librarynumber for the manuscript are displayed.

No No

BUILDER 10 items per frame with title and few lines from the summary No No

CDL 5/10/20/30/40/50 items per frame with database / journal Name, publisher,year from when it is available

No No

CMDL No restriction on number of items per frame. Each collection is grouped anddisplayed in alphabetical order.

No No

GEMS 10/25/100/200 per frame No No

GUTENBERG No limit for number of items to display. Title, author alphabetical andreleased date are displayed

Title and author No

HEADLINE The digital library is not working yet No Yes

IDL First 25 items with author, title, journal name, formats Four different formatscan be chosen for additional fields

No No

IEL 10/25/100/500 items per frame with few lines summary, relevance %, date ofsubmission, title and hyperlink

Relevance /date / title

No

NCSTRL No limit in Number of items per frame. Title, author, Document Id., andinstitution name (in case of paper), journal name is displayed

Author/title/date/ rank

No

NDLTD 10/25/100/500 items per frame with few lines summary, relevance %, date ofsubmission, title, hyperlink

Relevance / date/ title

No

NLC 20 items per frame. Displays number of words, threshold score, titlealphabetical, document hyperlink and size

Author / title /date

No

NZDL Bibliographic Information and abstracts wherever available. Differentcollections have different display formats

No No

DIGILIB 12 images per frame. Three forms of display : Contact sheet, Summary list,and detail summary. Images are in compressed JPEG format

Features/Conditions/materials/interior/

use/context/location Objects

Yes

23

SETIS 100/200/300…items per frame with Titles and few lines from the summary Group by match No

UCB 25/50/100 items per frame. Outline view gives only title; list view gives titleand few lines about the item. In quick access to collections, caption, Location,country, collection, color and Photographer name are displayed Each collectionhas different formats for display

No No

UMI Number of items per frame and fields depends on individual database No No

Table 5: Search Features

Name

Bro

wse

/Inde

x

Sim

ple

sear

ch

Boo

lean

sear

ch*

Mul

tiple

fie

lds

sear

chTr

unca

tion

Prox

imity

sear

ch#

Cas

e se

nsiti

vity

Nat

ural

Lang

uage

Com

para

tive

sear

chTh

esau

rus

Phra

se

LCSH

Wei

ghte

d te

rms

Wild

car

d

Lim

iters

Ran

ked

outp

ut

Stem

min

g

ACM Y Y 1,2,3 Y Y 1 N N N N Y N Y Y Y Y Y

ADL Y Y N Y N Y N Y N N Y Y N N Y N N

AMMEM N Y 4,5 Y Y N N N N N Y Y N N Y Y Y

BL Y Y 1,2,3 Y Y N Y N N N Y N N Y N N N

BUILDER N Y 1,2,3 N Y 1 N N N N Y N N Y N Y Y

CDL Y Y 4 N Y N N N N N N N N N Y N N

CMDL Y Y 1 Y N N N N N N N N N N N N N

GEMS N Y 1,2,3 Y Y 1 N N N N N Y N Y N N N

GUTENBERG

Y Y 4 Y N N N N N N N Y N N Y N N

HEADLINE N Y 1,2,3 N Y 1 N N N N Y N N Y N Y Y

IDL N Y 1,2,3 Y Y N N N N N Y N N N Y N N

IEL N Y 1,2,3 Y Y 2 Y Y Y N Y N Y N Y Y N

NCSTRL Y Y 1,2 Y Y N N N N N Y N N N Y Y N

NDLTD Y Y 1,2,3 Y Y 2 Y Y Y N Y N Y N Y Y N

NLC Y Y 1,2,3 N Y 1 N Y Y N Y N Y Y Y N N

NZDL Y Y 1,2,3 Y Y Y Y Y N Y Y N Y Y N Y Y

DIGILIB N Y 4 Y Y N N N Y N Y N N Y Y N N

SETIS Y Y 1,2 Y Y 1 N N N N Y N N N Y Y N

UCB Y Y 4,5 Y Y N N Y N N Y N N N N N N

UMDL Y Y N N N N N N N N N N N N N N N

Legend: Y: Facility available N : Facility Unavailable * Boolean 1 – AND 2 – OR 3 – NOT 4 – Implied AND 5 – Implied OR # Proximity operator 1 – NEAR 2 – ADJ 3 – WITH

24

NZDL provides the facility to download full-text documents from more than 300 sites.When users make a query, the first ten matching documents (or fewer if less than tenrecords match the query) will be shown on the screen -- the first few words of each aredisplayed. Subsequent records are displayed in frames of 10 records each. User can set acondition to get a maximum of 50 records as the output.

In DIGILIB, images are viewed individually or as a contact sheet, printed on a colorprinter or downloaded to a disk. The images may be added to a word processing orgraphics file, or manipulated through PhotoShop, changing the color and/or composition.Images are compressed using JPEG with an average file size of approximately 50kb.QuickTime has been incorporated, allowing virtual reality and interactive movies. Resultscan be displayed in two ways: contact sheet format and a summary list format. Both theformats lead to higher detail summary format display only one record at a time. Users cansee the full size image or movie in this format.

Sorting Search Output

Only seven digital libraries have sorting facilities. AMMEM lists output by relevance; inGutenberg all retrieved records can be sorted according to title or author, within author ortitle records are arranged according to publication date. In NCSTRL, results can be sortedaccording to rank, author, title, date and institution. NDLTD and IEL list output recordsaccording to relevance title and date. NLC electronic library sorts results by author, titleand date. Output from the DIGILIB can be sorted according to features, conditions,materials, interior, use, context, location and objects. In SETIS, there is no facility to sortthe output, but it provides a facility to group by match.

Summary and Conclusions

This study has has identified a number of important features of the chosen digitallibraries. The major findings can be summarized as follows:

� Fifteen out of twenty digital libraries are hosted by universities.� Gutenberg is the earliest among the chosen digital libraries, and most digital libraries

were set up from 1994 onwards. Gutenberg uses plain text format (ASCII) for itscollections, that is largely developed by volunteers.

� Although some digital libraries were set up to cater for specific type of informationsuch as Alexandria, NDLTD, SETIS, etc., most digital libraries in the universitiesprovide access to a variety of information sources.

� No digital library seems to support patent literature and software.� Very few books are available in digital form. Only four libraries have books in

electronic form: NLC in general area, Gutenberg in subject-specific, and SETIS andCarnegie Mellon University in special collection area.

25

� Ten out of twenty libraries studied here provide access to electronic journals.University of Michigan digital library has the highest number of abstracting andindexing databases.

� ACM and IEL digital libraries full-text collection is accessible only by subscription.� The search interfaces vary significantly, though most digital libraries have a simple

and an advanced search facility.� Some digital libraries use commercial search engines, for example, NDLTD and IEL

use InfoSeek.� Boolean, Proximity, Phrase search and Truncation are the common search facilities

available in all the digital libraries, though the search operators vary.� Only five Digital Libraries viz., ACM, CDL, IEL, DIGILIB and NDLTD, provide

facility to narrow down or modify query. NLC provides the facility for searching theircollection in English and French. NCSTRL and NLC have a facility to group searchterms using Boolean operator and parenthesis.

� Very few digital libraries provide vocabulary control support in searching. OnlyADL, Gutenberg and NZDL provide this support.

� Only the NZDL has the facility for using a thesaurus.� ACM uses their own Computing Classification System. This system is based on

unique articles and other types of materials URL’s formed by journal name, year ofacceptance and a unique identifier formed from the author and title. It also has thefacility of on-line citations and citation of article components.

� Only two libraries viz., ADL and HEADLINE have the facility to store or recordsearch history.

� IEL digital library has IEEE standards in their collection.� ACM digital library has phonic and fuzzy search facility, and the facility of

conference listing by special interest group.� ACM and IEL digital libraries have the facility to find related articles.� British Library’s Beowulf project has the facility of sub-string and alliteration search.� IDL has the facility of searching from tables and captions from figures. � The CMDL provides a facility to search using the hyperbolic tree structure, which has

unique visual impacts. � NDLTD, BUILDER, HEADLINE and California Digital library have the facility for

federated search from their member sites.� CDL has introduced charge mechanism for certain resources. A payment system has

been implemented for full-text. GEMS aims to have cashless system, without cashtransaction, using cashcard, library transaction for some of the services they offer.

� Very few digital libraries provide ranked output. � Only seven digital libraries provide facilities for sorting the results.

In conclusion, it can be stated that the digital libraries of today, though quite useful, needsome improvements in terms of the user interfaces, and information retrieval facilities.There seems to be two different types of digital libraries in the future. Subject anddocument specific digital libraries will cater for specific subject and type of informationlike digital video, maps, photographs and paintings, theses, etc. However, the other majorcategory will be the hybrid library that will link the traditional libraries with its OPAC,

26

CD-ROM and online databases to the world of digital libraries and virtual libraries orgateways.

The other trend is the provision for personalized information services offered by digitallibraries, as are currently available in GEMS and HEADLINE. Such facilities will enableuser to design a specification pertaining to his/her own information needs which will helphim/her get specialized information search and retrieval services from the digital orhybrid library. Access to other digital libraries and Internet resources and federatedsearch facilitating search across a number of similar digital libraries will also be more andmore common. Finally, there some dimensions to digital library research, other than those mentionedabove. The current digital library agenda, in the words of Levy [12], has largely been setby the computer science community, and clearly bears the imprint of this community’sinterests and vision. There are other constituencies whose voices need to be heard too,and we need to consider the purposes of digital libraries within a broader spectrum of thepopulation whose lives will be affected by the work we do. Social scientists, for example,have much to say about the relationship between technological developments and thesocietal benefits that might accrue from it, though unfortunately, according to Levy [12],to date the social science work within digital library R&D has largely been confined toevaluating prototypes developed by computer scientists. Libraries, digital or otherwise,have a great role to play in our society. Therefore, it is of great importance to see not onlyhow we design and develop our digital libraries, but also how they are of real use to thetarget users and how do they benefit our society at large.

Acknowledgement

The authors gratefully acknowledge the valuable comments and suggestions made by theanonymous referees on an earlier version of this paper.

References:

[1] C. Openheim, What is the hybrid library?, Journal of Information science 25(2)(1999) 97-112.

[2] Digital Library. The British Library. Online. Available: http://www.bl.uk.

[3] R. Vicki and T. Winograd, Working assumptions about the digital library (StanfordDigital Library working paper), 23 February 1995. Available at www-digilib.stanford.edu/diglib/WP/public/Doc10.html

[4] K.M. Drabenstott, Analytical review of the library of the future (Council of Libraryresources, Washington DC, 1994).

[5] Digital Library Fedration. Online. Available:http://www.clir.org/diglib/dlfhomepage.htm.

27

[6] C.L. Borgman. In: E.A. Fox (ed.), Sourcebook on Digital Libraries: Report for theNational science Foundation, TR-93-35 (439) (Blacksburg VA: VPI and SU ComputerScience Department). Available at http://fox.cs.vt.edu/DLSB.html.

[7] C. Rusbridge, Towards the hybrid library, D-Lib Magazine July/August (1998).Available at: [http://www.dlib.org/dlib/july98/rusbridge/07rusbridge.html].

[8] G.G. Chowdhury and S. Chowdhury, Digital Library Research: major issues andtrends, Journal of Documentation 55(4) (1999) 409-448.

[9] E. A. Fox, The Digital Libraries Initiative: Update and Discussion, Bulletin of theAmerican Society for Information Science October/November 1999 7-11.

[10] eLib: The Electronic Libraries Programme.Online. Available:http://www.ukoln.ac.uk/services/elib/

[11] Special issue on GEMS, NTU Library Bulletin 8(2) (1999).

[12] Levy, David M. Digital Libraries and the Problem of Purpose. D-Lib Magazine, 6(1), (2000). Available: http://www.dlib.org/dlib/january00/01levy.html#Note2