Open Source Software for Libraries

110
Open Source Software for Libraries A Trend Report Submitted by Saiful Amin Guided by Dr. A R D Prasad Project 2 A guided Research Project Submitted in Partial Fulfillment of the Course Leading to the Award of Associateship in Documentation and Information Science (ADIS) 2001 - 2003 DOCUMENTATION RESEARCH AND TRAINING CENTRE INDIAN STATISTICAL INSTITUTE 8 th Mile, Mysore Road Bangalore – 560 059

Transcript of Open Source Software for Libraries

Open Source Software for Libraries

A Trend Report

Submitted by

Saiful Amin

Guided by

Dr. A R D Prasad

Project 2

A guided Research Project Submitted in Partial Fulfillment of the Course Leading to the Award of Associateship in Documentation and

Information Science (ADIS)

2001 - 2003

DOCUMENTATION RESEARCH AND TRAINING CENTRE

INDIAN STATISTICAL INSTITUTE

8th Mile, Mysore Road

Bangalore – 560 059

Acknowledgement

I am deeply indebted to my guide Dr. ARD Prasad, Associate Professor, Documentation Research and Training Centre (DRTC), Indian Statistical Institute, Bangalore. It is the best opportunity to thank him with the core of my heart.

I also want to thank Prof. IK Ravichandra Rao, Head, DRTC, and Dr. Devika P Madalli, Lecturer, DRTC, for their continuous encouragement.

I must also thank Dr. K Mohan and my colleagues at the Learning Resource Centre at the Indian School of Business, Hyderabad, who have helped create such a nice ambience and atmosphere to work.

_______________ (Saiful Amin)

Place: Bangalore Date: August 27, 2003

Declaration

I do hereby declare that the project report entitled “Open Source Software for Libraries: A Trend Report”, which is being submitted to the partial fulfillment of the course leading to the award of the Associateship in Documentation and Information Science in DRTC, Indian Statistical Institute, Bangalore, is the result of the work carried out by me under the guidance and supervision of Dr. ARD Prasad, Associate Professor, Documentation Research and Training Centre.

I further declare that any other person or I have not previously submitted this project report to any other Institution/University for any degree or diploma.

_____________________

(Saiful Amin)

Place: Bangalore

Date: August 27, 2003

It is certified that this project has been carried out under my guidance and supervision.

______________________

(Dr. ARD Prasad)

Place: Bangalore

Date: August 27, 2003

Table of Contents

Page No.

Chapter 1

Introduction

1-5

Chapter 2

Use of Software in Libraries

6-10

Chapter 3

What is OSS?

11-18

Chapter 4

Software Tools for Automation

19-50

Chapter 5

Software Tools for Value Added Services

51-64

Chapter 6

Software Tools for DL Initiatives

65-83

Chapter 7

Miscellaneous Supporting Tools

84-94

Chapter 8

Conclusion

95-99

Chapter 9

Appendix – OSI Certified Licenses

100-103

Chapter 10

Selective Bibliography

104-106

Chapter 1

Introduction

“Any sufficiently advanced technology is indistinguishable from magic.” – Arthur

C. Clarke

• An Invitation to Library Software

• Objective of the Study

• Scope of the Project

• Distribution of Chapters

Introduction

Chapter 1 2

1 An Invitation to Library Software Developments in electronic and communication technology have affected every

profession in the past decades and libraries are no exception. Libraries of all types

are challenged to provide greater information access and improved levels of

service, while coping with the pace of technological change and ever-increasing

budget pressure.

Use of software applications in libraries has become essential due to a number of

factors. The most visible factors among them are:

• Growth of Electronic Resources: Large databases from periodical,

magazine, and journal publishers became increasingly available in digital

format – at first on CD-ROM, later via online services. Library services are

transitioning from local traditional collections to global resources provided

on demand via the most advanced networking technologies. Today, library

collections are used by people on campus as well as by individuals who are

not even located on the library’s physical facilities.

• Anytime Anywhere Access: Access to online digital information from

anywhere is the need of the hour. This is forcing a shift in role of library

from a repository to a gateway, with users expecting online libraries that

can provide round the clock service.

“Library users have grown accustomed to using the Internet as a research

tool and do not always appreciate the difference in quality of information

available through a library’s specialized collections, especially when

compared to what can be located through an Internet search engine. Thus,

libraries with substantial collections of information often find those

collections under utilized if the user interface is not designed to make it

easy to locate the required information.” (Pasquinelli, 2003)

• Resource Sharing: Libraries of all types also need to utilize new

application systems to automate resource sharing. Union Catalogs and

Introduction

Chapter 1 3

Inter-Library Loan modules are needed to allow cooperating institutions to

combine their catalogs and allow patrons of one library to request and

borrow materials from linked institutions. These technologies will foster

the growth of library consortia and the extension of offerings beyond the

organizational boundaries of individual libraries.

However, implementing new technologies and tools into library environments may

be a highly challenging task. Despite significant benefits many libraries do not

have the definite resources and infrastructure to maintain and upgrade available

technologies. In addition, there is a significant demand for standards-based, open

systems to promote interoperability.

Open Source Software (OSS), as will be discussed in the present study, comes to

the rescue of less-privileged libraries to deal with the increasing demands for use of

technology. OSS enables democratization of technology. OSS has definite

advantages over proprietary systems in the total cost of ownership (TCO), since it

is available free for download on the Internet. Thus OSS bears great importance to

the libraries in developing countries like India.

OSS also gives freedom to the users of the software to customize it to his/her needs

since one has access to the source code of the software.

2 Objective of the Study The objective of the present study is to look into the technologies and tools

available in the open source world that can be used in improving the services

within the libraries.

3 Scope of the Project The project is based on the study of available Open Source Software (OSS) useful

to libraries in general. It includes integrated library systems (ILS), cataloguing

tools, resource sharing tools, digital library tools, and other information service

tools useful in day-to-day functioning of the libraries.

Introduction

Chapter 1 4

4 Distribution of Chapters

Chapter 1 – Introduction

Chapter 2 – Use of Software in Libraries

• Why Automate?

• Software Needs for Automation

• Software Needs for Value Added Services

• Software Needs for DL Initiatives

Chapter 3 – What is OSS?

• What is OSS?

• Criteria for OSS

• The OSS movement

• Why adopt OSS in Libraries?

Chapter 4 – Software Tools for Automation

• Integrated Solutions

• Databases

• Cataloguing/MARC Tools

• Z39.50 Tools

• Barcode Makers

Chapter 5 – Software Tools for Value Added Services

• Library Portal Solutions

• User Services

• Subject Gateways

Introduction

Chapter 1 5

Chapter 6 – Software Tools for DL Initiatives

• Digital Library Solutions

• DL-like Software

• OAI-PMH Tools

Chapter 7 – Miscellaneous Supporting Tools

• HTML tools

• XML tools

• Information Retrieval Tools

Chapter 8 – Conclusion

• Barriers in Using OSS

• Criteria for Selection of OSS

• Conclusion

Chapter 9 - Appendix – OSI Certified Licenses

Chapter 10 – Selective Bibliography

Chapter 2

Use of Software in Libraries

“Necessity is the mother of invention” – Proverb

• Why Automate?

• Software Needs for Automation

• Software Needs for Value Added Services

• Software Needs for Digital Library Initiatives

Use of Software in Libraries

Chapter 2 7

1 Why Automate?

Automation considerations have been so well debated in last few decades that we

do not see many arguments against it. Still we need to place the topic in the

context of possible improvements in the existing library services.

Benefits for Patrons: Library automation offers many opportunities to improve

services to the library users. Benefits include faster access to resources through

OPACs, remote access, access to online reference tools, etc.

Benefits for Staff: Automation reduces the need to do repetitive jobs manually. It

reduces the manual work involved in circulation, cataloguing, acquisitions, etc.

Automation allows the staff to take benefit of online resources, and offline

databases in giving reference services.

Benefits for Institution: Automation not only builds positive reputation of the

library services it also increases access points for the users.

2 Software Needs for Automation

Before we look into the needs of software let us see what are the activities in a

library that can be automated. There are basically two kinds of activities in a

library, viz., visible and background. The activities like circulation, reference

services, which are visible to the users are of the first kind. The activities such as

ordering, accessioning, cataloguing, etc. can be referred to as the background

activities in a library.

The libraries also need to interact with other libraries to share resources. So the

third type of activity would be resource sharing with other libraries. Each of these

three kinds of activities is mostly still done manually in the traditional libraries.

2.1 Housekeeping activities

The housekeeping activities are essential for the day-to-day functioning of the

library. These include:

Use of Software in Libraries

Chapter 2 8

• Acquisitions: tracking the purchase of materials through ordering,

claiming, receiving, invoicing, and processing.

• Cataloging: creating catalogue records.

• Serials: automating ordering, receipt, routing, and renewals of all serial

subscriptions.

• Reminders: for library patrons as well as vendors of books and periodicals

2.2 Services to users

• Online Public Access Catalog (OPAC): an electronic record of holding,

bibliographic, and item information.

• Circulation: allowing librarians to check materials in and out, place

renewals or holds, and enter payments.

• Reference Services: to the users and other communities.

2.3 Resource Sharing

• ILL: for sharing resources.

• Cooperative Cataloguing: for sharing the cataloguing work among a group

of libraries.

• Union Catalogue: to enable easy identification of a resource in the

holdings of a group of libraries.

3 Software Needs for Value Added Services

Value addition is an important need for any service institution. The libraries

always need to improve the quality of service by adding value to each of its

products.

Use of Software in Libraries

Chapter 2 9

• Library Website: has become very important in modern libraries. It is

more than simply a library OPAC and can include library rules, subject-

based directories, access to online resources, news items, as well as online

reservation.

• Subject Guides: are useful for academic libraries for supporting the

existing curriculum of the parent institution.

• Reading Lists: is the modern version of literature search services on a

specific topic.

• Web Directories: are used to organize Internet resources on the basis of

classification, often biased towards a particular subject.

4 Software Needs for Digital Libraries

The growth of electronic information over the decades and the democratization of

the Internet have paved the way for the emergence of digital libraries. Digital

libraries are more than mere a collection of digital documents. It can be seen as an

extension of the existing libraries with all the three basic functions, viz., collection,

organization, and dissemination of digital information resources.

The importance of digital libraries can be summarized in the following points:

• Digital Documents: As the number of digital and electronic documents will

always increase in the future librarians need to organize them as efficiently

as possible. Simple information retrieval systems are not enough to handle

digital documents. Use of metadata is important in managing digital

content. That is where digital libraries come into picture.

• Archival Needs: Library has now access to electronic documents online as

well as in CD-ROM. These resources need to be archived efficiently.

• Online/Remote Access: Managing online access to resources available

over a network is growing in importance by day.

Use of Software in Libraries

Chapter 2 10

• Full-Text Search Capabilities: Full-text search is needed in a number of

situations, e.g., when context-based search does not fetch enough

documents.

• OAI-PMH Needs: Open Archives Initiative Protocol for Metadata

Harvesting (OAI-PMH) is a model for sharing of metadata between digital

libraries by means of metadata harvesting. The model supports building

low-barrier yet high-end federated search services across number of digital

libraries. The protocol needs to be implemented by the individual digital

libraries as well as the search service providers.

Chapter 3

What is OSS?

“Think free speech, not free beer” – Richard Stallman on Free Software

Foundation

• What is OSS?

• The Ten Commandments

• The OSS movement

• Why Adopt OSS in Libraries?

What is OSS?

Chapter 3 12

1 What is OSS?

Open source is a software development model as well as a software distribution

model. In this model the source code of programs is made freely available with the

software itself so that anyone can see, change, and distribute it provided they abide

by the accompanying license. In this sense, Open Source is similar to peer review,

which is used to strengthen the progress of scholarly communication.

The open source software differs from the closed source or proprietary software

which may only be obtained by some form of payment, either by purchase or by

leasing. The primary difference between the two is the freedom to modify the

software.

An open system is a design philosophy antithetical to solutions designed to be

proprietary. The idea behind it is that institutions, such as libraries, are can build a

combination of components and deliver services that include several vendors’

offerings. Thus, for instance, a library might use an integrated library system from

one of the major vendors in combination with an open source product developed by

another library or by itself in order to better meet its internal or users’ requirements.

Definition

According to Open Source Initiative (http://www.opensource.org/):

"Open source promotes software reliability and quality by supporting independent

peer review and rapid evolution of source code. To be certified as open source, the

license of a program must guarantee the right to read, redistribute, modify, and use

it freely."

Open source means several things (Chudnov, 1999):

• Open source software is typically created and maintained by developers

crossing institutional and national boundaries, collaborating by using

internet-based communications and development tools;

• Products are typically a certain kind of "free", often through a license that

specifies that applications and source code (the programming instructions

What is OSS?

Chapter 3 13

written to create the applications) are free to use, modify, and redistribute as

long as all uses, modifications, and redistributions are similarly licensed;

• Successful applications tend to be developed more quickly and with better

responsiveness to the needs of users who can readily use and evaluate open

source applications because they are free;

• Quality, not profit, drives open source developers who take personal pride

in seeing their working solutions adopted;

• Intellectual property rights to open source software belong to everyone who

helps build it or simply uses it, not just the vendor or institution who created

or sold the software.

2 The Ten Commandments

The Open Source Initiative (OSI) identified ten criteria for a software product to be

called open source. The OSI certifies a software license as an ‘OSI Certified

License’ on the basis of the following ‘Ten Commandments.’

1. Free Redistribution: The license shall not restrict any party from selling or

giving away the software as a component of an aggregate software

distribution containing programs from several different sources. The license

shall not require a royalty or other fee for such sale.

2. Source Code: The program must include source code, and must allow

distribution in source code as well as compiled form. Where some form of a

product is not distributed with source code, there must be a well-publicized

means of obtaining the source code for no more than a reasonable

reproduction cost–preferably, downloading via the Internet without charge.

3. Derived Works: The license must allow modifications and derived works,

and must allow them to be distributed under the same terms as the license of

the original software.

What is OSS?

Chapter 3 14

4. Integrity of the Author’s Source Code: The license may restrict source-

code from being distributed in modified form only if the license allows the

distribution of "patch files" with the source code for the purpose of

modifying the program at build time. The license must explicitly permit

distribution of software built from modified source code.

5. No Discrimination Against Persons or Groups: The license must not

discriminate against any person or group of persons.

6. No Discrimination Against Fields of Endeavor: The license must not

restrict anyone from making use of the program in a specific field of

endeavor.

7. Distribution of License: The rights attached to the program must apply to

all to whom the program is redistributed without the need for execution of

an additional license by those parties.

8. License Must not be Specific to a Product: The rights attached to the

program must not depend on the program's being part of a particular

software distribution.

9. The License Must not Restrict Other Software: The license must not

place restrictions on other software that is distributed along with the

licensed software. For example, the license must not insist that all other

programs distributed on the same medium must be open-source software.

10. The License must be Technology-Neutral: No provision of the license

may be predicated on any individual technology or style of interface.

3 The OSS Movement

The free/open source software movement began in the "hacker" culture of U.S.

computer science laboratories (Stanford, Berkeley, Carnegie Mellon, and MIT) in

the 1960's and 1970's. (Raymond, 2001)

What is OSS?

Chapter 3 15

The community of programmers at that time was small, and close-knit. Code

passed back and forth between the members of the community and if someone

made an improvement he/she was expected to submit that code to the community

of developers.

It was in this environment that Richard Stallman began his computer science career

in 1971, as a graduate student at the Massachusetts Institute of Technology

Artificial Intelligence Lab. In this environment, Stallman and his colleagues built

an enormous array of software tools for the PDP-10 (Rasch, 2000). Stallman

founded the GNU (http://www.gnu.org/), which stands for GNU’s Not Unix, in the

early eighties which later became Free Software Foundation (http://www.fsf.org/).

Open Source movement has its roots in this hacker culture of seventies and

eighties. According to Morgan (2002):

“OSS is both a philosophy and a process. As a philosophy it describes the

intended use of software and methods for its distribution. Depending on

your perspective, the concept of OSS is a relatively new idea being only

four or five years old. On the other hand, the GNU Software Project -- a

project advocating the distribution of "free" software -- has been

operational since the mid '80's. Consequently, the ideas behind OSS have

been around longer than you may think. It begins when a man named

Richard Stallman worked for MIT in an environment where software was

shared.”

OSS is also a process for the creation and maintenance of software. This is not a

formalized process, but rather a process of convention with common characteristics

between software projects. (Morgan, 2002)

4 Why Adopt OSS in Libraries?

The range and quality of software available for libraries is small compared to other

industrial applications. According to David Chudnov (1999) it is not surprising:

What is OSS?

Chapter 3 16

“The library community is largely made up of not-for-profit, publicly

funded agencies which hardly command a major voice in today's high tech

information industry. As such, there is not an enormous market niche for

software vendors to fill our small demand for systems. Indeed the 1997

estimated library systems revenue was only $470 million, with the largest

vendor earning $60 million. Because even the most successful vendors are

very small relative to the Microsofts of this world (and because libraries

cannot compete against industry salary levels), there are relatively few

software developers available to build library applications, and therefore a

relatively small community pool of software talent.”

According to Eric Lease Morgan (2002), author of MyLibrary portal software:

“In many ways I believe OSS development, as articulated by Raymond, is

very similar to the principles of librarianship. First and foremost with the

idea of sharing information. Both camps put a premium on open access.

Both camps are gift cultures and gain reputation by the amount of "stuff"

they give away. What people do with the information, whether it be source

code or journal articles, is up to them. Both camps hope the shared

information will be used to improve our place in the world. Just as

Jefferson's informed public is a necessity for democracy, OSS is necessary

for the improvement of computer applications.”

According to Chudnov (1999) there are three factors pushing the use of OSS in

libraries:

1. OSS licenses allow libraries to cut budget on software and use it to other

issues needing more funds.

2. OSS product is not locked into a single vendor. Thus even if a library buys

an open source system from one vendor, it might choose to buy technical

support from another company or get it from in-house experts.

3. The entire library community might share the responsibility of solving

information systems accessibility issues.

What is OSS?

Chapter 3 17

According to the Draft Report (2001) of Digital Library Federation (USA) to

consider Open Source Software for Libraries there are three virtues of OSS in

libraries. They are:

• OSS is an economical alternative to libraries' reliance upon commercially

supplied software. That is, despite the real costs involved in the

development, maintenance, and use of OSS software but these are lower

than those associated with library reliance upon commercial software.

• OSS is essential if libraries are to develop software and systems that meet

their patrons' needs. With OSS the IT infrastructure that is essential to

library operations and services can be:

o open (that is, built according to open standards and as such

potentially inter operable with other essential software and systems);

o ubiquitously available to libraries;

o capable of being tailored to suit the needs and circumstances of

individual libraries

o documented (and documentation must be available); and

o errors can more effectively be identified and corrected ("many

eyeballs make bugs shallow")

• OSS ensures that library systems and online services will be more

functional for libraries and their patrons and as such be good for library

patrons. This hypothesis is posited because, through OSS developments,

libraries:

o are reinserted into the research and development process that results

in systems and software;

o share a stake in software development and as such have greater

influence over (and as a result take a greater interest in specification

of) the functional and performance requirements associated with

particular software tools and systems

What is OSS?

Chapter 3 18

o motivate and empower systems librarians and related technical staff

by encouraging creativity and positioning them to make a

difference; and

o are able more easily to collaborate with other information science

communities involved in common research and development area

OSS democratizes the use of software applications in libraries irrespective of the

size and scope of the library.

Chapter 4

Software Tools for Automation

“What one man can invent, another can discover.” – Arthur Conan Doyle

• Integrated Solutions

• Databases

• Cataloguing/MARC Tools

• Z39.50 Tools

• Barcode Makers

Software Tools for Automation

Chapter 4 20

1 Integrated Solutions

Integrated Library Systems (ILS) is the current wave in the field of library

automation. An ILS combines several activities of the library into one integrated

system, allowing the library staff to perform all their functions online. These

activities include simple housekeeping activities like acquisition, cataloguing to

user services, and inter-library loan activities.

In the last few years we have seen the development of a number of ILS products in

the open source world. One important trend in these kind products is the use of

web-based client/server architecture. Listed below are some of the well-known ILS

products.

1.1 Koha: The First Open Source Integrated Library System

Description: Koha is the first open source fully featured integrated library system

(ILS) used by a considerable number of libraries in USA, New Zealand, and

Europe. The Koha ILS includes catalogue, OPAC, circulation, member

management, and acquisitions package. Koha is used by public libraries, private

collectors, not-profit organizations, churches, schools, and corporates.

Special Features: Some of the key features are

• Simple clear interface for librarians and members (patrons) to search right

from the front page.

• Customizable search - you choose which fields you want on your search

forms when you set it up

• Reading lists for members - now you can find the name of that great book

you read last year.

• Full acquisitions including budgets and pricing information (including

supplier and currency conversion), being kept so that you can see what

you've ordered and received - so handy at end of year and audit time.

Software Tools for Automation

Chapter 4 21

• Simple acquisitions for the smaller library

• Able to catalogue websites as items, or have them as links to existing

records.

History: Koha was developed in 1999 and the first library went live in January of

2000. Koha's code has been in production since then and is continuing to move

towards higher levels of functionality and standards compliance, including

embracing the international records and cataloguing standards MARC and Z39.50.

Project Sponsors/Administrators: Katipo Communications, and funding by

Horowhenua Library Trust and other libraries. Current project leader is Patrick

Eyler.

Dependency: Apache, Perl, MySQL (or any RDBMS)

Supported Platforms: Windows (without Z39.50 support), Linux, and UNIX

License: GNU General Public License

Availability: http://sourceforge.net/projects/koha, http://www.koha.org/download/

Further Information:

1. Project Homepage: http://www.koha.org/

2. Koha Wiki Page:

http://www.saas.nsw.edu.au/wiki/index.php?page=KohaProject

3. Koha Labs: http://www.kohalabs.com/

1.2 PhpMyLibrary

Description: PhpMyLibrary is a web-based library automation application meant

for smaller libraries. The system consists of cataloguing, circulation, and the OPAC

module. The system also has an import export feature. It strictly follows the

USMARC standard for adding materials.

Special Features: The salient features are:

Software Tools for Automation

Chapter 4 22

• Fully compatible with the Postnuke Content Management System enabling

easy integration with the Postnuke-based portal

• Online reservation system for library patron with their own login

• Supports import from ISIS database with an ISIS2MARC program

History: Unknown

Project Sponsors/Administrators: Polerio Babao III, and Paolo Alexis Falcone

Dependency: Apache, PHP, MySQL, Python

Supported Platforms: Platform Independent

License: GNU General Public License

Availability: http://sourceforge.net/projects/phpmylibrary/

Further Information: Project Homepage: http://phpmylibrary.sourceforge.net/

1.3 OpenBiblio: A Library System That’s Free

Description: OpenBiblio is an easy to use, open source, automated library system

written in PHP containing OPAC, circulation, cataloging, and staff administration

functionality. The purpose of this project is to provide a cost effective library

automation solution for private collections, clubs, churches, schools, or public

libraries.

Special Features: The goals of the project has been to achieve the following

• Intuitive and easy to use

• Well documented

• Easy to install with minimal expertise

• Designed with common library features to work with most library

workflows

It is fully compatible with the Postnuke Content Management System.

Software Tools for Automation

Chapter 4 23

History: Unknown

Project Sponsors/Administrators: Dave Stevens

Dependency: Apache, PHP, MySQL

Supported Platforms: Platform Independent

License: GNU General Public License

Availability: http://sourceforge.net/project/showfiles.php?group_id=50071

Further Information: Project Home Page: http://obiblio.sourceforge.net/

1.4 GNU Library Management System (GLIBMS)

Description: Glibms is Library management software developed using PHP and

PostgreSQL to automate the different activities carried out in the library. The

project is currently inactive at Sourceforge. It is renamed as Karuna and hosted at

sarovar.org.

Special Features: Unknown

History: Unknown

Project Sponsors/Administrators: Sharmad Naik, Gaurav Priyolkar

Dependency: Apache, PHP, Perl, PostgreSQL

Supported Platforms: Linux, UNIX

License: GNU General Public License

Availability: http://sourceforge.net/projects/glibs/

Further Information: Project Home Page: http://sourceforge.net/projects/glibs/

1.5 Avanti: An Open Source Library Computing System

Description: Avanti MicroLCS is an open source general purpose library

computing system that is small, simple, and easy to install and use. Written in

Software Tools for Automation

Chapter 4 24

Java, it is platform independent and can run on any system that supports a Java

runtime environment. Although it targets small libraries, it has a powerful and very

flexible architecture that allows it to be adapted for use in libraries of any type.

Special Features: Some key objectives of the project are:

• Keep it as small, simple and extendable as possible, using a well-

considered, clean design.

• Implementation neutral: Base the design on a purely abstract model of

library systems. Avoid designing for a literal library. This makes the core

system very portable and adaptable to the needs of libraries of all types.

• Platform independent: 100% pure Java.

• It should be easy to install and use. Unlike most other open source

solutions, it should not require the skills of a system administrator to install

and maintain.

• User interfaces should be modeless, flat and simple.

• Keep the memory and resource footprint very small. Avanti is anticipated

being used in a variety of forms including that of a library automation

server appliance.

• Incorporate standards such as MARC and Z39.50 as modules and

interfaces, but do not allow them to become part of the underlying design.

History: Avanti is an effort, begun in 1998 by Peter Schlumpf, to develop a simple,

flexible, and open source solution to automating small and medium-sized libraries

of various types that requires a minimum of technical expertise to install and use.

Project Sponsors/Administrators: Peter Schlumpf

Dependency: Java Virtual Machine

Supported Platforms: Platform Independent

License: Unknown

Availability: http://home.earthlink.net/~schlumpf/avanti/downloads.html

Software Tools for Automation

Chapter 4 25

Further Information: Project Home Page:

http://home.earthlink.net/~schlumpf/avanti/index.html

1.6 PhpMyBibli: A Free Solution for the Media Library

Description: PhpMyBibli is a web-based library automation for French libraries.

Special Features: Some of the features are:

• A simplified administration being able to be ensured by the personnel of the

library

• Support of format UNIMARC

• Management of the authorities (responsible, editors, collections, matters...)

• Management of the loan, the reservations, the borrowers...

• Support for cataloguing electronic resources

• The management of the periodicals

History: Unknown

Project Sponsors/Administrators: Francois Lemarchand

Dependency: Apache, PHP, MySQL

Supported Platforms: Platform Independent

License: GNU General Public License

Availability: http://sourceforge.net/project/showfiles.php?group_id=64869

Further Information: Project Home Page: http://phpmybibli.sourceforge.net/

1.7 OpenBook

Description: OpenBook, a free Web-based integrated library system offers

flexible, sophisticated automation to small and mid-sized public or school libraries

and was created to increase digital access to information. OpenBook uses open

Software Tools for Automation

Chapter 4 26

source code to offer a low-cost, simple-to-use system rich in features generally

found only in high-end systems. The current technical beta version includes

complex searching capabilities, a full bibliographic record with external resource

linking as defined in MARC21, and a cataloging function that is MARC21-

compatible.

Special Features: Some distinctive features include the following:

• A completely Web-based cataloging system—It's simple to use, works with

any existing hardware or software, and supports all popular browsers.

• Combines total capture and retention of all MARC21 fields with custom

configuration of cataloging display fields

• A multilingual interface—Can be displayed in any Roman- character

language

• Patron ability to access the system from home

• Enhanced safety features, including backup, restore, and purge

• A home page development template

History: OpenBook developed as a modification of Koha, the first free open source

library system created in New Zealand by the Horowhenua Library Trust and

Katipo Communications, Ltd. The Technology Resource Foundation's OpenBook

design team, which comprises experienced librarians and programmers, used Koha

as a basis to develop OpenBook from the ground up.

Project Sponsors/Administrators: Technology Resource Foundation

Dependency: Apache, Perl, MySQL

Supported Platforms: Unknown

License: Unknown, GNU GPL

Availability: Currently not accessible

Further Information:

1. Project Home Page: http://www.trfoundation.org/projects/faq.html

Software Tools for Automation

Chapter 4 27

2. Press Release: http://www.infotoday.com/IT/sep01/news16.htm

1.8 Learning Access ILS

Description: The Learning Access ILS is a full-feature Open Source library

automation system developed for use by small public and school libraries in the

U.S. and the rest of the world. The Institute will make this system available free to

libraries that, because of cost, have been unable to achieve the benefits of

automation.

The LearningAccess ILS consists of three modules: the patron or user module

(OPAC), the cataloging module and the circulation module. In future releases it

may also include an acquisition module. All modules are Web-interface based and

are multi-lingual user capable, with our initial release supporting English, Spanish

and French.

Special Features: The system supports the full MARC21 format for bibliographic,

holding, authority and community records. It has an intuitive importing program to

add records to its database. The cataloging client includes Z39.50 searching

capabilities to allow for copy cataloging against OCLC or other larger union

databases. Future releases will also support Z39.50 searches against the database.

History: The Learning Access Institute pursues its mission through two distinct yet

interconnected programme areas. The Technology Development Program focuses

its efforts on the development of and adaptation of information technology

solutions to meet the information and learning access needs of underserved

communities.

Project Sponsors/Administrators: Learning Access Institute

Dependency: Apache, PHP, Perl, MySQL

Supported Platforms: Linux, Windows NT/2000 (Not tested)

License: GNU General Public License

Availability: Not currently available

Software Tools for Automation

Chapter 4 28

Further Information:

Project Home Page: http://www.learningaccess.org/website/techdev/ils.php

1.9 Karuna

Description: This project is a library management system designed to automate a

library. Taken into consideration all the aspects of a library like search,

issue/retrieval, acquisition and other aspects of a library.

Special Features: Unknown

History: It is another version of the GNU Library Management System (GNU

LMS). According the author of Karuna (who was also one of the developer for

GNU LMS) the original GNU LMS is no more supported.

Project Sponsors/Administrators: Sharmad Naik

Dependency: Apache, PHP, PostgreSQL

Supported Platforms: Linux, UNIX

License: GNU GPL

Availability: http://sarovar.org/project/showfiles.php?group_id=34

Further Information: Project Home Page: http://sarovar.org/projects/karuna/

2 Databases

Use of databases has grown in the library software applications whether it is an

ILS, cataloguing software, information retrieval tool, reference service tool, current

awareness service tool, or simply for a library website. There are a number of

Relational Database Management Software (RDBMS) available as open source

(like MySQL, PostgreSQL, and SAP DB) which support Structured Query

Language (SQL) standards.

Software Tools for Automation

Chapter 4 29

2.1 OpenIsis

Description: OpenIsis is the open source member of the CDS/ISIS software

family. It is well suited for bibliographic databases with variable length fields and

repeatable sub-fields.

Special Features: Some of the special features are

• Highly flexible data structure: potentially unlimited number of data fields in

record

• Highly efficient storage: unused data fields consume no space

• Natural Modeling – ultra fast access: logically related data that would be

artificially separated in a relational DB is stored in a single record

• Highly flexible index structure: index entries associated with a record are

under full application control, can even be derived from associated text

documents of any format.

History: Developed since May 2001

Project Sponsors/Administrators: OpenIsis Verein, Berlin

Dependency: Unknown

Supported Platforms: Linux, UNIX, Windows, MacOS X

License: GNU GPL, LGPL

Availability: http://sourceforge.net/project/showfiles.php?group_id=11257

Further Information: Project Home Page: http://www.openisis.org/

2.2 PostgreSQL

Description: PostgreSQL is claimed to be the most advanced Open Source

database system in the world.

Special Features: Some of the special features are

Software Tools for Automation

Chapter 4 30

• Exceptional performance and speed

• World-class security

• Flexibility to be extended as required

• Highly scalable design

• Minimal administration requirements

Full feature set is available at: http://advocacy.postgresql.org/advantages/

History: The PostgreSQL software itself had its beginnings in 1986 inside the

University of California at Berkeley as a research prototype, and in the 16 years

since has moved to its now globally distributed development model, with central

servers based in Canada.

Project Sponsors/Administrators: PostgreSQL Global Development Group

Dependency: Perl, Python, Tcl/Tk, JDK/Ant, Flex & Bison

Supported Platforms: Linux, UNIX, Windows (under cygwin environment)

License: BSD License

Availability: http://www.postgresql.org/mirrors-ftp.html

Further Information: Project Home Page: http://www.postgresql.org/

2.3 MySQL

Description: The MySQL database server is the world’s most popular open source

database. Its architecture makes it extremely fast and easy to customize. Extensive

reuse of code within the software and a minimalist approach to producing

functionality-rich features has resulted in a database management system

unmatched in speed, compactness, stability and ease of deployment. The unique

separation of the core server from the storage engine makes it possible to run with

strict transaction control or with ultra-fast transaction-less disk access, whichever is

most appropriate for the situation.

Software Tools for Automation

Chapter 4 31

Special Features: Some of the major features are

• ANSI SQL syntax support

• Cross-platform support

• Independent storage engines

• Full-text indexing and searching

• Query caching

• Flexible security system, including SSL support

• Replication of database servers for robustness and speed

Full feature set is available at: http://www.mysql.com/products/mysql/index.html

History: The project was started in 1995 and has become quite mature in the last

five years. Undoubtedly it is the most popular open source RDBMS primarily

because of its speed.

Project Sponsors/Administrators: MySQL AB

Dependency: Unknown

Supported Platforms: Linux, UNIX, Windows, MacOS X

License: GNU GPL and Commercial non-GNU

Availability: http://www.mysql.com/downloads/index.html

Further Information: Project Home Page: http://www.mysql.com/

2.4 Firebird

Description: Firebird is a relational database offering many ANSI SQL-92 features

that runs on Linux, Windows, and a variety of Unix platforms. Firebird offers

excellent concurrency, high performance, and powerful language support for stored

procedures and triggers. It has been used in production systems, under a variety of

names since 1981.

Software Tools for Automation

Chapter 4 32

Special Features: Unknown

History: In August 2000, Borland Software Corp. (formerly known as Inprise)

released the beta version of InterBase 6.0 as open source. The community of

waiting developers and users preferred to establish itself as an independent, self-

regulating team rather than submit to the risks, conditions and restrictions that the

company proposed for community participation in open source development. A

core of developers quickly formed a project and installed its own source tree on

SourceForge.

Project Sponsors/Administrators: Ann W. Harrison, Pavel Cisar, John Bellardo,

Mark Odonohue, David Jencks, Dmitry Yemanov, Sean Leyne

Dependency: glibc-2.2, ncurses4

Supported Platforms: Linux, UNIX, Windows

License: Mozilla Public License, InterBase Public License

Availability: http://sourceforge.net/project/showfiles.php?group_id=9028

Further Information: Project Home Page: http://www.firebirdsql.org/

2.5 SAP DB

Description: SAP DB is an open, SQL-based, relational database management

system for small to very large implementations, supporting object orientation and

unstructured data. SAP DB adheres to open standards including SQL, JDBC, and

ODBC; access from Perl and Python; and HTTP-based services with HTML or

extensible markup language (XML) content.

Special Features: The main features are

• Round-the-clock operation

• Easy administration

• Free of reorganization tasks

• Unlimited number of users

Software Tools for Automation

Chapter 4 33

• Unlimited database size

• Supports all SAP solutions

History: Project started in October 2000.

Project Sponsors/Administrators: SAP AG, Germany

Dependency: Unknown

Supported Platforms: Windows NT, Linux

License: GNU GPL, LGPL

Availability: http://www.sapdb.org/7.4/sap_db_software.htm

Further Information: Project Home Page: http://www.sapdb.org/

3 Cataloguing/MARC Tools

Many small libraries could not afford and ILS to be implemented due to various

reasons depending upon the clientele and available resources. Automating a small

part of the library function like cataloguing or circulation might satisfy them. It

might convince the library authority to go for full-fledged automation in future.

These tools are also useful for building OPAC services within the library or

through the library website.

There are a number of tools available for automation of the cataloguing function.

The important concern here is the compliance of well-accepted standards like

AACR and MARC for integration with future softwares.

3.1 Java Book Cataloguing System

Description: The purpose of this software is primarily to create a Book Catalog

using barcode data from the freely available cuecat barcode reader.

Software Tools for Automation

Chapter 4 34

Special Features: It uses a RDBMS backend database, and allows synchronization

between different library branches.

History: Unknown

Project Sponsors/Administrators: Josh Patterson

Dependency: Java, Hypersonic SQL, JDBC

Supported Platforms: Platform Independent

License: GNU Library or Lesser General Public License

Availability: http://sourceforge.net/project/showfiles.php?group_id=10661

Further Information: Project Home Page:

http://sourceforge.net/projects/jbiblioteca/

3.2 MARC/Perl

Description: MARC/Perl is a Perl library for reading, manipulating, outputting and

converting bibliographic records in the MARC format.

Special Features: Some of the important features are:

• Support for reading, editing, creating MARC records in batch mode

• Can be used to validate MARC records

• Can be used with Net::Z3950 to download MARC data in batch mode

History: In 1999 a group of developers began working on MARC.pm to provide a

Perl module for working with MARC data. MARC.pm was quite successful since it

grew to include many new options that were requested by the Perl/library

community.

In mid 2001 Andy Lester released MARC::Record and MARC::Field which

provided a much more simpler and maintainable package for processing MARC

data with Perl. Instead of forking the two projects the developers agreed to

encourage use of the MARC::Record framework, and to work on enhancing

Software Tools for Automation

Chapter 4 35

MARC::Record rather than extending MARC.pm further. Soon afterwards

MARC::Batch was added which allows you to read in a large data file without

having to worry about memory consumption.

Project Sponsors/Administrators: Andy Lester, Edward Summers

Dependency: Perl

Supported Platforms: UNIX, Linux, Windows

License: GNU General Public License

Availability: http://sourceforge.net/project/showfiles.php?group_id=1254,

http://www.cpan.org/modules/by-module/MARC/

Further Information:

1. Project Home Page: http://marcpm.sourceforge.net/

2. CPAN Site: http://search.cpan.org/author/PETDANCE/MARC-Record-

1.29/

3.3 MARC Template Library

Description: The MARC Template Library is a C++ API (using C++ templates

and STL) for reading, writing and processing MARC records.

Special Features: The project provides a simple Windows-based graphical tool to

convert MARC records into MARCXML.

History: The author developed these tools to improve his knowledge of C++

Standard Template Library.

Project Sponsors/Administrators: Mark Basedow

Dependency: C++ Compiler

Supported Platforms: Windows, Linux, UNIX

License: BSD License

Availability: http://sourceforge.net/project/showfiles.php?group_id=43694

Software Tools for Automation

Chapter 4 36

Further Information: Project Home Page: http://mtl.sourceforge.net

3.4 jake2marc

Description: jake2marc is a utility that creates simple USMARC records for the

full-text journals in any of the databases listed in the jake (Jointly Administered

Knowledge Environment: http://www.jake-db.org/) project.

Special Features: Unknown

History: Unknown

Project Sponsors/Administrators: Mark Jordan

Dependency: Perl, libwww-perl & MARC::Record (Perl modules)

Supported Platforms: Linux, Windows

License: GNU GPL

Availability: http://jake.lib.sfu.ca/jake2marc/

Further Information: Project Home Page: http://jake.lib.sfu.ca/jake2marc/

3.5 UseMARCON

Description: The USEMARCON software is designed to provide users with two

specific services.

• The facility to convert MARC records compliant with a specified input

format into MARC records compliant with a specified output format.

• The facility to create and modify rules files, used to achieve MARC

conversions, in order to meet specific local requirements. The present

software is designed to be used by senior cataloguers or others with a

detailed knowledge of the structure of the MARC formats they wish to

convert between.

Software Tools for Automation

Chapter 4 37

Special Features: The UseMARCON project aimed to develop a generic toolkit

for ISO2709 compatible MARC formats to enable libraries to create rules based

systems to convert records between national MARC formats. This would give

libraries the ability to obtain records from a far wider range of potential sources

than those currently available to them and stimulate an increase in the international

exchange of bibliographic records.

History: The UseMARCON Project, which was successfully completed in

February 1997, was funded by the consortium partners and the EU's Telematics

Applications Programme (DGXIII-E). The partners of the UseMARCON Project

consortium were drawn from a variety of library and information technology

backgrounds and comprised the following:

Partners:

• Koninklijke Bibliotheek, Holland

• Instituto da Biblioteca Nacional e do Livro, Portugal

• The British Library, UK

Project Sponsors/Administrators: UseMARCON Consortium and Jouve S.I.

Dependency: C++ Compiler, XVT C++ Toolkit

Supported Platforms: Windows, UNIX, Linux

License: Unknown: Unsupported freeware (with source code)

Availability: ftp://ftp.bl.uk/pub/nbs/ec/usemarcon/, ftp://ftp.kb.nl/pub/usemarcon/

Further Information: Project Home Page:

http://www.konbib.nl/kb/resources/frameset_kb.html?/kb/sbo/bibinfra/usema-

en.html

Software Tools for Automation

Chapter 4 38

3.6 USEMARCON Plus

Description: USEMARCON is a software application that allows users to convert

bibliographic records from one MAchine-Readable Cataloguing (MARC) format to

another.

Special Features: The British Library has since further developed the

USEMARCON application. This work was carried out on behalf of the Library by

Crossnet Systems Limited. The program has been enhanced in the following ways:

• The redevelopment of the application removing all proprietary XVT

components and substituting public domain equivalents.

• The removal of the graphical user interface in order that the program can

function as part of a batch process from the system command line.

• The re-design of the application for 32bit MS Windows and Linux

operating systems.

• The optimization of the program to allow the conversion of large files.

• The integration of new rule functions to enable the creation of more

complex conversions.

History: In 1995, a project funded by the European Union was set up to address

this issue. The project was successfully completed in 1997 with the development of

the USEMARCON (User Controlled Generic MARC Converter) software.

Project Sponsors/Administrators: The British Library, Crossnet Systems

Dependency: C++ Compiler, XVT C++ Toolkit

Supported Platforms: Windows, Solaris, UNIX

License: Unknown: Unsupported freeware

Availability: ftp://ftp.bl.uk/pub/nbs/ec/usemarcon

Further Information: Project Home Page:

http://www.bl.uk/services/bibliographic/usemarcon.html

Software Tools for Automation

Chapter 4 39

3.7 Marc2Opac

Description: Marc2Opac is a PHP4 script for searching and displaying MARC

files. It supports a good range of searching techniques and it is fast (searches more

than 1,00,000 entries in a second).

Special Features: The features added to this PHP module include

• Advanced search

• Subscriber logon

• Reservations system

History: Bundaberg City Council, Australia, developed Marc2Opac to put their

library catalogue online. Other features were added later.

Project Sponsors/Administrators: IT Services, Bundaberg City Council

(Australia)

Dependency: Apache, PHP, Grep

Supported Platforms: Linux

License: Unknown

Availability: http://www.bundaberg.qld.gov.au/library/catalog/about.php4

Further Information: Project Home Page:

http://www.bundaberg.qld.gov.au/library/catalog/about.php4

3.8 Medlane XMLMARC

Description: Medlane XMLMARC is a computer program that converts MARC

records into XML. It can also update MARC records, based on plain text

processing instructions, and write records to a file in the MARC format.

Special Features: Unknown

History: Unknown

Project Sponsors/Administrators: Kevin S. Clarke

Software Tools for Automation

Chapter 4 40

Dependency: Java

Supported Platforms: Platform Independent

License: GNU GPL, LGPL

Availability: http://sourceforge.net/project/showfiles.php?group_id=48203

Further Information: Project Home Page: http://medlane.stanford.edu/

3.9 MARCUTL

Description: MARCUTL (the MARC Update and Transformation Language) is a

mapping language that converts MARC into XML or MARC into "updated

MARC" based on the instructions in a MARCUTL file. These files are expressed

in XML and must conform to the MARCUTL schema.

Special Features: MARCUTL provides for several built in methods of updating or

transforming MARC records, but it also provides for the creation of special MARC

processing classes. These classes implement a particular interface, described in the

MARCUTL API (application programming interface), and accept a MedMARC

Record as input. MedMARC is a Java API for handling MARC records that was

developed by the Medlane project.

History: Unknown

Project Sponsors/Administrators: Kevin S. Clarke

Dependency: Java

Supported Platforms: Platform Independent

License: GNU GPL, LGPL

Availability: To be available

Further Information: Project Home Page: http://medlane.stanford.edu/

Software Tools for Automation

Chapter 4 41

4 Z39.50 Tools

The Z39.50 standard specifies a client/server-based protocol for searching and

retrieving information from remote databases. The protocol is sponsored by

American National Standards Institute (ANSI) and US National Information

Standards Organization (NISO). The first version of the protocol was published in

1988. The second version came out in 1992 and the latest version (version 3) is

dated 1995. However, the Z39.50 revision (Z39.50-2001) is still in progress!

The use of Z39.50 protocol in library is either to get bibliographic data from other

libraries or provide bibliographic services to other libraries. The library may

choose to be either client (for downloading/search records) or server (allowing

others to download/search local records). There are tools available to implement

both the activities.

Z39.50 might prove beneficial in identifying resources through its powerful

broadcast search functions where a user can send a query to a large number of

servers to search bibliographic records. This way the protocol can be seen as an

alternative to union catalogues, though it still does not support holdings records to

be displayed in the search results. It can also be combined with other activities,

such as inter-library loan (ILL), to speed up the process.

4.1 YAZ Toolkit

Description: YAZ (Yet Another Z39.50 Toolkit) is a toolkit for implementing the

Z39.50-1995 standard and protocol. Both the Origin (client) and Target (server)

roles of the protocol are supported. The toolkit is written in C.

Special Features: Its ability to provide an open, well-defined, and structured

framework to information retrieval tasks within any application domain makes it an

obvious candidate for use in many different roles.

History: Unknown

Project Sponsors/Administrators: Index Data

Software Tools for Automation

Chapter 4 42

Dependency: None

Supported Platforms: UNIX, Linux, Windows

License: Index Data Copyright (Based on BSD License)

Availability: http://www.indexdata.dk/yaz/

Further Information: Project Home Page: http://www.indexdata.dk/yaz/

4.2 ZContent

Description: ZContent is a Perl script and module that provides a Z39.50 target for

the CONTENTdm server. CONTENTdm (http://contentdm.com/) is a commercial

digital collection management software.

ZContent is based on the open source SimpleServer Perl module which is provided

by Index Data (http://www.indexdata.com/simpleserver/). SimpleServer is based

on the YAZ toolkit, which is also provided by Index Data.

(http://www.indexdata.com/yaz/). USMARC Records are created using the

MARC::Record Perl module.

Special Features: Unknown

History: The University of Utah Marriott Library has developed software that adds

Z39.50 compatibility to any CONTENTdm digital collections server.

Project Sponsors/Administrators: Aaron DeMille, Kenning Arlitsch (University

of Utah)

Dependency: Perl, YAZ Toolkit, SimpleServer

Supported Platforms: Windows

License: GNU General Public License

Availability: http://sourceforge.net/projects/zcontent

Further Information: Project Home Page:

http://www.lib.utah.edu/digital/ZContent.html

Software Tools for Automation

Chapter 4 43

4.3 SimpleServer

Description: SimpleServer is a Perl module which is intended to make it as simple

as possible to develop new Z39.50 servers over any type of database imaginable.

All you have to do is implement a function for initializing your database (optional),

searching the database, and returning "database records" on request. The module

takes care of everything else and automatically starts a server for you, listens to

incoming connections, and implements the Z39.50 protocol.

Special Features: Use SimpleServer together with other Perl modules to provide

gateways to relational databases, local file stores, SOAP/RDF-servers, etc.

SimpleServer currently supports the Init, Search, Present, Scan and Close services.

If you are interested in other functionality, get in touch and we'll help if we can.

History: Unknown

Project Sponsors/Administrators: Index Data

Dependency: YAZ 1.8 or later

Supported Platforms: UNIX, Linux, Windows

License: Index Data Copyright

Availability: http://www.indexdata.dk/simpleserver/

Further Information: Project Home Page: http://www.indexdata.dk/simpleserver/

4.4 VB Zoom

Description: VB ZOOM is a collection of ActiveX COMponents, written in Visual

Basic, which implement the ZOOM 1.2 (Z39.50 Object-Orientation Model)

Abstract API. The current VB ZOOM is a wrapper for the YAZ Toolkit from Index

Data, plus a helper component for doing MARC-8 to Unicode character

conversions.

Special Features: Unknown

Software Tools for Automation

Chapter 4 44

History: The original VB ZOOM was developed for a project called ZMARCO as

part of the Open Archives Initiative Metadata Harvesting Project at the University

of Illinois at Urbana-Champaign, funded by the Andrew Mellon Foundation.

Continuing work on this and the ZMARCO project is being funded by a Library

Services and Technology Act grant from the Illinois State Library.

Project Sponsors/Administrators: Index Data, Denmark

Dependency: Yaz.dll V 2.0.1 (YAZ Toolkit)

Supported Platforms: Windows

License: University of Illinois/NCSA Open Source License (http://vb-

zoom.sourceforge.net/License.html)

Availability: http://sourceforge.net/project/showfiles.php?group_id=53790

Further Information: Project Home Page: http://vb-zoom.sourceforge.net/

4.5 JZkit

Description: A pure Java toolkit to assist in the development of information

retrieval systems using the Z39.50 standard.

Special Features: The toolkit is presented in three distinct levels:

Encoders/Decoders, Protocol Endpoint and IR-Services. A number of example

origin and target implementations are available.

History: Unknown

Project Sponsors/Administrators: Ian Ibbotson

Dependency: Java VM

Supported Platforms: Platform Independent

License: GNU General Public License

Availability: http://sourceforge.net/project/showfiles.php?group_id=16429

Further Information: Project Home Page: http://www.k-int.com/jzkit

Software Tools for Automation

Chapter 4 45

4.6 Zeta Perl

Description: ZETA Perl defines a set of functions, variables and conventions that

provide a consistent interface to the Z39.50 services and protocol for Perl

applications. It was mainly designed and implemented to be usable by web

developers. However, it would be of help as well in writing a Z3950 client with

very little effort.

Special Features: The current version of the ZETA Perl (0.059) supports the

following APDUs: Init, Search, Present, Close, Delete, Scan and Sort

History: Unknown

Project Sponsors/Administrators: Unknown

Dependency: Perl 5.003 or better

Supported Platforms: Linux, Solaris, AIX

License: Perl Artistic License, GNU GPL

Availability: ftp://zeta.tlcpi.finsiel.it/pub/zeta/

Further Information: Project Home Page:

http://lcweb.loc.gov/z3950/agency/resources/software.html

4.7 ZedKit for Unix

Description: The Z39.50 Application Development Libraries for UNIX developed

for the German Library Project DBV OSI II and also the ONE project co-funded by

the European Commission Libraries Programme.

Special Features: Unknown

History: Unknown

Project Sponsors/Administrators: Crossnet Systems, UK

(http://www.crxnet.com/)

Software Tools for Automation

Chapter 4 46

Dependency: None

Supported Platforms: UNIX, Linux

License: Unknown (ftp://ftp.ddb.de/pub/dbvosi/dbvosiII-2.1.README)

Availability: http://www.crxnet.com/ZedKit_download.php

Further Information: Project Home Page: http://www.crxnet.com/zedkit.php

4.8 IrTcl Toolkit

Description: IrTcl is an extension to the Tcl/Tk (http://www.tcl.tk/) language

environments. IrTcl allows you to rapidly develop platform-independent, graphical

clients to the Z39.50 protocol supporting both the X Window and MS-Windows

environments.

Special Features: Unknown

History: Unknown

Project Sponsors/Administrators: Index Data

Dependency: Tcl/Tk, YAZ Toolkit

Supported Platforms: UNIX, Linux, Windows

License: Unknown

Availability: http://www.indexdata.dk/irtcl/

Further Information: Project Home Page: http://www.indexdata.dk/irtcl/

4.9 Net::Z3950

Description: The Net::Z3950 module provides a Perl interface to the Z39.50

information retrieval protocol (ISO 23950), a mature and powerful protocol used in

application domains as diverse as bibliographic information, geo-spatial mapping,

museums and other cultural heritage information, and structured vocabulary

navigation.

Software Tools for Automation

Chapter 4 47

Special Features: Unknown

History: Unknown

Project Sponsors/Administrators: Mike Taylor

Dependency: Perl, YAZ Toolkit

Supported Platforms: UNIX, Linux, Windows (under Cygwin environment)

License: Perl Artistic License

Availability: http://perl.z3950.org/download/, http://www.cpan.org/modules/by-

module/Net/MIRK/

Further Information: Project Home Page: http://perl.z3950.org/

5 Barcode Makers

The barcodes are nothing but representation of some alphanumeric code in pictorial

bars. A barcode uniquely identifies an alphanumeric code which can be read by

machines. Barcode technology was invented for automatic identification of

products in the food chains in USA to enable rapid check out of items. The use of

barcodes for books came much later after the use of ISBN came in vogue.

The barcodes were mostly used by the department and bookstores to expedite the

process of check out. The use of barcodes has been found useful in automating

the check in and check out process in the circulation activities in the libraries. The

barcodes labels are assigned usually on the basis of the accession number of a

document which uniquely identifies an item within the library.

5.1 GNU Barcode

Description: GNU Barcode is a tool to convert text strings to printed bars. It

supports a variety of standard codes to represent the textual strings and creates

postscript output. The popular KBarcode software uses GNU Barcode at its

backend.

Software Tools for Automation

Chapter 4 48

Output is generated as either Postscript or Encapsulated Postscript (other back-ends

may be added if needed). The package is released as both a library and a command-

line frontend, so that one can include barcode-generation into one's application.

Special Features: Main features of GNU Barcode:

• Available as both a library and an executable program

• Supports UPC, EAN, ISBN, CODE39 and other encoding standards

• Postscript and Encapsulated Postscript output

• Accepts sizes and positions as inches, centimeters, millimeters

• Can create tables of barcodes (to print labels on sticker pages)

History: Unknown

Project Sponsors/Administrators: GNU

Dependency: Unknown

Supported Platforms: Unknown

License: GNU GPL

Availability: http://www.gnu.org/software/barcode/barcode.html

Further Information: Project Home Page:

http://www.gnu.org/software/barcode/barcode.html

5.2 KBarcode: The Open Source Barcode Solution

Description: KBarcode is a barcode and label printing application for KDE 3. It

can be used to print every thing from simple business cards up to complex labels

with several barcodes (e.g. article descriptions).

Special Features: KBarcode comes with an easy to use WYSIWYG label

designer, a setup wizard, batch import of labels (directly from the delivery note),

thousands of predefined labels, database managment tools and translations in many

Software Tools for Automation

Chapter 4 49

languages. Even printing more than 10.000 labels in one go is no problem for

KBarcode.

Additionally it is a simply xbarcode replacement for the creation of barcodes. All

major types of barcodes like EAN, UPC, CODE39 and ISBN are supported.

History: Unknown

Project Sponsors/Administrators: Project Leader: Stefan Onken.

Core Programmer: Dominik Seichter

Dependency: KDE 3, pdf417_encode (for 2-D barcodes)

Supported Platforms: Linux

License: GNU GPL

Availability: http://sourceforge.net/project/showfiles.php?group_id=51628

Further Information: Project Home Page: http://www.kbarcode.net/

5.3 PHP Barcode

Description: Barcode is a small implementation of a barcode rendering class using

the PHP language and GD graphics library.

Special Features: Unknown

History: Unknown

Project Sponsors/Administrators: Karim Mribti

Dependency: Apache, PHP, GD Graphics Library

Supported Platforms: Unknown

License: GNU LGPL

Availability: http://www.mribti.com/barcode/download.php

Further Information: Project Home Page: http://www.mribti.com/barcode/

Software Tools for Automation

Chapter 4 50

5.4 Barcodes-on-the-fly

Description: This utility will generate printable barcodes in the CODABAR (NW-

7) format based on the information you provide. The author hopes that libraries and

others will be able to print cheap disposable barcodes for, among other things,

books on loan from another library.

Special Features: Unknown

History: Unknown

Project Sponsors/Administrators: Ben Ostrowsky

Dependency: Apache, zlib, libpng, gd, Perl (with CGI), and GD::Barcode

Supported Platforms: Linux

License: GNU General Public License

Availability: http://bernie.tblc.org/~ostrowb/barcodes.html

Further Information: Project Home Page:

http://bernie.tblc.org/~ostrowb/barcodes.html

Chapter 5

Software Tools for Value Added Services

“Any tool should be useful in the expected way, but a truly great tool lends itself to

uses you never expected” – Eric S. Raymond

• Library Portal

• User Services

• Subject Gateways

• Inter Library Loan (ILL)

Software Tools for Value Added Services

Chapter 5 52

1 Library Portal

The wide use of Internet by the users has made it imperative for the libraries to

have a presence there. There can be three types of content in a library website

according to Morgan (2003):

1. Information about the library: staff directories, departmental descriptions,

maps of the building, hours, etc.

2. Electronic versions of traditional library services: online tutorials, book

renewals, interlibrary loan requests and status reports, requests for purchase,

online chat/reference, virtual tours of the building(s), etc.

3. Access to library content: catalogs, indexes, full-text magazines and

journals, digitized special collections, free and commercial ebooks,

government documents, freely accessible Internet resources, electronic

encyclopedias and dictionaries, licensed content from vendors, etc.

Simple websites are fairly easy to maintain with little knowledge of HTML editors.

But as the size of the website grows the one needs to have better searching and

browsing interface. One must follow the usability guidelines in creating and

maintaining the websites so that users are not lost while navigating the site.

1.1 MyLibrary

Description: MyLibrary is a user-driven, customizable interface to collections of

Internet resources -- a portal. Primarily designed for libraries, the system's purpose

is to reduce information overload by allowing patrons to select as little or as much

information as they so desire for their personal pages.

Special Features: Some of the important features are:

• Web-based administration to add, delete, modify user access

• Web-based report generation

• Current awareness service based on cron job

Software Tools for Value Added Services

Chapter 5 53

• Search engine support based on Swish-E

History: Unknown

Project Sponsors/Administrators: Eric Lease Morgan

Dependency: Apache, Perl, MySQL/PostgreSQL

Supported Platforms: UNIX, Linux

License: GNU General Public License

Availability: http://dewey.library.nd.edu/mylibrary/download/

Further Information: Project Home Page: http://dewey.library.nd.edu/mylibrary/

1.2 The Scout Portal Toolkit

Description: The Scout Portal Toolkit (SPT) allows groups or organizations that

have a collection of knowledge or resources they want to share via the World Wide

Web to put that collection online without making a big investment in technical

resources or expertise.

Special Features: The portal interface has a number of useful features including

• Cross-Field Searching

• Resource Annotations by Users

• Intelligent User Agents

• Resource Quality Ratings by Users

• Suggested Resource Referrals (Recommender System)

Go to http://scout.wisc.edu/research/SPT/features.html to get a detailed description

of the above features.

The Scout Portal Toolkit also provides the Intelligent Metadata Tool (IMT). The

IMT is a web-based tool for the entry and editing of resource information.

Although only accessible to portal site administrators and designated users, the

IMT is an integrated part of the portal site, providing ready access to portal

Software Tools for Value Added Services

Chapter 5 54

facilities and information collected by the portal for discipline experts while they

are working on resource entries.

History: Unknown

Project Sponsors/Administrators: Internet Scout Project

Dependency: Apache, PHP, MySQL

Supported Platforms: Platform Independent (But installer work only in shell

environment)

License: GNU GPL

Availability: http://scout.wisc.edu/research/SPT/download.html

Further Information: Project Home Page: http://scout.wisc.edu/research/SPT/

1.3 Research Guide

Description: Research Guide is a web-based management of subject guides for

academic libraries.

Special Features: Some of the features are:

• Support for creating specialist pages with contact information and other

background information on subject specialists in the library

• Web-based interface for creating and editing guides and specialist pages

• Database back-end

History: This application was written for use at the University of Michigan

Graduate Library. It is currently being used to serve research guides there

(http://www.lib.umich.edu/grad/guide/).

Project Sponsors/Administrators: Kelsey Libner

Dependency: Apache, MySQL, PHP

Supported Platforms: UNIX, Linux, Windows

Software Tools for Value Added Services

Chapter 5 55

License: MIT License

Availability: http://sourceforge.net/project/showfiles.php?group_id=63006

Further Information: Project Home Page: http://researchguide.sourceforge.net/

1.4 PostNuke Content Management System

Description: PostNuke is the most powerful and popular open source content

management system on the Internet. It is easy to install, easy to understand/use,

and easy to administer.

Special Features: It is full of features including:

• Complete web-based administration

• Support for additional modules with PostNuke API

• Strong community support

History: Based on PHP-Nuke (http://www.phpnuke.org/).

Project Sponsors/Administrators: PostNuke Development Team

Dependency: Apache, MySQL, PHP, ADOdb

Supported Platforms: Platform Independent

License: GNU GPL

Availability: http://download.postnuke.com/

Further Information: Project Home Page: http://www.postnuke.com/

1.5 Cascade

Description: Cascade is a Perl driven, web-based content management system. It's

based on a community model of managing of a large directory resource. Cascade

allows one to easily maintain a web-based Yahoo-like directory of resources using

web-based forms.

Software Tools for Value Added Services

Chapter 5 56

Special Features: Some of the features are:

• Supports Related Categories and Virtual Subcategories (what you see in

yahoo directory with an @ next to them)

• Designed to integrate with static content on your website

• Supports basic ratings of content

(Go to http://summersault.com/software/cascade/#features for detailed features)

History: Unknown

Project Sponsors/Administrators: Mark Stosberg

Dependency: Apache, Perl, RDBMS (MySQL/PostgreSQL)

Supported Platforms: Unix, Linux

License: GNU General Public License

Availability: http://sourceforge.net/project/showfiles.php?group_id=6582

Further Information: Project Home Page:

http://summersault.com/software/cascade/

2 User Services

The user services like reference, circulation, and document delivery are really

crucial since it is the face of the library. Automating these functions not only helps

reducing the burden on the librarians but also improves the image of the library

among the users.

2.1 Prospero

Description: An Open Source Internet Document Delivery (IDD) System.

Software Tools for Value Added Services

Chapter 5 57

Special Features: Prospero can be easily integrated with and ILL implementation

package.

History: Prospero was inspired by the Yale Library Electronic Document Delivery

(EDD) service authored by Daniel Chudnov from the Yale Medical Library. The

EDD Project (http://oss4lib.org/projects/edd.php3) is no more supported.

Project Sponsors/Administrators: Eric Hamrick, Eric Schnell

Dependency: Perl, COMCTL32.DLL (for Windows), SAMBA (for Linux)

Supported Platforms: Staff Module (Windows), Server-side (Windows, Linux)

License: GNU GPL

Availability: http://bones.med.ohio-state.edu/prospero/current.html

Further Information: Project Home Page: http://bones.med.ohio-

state.edu/prospero/

2.2 Ask a Librarian (ASKAL)

Description: Ask a Librarian (ASKAL) is a self-managing email-based reference

service suite for libraries.

Special Features: Includes an administrative interface.

History: Originally developed for use in the University Library, University of

Nebraska (USA).

Project Sponsors/Administrators: Karen K. Hein, Marc W. Davis

Dependency: Apache, Mail Server (e.g., Sendmail), PHP, MySQL

Supported Platforms: Linux, UNIX, Windows

License: GNU GPL

Availability: http://apocalypse.unomaha.edu/ask/

Further Information: Project Home Page: http://apocalypse.unomaha.edu/ask/

Software Tools for Value Added Services

Chapter 5 58

2.3 Reference Desk Manager (RDM)

Description: The Reference Desk Manager (RDM) is a PHP based web

application, specifically designed to meet the needs of Reference Services in

libraries.

Special Features: Current RDM features are:

• Email weblog -- with search feature

• Electronic Card File -- with search feature

• Common Links Area

• Web-based Administration

History: The RDM was initially developed at Oregon State University (USA) for

use by their Reference Services staff.

Project Sponsors/Administrators: Terry Reese, Carrie Ottow, John Matylonek,

and Joe Toth.

Dependency: Apache, Sendmail, PHP, MySQL

Supported Platforms: Linux, UNIX, Windows (not tested)

License: Oregon State University Copyright (with source code). Free for non-

commercial and educational use.

Availability: http://oregonstate.edu/~reeset/RDM/downloads.html

Further Information: Project Home Page: http://oregonstate.edu/~reeset/RDM/

2.4 Morris Messenger

Description: Morris Messenger is a web-based messenger system which can be

used as an effective reference tool by the libraries.

Special Features: Unknown

History: Originally developed for use in the Morris Library, Southern Illinois

University Carbondale (USA).

Software Tools for Value Added Services

Chapter 5 59

Project Sponsors/Administrators: Keith VanCleave, Jody Fagan

Dependency: Apache, Perl, MySQL

Supported Platforms: Linux, UNIX

License: GNU GPL

Availability: http://www.lib.siu.edu/chat/#software

Further Information: Project Home Page: http://www.lib.siu.edu/chat/

3 Subject Gateways

Subject gateways as the name suggests typically focus on a particular subject area.

These are online services and sites that provide that catalogues the Internet based

resources available in a specific field of study. The libraries have an important role

in the building of subject gateway in the area it specializes.

Building such kind of services demanded high level of technical adeptness in the

past. But with availability of good quality public domain OSS tools has removed

that fear. Most of these tools comply with well-accepted metadata standards like

Dublin Core, MARC, etc.

3.1 ROADS

Description: ROADS (Resource Organization And Discovery in Subject-based

Services) is a set of software tools to enable the set up and maintenance of Web

based subject gateways.

Special Features: ROADS is a software tool-kit allowing gateway managers to

pick and choose what parts of the software they require whilst allowing the

integration of other software according to requirement. ROADS include advanced

features for linking distributed cooperative databases together using the IETF's

Software Tools for Value Added Services

Chapter 5 60

WHOIS++ search and retrieval protocol, and their Common Indexing Protocol

(CIP).

History: ROADS was originally developed as part of the UK Electronic Libraries

Programme (eLib) by a consortium including the Institute of Learning and

Research Technology at the University of Bristol, and the UK Office of Library

and Information Networking at the University of Bath, with the bulk of the

development being done by the Department of Computer Science at Loughborough

University. Although this project itself has finished, the software continues to be

developed and used all over the world.

Project Sponsors/Administrators: The ROADS project has three partners:

• The Department of Computer Science at Loughborough University of

Technology

• The ILRT (Institute for Learning and Research Technology) at Bristol

University

• UKOLN (the UK Office for Library and Information Networking) at the

University of Bath

Dependency: Apache, Perl

Supported Platforms: POSIX (UNIX, Linux)

License: Artistic License, GNU GPL

Availability: http://sourceforge.net/project/showfiles.php?group_id=6936

Further Information: Project Home Page: http://roads.sourceforge.net/

3.2 iVia

Description: iVia is an open source Internet subject portal or virtual library system.

As a hybrid expert and machine built collection creation and management system,

it supports a primary, expert-created, first-tier collection that is augmented by a

Software Tools for Value Added Services

Chapter 5 61

large, second-tier collection of significant Internet resources that are automatically

gathered and described.

Special Features: Some of the major features of the iVia system include:

• A core system that is fast, robust, reliable and scalable to millions of records

and users.

• An array of Web crawlers capable of fully- to semi-automating the

identification of significant Internet resources.

• Classifiers that enable semi-automated metadata content creation providing

expert/machine interaction throughout the record building process.

• Search/browse interface options that provide users with great flexibility in

finding resources and which support all levels of user search skills.

• Support for single or multiple subject virtual library projects which can

share data and efforts on any of several levels of cooperation.

• Support for the following standards: OAI Protocol for Metadata Harvesting

(OAI-PMH), Dublin Core, MARC (Machine-Readable Cataloging), Library

of Congress Subject Headings (LCSH), and Library of Congress

Classifications (LCC).

History: The iVia system is an INFOMINE creation generously funded by the

National Leadership Grant Program of the U.S. Institute of Museum and Library

Services, the Fund for the Improvement of Post-Secondary Education of the U.S.

Department of Education and the Library of the University of California, Riverside.

Project Sponsors/Administrators: INFOMINE, The Regents of the University of

California

Dependency: Apache, MySQL, Berkeley DB

Supported Platforms: Linux

License: Affero General Public License (http://www.affero.org/oagpl.html)

Availability: http://infomine.ucr.edu/iVia/ivia.php?section=2

Software Tools for Value Added Services

Chapter 5 62

Further Information:

1. Project Home Page: http://infomine.ucr.edu/iVia/

2. iVia Open Source Virtual Library System:

http://www.dlib.org/dlib/january03/mitchell/01mitchell.html

3.3 IMesh Toolkit

Description: The IMesh Toolkit is a coherent set of tools and standards being

developed for use by subject gateway software developers and technically savvy

subject gateway implementers. These tools and standards will make use of

established open protocols and interfaces wherever possible to insure

interoperability. The toolkit will include reference implementations for all

standards.

Special Features: It has many components such as metadata exchange tools, RDF

query tools, OAI normalization tools, Reading Lists, etc.

History: The IMesh Toolkit Project is a joint effort by groups funded by JISC and

the NSF to develop the IMesh Toolkit. The major participants in this effort include

the UK Office for Library and Information Networking (UKOLN) and the

University of Bath in the UK, the Institute for Learning and Research Technology

(ILRT) at the University of Bristol in the UK, and the Internet Scout Project (ISP)

at the University of Wisconsin - Madison in the United States.

Project Sponsors/Administrators: UKOLN, ILRT, Internet Scout Project.

The IMesh Toolkit project was funded under the NSF/JISC International Digital

Libraries Initiative from September 1999 to July 2003.

Dependency: Perl

Supported Platforms: Unknown

License: GNU GPL

Availability: http://clark.cs.wisc.edu/cgi-bin/cvsweb.cgi

Software Tools for Value Added Services

Chapter 5 63

Further Information:

1. Project Home Page: http://www.imesh.org/toolkit/work/components/ME/

2. Internet Scout Portal Project: http://scout.wisc.edu/research/imeshtk/

4 Inter Library Loan

Inter Library Loan (ILL) is the most visible form of resource sharing among

libraries. The ILL protocol (ISO 10160:1997) developed by the National Library

of Canada has sought to automate this process. It has become an ISO standard in

1997. Wide implementation of this protocol would reduce the gestation period in

the delivery of ILL request considerably.

4.1 ILL Wizard

Description: ILL Wizard is ISO-compliant ILL web form to handle ILL requests.

It can run from a desktop or from the library's Web site server directory.

Special Features: Non-programmer technical librarians should be able to

configure and mount this Java Web form without help from computer experts.

History: Originally developed for use in the Benner Library and Resource Center,

Olivet Nazarene University (Illinois, USA).

Project Sponsors/Administrators: Bryan Wilhelm, Craighton Hippenhammer

Dependency: Java, Web Server/Web Browser

Supported Platforms: Linux, UNIX, Windows

License: Unknown

Availability: http://library.olivet.edu/iso-ill.html

Further Information: Project Home Page: http://library.olivet.edu/iso-ill.html

Software Tools for Value Added Services

Chapter 5 64

4.2 Biblio::ISO::ILL

Description: Biblio::ILL::ISO is ISO-protocol-based Interlibrary Loan (ISO

10161) module for Perl programming language.

Special Features: The module implements the 20 Interlibrary Loan message

classes (ILL-Request, Answer, etc), plus the hundred or so types that make up

those classes. There is a test suite. There are a handful of test/example programs.

History: The author had earlier written Biblio::ILL::GS which was a Interlibrary

Loan Generic Script.

Project Sponsors/Administrators: David Christensen

Dependency: Perl

Supported Platforms: Linux, UNIX

License: Perl Artistic License

Availability: Currently available at http://maplin.gov.mb.ca/pub/TEST/, Check

http://search.cpan.org/author/DCHRIS/ in future

Further Information: Project Home Page: http://www.lib.siu.edu/chat/

Chapter 6

Software Tools for Digital Library Initiatives

“Future is digital” – Famous Advertisement Campaign

• Digital Library Toolkit

• DL-like Softwares

• OAI-PMH Tools

Software Tools for DL Initiatives

Chapter 6 66

1 Digital Library Toolkit

The term "Digital Library" has a variety of potential meanings, ranging from a

digitized collection of material that one might find in a traditional library through to

the collection of all digital information. However, it is not merely equivalent to a

digitized collection with information management tools. It is also a series of

activities that brings together collections, services, and people in support of the full

life cycle of creation, dissemination, use, and preservation of data, information, and

knowledge.

The creation and maintenance of digital libraries is imperative with growing

amount of information available in the digital format. Building digital libraries

needs a fair amount of knowledge of information management tools such as

databases, web technology, information retrieval, user interface, etc. The usability

of hosted resources is as important as the quality of information presented.

The Digital Library toolkits discussed below are fairly integrated set of solutions to

build digital libraries with born digital resources. However, converting existing

hard copy documents into digital format would require few more tools like scanner,

optical character recognition (OCR) software, word processing software, image

editing tools, etc.

1.1 Greenstone

Description: Greenstone is a suite of software for building and distributing digital

library collections. It provides a new way of organizing information and publishing

it on the Internet or on CD-ROM.

Special Features: Some of the important features are:

• Support for image, video, and text collection

• Support for multilingual collection building

• Z39.50 client available on Linux systems

Software Tools for DL Initiatives

Chapter 6 67

• Highly portable collection, can easily be distributed even on a CD-ROM

History: Greenstone is produced by the New Zealand Digital Library Project at the

University of Waikato, and developed and distributed in cooperation with

UNESCO and the Human Info NGO.

Project Sponsors/Administrators: University of Waikato, New Zealand

Dependency: Apache, Perl, GDBM

Supported Platforms: UNIX, Windows, Linux, MacOS X

License: GNU GPL

Availability: http://www.greenstone.org/english/download.html

Further Information: Project Home Page: http://www.greenstone.org/

1.2 DSpace

Description: DSpace is a specialized type of digital asset management or content

management system: it manages and distributes digital items, made up of digital

files and allows for the creation, indexing, and searching of associated metadata to

locate and retrieve the items. It is designed to support the long-term preservation of

the digital material stored in the repository.

Special Features: The important features of DSpace are:

• Institutional Repository: DSpace is organized to accommodate the

multidisciplinary and organizational needs of a large institution.

• Document Formats: Support for a Variety of Digital Formats and Content

Types including text, images, audio, and video

• Access Control: DSpace allows contributors to limit access to items in

DSpace - at the collection and the individual item level.

• Digital Preservation: DSpace provides long-term physical storage and

management of digital items in a secure, professionally managed repository

Software Tools for DL Initiatives

Chapter 6 68

including standard operating procedures such as backup, mirroring,

refreshing media, and disaster recovery.

• Search and Retrieval: The DSpace submission process allows for the

description of each item using a qualified version of the Dublin Core

metadata schema.

(Go to http://dspace.org/technology/features.html for more detailed description on

features.)

History: DSpace was developed out of collaboration between MIT Libraries and

Hewlett-Packard Company.

Project Sponsors/Administrators: MIT Libraries & Hewlett-Packard Company

Dependency: Apache, Tomcat, PostgreSQL, Java

Supported Platforms: Claimed to be Platform Independent, but installation

manual suggests only UNIX-like platform.

License: BSD License

Availability: http://dspace.org/technology/download.html,

http://sourceforge.net/project/showfiles.php?group_id=19984

Further Information: Project Home Page: http://www.dspace.org/

1.3 iVia

Description: iVia is an open source Internet subject portal or virtual library system.

As a hybrid expert and machine built collection creation and management system,

it supports a primary, expert-created, first-tier collection that is augmented by a

large, second-tier collection of significant Internet resources that are automatically

gathered and described.

Special Features: Some of the major features of the iVia system include:

• A core system that is fast, robust, reliable and scalable to millions of records

and users.

Software Tools for DL Initiatives

Chapter 6 69

• An array of Web crawlers capable of fully- to semi-automating the

identification of significant Internet resources.

• Classifiers that enable semi-automated metadata content creation providing

expert/machine interaction throughout the record building process.

• Search/browse interface options that provide users with great flexibility in

finding resources and which support all levels of user search skills.

• Support for single or multiple subject virtual library projects which can

share data and efforts on any of several levels of cooperation.

• Support for the following standards: OAI Protocol for Metadata Harvesting

(OAI-PMH), Dublin Core, MARC (Machine-Readable Cataloging), Library

of Congress Subject Headings (LCSH), and Library of Congress

Classifications (LCC).

History: The iVia system is an INFOMINE creation generously funded by the

National Leadership Grant Program of the U.S. Institute of Museum and Library

Services, the Fund for the Improvement of Post-Secondary Education of the U.S.

Department of Education and the Library of the University of California, Riverside.

Project Sponsors/Administrators: INFOMINE, The Regents of the University of

California

Dependency: Apache, MySQL, Berkeley DB

Supported Platforms: Linux

License: Affero General Public License (http://www.affero.org/oagpl.html)

Availability: http://infomine.ucr.edu/iVia/ivia.php?section=2

Further Information:

1. Project Home Page: http://infomine.ucr.edu/iVia/

2. iVia Open Source Virtual Library System:

http://www.dlib.org/dlib/january03/mitchell/01mitchell.html

Software Tools for DL Initiatives

Chapter 6 70

1.4 Dienst

Description: The distributed Dienst software is configured to handle textual

resources (documents) in a variety of formats. However, the Dienst architecture

includes a sophisticated document model that accommodates a wide variety of

digital resources. Using the Dienst software for these other resources will require

some programming.

Special Features: Unknown

History: Dienst is a project of the CDLRG - Cornell Digital Library Research

Group. Work on Dienst sponsored by the Defense Advanced Research Projects

Agency (DARPA) on behalf of the Digital Libraries Initiative. Additional work on

Dienst is sponsored by the National Science Foundation Digital Libraries Initiative

Phase 2 Project Prism.

Project Sponsors/Administrators: Cornell University, USA

Dependency: Apache, Perl, mod_perl, ImageMagic, PerlMagic, freeways-sf

Supported Platforms: UNIX, Linux, MacOS X, Windows (not tested)

License: Unknown

Availability: Currently not available

Further Information:

Project Home Page:

http://www.cs.cornell.edu/cdlrg/dienst/software/DienstSoftware.htm

1.5 Fedora

Description: Flexible Extensible Digital Object and Repository Architecture

(Fedora) is a toolkit to build a digital object repository management system. The

system, designed to be a foundation upon which interoperable web-based digital

libraries, institutional repositories and other information management systems can

Software Tools for DL Initiatives

Chapter 6 71

be built, demonstrates how distributed digital library architecture can be deployed

using web-based technologies, including XML and Web services.

The interface to the system consists of three open APIs that are exposed as web

services:

• Management API (API-M) – defines an interface for administering the

repository. It includes operations necessary for clients to create and

maintain digital objects and their components. API-M is implemented as a

SOAP-enabled web service.

• Access API (API-A) – defines an interface for accessing digital objects

stored in the repository. It includes operations necessary for clients to

perform disseminations on objects in the repository and to discover

information about an object using object reflection. API-A is implemented

as a SOAP-enabled web service.

• Access-Lite API (API-A-Lite) – defines a streamlined version of the

Fedora Access Service that is implemented as an HTTP-enabled web

service.

Special Features: The major features are:

• Web Services: The interface to the Fedora repository system consists of

three open APIs that are exposed as web services: Management API known

as API-M, Access API known as API-A, and Access-Lite API known as

API-A-Lite.

• Datastreams: Objects in a repository may consist of content and metadata

(datastreams) that physically reside inside the repository or outside the

repository. The Fedora repository system supports content of any MIME

type.

• XML Submission and Storage: Digital objects are stored as XML-

encoded files that conform to an extension of the Metadata Encoding and

Transmission Standard (METS) schema. The schema for the extended

Software Tools for DL Initiatives

Chapter 6 72

version of METS used by Fedora can be found at

http://www.fedora.info/definitions/1/0/mets-fedora-ext.xsd.

• OAI Metadata Harvesting Provider: The Fedora metadata is accessible

using the OAI Protocol for Metadata Harvesting, v2.0.

• Parameterized Behaviors: Behaviors defined for an object support user-

supplied options that are handled at dissemination time.

• Versioning: Although not fully implemented in release 1.1, the Fedora

repository system includes the infrastructure to support versioning of digital

objects and their components.

• Access Control and Authentication: Release 1.1 includes a simple form of

access control to provide access restrictions based on IP address. IP range

restriction is supported in both the Management and Access APIs.

(Go to http://www.fedora.info/ for complete feature set)

History: Jointly developed by the University of Virginia and Cornell University

the Fedora project was funded by the Andrew W. Mellon Foundation.

Project Sponsors/Administrators: University of Virginia and Cornell University.

Technical Coordinator: Ronda A. Grizzle (Virginia)

Dependency: Java SDK, MySQL/Oracle (optional), JDBC

Supported Platforms: Platform Independent (with Java)

License: Mozilla Public License

Availability: http://www.fedora.info/release/

Further Information: Project Home Page: http://www.fedora.info/

1.6 DjVuLibre

Description: DjVu (pronounced "deja vu") is a compression technique, a file

format, and a delivery platform that is specifically designed to enable the creation

of digital libraries of printed material, either scanned from paper or digitally

Software Tools for DL Initiatives

Chapter 6 73

produced. For scanned document, DjVu file sizes are typically 3 to 10 times

smaller than TIFF or PDF in black and white, and 5 to 10 times smaller than JPEG

in color.

DjVu documents are displayed within web browsers through a very lightweight

plug-in (available for all major platforms). Server-side full-text search can easily

be provided using free indexing tools and a few Perl scripts.

Special Features: Unknown

History: The DjVu project was started by Yann LeCun at AT&T Labs-Research in

1996. Much of the research and innovations behind DjVu were the work of Leon

Bottou, Yann LeCun, Patrick Haffner, Paul Howard, and Yoshua Bengio, with

some contributions from Pascal Vincent, Patrice Simard, and Steven Pigeon.

DjVuLibre is a GPL implementation of DjVu maintained by the original inventors

of DjVu. Go to http://djvu.sourceforge.net/credits.html to know the historical

details of the project.

Project Sponsors/Administrators: Yann LeCun, Léon Bottou

Dependency: Unknown

Supported Platforms: Linux

License: GNU GPL (version 2)

Availability: http://sourceforge.net/project/showfiles.php?group_id=32953

Further Information: Project Home Page: http://djvu.sourceforge.net/

2 DL-like Softwares

Archiving of digital documents can be seen as an extension of digital libraries.

Digital archiving softwares can also be used to build useful services.

Software Tools for DL Initiatives

Chapter 6 74

2.1 E-prints

Description: The primary purpose of the E-Prints software is to help create open

access to the peer-reviewed research output of all scholarly and scientific research

institutions. The default configuration creates a research papers archive, but could

be used for other purposes.

Special Features: Unknown

History: E-Prints was part of the Open Citation Project, a DLI2 International

Digital Libraries Project funded by the Joint Information Systems Committee

(JISC) of the Higher Education Funding Councils, in collaboration with the

National Science Foundation. E-Prints was previously supported by CogPrints,

funded by JISC as part of its Electronic Libraries (eLib) Programme.

Project Sponsors/Administrators: University of Southampton, UK

Dependency: Apache, Perl, mod_perl, MySQL

Supported Platforms: Linux, UNIX

License: GNU GPL

Availability: http://software.eprints.org/

Further Information: Project Home Page: http://www.eprints.org/

2.2 CDSWare

Description: CERN Document Server Software (CDSware) allows one to run

one's own electronic preprint server, online library catalogue or a document system

on the web. It complies with the Open Archives Initiative metadata harvesting

protocol (OAI-PMH) and uses MARC 21 as its underlying bibliographic standard.

Special Features: Some of the salient features are:

• Configurable portal-like interfaces for hosting various kind of collections.

• Powerful search engine with Google-like syntax.

Software Tools for DL Initiatives

Chapter 6 75

• User personalization, including document baskets and email notification

alerts.

• Electronic submission and upload of various types of documents.

• Running an OAI data and service provider enabling the metadata exchange

between heterogeneous repositories.

History: Developed for use at the CERN Library, Europe.

Project Sponsors/Administrators: CERN

Dependency: Apache, MySQL, PHP, Python, WML

Supported Platforms: Linux, UNIX

License: GNU GPL

Availability: http://cdsware.cern.ch/download/

Further Information: Project Home Page: http://cdsware.cern.ch/

2.3 Harvest

Description: Harvest is a system to collect information and make them searchable

using a web interface. Harvest can collect information on inter- and intranet using

http, ftp, nntp as well as local files like data on hard disk, CDROM and file servers.

Current list of supported formats in addition to HTML include TeX, DVI, PS, full

text, mail, man pages, news, troff, WordPerfect, RTF, Microsoft Word/Excel,

SGML, C sources and many more.

Possible uses of Harvest include:

• Web Search Engine

• Specialized Search System

• Building a Distributed Search System

• Testbed for Search related Components

Special Features: Some of the features are:

Software Tools for DL Initiatives

Chapter 6 76

• Harvest is designed to work as distributed system.

• Harvest is designed to be modular.

• Harvest allows complete control over the content of data in the search

database.

• The Search interface is written in Perl to make customization easy, if

desired.

History: Unknown

Project Sponsors/Administrators: Kang-Jin Lee, Javier Masa Marin, Harald

Weinreich

Dependency: Apache, Perl, GDBM, Bison, Flex, and GCC (for compiling from

source)

Supported Platforms: Linux, UNIX, Windows (under cygwin)

License: GNU GPL

Availability: http://sourceforge.net/project/showfiles.php?group_id=27808

Further Information: Project Home Page: http://harvest.sourceforge.net/

3 OAI-PMH Tools

The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a

means of making machine-readable metadata widely available for use. In other

words, it is a means of sharing metadata between digital archives and repositories.

The Open Archives Initiative was originally proposed to achieve federated

searching of to e-print/pre-print archives. Gradually, however, the scope of the

initiative has broadened to cover any kind of digital content including images and

videos.

Software Tools for DL Initiatives

Chapter 6 77

It is based on providing simple yet powerful framework of metadata harvesting.

This metadata harvesting method can be used to build high quality federated search

systems across collection in a very short span of time. The protocol stipulates that

all the metadata should be encoded in XML. The minimum common denominator

is the Unqualified Dublin Core (UDC) so that more digital collection can

implement this protocol without much hassle. The reason is, virtually any other

metadata schema can be downgraded to conform to UDC.

3.1 OAICat

Description: OAICat is a Java Servlet web application providing an OAI-PMH

v2.0 repository framework. The framework can be customized to work with

arbitrary data repositories by implementing some Java interfaces.

Special Features: Unknown

History: Unknown

Project Sponsors/Administrators: Jeffrey Young, OCLC

Dependency: Java Servlet Engine, RDBMS (tested with MySQL)

Supported Platforms: Platform Independent

License: OCLC Research Public License

(http://purl.oclc.org/oclc/research/ORPL/)

Availability: http://pubserv.oclc.org/oaicat/jars/dist/dist.html

Further Information:

Project Home Page: http://www.oclc.org/research/software/oai/cat.shtm

3.2 PHP OAI Data Provider

Description: As the name suggests it is an implementation of the OAI-PMH

(version 2) Data Provider.

Software Tools for DL Initiatives

Chapter 6 78

Special Features: This implementation currently supports

• Full OAI-PMH version 2.0 compliance

• Compressed XML support, which greatly reduces used bandwidth

• Can connect to many existing databases, by using PEAR abstract layer

• Quite easy to configure

History: Developed at the University of Oldenburg, Germany.

Project Sponsors/Administrators: Heinrich Stamerjohanns

Dependency: Apache, PHP, RDBMS (Oracle8/MySQL)

Supported Platforms: Linux, UNIX, Windows

License: Unknown

Availability: http://physnet.uni-oldenburg.de/oai/

Further Information: Project Home Page: http://physnet.uni-oldenburg.de/oai/

3.3 VTOAI OAI-PMH2 Perl Implementation

Description: This toolkit implements the skeleton of the OAI-PMH v2.0 in an

object-oriented fashion, thus hiding the details of the protocol from code that is

derived from the predefined class.

Special Features: Some of the features are:

• Strict compliance with OAI-PMH v2.0

• One installation can easily be used for multiple archives

• All extensions, configurations, and containers are specified using XML

Schema

• Minimal changes are required to create a working implementation

History: Developed at the Digital Library Research Laboratory (DLRL) of

Virginia Tech University, USA.

Software Tools for DL Initiatives

Chapter 6 79

Project Sponsors/Administrators: Hussein Suleman, Virginia Tech

Dependency: Apache, Perl

Supported Platforms: Linux, UNIX, Windows

License: Perl Artistic License

Availability: http://www.dlib.vt.edu/projects/OAI/software/vtoai/vtoai.html

Further Information:

Project Home Page: http://www.dlib.vt.edu/projects/OAI/software/vtoai/vtoai.html

3.4 ARC

Description: Arc is the first federated search service based on the OAI-PMH

protocol. It includes a harvester which can harvests OAI-PMH 1.x and OAI-PMH

2.0 compliant repositories, a basic search engine which is based on database and an

OAI-PMH. It was developed at the Old Dominion University, USA.

Special Features: It includes a harvester, a search engine together with a simple

search interface, and an OAI-PMH layer over harvested metadata. Arc can be easily

configured for a specific community.

History: Developed at the Digital Library Research Group of Old Dominion

University (USA).

Project Sponsors/Administrators: Digital Library Research Group, ODU

Dependency: Java Servlet Engine, Tomcat, RDBMS (Oracle/MySQL)

Supported Platforms: Platform Independent

License: University of Illinois/NCSA Open Source License

Availability: http://sourceforge.net/project/showfiles.php?group_id=61532

Further Information:

1. Project Home Page: http://oaiarc.sourceforge.net/

2. ARC Demo Search: http://arc.cs.odu.edu/

Software Tools for DL Initiatives

Chapter 6 80

3.5 OAIHarvester

Description: Developed by OCLC, the OAIHarvester Open Source project is a

Java application providing an OAI-PMH v2.0 harvester framework. This

framework can be customized to perform arbitrary operations on harvested data by

implementing some Java interfaces.

Special Features: Unknown

History: Unknown

Project Sponsors/Administrators: Jeffrey Young, OCLC

Dependency: Java, Apache Ant

Supported Platforms: Platform Independent

License: OCLC Research Public License

Availability: http://www.oclc.org/research/software/oai/harvester.shtm

Further Information:

Project Home Page: http://www.oclc.org/research/software/oai/harvester.shtm

3.6 OAI/ODL Harvester

Description: Harvest data from one or more archives. This is a template that does

nothing useful besides printing the records to STDOUT (screen). It is intended that

the Harvester class will be sub classed to perform more useful functions.

Special Features: Some of the important features are:

• Works with any OAI (PMH v1.0/1.1/2.0) or ODL (XOAIPMH v1.0)

archive

• Code layout for separate components or libraries of components

• One installation can easily be used for harvesting from multiple sites for

different purposes

Software Tools for DL Initiatives

Chapter 6 81

• All extensions, configurations, and containers are specified using XML

Schema

History: Unknown

Project Sponsors/Administrators: Digital Library Research Lab, Virginia Tech

(USA).

Dependency: Apache, Perl

Supported Platforms: Linux, UNIX, Windows

License: Perl Artistic License

Availability: http://oai.dlib.vt.edu/odl/software/harvest/

Further Information: Project Home Page:

http://oai.dlib.vt.edu/odl/software/harvest/

3.7 Net::OAI::Harvester

Description: Net::OAI::Harvester is a Perl extension for easily querying OAI-

PMH repositories. OAI-PMH allows data repositories to share metadata about

their digital assets. Net::OAI::Harvester is a OAI-PMH client, so it does for OAI-

PMH what LWP::UserAgent does for HTTP. At the moment this module supports

only Dublin Core (oai_dc) schema handling through XML::Handler.

Special Features: Some of the features are:

• It is able to handle memory-crazy requests like listRecords and

listIdentifiers

• XML::SAX filters are used which will allow interested developers to write

their own metadata parsing packages, and drop them into place

• It has built in support for unqualified Dublin Core, and has a framework for

dropping in one’s own parser for other kinds of metadata

History: Unknown

Software Tools for DL Initiatives

Chapter 6 82

Project Sponsors/Administrators: Edward Summers

Dependency: Perl, XML::SAX, LWP::UserAgent

Supported Platforms: Linux, UNIX, Windows

License: Perl Artistic License

Availability: http://search.cpan.org/author/ESUMMERS/OAI-Harvester-0.5/

Further Information:

Project Home Page: http://search.cpan.org/author/ESUMMERS/OAI-Harvester-

0.5/

3.8 Rapid Visual OAI Tool (RVOT)

Description: Rapid Visual OAI Tool (RVOT) can be used to graphically construct

an OAI-PMH repository from a collection of files. The records in the original

collection can be in any one of the acceptable formats. RVOT helps to define the

mapping visually from a native format to oai_dc format, and once this is done the

tool can respond to OAI-PMH requests.

Special Features: The design of RVOT is such that it can be easily extended to

support other metadata formats.

History: Developed at the Digital Library Research Group of Old Dominion

University (USA).

Project Sponsors/Administrators: Sathish Kumar Kothamasa, M. Zubair

Dependency: Java SDK

Supported Platforms: Platform Independent (Linux, UNIX, Windows

2000/NT/XP)

License: University of Illinois/NCSA Open Source License

Availability: http://sourceforge.net/project/showfiles.php?group_id=66652

Further Information: Project Home Page: http://rvot.sourceforge.net/

Software Tools for DL Initiatives

Chapter 6 83

3.9 OAI Repository Explorer

Description: This site presents an interface to interactively test archives for

compliance with the OAI Protocol for Metadata Harvesting.

Special Features: Some of the features are:

• Simple web-based interface

• Tests compliancy for OAI-PMH version 1.0/1.1/2.0 with schema validation

• Useful to test OAI compliancy of a Data Provider before making it public

History: Developed at the Digital Library Research Laboratory, Virginia Tech

University (USA).

Project Sponsors/Administrators: Hussein Suleman, Edward Fox

Dependency: Web browser with JavaScript support

Supported Platforms: Platform Independent

License: Unknown

Availability: http://www.purl.org/NET/oai_explorer

Further Information: Project Home Page: http://www.purl.org/NET/oai_explorer

Chapter 7

Miscellaneous Supporting Software Tools

“The good news: Computers allow us to work 100% faster. The bad news: They

generate 300% more work” – Unknown

• HTML Tools

• XML Tools

• Information Retrieval Tools

Miscellaneous Supporting Tools

Chapter 7 85

1 HTML Tools

HTML is the lingua franca for publishing documents on the World Wide Web

developed by the World Wide Web Consortium (W3C: http://www.w3c.org/). It is

a non-proprietary format based upon Standard Generalized Markup Language

(SGML), and can be created and processed by a wide range of tools, from simple

plain text editors to sophisticated WYSIWYG (What You See Is What You Get)

authoring tools.

Described below some of the open source tools available for editing HTML in

WYSIWYG way.

1.1 Amaya

Description: Amaya is a Web editor, i.e. a tool used to create and update

documents directly on the Web. Browsing features are seemlessly integrated with

the editing and remote access features in a uniform environment.

Special Features: Amaya started as an HTML + CSS style sheets editor. Since that

time it was extended to support XML and an increasing number of XML

applications such as the XHTML family, MathML, and SVG. It allows all those

vocabularies to be edited simultaneously in compound documents.

Amaya includes a collaborative annotation application based on Resource

Description Framework (RDF), XLink, and Xpointer. The current release, Amaya

8.1a, supports HTML 4.01, XHTML 1.0, XHTML Basic, XHTML 1.1, HTTP 1.1,

MathML 2.0, many CSS 2 features, a SVG support (transformation, transparency,

and SMIL animation on OpenGL platforms).

History: Work on Amaya started at W3C in 1996 to showcase Web technologies in

a fully featured Web client. The main motivation for developing Amaya was to

provide a framework that can integrate as many W3C technologies as possible. It is

used to demonstrate these technologies in action while taking advantage of their

combination in a single, consistent environment.

Miscellaneous Supporting Tools

Chapter 7 86

Project Sponsors/Administrators: World Wide Web Consortium (W3C)

Dependency: None

Supported Platforms: Linux, Windows, UNIX, Solaris

License: W3C Software License (GNU GPL Compatible)

Availability: http://www.w3.org/Amaya/User/BinDist.html

Further Information: Project Home Page: http://www.w3.org/Amaya/

1.2 Mozilla

Description: Mozilla has a decent web page editor built-in with the browser.

Though a low-end product it is useful for developing small websites and for editing

a page in a hurry.

Special Features: Mozilla is a browser that includes a web page editor, an address

book, an IRC chat client and a powerful mail client with intelligent spam filtering.

History: Developed by the Netscape Communications.

Project Sponsors/Administrators: Mozilla Foundation

(http://www.mozillafoundation.org/)

Dependency: glibc2.2.4 or better (for Linux)

Supported Platforms: Linux, Windows, MacOS X, UNIX

License: Mozilla Public License

Availability: http://www.mozilla.org/

Further Information: Project Home Page: http://www.mozilla.org/

Miscellaneous Supporting Tools

Chapter 7 87

1.3 Bluefish Editor

Description: Bluefish is a powerful editor for experienced web designers and

programmers. Bluefish supports many programming and markup languages, but it

focuses on editing dynamic and interactive websites.

Special Features: A What You See Is What You Need interface. Multiple

document interface, easily opens 500+ documents (tested 3500 documents

simultaneously). Customizable syntax highlighting based on Perl Compatible

regular expressions, with sub pattern support and default patterns for PHP, HTML,

C, Java, XML, Python, ColdFusion, Pascal, and R.

Complete feature set is available at: http://bluefish.openoffice.nl/features.html

History: Unknown

Project Sponsors/Administrators: Olivier Sessink

Dependency: gtk2, libpcre, libaspell (optional, for spell checking)

Supported Platforms: Linux, FreeBSD, MacOS-X, OpenBSD, Solaris and Tru64

License: GNU GPL

Availability: http://bluefish.openoffice.nl/download.html

Further Information: Project Home Page: http://bluefish.openoffice.nl/

1.4 Quanta Plus

Description: Quanta+ is a web development environment for HTML and associate

languages for the K Desktop Environment on Linux. Quanta is designed for quick

web development and is rapidly becoming a mature editor with a number of great

features.

Special Features: Unknown

History: Unknown

Miscellaneous Supporting Tools

Chapter 7 88

Project Sponsors/Administrators: Andras Mantia, Robert Nickel, Eric Laffoon

Dependency: KDE, Perl

Supported Platforms: Linux

License: GNU GPL

Availability: http://sourceforge.net/project/showfiles.php?group_id=4113

Further Information: Project Home Page: http://sourceforge.net/projects/quanta/

2 XML Editors

World Wide Web Consortium (W3C) says: “Extensible Markup Language (XML)

is a simple, very flexible text format derived from SGML (ISO 8879). Originally

designed to meet the challenges of large-scale electronic publishing, XML is also

playing an increasingly important role in the exchange of a wide variety of data on

the Web and elsewhere.”

XML has found wide use in the library community in describing metadata. A

number of XML Schema have been developed for various metadata standards like

Dublin Core, MARC, TEI, etc. Various digital library softwares, including

Greenstone, expect metadata only in XML format.

Creating well-formed or valid XML documents requires the help of XML editors.

We will look into few of the WYSIWIG XML editors available as open source.

2.1 Open eXeed

Description: Open eXeed is a Open Source development Environment for XML. It

is used to edit, create, and validate XML and other related documents, such as

XHTML, XSLT.

Special Features: Unknown

History: Unknown

Miscellaneous Supporting Tools

Chapter 7 89

Project Sponsors/Administrators: [email protected]

Dependency: MSXML (version 4)

Supported Platforms: Windows

License: GNU GPL

Availability: http://sourceforge.jp/frs/index.php?group_id=58

Further Information: Project Home Page: http://openexeed.sourceforge.jp/

2.2 Xerlin

Description: The Xerlin Project is a Java™ based XML modeling application

written to make creating and editing XML files easier. It runs on any Java 2 virtual

machine (JDK1.2.2 or higher). The application is extensible via custom editor

interfaces that can be added for individual DTD's.

Special Features: It is extensible via a plugin interface and can also be launched as

an XML editor widget to be included in other Java applications. It also supports

XML libraries such that XML components can be shared between different files. It

has standard editor features such as undo, cut, copy and paste.

History: Unknown

Project Sponsors/Administrators: SpeedLegal (http://www.speedlegal.com/)

Dependency: Java 2 Platform Standard Edition

Supported Platforms: Platform Independent

License: Unknown, claimed to be Apache-style

(http://www.xerlin.org/LICENSE.txt)

Availability: http://www.xerlin.org/downloads.shtml

Further Information: Project Home Page: http://www.xerlin.org/

Miscellaneous Supporting Tools

Chapter 7 90

2.3 Bitflux Editor

Description: Bitflux Editor (acronym: BXE) is a browser-based (currently Mozilla

only) WYSIWYG XML editor which is written in JavaScript and uses XML,

XSLT, and CSS for rendering. It is usable with any XML document and features

tables, lists, images, special chars, clipboard, undo/redo, and easy customization.

Special Features: Unknown

History: Unknown

Project Sponsors/Administrators: Bitflux, Switzerland

Dependency: Netscape/Mozilla

Supported Platforms: Platform Independent

License: Apache License

Availability: http://bitfluxeditor.org/download/

Further Information: Project Home Page: http://bitfluxeditor.org/

3 Information Retrieval Tools

There is a wide range of open source search engines or information retrieval tools

available on the web from Sourceforge (http://www.sf.net/). These systems can be

categorized into two main groups, viz., those that use inverted files and those that

use database systems. We will look into some of the most popular search engines.

3.1 Ht://Dig

Description: The ht://Dig system is a complete world wide web indexing and

searching system for a domain or intranet. Instead it is meant to cover the search

needs for a single company, campus, or even a particular sub section of a web site.

Special Features: Some of the special features are

Miscellaneous Supporting Tools

Chapter 7 91

• Intranet searching

• Robot exclusion is supported

• Boolean expression searching

• Configurable search results

• Email notification of expired documents

• Searches on subsections of the database

(Go to http://www.htdig.org/require.html for full feature set)

History: ht://Dig was developed at San Diego State University as a way to search

the various web servers on the campus network.

Project Sponsors/Administrators: San Diego State University

Dependency: C++ Compiler, libstdc++ (for building from source)

Supported Platforms: Linux, UNIX, BSD, Solaris, HP/UX

License: GNU GPL

Availability: http://www.htdig.org/mirrors.html, http://www.htdig.org/where.html

Further Information: Project Home Page: http://www.htdig.org/

3.2 Swish-E

Description: Simple Web Indexing System for Humans - Enhanced (SWISH-E) is

a fast, powerful, flexible, free, and easy to use system for indexing collections of

Web pages or other files.

Special Features: Please refer to http://swish-

e.org/current/docs/README.html#Key_features for full feature set. Some of the

major features are:

• Quickly index a large number of documents in different formats including

text, HTML, and XML

Miscellaneous Supporting Tools

Chapter 7 92

• Use “filters” to index other types of files such as PDF, gzip, or Postscript.

• Includes a web spider for indexing remote documents over HTTP. Follows

Robots Exclusion Rules (including META tags).

• Can use an external program to supply documents to Swish-e, such as an

advanced spider for your web server or a program to read and format

records from a relational database.

• Document “properties” (some subset of the source document, usually

defined as a META or XML elements) may be stored in the index and

returned with search results

History: Developed by people at University of California (Berkeley and San

Francisco) and other places.

Project Sponsors/Administrators: Roy Tennant (UC, Berkeley)

Dependency: (To build from source) GCC (C++ Compiler), and some other

optional packages. Please refer to http://swish-e.org/dev/docs/INSTALL.html for

the latest requirements.

Supported Platforms: Sun/Solaris, UNIX, BSD, Linux, OS X, Windows

License: GNU GPL, or LGPL

Availability: http://swish-e.org/Download/

Further Information:

1. Project Homepage: http://swish-e.org/

2. How to Index Anything: http://www.linuxjournal.com/article.php?sid=6652

3.3 ASPseek

Description: ASPseek is an Internet search engine software developed by SWsoft

consists of an indexing robot, a search daemon, and a CGI search frontend. It can

index as many as a few million URLs and search for words and phrases, use

Miscellaneous Supporting Tools

Chapter 7 93

wildcards, and do a Boolean search. Search results can be limited to time period

given, site or Web space (set of sites) and sorted by relevance (PageRank is used)

or date.

Special Features: ASPseek is optimized for multiple sites (threaded index, async

DNS lookups, grouping results by site, Web spaces), but can be used for searching

one site as well. ASPseek can work with multiple languages/encodings at once

(including multibyte encodings such as Chinese) due to Unicode storage mode.

Other features include stopwords and ispell support, a charset and language

guesser, HTML templates for search results, excerpts, and query words

highlighting.

History: Developed and maintained by SWsoft.

Project Sponsors/Administrators: SWsoft (http://www.sw-soft.com/)

Dependency: C++ STL, RDBMS

Supported Platforms: Linux

License: GNU GPL

Availability: Binary packages: http://www.aspseek.org/packages.php, Source

packages: http://www.aspseek.org/download.php

Further Information: Project Home Page: http://www.aspseek.org/

3.4 Harvest: A Distributed Search System

Description: Harvest is a system to collect information and make them searchable

using a web interface. Harvest can collect information on inter- and intranet using

http, ftp, nntp as well as local files like data on harddisk, CDROM and file servers.

Special Features: Current list of supported formats in addition to HTML include

TeX, DVI, PS, full text, mail, man pages, news, troff, WordPerfect, RTF, Microsoft

Word/Excel, SGML, C sources and many more. Stubs for PDF support is included

Miscellaneous Supporting Tools

Chapter 7 94

in Harvest and will use Xpdf or Acroread to process PDF files. Adding support for

new format is easy due to Harvest's modular design.

History: Unknown

Project Sponsors/Administrators: Developers: Kang-Jin Lee, Javier Masa Marin,

Harald Weinreich

Dependency: Apache, Perl, GCC (C Compiler), Bison, Flex

Supported Platforms: UNIX, Linux

License: GNU GPL

Availability: http://sourceforge.net/project/showfiles.php?group_id=27808,

http://harvest.sourceforge.net/harvest/doc/download.html

Further Information: Project Home Page: http://harvest.sourceforge.net/

3.5 Zebra Server

Description: Zebra is a high-performance, general-purpose structured text indexing

and retrieval engine. It reads structured records in a variety of input formats (e.g..

email, XML, MARC) and allows access to them through exact Boolean search

expressions and relevance-ranked free-text queries.

Special Features: Zebra supports large databases (more than ten gigabytes of data,

tens of millions of records). It supports incremental, safe database updates on live

systems. You can access data stored in Zebra using a variety of Index Data tools

(e.g. YAZ and PHP/YAZ) as well as commercial and freeware Z39.50 clients and

toolkits.

History: Unknown

Project Sponsors/Administrators: Index Data (http://indexdata.dk/)

Dependency: YAZ Toolkit, [To build from source: C++ Compiler (GCC or

VC++)]

Supported Platforms: UNIX, Linux, Windows

Miscellaneous Supporting Tools

Chapter 7 95

License: GNU GPL

Availability: Source and binary: http://indexdata.dk/zebra/

Further Information: Project Home Page: http://indexdata.dk/zebra/

3.6 SiteSearch

Description: The OCLC SiteSearch software provides a comprehensive solution

for managing distributed library information resources in a World Wide Web

environment. It offers tools that integrate electronic resources under one web

interface, provide flexible access to resources, and build text and image databases

locally.

Special Features: Unknown

History: Unknown

Project Sponsors/Administrators: OCLC, Inc

Dependency: Java

Supported Platforms: Platform Independent

License: SiteSearch Open Source License Terms

Availability:

http://www.sitesearch.oclc.org/project/showfiles.php?group_id=16381

Further Information: Project Home Page: http://www.sitesearch.oclc.org/

Chapter 8

Conclusion

“The computer should be doing the hard work. That's what it's paid to do, after

all” – Larry Wall, author of Perl programming language

• Barriers in Using OSS

• Criteria for Selection of OSS

• Conclusion

Conclusion

Chapter 8 97

1 Barriers in Using OSS

Benefits of the Open Source Software notwithstanding there are a number of

barriers to the use of OSS in libraries. Library administrators are often reluctant to

adopt OSS due to number of factors.

According to the Draft Report (2001) of Digital Library Federation (USA):

• OSS can lack formal support making it difficult for libraries without

significant capacity in their systems department to participate in OSS

development or to use OSS.

• OSS needs to develop a participatory organizational model that allows

many to contribute perhaps in different ways to OSS development.

• OSS is not always easy to use. It is therefore largely inaccessible to the

many libraries and library system departments that require plug-and-play

software that is well documented and supported and can be easily installed

(and uninstalled).

• OSS initiatives do not always do enough to get non-systems librarians and

library patrons involved in design and testing of OSS. As such, they are

seen as being something that exclusively offers benefits to and holds

interest for library systems staff and not for the wider library community.

Another factor that often comes up is the usability of open source software. The

basic problem is that most open source systems are written by programmers who do

not understand the end user needs and whose software is often complex and

difficult to use. Thus, people argue that open source software projects need to

adapt in order to produce systems that can be used by a typical and non-technical

user.

Another issue related to this usability is the documentation of open source software.

A particular piece of software cannot be used easily without proper documentation.

While proprietary software vendors can afford to employ documentation people to

do the job open source world largely lacks the resources to do it. Programmers

Conclusion

Chapter 8 98

work in open source projects because they love programming and they do it as a

hobby or pastime. But documenting the product may not be as challenging to them

as writing the software. This factor sometimes reduces the usefulness of a software

product to a great extent.

2 Criteria for Selection of OSS Frank Cervone (2003) has given the following guidelines for evaluation of Open

Source Software Guidelines for Evaluating OSS all of which follow a single

principle: thoroughly investigate the software before implementing.

Some questions to be asked include:

• What are the programming language requirements?

o Do you have people on staff who can program in the language in

which the software is written?

o If not, do you have ready access to people who can?

o If not, are there alternative packages you can support?

• What is the operating environment?

o Is this software supported on your hardware?

o Does it run on the operating systems you support?

o Is there a large, active user base?

• How is maintenance handled?

o Who is currently supporting it?

o Is there an electronic discussion list, newsgroup, or blog [weblog]

that can be used for support?

o Is there a commercial entity that could provide support?

o Is there a community of peers providing input on enhancements and

modifications?

Conclusion

Chapter 8 99

o What sort of functional and integrated testing is performed by the

user community?

• Does the software have the necessary functionality?

o Is the product mature? Is it in a greater than 1.0 release?

o Will it require modification? If so, do you have the expertise?

o How will local modifications be folded back into the base product so

that the same modifications need not be repeated for each new

release?

Another concern is how much customization needed to make that product work.

Often librarians lack the skills themselves or are unable to find suitable support to

customize a software application to their his needs. A case in point could be the

Postnuke Content Management System which requires innumerable amount of

customization to use it as the library’s portal.

3 Conclusion

Open Source essentially empowers less privileged communities though it does not

follow that it is meant only for them. There is no denying the fact that OSS enables

bridging the digital divide in more ways than one. Libraries in the developing

countries are able to support electronic access, digital libraries, and resource

sharing because they are able to use OSS. Even libraries in well-developed

countries are becoming more inclined towards OSS to improve their services.

Chapter 9

Appendix

“Making Linux freely available is the single best decision I've ever made. There are

lots of good technical stuff I'm proud of too in the kernel, but they all pale by

comparison.” – Linus Torvalds

• OSI Certified Licenses

Appendix: OSI Certified Licenses

Chapter 9 101

OSI Certified Licenses

The Open Source Initiative (OSI) certifies open source licenses on the basis of ten

criteria describe in the chapter 3 of this document. Till now there are 45 OSI

certified licenses described in their home page (http://www.opensource.org/).

1. Academic Free License: http://www.opensource.org/licenses/academic.php

2. Apache Software License:

http://www.opensource.org/licenses/apachepl.php

3. Apple Public Source License: http://www.opensource.org/licenses/apsl.php

4. Artistic license: http://www.opensource.org/licenses/artistic-license.php

5. Attribution Assurance Licenses:

http://www.opensource.org/licenses/attribution.php

6. BSD license: http://www.opensource.org/licenses/bsd-license.php

7. Common Public License: http://www.opensource.org/licenses/cpl.php

8. Eiffel Forum License: http://www.opensource.org/licenses/eiffel.php

9. Eiffel Forum License V2.0:

http://www.opensource.org/licenses/ver2_eiffel.php

10. Entessa Public License: http://www.opensource.org/licenses/entessa.php

11. GNU General Public License (GPL):

http://www.opensource.org/licenses/gpl-license.php

12. GNU Library or "Lesser" General Public License (LGPL):

http://www.opensource.org/licenses/lgpl-license.php

13. Lucent Public License (Plan9):

http://www.opensource.org/licenses/plan9.php

14. IBM Public License: http://www.opensource.org/licenses/ibmpl.php

Appendix: OSI Certified Licenses

Chapter 9 102

15. Intel Open Source License: http://www.opensource.org/licenses/intel-open-

source-license.php

16. Historical Permission Notice and Disclaimer:

http://www.opensource.org/licenses/historical.php

17. Jabber Open Source License:

http://www.opensource.org/licenses/jabberpl.php

18. MIT license: http://www.opensource.org/licenses/mit-license.php

19. MITRE Collaborative Virtual Workspace License (CVW License):

http://www.opensource.org/licenses/mitrepl.php

20. Motosoto License: http://www.opensource.org/licenses/motosoto.php

21. Mozilla Public License 1.0 (MPL):

http://www.opensource.org/licenses/mozilla1.0.php

22. Mozilla Public License 1.1 (MPL):

http://www.opensource.org/licenses/mozilla1.1.php

23. Naumen Public License: http://www.opensource.org/licenses/naumen.php

24. Nethack General Public License:

http://www.opensource.org/licenses/nethack.php

25. Nokia Open Source License: http://www.opensource.org/licenses/nokia.php

26. OCLC Research Public License 2.0:

http://www.opensource.org/licenses/oclc2.php

27. Open Group Test Suite License:

http://www.opensource.org/licenses/opengroup.php

28. Open Software License: http://www.opensource.org/licenses/osl.php

29. Python license (CNRI Python License):

http://www.opensource.org/licenses/pythonpl.php

30. Python Software Foundation License:

http://www.opensource.org/licenses/PythonSoftFoundation.php

Appendix: OSI Certified Licenses

Chapter 9 103

31. Qt Public License (QPL): http://www.opensource.org/licenses/qtpl.php

32. RealNetworks Public Source License V1.0:

http://www.opensource.org/licenses/real.php

33. Reciprocal Public License: http://www.opensource.org/licenses/rpl.php

34. Ricoh Source Code Public License:

http://www.opensource.org/licenses/ricohpl.php

35. Sleepycat License: http://www.opensource.org/licenses/sleepycat.php

36. Sun Industry Standards Source License (SISSL):

http://www.opensource.org/licenses/sisslpl.php

37. Sun Public License: http://www.opensource.org/licenses/sunpublic.php

38. Sybase Open Watcom Public License 1.0:

http://www.opensource.org/licenses/sybase.php

39. University of Illinois/NCSA Open Source License:

http://www.opensource.org/licenses/UoI-NCSA.php

40. Vovida Software License v. 1.0:

http://www.opensource.org/licenses/vovidapl.php

41. W3C License: http://www.opensource.org/licenses/W3C.php

42. wxWindows Library License:

http://www.opensource.org/licenses/wxwindows.php

43. X.Net License: http://www.opensource.org/licenses/xnet.php

44. Zope Public License: http://www.opensource.org/licenses/zpl.php

45. zlib/libpng license: http://www.opensource.org/licenses/zlib-license.php

Chapter 10

Selective Bibliography

Selective Bibliography

Chapter 10 105

1. Cervone, Frank (2003). The Open Source Option [online] Available from:

http://libraryjournal.reviewsnews.com/index.asp?layout=articlePrint&articleI

D=CA304084&publication=libraryjournal (Accessed on August 27, 2003)

2. Chawner, Brenda (2003). Open Source Software and Libraries Bibliographies

(Version 0.5) [online] Available from:

http://www.vuw.ac.nz/staff/brenda_chawner/biblio.html (accessed on July 23,

2003)

3. Chudnov, Daniel (1999). Open Source Library Systems: Getting Started

[online] Available from: http://www.oss4lib.org/readings/oss4lib-getting-

started.php (accessed on July 23, 2003)

4. Ghosh, R.A. (1998). FM Interview with Linus Torvalds: What motivates free

software developers? First Monday [online] (2 March 1998) Vol.3 (3)

Available from

http://www.firstmonday.dk/issues/issue3_3/torvalds/index.html (Accessed

July 20, 2003)

5. Greenstein, D. (2001). DLF Architectures: Evaluation of Open Source

Software for Libraries [online] Available from:

http://www.diglib.org/architectures/ossrep.htm (Accessed on August 26,

2003)

6. MacFarlane, Andrew (2003). On Open Source IR. In Aslib Proceedings, 55

(4), pp. 217-222.

7. Moody, Glynn (2001). Rebel Code: Linux and the Open Source Revolution.

Allen Lane, London.

8. Morgan, Eric Lease (2002). Open Source Software in Libraries [online]

Available from: http://dewey.library.nd.edu/morgan/musings/ossnlibraries/

(accessed on July 20, 2003)

Selective Bibliography

Chapter 10 106

9. Morgan, Eric Lease (2003). Building Your Library’s Portal [online] Available

from: http://dewey.library.nd.edu/morgan/musings/portals/ (Accessed on

August 27, 2003)

10. OSS4Lib. Oss4lib – Projects [online] Available from:

http://www.oss4lib.org/projects/ (accessed on July 23, 2003).

11. OSS- Open Source Software [online] Available from:

http://www.eifl.net/opensoft/soft.html (accessed on July 23, 2003)

12. Pasquinelli, Art (2003). Information Technology Advances in Libraries

[online] Available from: http://www.sun.com/products-n-

solutions/edu/whitepapers/pdf/it_advances.pdf (Accessed on August 16, 2003)

13. Rasch, Chris (2000). A Brief History of Free/Open Source Software

Movement [online] Available from:

http://www.openknowledge.org/writing/open-source/scb/brief-open-source-

history.html (accessed on July 20, 2003)

14. Raymond, Eric S. (2001). The Cathedral and the Bazaar: Musings on Linux

and Open Source by an Accidental Revolutionary. Revised edition.

Sebastopol, CA; O’Reilly and Associates.