VTLS presentation

61
Vinod Chachra, President & CEO VTLS Inc. Special Presentation at UiTM 18 February 2009

Transcript of VTLS presentation

Vinod Chachra, President & CEO VTLS Inc.

Special Presentation at UiTM18 February 2009

About VTLS Inc.

First spin-off corporation from Virginia Tech (VT) - Virginia’s

largest University – home of System X - 3rd fastest

Supercomputer in the world when built from 1100 PCs

purchased off the web. Total cost under $5M.

Vinod Chachra served as VP for Information Tech at VT.

VTLS has offices in 7 countries; does business in 40.

VTLS has three major product lines

Virtua – Alexandria Egypt; many National Libraries

VITAL – Fedora based Institutional Repository – developed in

partnership with the Australian ARROW project.

VTRAX – RFID based tracking & security systems for libraries

VTLS is a Worldwide Company

Partner or Office* Locations

Australia*

Brazil*

Brunei

Egypt

France

Greece

India*

Kuwait

Malaysia*

Philippines

Russia

Slovakia

Spain – European HQ*

Switzerland*

Tunisia

Taiwan

Thailand

UAE

USA*

Select Customers: National Libraries

Europe: National Library of Switzerland

Europe: National Library of Wales

Europe: National Library of Ireland

Europe: Royal Library of Belgium

Europe: National Library of Slovakia

Europe: National Union catalog of Poland

Africa: Library of Alexandria (Egypt)

Africa: Notional Library of Morocco

Asia: National Library of India

Asia: National Library of Singapore

Asia: National Library of Malaysia

Union Catalogs (regional) of Catalan and Switzerland

VTLS HQ in Blacksburg, VA, USA

Presentation Outline

1. Discussion on Model of Library Co operatives

2. Institutional Repositories

3. New Discovery Tools using Facet Based Searching

4. New Directions in R & D

Presentation Part 1 (of 4)

Discussion on Model of Library Cooperatives

Libraries have to justify their value

Libraries have to redefine their relevance

Libraries have to do more for less

In a knowledge based economy libraries are

essential … we know that but do funding

agencies and taxpayers know that?

In bad economic times (like now) usage of libraries

increases.

Why Union Catalogs and Consortia?

A union catalog is a catalog that contains the records of all the participating libraries.

A consortia is a group of libraries working together for the benefit of all its members.

The primary purposes of Union Catalogs and Consortia are To support Resource Sharing To reduce operational costs and To increase services to users .

Resource sharing is a noble goal as it allows people to do more with less.

What Qualifies for Resource Sharing?

1. Descriptive Materials (intellectual work) Bib records; authority records; serials holdings patterns

2. Identification and authentication materialsPatron records used in ubiquitous library

3. Real Materials; ILL & Joint Collection Development

Books, periodicals, films, digital documents etc..

4. Human resourcesShared cataloging; joint system management functions

5. Computer resources Hardware, Software, Know-how

6. Virtual CollectionsIndexes to digital materials

Access to digital materials

Steps in Resource Sharing - Users Perspective

1. Can I find what I am looking for?- searching tools, finding aids, union catalogs

2. Is it available?- status checking; navigation tools- should not have to repeat search

3. Is it available to me?- identification; authentication; access control

4. If yes, can I request it here and now?- local request service, ILL, DD, download

5. How will I get it?- pick up, mail, fax, e-mail, download

Functions

1. Copy Cataloging and Joint Cataloging

2. Duplicate control

3. Reference searching

4. Support of Distributed Cataloging

5. Maintaining Holdings symbols and if needed navigating them for discovery and ILL

6. Quality Control

7. Electronic distribution of records (EDIS)

8. Creating and managing linked records

9. Maintaining Serials holdings records

10. Support for Extracting records

11. Union Catalogs for sharing hardware etc..

Four Models

1.Union Catalog with Local Systems

2. No union catalog – only broadcast searching + Local Systems

3. Union Catalog + Consortium Data Base for Regions

4. Single Consortium Database with all local functions

1. Union Catalog with Local Systems

1

654

32

Union Catalog

Individual LibrariesWith local systems

2. No Union Catalog

1

654

32

Individual LibrariesWith local systems

Broadcast searchingFederated Searching

3. Union Catalog + Consortium Systems

1

654

32

Union Catalog

Consortium Systems

Individual Libraries

4. Single Consortium Database

ConsortiumSystem

Individual Libraries

Consortium Database (1 of 2)

Standard Virtua Database Structure

Accounts

(Clas 02)

Accounts

(Clas 01)

Accounts

(Clas 99) Test

Organization

Locations

(Libraries)

Locations

(Libraries)

Locations

(Libraries)

Sub-Locations

(Floors/Dept)

Sub-Locations

(Floors/Dept)

In a standard implementation of Virtua, general parameters that define circulation rules, acquisitions and cataloging options, etc. in the Virtua Profiler apply to all locations that are a part of the database account.

Virtua Consortium Database (2 of 2) Consortium Database Structure

Accounts

(Clas 02)

Accounts

(Clas 01)

Accounts

(Clas 99)

Organization

Locations

(Libraries)

Locations

(Libraries)

Locations

(Libraries)

Sub-Locations

(Floors/Dept)

Sub-Locations

(Floors/Dept)

Institutions

Institutions

Institutions

Virtua Consortium Database Structure

In the consortia implementation of Virtua, general parameters that define circulation rules, acquisitions and cataloging options, etc. in the Virtua Profiler apply to one institution and all locations associated with that institution that are a part of the database account.

This allows each institution the ability to set and manage their own parameters in the Profiler to reflect their policies within their institution and their policies towards other member institutions in the consortia.

When parameters are defined for a given institution, those parameters apply to all patrons, locations and sub-locations that are linked to that institution.

Presentation Part 2 (of 4)

Institutional Repositories

What is an institutional repository?

-- place to keep locally developed content

-- place to provide access to this content

-- institutional commitment to support this activity so there is some assurance that this material will be available into the future.

Institutional Repositories?

Clifford A. Lynch, Executive Director, Coalition

Networked Information, defines an institutional

repository as a set of services that an institution offers

to the members of its community for the management

and dissemination of digital materials created by the

institution and its community members.

Lynch, Clifford A., “Institutional Repositories: Essential Infrastructure for

Scholarship in the Digital Age”, ARL Bimonthly Report 226, February 2003.

It is more than software… it is a combination of

hardware, software, policies and processes AND an

institutional commitment to support it.

Two Problems : Viewers and Media

Viewers

Digital content requires “viewers”

Viewers depend on hardware / software (HW/SW)

Obsolescence rate of HW/SW is very high

Preservation depends on “migrations”

Media life is unknown

may require “migrations”

Institutional Repositories will help as VITAL is “Preservation Friendly”.

Digital Preservation Issues

OAI-PMH – Open Archives Initiative Protocol for Metadata Harvesting

SRU & SRW – Search Retrieval URL / Search Retrieval Web

DOI – Digital Object Identifier

RDF – Resource Description Framework

METS – Metadata Encoding Transmission Standard

FOXML – Fedora Object XML

LDAP - Lightweight Directory Access Protocol

XML – eXtensible Markup

TEI – Text Encoding Initiative

EAD – Encoded Archival Description

Acronyms in Repository Services

Examples: Open Source Software

Some systems call themselves as “digital library

software” like

Greenstone (New Zealand)

E-Prints (UK)

Other newer systems go by the name of Institutional

Repositories like

D-Space

Fedora

What is VITAL?

Proprietary Repository Management Software

Owned and produced by VTLS Inc

Developed in close partnership with ARROW

Repository software that allows you to: Ingest

Manage

Search - Access

Expose

Preserve

Digital objects stored in an Open source Fedora™ repository

VITAL is based on Fedora.

What is Fedora™?

Flexible Extensible Digital Object Repository Architecture

http://www.fedora-commons.org/

Go to Fedora Commons

VITAL : Institutional Repository Solution

Provides Management Services (API-M)

Ingest – XML-encoded object submission

Create – interactive object creation via API request

Maintain – interactive object modification via API requests

Validate – application of integrity rules to objects

Identify – generate unique object identifiers

Secure – authentication and access control

Preserve – automatic content versioning and audit trail

Export – XML-encoded object formats

Provides tools to simplify the workflows

What does VITAL do?

Vital Manager

FEDORA

VITAL Fedora Relationship

Valet

Access Portal

Indexes

Web services

Batch Loading Tool

[Reproduced with permission from ARROW]

editor

Ingesting using VALET

submitter

VALETweb – form

captures objects

one by one

[Reproduced with permission

from ARROW]

The VITAL Architecture

Self Submission

Ingest Layer

Management

Search & Discovery

Web ServicesAdministrative

Functions

BatchIngest

Who is using VITAL?

30 Institutions worldwide; more coming

In Australia: Australian Research Repositories Online to the World (ARROW) 16 Institutions in Australia including Monash University

In USA Duke, Columbia University.

VCOM, Mary Washington University, Virginia Tech

In Europe UK: National Library of Wales

Greece: National Theatre; Athens Archaeological Society

Belgium: UCL

Slovakia: National Library

In ME and Asia National Library of Singapore

KISR

Presentation Part 3 (of 4)

Discovery Tool Based on Facet Searching

The Data Overload Problem

Fantastic Growth of Data

UC Berkeley estimates that in 2002 world created 5

exabytes of data (tera – 12; peta –15; exa – 18 zeroes)

Eric Schmidt, CEO Google says “absorbing 5 exabytes of

data on TV would require 40,700 years

Another way to look at it – with a life expectancy of 80+

years it will take you 500 life times of ceaseless TV

watching from birth to death to see 5 exabytes of data

The problem in 2009 is much worse.

The Problem - much worse today

Massive digitization Projects are going on everywhere

In governments, corporations, libraries, archives, in

special projects like

The Million Book Project

Google Print Project

Open Content Archive (32 Universities, Yahoo, Microsoft)

Steven Spielberg Digital Library

Born digital data is increasing; everybody is a publisher now --

blogs, wikis, chats

Sites like Google, MySpace, YouTube, MicrosoftLive produce even

more data

People -- Strategic Issues

84% of people start their information searches on the

internet (not in the library catalog).

62% start at Google,

1% start at a library website.

However,

Only 10% found the information they needed.

40% of that 10% found it at a library web site!

So there is an opportunity here!

Discovery Tools -- What is Discovery?

Discovery is finding something you need without knowing exactly what you are looking for!

Requires the following capabilities Systems “exposes” its content

System is iterative – good navigation

System has no “dead-ends”

System aggregates information – drill down

System shows contents in “graphical format”

System is fast (because of iterative use)

System requires no training – discovery!!

Discovery Tool Examples

Networked Digital Library for Thesis and Dissertations

http://rogers.vtls.com:6080/visualizer

A loose consortium of independent libraries

http://rogers.vtls.com:7080/visualizer

Note: These sites may not be available at all times or from all locations.

Visualizer -- Expanding the Architecture

The problem is massive.

We can admit defeat or build systems to cope with it.

These systems must take advantage of the capabilities of

the computer and combine it with the knowledge,

expertise and inference ability of humans.

Two Questions

How do you organize the world’s information?

How do we visualize the nature and depth of content?

VTLS Visualizer OPAC -- Facet Based Searching

ILS 1,2,3 Repository 1,2 ANY SYSTEM

MAPPING ROUTINES

MARC 21

STANDARDIZED INPUT

FACET SEARCH ENGINE

STANDARIZED OUTPUT

DISPLAY MANAGEMENT

D.C.

XML

Direct Direct

Profiling

Interface

Query

Interface

Knowledge Base

VTLS Visualizer – Expanding the Architecture

The architecture must be:

Comprehensive All data – catalog & other

All formats – MARC21, XML, EAD etc.

All languages – those you can read and those you cannot

Distributed Multiple locations

Multiple sources

Scalable Survive unexpected onslaught

Sustainable Organizational Involvement, Management & Support

User Centric - knowledge base Takes advantage of the knowledge base of the user community

Branding (1 of 4)

Branding (2 of 4)

Branding

Branding and Drill Down (3 of 4)

Branding and Expanded Search (4 of 4)

How Many Facets?

Basic facets and Extended facets

Basic facets -- minimal set for every implementation

Extended facets – additional facets for special use

How many facets?

Too few facets are ineffective

Too many facets are not user friendly

How to identify

One to one facets

One to many facets

Drill down facets

How does it work?

1. Harvest data: OAI-PMH used for harvesting the metadata

2. Create KB: Apply the “Knowledge Base” to the Metadata

3. Profile the system for proper facets

Facets on the raw data

Facets on the derived data from knowledge base

4. Create standardized input for indexing

5. Apply indexing for use by search engine

6. Throw away the harvested metadata but retain index

7. Discover

8. Hyperlink to the source for display of content

Presentation Part 4 (of 4)

New Directions in R & D

If you don’t know where you are going, any road will

lead you there!

Lewis Carroll in “Alice in Wonderland”

RDA and FRBR

There are two approaches to the implementation to FRBR

Store the internal data in a hierarchic linked record format.

FRBRize records upon display keeping the storage system like a traditional flat catalog

Since records are cataloged only once and displayed many many times it is better to use the first method.

When harvesting FRBR records do we unFRBRize them and then harvest? Or do we harvest and them and then do some post processing?

Issue remains unresolved.

Work Work

Expression

Manifestation

Item

WW

FRBR Link Types: Group 1

WE

E W

E M

M E

M I

I M

Responsibility Relationships

Item

Manifestation

Expression

Work

Person / Corporate Body

FRBR Link Types: Group 2

R e a l i zed

b

y

P r o d u c ed

b y

Cr e a t ed

b

y

Owned

b

y

Virtua - Archives Management

Archives Management - Background

Archival management functionality in Virtua came about as

a functional enhancement to Virtua for the National Library

of Wales to allow them to preserve the content and

arrangement of their existing archival collection material

as well as maintain these collections once on Virtua.

The functionality was implemented in release 48 of Virtua

and encompasses changes in both the client and the

iPortal.

Archives Management - Background

Archival cataloguing differs from bibliographic cataloguing

because archives, unlike most printed material, cannot be

described in isolation.

An archive is, therefore, only fully comprehensible through

the understanding of both its content, and also, its

context. Content may be explained through description,

but context can only be reflected through arrangement.

Archival arrangement involves the ordering of material to

reflect its context.

Archival arrangement is reflected in the cataloguing of

archives through the use of multilevel description.

Archives Management - Background

Rules for multilevel description are laid down in 'ISAD(G)

General International Standard Archival Description and

describe 7 core levels:

Fonds

Sub-fonds

Sub-sub-fond

Series

Sub-series

Files

Items

Top Five Topics

Format independence Support for Marc21, XML, Mods, Mets, etc in same DB

Mobile access Support for iPhones and other devices

Everything that touches users should be mobile

Linked records As in FRBR, Archival systems

As in multilingual subjects

On demand open delivery Authentication, shibboleth

Single sign on and access control

Deep Linking Discovery in one system and delivery from another

Libraries & Removing Barriers

Spatial Barriers

Libraries located all over the globe

Temporal barriers

The mere act of publishing a book removes the “time

dependent barrier” between the creator of knowledge and

the user of knowledge. Question: Would the written

language be so pervasive if “voice recording devices” were

invented BEFORE written language?

Financial Barriers

New role for libraries – removing intellectual barriers

Removing Intellectual Barriers

Discipline Independence in a multi-disciplinary world

No matter how learned we are, there will always be some

discipline that we know nothing about

Language independence in a global world

Knowledge not limited to any single language

Should not be necessary to read Chinese or Arabic to access

photos or sound or screen objects?

Should it be necessary to read to be able to learn?

Literacy independence in a multimedia world

Is the ability to read necessary in a multimedia world?

A Closing Thought

Poor technology

fosters competition.

Great technology promotes partnerships

Thanks … Our success lies in making you successful

Many thanks to

Dr. Adnan and our hosts At

UiTM

And to all of you for coming

VTLS