Top Banner
A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University [email protected] http://www.cs.odu.edu/~mln/ Texas A&M University May 6, 2004
78

A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University [email protected] mln/ Texas.

Dec 28, 2015

Download

Documents

Abel Blair
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

A Review of InstitutionalRepository Projects and

TechnologiesMichael L. Nelson

Old Dominion [email protected]

http://www.cs.odu.edu/~mln/

Texas A&M UniversityMay 6, 2004

Page 2: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Acknowledgements

• ODU: K. Maly, M. Zubair, J. Bollen• LANL: R. Luce, X. Liu• NASA: G. Roncaglia, J. Rocker, C. Mackey• Cornell: C. Lagoze, S. Warner• MAGiC (UK): Paul Needham• and, of course, Herbert Van de Sompel

(LANL)– the OpenURL slides are nicked from his

presentations

Page 3: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Outline

• A bit of history• Core technologies & Issues

– OAI-PMH• deep web

– OpenURL– Handles / DOIs– Object Models

• Example implementations• Download and go…

covered only briefly

Page 4: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

OAI-PMH

Page 5: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Background

• I met Herbert Van de Sompel in April 1999...– we spoke of a demonstration project he had in mind and

had received sponsorship from Paul Ginsparg and Rick Luce

– We wanted to demonstrate a multi-disciplinary DL that leveraged the large number of high quality, yet often isolated, tech report servers, e-print servers, etc.

• most digital libraries (DLs) had grown up along single disciplines or institutions

– little to no interoperability; isolated DL “gardens”

Page 6: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Universal Preprint Service

• A cross-archive DL that provides services on a collection of metadata harvested from multiple archives– Nelson: NCSTRL+; a modified version of Dienst

• support for “clustering”• support for “buckets”

– Krichel: ReDIF metadata format– Van de Sompel: SFX Linking

• Demonstrated at Santa Fe NM, October 21-22, 1999– http://web.archive.org/web/*/http://ups.cs.odu.edu/– D-Lib Magazine, 6(2) 2000 (2 articles)

• http://www.dlib.org/dlib/february00/02contents.html– UPS was soon renamed the Open Archives Initiative (OAI)

http://www.openarchives.org/

Page 7: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

• Self-describing archives– Much of the learning about the constituent

UPS archives occurred out of band…

• Data Providers– publishing into an archive– providing methods for metadata

“harvesting”• provide non-technical context for sharing

information also

• Service Providers– harvest metadata from providers– implement user interface to data

Data and Service Providers

Even if theseare done bythe same DL,these are distinct roles

Page 8: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Metadata Harvesting• Move away from distributed searching

– the return of union catalogs• Extract metadata from various sources• Build services on local copies of metadata

– data remains at remote repositories

user

. . .

search for “cfd applications”

local copy ofmetadata

metadataharvested offline

metadataharvested offline

metadataharvested offline

metadataharvested offline

each node independently maintained

all searching, browsing, etc. performed on the metadata hereindividual nodes can

still support direct userinteraction

Page 9: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Result… OAI

• The OAI was the result of the demonstration and discussion during the Santa Fe meeting– OAI = a bunch of people, a religion, a cult, etc.– OAI Protocol For Metadata Harvesting (OAI-PMH) = the protocol

created and maintained by the OAI• Initial focus was on federating collections of scholarly e-print

materials…• …however, interest grew and the scope and application of

OAI-PMH expanded to become a generic bulk metadata transport protocol

• Note:– OAI-PMH is only about metadata -- not full text!

• but what is metadata vs. full-text?– OAI is neutral with respect to the nature of the metadata or the

resources the metadata describes• read: commercial publishers have an interest in OAI-PMH too...

Page 10: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Open Archives Initiative

The protocol is openlydocumented, and metadatais “exposed” to at least somepeer group (note: rights management still applies!)

Archive defined as a“collection of stuff” --not the archivist’s definition of “archive”. “Repository” used in most OAI documents.

TLA; needed anothervowel...

Page 11: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

OAI-PMH MechanicsRequest is encoded in http

Response is encoded in XML

XML Schema for theresponses are defined in the OAI-PMH document

Page 12: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Overview of OAI-PMH Verbs

Verb Function

Identify description of archive

ListMetadataFormats metadata formats supported by archive

ListSets sets defined by archive

ListIdentifiers OAI unique ids contained in archive

ListRecords listing of N records

GetRecord listing of a single record

archivalmetadata

harvestingverbs

most verbs take arguments: dates, sets, ids, metadata formatsand resumption token (for flow control)

Page 13: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

resource

all available metadata about David

item

Dublin Coremetadata

MARCmetadata

SPECTRUMmetadata records

item = identifier

record = identifier + metadata format + datestamp

set-membership is item-level property

OAI-PMH Data Model

Page 14: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Data Providers / Service Providers

data providers(repositories)

service providers(harvesters)

Page 15: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Aggregators

data providers(repositories)

service providers(harvesters)

aggregator

aggregators allow for:• scalability for OAI-PMH• load balancing • community building• discovery

Page 16: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Aggregators• Frequently interchangeable terms:

– aggregators: likely to be community / institutionally focused

– caches: stores a copy, less likely to be community-oriented

– proxies: less likely to store a copy, may gateway between OAI-PMH and other protocols

• Dienst / OAI Gateway; Harrison, Nelson, Zubair, JCDL 03

• To learn more about aggregators, caches & proxies:– http://www.openarchives.org/OAI/2.0/guidelines-aggregator.htm– http://www.cs.odu.edu/~mln/jcdl03/

Page 17: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Example Aggregators

• Arc - http://arc.cs.odu.edu/– first described “hierarchical harvesting” in

D-Lib Magazine, 7(4) 2001• http://www.dlib.org/dlib/april01/liu/04liu.html

• Celestial - http://celestial.eprints.org/– among other services, it provides a history

of harvests (successful vs. errors)• http://celestial.eprints.org/cgi-bin/status

Page 18: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

OAI-PMH 2.0 Registration

Data Providers: http://www.openarchives.org/Register/BrowseSites.plService Providers: http://www.openarchives.org/service/listproviders.html

75 repositories registered

??? unregistered repositories

unregistered because:• testing / development• not for public harvesting • public, but “low-profile”• never got around to it…• ???

DP:SP ~= 5:1

Page 19: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Registration is Nice……But Not Required

• OAI-PMH is (becoming) the “http” for digital libraries– there is no central registry of http servers

• remember the NCSA “What’s New” page? (ca. 1994)

• There will never be “registration support” in OAI-PMH– registries are a type of service provider, built on top of OAI-

PMH– registration will be an integral part of community building– friends…

Page 20: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

<friends>…</friends>

http://techreports.larc.nasa.gov/ltrs/oai2.0/ http://naca.larc.nasa.gov/oai2.0/

http://ntrs.nasa.gov/oai2.0/

http://ston.jsc.nasa.gov/collections/TRS/oai/

http://horus.riacs.edu/perl/oai/

harvester

Identify

NASA <friends> example

Page 21: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Scientific Communication

• With only some exceptions, which interface is used for discovery is not as important as the fact that discovery occurred in the first place…– “control” of the discovered objects is not “lost” by data

providers• however, higher level mirroring services can be built on

top of OAI (cf. NACA & ARC mirroring between NASA LaRC and MAGiC)

• The real power of OAI-PMH derives as much from what it does not do as what it actually does

Page 22: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

What Does OAI-PMH Mean for Authors?

• On the surface, absolutely nothing!– the ideal OAI deployment should be absolutely invisible

to normal DL operations– uninterested users should not even notice or care

• Indirectly, they should enjoy the benefits of the critical mass of current and developing DL tools & systems – personal, institutional data providers– proliferation of targetted, value-added service

providers

Page 23: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

What Does OAI-PMH Mean For Editors?

• Absolutely everything…• The decoupling of SPs and DPs will have significant and

profound implications on scientific and technical information exchange– OAI-PMH is actually just one component in a larger

engineering effort for scholarly communication (e.g. OpenURL)

• Service and resource integration will be the focus of journals, professional societies, universities, etc.– OAI-PMH will be a basic, core technology for scientific

publishing as http & XML

Page 24: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Field of Dreams• It should be easy to be a data provider,

even if it makes more work for the service provider.– if enough data providers exist, the service

providers will come (DPs >> SPs)

• Open-source / freely available tools– “drop-in” data providers

• at the end of this presentation

– tools to make your existing DL a data provider:• http://www.openarchives.org/tools/tools.htm• also: OAI-implementers mailing list / mail archive!

– service providers:• http://oaiarc.sourceforge.net/

Page 25: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

OAI-PMH Meeting History

OAI Open Day, Washington DC

1/2001

2nd OAI WorkshopCERN 10/2002

Protocol definition,development tools

DPs, retrofittingexisting DLs

SPs, new services

Socio-Economic-Political Issues

4 1

5 4

1 11

0 6

Page 26: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Shift of Topics

• From the protocol itself, supporting & debugging tools and how to retrofit (existing) DLs…

• …to building (new) services that use the OAI-PMH as a core technology and reporting on their impact to the institution/community

Page 27: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

• http://arc.cs.odu.edu/• harvests all known archives• first end-user service provider• source available through SourceForge• hierarchical harvesting

• http://www.ncstrl.org/• metadata harvesting replacement

for Dienst-based NCSTRL• based on Arc• computer science metadata

• http://archon.cs.odu.edu/• physics metadata• based on Arc• features:

– citation indexing– equation-based

searching

Page 28: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

• http://torii.sissa.it/• physics metadata• features

– personalization– recommendations– WAP access

• http://icite.sissa.it/• physics metadata• features

– citation based access to arXiv metadata

• http://citebase.eprints.org/• arXiv metadata• citation based indexing, reporting

Page 29: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

• http://www.myoai.com/• covers all registered metadata• features

– result sets– personalization– many other advanced

features

• http://www.ercim.org/cyclades• scientific metadata• features

– personalization– recommendations– collaboration

• status?

• http://oaister.umdl.umich.edu/• harvests all known archives• Mellon Foundation funded project• Content-sharing agreement with Yahoo!

– http://www.openarchives.org/pipermail/oai-general/2004-March/000371.html

Page 30: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Others…• Commercial publishers

– American Physical Society (APS)– Institute of Physics– Elsevier / Scirus (www.scirus.com)– BioMed Central

• US Govt– OSTI– LANL– PubMed Central

• Institutional servers– DARE (All Dutch universities)– California Digital Library

Page 31: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

NACA Technical Report Server

• publicly available– began in 1996– details in NASA TM-1999-

209127 • scanned reports from

1917-1958– NACA = predecessor to

NASA• contents mirrored with

the MaGIC project– a UK-based grey-

literature preservation project

– OAI-PMH used to mirror contentshttp://naca.larc.nasa.gov/

http://naca.larc.nasa.gov/oai2.0/

Page 32: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

NACA Report 1345

as seen through its native DLhttp://naca.larc.nasa.gov/

Page 33: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

NACA Report 1345

as seen through MAGiChttp://www.magic.ac.uk/

Page 34: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

NACA Report 1345

as seen through its Scirus(Elsevier)http://www.scirus.com/

Page 35: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

NACA Report 1345

as seen through my.OAI(FS Consulting)http://www.myoai.com/

Page 36: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

NASA Technical Report Server

• replacement for the previous distributed searching version of NTRS– MySQL– Va Tech harvester– modified “bucket”– details in Nelson, Rocker,

Harrison, Library Hi-Tech, 21(2) (March 2003)

• a service provider & aggregator– same OAI baseURL as

used for interactive searching

http://ntrs.nasa.gov/

Page 37: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

NASA Technical Report Server

• advanced, fielded search

• explicit query routing – 12 NASA repositories– 4 non-NASA

repositories• turned “off” by

default

• >600k abstracts; >300k full-text

Page 38: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

NASA DLs in the Larger STI Realm

NTRS

LTRS ATRS CASITRS…

DOEDODUniversitiesPublishers . . .International

NTRS could also be a data provider from the point of view of other DLs; allowing theharvesting of NASAreport metadata.

NTRS could also harvestmetadata from other DLs,and provide access to non-NASA content.

We hope to influencethe direction of the science.gov effort to useOAI-PMH

this could be a fully connected graph

Page 39: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Service Providers• It is clear that SPs are proliferating, despite

(because of?) the inherent bias toward DPs in the protocol– easy to be a DP -> many DPs -> SPs eventually emerge– hard to be a DP -> SPs starve– currently 5x DPs more than SPs

• SPs are beginning to offer increasingly sophisticated services– competitive market originally envisioned for SPs is

emerging

Page 40: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

OAI-PMH & The Deep Web

Page 41: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Exposing Repository Contents

• DP9: Webcrawler access to OAI-PMH repositories

• http://dlib.cs.odu.edu/dp9/• JCDL 02

http://www.cs.odu.edu/~liu_x/dp9/dp9.pdf

• An Apache module for OAI-PMH– http://www.modoai.org/

• Extensible Repository Resource Locators (ERRoLs) for OAI Identifiers – http://www.oclc.org/research/projects/

oairesolver/default.htm

Page 42: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Race for This New Market…

• Yahoo! & University of Michigan– http://www.umich.edu/news/

index.html?Releases/2004/Mar04/r031004

• Google & CrossRef– http://www.nature.com/nature/focus/

accessdebate/17.html

Page 43: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

OpenURL

slides from Herbert Van de Sompel, LANL

Page 44: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

The Context: Library Automation Environment anno 1998

• distributed information environment• local & remote A&I databases• rapidly growing e-journal collection• need to interlink the available information

The Problem: • links are delivered by info providers• links are not sensitive to user’s context

• appropriate copy problem• links dependent on business agreements between information vendors• links don’t cover the complete collection

Origins & Motivation

Page 45: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

The Context: Library Automation Environment anno 1998

• distributed information environment• local & remote A&I databases• rapidly growing e-journal collection• need to interlink the available information

The REAL Problem:

• libraries have no say in linking • libraries are losing core part of the “organizing information” task• expensive collection is not used optimally• users are not well served

Origins & Motivation

Page 46: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Origins & Motivation

The Solution:

In information services:

• DO NOT provide a link which is an actual service related to a referenced item (e.g. a link from a record in an A&I database to the corresponding full-text)

• BUT rather provide• a link that transports metadata about the referenced item

to • others that are better placed to provide service links

OpenURL

Linking server operated by library

Page 47: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

link source

link destination

link to referenced work .

resource

resolution of metadata into link

reference

non-OpenURL linking

resource

link

Page 48: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

link source.

user-specific

resolution of metadata & identifiers into services

reference

OpenURL linking

OpenURL

OpenURLlinking

server

provision of OpenURL

linklink

destination

linklink

destinationlink

linkdestination

linklink

destination

transportation of metadata & identifiers

context-sensitive

Page 49: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

• Nature of solution determined

• Experiment with local databases at Ghent University

• Demonstrated October 1998 at Belgian Library meeting

• Problem statement & Experiment described in 2 D-Lib Magazine papers, April 1999

Evolution ~ 1998

Page 50: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

• Feasibility of solution tested in 2 complex environments

• Experiments:• SFX@Ghent & SFX@LANL: LANL, Ghent, APS, Wiley, SilverPlatter, Ex Libris• UPS Prototype: arXiv, SLAC/SPIRES, LANL, Ghent, …

• Demonstrated:• June 1999 at ALA LiTA session, New Orleans • October 1999 at OAI meeting, Santa Fe

• Experiments described in 2 D-Lib Magazine papers, October 1999 and February 2000

Evolution ~ 1999

Page 51: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

• OpenURL 0.1 released

• Quick adoption of OpenURL 0.1 in information community

• SFX linking server goes beta

Evolution ~ 2000

Page 52: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

• Integration of OpenURL Framework and DOI/CrossRef framework

• Experiment involving CNRI, LANL, OhioLink, Academic Press, Ex Libris, …

• DOI/OpenURL integration described in 2 D-Lib Magazine papers, March 2001 and September 2001

• First non-SFX linking servers appear

Evolution ~ 2001

Page 53: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

• Proposal to standardize OpenURL

• Generalization of OpenURL Framework concepts beyond scholarly information community

• Described in:Van de Sompel, Herbert and Beit-Arie, Oren. Generalizing the OpenURL Framework beyond References to Scholarly Works: the Bison-Futé model. July/August 2001. D-Lib Magazine.

• NISO AX Committee starts standardization of the OpenURL Framework using the Bison-Futé model as the basis of its work.

Evolution ~ 2001

Page 54: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

NISO OpenURL Standardization Charge

• Use existing “OpenURL Framework” as starting point• notion of context-sensitive services• notion of transporting “contextual” metadata packages to obtain context-sensitive services

• Define syntax and transport-method for “contextual” metadata packages

• Ensure extensibility:• must support future applications• must support other information communities

=> Generalize and Standardize

Page 55: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

NISO OpenURL Standardization Charge

Therefore, to be addressed were:

• OpenURL Framework beyond scholarly resources

• “contextual” metadata packages

• Syntax for “contextual” metadata packages

• Transport of “contextual” metadata packages

Page 56: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

metadata planeresource1

resource2 resource3

default links

herbert van de sompel

default links:• restricted in nature• action-radius restricted by business agreements• not context-sensitive

Page 57: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

metadata plane

extended services plane

resource1

servicecomponent1

servicecomponent2

default links

appropriate links

Ope

nURL

resource2 resource3

herbert van de sompel

Page 58: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Naming: Handles & DOIs

Page 59: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Naming

• Fundamental to other technologies (OAI-PMH, OpenURL, etc.)

• Options– URNs– Persistent URLs (PURLs)

• http://purl.org/

– Handles• http://www.handle.net/

– Digital Object Identifiers• http://www.doi.org/

– ARK• http://www.cdlib.org/inside/diglib/ark/

Page 60: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

“Inverted Archives”

• Unit of discourse is no longer an archive or service, but a DOI which has services linked from it– cf.:

• UPS demonstration prototype• “Smart Objects, Dumb Archives” (SODA)

model

Page 61: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Object Models

Page 62: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Popular Object Models

• METS– used in DSpace, Fedora– http://www.loc.gov/standards/mets/

• MPEG-21 DIDL– http://xml.coverpages.org/mpeg21-didl.html– used in LANL DLs

• http://www.dlib.org/dlib/november03/bekaert/11bekaert.html• http://www.dlib.org/dlib/february04/bekaert/02bekaert.html• http://lib-www.lanl.gov/~herbertv/papers/jcdl2004-submitted-

draft.pdf

Page 63: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Object Models & OAI-PMH

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

resource

item oai:foo.edu:1234

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

METS

Move from simple metadata files“pointing” to resources…

…to records as “modeled representations” of resources

records

Page 64: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Download and Go!

Page 65: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Where Do You Want to Build?

user

. . .dataprovider

dataprovider

dataprovider

dataprovider

serviceprovider

local context-sensitive services

EPrints.org

dataprovider

CDSware

CDSware

Page 66: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Fedora

• joint project between Cornell & UVa – funded by the Mellon Foundation

• a repository management system– focuses on complex digital objects and their

behaviors

• more info:– http://www.fedora.info/– D-Lib Magazine, 9(4)

• http://www.dlib.org/dlib/april03/staples/04staples.html

Page 67: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

• MIT + HP Labs• constructed to capture all the output of

MIT’s faculty• now generalized to the DSpace Federation

– 8 top universities in the US & Canada

• More info:– http://www.dspace.org/– http://sourceforge.net/projects/dspace/– D-Lib Magazine 9(1)

• http://www.dlib.org/dlib/january03/smith/01smith.html

Page 68: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

EPrints.org

• developed at Southampton University– part of larger suite of institutional/author

self-archiving tools and services• e.g.: citebase; paracite

• widely adopted -- 100+ sites– http://software.eprints.org/#ep2

• more info– http://www.eprints.org/– http://www.arl.org/sparc/core/index.asp?pag

e=g20#6

Page 69: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

CDSware

• developed at CERN• data provider & service provider• large-scale use @ CERN (> 600k

records)– in use at a few non-CERN sites

• free & paid support models• more info

– http://cdsware.cern.ch/

Page 70: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

• P2P publishing for academia– community servers for coordination,

management– archivelets for individual laptops, PCs

• more info:– http://kepler.cs.odu.edu/– D-Lib Magazine 7(4)

• http://www.dlib.org/dlib/april01/maly/04maly.html

Page 71: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

• developed by UKOLN– open source

• OpenURL 0.1 format resolver– NISO 1.0 format???

• more info:– Ariadne, 28

• http://www.ariadne.ac.uk/issue28/resolver/• ftp://ftp.ukoln.ac.uk/metadata/tools/openresolver/• http://www.ukoln.ac.uk/distributed-systems/openu

rl/

Page 72: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Conclusions

Page 73: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Why The OAI-PMH is NOT Important

• Users don’t care• OAI-PMH is middleware

– if done right, the uninterested user should never have to know

OAI

Inside

• Using OAI-PMH does not insure a good SP

• OAI-PMH is (or is becoming) HTTP for DLs– few people get excited about http now

• http & OAI-PMH are core technologies whose presence is now assumed

Page 74: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Other Uses For the OAI-PMH• Assumptions:

– Traditional DLs / SPs will continue on their present path of increasing sophistication

• citation indexing, search results viz, personalization, recommendations, subject-based filtering, etc.

– growth rates remain the same (5x DPs as SPs)• Premise: OAI-PMH is applicable to any scenario that needs

to update / synchronize distributed state– Future opportunities are possible by creatively interpreting

the OAI-PMH data model• See Van de Sompel, Young & Hickey, D-Lib Magazine July 2003,

http://www.dlib.org/dlib/july03/young/07young.html• Nelson, 2nd OAI Workshop,

http://agenda.cern.ch/askArchive.php?base=agenda&categ=a02333&id=a02333s5t8/transparencies

Page 75: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

OpenURL Framework evolution

A spec based on HTTP GET to transport metadata about• a scholarly referent & • the context in which the referent is referenced

Draft Van de Sompel, Beit-Arie, Hochstenbach - 05/2001

A framework Standard that enables different Communities to:• describe a referent• describe the context in which the referent is referenced• transport these descriptions

NISO Draft Standard -04/2003

Page 76: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

The Future: Community Building

• Ultimately, protocols and metadata formats are not what makes a difference

• Rather, the critical mass afforded by a common set of utilities (cf. http, Dublin Core, XML)

• The best current example: The Open Language Archives Community – http://www.language-archives.org/

• OAI-PMH provides the basis for communication between strangers, but allows even richer communication between friends

Page 77: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Further Reading• Gerry McKiernan, Library Hi-Tech News

– http://www.public.iastate.edu/~gerrymck/OAI-SP-I.pdf– http://www.public.iastate.edu/~gerrymck/OAI-SP-II.pdf– http://www.public.iastate.edu/~gerrymck/OAI-SP-III.pdf

• Open Archives Forum OAI-PMH Tutorial– http://www.oaforum.org/tutorial/

• “A Survey of Digital Library Aggregation Services”– http://www.diglib.org/pubs/brogan/

• Open Access News– http://www.earlham.edu/~peters/fos/fosblog.html

• Guide To Institutional Repository Software– http://www.soros.org/openaccess/software/

Page 78: A Review of Institutional Repository Projects and Technologies Michael L. Nelson Old Dominion University mln@cs.odu.edu mln/ Texas.

Great Stuff I Did Not Cover…

• OAI-PMH– Static Repositories

• http://www.openarchives.org/OAI/2.0/guidelines-static-repository.htm

– OAI-Rights• http://www.openarchives.org/documents/

OAIRightsWhitePaper.html• http://www.openarchives.org/news/

oairightspress030929.html

• Digital Preservation– http://www.digitalpreservation.gov/