Top Banner
Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005
35

Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Mar 27, 2015

Download

Documents

Audrey Ward
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Preservation Rumination

Priscilla Caplan,FCLA

OCLC DSSFebruary 16, 2005

Page 2: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Preservation Basics

Page 3: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

THE NEED FOR DIGITAL PRESERVATION

Number of academic/scholarly journals published online: 15,757

Percent of U.S. federal government publications produced only online in 2003: 65 percent

Estimated percent of U.S. federal government publications available only online by 2008: 90 percent

From: California Digital Libraryhttp://www.cdlib.org/inside/projects/preservation

Page 4: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

The problem of abundance

0500

10001500200025003000350040004500

items (millions)

LoCWeb

Page 5: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

•Percent of web-based references in scientific articles from 3 major journals inaccessible within 2 years of publication: 21%

•Proportion of websites in 1998 gone in 1999: 44%

•Life of an average website: 44 days

The problem of ephemerality

Page 6: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

The problems of media life expectancy and obsolescence

Page 7: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

The problem of format obsolescence

Page 8: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Maintain original

technology

Preserve Technology

OBJECTIVE

Preserve Objects

Spec

ific

APPLI

CABIL

ITY

Gen

eral

ProgrammableChips

Emulation

Viewer

Re-engineerSoftware

VirtualMachine

UniversalVirtual

Computer

VersionMigration

FormatStandardization

Rosetta StoneTranslation

Typed ObjectConversion

PersistentArchives

ObjectI nterchange

Format

Source: Thibodeau, 2002.

Page 9: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

The problem of rights

Page 10: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Integrity

Viability

Renderability

The Preservation Pyramid

Description

Secure storage

Media management

Preservation strategies

Availability

Identity

CaptureSelection

Page 11: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Authenticity

Page 12: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Traditionally, preserving things meant keeping them unchanged; however … if we hold on todigital information without modifications, accessing the information will become increasinglymore difficult, if not impossible.

From: The Paradox of Preservation,Su-Shing Chen

Page 13: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

“Preservation metadata ...is the information necessary to maintain the viability, renderability, and understandability of digital resources over the long-term.”

OCLC/RLGPreservation

Metadata Framework Working Group

Understandability

Page 14: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Integrity

Viability

Renderability

Revised Preservation Pyramid

Description

Secure storage

Media management

Availability

Identity

CaptureSelection

UnderstandabilityAuthenticity

Preservation strategies

Page 15: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Who is doing preservation?

Research Libraries

Government Archives

Historical Societies

Individual Collectors

Page 16: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Who is doing digital preservation?

Research Libraries

Government Archives

Historical Societies

Individual Collectors

National Libraries

Research Centers

Public broadcasting

Page 17: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Integrity

Viability

Renderability

Description

Secure storage

Media management

Availability

Identity

CaptureSelection

UnderstandabilityAuthenticity

Preservation strategies

DSPACE

Page 18: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Integrity

Viability

Renderability

Description

Secure storage

Media management

Availability

Identity

CaptureSelection

UnderstandabilityAuthenticity

Preservation strategies

LOCKSS

Page 19: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Integrity

Viability

Renderability

Description

Secure storage

Media management

Availability

Identity

CaptureSelection

UnderstandabilityAuthenticity

Preservation strategies

OCLCDigitalArchive

Page 20: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Integrity

Viability

Renderability

Description

Secure storage

Media management

Availability

Identity

CaptureSelection

UnderstandabilityAuthenticity

Preservation strategies

LCMinerva

Page 21: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Integrity

Viability

Renderability

Description

Secure storage

Media management

Availability

Identity

CaptureSelection

UnderstandabilityAuthenticity

Preservation strategies

FCLADigitalArchive

Page 22: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Preservation in Action

Page 23: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.
Page 24: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

State Universities

FCLA

Page 25: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

•Designed as a “dark archive”•Preservation repository functions only•Based on OAIS functional architecture•“Bit-level” and “Full” preservation•Format migration and normalization

Page 26: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

OAIS Functional Architecture

4-1.

2

MANAGEMENT

Ingest

Data Management

SIP

AIPDIP

queries

result setsAccess

PRODUCER

CONSUMER

Descriptive Info

AIP

orders

Descriptive Info

Archival Storage

Administration

Preservation Planning

Page 27: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

DAITSS Functional Architecture

IngestSIP

AIP

Storagemanagement

Access

DIP

Reporting

MgmtDB

L

I

B

R

A

R

Y

L

I

B

R

A

R

Y

Page 28: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

DAITSS Data Model

Intellectualentity

(1)

Bitstream(0..n)

Information Package

Data File (1..n)

Page 29: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

DAITSS Data File Object

X M L S G M L

M a rku p F ile T IF F F ile

D T D

T e x tF ile P D F F ile

D a ta F ile

A u d io

JP E G Im a ge T IF F Im a ge

Im a ge T e xt V id eo

B its tre am

DAITSS Bitstream Object

Page 30: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Risk Management

•Storing multiple master copies of files•Calculating two message digests•Storing metadata as XML and in RDBMs•Normalizing when possible•Always retaining original•Action plans and background papers

Page 31: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Ingest Functions

METS validation and metadata extraction Virus check and checksum verification File format identification Creation of Data File and Bitstream objects Harvesting of external files Normalization and Forward Migration Technical, relationship and event metadata AIP creation Storage update Data table update

Page 32: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Ingest Example: A simple SIP

XML

PDF AVI

SIP

Page 33: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

XML

PDF AVI

SIP

XML

XML

XML

XML

XML

XML

TIFF

TIFF

TIFF

Database

AIP

Page 34: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

Future Plans

Find partners to install at other places

Finish DAITSS

Release under open source license

Build a community of developers for different formats

Page 35: Preservation Rumination Priscilla Caplan, FCLA OCLC DSS February 16, 2005.

References

Priscilla Caplan: www.fcla.edu/~pcaplan, [email protected] FCLA Digital Archive: www.fcla.edu/digitalArchive Terry Kuny, “A Digital Dark Ages?”

www.ifla.org/IV/ifla63/63kuny1.pdf PREMIS Implementation Survey

www.oclc.org/research/projects/pmwg/surveyreport.pdf Roy Rosenzweig, “Scarcity or Abundance?”

www.historycooperative.org/journals/ahr/108.3/rosenzweig.html

O’Neil et al. “Trends in the Evolution of the Public Web” www.dlib.org/dlib/april03/lavoie/04lavoie.html

Clifford Lynch, “Authenticity and Integrity in the Digital Environment” www.clir.org/pubs/reports/pub92/lynch.html