Top Banner
Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program [email protected]
22

Digital Collections: Storage and Access - Indiana University

Feb 03, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Digital Collections: Storage and Access - Indiana University

Digital Collections:Storage and Access

Jon DunnAssistant Director for Technology

IU Digital Library [email protected]

Page 2: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

StorageWhy is storage an issue?

Space requirementsPersistenceAccessibility

Needs depend on purpose of storageCapture/encodingAccess/deliveryPreservation

Page 3: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

Storage: Working SpaceSpace for storage of digital files during capture/encoding/quality control processPossibilities

PC hard driveFile server / LAN

IssuesCapacity, backup, speed, accessibility

Page 4: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

Storage: Access/DeliveryStorage of derivative files for web delivery

Image, audio, video, text files, etc.Possibilities

Local web serverCommercially-hosted web siteConsortial service provider

Issues: capacity, backup, performance, software integration, maintenance/migration

Page 5: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

Storage: PreservationMuch harder problemLonger term

Issues of longevity of media, hardware, file format“Where did we put the files?”

Larger filesHard disk storage, traditional backup methods not cost-effective

Infrequency of accessProblems do not become immediately evident

Page 6: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

Long-Term Storage OptionsRemovable media stored offline

OpticalCD-R (CD-Recordable)DVD-R (DVD-Recordable), DVD+R, DVD+RW, DVD-RW, …

TapeDLT, 8mm, DAT, …

Pros: cheap, easy, produces tangible itemCons: Low capacity, physical space requirements, unknown longevity, migration, potential format obsolescence

Online/nearline storage systemsHSM: Hierarchical Storage Management

Combine disk and automated tape storage with software to keep track of where files are located

Locally managed or remote providerPros: high capacity, migration can be handled by software, Cons: expensive, complex, network bandwidth issues, must trust service provider, potential single point of failure

Page 7: Digital Collections: Storage and Access - Indiana University
Page 8: Digital Collections: Storage and Access - Indiana University
Page 9: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

HSM Example: IU’s Massive Data Storage Service (MDSS)

HPSS (High Performance Storage System) software

Developed as collaboration of IBM and US national labs

Four tape robots 2 in Bloomington, 2 in IndianapolisData can be mirrored

540 terabytes (TB) total storage~75 TB used as of April 2001

Page 10: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

A digital object is more than just a file!

Metadata

Delivery page image files (JPEG)

Hi-res page image files (TIFF)

Text file (TEI/XML)

Page 11: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

A digital object is more than just a file!

EADFinding

Aid

Page 12: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

DL ObjectsDigital library “objects” have many parts

MetadataPreservation/archival filesDelivery files

How do we keep them connected?Now: Good practice in file naming, directory organization, project documentation -not scalable!Future: Digital object repository

Page 13: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

Data PersistenceKey is migrationKeeping the bits alive

Physical mediaLogical media format

Keeping the bits understandableFile formatMetadata

Small “pockets” of digital content pose a problem for migration

Page 14: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

DL Object Repository

Preservation version in HSM

Delivery version(s) on web server

Metadata records

RepositorySystem

Users andapplications

Page 15: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

Web Delivery FunctionsSearching

MetadataFull text

BrowsingBy subject, date, author, …

NavigationPage turning, image panning/zooming, …

StreamingFor audio/video

ReuseDownloading, format conversionLinking, persistent naming

Access controlIf necessary

Page 16: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

Digital Collection Delivery Software

Very complex systemsNeed to integrate data from databases, full-text search engines, file systems, and other sourcesCross-collection searchingCommercial

ContentDM, Luna Insight, various library management system addons

Open sourceUMich DLXS, Greenstone, Eprints, MIT DSpace, …

Homegrown

Page 17: Digital Collections: Storage and Access - Indiana University
Page 18: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

DemonstrationHoagy Carmichael Collection,IU Digital Library Programhttp://www.dlib.indiana.edu/collections/hoagy/

Page 20: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

Exposing Digital Resources Broadly

Pay servicesRLG Cultural Materials, Archival Resources

Free servicesUniversity of Michigan OAIster

www.oaister.orgUIUC Digital Gateway to Cultural Heritage Materials

oai.grainger.uiuc.edu

OAI-PMHOpen Archives Initiative Protocol for Metadata Harvestingwww.openarchives.org

Google

Page 21: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

OAI Metadata HarvestingExtract metadata from various sourcesBuild services on local copies of metadata

user

. . .

search for “Indiana”

local copy ofmetadata

metadataharvested offline

metadataharvested offline

metadataharvested offline

metadataharvested offline

all searching, browsing, etc. performed on the metadata here

Data providers

Service provider

Page 22: Digital Collections: Storage and Access - Indiana University

October 2, 2003 ALI Digital Library Workshop

More Information

Bibliography to be made available at:http://www.dlib.indiana.edu/workshops/alioct03/