Top Banner
Taming the Monster Digital Preservation Planning and Implementation Tools Dorothea Salo One System, One Library 2 June 2011 Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
41

Taming the Monster: Digital Preservation Planning and Implementation Tools

Oct 21, 2014

Download

Technology

Given at Council of UW Libraries "One System, One Library" conference, June 2011.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Taming the Monster: Digital Preservation Planning and Implementation Tools

Taming the MonsterDigital Preservation Planning and Implementation Tools

Dorothea SaloOne System, One Library

2 June 2011

Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/WorldIslandInfo.com / CC-BY 2.0

Page 2: Taming the Monster: Digital Preservation Planning and Implementation Tools

Why is thisso scary?

Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/WorldIslandInfo.com / CC-BY 2.0

Page 3: Taming the Monster: Digital Preservation Planning and Implementation Tools

Isn’t this just as scary?

Photo: “News Paper Origami Dragon Monster” http://www.flickr.com/photos/epsos/3777343342/epSos.de / CC-BY 2.0

Page 4: Taming the Monster: Digital Preservation Planning and Implementation Tools

Yet wepersevere.

Photo: “News Paper Origami Dragon Monster” http://www.flickr.com/photos/epsos/3777343342/epSos.de / CC-BY 2.0

Page 5: Taming the Monster: Digital Preservation Planning and Implementation Tools

DIGITAL IS NO DIFFERENT.

Photo: “559 - The Matrix - Seamless Texture” http://www.flickr.com/photos/zooboing/4335531915/Patrick Hoesly / CC-BY 2.0

Page 6: Taming the Monster: Digital Preservation Planning and Implementation Tools

Many of the same ideas apply...•Planning and policy•Risk assessment•Risk management

•(knowing that we can’t save everything)

•Materials quality matters!•Problem discovery and remediation•Crisis management•Chief problems: sta!, $$$, organizational

commitment

Photo: “Where I Teach” http://www.flickr.com/photos/eklektikos/2541408630/Todd Ehlers / CC-BY 2.0

Page 7: Taming the Monster: Digital Preservation Planning and Implementation Tools

Planning andassessment

tools

Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/WorldIslandInfo.com / CC-BY 2.0

Page 8: Taming the Monster: Digital Preservation Planning and Implementation Tools

Scene-setting

•Rosenthal, David. “Requirements for Digital Preservation: a Bottom-Up Approach.”

•http://www.dlib.org/dlib/november05/rosenthal/11rosenthal.html

• If you’re new to this, or trying to find your feet, this is the best short introduction I know.

•The list of threats is outstanding.

Photo: “Bottoms Up! - Duck; San Anton Gardens, Malta” http://www.flickr.com/photos/foxypar4/3123113762/John Haslam / CC-BY 2.0

Page 9: Taming the Monster: Digital Preservation Planning and Implementation Tools

TRAC• “Trusted Repository Audit Checklist”•Despite the name, covers a LOT more than

the technology!•Budget•Sta"ng•“designated communities”

•CRL will audit you, if you like•(don’t, unless you’re really serious!)

•http://catalog.crl.edu/record=b2212602~S1

!

Page 10: Taming the Monster: Digital Preservation Planning and Implementation Tools

DRAMBORA

•Digital Repository Audit Method Based on Risk Assessment

•A “self-test,” if you will.•DRAMBORA is equally good as a pre- or post-test.

•Personally, I prefer DRAMBORA to TRAC, especially for those just starting out.

•http://www.repositoryaudit.eu/•(registration required for toolkit access)!

Page 11: Taming the Monster: Digital Preservation Planning and Implementation Tools

Coping withfile formats

Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/WorldIslandInfo.com / CC-BY 2.0

Page 12: Taming the Monster: Digital Preservation Planning and Implementation Tools

The one acronym you need to know: FITS

• “File Information Tool Set”•(you need to know this; otherwise it’s hard to Google)

•Wrapper for several file-format detector software packages

• Intended to be baked into other software• It’s early days yet!

•(This means you can’t always trust what the tools tell you, especially when they’re telling you about errors.)

Page 13: Taming the Monster: Digital Preservation Planning and Implementation Tools

What’s this file?

•wotsit.org “The Programmer’s File and Data Resource”

•Directory of file extensions•When in doubt: open in a browser or text

editor and see what you get.•N.b.: Microsoft Word is NOT a text editor!

Page 14: Taming the Monster: Digital Preservation Planning and Implementation Tools

Solving thegeographicdistributionproblem

Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/WorldIslandInfo.com / CC-BY 2.0

Page 15: Taming the Monster: Digital Preservation Planning and Implementation Tools

What problem, now?

•The “all your eggs in one basket” problem.•If all your bits are on one server, and the server room

is flooded, or your town is nuked—oops.

•Not the same as backups!•Don’t get me wrong, backups are important!•Backups are SHORT-TERM, and usually LOCAL.

Geographic distribution (plus associated auditing) is intended for the long term.

•Don’t forget auditing!Photo: “Nido” http://www.flickr.com/photos/italintheheart/3679974298/Jorge Elías / CC-BY 2.0

Page 16: Taming the Monster: Digital Preservation Planning and Implementation Tools

LOCKSS•Lots of Copies Keeps Stu! Safe!

•(There is also Portico, but Portico only works with e‑journal content.)

•Open-source software that handles replication and (some) auditing.

• “Private LOCKSS network”•A group of institutions agrees to build a LOCKSS

network just for the stu! they’re interested in.•ASERL does this for ETDs. Many institutions

(including UW-Madison) participate in a PLN for govdocs.

Page 17: Taming the Monster: Digital Preservation Planning and Implementation Tools

“The cloud”•Typical cloud-based storage services make

NO promises they won’t lose your stu!.•And for large quantities of data, bandwidth can become

an issue.•And can they look at your stu!? Should they be able to?

•Some early movers in this market fading•Iron Mountain had to kill their service.

•DuraCloud•trying to finesse this issue by negotiating tougher SLAs

with cloud-storage providersPhoto: “Sky View From Humboldt Park” http://www.flickr.com/photos/purpleslog/2589612577/Purple Slog / CC-BY 2.0

Page 18: Taming the Monster: Digital Preservation Planning and Implementation Tools

Repositoryand digital-libraryplatforms

Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/WorldIslandInfo.com / CC-BY 2.0

Page 19: Taming the Monster: Digital Preservation Planning and Implementation Tools

PICK SOFTWARELAST.

Friendly wordof advice:

Photo: “Briana Calderon; future educator of america.” http://www.flickr.com/photos/46132085@N03/4703617843/

Arielle Calderon / CC-BY 2.0

Page 20: Taming the Monster: Digital Preservation Planning and Implementation Tools

DON’T CHASE THE SHINY.

Another friendly word of advice:

Photo: “Sparkle Texture” http://www.flickr.com/photos/abbylanes/3214921616/Abby Lane / CC-BY 2.0

Page 21: Taming the Monster: Digital Preservation Planning and Implementation Tools

Digital-library software• Is almost always VERY BAD at digital

preservation!•(most packages don’t even try!)•So if a file gets corrupted on the server, or whatever...

no warnings, no restore, nothing. Also, provenance? Who needs provenance? Event tracking? What’s that?

• I’m not saying don’t use it. I’m saying that it doesn’t solve this problem.

•In fact, if you’re using this software, you need to solve this problem FOR IT.

Photo: “National DIGITAL Library” http://www.flickr.com/photos/schex/193912573/Jesse Schexnayder / CC-BY 2.0

Page 22: Taming the Monster: Digital Preservation Planning and Implementation Tools

Examples

•ContentDM: http://contentdm.com/•Omeka: http://omeka.org/•Greenstone: http://greenstone.org/

Page 23: Taming the Monster: Digital Preservation Planning and Implementation Tools

Institutional-repository software

• Is SHOCKINGLY bad at digital preservation!•(Though sometimes better than most DL software.)

•Examples•Hosted/commercial: Digital Commons (BePress),

ContentDM, DigiTool• If you go hosted, you’d better ask about their digital-

preservation practices!•Open-source: EPrints, DSpace, Fedora

Photo: “IMG_0668” http://www.flickr.com/photos/12967790@N00/66531124Robert / CC-BY 2.0

Page 24: Taming the Monster: Digital Preservation Planning and Implementation Tools

A new approach:curationmicroservices

Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/WorldIslandInfo.com / CC-BY 2.0

Page 25: Taming the Monster: Digital Preservation Planning and Implementation Tools

Do we really need

THE BLOB?Photo: “giant crystal blob” http://www.flickr.com/photos/a_of_doom/527905701/A of DooM / CC-BY 2.0

Page 26: Taming the Monster: Digital Preservation Planning and Implementation Tools

How about a jigsaw puzzle instead?

•Break the digital-preservation problem down into parts.

•Code up each part, making sure that it plays nicely with other parts.

•lots of nice APIs!•which means other software can adopt/adapt

microservices as well!

•Put parts together as you need them.

Photo: “Lapsana Apogonoides Puzzle” http://www.flickr.com/photos/gdesigneralex/2313092112/gdesigneralex / CC-BY 2.0

Page 27: Taming the Monster: Digital Preservation Planning and Implementation Tools

California Digital Library

•Pioneering this approach•Has open-sourced code for microservices•Has added microservices together to build

its “Merritt” storage/repository service

Page 28: Taming the Monster: Digital Preservation Planning and Implementation Tools

Escaping the silos:Fedora Commons

Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/WorldIslandInfo.com / CC-BY 2.0

Page 29: Taming the Monster: Digital Preservation Planning and Implementation Tools

What is Fedora Commons?

•Blueprints and foundation, not the whole house (analogy credit to Peter Gorman)

•You build the house you want!•Or you build condominiums on the same

foundation.•Need di!erent user interfaces for di!erent materials?•Need di!erent structures and behaviors?•No problem! Fedora can handle that.

• (have I run this analogy into the ground yet?)

Page 30: Taming the Monster: Digital Preservation Planning and Implementation Tools

We had this...

Diagram courtesy of Peter Gorman.

Page 31: Taming the Monster: Digital Preservation Planning and Implementation Tools

We are building this.

Diagram courtesy of Peter Gorman.

Page 32: Taming the Monster: Digital Preservation Planning and Implementation Tools

E-recordsmanagement

Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/WorldIslandInfo.com / CC-BY 2.0

Page 33: Taming the Monster: Digital Preservation Planning and Implementation Tools

Axioms•Records management is

about policy and procedures.

•If your policy doesn’t fit with their procedures, guess what wins? Choose battles wisely.

•There is never enough storage space.

•Nobody cares until there’s a crisis.

•Software will not save you... but it might help!

Photo: “The Never Ending Math Problem” http://www.flickr.com/photos/acidwashphotography/2967752733/

d3 Dan / CC-BY 2.0

Page 34: Taming the Monster: Digital Preservation Planning and Implementation Tools

Duke Data Accessioner

•Accessioning tool for digital data•use case: J. Important Scholar dumps her hard drive

on your desk, expects you to cope

•File migrator, metadata manager, GUI, plugins (e.g. for file-format detection)

•Bit rough, but in production use.•http://library.duke.edu/uarchives/about/tools/data-

accessioner.html

Page 35: Taming the Monster: Digital Preservation Planning and Implementation Tools

Archivematica

•Soup-to-nuts records management and digital preservation tool.

•Evaluation and accessioning all the way through preservation actions. (Oddly, they seem to be missing disposal... but they’re in alpha, so...)

•Open source•Runs on a Linux server; RMs and archivists log in to

GUI application remotely.

•Normally I hate and fear silos, but this one is smartly built on microservices.

Page 36: Taming the Monster: Digital Preservation Planning and Implementation Tools

Practical E-Records•Weblog by Chris Prom and protegés•Tool evaluations, conference-session

writeups, essays on praxis•Best reading out there for the do-it-

yourselfer• If you’re not reading it, why not?•http://e-records.chrisprom.com/

Page 37: Taming the Monster: Digital Preservation Planning and Implementation Tools

Last thoughts

Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/WorldIslandInfo.com / CC-BY 2.0

Page 38: Taming the Monster: Digital Preservation Planning and Implementation Tools

If you can’t do everything...

that’s okay. Who can?

Image: “Confused” http://www.flickr.com/photos/kristiand/3223044657/Kristian D. / CC-BY 2.0

Page 39: Taming the Monster: Digital Preservation Planning and Implementation Tools

DO SOMETHING.

Photo: “Came hame háááá!” http://www.flickr.com/photos/kristiand/3223044657/Guirí R. Reyes / CC-BY 2.0

Page 40: Taming the Monster: Digital Preservation Planning and Implementation Tools

The worst threat?

INACTION.Photo: “Fatty’s role model”

http://www.flickr.com/photos/cloudzilla/4910616774/ cloudzilla / CC-BY 2.0

Page 41: Taming the Monster: Digital Preservation Planning and Implementation Tools

Thank you!

Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/WorldIslandInfo.com / CC-BY 2.0

This presentation is available under a Creative Commons 3.0 United States license.