Top Banner
Approaches to preserving digitized taxonomic data: Prints, manuscripts & specimens Chris Freeland Director, Center for Biodiversity Informatics Technical Director, Biodiversity Heritage Library 28 October 2011 @chrisfreeland
23

Approaches to preserving digitized taxonomic data

Nov 18, 2014

Download

Technology

Chris Freeland

Sherborn Symposium. Natural History Museum, London. 28 October 2011.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Approaches to preserving digitized taxonomic data

Approaches to preserving digitized taxonomic data:

Prints, manuscripts & specimens

Chris FreelandDirector, Center for Biodiversity Informatics

Technical Director, Biodiversity Heritage Library28 October 2011

@chrisfreeland

Page 2: Approaches to preserving digitized taxonomic data

Prints / Manuscripts / SpecimensDifferent objects, similar management

http://www.flickr.com/photos/biodivlibrary/6257859557 http://www.flickr.com/photos/chrisfreeland/6018724034 http://www.biodiversitylibrary.org/page/34045915

Page 3: Approaches to preserving digitized taxonomic data

Overview of Talk

• Why worry about digital preservation?

• Considerations for preservation– Collaboration– File formats– Metadata standards

• Views to the future

Preservation Panic!

Page 4: Approaches to preserving digitized taxonomic data

WHY WORRY?http://www.flickr.com/photos/biodivlibrary/6008902662

Page 5: Approaches to preserving digitized taxonomic data

Do it once, do it right

Costs more to get object to scanner than to scan

Page 6: Approaches to preserving digitized taxonomic data

• Conversion / Compost / Corruption• Longevity of digital objects• File changes• Media obsolescence

Cautionary Tales

Page 7: Approaches to preserving digitized taxonomic data

CONSIDERATION: COLLABORATION

Page 8: Approaches to preserving digitized taxonomic data

LOCKSS

Lots Of Copies Keeps Stuff Safe

• LOCKSS is both a software platform & a concept– Software: http://www.lockss.org

Page 9: Approaches to preserving digitized taxonomic data

Museum XLibrary Y

Rule of 3

Archive Z

1. Geographic Locations 2. Administrations 3. Technology Platforms

Page 10: Approaches to preserving digitized taxonomic data

CONSIDERATION: FILE FORMATS

Page 11: Approaches to preserving digitized taxonomic data

JPEG2000

• Wavelet compression, lossless encoding• 12 Parts• Of particular interest to documents &

specimens:– Part 1: Core Coding System, ISO/IEC 15444-1– Part 6: Compound image file format– Part 10: JP3D, Volumetric images

http://www.jpeg.org/jpeg2000/

Page 12: Approaches to preserving digitized taxonomic data

http://www.tropicos.org/ImageFullView.aspx?imageid=62182

Page 13: Approaches to preserving digitized taxonomic data

JPEG2000 (Hurrahs & Hisses)

• Advantages– Store a single file for access & preservation– Standards-based– Saves drive space (important at museum scale)

• Disadvantages– Doesn’t have wide native support in many apps– Requires an intermediary app to decode & serve

• But, there’s an open source option: djatoka http://djatoka.sourceforge.net

– Reports of data loss

Page 14: Approaches to preserving digitized taxonomic data

PDF/A

• ISO-standardized version of PDF suitable for long-term preservation

• Identifies a "profile" for electronic documents that ensures the documents can be reproduced exactly the same way in years to come.*

• Makes the file self-contained (and therefore larger)– Embeds fonts– Graphics

* http://en.wikipedia.org/wiki/PDF/A

Page 15: Approaches to preserving digitized taxonomic data

CONSIDERATION: METADATA

Page 16: Approaches to preserving digitized taxonomic data

The Great Thing AboutSTANDARDS

Is That There AreSO MANY

To Choose From

Page 17: Approaches to preserving digitized taxonomic data

FilesystemFilesystem

Metadata Preservation

• Descriptive information (metadata) provides content & context for indexing, reuse

• Can bundle metadata within files– EXIF: images, common in digital cameras– Adobe XMP: docs, images

• Should commit metadata to file system– Should not manage just

in DB or other management system

<DwC> XMLXML

JP2JP2

Page 18: Approaches to preserving digitized taxonomic data

THE FUTURE

Page 19: Approaches to preserving digitized taxonomic data

Electronic Publications

• Happening now, has been for years• Should take same care in ensuring

heterogeneity & diversity in digital management systems as with printed, bound books– Monolithic libraries have failed over time– Monolithic electronic archives will, too

Page 20: Approaches to preserving digitized taxonomic data

http://www.biodiversitylibrary.org/page/22681143

Need a meadow…

Page 21: Approaches to preserving digitized taxonomic data

…not a monoculture.

Page 22: Approaches to preserving digitized taxonomic data

There is no silver bullet

• Make best decision today

• Stay up with technology changes & best practices– <insert library & archive professionals here>

• Evaluate, experiment, document, lead

• Move to stable new technologies when necessary

Page 23: Approaches to preserving digitized taxonomic data

Questions?Chris Freeland

Director, Center for Biodiversity InformaticsTechnical Director, Biodiversity Heritage Library

28 October 2011

Email: [email protected]

Twitter: @chrisfreeland