Avoiding the 927 Problem: Standards, Digital Preservation, and Communities of Practice Dan Gillean PASIG NYC 2016 October 26, 2016
Jan 21, 2018
Avoiding the 927 Problem:Standards, Digital Preservation, and Communities of Practice
Dan Gillean
PASIG NYC 2016
October 26, 2016
What is a standard?
•A model or basis of comparison
•An agreed-upon set of characteristics, definitions, and/or practices
•A minimum acceptable benchmark allowing for quantitative or qualitative judgement
http://www.cas.edu/
De Jure vs De Facto
• “According to law,” “By right”
• Declared to be standards by an
authority
• Top-down distribution
• Can be formalized from de facto
standards; can become de facto
as well via adoption
• Generally open
• “In reality,” “As a matter of fact”
• Grow to be standards via
adoption
• Dependent on market or
community uptake
• Can become de jure standard
• Can be open or closed
De Jure De Facto
Open vs Proprietary
•Open can sometimes just refer to availability – royalty free
•Open source: community-driven, open exchange of ideas
•Open proprietary: Privately developed or owned but freely available for implementation
•Closed proprietary: Privately developed/owned, must pay licensing fee to implement
Standards allow us to communicate across space and time
https://pixabay.com/p-624054
Standards allow us to communicate across space and time
…but not to just anyonehttps://pixabay.com/p-624054
Communities of practice
•Shared craft, domain, or profession
•Shared common interest in improvement
•Established via mutual engagement, joint enterprise, and shared repertoire
Crowd, by James Cridland. https://www.flickr.com/photos/jamescridland/613445810
Designated community:
• An identified group of potential Consumers who should be able to understand the preserved information
“Since a key purpose of an OAIS is to preserve information for a Designated Community, the OAIS must understand the Knowledge Base of its Designated Community to understand the minimum Representation Information that must be maintained.“ (p. 2-4)
Standards are only useful if we use them
http://www.salon.com/2016/06/16/black_holes_are_colliding_scientists_confirm_ripples_in_spacetime_partner/
Standards are only useful if we use them
https://commons.wikimedia.org/wiki/File:Snowflake_01.svg
Special!Special!
Special!
Standards are only useful if we use them
The 927 problem:
https://xkcd.com/927/
https://en.wikipedia.org/wiki/Archive#/media/File:WikiXDC_
National_Archives_Tour_Hall_-_Stierch.jpg
Our standards should be:• Open
• Non-proprietary
• Widely adopted
• Evaluated by experts
• Endorsed by our community of
practice
• Agnostic and interoperable
ISO 14721:2002 ISO 16363:2012
ISO 14721 and 16363
ISO 14721A reference model – not a
systems architecture!
https://wiki.archivematica.org/Overview
• Governance
• Organizational structure
• Staffing
• Procedural accountability
• Preservation policy framework
• Documentation
• Financial sustainability
• Security
ISO 16363 Reminds us that much of digital
preservation readiness is not technical
– it’s organizational
ISO 16363
??????
Meet Archivematicahttps://www.archivematica.org
What is Archivematica?Archivematica is a web-and standards-based,
open-source application which allows your
institution to preserve long-term access to
trustworthy, authentic and reliable digital
content.
Standards based
Open source
Customizable
Integrated w 3rd
party systems
Active community
PREMIS in METS XML
Archivematica AIP structurePackaged according to BagIt specifications
Virus scan, normalization report, extraction log, etc
For browsing in Archivematica
Original + normalized
objects, submission
docs, original metadata
included at SIP creation
• Originally developed for exchange between
California Digital Library and Library of
Congress; specifications written up by IETF in
2008
• System agnostic, interoperable format for
storage and exchange
• “Bag and tag” approach: mandatory tag file
contains a manifest listing every file in the
payload together with its corresponding
checksum
BagItBagIt is a hierarchical file packaging format
designed to support disk-based or network-
based storage and transfer of arbitrary digital
content.
• It provides a wrapper for other metadata, such
as PREMIS and Dublin Core.
• It defines relationships between digital objects
and other digital objects, and between digital
objects and their metadata.
• It can be used to provide technical metadata
about digital objects (although Archivematica
doesn’t implement it that way: we wrap PREMIS
in it instead)
METS, or Metadata Encoding and
Transmission Standard, was designed to
support inter-repository data exchange.METS
• It captures technical information about an object in order
to support the implementation of preservation strategies
such as normalization, migration or emulation (PREMIS
Object)
• It describes relationships between digital objects (PREMIS
Object)
• It provides an audit trail of actions taken by the digital
preservation repository to preserve the object (PREMIS
Event)
• It names the individuals, organizations and software tools
responsible for taking actions to preserve digital objects
(PREMIS Agent)
• It specifies the actions a repository is allowed to take to
preserve digital objects (PREMIS Rights)
PREMISPREMIS, or Preservation Metadata
Implementation Strategies, is the
recognized standard for metadata
about objects in a digital
preservation system.
<mets:amdSec>
<mets:techMD>
PREMIS: OBJECT<mets:rightsMD>
PREMIS: RIGHTS<mets:digiprovMD>
PREMIS: EVENT<mets:digiprovMD>
PREMIS: AGENT
PREMIS in METS
METS SECTIONS
<metsHdr> METS header
<dmdSec> Descriptive metadata
<amdSec> Administrative metadata
<fileSec> File section
<structMap> Structural Map
PREMIS in METS<mets:amdSec ID="amdSec_1">
<mets:techMD ID="techMD_1">
<mets:mdWrap MDTYPE="PREMIS:OBJECT">
<mets:xmlData>
<premis:object xmlns:premis="info:lc/xmlns/premis-v2" xsi:type="premis:file"
xsi:schemaLocation="info:lc/xmlns/premis-v2
http://www.loc.gov/standards/premis/v2/premis-v2-2.xsd" version="2.2">
<premis:objectIdentifier>
<premis:objectIdentifierType>UUID</premis:objectIdentifierType>
<premis:objectIdentifierValue>bb52e3a0-2c5...</premis:objectIdentifierValue>
…etc
http://www.totallylocalvc.com/art-of-teamwork/
Interoperability
Consistency
Intelligibility
Collaboration
Exchange
Standards allow for…
Between agents of a
designated community
across space and time