MetaArchive of Southern Digital Cultural Partners in the dispersed redundant dark archive University Libraries at Emory Auburn Florida State Georgia Tech Louisville Virginia Tech LOCKSS: Lots of Copies Keep Stuff Safe Library of Congress NDIIPP
Dec 20, 2015
MetaArchive of Southern Digital Cultural Partners in the dispersed redundant dark archive
University Libraries at Emory
Auburn
Florida State
Georgia Tech
Louisville
Virginia Tech
LOCKSS: Lots of Copies Keep Stuff Safe
Library of Congress NDIIPP
The MetaArchive of Southern Digital Culture
will create and develop a digital preservation network for critical and at-risk content relating to Southern culture and history. The partners in this initiative will select and preserve institutional digital archives, including institutionally relevant materials such as electronic theses and dissertations, as well as ephemeral works such as online exhibitions and cultural history displays. This digital content will include subjects that complement Library of Congress collections covering the Civil War, the civil rights movement, slave narratives, Southern music, handicrafts, and church history. These collections will encompass born-digital works containing existing works as well as newly created digital works produced for web and other archival purposes.
MetaArchive Project Goals
1. Create a conspectus of digital content within the subject domain held by the partners
2. Harvested body of the most critical content to be preserved (3 TB per institutions)
3. Develop a model cooperative agreement for ongoing collaboration and sustainability
4. Distributed preservation network infrastructure based on the LOCKSS software
MetaArchive Deliverables
• Define the Scope of the Content– What is Southern digital culture?– What is “at risk?”
• Developing a conspectus: content selection– What collections will be preserved?– Metadata
• Adaptations showing any unique or qualified tags • Rights issues: harvesting for preservation vs. user
access
Key Features of the MetaArchive of Southern Digital Culture
1. Distributed preservation strategy2. Flexible organizational model3. Formal content selection process 4. Capability for migrating archives5. Dark archiving strategy6. Low cost to deployment7. Self-sustaining incentives 8. Simple preservation exchange mechanisms with
the Library of Congress
MetaArchive Preservation Strategy
• Redundant Access– Bit Preservation
• Automated– Content Ingestion– MD5 Checksums– LOCKSS Polling Algorithm (Trust Relationships)
– Distributed Copies• 200,000 + square mile area
MetaArchive NDIIPP Network via Internet2
Auburn University
Emory University
Ga Tech
Va TechUniversity of Louisville
Florida State University
DC
NYC
CH
IN
ATL
FL Lambda Rail
Abilene Network SOX Network
MAX Network
MAX Connection to Va Tech
MetaArchive Hardware
• Off-the-Shelf Strategy
– Dell/Intel Based Hardware• Could easily be HP or SUN Intel Based Hardware etc.
• Could be old desktops w/large hard drives.
– New Low Cost SATA SAN• EMC AX100
– $4.00 per GB (already dropping in price)
MetaArchive Software
• Operating System– RedHat Linux Enterprise AS v. 3/4
• Ease of update management and experience w/OS– Could easily work on other versions of Linux
• JAVA SDK
• LOCKSS Content Ingestion/Replication– LOCKSS Daemon 1.8.3 – 6-8 week updates w/RPM
• Conspectus Database– MySQL/PHP Interface – Standalone System
• MetaArchive Collection Description Metadata Schema
MetaArchive Collection Description Metadata Schema
• Based on UKOLN RSLP Collection Description– Includes
• LOCKSS Manifest Page
• Risk Ranking – MA Ranking
• OAI-PMH Data Provider URL
Collection-Level Conspectus Metadata Specification
Access RightsAccrual PeriodicityAccrual PolicyAccumulation Date RangeAlternative TitleAssociated CollectionAssociated PublicationBytesCataloged StatusCatalogue or descriptionCollection SizeContents Date RangeCreatorCustodial HistoryDescriptionFormat CharacteristicsInstitution Collection IdentifierIs Available Via
LanguageLOCKSS Manifest PageManifestationMetaArchive Collection IdentifierOAI ProviderPublisherRecommended Harvest ProcedureRightsRisk FactorsRisk RankSpatial CoverageSubCollectionSubjectSuperCollectionTemporal CoverageTitleType
MetaArchive follows Standards• OAIS Reference Model
– LOCKSS Compliance
• OAI-PMH 2.0– Using as alternative to current LOCKSS AU
strategy w/ETDs – VaTech, GaTech, FSU
• UKOLN RSLP Collection Description– Basis for MetaArchive Conspectus
• http://www.metaarchive.org/pdfs/conspectus_md_2005.html
MetaArchive Collaboration• Kickstart Installations for Linux Servers
– Easy to setup all hardware together exactly the same.• Efficiency of Replication
– Kickstart can be used with production system as well as with any Intel based machine.
• Communication– Telephone conference, video conference I2, iVocalize
Chat/VOIP Room, Wiki, PhpCollab• Study issues
– Dynamic content– Format migration
MetaArchive Rights Issues
Any use of protected works generally will need to:– fit within an exception to the exclusive rights of owners,
such as the “fair-use” doctrine or other provisions relating specifically to library copying and other activities
– undergo an investigation to determine whether the work still enjoys protection or has lapsed into the public domain due to notice or renewal defects
– occur as a result of valid permission from the copyright owner(s)
– constitute an acceptable risk for the institution in potential absence of “clear” resolution
MetaArchive Approach to Rights Issues
• Who will make decisions?• How will we investigate current status of
works that pre-date 1976 Act?• Define “acceptable” legal risk individually
or across institutions.• Does “dark archive” lessen potential risks?• How to identify owners and seek their
permission?
MetaArchive: Beyond Rights Issues
• Rights of publicity – Names, images, likeness
• Rights of privacy– Potential for damage
• Common-law or state statutory protections – Apply to most restrictive jurisdiction?– Apply to widespread norm across jurisdictions?
MetaArchive: Beyond Rights Issues
• What is the practical meaning of “infringement” in the context of a “dark archive?”
• What kinds of limits currently exist in Sect. 108 that prevent or lessen its application to preservation efforts such as MetaArchive?
• How can the law address works that have no “owner” in a practical sense from which to acquire permission to preserve?
Contact Information MetaArchive of Southern Digital Culture
– Dwayne Buttler – [email protected]– Martin Halbert - [email protected]– Robert H. McDonald – [email protected]– Gail McMillan – [email protected]
http://www.metaarchive.org