C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A Slavko Manojlovich Associate University Librarian (IT) / Manager, Digital Archives Initiative and Benoit Pauwels Head, Library Automation Team Université Libre de Bruxelles [with input from Michael J. Bennett, Digital Projects Librarian and Institutional Repository Coordinator, University of Connecticut] Digital Preservation Best Practices Lessons Learned From Across the Pond
153
Embed
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the Pond. Slavko Manojlovich (Associate University Librarian (IT) / Manager, Digital Archives Initiative Memorial University St Johns Canada) and Benoit Pauwels (Head, Library Automation Team, Université libre de Bruxelles Belgium)
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Slavko ManojlovichAssociate University Librarian (IT) / Manager, Digital Archives
Initiativeand
Benoit PauwelsHead, Library Automation Team
Université Libre de Bruxelles
[with input from Michael J. Bennett, Digital Projects Librarian and Institutional Repository Coordinator, University of
Connecticut]
Digital Preservation Best Practices
Lessons Learned From Across the Pond
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
What is digital preservation? Best practices information resources Open Archives Information System
(OAIS) Preservation Planning Digital Preservation in
Action(Archivematica) Digital preservation @ ULB Our issues
Outline
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
What is digital preservation?Digital preservation is NOT digitization!!!!!!!!
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation is the series of actions and interventions required to ensure continued and reliable access to authentic digital objects for as long as they are deemed to be of value. This encompasses not just technical activities, but also all of the strategic and organisational considerations that relate to the survival and management of digital material.
Digital preservation is the series of actions and interventions required to ensure continued and reliable access to authentic digital objects for as long as they are deemed to be of value. This encompasses not just technical activities, but also all of the strategic and organisational considerations that relate to the survival and management of digital material.
Recent example from Memorial University– Preserve faculty member’s research
outputs from 1977 – present stored in a variety of formats.
“All of the above represents a vast resource which cannot be lost from the University”.
What is digital preservation?
Access Databases Paper Files (14 filing cabinets)Excel Spreadsheets Progeny Files
Cyrillic Files Photographic Slides
JPEG Files of Testing Images PowerPoint Presentations
Web Sites Researcher’s memory
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Best practices may not always be the best option for your organization:– British Library Microsoft Live Book Data
Project The DPT [Digital Preservation Team] have taken
the view that since the budget for hard drive storage for this project has already been allocated, it would be impractical to recommend a change in the specifics as far as file format is concerned for this project...... JPEG 2000 files compressed to 70 dB PSNR for the preservation copy.
Best practices may not always be the best option for your organization:– British Library Microsoft Live Book Data
Project The DPT [Digital Preservation Team] have taken
the view that since the budget for hard drive storage for this project has already been allocated, it would be impractical to recommend a change in the specifics as far as file format is concerned for this project...... JPEG 2000 files compressed to 70 dB PSNR for the preservation copy.
– The National Gallery (UK) Preservation of Digital Photographs of the CollectionThe National Gallery has photographed their entire collection using a high-end digital MARC camera capable of capturing and rendering colour accuracy which is at least 5 times better than traditional photography. They have selected the proprietary raw camera output format for long-term preservation because it supports an advanced level of colour management. The company supporting the camera and associated software is very smalland is not a market leader.
Source: Site Visit to National Gallery Photography Department, April, 2010.
– The National Gallery (UK) Preservation of Digital Photographs of the CollectionThe National Gallery has photographed their entire collection using a high-end digital MARC camera capable of capturing and rendering colour accuracy which is at least 5 times better than traditional photography. They have selected the proprietary raw camera output format for long-term preservation because it supports an advanced level of colour management. The company supporting the camera and associated software is very smalland is not a market leader.
Source: Site Visit to National Gallery Photography Department, April, 2010.
Digital Preservation – The Planets WayLondon, UK / February 9, 2010 Source
Digital Futures London 2010: From digitization to delivery King’s Digital Consultancy Services (KDCS)King’s College, London, UK April 19 – 23, 2010Source
Digital Preservation Management: Implementing Short-term Solutions for Long-term ProblemsCambridge, MA, USA / June 13-18, 2010Note: Albany, New York / June 5-10, 2011Source
Short digital preservation workshops are typically offered in conjunction withmost digital preservationconferences.
“GPO’s world-class preservation repository [Fdsys] went live in March 2009. The repository was built upon the Open Archival Information System (OAIS) model and provides sufficient control to ensure long-term preservation and access.” Source
“The use of this reference model as the basis of any archive implementation is recommended as it allows practitioners to use common language and potentially common tools to address common problems.”
Tessella Technology & Consulting White PaperSource
Format 2005: wmv (Windows Media Video) format using Windows Media Player (or other players) for Windows and Flip4MAC Quicktime extension for Macintosh.
2005 – 2009: swf (Adobe Flash) format with Adobe flash plug-ins available for Windows and Macintosh browsers becomes the flavour of the day for web delivery of video content.
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Monitor TechnologyCross-Platform Access Video
Format Fast forward to April, 2010: mp4 (H.264) format with players/support for Windows, Macintosh and IPAD.
IPAD does not support wmv or swf video formats.
Video conversion history: wmvswfmp4 from original DVD vobs.
DVD vob files are being preserved with agoal of converting them to MXF MotionJPEG 2000 for long-term preservation.
The British LibraryThe National Library of the NetherlandsAustrian National LibraryThe Royal Library of DenmarkState and University Library, DenmarkThe National Archives of the NetherlandsThe National Archives of England, Wales and the UKSwiss Federal Archives
University of CologneUniversity of FreiburgHATII at the University of GlasgowVienna University of TechnologyThe Austrian Institute of TechnologyIBM NetherlandsMicrosoft Research LimitedTessella Plc
A preservation plan defines a series of preservation actions to be taken by a responsible institution due to an identified risk for a given set of digital objects or records (called collection).
The preservation plan takes into account the preservation policies, legal obligations, organisational and technical constraints, user requirements and preservation goals and describes the preservation context, the evaluated preservation strategies and the resulting decision for one strategy, including the reasoning for the decision.
It also specifies a series of steps or actions (called preservation action plan) along with responsibilities and rules and conditions for execution on the collection.
Provided that the actions and their deployment as well as the technical environment allow it, this action plan is an executable workflow definition.
British Library’s 2 million newspaper pages in TIFF-5 uncompressed and high quality. File size is 40 MB/ page.
PLATO experiment compares image quality and size of TIFF-5 images converted to JPEG 2000 lossless.
Experiment results: JPEG 2000 lossless image quality is as good as TIFF-5 uncompressed and image file size is reduced by 25-30 percent. JPEG derivatives from TIFF-5 are as good as JPEG derivativesfrom JPEG 2000 lossless.
E-Prints: Integration of Bit-Level and Logical Preservation (New)
Source
Upload Plato preservation plan to E-Prints
Prescribed preservation plan action applied to each set of identified “at risk” classified files
E-Prints creates provenance metadata for all preservation actions (i.e. File was migrated from “file format A” to “file format B” on this date according to preservationplan NNN).
6. Maintains/ensures the integrity, authenticity and usability of digital objects it holds over time.
7. Creates and maintains requisite metadata about actions taken on digital objects during preservation as well as about the relevant production, access support, and usage process contexts beforepreservation.
Archivematica http://archivematica.org is an open source software toolkit that takes the OAIS model and turns its various conceptual entities into actionable functionalities.
Take SIPs and turn them into AIPs and DIPs.
In v. 0.7 alpha this is accomplished through a Unix pipeline design which makes use of various open-source utilities toperform designated actions.
Open source software developed by Artefactual Systems (Vancouver, Canada)
Development partners include:–UNESCO Memory of the World
Programme– International Monetary Fund– Vancouver City Archives–University of British Columbia–University of Virginia (Rubymatica)–Many alpha installations
Digital Curation Software Tools Pronom File Format Registry
PRONOM is a resource for anyone requiring impartial and definitive information about the 320+ file formats, software products and other technical components required to support long-term access to electronic records and other digital objects of cultural, historical or business value. It is maintained by The National Archive(UK). Source
Digital Curation Software Tools FITS (Developed by Harvard
University)– The File Information Tool Set (FITS)
identifies, validates, and extracts technical metadata for various file formats. It wraps several third-party open source tools, normalizes and consolidates their output, and reports any errors.
– Current tools are: Jhove, Exiftool, National Library of New Zealand Metadata Extractor, DROID, FFIdent, File Utility, Fileinfo andXMLMetadata.
Digital Curation Software Tools FITS (Developed by Harvard
University)– All digital file formats are not supported
by every tool as illustrated in the latest FITS release notes: Improved support for audio formats Better identification of JP2 and JPx images Improved identification of EXIF and JFIF
Digital Curation Software Tools FITS (DROID Tool – file identification)
– DROID (Digital Record Object Identification) uses internal and external signatures, maintained in the PRONOM technical registry, to identify and report the specific file format versions of digital files.
Digital Curation Software Tools FITS (JHOVE Tool – file identification,
validation and characterization)– File identification as per DROID– File validation
A file is well-formed if it meets the purely syntactic requirements for a format.For example, a TIFF object is well-formed if it starts with an 8 byte header followed by a sequence of Image File Directories (IFDs), each composed of a 2 byteentry count and a series of 8 byte taggedentries.
Digital Curation Software Tools FITS (JHOVE Tool – file identification,
validation and characterization)– File validation (continued)
A well-formed file is also valid if it meets additional semantic level requirements.For example, an RGB file must have at least three sample values per pixel.
A specification for the packaging of digital content for transfer. Content is packaged (the bag) along with a small amount of machine-readable text (the tag) to help automate the content's receipt, storage and retrieval. There is no software to install. A bag consists of a base directory containing the tag and a subdirectory that holds the content files. The tag is a simple text-file manifest, like a packing slip, that consists of two elements:– An inventory of the content files in the bag– A checksum for each file
BagIt: bag‐info.txtSource‐organization: Simon Fraser University LibraryOrganization‐URL: http://www.lib.sfu.caBagging‐Date: 2009‐06‐26External‐Description: TIFF master files and associated metadata for item 6‐1999‐06‐07 in the SFU Editorial Cartoons Collection.
DSpace 1.7 (New Features) AIP Backup and Restore– Outputs metadata and bitstreams into
zipped self-contained Archival Information Packages which can be loaded into another instance of DSpace or another institutional respository platform (Fedora, CONTENTdm, etc.)
– DSpace AIPs can function as SIPs or DIPs.– Possible to load Archivematica AIPs into
DSpace 1.7 (New Features) Curation System– Infrastructure to support the
implementation of digital curation micro-services for the long-term preservation of your DSpace content.
– Initial Services include: Bitstream format profiler: examines all the
bitstreams and generates a count and support level for each type of bitstream format. Useful tool for format migration. Note: this is not identifying and validating bitstreams.
Required metadata: checks to see if requiredmetadata is present in all records.
Boot your PC with the bootable Archivematica DVD. Login as: demo Password: demo You see the File Manager
– Shortcuts– Directories used through the archiving process
Imagine you’re an archivist and you have a set of object files sitting in demo/testFiles– structured into a number of directories– each directory corresponds to a logical unit
of resources, be it a distinctive item or a complete fonds
– each directory in testFiles = one SIP You could also drag/drop, copy/paste from
USB stick
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Archivematica 0.7 Alpha Demo
Launch dashboard and resize so that it can be viewed as you navigate through the Archivematica processes.– FireFox: uncheck File/Work Offline
Web-based administration for the archivist– Tracks various stages of the archival process
(In this demo setup of ) ArchiveMatica manual approval is required from archivist at various stages in the process:– we’ll have a look at contents of SIP, AIP and
DIP at each of these stages
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Archivematica 0.7 Alpha Demo
ArchiveMatica-SIP
Folder structure, containing metadata, checksums, object files– logs– logs/fileMeta– metadata: checksum and descriptive metadata– objects: digital objects to be preserved
Content changes as SIP is moved through the different stages of the archiving process
Demo SIP = ImagesSIP directory
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Archivematica 0.7 Alpha Demo
Start the archival process –Drap and drop the ImagesSIP directory
into the receiveSIP watched directory– Rename the SIP
The SIP appears in the DashBoard
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Archivematica 0.7 Alpha Demo
First approval: appraise SIP for submission
click on Micro-Services to look at actions performed by ArchiveMatica so far– SIP backup, SIP compliant, assign UUIDs (package and
object files), check delivered checksums (if any delivered) click on Browse to see contents of SIP at this
stage– logs/fileUUIDs.log– logs/fileMeta/*.xml
for each object file: PREMIS-formatted metadata file name, uuid, sha256 hash events that occurred on the object file
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Archivematica 0.7 Alpha Demo
First approval: appraise SIP for submission
submitted SIP should be in accordance with institution’s submission agreements
delete any unwanted files or directories File Manager/appraiseSIPForSubmission
add descriptive metadata about the SIP in metadata/dublincore.xml
click on Approve
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Archivematica 0.7 Alpha Demo
SIP quarantined
SIP is placed in quarantine for virus checking Why quarantine? – Give ClamAV a chance to pick up the latest
version of its virus database How long?– demo: preset to one minute– National Archives of Australia: 1 month– archivist can manually remove SIP from
quarantine
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Archivematica 0.7 Alpha Demo
Second approval: appraise SIP for preservation
zipped/tarred/… files are extracted check directory and file names scan for viruses using FITS:– identify and validate format of object
files– extract technical metadata – PREMIS
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Archivematica 0.7 Alpha Demo
Second approval: appraise SIP for preservation
logs/clamAVScan.txt: report on virus checking logs/extraction.log: report on extracted zip logs/fileMeta/*.xml: augmented PREMIS-
formatted metadata– format designation (PRONOM PUID identifier)– events– technical metadata
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Archivematica 0.7 Alpha Demo
Second approval: appraise SIP for preservation
technical metadata: object characteristics <fits_output> XML formatted metadata – <fits/identification>– <fits/fileinfo>– <fits/filestatus>: well-formed / valid– <fits/metadata>: technical metadata of object– <fits/toolOutput>: output results of used tools
<structMap>: structure of the AIP <fileSec>: list of files included in the AIP <dmdSec>: descriptive metadata for the AIP (the
dublincore.xml) <amdSec>: administrative metadata
– <digiprovMD>: PREMIS-formatted digital provenance metadata
most of it is grabbed from the logs/fileMeta files object identification and characteristics events agents relation between original and preservation copies
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Archivematica 0.7 Alpha Demo
Third approval: push AIP to archival storage
If wanted, check contents of the AIP : you are not able to make any changes though in an AIP
click on Approve AIP is pushed into archival storage– our demo setup: the AIPsStore directory– real life: cloud storage, Amazon S3,
your own network storage device, CLOCKSS, …
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Archivematica 0.7 Alpha Demo
Fourth approval: upload DIP to public access system
directory created for this DIP under uploadDIP– objects: normalized access copies of the object files – objectsBackup: idem– METS.xml: identical as in the AIP
If wanted, check and change contents of the DIP File Manager / uploadDIP
click on Approve removed from SIPbackups copied to DIPbackups our demo setup: DIP is pushed towards an
ICA-Atom public access system
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Archivematica 0.7 Alpha Demo
ICA-AtoM public access system
Fully web-based archival description application based on International Council on Archives standards
AtoM = Access to Memory Point Firefox to http://localhost/ica-atom Uploaded DIPs are by default in draft. Change status to
‘published’ for these to become visible in public access Log in: [email protected] / demo Choose from archival descriptions Edit: change publication status to ‘published’ Log out Selected archive is now publicly visible
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULB Context: multiple digital archives– DI-pot
All academic output (except PhD theses) Most digital born / some digitized by library
staff Self-submission by academic staff Extensively modified DSpace 1.4.2
– Metadata granularity – Semi-automated metadata ingest from PubMed,
Scopus, Web of Science, BibTex and RIS files– Integrated with central administration databases
Component/Resource -- representation by value (XML)
Item[0..∞] (of type objectFile)
Component/Resource -- representation by ref. (URL)
Descriptor/modified
Descriptor/Identifier (persistent identifier)
Descriptor/modified
Descriptor/type (« objectFile »)
Descriptor/Identifier (persistent identifier)
Descriptor/modified
Item[0..1] (of type humanStartPage)
Component/Resource -- representation by ref. (URL)
Descriptor/type (« humanStartPage »)
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULB One dissemination platform– SAMBURU: harvest and index
DIDL records are harvested from the digital archives
DIDL record is stored as-is in MySQL database DIDL record is transformed into SOLR
document and stored in Lucene indexes
– DI-fusion: web portal Based on VuFind Search/retrieve records through SOLR Use XSLT to transform DIDL into HTML Additional 2.0 functionality with AJAX
technology
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULBSamburu
Har
vest
er MySQL
Metadata Store In
dexe
r
Lucene indexes SO
LR
DI-fusion web
portal
DI-pot
BicTel
Icono
Digi
UMons
OAI
-PM
H
Metadata Enrichment O
AI-P
MH
OA
I-PM
H
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULB
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULB
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULB
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULB Enrichment process– Fetch DIDL records from SAMBURU md
store+ Fetch object files (in function of enrichment type)
– Calculate enrichment and create DIDL formatted enrichment record
–Make enrichment record available over OAI-PMH
– SAMBURU harvests and merges original DIDL record with enrichment DIDL record, before re-indexing into Lucene
– End user sees enrichment through DI-fusion
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULB Enrichment: 3 prototype setups
1. Enrichment service at Erasmus University in Rotterdam fetches publications in economics from md store, and determines JEL classification codes based on text analysis
2. Enrichment service @ ULB extracts texts from PDFs and indexes on all words. DI-fusion permits end user to do a full-text search
3. Enrichment service @ ULB enriches with
JCR impact factors (based on ISSN and publication year)
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULB Back to digital preservation– SUBMISSION
metadata and object files (through 4 submission interfaces)
– DISSEMINATION through DI-fusion
– ARCHIVAL we need a PAS: “Perpetual Archiving System” based on the idea of enrichment
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULBSamburu
Har
vest
er MySQL
Metadata Store In
dexe
r
Lucene indexes SO
LR
DI-fusion web
portal
DI-pot
BicTel
Icono
Digi
UMons
OAI
-PM
H
PAS
OA
I-PM
H
OA
I-PM
H
SIPs AIPs DIPs
LOCKSSAdmin
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULB PAS-SIP– Retrieve DIDL records over OAI-PMH from
SAMBURU metadata store– Fetch object files, based on references included in
the DIDL record– Make and store ArchiveMatica-SIP– Alternative to OAI-PMH + web grabbing:
Prepare ArchiveMatica-SIPs on a network-attached filesystem
More practical for bulk ingest into AM: less network traffic
We would probably try a combined approach: bulk + incremental
– Specific package information registered in PAS-Admin
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULB PAS-AIP– Use ArchiveMatica micro-services to
create and store ArchiveMatica-AIP, according to media type preservation plan
– Fully automated, at least for certain media types (PDF, JPEG, TIFF)
– Update package information in PAS-Admin
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULB PAS-DIP– Use ArchiveMatica micro-services to
create and store ArchiveMatica-DIP, according to media type preservation plan
– DIPped object files made available through web service
– Update package information in PAS-Admin
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULB PAS-Admin– Digital preservation status of packages
information accessible over a web service:
Original digital archive wants to find out archival status of its items, based on gupi of item or object file
– End user accesses DIPped object files through web service: not publicly available since dependent on accessibility restrictions set by IPR owner in original digital archive
– AIPs are pushed into outer preservation space, e.g. LOCKSS + registered as suchin PAS-Admin
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULB PAS-Admin– Throughout SIP/AIP/DIP processing,
relevant information should be registered about the packages in a db
– For each SIP, AIP, DIP: (I) gupi of item and all object files uuid of package (I) identifier of original digital archive (I) date of creation/modification
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Digital preservation @ ULB PAS-Admin– relevant metadata of DIPs are made
available as DIDL-structured (enrichment) records over OAI-PMH for SAMBURU to pick up
Parse/extract from METS.xml:– Essentially mime type and location
– sum of original metadata and PAS-created metadata is available to DI-fusion
– DI-fusion could for example decide to only show DIP version of an object file, and inform end user of the existence of the original object file format
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Open Discussion
Alternative options for integrating Archivematica or a subset of digital curation micro-services into your
digitization workflow.
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Issues
Institutional repositories are also used to maintain an institution’s bibliography, with frequent updates of descriptive metadata and object files.
When should digital objects from an IR be preserved?
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Issues
Dappert, A. & Enders M. Using METS, PREMIS and MODS for archiving eJournalsD-Lib Magazine Volume 14 Number (9/10)http://www.dlib.org/dlib/september08/dappert/09dappert.html
“AIP per generation” generation: change in md and/or object file
Both ArchiveMatica and LOCKSS are looking into solutions for the normalization of objects and packaging. Both systems seem redundant at first.
How does ArchiveMatica interact with LOCKSS?
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Issues
ArchiveMatica-AIPs, DSpace-AIPs, exchange of packages between digital archives, nationwide preservation solution.
Need for interoperability standards?
– TIPR: Towards Interoperable Preservation
Respositories– RXP: Repository eXchange Package
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
AIP Repository Interoperability
“For reasons of redundancy, succession planning and software migration, repositories must be able to exchange copies of archival information packages with each other. Every different repository application, however, describes and structures its archival packages differently. Therefore each system produces dissemination packages that are rarely understandable or usable as submission packages by other repositories. “
AIP Repository Interoperability One possible solution: RXP (Repository
eXchange Package), developed by the Towards Interoperable Preservation Repositories (TIPR) project which has defined a standards-based package of metadata files that can act as an intermediary information package, the RXP, a lingua franca all repositories can read and write.
Another option: create AIPS followingthe HathiTrust specification for digital objects.
and therefore only contain objects that comply to an open documented format. Any human being within 50 years should be able to re-read the contents of the object files, given a textual documentation.
So, why migrate AIPs into a new(er) format?
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Issues
Archivematica normalizes moving pictures into MPEG2 = loss of quality
Lossless conversion would be Motion JPEG2000
However: no open-source CLI-based tool for conversion into Motion JPEG2000 format available
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Issues
The more copies of a digital object are stored all over the place, the less trivial becomes control of copyright.
Is geo-independent perpetual archiving in contradiction with IPR issues?
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Issues
Packages are self-contained: if you find an AIP, you know what it is about, and you can read, look, hear it. But how do you find the AIP in a see of billions of AIPs?
Don’t forget to preserve finding aids! How?
C O S U G I 2 0 1 1 P H O E N I X, A R I Z O N A
Slavko ManojlovichAssociate University Librarian (IT)
Manager, Digital Archives InitiativeMemorial University of Newfoundland, St. John’s