Courtney C. Mumma, MAS/MLIS UBC – SLAIS Digital Records Forensics ARST 556B May 15, 2013, Vancouver, BC Introduction to
Courtney C. Mumma, MAS/MLISUBC – SLAIS Digital Records Forensics ARST 556B
May 15, 2013, Vancouver, BC
Introduction to
What is Archivematica?
● digital preservation/curation system ● designed to maintain standards-based, long-
term access to collections of digital objects● free and open-source (AGPLv3)● supported by Artefactual Systems Inc.
digital preservation consulting open-source sofware for archives and libraries
Who am I?
The Digital Preservation Problem:
1. Rapid technological change drives constant system upgrades,
migrations and retirement of legacy technologies.
2. Incompatible, obsolete, obscure or proprietary systems and file
formats.
3. Loss or damage to bitstreams due the fragility of digital storage
media, system error, or human error.
The Digital Preservation Problem:
4. The overwhelming volume of digital information objects created
daily, each with many possible copies and versions.
5. The lack or loss of adequate metadata describing digital
information objects.
6. Accidental or malicious content alteration.
The Digital Preservation Problem:
7. Doubts about the reliability and integrity of electronic records
and the inability to vouch for their authenticity.
8. The complexity of digital information objects which requires
preservation of their content, structure, context, presentation,
behaviour as intellectual entities as well as bitstreams.
9. The lack of formally recognized organizational responsibility,
resources and enterprise architecture components that facilitate
digital curation, preservation and long-term access.
now future
bitstream
storage media
packaging
storage device
storage driver
file system
error correction operating system
application software user interface
input / output devices
metadata
find
relate / bind
authenticate
contextualize
stored
copied
protected
Accessible?Usable?Authentic?
compression
decryption
file format
character encoding fonts
codec
Responsible?Architecture?Resources?
contentcontext
structurepresentationbehaviour
TRAC: Trustworthy Digital Repositories
ISO 16363:2012
Data Management
Preservation Planning
Archival Storage
Ingest
Administration
SIP
MANAGEMENT
AIP Access DIP
PRODUCER
CONSUMER
Open Archival Information System (OAIS) reference model (ISO-STD 14721)
https://www.archivematica.org/wiki/OAIS_Use_Caseshttps://www.archivematica.org/wiki/OAIS_Activity_Diagrams
What is Archivematica?
● allows users to process digital objects from ingest to access in conformance with the ISO-OAIS functional model
● creates high-quality, standards-compliant Archival Information Packages (AIP)
● provides an architecture for implementing preservation strategies
● provides a framework for evaluating and implementing format policies
`
web-based dashboard
monitor and control
web server
MCP server
micro-service processing clients
watched directory
success
error
fileshare
successdigital curationmicro-services
pythonscripts
FOSStools
AIP
DIP
SIP or
transfer of digital objects
& metadata
The METS file<dmdSec> (descriptive metadata) Dublin Core XML EAD XML MODS XML [whatever] XML<amdSec> (administrative metadata) <techMD> PREMIS: object <digiProvMD> PREMIS: events PREMIS: agents <rightsMD> PREMIS: rights<fileSec> (a list of the files and their roles and relationships)<structMap> (a representation of the physical structure of the AIP)
Preservation planning
● A two-pronged approach:– Normalization on ingest
– Preservation of the original file to support future strategies such as migration and emulation
● Normalization relies on format policies based on an analysis of the significant characteristics of file formats
● A format policy indicates the actions, tools and settings to apply to a file of a particular file format (e.g. normalization to preservation and/or access format)
https://www.archivematica.org/preservation
Archivematica format policies
● Criteria for selecting default formats:● Non-proprietary● Freely available specifications● Widely used/endorsed by major repositories● No compression/lossless compression● Tools available to write and render the format
● Format policies will change as community standards, practices and tools evolve.
PRONOM
UDFR
Format Policy Registry (FPR)
?
API GUI
Systems Integration
● Application Programming Interfaces (API)– Storage
– Ingest
– Access
● Dspace● ContentDM● ICA-AtoM● Archivist Toolkit● LOCKSS● Islandora● Fedora● Dataverse
Archivematica Clients / Partners
● 30 – 50 users worldwide● Active discussion list, twitter feed, website● Courses, community participation● Current Artefactual clients:
– UBC Library
– UofA Library
– SFU Library
– SFU Archives
– City of Vancouver Archives
– Rockefeller Archive Center
– International Monetary Fund Archives
– Columbia University Library
– Museum of Modern Art (MoMA)
– Yale University Library
Foundation orSteering Committee
Governance
Coordination
Funding
Promotion
Users
Lead institutions Funding DevelopmentAll users Bug reports Enhancement requests Code patches Documentation Promotion Open Source Software
Code
Knowledge
Community
Service Providers
Development
Technical Support
Hosting
Training
Promotion
CodeTime
MoneyKnowledge
CodeTimeMoneyKnowledge
TimeMoney
Knowledge
Free Beer!
“They’ll never take our freedom”
© 1995 Paramount Pictures & 20th Century FoxSee fair use rationale: http://en.wikipedia.org/wiki/File:Brave_mel.jpg
http://archivematica.org
Participate
➔Collaborate with the Digital Preservation Community➔Library of Congress Tools Showcase - try out tools and provide feedback in open forums➔Download and provide feedback to BitCurator – digital forensics tools for archivists➔Leverage social media – follow digital archivists and leading institutions and initiatives on Twitter, read archives blogs➔Follow @Snarkivist @Archivematica