Designing a long term very large digital library Stephen Green Head of Digital Library Infrastructure [email protected]
Designing a long term
very large digital library
Stephen Green
Head of Digital Library Infrastructure
2
The design of a very large digital librarypresents many challenges
Long term very
large digital library
Low cost of
ownership
Large object-
store
Tape is not
scalable
Geographically
diverse
Self monitoring
Assurance of
authenticity
Use robust
digital signing
Long term
retention
Availability of
ingest / access persistent
identifier
Usage and
random access
Geographically
resilient
Changing
storage market
Rolling
procurement &
replacement
Heterogeneous
store design
Significant
failure rate
Local self
healing
Remote self
healing
Separate
metadata store
Re-signing over
time
Change vendors
over time
Trust separated
from vendor
Service priorities
Provision of
storage
File format
migration
1
2
34
5
6
7
3
•An interruption to ingest
is less visible
•Hence a longer
interruption can be
tolerated
Service priorities - we need to determine the service priorities so we can focus on the critical topics
Loss of all objects in the store
Loss of some objects
Long term interruption to access
Long term interruption to ingest
Short term interruption to access
Short term interruption to ingest
Disaster
Incident
Incre
asin
g
severity •An interruption to access
is externally visible
•A long term interruption
to access is not tenable
•This must be mitigated in
the design of the system
•Retention is more
important than high
availability
•The store must be
resilient
1
4
Geographically resilient - to tolerate the loss of a single computer room / facility, a multi site repository is needed to deliver service without interruption
One cannot obtain commercial DR for
multi-100 Tb systems
DR must thus be in the system design
A single site, with a common-mode
disaster, cannot sustain availability, and
so is not acceptable
Hence need a multi-site solution
Full service can be delivered remotely,
albeit slower, than locally
Remote delivery implies that each site
does not need “maximum resilience”
within that site – such systems are
expensive
Long term
very large
digital library
Large object-
store
Geographically
diverse
Long term
retention
Availability of
ingest / access
Geographically
resilient
Service priorities
1
2
2
DR – disaster recovery
5
Anticipated usage patterns lead to the conclusion that tape is not sufficiently scalable
Usage patterns: expect peaks with a “very long tail”
Usage metrics are likely to be in terms of:
Ingest in objects/sec & capacity/sec growing over time
Access in relation to the size of the store, hence measured
in objects/Tb/sec and volume delivered/Tb/sec
Access will need to be scalable & it will be random access
Tape storage is efficient when “restoring large file systems”
but is not efficient for restoring individual items.
Typically one tape robot can retrieve an item in ~40 secs and
the maximum number of robots is typically ~10
The scalability of a tape library is thus limited by the no. of
robots and is not scalable for random access to a large library
Future costs vs online are unclear & need routine self checks
Exclusively using online storage is a “clean” keep it simple
approach and there appears little to be lost by adopting it.
Large object-
store
Tape is not
scalable
Usage and
random access
3
HSM –
hierarchical
storage
management –
mixed on/offline
3
6
Large scale storage is not intrinsically reliable and so monitoring and self healing are required
HP reviewed the long term reliability of content held by
the Internet Archive
~1 in 1000 files changed during a three year period
Extrapolating to a collection of ~150 million digital items
then ~4000 items would suffer corruption per month
While the reliability of storage is likely to improve, some
“bit rot” is inevitable, and so we need to plan for it
We do not worry that Internet bearers are unreliable
since we add detection / recovery (e.g. TCP over IP)
We need similar approach for large scale storage, self
monitoring / detection and recovery has to be automatic
Ideally this should be both at a local level (e.g. RAID
5/6) and also across storage sites
See http://arxiv.org/abs/cs.DL/0508130
Large object-
store
Self monitoring
Significant
failure rate
Local self
healing
Remote self
healing
4
4
7
The changing landscape in the storage system market implies accommodating rolling procurement and heterogeneous storage systems
The market for storage systems is changing rapidly
Cost of storage is reducing by 30-40% per year
New innovative players emerge and some fade
These imply that supplier “lock-in” is not sensible
Not realistic to assume there will be one supplier for the lifetime of the library, hence need the flexibility to change supplier over time
Reducing costs lead to rolling procurement just ahead of demand
Cost effective to replace on a rolling basis on expiry of warranty
Rolling replacement/procurement programmes imply the need to be able to support a heterogeneous storage systems
The design of the logical architecture thus needs to support storage sourced from multiple storage vendors
Low cost of
ownership
Changing
storage market
Rolling
procurement &
replacement
Heterogeneous
store design
Change vendors
over time
Provision of
storage
5
5
8
To have a meaningful archive continuous assurance of authenticity is required from the time of ingest (part 1)
Assurance of authenticity:
With a physical item this can be based partly on
examination of the item as well as its content
With a digital item, there is no physical item to examine
A long term library will also migrate though storage products
and vendors over time
Stored digests and/or protection within a single system do
not address handover and so are not sufficient – for
example
Assurance of
authenticity
Use robust
digital signing
Long term
retention
6
6
1st storage system
•May have measures to
“protect” (no
unauthorised changes)
•May store digests
2nd storage system
•May have measures to
“protect”
•May store digests
3rd storage system
•May have measures to
“protect”
•May store digests
timenowingest
Not pro
tecte
d
Not pro
tecte
d
Trust separated
from vendor
9
Ideally
time
stamped
RFC
3161
To have a meaningful archive continuous assurance of authenticity is required from the time of ingest (part 2)
Only a chain of increasing strength digital signatures will do
an object’s signature is re-evidenced periodically over time
the signature chain is “transferred” when systems are refreshed
“perpetuity” is provided by the signature chain not by a system
The assurance (trust) is thus separate from any capabilities by any
one vendor or storage system
The assurance relies on keeping the private signing key private
It is not sufficient to rely on “software” signing
The only trusted way to keep a private key private is to use an HSM*
Leads to the conclusion that if you do not use an HSM
then you “do not have a meaningful long term archive”
But no current storage products use them
FIPS
186-2
Digital
Signature
Standard
FIPS
140-2
Security
Reqs for
Crypto-
Modules
* HSM - high security module not hierarchical storage management
6
11
The use of robust digital signing requires a separate metadata store
A storage system can thus provide long term assurance of
authenticity over succession of vendors
Able to“prove” a bit stream is identical to that ingested
Based on holding objects in an “invariant” store
But cannot support both “no changes” & also “changes”
However, metadata can & may need to change over time
to support versioning and successor objects
A change to live metadata thus cannot be in the same
store
Use robust
digital signing
Separate
metadata store
7
7
• Upper layer: versions
and collections
• Lower Layer: METS for
each stored object
• Of conventional size (few TB)
• Hence conventional backup
regimes can be applied
Invariant resilient storage system providing
assurance of authenticity
Meta data
management
O1
V
O2
External
persistent identifier
Replacement by successor version
Also export new and
updated metadata for
additional resilience
12
Summary - the design of a very large digital library presents many challenges
Long term very
large digital library
Low cost of
ownership
Large object-
store
Tape is not
scalable
Geographically
diverse
Self monitoring
Assurance of
authenticity
Use robust
digital signing
Long term
retention
Availability of
ingest / access persistent
identifier
Usage and
random access
Geographically
resilient
Changing
storage market
Rolling
procurement &
replacement
Heterogeneous
store design
Significant
failure rate
Local self
healing
Remote self
healing
Separate
metadata store
Re-signing over
time
Change vendors
over time
Trust separated
from vendor
Service priorities
Provision of
storage
File format
migration
13
Annex - The needs for large scale archival are not well met by the storage market
High end market profile
Immense scale with corresponding price
Maximised performance with price premium
Single frame maximised resilience / availability with price premium
Proprietary software management
Low end market profile
Low cost without scalability
Large scale archival needs
Immense scale
Low total cost of ownership
Long term resilience – in practice requires distribution
Self healing
Strategy for future migration
Assurance of authenticity
Does not need
Maximised performance
Maximised resilience within single frame
100% availability
We received briefings from ~35 storage vendors
There were are still are two main clusters of storage systems:
Scalable but not affordable, or affordable but not scalable
Needs for large scale archival are not well met