Top Banner
DRS 2 Metadata Migration June 25, 2013
42

DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Dec 15, 2015

Download

Documents

Cayden Bryars
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

DRS 2 Metadata Migration

June 25, 2013

Page 2: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Agenda

• Introduction• Preliminary results - content analysis• Metadata options• Next steps• Questions

Page 3: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

INTRODUCTION

Page 4: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Reason for metadata migration

• Different data model– File -> Object (a coherent set of content that is

considered a single intellectual unit for purposes of description, use and/or management: for example a particular book, web harvest, serial or photograph.)

• Different metadata schemas– Many locally-defined -> community-standard

• Different packaging of metadata– Use of METS in some cases -> consistent use of

METS

Page 5: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Path to metadata migration

Analysis • Metadata• Content• Users

Prototype• Proof-of-

concept• Time

estimates

Migration plan• Sequence• Schedule

Develop tools• Dashboard• Object

builders

Metadata migrationWe are here

Page 6: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Key feedback points

Analysis • Metadata• Content• Users

Prototype• Proof-of-

concept• Time

estimates

Migration plan• Sequence• Schedule

Develop tools• Dashboard• Object

builders

Metadata migrationTechnical

options

Process options

Page 7: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Timing

Analysis • Metadata• Content• Users

Prototype• Proof-of-

concept• Time

estimates

Migration plan• Sequence• Schedule

Develop tools• Dashboard• Object

builders

Metadata migration

Next 3 months

Page 8: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

What does it involve?

• Aggregate DRS1 files into objects– Different object types = content models

• Generate an object descriptor per object

Page 9: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Document example

PDF file

Page 10: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Document example

PDF file

New object (content model = DOCUMENT)

Page 11: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Document example

PDF file

Descriptor file

New object (content model = DOCUMENT)

Page 12: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Still image example

Archival master

image file

Page 13: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Still image example

Archival master

image file

Productionmaster

image file

Page 14: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Still image example

Archival master

image file

Deliverableimage file

Productionmaster

image file

Page 15: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Still image example

Archival master

image file

New object (content model = STILL IMAGE)

Deliverableimage file

Productionmaster

image file

Page 16: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Still image example

Archival master

image file

Descriptor file

Deliverableimage file

Productionmaster

image file

New object (content model = STILL IMAGE)

Page 17: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Aggregate DRS1 files into objects

• One content file per object– Color profile– Document– Google document container 1– Google document container 2– Google document container 3– Opaque container– Text

Page 18: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Aggregate DRS1 files into objects

• Multiple content files per object– Audio– Web harvest– Biomedical image– PDS document– Target image– MOA2– Still image

Page 19: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Generate object descriptors

• METS format– Embedded schemas (PREMIS, MODS, MIX, etc.)

• Metadata sources– DRS1 database– DRS1 METS files where they exist– Examining the content files– Catalog records?

Page 20: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

PRELIMINARY RESULTS:CONTENT ANALYSIS

Page 21: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Preliminary content analysis

• Conceptually “built” objects for 13/14 content models (~36 million / 44 million files)– All but still image– Order helps!

Still Image

MOA2

Biomedical Image

PDS Document

Page 22: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Preliminary content analysis

• 1,091,670 objects from 36,190,120 files– ~33 files per object

• Relatively few surprises but content analysis is not complete

Page 23: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Content cleanup

• MOA2 files (8,024)• Index maps (2,686)• Entity files (1)• Merged PDS descriptors (22,203)

Page 24: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Content cleanup

• Orphaned target image (5), target description files (4)

• Orphaned audio files (71)

Page 25: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

METADATA OPTIONS

Page 26: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

O

DRS1 DRS2

e.g., billingCodeownerCodeaccessFlag

tech metadataowner-suppliedName

rolepurposequality

usageClass

e.g., accessFlagtech metadata

owner-suppliedNamerole

processingquality

usageClass

e.g., billingCodeownerCode

owner-suppliedName

FILE INFO

FILE INFO

OBJECT INFO

DESCRIPTOR

Page 27: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

O

DRS1 DRS2

e.g., billingCodeownerCodeaccessFlag

tech metadataowner-suppliedName

rolepurposequality

usageClass

e.g., accessFlagtech metadata

owner-suppliedNamerole

processingquality

usageClass

e.g., billingCodeownerCode

owner-suppliedName

FILE INFO

FILE INFO

OBJECT INFO

DESCRIPTOR

Page 28: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

O

DRS1 DRS2

e.g., billingCodeownerCodeaccessFlag

tech metadataowner-suppliedName

rolepurposequality

usageClass

accessFlagtech metadata

owner-suppliedNamerole

processingquality

usageClass

billingCodeownerCode

owner-suppliedNamecaption unit name

view text

FILE INFO

FILE INFO

OBJECT INFO

DESCRIPTOR

METS

Object LabelMODSPDS info, etc.

Object LabelObject-level MODS

Page 29: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Objects

• Owner supplied name is required• Need to generate during migration• Four cases

– A METS file exists– New object will be built from a single content file– New object will be built from multiple content files– No OSN (potential case)

• Proposal for most cases: – add prefix or suffix to METS or content file owner supplied

name

Page 30: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Objects

• Other required object elements– insertionDate• date of earliest file?

– captionBehavior• for existing objects, set based on billing code• prospectively, set by depositor

– viewText• available for all objects, not just PDS• default to off

Page 31: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Objects

• Descriptive metadata– Take MODS from existing METS as is or import

new• From Aleph• From Finding Aid

– If re-imported, update METS label or not?– Import from OLIVIA based on owner supplied

name for the file?

Page 32: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Objects from existing METS

• Identifiers for Harvard metadata – Identify finding aid identifiers– Convert “Old HOLLIS” numbers– Aleph IDs: include check digit or not?– Convert to URIs or actionable URNs from plain IDs• Could DRS format such URIs for new DRS2 input?

Page 33: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Objects from existing METS

• PDS elements– PDF owner text becomes caption unit name– viewOcr function becomes viewText– goto function will be automatically determined by

presence of structMap/div attributes• Caption behavior – for existing objects, set by billing code

Page 34: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Files

• Run automated processes to identify, validate and characterize file technical characteristics

• Extract technical metadata

Page 35: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Files

• isFirstGenerationinDrs – Values: yes, no, unspecified– Should we supply “yes” for archival masters

and/or top of derivation chain?

Page 36: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Image Files

• Converting from local scheme to MIX• Local field questions– Methodology– History– Source– Enhancements

Page 37: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Text files

• Converting from local scheme to textMD• Descriptor_type will be absorbed into

different places in DRS2

• Extracted metadata can supply• markup_basis • markup_language for specific schemas• possibly other elements

Page 38: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Audio files

• Moving from local schema to AES57-2011: Audio object structures for preservation and restoration

Page 39: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Versioned metadata

• History will be tracked for key administrative elements:– Access flag– Admin flag (new)– Billing code– Owner code

• What values to assign for required creation date and agent for migrated content?

Page 40: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

NEXT STEPS

Page 41: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

Next steps

• Continue analysis and development of technical requirements

• Build prototype• September check-in on progress• Create metadata migration plan• Open meeting to review plan

Page 42: DRS 2 Metadata Migration June 25, 2013. Agenda Introduction Preliminary results - content analysis Metadata options Next steps Questions.

OPEN FOR QUESTIONS