Top Banner
ALL TEH METADATAS Re-revisited 2013 code{4}lib Meeting February 13, 2013 Esmé Cowles Matthew Critchlow Bradley Westbrook
24

Code4Lib 2013 - All THE Metadatas Re-Revisited

Jun 29, 2015

Download

Technology

Last year Declan Fleming presented ALL TEH METADATAS and reviewed our UC San Diego Library Digital Asset Management system and RDF data model. You may be shocked to hear that all that metadata wasn't quite enough to handle increasingly complex digital library and research data in an elegant way. Our ad-hoc, 8-year-old data model has also been added to in inconsistent ways and our librarians and developers have not always been perfectly in sync in understanding how the data model has evolved over time.

In this presentation we'll review our process of locking a team of librarians and developers in a room to figure out a new data model, from domain definition through building and testing an OWL ontology. We¹ll also cover the challenges we ran into, including the review of existing controlled vocabularies and ontologies, or lack thereof, and the decisions made to cover the gaps. Finally, we'll discuss how we engaged the digital library community for feedback and what we have to do next. We all know that Things Fall Apart, this is our attempt at Doing Better This Time.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Code4Lib 2013 - All THE Metadatas Re-Revisited

ALL TEH METADATAS

Re-revisited2013 code{4}lib Meeting

February 13, 2013

Esmé CowlesMatthew CritchlowBradley Westbrook

Page 2: Code4Lib 2013 - All THE Metadatas Re-Revisited

Overview

• Needs assessment and proposed solution

• Data modeling

• Tool implementation

Overview• Needs Assessment

• Data Model Process

• Implementation

Page 3: Code4Lib 2013 - All THE Metadatas Re-Revisited

Overview

• Needs assessment and proposed solution

• Data modeling

• Tool implementation

Needs AssessmentBrad Westbrook

Page 4: Code4Lib 2013 - All THE Metadatas Re-Revisited

Need One: More consistent data

Page 5: Code4Lib 2013 - All THE Metadatas Re-Revisited

Need Two: Maintain syntax of hierarchical subjects

Page 6: Code4Lib 2013 - All THE Metadatas Re-Revisited

Need Three: Improve support for complex objects

Page 7: Code4Lib 2013 - All THE Metadatas Re-Revisited

Improve support for complex objects-2

Page 8: Code4Lib 2013 - All THE Metadatas Re-Revisited

Need Four: Align more strongly with DL community

• Make sure UCSD RDF is public facing– Use vocabularies in the public– Make UCSD vocabularies public

• Develop technology stack– Utilize contributions from non-UCSD sources– Contribute to non-UCSD endeavors

Page 9: Code4Lib 2013 - All THE Metadatas Re-Revisited

Data Model ProcessMatt Critchlow

Page 10: Code4Lib 2013 - All THE Metadatas Re-Revisited

Project Overview

Research Data Curation Pilot Deadline: June, 2013

Timeline: July 16, 2012 – Oct 29, 2012

Deliverables• Abstract Data Model• OWL/RDF Ontology• Data Model Extension Guidelines

TeamMetadata Analyst: Arwen Hutt, Bradley WestbrookIT: Esmé Cowles, Matt Critchlow, Longshou Situ

Page 11: Code4Lib 2013 - All THE Metadatas Re-Revisited

User Stories

As an administrative unit manager, I want to indicate any external versions or descriptions of an object that may be of probable importance to a user

As a user, I want to know what collection(s) an object belongs to

As a DAMS manager, I want to know what administrative unit an object belongs to

Page 12: Code4Lib 2013 - All THE Metadatas Re-Revisited

Abstract Model – High Level

Page 13: Code4Lib 2013 - All THE Metadatas Re-Revisited

Abstract Model

Collection

Object

Component

Relationship

Name

Role

Page 14: Code4Lib 2013 - All THE Metadatas Re-Revisited

Data Dictionary

Title (title 1-m)

Administrative Unit (unit 1)

Language (language 1-m)

Copyright (copyright 1)

Relationship (relationship 0-m)

Page 15: Code4Lib 2013 - All THE Metadatas Re-Revisited

Ontology

Page 16: Code4Lib 2013 - All THE Metadatas Re-Revisited

Thing 1, Thing 2

Page 17: Code4Lib 2013 - All THE Metadatas Re-Revisited

Thing 1, Thing 2

Page 18: Code4Lib 2013 - All THE Metadatas Re-Revisited

ImplementationEsmé Cowles

Page 19: Code4Lib 2013 - All THE Metadatas Re-Revisited
Page 20: Code4Lib 2013 - All THE Metadatas Re-Revisited

DAMS Repository

• New version of our lightweight repository– Metadata in triplestore– Files on disk or cloud storage

• Explicit structural metadata • Native REST API• Fedora REST API (partial)

Page 21: Code4Lib 2013 - All THE Metadatas Re-Revisited

DAMS Manager

• Separate Java webapp• Ingest, batch operations• Uses DAMS Repository REST API• Functionality moved into the repository– Characterization (JHove)– Fixity checking– Derivatives (ImageMagick)

Page 22: Code4Lib 2013 - All THE Metadatas Re-Revisited

DAMS Public Access System

• Old frontend is unsustainable• New frontend in Hydra– Backed by DAMS Repo, not Fedora

• Hydra platform and community

Page 23: Code4Lib 2013 - All THE Metadatas Re-Revisited

Timeline

• Started 2 months ago• Code sprint in January with cbeer and jcoyne• March: Beta release with research data• Spring: Migrating existing content• Summer: Production release

Page 24: Code4Lib 2013 - All THE Metadatas Re-Revisited

One More Thing

• We’ve talked about DAMS for years...• Now we have code to share

http://github.com/ucsdlib/

@escowles @[email protected]