Top Banner
November 5, 2014 Hot Topics: DuraSpace Community Webinar Series Hot Topics: The DuraSpace Community Webinar Series Series Nine: “Early Advantage: Introducing New Fedora 4.0 Repositories” Curated by David Wilcox, Fedora Product Manager, DuraSpace
40

11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Jun 19, 2015

Download

Technology

DuraSpace

Hot Topics: The DuraSpace Community Webinar Series
Series 9: Early Advantage: Introducing New Fedora 4.0 Repositories
Curated by David Wilcox, Fedora Product Manager, DuraSpace
“Fedora 4.0 in Action at Penn State and Stanford”
Wednesday, November 5, 1:00-2:00pm ET
Presented by:
David Wilcox, Fedora Product Manager, DuraSpace
Adam Wead, Developer, Pennsylvania State University and Tom Cramer, Chief Technology Strategist and Associate Director of Digital Library Systems and Services, Stanford University
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

November 5, 2014 Hot Topics: DuraSpace Community Webinar Series

Hot Topics: The DuraSpace Community Webinar Series

Series Nine:

“Early Advantage: Introducing New Fedora 4.0 Repositories”

Curated by David Wilcox, Fedora Product Manager, DuraSpace

Page 2: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

November 5, 2014 Hot Topics: DuraSpace Community Webinar Series

Webinar 2: Fedora 4.0 in Action at

Penn State and Stanford

Presented by:

Adam Wead, Developer, Pennsylvania State University

Tom Cramer, Chief Technology Strategist and Associate Director of Digital Library Systems and Services, Stanford University

Page 3: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

November 5, 2014 Hot Topics: DuraSpace Community Webinar Series

Fedora 4.0 Status

• Wrapping up development this week

• Focus on testing and bug fixing

• Production release by end of year

• Next: F3 to F4 migrations

Page 4: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

November 5, 2014 Hot Topics: DuraSpace Community Webinar Series

Beta Pilot Goals

• Test 4.0 features in a production-like

environment

• Gather feedback for 4.0 release

• Demonstrate diverse use cases

• Encourage early adoption of Fedora 4

Page 5: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

November 5, 2014 Hot Topics: DuraSpace Community Webinar Series

Beta Pilot Outcomes

• Outcomes reported on the wiki

• Feedback rolled into 4.0 release

• Panel at CNI Fall Meeting

• Next round of Beta Pilots: F3 to F4

migrations

Page 6: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Fedora 4 Beta Pilot

Adam Wead, Analyst and ProgrammerPenn State University

[email protected] / @amsterdamos

Page 7: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Why do a beta pilot?

• currently use Fedora3 via Hydra• Fedora is central to our mission to provide

repository services at Penn State• further community development of Hydra and

related Fedora applications• work with Duraspace while development is

still active

5 Nov 2014 7

Page 8: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Fedora at Penn StateScholarSphere

• the institutional repository at Penn State• version 2 released in September• 3 years in production• 4775 objects / 37GB data• comprises academic publications and research

data from Penn State's faculty and students

5 Nov 2014 8

Page 9: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Fedora at Penn StateArchiveSphere

• archival collection management• 72262 objects / 186GB data• supports the efforts of the University’s

archivists

5 Nov 2014 9

Page 10: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Fedora at Penn StateETDFlow

• electronic theses and dissertations• supports submission, approval, and

publication workflows• assets are deposited into ScholarSphere upon

publication• forthcoming application still under

development

5 Nov 2014 10

Page 11: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Fedora at Penn StateSufia

• core "engine" for all of Penn State's Hydra applications

• began as the original ScholarSphere and was extracted into a separate gem

• developed by the Hydra community• enables intra-institutional use, development,

and support

5 Nov 2014 11

Page 12: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Why Fedora 4?

• proven track record• vested interest: Sufia, ScholarSphere, et. al.• continued community development with

Hydra• new features!

5 Nov 2014 12

Page 13: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Looking Forward to…

• native RDF support• better support for large files• clustering capabilities• more flexible modeling of content and

metadata• fixity checking

5 Nov 2014 13

Page 14: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Pilot Goals

• content "remodeling"• compatibility with ActiveFedora• migration

5 Nov 2014 14

Page 15: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Sufia Models: Fedora3

• RDF triples for all descriptive metadata

• must be stored as text file in a datastream

• Hydra handles the CRUD operations

• Fedora only sees a related datastream

5 Nov 2014 15

Page 16: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Sufia Models: Fedora4

• native RDF for any object or resource

• no attached file of triples• persisted in the Fedora

object as RDF• binary content and related

files are child resources• child resources can have

RDF too, just like their parent objects

5 Nov 2014 16

Page 17: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

ActiveFedora

• integration point between Hydra and Fedora4• code sprint underway to finish outstanding

issues• alpha release targeted for this month• Sufia+Fedora4 work following concurrently

5 Nov 2014 17

Page 18: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Migration

• currently have a working proof-of-concept• uses Hydra stack component to move content• waiting on ActiveFedora and Fedora4 release• migration testing in early December• deploy migrated content to production in

January

5 Nov 2014 18

Page 19: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Current Status

• working with Duraspace for 4.0 release• code sprinting• communicating progress to the community• always seeking feedback• willing to share

5 Nov 2014 19

Page 20: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Thank You

Adam WeadPenn State University

[email protected] / @amsterdamos

Page 22: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Exercising Fedora as a Linked Data Repository

Introducing Triannon andStanford’s Fedora 4 Beta Pilot

November 2014

Tom CramerChief Technology

StrategistStanford University

Libraries @tcramer

Page 23: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Use Case 1: Digital Manuscript Annotations

Parker on the Web

Page 24: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Use Case 1: Digital Manuscript Annotations

Parker on the Web

Image annotation & transcription tools

Page 25: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Use Case 1: Digital Manuscript AnnotationsImage annotation

& transcription tools

Parker on the Web

Open Annotation RDF(AKA Linked Data)

Open Annotation RDF(AKA Linked Data)

Page 26: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Use Case 1: Digital Manuscript Annotations

Image annotation & transcription toolsParker on the Web

Open Annotation RDF(AKA Linked Data)

We have tens of thousands of scholarly annotations expressed as RDF triples, enriching digital resources in our repositories.• Where can we store it? • How can we manage it?• How can we retrieve it for visualization in new

environments?

Page 27: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Bibliographic Data• MARC• MODS• EAD

Person Data• VIVO • ORCID• ISNI• VIAF

Usage Data• Circulation• Citation• Curation

• Exhibits• Research

Guides• Syllabi• Tags

Use Case 2: Linked Data for Libraries (LD4L)

LD4L

Page 28: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Use Case 2: Linked Data for Libraries (LD4L)

https://wiki.duraspace.org/display/ld4l/LD4L+Use+Cases

Use Case 1.1: Build a virtual collectionAs a faculty member or librarian, I want to create a virtual collection or exhibit from multiple collections, so that I can share a focused collection with a <class, set of researchers, set of students in a disciplinary area>.Use Case 1.2: Tag scholarly information resources to support reuseAs a librarian, I would like to be able to tag scholarly information resources into curated lists, so that I can feed these these lists into subject guides, course reserves, or reference collections.

Page 29: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”
Page 30: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

• Circulation data• Citation data• Curation data

virtual collections, exhibits, reading lists, tags, etc.

• Require a store for RDF annos and body of annotation(any arbitrary bitstream)

• Need to persist, manage, index• NOT the ILS nor core

repository • All RDF / linked data

Page 31: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

What should we use for these two use cases? 1.) Digital Manuscript annotations2.) Linked data for libraries

Page 32: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

What should we use for these two use cases? 1.) Digital Manuscript annotations2.) Linked data for libraries

Page 33: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

What should we use for these two use cases? 1.) Digital Manuscript annotations2.) Linked data for libraries

Native RDF storeManage assets (bitstreams)Built in service framework

Versioning, indexing, APIsEasy to deployLooking for real world use

cases!

Page 34: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

• W3C draft specification• Enables read-write operations of linked data via

HTTP• Developed at same time as Fedora 4• Fedora 4 one of a handful of current LDP

implementations • See http://www.w3.org/TR/ldp/

A note about LDP: Linked Data Platform

Page 35: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

• Install, configure & deploy Fedora 4• Exercise LDP API for storing annotations

• and associated text/binary objects

• Develop support for RDF references to external objects

• Test scale with millions of small objects• Integrate with read/write apps and operations

• Annotation tools for write: e.g., Annotator• Indexing & Visualization for read: solr & Blacklight,

Mirador

• See https://wiki.duraspace.org/display/FF/Beta+Pilot+-+Stanford

Stanford Fedora 4 Beta Pilot

Page 36: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Architecture and Data Flow

Page 37: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Architecture and Data Flow: Future

Page 38: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

What We’ve Learned To Date

• Fedora 4 approaching 100% LDP 1.0 Compliant

• Triannon at alpha stage• Can write, read & delete Open Annotations to/from

Fedora 4

• Still to come• Updates to annotations• Storage of binary blobs (Annotation bodies) in

Fedora 4• Implement authn/z• Deploy against real annotation clients• Populate with data at scale

• Work will continue throughout 2015• Triannon: https://github.com/sul-dlss/triannon• Mirador: https://github.com/IIIF/m2

Page 39: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

Futures

• A wealth of tools…• Enriching digital

objects and records through…• Annotating• Tagging• Curating

• Stored as linked data natively

• Using Fedora 4 as a management platform

Mirador/Annotator

Image anno. tools

Blacklight-based apps

Any OA-compatible annotations

OpenAnnotationtools

Page 40: 11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”

November 5, 2014 Hot Topics: DuraSpace Community Webinar Series

Webinar 2: Fedora 4.0 in Action at

Penn State and Stanford

Questions