Lorcan Dempsey (with contributions from colleagues) VP Research and Chief Strategist Library of Congress, 15 June 2004 OCLC: some development and research directions in the areas of metadata management and knowledge organization. Presented to Library of Congress cataloging managers retreat.
57
Embed
Lorcan Dempsey (with contributions from colleagues) VP Research and Chief Strategist Library of Congress, 15 June 2004 OCLC: some development and research.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Lorcan Dempsey(with contributions from colleagues)
VP Research and Chief Strategist
Library of Congress, 15 June 2004
OCLC: some development and research directionsin the areas of metadata management and
knowledge organization.
Presented to Library of Congress cataloging managers retreat.
TopicsFramework for WorldCat directionsFramework for WorldCat directions
Metadata management and knowledge organizationMetadata management and knowledge organization
Working with web servicesWorking with web services
Making data work harderMaking data work harder
Some research, some productionSome research, some production
Open WorldCatOpen WorldCat
Framework for WorldCat directionsFramework for WorldCat directions
• Schematization and web services– Make data available in forms that allow machine
services to be flexibly built on top of them– Everything is a service
Open WorldCatOpen WorldCat
Open WorldCat
• Facilitate the rendezvous of users and library services on the web
• Surface the library where the users are
• Help release the value of library services in the working and learning lives of their users.
Open WorldCat Architecture
Aggregators
Schemas and Vocabularies
Profiles and Relationships
Content Owner
Portals
Metadata
Distribution, Search,
Display
Access
Google, Yahoo and Book Vendors Organization and Presentation
OCLC Organizes WorldCat content in model suitable for harvesting, anticipate unique aspects of various portals
OCLC Uses Host of Authentication and Authorization tools to progressively match content to rights
OCLC Developed Geo-locator services to matches users to extensive FirstSearch WorldCat institution and user profiles
WorldCat , Additional collections can be added to Worldcatlibraries domain
OCLC will use tools such as xISBN and FRBR models to organize WorldCat public views suitable for low precision access
Current partners
• Book vendors and bibliographies ABE Books ABAA Alibris HCBIB BookPage
• Search engines (pilot with 2M records exposed as web pages for harvesting)
Google Yahoo!
Click in presentation mode to go through toexamples
Click in presentation mode to go through toexamples
Try a search for:A history of caricature and grotesque in literature and art Try a search for:A history of caricature and grotesque in literature and art
• Work-based view incorporated into WorldCat in FirstSearch in late 2004
• FictionFinder– 2.6+ million fiction records from Worldcat,
clustered by OCLC’s FRBR algorithm– Make greater use of data (genres, settings,
imaginary characters, etc)
• Participate in ongoing FRBR refinement
Click in presentation mode to go through toFictionFinder
Click in presentation mode to go through toFictionFinder
FAST
Vocabulary mappings
Services
• Web services– Computer to computer applications over the web
• Unplug and play– Unbundling monolithic applications and making
functionality available in more modular ways
• Reuse and sharing– Of services!
• Release the value in a web environment of the historical library investment in vocabularies and structures
xISBN
• An experimental web service– Leverages FRBRization work– Give it an ISBN, it returns all related ISBNs– Based on WorldCat– Designed for machine-to-machine data exchange
• Examples:– Check user ILL requests against all editions/versions in
OPAC– Find library’s editions when user finds any
edition/version of item on Amazon– Check OPAC for all editions during
selection/acquisitions/gift book processing– …
xISBN
Click cover to search amazon.co.ukClick cover to search amazon.co.uk
Click cover to search Seattle Public LibraryClick cover to search Seattle Public Library
Install FRBR Bookmarklets in your browser to see xISBN working.See Bookmarklets pageAt www.oclc.org/research/researchworks/
Install FRBR Bookmarklets in your browser to see xISBN working.See Bookmarklets pageAt www.oclc.org/research/researchworks/
• The XSLT “short path”– Supports lightweight XML processing– Designed for public access– Deliverables:
OAI repository of METS-captured xwalks [NEW]
• The “long path” option– Designed for high-fidelity translations– May be public or proprietary– Deliverables: Toolkit; expertise in non-MARC formats
1111
File of records in format X
5555
File of records in format Y
2222Transform to intermediate form
STRUCTURAL TRANSFORM
Translate input semantics to CORE
3333
CORE
SEMANTIC TRANSLATION
Transform to output format Y
STRUCTURAL TRANSFORM
Translate CORE to output semantics
4444
SEMANTIC TRANSLATION
A crosswalk as a METS record
• Describe the crosswalk object in the METS header.
• Assemble and identify six objects in the METS structural map:– The source metadata schema– The target metadata schema– The crosswalk
– Human-readable and executable versions of each
• Associate metadata for each file in the METS Descriptive Metadata Section.
Crosswalk METS record in OAI repository
What the METS encoding solves
• The semantic and syntactic information required for interpreting and executing a crosswalk is collected into a single object.
• The repository is searchable by humans and automated processes.
• Services can be built on top of it.
• It encourages the development and standardization of crosswalks.
These outcomes are possible because every component in the system is a standard.These outcomes are possible because every component in the system is a standard.
Terminology Services
• Terminology services are web services for knowledge organization schemes (kos)– e.g., authority files, subject heading systems, thesauri,
taxonomies, and classification schemes
• A web service that provides mappings from a term in one vocabulary to one or more terms in another vocabulary is an example of a terminology service
• An example: authority control serviceinvoked from within Dspace
Click in presentation
mode.
Click in presentation
mode.
Working with web servicesWorking with web services
Making data work harderMaking data work harder
Data mining
• Research
• Production– Collection analysis service in development
phase– Leverages WorldCat data in interactive mode
Compare my collection to my peers Compare my collection to my neighbors Profile my collection by subject, by age, … etc
Collection
• Change creates demand for better data.
• Growing interest in knowing more about:– Characteristics– Gaps and overlaps– Use
• Tuning collections based on data.
• Focus collection spending where creates most value.
Some projects
• Characteristics of collections– WorldCat– CIC
• Compare ILL, circulation and holdings data.
• Last copy: what is irreplaceable?
• ARL Global Resources.– Exploring coverage of
overseas titles in ARL libraries.
• Depends on consistency, coverage, currency
Comparing CIC Collection Profiles
Audience level
Forge Letters
Profiles of ‘Letters’ & ‘Forge’ Example
0%
20%
40%
60%
80%
ARL Academic Public School
Per
cen
t o
f H
old
ing
s Letters of …
Forge of Liberty
0.81 0.65
TopicsFramework for WorldCat directionsFramework for WorldCat directions
Metadata management and knowledge organizationMetadata management and knowledge organization
Working with web servicesWorking with web services
Making data work harderMaking data work harder
Some research, some productionSome research, some production
Open WorldCatOpen WorldCat
Thoughts
• Machines will do more work– Consistency becomes more important
• Variety
• Low precision– Make data work
The pattern is new …
The knowledge imposes a pattern and falsifies
For the pattern is new in every moment
The knowledge imposes a pattern and falsifies
For the pattern is new in every moment
Further information
Thanks to colleagues in OCLC Research forcontributions to this presentation. Further information about OCLC Research projectscan be found at http://www.oclc.org/research/
Thanks to colleagues in OCLC Collection Management Services for contributions to this presentation. Further information aboutOpen WorldCat athttp://www.oclc.org/worldcat/pilot/