Top Banner
Smithsonian Institution Libraries “Metadata Mixing & Matching For Discovery” LSC 888 The Special Library/ Information Center Suzanne C. Pilsk ~ Smithsonian Institution Libraries ~ 2010
93
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Cua2010

Smithsonian Institution Libraries“Metadata Mixing & Matching For

Discovery”

LSC 888

The Special Library/ Information Center

Suzanne C. Pilsk ~ Smithsonian Institution Libraries ~ 2010

Page 2: Cua2010

Facts and FiguresSmithsonian Institution Libraries

Washington, D.C.• Anacostia Museum & Center for African American History and Culture

Library

• Anthropology Library

• Botany and Horticulture Library

• The Dibner Library of the History of Science and Technology

• Freer Gallery of Art and Arthur M. Sackler Gallery Library

• Hirshhorn Museum and Sculpture Garden Library

• Joseph F. Cullman 3rd Library of Natural History

Page 3: Cua2010

Facts and FiguresSmithsonian Institution Libraries

Washington, D.C. (continued)• Museum Studies & Reference Library

• National Air and Space Museum Library

• National Museum of American History Library

• National Museum of Natural History Library

• National Postal Museum Library

• National Zoological Park Library

• Smithsonian American Art Museum/National Portrait Gallery Library

• Warren M. Robbins Library, National Museum of African Art

Page 4: Cua2010

Facts and FiguresSmithsonian Institution Libraries

ElsewhereSuitland, Md.

• Museum Support Center Library

• National Museum of the American Indian Library

Edgewater, Md.

• Smithsonian Environmental Research Center Library

New York City

• Cooper-Hewitt, National Design Museum Library

Republic of Panama

• Smithsonian Tropical Research Institute Library

Page 5: Cua2010

Facts and FiguresSmithsonian Institution Libraries

African Art

African American History and Culture

Anthropology

American Art

American History

Asian and Middle Eastern Art

Aviation history and Space Flight

Design and Decorative Arts

Environmental Management and Ecology

History of Science and Technology

Latino History and Culture

Materials Research

Modern and Contemporary Art

Museology

Native American History and Culture

Natural History

Postal History

Tropical Biology

Trade Literature

World’s Fair Ephemera

Page 6: Cua2010

What’s So Special?

Public Museum

Smithsonian Institution is the largest museum complex in the world …

“The Nation’s Attic”

Page 7: Cua2010

“Increase and Diffusion of Knowledge”

Unlock the Mysteries of the Universe

Understanding and Sustaining a Biodiverse Planet

Valuing World Cultures

Understanding the American Experience

Page 8: Cua2010

SIL Mission(Smithsonian Directive 500)

As the largest and most diverse museum library in the world, SIL leads the Smithsonian in taking advantage of the opportunities of the digital society. SIL provides authoritative information and creates innovative services and programs for Smithsonian Institution researchers, scholars and curators, as well as the general public, to further their quest for knowledge. Through paper preservation and digital technologies, SIL ensures broad and enduring access to the Libraries’ collections for all users.

Page 9: Cua2010

SIL’s Strategic Plan “Focus on Service”

• GOAL 1: COLLABORATING ACROSS BOUNDARIES– SIL creates a compelling environment for connecting, collaborating and

exploring across disciplines and information boundaries

• GOAL 2: DISCOVERING INFORMATION– SIL enhances and eases the discovery of information in our collections

for SI scholars, researchers, scientists, and the larger world of learners

• GOAL 3: CONNECTING WITH USERS– SIL understands and meets user needs, serving users where they live

and work

• GOAL 4: BUILDING EXPERTISE– SIL builds expertise on information discovery, navigation and

management

• GOAL 5: ENABLING OUR MISSION– SIL ensures its success through increased financial strength, effective

administrative support, and organizational excellence

Page 10: Cua2010

Facts and FiguresSmithsonian Institution Libraries

Total volumes

> 1.7 million

50,000 are rare books

10,000 manuscripts

Trade Catalogs

> 500, 000 items

> 30,000 companies

dating from the 1800s

Page 11: Cua2010

Facts and Figures

• 102 Smithsonian Libraries Staff

• 17 Souls in Cataloging Services (with contractors)

Page 12: Cua2010

• Traditional Library

• Traditional Services

Page 13: Cua2010

Integrated Library System

Smithsonian Institution Research Information System (SIRIS)

– MARC

– AACR2r

– ISBD

– LC Classification

– LC Subject Headings

Page 14: Cua2010

Traditional Cataloging

• Monographs

• Serials

• Videos

• Microfilm/fiche

• Sound Recordings

• CD/DVDs

• Electronic Resources

Page 15: Cua2010

Traditional Cataloging

• OCLC

• Program for Cooperative Cataloging

– NACO

– SACO

– BIBCO

Page 16: Cua2010

SI Libraries Serves

• Curators

• Researchers

• Post-Docs

• Museum Administrators

• Public

Page 17: Cua2010
Page 18: Cua2010

IFLA’s Functional Requirements for Bibliographic Data

To Find

To Identify

To Select

To Obtain

To USE

Page 19: Cua2010

Determining Level of Metadata

• What do you have?

• What staff do you have?

• Who are your users?

• Where will it go?

• Will it stay there or travel on and on and on and on and on and on and on and on

Page 20: Cua2010

Metadata

Page 21: Cua2010

Metadata – failure to serve

Page 22: Cua2010
Page 23: Cua2010

Metadata: MARC

MARC

110 Oscar Mayer & Co.

650 Frankfurters

Page 24: Cua2010

Metadata

Dublin Core

Creator:

Oscar Mayer & Co.

Subject:

Frankfurters

Page 25: Cua2010

02761nam 2200469 4500001000700000005001700007008004100024010002300065019001300088035001400101035002300115040006100138049002700199050001500226100004200241245019300283260008300476300001700559504033500576505015400911590010901065590009601174650002601270945002101296945007301317945003101390945004801421945004801469945004701517945007901564945004401643945004601687945004801733945007601781945004401857945005101901945005101952945007102003945009002074945009602164945003102260-459797-20050131154400.0-731129m19021933enk b 000 0 lat c- ­aagr03000069 //r582- ­a14018362- ­aABY6485LB- ­a(OCoLC)ocm00751549- ­aU.S. Dept. of Agr. Libr.­cRIU­dOCL­dCHS­dSER­dSMI­dWaOLN- ­aSMI$­aSMIM­aSMIE­aSMIB-00­aQL354­b.S5-1 ­aOscar Mayer & Co.-10­aPronto pup:­bhot dogs hamburgers/­ca Oscar Mayer and Company.- ­aNew Orleans, La. :­bBourbon Street Foods,­c2000.

Metadata: Real MARC – Still failure to serve

Page 26: Cua2010

Metadata: MARCXML

<?xml version="1.0" encoding="UTF-8" ?><collection xmlns="http://www.loc.gov/MARC21/slim"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">

<record><leader>02761nam a2200469 4500</leader><controlfield tag="001">459797</controlfield><controlfield tag="005">20050131154400.0</controlfield><controlfield tag="008">731129m19021933enk b 000 0 lat

c</controlfield><datafield tag="010" ind1=" " ind2=" "><subfield code="a">agr03000069 //r582</subfield></datafield>

Page 27: Cua2010

How to make THIS into 0’s and 1’s

Page 28: Cua2010

Virtual Library defined in theOnline Dictionary for

Library and Information Science

A "library without walls" in which the collections do not exist … [in] tangible form at a physical location but are electronically accessible in digital format via computer networks. The term digital library is more appropriate because virtual (borrowed from "virtual reality") suggests that the experience of using such a library is not the same as the "real" thing when in fact the experience of reading or viewing a document on a computer screen may be qualitatively different from reading the same publication in print, but the information content is the same regardless of format.~ http://lu.com/odlis/odlis_v.cfm

Page 29: Cua2010

Digital Library defined in theOnline Dictionary for

Library and Information Science

A library in which a significant proportion of the resources are available in machine-readable format … . The digital content may be locally held or accessed remotely via computer networks. … In libraries, the process of digitization began with the catalog, moved to periodical indexes and abstracting services, then to periodicals and large reference works, and finally to book publishing.~ http://lu.com/odlis/odlis_v.cfm

Page 30: Cua2010

Traditional Digital Library

• Electronic Journals & Databases

• Digital Editions

• Online Exhibitions

• Online Catalog

• Digital Reference

Page 31: Cua2010
Page 32: Cua2010

Will they find it?

If you digitize it …

Page 33: Cua2010

Search Gone BAD!

Page 34: Cua2010
Page 35: Cua2010

- Specimen- Plate or other visual image- Taxonomic description

Page 36: Cua2010
Page 37: Cua2010

Beyond the Traditional

Taxonomic Literature Needs/Requests

• Beyond the Scan

• Beyond the Re-Keyed

• Marking up the data in metadata schemas

Page 38: Cua2010
Page 39: Cua2010
Page 40: Cua2010

MARC

LCSH/LCCS

ISBDFeed the cat

Pick up dry cleaning

Milk, eggs, lactaid

Make dentist appt.

AACR

Page 41: Cua2010

MARC

AACR

LCSH/LCCS

ISBD

Feed the cat

Pick up dry cleaning

MODSXML

Dublin Core

ONIX

METs

TEI

FRBR

Access

Hierarchical

Faceted

relatedItem

Milk, eggs, lactaid

Add hotdogs to grocery list

Dewey

XMP

RDA

Page 42: Cua2010

Discoverable

Interoperability

Open AccessFeed the cat

Pick up dry cleaning

Milk, eggs, lactaid

Make dentist appt.

Collaboration

Page 43: Cua2010
Page 44: Cua2010

Biodiversity Heritage Library (BHL)

Page 45: Cua2010
Page 46: Cua2010

EOL species need

CuratorRequest

“gap-fill”for other BHL library

Pull from stacks

Circ in ILS Preliminary metadata checkAnd physical check

Goin’ down the rows

serial?

“Bid”on title, select in picklist

The Stacks

Meta-datacheck

Preser-vationreview

Other library“bid” ?

Circ to cataloging for MARC editing

Update picklist if item record has been changedDuring cataloging touch-upCirc to scanner

Put on shipping cart, generate‘packinglist’ invoice

Evaluate titleNeed is…

pass

yes

pass

fail

fail

no

Carts delivered to scanner

Picklist DatabaseStores Select / reject / shipstate & suppliesitem metadatato IA

Select title in picklist,upload to monograph de-duper

Duplicate?yes

BibliographicData from SIRIS

noReject in picklist,Circ in HorizonReturn to stacks

noReject in picklist,return to stacks

Page 47: Cua2010

IA scanning processUnique IA id is assignedMetadata is gathered fromSIRIS and the picklist dbAnd associated with the scanJP2000s generated& transformedServed on archive.orgQA is done by IA on 10%

Books are returned, cart contents areverified against invoice

SIL does 20% QAChecking for metadata matchingWith item, scan quality etc

Updated in picklist as scannedCirc in HorizonPlace BHL sticker near barcodeReturn to Stacks

Pass QA?

BHL PortalPeriodically harvestsMarc.xml (bib) and itemRecords, along with JP2000 fromArchive.orgTo index and displayIn the portal

yesno

Update picklist to indicate rescan

Put on shipping cart, generate ‘packinglist’ Invoice, alert scanning center

Carts delivered to scanner

Download .csv from portal with SIL barcodes, Portal URLs

Send URLs to SIRISOffice for batch updates

Page 48: Cua2010

Ernest IngersollHand-book to the National Museum … Smithsonian Institution, 1886

Mass Scanning Workflow

BHL

•Bid Lists•Serials Management•Pick Lists•Packing Lists•Monographic Management•Local data flow•WonderFetchtm

•Return of data•Return of material•Billing

Page 49: Cua2010

1. Select Book ~Pull from Shelf

2. Review Physically and Metadata

3. Establish viability and create Wonderfetchtm

4. Send to IA scanning center

5. Book is scanned & QA

6. Page images loaded

7. Derivatives created

8. Book returned to library

9. Files harvested from IA portal to BHL

10. Taxonomic Intelligence Added

11. Available through BHL

BHL

Page 50: Cua2010

Monographic DeDuper

Page 51: Cua2010
Page 52: Cua2010
Page 53: Cua2010
Page 54: Cua2010
Page 55: Cua2010
Page 56: Cua2010
Page 57: Cua2010
Page 58: Cua2010

The BHL Portal is not a library catalog

Page 59: Cua2010
Page 60: Cua2010
Page 61: Cua2010
Page 62: Cua2010
Page 63: Cua2010
Page 64: Cua2010
Page 65: Cua2010
Page 66: Cua2010
Page 67: Cua2010
Page 68: Cua2010
Page 69: Cua2010
Page 70: Cua2010
Page 71: Cua2010
Page 72: Cua2010
Page 73: Cua2010
Page 74: Cua2010
Page 75: Cua2010
Page 76: Cua2010
Page 77: Cua2010
Page 78: Cua2010
Page 79: Cua2010

Collections.SI.edu ~ SI Libraries

842,000 Records in ILS27,805 Trade literature

74,613 Art and Artists files4,000 SI Digital Repository

(SI Research Online)

Page 80: Cua2010

Not in Collections.Si.Edu

Page 81: Cua2010

Collections.SI.edu ~ Freer + Sackler

53% of the ENTIRE

collection at www.asia.si.edu

& collections.si.edu

12,269 objects online

NOT: F/S G’s Study Collection – 10,872 objects only for study not for exhibit – will never go online

Page 82: Cua2010

Collections.SI.edu ~ NPM

12,000 Records

Collections.si.edu

16,000 Records in the ARAGO

214,000 Records in the database

6 Million objects

= 0.2% in Collections.si.edu

Page 83: Cua2010

Collections.SI.edu ~ NMNH

NMNH estimates 126 Million Specimens

Page 84: Cua2010
Page 85: Cua2010

Collections.SI.edu ~ NMNH

NMNH estimates 126 Million Specimens

5,400,000 Catalog Records in collection management system –

5,218,793 available on collections.nmnh.si.edu (181,207 records not available)

Page 86: Cua2010

Collections.SI.edu ~ NMNH

Coming soon:

IZ 992,000 (68,000 with media)

Bot 788,000 (1,300 with media)

Page 87: Cua2010

Collections.SI.edu ~ NMNH

NMNH estimates 126 Million Specimens

5,400,000 Catalog Records in collection management system – 5,218,793 available on

collections.nmnh.si.edu (181,207 records not available)

6 out of 10 units supplying data to collections.si.edu = 2,527,557 records

(153,418 have images)

Page 88: Cua2010

Collections.SI.edu

50% of the records are from 1 source

(NMNH and still growing 2,527,557 records

with 153,418 images)

4,600,000 Records

445,000 Images

40 Data sources

Page 89: Cua2010

SI Wide Estimations

• 136.9 MILLION objects

• 13 MILLION digital records

• 821,000 digital images

Page 90: Cua2010

“The worth and importance of the Institution is not to be estimated by what it accumulates within the walls of its building, but by what it sends forth to the world.”

—Joseph HenryThe Smithsonian Institution’s First Secretary

1852

Page 91: Cua2010
Page 92: Cua2010

Credits

Thanks to staff at

NMAI SIL

NMNH MBL/WHOI Library

NPM MoBot

Freer/Sackler NYBG

BHL

Page 93: Cua2010

Smithsonian Institution Libraries“Metadata Mixing & Matching For

Discovery”

Suzanne C. Pilsk Smithsonian Institution Libraries

[email protected]