Top Banner
©2002 Colorado Digitization Program http://www.cdpheritage.o ©2002 Colorado Digitization Project http:// coloradodigital.coalliance.o Taming Metadata in the Wild West Liz Bishoff, Colorado Digitization Program Cheryl Walters, Utah State University Chuck Thomas, Florida State University Elizabeth S. Meagher, University of Denver
38

©2002 Colorado Digitization Program ©2002 Colorado Digitization Project Taming Metadata.

Mar 30, 2015

Download

Documents

Makayla Reason
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2002 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Taming Metadata in the Wild WestLiz Bishoff, Colorado Digitization ProgramCheryl Walters, Utah State UniversityChuck Thomas, Florida State UniversityElizabeth S. Meagher, University of Denver

Page 2: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Who what when where—Western Trails Digital Standards• Western Trails 2001 IMLS funded grant

– Multi-state initiative to create a collection of digital objects on topic of Western Trails

– 23 participating institutions, creating 20,000 digital objects

– Each institution would host their own digital object/each would create their own metadata/each would use their own metadata standards and their own database or local system

– Each state would create a statewide database

Page 3: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Who, what, when, where (con’t)• Interoperability of the 4 state databases

was through Z39.50, with a SiteSearch Web-Z interface

• Based on the CDP experience– Crosswalks from various databases– Reviewed the CDP Best Practices– Agreed to utilize Dublin Core as the common

format• Involved more than just the 4 states

– Utah Academic Library Consortia, New Mexico, Arizona, Minnesota, Kansas, Nebraska, Colorado, Wyoming

– 18 representatives from archives, museums, historical societies in these states met over a 9 month period to develop the document

Page 4: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Why Western States Best Practices?

• Improve user results/satisfaction• Improve consistency across different

cultural heritage institutions• Enhance potential for creating union

catalogs from multiple databases/ILS• Provide guidance for cultural heritage

institutions on use of Dublin Core for digital resources

• Support interoperability• Support emerging standards--OAI

Page 5: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2002 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Taming Metadata in the Wild West

Part 2: Writing the metadata guidelines

Page 6: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Western Trail Metadata Task Forces- Descriptive Elements

- Title - Contributor- Creator - Publisher- Subject - Language- Description - Source- Date.Original - Coverage- Relation

- Technical Elements- Date.Digital - Format.Creation- Type - Identifier- Format.Use - Rights Management- Holding.Institution

Page 7: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Getting It Done• Set up two electronic discussion lists• Used Colorado Digital Project’s metadata

guidelines as a base document• Created initial working draft

– base document– decisions made during WSDSG’s first meeting– input from task force members

• Distributed draft to task force for revision (additions, deletions, rewrites, etc.)

• Changes discussed via email & incorporated• Result taken back to entire Western States Digital

Standards Group for review

Page 8: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Some concepts• Define all terms used• Avoid being library-centric • Do not assume cataloging or

metadata experience• Provide lots of examples• Provide links to related thesauri,

standards, etc.

Page 9: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Problems, points of contention• Figuring out exactly what data each

Dublin Core element should containNot as easy as it sounds!

- Figuring out how to make guidelines flexible & comprehensive enough to fit a variety of situations, collaborative ventures & partners, for now and in the future.

Page 10: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

The Coverage element• Official DC definition:

“The extent or scope of the content of the resource”

• What does that mean exactly?• How does it differ from the Subject• Does not mean date/place of publication• Our description of this element:

“Describes the spatial or temporal characteristics of the intellectual content of the resource”

– For art objects and artifacts, this could be the place where the object originated and the date or time period during which it was made.

– Currently recommended only for maps, etc. or when place or time period cannot be adequately described by Subject element.

Page 11: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Source versus Relation element• Looking at California State Library’s

Metadata Standards helped– Source maps to MARC 534

• Note about original version

– Relation maps to MARC 787• Note about a related title

• The lights came on for catalogers who wanted to provide similar MARC tag equivalents; voted down as too library-centric. There are other standards beside MARC!

Page 12: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Relation refinements explained• Relation element has 12 possible

refinements • Meaning of each not always obvious• Some of the differences not clear

– Relation.IsFormatOf versus Relation.HasFormatOf.

- We provided DC’s explanation of the relationship between the resource and the object described in relation field.

- Also gave concrete example of each

Page 13: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Publisher element• Not straightforward when object is

digitized version of previously published item– Is it the digitizing institution?– Is it the publisher of the original version?

• Our guideline explains – “The Publisher element contains information

about the digital publisher. Publisher information from earlier stages in an object’s publishing history may be listed in ... Source and Contributor.”

Page 14: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Date element• What could possibly be confusing about

“Date”?- Date originally issued, published,

made, or created?- Date digitized?- Date of an associated event?

- We created two new refinements to distinguish most important dates: - Date.Original “Creation or modification dates

for the original resource from which the digital object was derived or created.”

- Date.Digital “Date of creation or availability of the digital resource.”

Page 15: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Enter initial articles in titles?• Sounds innocuous but ...

Affects sorting for display or reportsDo you want the title “The toupee worn by....” to sort by “Toupee” or “The”?

- MARC controls via indicators; other formats don’t have

- If leave out, creates possible problem if migrating data into or out of a MARC format databases.

- One person dryly commented: “there will probably be some sort of trouble

no matter what we decide.”- Our guidelines recommend omitting initial

articles.

Page 16: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Making guidelines“one size fits all”• Tried to encourage users to think about

ramifications of their metadata decisions• Reminded them to think about how data

may be migrated and shared in future• Listed lots of different thesauri &

schemes to give users some choices • Listed important info that metadata

creators should include in record.– Example: the Format.Creation field

Page 17: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Improving quality & detail of data• Format.Creation field guidelines

describe important technical data that users might want to include:– File size, quality (bit depth, resolution), extent

(playtime, etc.), compression, checksum value, operating system, creation hardware & software, etc.

• Format.Creation also gives links to resources with more info about terms and standards.

Page 18: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

A rose by any other name...• One user community’s “autograph

album” might be another’s “libri amicorum”

• How can we accommodate many potential controlled vocabularies

• Does the public need or want to know what vocabularies are used?

Page 19: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Flexible subject guidelines – Allow and provide links to many

different thesauri– Separate out different subject/genre

schemes• Example: Put all the Lib of Congress subject

headings in one field; put all the genre terms from Thesaurus for Graphic Materials in another.

– Identify thesauri used via scheme qualifier in field label, not mixed in with data in field itself which is searchable.• Example: Label is Subject.MeSH so that

“mesh” does not become a searchable term.

Page 20: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2002 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Taming Metadata in the Wild West

Part 3: Applications

Page 21: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Metadata Application Depends On:

• Information available about the artifact

• Expertise of the researcher

• Complexity of records

• Expertise of the cataloger

• Data entry system and display

Page 22: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

MARC to Dublin Core – DCBuilder

Page 23: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Original Museum Record

Page 24: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Museum Record after CDP

Title Acer florissantiCreator Contributor Link http://planning.nps.gov/flfo/tax3_Detail.cfm?ID=13484004 [Access] [URI]Publisher 1. Florissant Fossil Beds National Monument 2. National

Park ServiceDescription Plant (Angiosperm, Dicotyledon) Family: AceraceaeDate Digital 2000Subject(s) Aceraceae -- Colorado

Angiosperms, Fossil -- Colorado Dicotyledons, Fossil – Colorado Florissant (Colo.)

Type 1. image [DCMI Type Vocabulary] 2. text [DCMI Type Vocabulary]Source National Museum of Natural History, Smithsonian Institution USNM-333761Languages eng [ISO 639-2]Relation MacGinitie, D.D., Fossil Plants of the Florissant Beds,

Colorado, CarnegieFormat Use 1. image/jpeg [IMT] [medium] 2.text/html [IMT] [medium]Rights National Museum of Natural History, Smithsonian

InstitutionProject Florissant Fossil Beds National Monument

Page 25: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Metadata Record in ContentDM

Page 26: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Metadata Record in ContentDMContinued

Page 27: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Metadata Elements - Public Display

Page 28: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Page 29: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Historical Society Metadata Record

Page 30: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Direct InputTitle Annual report of the Jewish Consumptives' Relief Society at Denver, Colo.Creator Jewish Consumptives' Relief Society (U.S.)Contributor Link http://library.du.edu/About/collections/SpecialCollections/jcrs/annualreports.cfmPublisher University of Denver. Penrose LibraryDescription The <3rd- > reports published <1907- > as regular numbered issues of:

The Sanatorium, v. <1- > The 11th and 12th reports (covering 1914-15) issued in combined form as: The Sanatorium ; v. 10, nos. 3/4 (July-Sept./Oct.-Dec. 1916) Reports cover the year ending Dec. 31. Chiefly in English, with some Hebrew.

Date Original 1905-1906. [Issued] [W3C-DTF]Date Digital 2002-01-04 [Created]Subject(s) Jewish Consumptives' Relief Society (U.S.) -- Periodicals.

Tuberculosis -- Patients -- Colorado. Sanatoriums -- Colorado -- Denver.

Type image [DCMI Type vocabulary]Source 23-26 cm.Languages eng [ISO 639-2]; heb [ISO 639-2]Relation Beck Archives/Rocky Mountain Jewish History Society. Jewish

Consumptives' Relief Society Collection. Special Collections Dept., Penrose Library, University of Denver, Denver, Colo.

Format Create jpg; 300 dpi; 145 files; Epson Expression 836 XL Scanner; Adobe Photoshop version 5.5.

Format Use image/jpg [Medium] [IMT]Rights http://www.penlib.du.edu/specoll/copyri.html

Page 31: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Metadata Record in ContentDM

Page 32: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2002 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Taming Metadata in the Wild West

Part 4: Accommodation of levels of expertise

Page 33: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Local Metadata Routes

LEGACYMETADATA

constituents

mappings&

migrations

NEW

CONTENT

latestmetadatastandard

Local MetaBaseServices

Page 34: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Heritage ColoradoMetadata

Colorado Western Trails

Metadata

Z39.50 Connections

Local MetaBase

Conversion Scripts

Z39.50 Access

content w/out local database

Page 35: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

COLORADOHERITAGE

FSUDIGITAL

WESTERN TRAILS

SERVICES PROVIDER

CONTENTPROVIDER CONTENT

PROVIDER

OAI-WT

OAI-WT

OAI-DC

OAI-WT OAI-DC OAI-METS

MOUNTAINWEST DL

CONTENTPROVIDER

OAI Access

Page 36: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Content Provider Challenges

Implementing OAI - Intermediate Brokers May Be Necessary

Choosing Brokers & Harvesters

Maintaining Current OAI Provider Support

Awareness of Current Metadata Standards

Mapping Local Metadata to Supported Schema

Maintaining Current Transformation Procedure - Examples

Knowing Who Has Your Metadata

Page 37: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Service Provider Challenges

Maintaining Current OAI Harvester Support- Continuing support for older versions

Awareness of Communities & Metadata Schema- What to collect?- Multiple views / repurposing- Added value of relationships between objects/collections- Link in a greater series of brokers?

Maintaining Multiple Data About Same Objects?- Examples

Active Role as Harvester/Service Provider- Contrast with more passive current OAI role

Page 38: ©2002 Colorado Digitization Program  ©2002 Colorado Digitization Project  Taming Metadata.

©2003 Colorado Digitization Program http://www.cdpheritage.org©2002 Colorado Digitization Project http://

coloradodigital.coalliance.org

Thank You!Liz BishoffColorado Digitization [email protected]

Cheryl WaltersUtah State [email protected]

Chuck Thomas Florida State University Libraries [email protected]

Elizabeth “Betty” MeagherUniversity of [email protected]