Top Banner
Prepared for Open Access and Scholarly Books Berkman Center/Knowledge Unlatched June 2013 Metadata and Metrics to Support Open Access Monographs Dr. Micah Altman <[email protected]> Director of Research, MIT Libraries
14

Metadata and Metrics to Support Open Access

May 13, 2015

Download

Technology

Micah Altman

This presentation, invited for a workshop on Open Access and Scholarly Books (sponsored by the Berkman Center and Knowledge Unlatched), provides a very brief overview of metadata design principles, approaches to evaluation metrics, and some relevant standards and exemplars in scholarly publishing. It is intended to provoke discussion on approaches to evaluation of the use, characteristics, and value of OA publications.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Metadata and Metrics to Support Open Access

Prepared for

Open Access and Scholarly Books

Berkman Center/Knowledge UnlatchedJune 2013

Metadata and Metrics to Support Open Access Monographs

Dr. Micah Altman<[email protected]>

Director of Research, MIT Libraries

Page 2: Metadata and Metrics to Support Open Access

Metadata and Metrics to Support Open Access Monographs

2

DISCLAIMERThese opinions are my own, they are not the opinions of MIT, Brookings, any of the project funders, nor (with the exception of co-authored previously published work) my collaborators

Secondary disclaimer:

“It’s tough to make predictions, especially about the future!”

-- Attributed to Woody Allen, Yogi Berra, Niels Bohr, Vint Cerf, Winston Churchill, Confucius, Disreali [sic], Freeman Dyson, Cecil B. Demille, Albert Einstein, Enrico Fermi, Edgar R.

Fiedler, Bob Fourer, Sam Goldwyn, Allan Lamport, Groucho Marx, Dan Quayle, George Bernard Shaw, Casey Stengel, Will Rogers, M. Taub, Mark Twain, Kerr L. White, etc.

Page 3: Metadata and Metrics to Support Open Access

Metadata and Metrics to Support Open Access Monographs

3

Related Work• Altman (2012) “Mitigating Threats To Data Quality

Throughout the Curation Lifecycle, 1-119. In Curating For Quality.

• CODATA-ICSTI Task Group on Data Citation Standards and Practices, (Forthcoming 2013), Citation of Data: The Current State of Practice, Policy, and Technology, CODATA.

• National Digital Stewardship Alliance, (Forthcoming 2013), National Agenda for Digital Stewardship.

• Uhlir (ed.) (2012), Developing Data Attribution and Citation Practices and Standards Report from an International Workshop. National Academies Press, 2012

Most reprints available from:informatics.mit.edu

Page 4: Metadata and Metrics to Support Open Access

Metadata and Metrics to Support Open Access Monographs

4

The Next 10 Minutes

• Level setting• Start Discussion questions

Page 5: Metadata and Metrics to Support Open Access

5

Preview: Some Discussion Questions• Successful examples/exemplars:

– existing metadata and effective uses of it with books?– graceful degradation, increasing returns, etc.?

• Emerging requirements:– Explicit metadata (or identifier, integration, etc.) requirements from stakeholders?– In what ways do these explicit support use, evaluation, and integration? – Clear implicit requirements? … Licensing (CC-BY, CC0)? Identifier schemes (ISBN,

DOI)? Indexing integration requirements?– What evidence could you envision showing your stakeholders to demonstrate

success? • Opportunities

– ‘Easy pickings’ – metadata already produced in production, dissemination, use, but not retained?

– “‘Looks-easy’ pickings” – opportunities for automated extraction; crowd-sourced entry and refinement?

– Leverage points – e.g. where can effort applied to prime the pump, coordinate practice, or build infrastructure yield network effects, lower barriers to entry, create norms/nudges, or coordinating equlibria that generate incentives to continue production?

Page 6: Metadata and Metrics to Support Open Access

6

What is metadata anyway?

(a) “data about data”(b) something the NSA

wants a lot of (c) magic pixie dust(d) digital breadcrumbs(e) all of the above

Metadata and Metrics to Support Open Access Monographs

Source: http://www.guardian.co.uk/technology/interactive/2013/jun/12/what-is-metadata-nsa-surveillance#meta=0000000

Page 7: Metadata and Metrics to Support Open Access

Metadata and Metrics to Support Open Access Monographs

7

What good is it?

• Support decision & workflow for production• Add value to product– Support discovery -- descriptive information– Support use – re-presentation, navigation– Support reuse/integration – descriptive, structural,

provenance• Grow the evidence base regarding OA books– characteristics of production, products, and use– E.g., costs, content features, authors, quality

• Support evaluation

Page 8: Metadata and Metrics to Support Open Access

Metadata and Metrics to Support Open Access Monographs

8

Selected Characteristics

• Purpose– Descriptive– Structural– Administrative

• Identification• Rights• Provenance• Fixity• Preservation

– Linkages/relationships– Annotation

• Granularity• Association Model

– Embedded– Associated– Third party

• Schema– Mandatory elements– Structure

• Ontology– Semantics– Relationships among elements

and concepts

Page 9: Metadata and Metrics to Support Open Access

Metadata and Metrics to Support Open Access Monographs

9

Design Heuristics

• Dublin Core Design Principles[Duval, et al. 2002] – Modularity– Extensibility– Capacity for refinement– Multilingual

• Early capture• Automated extraction

• Approaching richness– Progressive enhancement– Graceful degradation– Increasing returns to

investment– requirement -> barrier

Page 10: Metadata and Metrics to Support Open Access

Metadata and Metrics to Support Open Access Monographs

10

Evaluation• Measurement characteristics

– Scope: Local measures vs. Ego-centric vs. Global

– Duration: Point in time vs. period vs. trend– Measurement Scale: Absolute vs. proportion

vs. rank vs. pairwise comparisons vs. purely descriptive (e.g. usage stories)

• Inputs– Content – Associated meta-information– External behaviors, actions (awards),

reputation • Use characteristics:

– … understandability (cognitive burden) of metrics

– … dissemination and adoption strategy– … incentives to be strategic to effect measures

• Some emerging approaches:– Proxies for interest

(citation counts)– Proxies for use (downloads,

reading patterns, annotation patterns, data citations)

– Proxies for (predictive) value(journal impact metrics, h(g,i)-indices, PageRank, Google rank, models of network evolution)

[See Borner, et al. 2004; Kurtz & Bollen 2010; Bollen et. al 2009; Uhlir 2012; CODATA-ICSTI Task Group on Data Citation Standards and Practices… 2013]

Page 11: Metadata and Metrics to Support Open Access

Metadata and Metrics to Support Open Access Monographs

11

Ecosystem Integration• Usage

– SUSHI / COUNTERhttp://www.niso.org/workrooms/sushi/• (NISO Standardized Usage Statistics Harvesting Initiative)• Protocol for transmission of usage statistics / practices & schema for formatting

and collecting usage statistics

• Digital work identifiers / locators– Exemplars: DOI’s / OpenURL– Use of identifier internal to monograph adds value for later

use and evaluation– Use of identifier / standard locator to refer to work provides

potential leverage point for usage metrics collection• Other identifiers

– FUNDREF – funding identifiers– ORCID/ISNI – contributor identifiers– Data Citations – citations to data and other non-traditional

scholarly publication– Embedding in monograph adds value to evidence base– Useful for evaluations – esp. those that are likely to align

incentives among funders & contributors• De facto discovery, use & evaluation

– e.g. Google, Amazon

Page 12: Metadata and Metrics to Support Open Access

Metadata and Metrics to Support Open Access Monographs

12

Examples: Current State of the Practice• Institutional Repository Metrics

– Harvard DASH User Storieshttps://osc.hul.harvard.edu/dash/stories

– MIT Global Impacthttp://dspace.mit.edu/handle/1721.1/49433

– SSRN Author and Paper Metrics:http://hq.ssrn.com/rankings/Ranking_display.cfm?TRN_gID=10&requesttimeout=900

• Aggregators– Project Muse

http://muse.jhu.edu/about/stats.html– Highwire

http://sushi.highwire.org/– HathiTrust Research Center

http://www.hathitrust.org/htrc

Page 13: Metadata and Metrics to Support Open Access

13

Some Discussion Questions• Successful examples/exemplars:

– existing metadata and effective uses of it with books?– graceful degradation, increasing returns, etc.?

• Emerging requirements:– Explicit metadata (or identifier, integration, etc.) requirements from stakeholders?– In what ways do these explicit support use, evaluation, and integration? – Clear implicit requirements? … Licensing (CC-BY, CC0)? Identifier schemes (ISBN,

DOI)? Indexing integration requirements?– What evidence could you envision showing your stakeholders to demonstrate

success? • Opportunities

– ‘Easy pickings’ – metadata already produced in production, dissemination, use, but not retained?

– “‘Looks-easy’ pickings” – opportunities for automated extraction; crowd-sourced entry and refinement?

– Leverage points – e.g. where can effort applied to prime the pump, coordinate practice, or build infrastructure yield network effects, lower barriers to entry, create norms/nudges, or coordinating equlibria that generate incentives to continue production?

Page 14: Metadata and Metrics to Support Open Access

Questions?

E-mail: [email protected]: micahaltman.comTwitter: @drmaltman

Metadata and Metrics to Support Open Access

Monographs14