Prepared for Open Access and Scholarly Books Berkman Center/Knowledge Unlatched June 2013 Metadata and Metrics to Support Open Access Monographs Dr. Micah Altman <[email protected]> Director of Research, MIT Libraries
May 13, 2015
Prepared for
Open Access and Scholarly Books
Berkman Center/Knowledge UnlatchedJune 2013
Metadata and Metrics to Support Open Access Monographs
Dr. Micah Altman<[email protected]>
Director of Research, MIT Libraries
Metadata and Metrics to Support Open Access Monographs
2
DISCLAIMERThese opinions are my own, they are not the opinions of MIT, Brookings, any of the project funders, nor (with the exception of co-authored previously published work) my collaborators
Secondary disclaimer:
“It’s tough to make predictions, especially about the future!”
-- Attributed to Woody Allen, Yogi Berra, Niels Bohr, Vint Cerf, Winston Churchill, Confucius, Disreali [sic], Freeman Dyson, Cecil B. Demille, Albert Einstein, Enrico Fermi, Edgar R.
Fiedler, Bob Fourer, Sam Goldwyn, Allan Lamport, Groucho Marx, Dan Quayle, George Bernard Shaw, Casey Stengel, Will Rogers, M. Taub, Mark Twain, Kerr L. White, etc.
Metadata and Metrics to Support Open Access Monographs
3
Related Work• Altman (2012) “Mitigating Threats To Data Quality
Throughout the Curation Lifecycle, 1-119. In Curating For Quality.
• CODATA-ICSTI Task Group on Data Citation Standards and Practices, (Forthcoming 2013), Citation of Data: The Current State of Practice, Policy, and Technology, CODATA.
• National Digital Stewardship Alliance, (Forthcoming 2013), National Agenda for Digital Stewardship.
• Uhlir (ed.) (2012), Developing Data Attribution and Citation Practices and Standards Report from an International Workshop. National Academies Press, 2012
Most reprints available from:informatics.mit.edu
Metadata and Metrics to Support Open Access Monographs
4
The Next 10 Minutes
• Level setting• Start Discussion questions
5
Preview: Some Discussion Questions• Successful examples/exemplars:
– existing metadata and effective uses of it with books?– graceful degradation, increasing returns, etc.?
• Emerging requirements:– Explicit metadata (or identifier, integration, etc.) requirements from stakeholders?– In what ways do these explicit support use, evaluation, and integration? – Clear implicit requirements? … Licensing (CC-BY, CC0)? Identifier schemes (ISBN,
DOI)? Indexing integration requirements?– What evidence could you envision showing your stakeholders to demonstrate
success? • Opportunities
– ‘Easy pickings’ – metadata already produced in production, dissemination, use, but not retained?
– “‘Looks-easy’ pickings” – opportunities for automated extraction; crowd-sourced entry and refinement?
– Leverage points – e.g. where can effort applied to prime the pump, coordinate practice, or build infrastructure yield network effects, lower barriers to entry, create norms/nudges, or coordinating equlibria that generate incentives to continue production?
6
What is metadata anyway?
(a) “data about data”(b) something the NSA
wants a lot of (c) magic pixie dust(d) digital breadcrumbs(e) all of the above
Metadata and Metrics to Support Open Access Monographs
Source: http://www.guardian.co.uk/technology/interactive/2013/jun/12/what-is-metadata-nsa-surveillance#meta=0000000
Metadata and Metrics to Support Open Access Monographs
7
What good is it?
• Support decision & workflow for production• Add value to product– Support discovery -- descriptive information– Support use – re-presentation, navigation– Support reuse/integration – descriptive, structural,
provenance• Grow the evidence base regarding OA books– characteristics of production, products, and use– E.g., costs, content features, authors, quality
• Support evaluation
Metadata and Metrics to Support Open Access Monographs
8
Selected Characteristics
• Purpose– Descriptive– Structural– Administrative
• Identification• Rights• Provenance• Fixity• Preservation
– Linkages/relationships– Annotation
• Granularity• Association Model
– Embedded– Associated– Third party
• Schema– Mandatory elements– Structure
• Ontology– Semantics– Relationships among elements
and concepts
Metadata and Metrics to Support Open Access Monographs
9
Design Heuristics
• Dublin Core Design Principles[Duval, et al. 2002] – Modularity– Extensibility– Capacity for refinement– Multilingual
• Early capture• Automated extraction
• Approaching richness– Progressive enhancement– Graceful degradation– Increasing returns to
investment– requirement -> barrier
Metadata and Metrics to Support Open Access Monographs
10
Evaluation• Measurement characteristics
– Scope: Local measures vs. Ego-centric vs. Global
– Duration: Point in time vs. period vs. trend– Measurement Scale: Absolute vs. proportion
vs. rank vs. pairwise comparisons vs. purely descriptive (e.g. usage stories)
• Inputs– Content – Associated meta-information– External behaviors, actions (awards),
reputation • Use characteristics:
– … understandability (cognitive burden) of metrics
– … dissemination and adoption strategy– … incentives to be strategic to effect measures
• Some emerging approaches:– Proxies for interest
(citation counts)– Proxies for use (downloads,
reading patterns, annotation patterns, data citations)
– Proxies for (predictive) value(journal impact metrics, h(g,i)-indices, PageRank, Google rank, models of network evolution)
[See Borner, et al. 2004; Kurtz & Bollen 2010; Bollen et. al 2009; Uhlir 2012; CODATA-ICSTI Task Group on Data Citation Standards and Practices… 2013]
Metadata and Metrics to Support Open Access Monographs
11
Ecosystem Integration• Usage
– SUSHI / COUNTERhttp://www.niso.org/workrooms/sushi/• (NISO Standardized Usage Statistics Harvesting Initiative)• Protocol for transmission of usage statistics / practices & schema for formatting
and collecting usage statistics
• Digital work identifiers / locators– Exemplars: DOI’s / OpenURL– Use of identifier internal to monograph adds value for later
use and evaluation– Use of identifier / standard locator to refer to work provides
potential leverage point for usage metrics collection• Other identifiers
– FUNDREF – funding identifiers– ORCID/ISNI – contributor identifiers– Data Citations – citations to data and other non-traditional
scholarly publication– Embedding in monograph adds value to evidence base– Useful for evaluations – esp. those that are likely to align
incentives among funders & contributors• De facto discovery, use & evaluation
– e.g. Google, Amazon
Metadata and Metrics to Support Open Access Monographs
12
Examples: Current State of the Practice• Institutional Repository Metrics
– Harvard DASH User Storieshttps://osc.hul.harvard.edu/dash/stories
– MIT Global Impacthttp://dspace.mit.edu/handle/1721.1/49433
– SSRN Author and Paper Metrics:http://hq.ssrn.com/rankings/Ranking_display.cfm?TRN_gID=10&requesttimeout=900
• Aggregators– Project Muse
http://muse.jhu.edu/about/stats.html– Highwire
http://sushi.highwire.org/– HathiTrust Research Center
http://www.hathitrust.org/htrc
13
Some Discussion Questions• Successful examples/exemplars:
– existing metadata and effective uses of it with books?– graceful degradation, increasing returns, etc.?
• Emerging requirements:– Explicit metadata (or identifier, integration, etc.) requirements from stakeholders?– In what ways do these explicit support use, evaluation, and integration? – Clear implicit requirements? … Licensing (CC-BY, CC0)? Identifier schemes (ISBN,
DOI)? Indexing integration requirements?– What evidence could you envision showing your stakeholders to demonstrate
success? • Opportunities
– ‘Easy pickings’ – metadata already produced in production, dissemination, use, but not retained?
– “‘Looks-easy’ pickings” – opportunities for automated extraction; crowd-sourced entry and refinement?
– Leverage points – e.g. where can effort applied to prime the pump, coordinate practice, or build infrastructure yield network effects, lower barriers to entry, create norms/nudges, or coordinating equlibria that generate incentives to continue production?
Questions?
E-mail: [email protected]: micahaltman.comTwitter: @drmaltman
Metadata and Metrics to Support Open Access
Monographs14