Dissemination Information Packages (DIPS) for Information Reuse (DIPIR) MIT Libraries Brown Bag DIPIR Principal Investigators: Ixchel M. Faniel, Ph.D. Elizabeth Yakel, Ph.D. Overview of DIPIR : Nancy Y McGovern, Ph.D.
Nov 11, 2014
Dissemination Information Packages (DIPS) for
Information Reuse (DIPIR)
MIT Libraries Brown Bag
DIPIR Principal Investigators:
Ixchel M. Faniel, Ph.D. Elizabeth Yakel, Ph.D.
Overview of DIPIR :
Nancy Y McGovern, Ph.D.
Research-based Practice
research instruction
practice
• IMLS-funded project led by Drs. Ixchel Faniel (PI) & Elizabeth Yakel (co-PI)
• 3-year project October 2010 – September 2013
• Studying the intersection between data reuse and digital preservation in three academic disciplines to identify how contextual information about the data that supports reuse can best be created and preserved.
• Focuses on research data produced and used by quantitative social scientists, archaeologists, and zoologists.
• The intended audiences of this project are researchers who use secondary data and the digital curators, digital repository managers, data center staff, and others who collect, manage, and store digital information.
Motivation for the DIPIR Project
Two Major Goals 1. Bridge gap between
data reuse and digital curation research
2. Determine whether
reuse and curation practices can be generalized across disciplines
Data reuse research
Digital curation research
Disciplines curating
and reusing data
Our interest is in this overlap.
DIPIR Project
Nancy McGovernICPSR/MIT
Ixchel FanielOCLC Research
(PI)
Eric Kansa Open Context
William Fink UM Museum of Zoology
Elizabeth Yakel University of
Michigan (Co-PI)
The Research Team Resources at dipir.org:• Project Details• People• Sites• Publications• Bibliography• Project Reports• News
For more information, please visit http://www.dipir.org
Next Steps
Interviews• Social scientists• Archaeologists• Zoologists
Survey• ICPSR Data
Reusers
Observations• UMMZ Data
Reusers
Web analytics• OpenContext.org
transaction log analysis
Map significant properties of data as representation
information
Faniel & Yakel 2011
Methods Overview
ICSPR Open Context UMMZ
Phase 1: Project Start up
Interviews Staff
10 Winter 2011
4 Winter 2011
10 Spring 2011
Phase 2: Collecting and analyzing user data
Interviews data consumers
43 Winter 2012
22 Winter 2012
27 Fall 2012
Survey data consumers
2000 Summer 2012
Web analyticsdata consumers
Server logsOngoing
Observations data consumers
10Ongoing
Phase 3: Mapping significant properties as representation information
A Survey of ICPSR Data Reusers Measuring Data Repository Success
What data qualityindicators contributeto quantitative socialscientists’ data reuse satisfaction?
Measuring Repository Success Survey of ICPSR Data Reusers - Part 1
• Completeness – sufficiency, breadth, depth, and scope• Relevancy – applicability and helpfulness of data for the task • Accessibility – ease and speed data were retrieved• Ease of Operation – ease data were managed and manipulated • Credibility – correctness, reliability, impartiality of data
Additional Indicators:• Data Producer Reputation – regard for a data producer’s work• Documentation Quality – sufficiency and ability to facilitate use
Data Quality Indicators ICPSR Survey of Data Reusers – Part 1
(Wang and Strong, 1996; Lee et al., 2002)
Survey Methodology
Data Collection1,632 first authors of published journal articles 2008-2012 surveyed
The Survey Part 1:inquire about data reuse experiencePart 2: inquire about experience using ICPSR repository and intention to continue use
Preliminary Findings• Tested measures of repository success • Extended ideas about data quality beyond credibility and
relevance of data – Data reuse satisfaction requires data that are complete, accessible,
and easy to operate
• Data producer reputation was not significant• Documentation quality played a role if data reuse satisfaction
The Study
Research QuestionHow do novice social science researchers make sense of social science data?
Data Collection22 Interviews
Data AnalysisCode set developed and expanded from interview protocol
http://www.english.sxu.edu
Making sense of matching and merging capabilities across multiple datasets
• Combining longitudinal data• “If they're not asking the same question over years,… [it’s] particularly
difficult because if they’ve changed the question wording, are then people answering differently and so there were several discussions that I had with my dissertation advisor…” (CBU18).
• Merging data from different sources
• “…authors will create a variable, they’ll average across a four or five year period, and I’m trying to match that with a variable that was coded for a single year period. So making an argument…that these two things should be put together …, is something I always have to be wary of …So when dealing with that,…I’ll see if it’s been done by others” (CBU04).
Preliminary Findings
Novices engaged in careful articulation of the data producer’s research process.Novices relied on human scaffolding in the form of faculty advisors and instructors.Human scaffolding also came from the community as represented in the literature.
Research QuestionHow do novice social science researchers make sense of social science data? Data Collection22 InterviewsData AnalysisCode set developed and expanded from interview protocol
Preliminary Findings
Social Science Resource
Faniel, I.M., Kriesberg, A. & Yakel, E. (2012). Data Reuse and Sensemaking among Novice Social Scientists. Proceedings of the American Society for Information Science and Technology, 49. (Slides)
Full list: http://dipir.org/publications/
• Social and economic forces pushing toward digital archaeological data publication
• No robust set of standards exist for field archaeology
• Data reuse studies can inform standards development, but there are few outside of science and engineering disciplines
MotivationThe Challenges of Digging Data: A Study of Context in Archaeological Data Reuse
http://opencontext.org/
Archaeology resource
Faniel, I.M., Kansa, E., Kansa, S.W., Barrera-Gomez, J. & Yakel, E. (2013). The Challenges of Digging Data: A Study of Context in Archaeological Data Reuse. Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries. (Preprint, Abstract, view slides via SlideShare)
Full list: http://dipir.org/publications/
Archaeology Study
Research Question1. How does contextual information
serve to preserve the meaning of and trust in archaeological field research over time?
2. How can existing cultural heritage standards be extended to incorporate these contextual elements?
Data Collection22 interviews with archaeologists
Data AnalysisCode set developed and expanded from interview protocol
http://www.english.sxu.edu
• The lack of context was a persistent problem.• Data collection procedures were highly sought during
data reuse.• Additional context also played a role during data reuse. • Researchers have an interest in the entire data life-cycle
(data collection preparation through repository)• Need more studies involving data integration and reuse
to help guide standards development (CIDOC-CRM not sufficient)
Preliminary Findings
A Snapshot of the 27 Data Reusers
63%
96%93%
reuse data from colleagues
26%
reuse data from other repositories and websites
reuse data from museums and archives
37%
are systematists
study ecological trends
reuse data from journal articles
26%
Data Selection Criteria
Data coverage Geographic precision
Matches another datasetAvailability of voucher specimen
Time period specimen collected
Condition of specimen
Sequence has been published
Results of pre-analysisIdentification or location errors
Relevant taxonomically
Trust in Repositories Resource
Yakel, E., Faniel, I., Kriesberg, A., & Yoon, A. (2013). Trust in Digital Repositories. International Journal of Digital Curation, 8(1), 143–156. doi:10.2218/ijdc.v8i1.251.(Awarded Best Conference Paper at the 8th International Digital Curation Conference (IDCC). Amsterdam, Netherlands). (Article)
Full list: http://dipir.org/publications/
DIPIR is examining trust factors for re-use:
• Benevolence– The organization demonstrates goodwill toward the customer
• Integrity– The organization is honest and treats stakeholders with respect
• Identification– Understanding and internalization of stakeholder interests by the
organization – ISO TRAC understanding the designated community (pp. 25-26)
• Transparency– Sharing trust-relevant information with stakeholders – ISO TRAC sharing audit results (p. 19)
(Pirson & Malhotra, 2011)
Stakeholder Trust
Theoretical Framework
DeLone and McLean Information Systems (IS) Success Model
Information Quality
System Quality
Service Quality
Intention Use to use
User Satisfaction
Net Benefits
(DeLone & McLean, 2003)
DIPIR and TRAC
• DIPIR used TRAC requirements as a starting point for informing a survey of social scientists
• That process raised questions about what users of digital repositories might notice and/or rely upon
• Worthwhile to take a step back and consider how users might perceive our TRAC-related efforts
Perceptions of TRAC Examples from TRAC requirements: 3.1.1. Mission Statement reflects “commitment to the preservation
of, long term retention of, management of, and access to digital information”
3.2. “sustained operation of the repository”3.3.4. “commit to transparency and accountability in all actions”
How might users of repositories become aware of and respond to our efforts to be compliant?
Should we strive to encourage them to be aware? How?How can/would we know if their interest in our practices increases
or changes?Who is our audience for demonstrating good practice?
Repository Trust Concepts
Continuance Intention
Integrity
Benevolence
Transparency
Identification-based trust
Social Factors
Structural Assurances
Performance Expectancy
Trust
ConceptsArchaeologists
(22)
Quantitative Social Scientists
(44)
All
(66)Stakeholder Trust in the Organization
Benevolence 0 1 1Identification 1 1 2Integrity 1 1 2Transparency 5 5 10
Social FactorsColleagues 1 7 8
Structural AssuranceGuarantees:Preservation/Sustainability 9 1 10Institutional reputation 4 23 27Third Party Endorsement 0 1 1
How often interviewees mentioned Trust Factors
Coming UP …
DIPIR Research Assistant Adam Kriesberg will present a paper on Nov. 4 at the 2013 Meeting of the Association for Information Science and Technology (ASIS&T). The paper is entitled “The Role of Data Reuse in the Apprenticeship Process” and features Rebecca Frank, Ixchel Faniel, and Elizabeth Yakel as co-authors.
http://dipir.org/news/
Acknowledgements
• Institute of Museum and Library Services, – LG-06-10-0140-10
• Our co-authors: Sarah Whitcher Kansa, Ph.D., Julianna Barrera-Gomez, M.S.I., Elizabeth Yakel, Ph.D.
• Partners: Nancy McGovern, Ph.D. (MIT), Eric Kansa, Ph.D. (Open Context), William Fink, Ph.D. (University of Michigan Museum of Zoology)
• Students: Morgan Daniels, Rebecca Frank, Adam Kriesberg, Jessica Schaengold, Gavin Strassel, Michele DeLia, Kathleen Fear, Mallory Hood, Molly Haig, Annelise Doll, Monique Lowe