DataCite and Campus Data Services 24 September 2012 Paul J. Bracke Associate Dean for Digital Programs and Information Access Purdue University Libraries
Jan 26, 2015
DataCite and Campus Data Services
24 September 2012
Paul J. BrackeAssociate Dean for Digital Programs and Information AccessPurdue University Libraries
Overview
2
• Data Services and Libraries• Campus Data Services at Purdue• DataCite• Data Citation and Campus Data Services
3
Data and Libraries
Data and Libraries
4
• Role of libraries in data management has been a focus of discussion
• Academic libraries collect, preserve, and disseminate human knowledge, within the context of a particular institution (Research and Teaching)
Drivers
5
• Increased interest in computational, collaborative science has led to an increase in interest in data management and sharing
• Funder mandates have increased interest at campus level
Data Services and Libraries
6
• Different views of library roles• Curatorial Roles (Data Collection,
Appraisal, Selection, Description, Preservation, etc.)
• Service Roles (Data Management Planning, Preservation Planning, Data Needs Assessment, Data Information Literacy, Intellectual Property and Governance)
7
Campus Data Services at Purdue
Data Services at Purdue
8
• Assessment of Data Needs• Development of Data Services• Development of Data Repository
Looking Upstream
9
secondary/tertiary
resources
publishedresearchtraditional
“published”research
non-traditional
unpublishedresearch
traditional/non
“published”data/
datasets
Modified from: Brandt, D.S. “Scholarly Communication” (in To Stand the Test of Time: Long-Term Stewardship of Digital Data Sets in Science and Engineering.: Final Report of Workshop New Collaborative Relationships: Academic Libraries in the Digital Data Universe. ARL, Washington, DC, September 2006.)
analyzeddata/
datasets
processeddata/
datasets
“raw”data/
datasets
Analyzed data might need to be reviewed prior to publication, or in case of questions after publication
Quite often data must be scrubbed/anonymized, or processed to format prior to analysis; some disciplines share this data widely within their communities (e.g., astronomy, physics, etc.)
Some raw data are shared readily (e.g., genetics), but also quite often are discarded, depending on discipline
Data Needs Assessment
10
• Needed to understand campus needs before investing in solutions
• What are faculty needs, practices, attitudes, etc.?
• What is the appropriate infrastructure at a campus level?
• Where should we develop partnerships?
Data Curation Profiles
11
• An interview instrument that provides a guide for discussing data with researchers
• Analysis of profiles:• Gives insight into faculty needs and attitudes related to data sharing• Help assess information needs related to data collections• Gives insight into differences between data in various disciplines • Help identify possible data services • Create a starting point for curating a data set for archiving and
preservation
http://www.datacurationprofiles.org
Early Campus Collaborations
13
• 2006: D2C2• 2007: Grant Proposals• 2007: Data Curation Profiles• 2008-9: e-Data Task Force• 2010: Faculty Data Committee
Current Service Offerings
14
Current Service Model
15
Specific Data Services
16
• Data reference• Data mgmt planning• Data consultation (may lead
to collaborations/grants)• Using PURR• Promoting data DOIs• Data mgmt education and
information literacy• Finding and using data• Developing tools (DCP 2.0,
DataBib, DMP-SAQ)• Data visualization/GIS
• Developing data resources (LibGuides, tutorials)
• Linking data to articles and dissertations
• Promoting open access (Authors rights, IR deposit)*
• Leveraging publishing opportunities*
• Developing local collections*• Collection mgmt of “e”
(journals, data, archives)*• Integrating systems *
(i.e., finding data in Primo)• * As relates to data
Campus Data Services at Purdue
17
Data Services is oneof many services
in the Libraries
Purdue Data
Services
Liaison Librarians
Data Services
Specialists
Other Libraries
Specialists
Other Campus
Specialists
DS
Collaborative Model within the Libraries
18
The current service model is a combination of interaction between researcher, subject liaison, and data services librarian. When a researcher approaches a subject librarian about a data related question, the librarian can:
1. Refer the question to the data services team, who will engage the researcher and keep the librarian in the loop regarding resolution (Referral)
2. Ask a data services team member to accompany them in meeting with the researcher to determine question or problem (“Buddy System”)
3. Meet with the researcher to understand and address the problem, using the data services team as resource to consult with as needed (Consultation)
4. Work directly with researcher (Solo)
Purdue University Research Repository (PURR)
19
http://research.hub.purdue.edu
PURR
20
• Based on HUBzero• Collaboration between Libraries, IT, OVPR• Subsidized by campus
• Grant-supported projects get 100GB working space, 10 GB for published data
• Additional space can be purchased if needed
• Includes project space, “publishing” workflow including DOIs
• Preservation layers under investigation
Data Services & PURR
21
PURROVPR
Policy & Sponsored Programs & Awards
ITaPInfrastructure(HUBzero™)
LibrariesData Services(Reference & Consulting) &Preservation
Researchers
Research Collaboration, Data Discovery, Curating, Publishing & Archiving
Purdue University Research Repository (PURR)
22
Purdue University Research Repository (PURR)
23
PURR Workflow Diagram
5 Opportunities
1 2 3 4 5
Purdue University Research Repository (PURR)
24
1. Craft Data Management Plans2. Consult on new projects3. Collaborate and contribute to projects4. Review datasets submitted for publication5. Select / De-select published datasets from
the collection
25
DataCite
What is DataCite?
26
An International Organization dedicated to:• Establishing easier access to scientific research data • Increasing acceptance of research data as legitimate,
citable contributions to the scientific record• Supporting data archiving that will permit results to
be verified and re-purposed for future studyhttp://www.datacite.org
DataCite
27
• DOI Allocation• 3 Full Members in US:
– Purdue University Libraries– California Digital Library– Office of Scientific and Technical Information (DOE)
• How to get involved?– Work with a full member to assign DOIs to your data– Attend DataCite workshops and conferences
http://datacite.org/DataCiteUS
28
Identifier Services and Campus Data
Services
Why Identifier Services?
29
Data Citation Services on Campus
30
There is a lack of resources, tools and standards to help researchers manage, share, or preserve research data
“In an ideal situation we would somehow have some sort of standard under which we named things and stored things and kept track of things and we would, you know, have a way to get this information to our students.” (U1E2J1)
Data Citation Services on Campus
31
• Researchers state a general willingness to share their data with others, but not without certain restrictions, and not without benefits for themselves.– Embargo
– Attribution (Citation)– “Trust”
• “I need the people who use my dataset to cite it so that I get credit for producing it.” focus on citation and identifier standards
Availability of Identifier Services
32
• We use EZID as our platform• DOIs are included in PURR, which is broadly
available on campus• Pricing models for other projects, both for DOIs
and ARKs
Data Citation Services and Library Publishing Services
33
• Provides a connection between Data Citation Services and Library Publishing Services at Purdue
• Provides a selling point for both services. DOIs provide credibility
• Exploring emerging publishing models– Open Access– Connecting Textual and non-Textual Resources– Publishing Data (Data Papers, etc.)