L&P Humphrey Stewart-Shearer-Joint Session Project ARC & Federated DMP Pilot
Post on 16-Jul-2015
122 Views
Preview:
Transcript
THE FEDERATED PILOTand
PROJECT ARC
Walter Stewart – RDC Co-ordinator
Kathleen Shearer – RDC and CARL
Chuck Humphrey – U of Alberta Library
THE TREND TOWARDS SHARED SERVICES• Other countries are developing shared services and
infrastructure to support research data management services.
• Why? To address cost redundancies, pools knowledge, breaks down silos across disciplines.
• Common shared services are: discovery, data registries, support and expertise, training, shared repositories and preservation.
• General trend towards domain-based services to generic RDM infrastructure. There are common infrastructure and service requirements across domains!
PROJECT ARC• Initiated and supported by CARL
• December 2013 initial stakeholders meeting
• March 2014 working group launched, for 1 year
• Working group members represent
– CARL
– all four regional academic library associations: CAUL, COPPUL, OCUL and Quebec
– CRKN
• Includes some of Canada’s top research data management experts
PROJECT ARC
Builds on previous efforts by CARL to improve capacity at Canadian universities in the area of research data management:• 2009 Research Data Management Toolkit: Unseen
Opportunities• 2010 Library Roles in Management Research Data• 2011-2012 Proposal: “Canadian National
Collaborative Data Infrastructure Project” • 2013 RDM Course: Introduction to Research Data
Management Services
PROJECT ARC AIM AND VISION
• A future in which Canada capitalizes on the trend towards data intensive research and is a world leader in research and innovation
• This future is achievable, with comprehensive support for research data management at a national scale.
• Project ARC aim is to improve our national capacity for the management, preservation, and re-use of research data.
PROJECT ARC – SCOPE
• Bring together existing library-based initiatives to better coordinate activities and build capacity across the country
• Lay the foundation for a library-based research data management network
• Work closely with other stakeholders (e.g. CANARIE, Compute Canada, Research Data Canada) to ensure integration with and support for other infrastructures and initiatives in Canada
PROJECT ARC - PRINCIPLES
• Data are a public good
• Intelligent access: openness, with respect for privacy
• Collaborative approaches: cost savings and sharing expertise
• Inclusiveness: aim to serve all researchers and create a more level playing field
• Commitment to standards and interoperability
• International relationships: liaise internationally and ensure our work is in keeping with international practices
• Respect for differences: flexibility to meet the needs of different regions, institutions, and disciplines
• Open source: Tools will be contributed back to the community
• Stewardship: a sense of responsibility for managing research data over the long term
OBJECTIVES OF PROJECT ARCLiaising closely with all relevant stakeholder in this arena, 1. Provide support for institutions to deliver data
management plans (DMPs)2. Develop a plan for the implementation of a centre
of expertise for the curation of research data in Canada
3. Undertake a pilot that will act as an exemplar for a national preservation service for research data
4. Develop an organizational framework and operational plan for a library-based research data management network in Canada
PORTAGEAt Project ARC mid-point (September 2014), a network name was proposed and concepts were refined…
The Portage network will have two major components:
• A distributed centre of expertise for research data management, and
• A national preservation system for research data that will evolve and expand over time
NETWORK CENTRE OF EXPERTISE1) Comprehensive set of resources to support data management planning
– How-to guides, case studies, training materials
– Cooperation with UK Digital Curation Centre (DCC)
– Currently being collected on Project ARC website
2) National DMP automated tool to assist Canadian researchers in developing management plans
– DMP online (originally developed by DCC) selected
3) Consulting services
– Draw on expertise of librarians and others from across the country
– Support data curation, training, DMPs, discovery, preservation, privacy-security-ethics
– Build human capacity across the country
NATIONAL PRESERVATION SYSTEM
(more from Chuck and Walter)Advice and support for researchers depends on viable technical solutions!
• Continue a pilot in close collaboration with Compute Canada and RDC, including some of the domain data centres
• Domain data centres currently involved are Canadian Astronomy Data Centre and C-Brain, whose creation was supported by the CANARIE research software program
• Goal is to enable all interested academic libraries to participate, whether or not they have their own local infrastructure
• Complements high performance computing infrastructure and domain repositories and contributes integration layer
A COLLABORATIVE EFFORT AMONG RDC, CARL Project Arc, Compute Canada, CANARIE,
Scholar’s Portal, SFU Libraries, CANFAR, C-Brain, CPDN
Born at the DI Summit2014
THE CONTEXT:
• Data are both a product and a resource for 21st
century discovery.
• The TC3+ are preparing to require Data
Management Plans as part of the funding
application process.
• The federal government has extended its
commitment to open government and open data
to cover federally funded research:
THE CONTEXT:
The Government of Canada will maximize access to federally funded scientific research to encourage greater collaboration and engagement with the scientific community, the private sector, and the public.
Among the commitments for 2014 to 2016:
Launch of open access to publications and data resulting from federally funded scientific activities
Canada's Action Plan on Open Government 2014-16
http://open.canada.ca/
THE PROBLEM:
• Data that cannot be discovered cannot be open!
• Data that are only on someone’s hard drive or
memory stick cannot be open!
• Data that are not curated cannot be open for
long!
THE PROBLEM:
• Currently in Canada, most researchers lack
access to the services and the infrastructure
that would permit them to be good stewards of
their research data and to make it accessible.
• Outside of some data intensive disciplines, little
is in place to provide for the long-term curation
and preservation of data
THE OPPORTUNITY:
• Many of the elements for a national system of
data stewardship are in place – the networks
that are required with CANARIE and the ORANS;
storage systems at Compute Canada; data
expertise in research libraries and the ARC
Project of CARL; significant experience in
developing repositories with CANFAR, C-Brain,
and CPDN among others.
THE CHALLENGE:
• Can we integrate those elements at a pilot level?
• Can we work with a small set of researchers to
ingest their data into a storage and curation
environment easily and seamlessly in a manner
that provides for easy retrieval?
• Can we create this opportunity first at a local
level and then demonstrate integration among a
few local sites into a proto-typical national
network that provides appropriate replication
and the basis for long-term preservation?
THE ANTICIPATED RESULT:
• Anticipating meeting the challenge successfully,
we hope to be able to arrive at a set of
conclusions that will allow us to make
recommendations on what would be required to
grow such a prototypical system into a truly
national network that would serve those parts of
the research community currently unserved and
would provide further support and backup for
existing repositories, some of which have
concerns about their long-term viability.
PROGRESS TO DATE:
• We have a model identified about which Chuck
Humphrey will speak in a moment.
• Building on a local scale project at SFU, we have
researcher data being moved into Compute
Canada storage resources by library staff
• We have a plan for a similar activity at the
University of Toronto to get underway in 2015
NEXT STEPS:
• We will shortly start having researchers do their own
ingest directly into the repository and archive
environment at SFU
• We will be looking at establishing a duplicative
installation at another university
• We will be looking to test replication services
• We will look to have researchers use the system at a
distance
• We will minutely detail the processes
• We will begin to discuss what it would take to scale
The PilotWorking with existing digital technology and expertise, the Pilot is to assemble a research data management infrastructure demonstrating interoperability among data repositories and the archiving of research data.
LEVELS OF DATA STEWARDSHIP• Research data management infrastructure supports data
stewardship that occurs at different levels across the research lifecycle.o The researcher at the project levelo The data repository levelo The interoperability level at the regional and national
level
EXCHANGES AMONG LEVELS
• The exchange of research data and metadata among these three levels can encounter barriers or gaps.
• A Data Management Plan is helpful in identifying a pathway for data and metadata across levels, bridging gaps and overcoming barriers.
KEY OBJECTIVES
Because no national RDM infrastructure exists today, the pilot is building pathways across levels and assembling an operational system to demonstrate a community response to providing RDM infrastructure.
aticaInstance(s
)
aticaInstance(s
)
ArchivematicaInstance(s)
managed by CARL/RDC
Writes Data Package to
Compute Canada HPC
Facilities
Custom search app or domain-specific search
apps
Compute Canada Storage
Systems
Interoperability Layer
top related