African Open Science Platform Geoffrey Boulton CODATA Pretoria December 2016
African Open Science Platform
Geoffrey Boulton
CODATA
Pretoria
December 2016
19
Ex
ab
yte
s280 E
xa
byte
s
Based on: http://www.martinhilbert.net/WorldOnfoCapacity.html 1 Exabyte=1018 bytes
The digital revolution
storage – analysis – communicationGlobal information storage capacity
In optimally compressed bytes
Digital
Storage
Analogue Storage
Explosion of the
Digital revolution
19861993
2000
2007
20
14
-
40
00
E
xa
byte
s
The technological bases for open scienceif we choose to use them!
Why Open Data/Open Science?
The international context
What is a Platform,
how should it be structured
what is its value?
How is it governed?
Role of CODATA & ICSU
Key Questions
• Identifies the opportunities and challenges of the data revolution as the dominant issue of policy for science
• Sets out 12 guiding principles for the practice of open data
• Outlines the responsibilities of all stakeholders in supporting such practice
• Addresses the boundaries of openness, concluding that open data should be the default position for publicly funded science
First statement on Open Data by the
International Scientific CommunityInternational Accord on Open Data
EMBL-EBI services Labs around the
world send us
their data and
we…
Archive it
Classify it Share it with
other data
providers
Analyse, add
value and
integrate it
…provide
tools to help
researchers
use it
A collaborative
enterprise
Discipline-driven Government-driven
International Systemic Platforms/Commons
European science cloud
CODATA – ICSU COMMISSIONON ONTOLOGIES & METADATAFOR SCIENCE & TECHNOLOGY
International Union of Crystallography
DECADE OF DATA?
The Open Data Iceberg
The Technical Challenge
The Consent Challenge
The Ecosystem Challenge
The Funding Challenge
The Support Challenge
The Skills Challenge
The Incentives Challenge
The Mindset Challenge
Processes &
Organisation
People
motivation and ethos. National/Regional Infrastructure
Technology
African Open Science Platform
Purpose:
• To provide a federated virtual space for scientists to find, deposit,
manage, share and reuse data, software and metadata
Functions:
• Establishing common principles, policies and practices for data
acquisition and use and Providing the facilitating tools in ways that
are adapted to varying national, disciplinary and application
priorities and approaches.
• Recognising the roles and developing responsibilities of different
actors at all levels in national scientific ecosystems.
• Developing the technical capacities of researchers and data
professionals.
• Creating meaning from data: awareness and access to developing
A Roadmap for Implementation A current priority is the creation of a Technical Advisory Board that will produce the road map to
determine how the above functions will be prioritised and how they will be implemented in ways
that adapt to current national priorities and research initiatives.
a) Principles – Policies – Practices – Tools
• Shared open data principles (Science International Accord on Open Data)
• A computational environment for access, utilisation and storage
• Common digital data compliance model that describes the properties of data
that enable them to be Findable, Accessible, Interoperable and Reproducible
(FAIR)
• Publicly available datasets that adhere to accepted principles and practices
• Software services and tools to facilitate access to data and their responsible
use !
b) National Science Ecosystems
• Governments: enunciate policy, create incentives
• Funders: costs of open data as the costs of doing science; require FAIR data
deposition from the projects they fund; collaborate in Platform evolution
• Universities and Institutes: research data management; capacity building;
research support; incentives for researchers
• Publishers: require concurrent FAIR data deposition
• Researchers: changing the mindset – data custodians not owners
d) Creating Meaning from Data
· Ensuring access to cutting edge analytic tools
· Matching analytic tools for big data to project purpose
· Using machine learning
· Applying semantic methods to data integration
· Developing/using relevant ontologies and vocabularies for discovery and
integration
· Linking with international efforts in data science and application areas
c) Capacity building amongst researchers and data professionals
· Coordination of technical capacity building exercises
· Including scaled-up versions of existing CODATA training workshops
and CODATA/RDA School of Research Data Science in Africa
· Collaboration with disciplinary bodies in offering discipline-specific workshops
· Discussions with universities about their longer-term adoption of data science
curricula !
A Roadmap for Implementation A current priority is the creation of a Technical Advisory Board that will produce the road map to
determine how the above functions will be prioritised and how they will be implemented in ways
that adapt to current national priorities and research initiatives.
a) Principles – Policies – Practices – Tools
• Shared open data principles (Science International Accord on Open Data)
• A computational environment for access, utilisation and storage
• Common digital data compliance model that describes the properties of data
that enable them to be Findable, Accessible, Interoperable and Reproducible
(FAIR)
• Publicly available datasets that adhere to accepted principles and practices
• Software services and tools to facilitate access to data and their responsible
use !
b) National Science Ecosystems
• Governments: enunciate policy, create incentives
• Funders: costs of open data as the costs of doing science; require FAIR data
deposition from the projects they fund; collaborate in Platform evolution
• Universities and Institutes: research data management; capacity building;
research support; incentives for researchers
• Publishers: require concurrent FAIR data deposition
• Researchers: changing the mindset – data custodians not owners
d) Creating Meaning from Data
· Ensuring access to cutting edge analytic tools
· Matching analytic tools for big data to project purpose
· Using machine learning
· Applying semantic methods to data integration
· Developing/using relevant ontologies and vocabularies for discovery and
integration
· Linking with international efforts in data science and application areas
c) Capacity building amongst researchers and data professionals
· Coordination of technical capacity building exercises
· Including scaled-up versions of existing CODATA training workshops
and CODATA/RDA School of Research Data Science in Africa
· Collaboration with disciplinary bodies in offering discipline-specific workshops
· Discussions with universities about their longer-term adoption of data science
curricula !
Functions - 1
A Roadmap for Implementation A current priority is the creation of a Technical Advisory Board that will produce the road map to
determine how the above functions will be prioritised and how they will be implemented in ways
that adapt to current national priorities and research initiatives.
a) Principles – Policies – Practices – Tools
• Shared open data principles (Science International Accord on Open Data)
• A computational environment for access, utilisation and storage
• Common digital data compliance model that describes the properties of data
that enable them to be Findable, Accessible, Interoperable and Reproducible
(FAIR)
• Publicly available datasets that adhere to accepted principles and practices
• Software services and tools to facilitate access to data and their responsible
use !
b) National Science Ecosystems
• Governments: enunciate policy, create incentives
• Funders: costs of open data as the costs of doing science; require FAIR data
deposition from the projects they fund; collaborate in Platform evolution
• Universities and Institutes: research data management; capacity building;
research support; incentives for researchers
• Publishers: require concurrent FAIR data deposition
• Researchers: changing the mindset – data custodians not owners
d) Creating Meaning from Data
· Ensuring access to cutting edge analytic tools
· Matching analytic tools for big data to project purpose
· Using machine learning
· Applying semantic methods to data integration
· Developing/using relevant ontologies and vocabularies for discovery and
integration
· Linking with international efforts in data science and application areas
c) Capacity building amongst researchers and data professionals
· Coordination of technical capacity building exercises
· Including scaled-up versions of existing CODATA training workshops
and CODATA/RDA School of Research Data Science in Africa
· Collaboration with disciplinary bodies in offering discipline-specific workshops
· Discussions with universities about their longer-term adoption of data science
curricula !
A Roadmap for Implementation A current priority is the creation of a Technical Advisory Board that will produce the road map to
determine how the above functions will be prioritised and how they will be implemented in ways
that adapt to current national priorities and research initiatives.
a) Principles – Policies – Practices – Tools
• Shared open data principles (Science International Accord on Open Data)
• A computational environment for access, utilisation and storage
• Common digital data compliance model that describes the properties of data
that enable them to be Findable, Accessible, Interoperable and Reproducible
(FAIR)
• Publicly available datasets that adhere to accepted principles and practices
• Software services and tools to facilitate access to data and their responsible
use !
b) National Science Ecosystems
• Governments: enunciate policy, create incentives
• Funders: costs of open data as the costs of doing science; require FAIR data
deposition from the projects they fund; collaborate in Platform evolution
• Universities and Institutes: research data management; capacity building;
research support; incentives for researchers
• Publishers: require concurrent FAIR data deposition
• Researchers: changing the mindset – data custodians not owners
d) Creating Meaning from Data
· Ensuring access to cutting edge analytic tools
· Matching analytic tools for big data to project purpose
· Using machine learning
· Applying semantic methods to data integration
· Developing/using relevant ontologies and vocabularies for discovery and
integration
· Linking with international efforts in data science and application areas
c) Capacity building amongst researchers and data professionals
· Coordination of technical capacity building exercises
· Including scaled-up versions of existing CODATA training workshops
and CODATA/RDA School of Research Data Science in Africa
· Collaboration with disciplinary bodies in offering discipline-specific workshops
· Discussions with universities about their longer-term adoption of data science
curricula !
Functions - 2
Pilot Phase
Funded by DST/NRF - Managed by Assaf - Directed by CODATA
Interim Governance
Advisory Council
Technical Advisory Board
Pilot Phase Priorities
Developing the Partnership
Developing Governance
Creating a Roadmap
Stimulating engagement