The agINFRA Germplasm Working Group

Post on 27-Jan-2015

107 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Presentation about the agINFRA Germplasm Working Group (http://wiki.aginfra.eu/index.php/Germplasm_Working_Group). Presented during Session 1 of the 1st International e-Conference on Germplasm Data Interoperability (https://sites.google.com/site/germplasminteroperability/)

Transcript

The Germplasm Working Group

Dr. Vassilis ProtonotariosAgricultural Biotechnologist, PhDAgro-Know Technologies, Greece

e-Conference on Germplasm Data InteroperabilitySession 1: “The vision of Linked Germplasm Data”

Structure of the presentation

1. Background– About the agINFRA project– Issues related to data sharing

2. The Germplasm Working Group– Objectives– Wiki– Link with RDA

3. The next steps

Background

The agINFRA project

• A project funded under the FP7 program of EC• Consortium with expertise on– Technology / infrastructures– Data / data management

Combined to facilitate agricultural data sharingMore info at:

www.aginfra.eu

The agINFRA project

• Aims to enhance the interoperability between the agricultural data sources

– Data sharing by• Metadata aggregation & linking data• Design and deploy the linked ag-data framework

– Methodology for linking data– Provide the infrastructure needed• Both cloud- and grid-based services• Tools, APIs etc.

agINFRA major data types

agINFRA

Bibliographic

Agri Statistics & Economics

Educational

Germplasm

Soil data

Profiles

Raw data

Other?

agINFRA major data sourcesData Type Data provider(s)

Bibliographic FAO AGRISCASDD (CAAS)

Educational Organic.EdunetGreen Learning NetworkLAFLOR

Germplasm Chinese Crop Germplasm Information System (CAAS)Italian National Germplasm Database (CRA)

Soil Data Italian National Center for Soil Mapping

Statistical FAOSTATCountrySTAT

Researchers’ profiles, organizations & events

AGRIVIVO

Focusing on germplasm

Local Databases

National DatabasesAggregators

GENESYSEURISCO

GBIF

Italian

Italian University

Italian research center

Chinese Chinese research center

Data flow

Focusing on germplasm

Local Databases

National DatabasesAggregators

GENESYSEURISCO

Italian

Italian University

Italian research center

Chinese Chinese research center

The issue ?

• Heterogeneity!– Data types– Data formats– Data management workflows– Standards used– Metadata exposure options– ….

• Lack of connectivity with other data sources

The Germplasm Working Group

The Germplasm Working Group

• Created in the context of the agINFRA project• Initially included agINFRA stakeholders– now expanded to host all stakeholders

• The group is NOT a group of experts on germplasm data!

The scope of the Germplasm WG

• Aims to enable/enhance interoperability between germplasm databases – By developing the services for

• exchanging their data and • delivering their data to other partners

• Focusing on three actions:1. IDENTIFY2. ORGANIZE3. PROPOSE

Germplasm WG objectives

• IDENTIFY: collect all information related to germplasm data

• People/groups• Namespaces (metadata, KOS)• Standards• Workflows• Events

• ORGANIZE: engage all stakeholders & available resources, analyze existing standards , facilitate collaboration

• PROPOSE: linked data framework to connect data sources • facilitate data sharing between germplasm data sources

Germplasm related information

data management

workflows

metadata schemas

Working groups in

germplasm

Events (for connecting stakeholders)

KOS (ontologies,

thesauri, vocabularies

etc.)

Data exposure capabilities

Germplasm related information

data management

workflows

metadata schemas

Working groups in

germplasm

Events (for connecting stakeholders)

KOS (ontologies,

thesauri, vocabularies

etc.)

Data exposure capabilities

Proposed methodology

1. Analyze metadata schemas & KOSs used to describe germplasm resources

2. Define attributes & vocabularies that can be used to expose germplasm resources in linked data format.

3. Provide a set of recommendations for the exposure of germplasm resources as linked data

4. Embed the recommendations in the data infrastructure of agINFRA – to allow the exposure of germplasm resources as LOD.

The Germplasm WG wiki

• Central point of reference

• Freely accessible (no login required)

http://wiki.aginfra.eu/index.php/Germplasm_Working_Group

Information available so far

• Vision• Activities• Outcomes• Participants• Next steps• Useful resources– Data sources– Standards– Services– Stakeholders

• Events

Key outcomes of the group

• Dossier on Germplasm Information:– Major programs– Major information systems and services– agINFRA germplasm data sources (CGRIS & CRA)– Core standards for germplasm information – Plant nomenclature, taxonomies and ontologies– Plant genomic resources– Related references and links

• Freely available from the Germplasm Group wiki

Existing participants

Our wish list (tentative list)

Reusing experiences from …and working closely with

Connection with RDA

• RDA: Research Data Alliance (https://rd-alliance.org)

• Aims to “accelerate and facilitate research data sharing and exchange”

• Structure:– Interest Groups: Cover wider topics– Working Groups: Working on focused topics

Connection with RDA

• Representation of agINFRA Germplasm WG in– 1st RDA Plenary Meeting (March 2013,

Gothenburg, Sweden)– 2nd RDA Plenary Meeting (September 2013,

Washington D.C., USA)

• Suggestion for a Germplasm WG in RDA

Link between WG and RDA Groups

Link between WG and RDA GroupsRDA IG/WG

• Collection of large-scale data

• Collection of requirements

•Development of Best Practices

• Interaction with other IGs/WGs (e.g. metadata, LD)

• Application in more cases

• Wider exposure of outcomes

•Development of Best Practices

agINFRA WG• Interactions with data

providers

• Two (2) case studies

• Analysis of existing standards

• Collection of requirements

• Definition of data management workflows

• Development & adaptation of tools and services

•Development of Best Practices

The next steps

Towards the linking of germplasm data sources

1. Definition and application of the linked data for the agINFRA germplasm data sources

2. Recording and documentation of the process3. Identification of issues4. Suggestion for solutions to these issues5. Fine-tuning of workflow6. Development of Best Practices

…and more next steps

• Update the existing analysis with new data• Collect new user requirements• (re)define the mappings between metadata

schemas and KOSs• Fine-tune the linked data approach

Source: http://verastic.com/social/why-do-people-not-say-thank-you.html

Contact me: vprot@agroknow.gr

top related