Top Banner
Organizing Scientific Competitions on the Semantic Web Sayoko Shimoyama, Robert Sidney Cox III, David Gifford and Tetsuro Toyoda Integrated Database Unit, Advanced Center of Computing and Communication (ACCC), RIKEN, Japan DEXA2013, August 27
18

Organizing Scientific Competitions on the Semantic Web

Jul 24, 2015

Download

Technology

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Organizing Scientific Competitions on the Semantic Web

Organizing Scientific Competitions on the Semantic Web

Sayoko Shimoyama, Robert Sidney Cox III, David Gifford and Tetsuro Toyoda Integrated Database Unit, Advanced Center of Computing and Communication (ACCC),

RIKEN, Japan

DEXA2013, August 27

Page 2: Organizing Scientific Competitions on the Semantic Web

» Data repositories and directories for open data help users register their data resources and locate related data; such as the CKAN (Comprehensive Knowledge Archive Network) web-based system for storage and distribution of data.

» However, the act of separating data from applications on the web makes collaboration between data and applications invisible; so contributions to opening and maintaining data are not evaluated as appropriately as contributions to Apps.

?

This situation does not motivate people to contribute by donating their own datasets.

invisible

Au

gust

27

, 20

13

Page 3: Organizing Scientific Competitions on the Semantic Web

» To overcome this situation, we developed LinkData as a data publishing platform and LinkDataApp as an application publishing platform.

» We combined them by automatically recording dependency graphs that relate Data and Apps using that data;

This cycle enhances a wide range of synergistic collaborations.

Create App for Data

Create Data to use with App

Au

gust

27

, 20

13

Page 4: Organizing Scientific Competitions on the Semantic Web

1. Support functions for creating table data to upload:

» Users can create a template by inputting metadata using LinkData’s GUI and downloading it.

» Schema of all published Data can be reused for publishing new datasets.

» Users input their data to this template to create their own table data for uploading.

create new

reuse

template

Au

gust

27

, 20

13

Page 5: Organizing Scientific Competitions on the Semantic Web

phone number

number of books

Central library

045-111-1111

154,265

West library

045-222-2222

65,489

South library

045-333-3333

98,548

Central library

045-111- 1111

phone number

154,265 number of books

Central library

045-111- 1111

phone number

154,265 number of books

Central library

045-111- 1111

phone number

154,265 number of books

2. Conversion to RDF and publishing:

» Template data tables can be uploaded, converted to RDF, and published online at LinkData.org.

Au

gust

27

, 20

13

Page 6: Organizing Scientific Competitions on the Semantic Web

3. Application development support function:

» Application developers can access Data Content using LinkData platform provided APIs.

» 8 formats are provided:

• TSV

• RDF/Turtle

• RDF/JSON

• RDF/XML

• RSS

• KML

• R (for statistical analysis)

• Simple Data Format

Due to these functions, LinkData supports not only

publishing data but also using data.

Au

gust

27

, 20

13

Page 7: Organizing Scientific Competitions on the Semantic Web

1. Create a new App by editing a sample program:

» Users choose Data as an input and edit sample JavaScript programs on a web browser to develop their own original App.

2. Fork an App to publish as a new one:

» Published Apps on LinkDataApp can be forked.

» Users can fork and modify the program to publish it as a new App.

3. Change input Data to create a new

App:

» Even a non-programmer can add new functionality to an App by changing the Input Data.

Choose input Data and create

Change input Data

Fork

JavaScript Editor Au

gust

27

, 20

13

Page 8: Organizing Scientific Competitions on the Semantic Web

Entity Definition

Data A single data set which has been published by a User in LinkData

Application (App)

A single application which has been published by a User in LinkData

User A user who had registered for a LinkData account

Graph Term Label Definition

Data(new) → Data(old) Reuse Ldd Create new Data by reusing existing Data

Data → User Contributed Ldu

The relationship between Existing Data and the user who created the Data

App(new) → App(old) Fork Laa Create a new App by reusing an existing App’s program code

App → Data Load Lad

Create an App by specifying some files as input from some particular Data

App → User Contributed Lau

The relationship between an Existing App and the user who created the App

User(A) → User(B) Follow Luu

User A follows user B to receive updates and information of evaluated Data and Apps by user B

User → Data Vote Lud A user gives a rating of Useful or Un-useful for considered Data

User → App Vote Lua A user gives a rating of Useful or Un-useful for a considered App

Au

gust

27

, 20

13

Page 9: Organizing Scientific Competitions on the Semantic Web

» Count of hosted Data and Apps in LinkData (as of August, 2013)

» Count of relationships among Data, Apps and Users in LinkData

Kind of relationship Count Load (App to Data) 1508

Fork (App to App) 153

Reuse (Data to Data) 41

Follow (User to User) 54

Vote (User to Data) 279

Vote (User to App) 108

655 316

There is a stronger synergy cycle between data resources and applications than “in data” (between data and data) or “in app” (between app and app).

Au

gust

27

, 20

13

Page 10: Organizing Scientific Competitions on the Semantic Web

Example dependency graph among Data, Apps and Users.

» Dark Green edges indicate Data to Data reuse

» Red edges indicate Data to App loading

» Blue edges indicate App to App forking

» Bright Green edges indicate User ownership “contribution”

» Grey edges indicate votes to rate applications by users, and following of other users

The dependency graph allows users to dynamically contribute to and benefit from an automated rating of both data and applications.

Interactive Gene Association Matrix application created on LinkDataApp http://app.linkdata.org/app/app1s64i

Au

gust

27

, 20

13

Page 11: Organizing Scientific Competitions on the Semantic Web

Organizing Scientific Competitions on the LinkData platform

» For the synthetic biology competition GenoCon2 (http://genocon.org) , we challenged participants to design novel regulatory DNA for controlling gene expression in the thale cress plant Arabidopsis thaliana.

» In addition to DNA sequences, we offered programs for DNA design.

Au

gust

27

, 20

13

Page 12: Organizing Scientific Competitions on the Semantic Web

PromoterCAD : Data Driven Design of Plant Regulatory DNA

» To allow non-experts an opportunity for DNA design we built a computer aided design tool on the LinkData platform, called PromoterCAD.

» Using PromoterCAD function modules, genes with the desired properties can be found and mined for regulatory motifs. These are introduced into the synthetic promoter by user choice of regulatory position. Repeating this process can create complex regulation at the promoter.

» Finally, the DNA design is exported for error and safety checking, DNA synthesis, and experimental characterization.

Au

gust

27

, 20

13

http://app.linkdata.org/app/app1s335i

Page 13: Organizing Scientific Competitions on the Semantic Web

PromoterCAD LinkData system architecture for DNA design incorporates database information with user knowledge

» PromoterCAD uses several data sources for Tissue / Time specific promoter design.

Au

gust

27

, 20

13

fork

add

Users can add their own data suited to promoter design.

create new

Users also can create a new App or fork a pre-existing App for design.

Page 14: Organizing Scientific Competitions on the Semantic Web

Here we show the cycle enhancing synergy of collaboration in this web-based scientific competition for synthetic biology promoter design.

This graph shows interaction between Data (Green box), Apps (Blue box), and Users (Grey box).

» Dark Green edges indicate Data to Data reuse

» Red edges indicate Data to App loading

» Blue edges indicate App to App forking

» Bright Green edges indicate User ownership “contribution”

» Grey edges indicate votes to rate applications by users, and following of other users

The App “GenoCon PromoterCAD” at http://app.linkdata.org/app/app1s94i is shown in the graph. The Dataset http://linkdata.org/work/rdf1s339i “Speedup Lists of Developmental Coexpression” is a source for this graph

Au

gust

27

, 20

13

Page 15: Organizing Scientific Competitions on the Semantic Web

• For example, highly voted application ID:137 “A Promoter Design to Maintain the Fertility of Transgenic Plant by new Plugin MotifRanking” is a fork of ID:94 PromoterCAD.

• This example graph shows that ID:94 forked by 6 apps and voted for by 1user.

• It shows ID:137 forked 0 times and voted for by 5 users for a score of 5.

• In this fashion each app can be compared for total activity and usefulness in turn.

6 forks

1 vote

5 votes

0 forks

Au

gust

27

, 20

13

Page 16: Organizing Scientific Competitions on the Semantic Web

LinkData Application app1s137i showing usability ranking and user voting buttons on top right. http://app.linkdata.org/app/app1s137i

GenoCon2 Contest Activity:

» There are over 40 international submissions including from the USA, Egypt and Japan.

» Users cooperated to create original designs that were modified and possibly improved by other users.

» Team collaboration was aided by the open nature of the design platform; 13 promoter designs are being considered for final construction in transgenic plants.

Au

gust

27

, 20

13

The semantic dependency-graph-based system with evaluation by experiment will foster a rapid biological knowledge cycle where programmers, researchers, and amateurs can all contribute.

Page 17: Organizing Scientific Competitions on the Semantic Web

» A scientific competition was successfully organized on the LinkData platform that records dependency graphs among datasets and applications.

» It was found that participants in the competition generated many dependency graphs by forking pre-existing applications or reusing schema of pre-existing datasets.

» These creative activities could not be observed explicitly without being recorded, such as by dependency graphs among datasets and applications on the platform.

» Hence, we suggest a worldwide system needs to be established to record and harvest such dependency graphs from distributed data platforms and application-development platforms around the world, so that our intellectual and creative activities using open datasets for application development may be recorded properly.

Au

gust

27

, 20

13

Page 18: Organizing Scientific Competitions on the Semantic Web

Dr. Takaho Endo for creating biological visualization tool on LinkDataApp. Ms. Yuko Yoshida for development of converter and valuable discussion. Dr. Shuji Kawaguchi for giving advice on the score calculation. Dr. Koro Nishikata for testing LinkData functions. Dr. Masahiro Mochizuki for testing and adding the MotifRanking tool.

Mr. Chanaka Perera, Mr. Uditha Punchihewa, Mr. Gayan Hewathanthri, Mr. Hiroaki Osada, Mr. Kazuro Fukuhara and Mr. Kiyoshi Mizumoto (Axiohelix Co., Ltd.) for web application and LinkData development.

The committee of Linked Open Data Challenge Japan for continuing interest and encouragement.

This work was supported by: The National Bioscience Database Center (NBDC) of the Japan Science and Technology Agency (JST).

REFERENCES » F. Manola, et al.: RDF Primer W3C Rec. (2004)

» E. Prud’hommeaux, et. al.: SPARQL Query Language for RDF. W3C Candidate Rec. (2006)

» T. Toyoda, et al.: “Methods for Open Innovation on a Genome – Design Platform Associating Scientific, Commercial, and Educational Communities in Synthetic Biology,” Methods in Enzymology., Vol. 498, 189-203, (2011)

» R. S. Cox III, K. Nishikata, S. Shimoyama, T. Toyoda et. al.: “PromoterCAD: data-driven design of plant regulatory DNA,” Nucl. Acids Res. 41 (W1): W569-W574, (July 2013)

Au

gust

27

, 20

13