Top Banner
BioMed Central Roadshow Auckland, 26 February, 2015 Dr Nicole Nogoy, Commissioning Editor, GigaScience [email protected]
26
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Nicole Nogoy at the Auckland BMC RoadShow

BioMed Central RoadshowAuckland, 26 February, 2015

Dr Nicole Nogoy, Commissioning Editor, [email protected]

Page 2: Nicole Nogoy at the Auckland BMC RoadShow

www.gigasciencejournal.com

Journal, data-platform and database

for large-scale data

Editor-in-Chief: Laurie Goodman

Executive Editor: Scott Edmunds

Commissioning Editor: Nicole Nogoy

Lead Curator: Chris Hunter

Data Platform: Peter Li

Data Scientist: Rob Davidson

Database Developer: "Jesse" Xiao Si Zhe

in conjunction with

Page 3: Nicole Nogoy at the Auckland BMC RoadShow

Challenges/Opportunities in the Data-Driven Era

Quick response to climate change, food security & disease outbreaks

Using networking power of the internet to tackle problems

Can ask new questions & find hidden patterns & connections

Build on each others efforts quicker & more efficiently

More collaborations across more disciplines

Harness wisdom of the crowds: crowdsourcing, citizen science, crowdfunding

Enables:

Enabled by:Removing silos, standards/formats, open-access/data

Challenges:

Page 4: Nicole Nogoy at the Auckland BMC RoadShow

Not enabled by: paywalls, silos, dead trees

18121665 1869

• Scholarly articles are merely advertisement of scholarship . The actual scholarly artefacts, i.e. the data and

computational methods, which support the scholarship, remain largely inaccessible --- Jon B. Buckheit and David L. Donoho, WaveLab and reproducible research, 1995

• Lack of transparency, lack of credit for anything other than “regular” dead tree publication

• If there is interest in data, only to monetise & repackage

Page 5: Nicole Nogoy at the Auckland BMC RoadShow

Problem: growing replication gap

1. Ioannidis et al., (2009). Repeatability of published microarray gene expression analyses. Nature Genetics 41: 142. Ioannidis JPA (2005) Why Most Published Research Findings Are False. PLoS Med 2(8)

Out of 18 microarray papers, resultsfrom 10 could not be reproduced

Page 6: Nicole Nogoy at the Auckland BMC RoadShow

Growing Issue: increasing number of retractions

>15X increase in last decadeStrong correlation of “retraction index” with higher impact factor

1. Science publishing: The trouble with retractions http://www.nature.com/news/2011/111005/full/478026a.html2. Retracted Science and the Retraction Index ▿ http://iai.asm.org/content/79/10/3855.abstract?

At current % increase by 2045 as many papers published as retracted!

Page 7: Nicole Nogoy at the Auckland BMC RoadShow

How

Page 8: Nicole Nogoy at the Auckland BMC RoadShow

GigaSolution: Deconstructing the paper

www.gigadb.orgwww.gigasciencejournal.com

Utilizes big-data infrastructure and expertise from:

Combines and integrates:

Open-access journal

Data Publishing Platform

Data Analysis Platform

Page 9: Nicole Nogoy at the Auckland BMC RoadShow

• Data• Software• Review• Re-use…

= Credit

}

Credit where credit is overdue:“One option would be to provide researchers who release data to public repositories with a means of accreditation.”“An ability to search the literature for all online papers that used a particular data set would enable appropriate attribution for those who share. “Nature Biotechnology 27, 579 (2009)

New incentives/credit

Page 10: Nicole Nogoy at the Auckland BMC RoadShow

Anatomy of a Publication

Data

Idea

Study

Analysis

Answer

Metadata

Page 11: Nicole Nogoy at the Auckland BMC RoadShow

Anatomy of a Data Publication

Data

Idea

Study

Analysis

Answer

Metadata

Page 12: Nicole Nogoy at the Auckland BMC RoadShow

Examples

Page 13: Nicole Nogoy at the Auckland BMC RoadShow

To maximize its utility to the research community and aid those fighting the current epidemic, genomic data is released here into the public domain under a CC0 license. Until the publication of research papers on the assembly and whole-genome analysis of this isolate we would ask you to cite this dataset as:

Li, D; Xi, F; Zhao, M; Liang, Y; Chen, W; Cao, S; Xu, R; Wang, G; Wang, J; Zhang, Z; Li, Y; Cui, Y; Chang, C; Cui, C; Luo, Y; Qin, J; Li, S; Li, J; Peng, Y; Pu, F; Sun, Y; Chen,Y; Zong, Y; Ma, X; Yang, X; Cen, Z; Zhao, X; Chen, F; Yin, X; Song,Y ; Rohde, H; Li, Y; Wang, J; Wang, J and the Escherichia coli O104:H4 TY-2482 isolate genome sequencing consortium (2011) Genomic data from Escherichia coli O104:H4 isolate TY-2482. BGI Shenzhen. doi:10.5524/100001 http://dx.doi.org/10.5524/100001

Our first DOI:

To the extent possible under law, BGI Shenzhen has waived all copyright and related or neighboring rights to Genomic Data from the 2011 E. coli outbreak. This work is published from: China.

Page 14: Nicole Nogoy at the Auckland BMC RoadShow
Page 15: Nicole Nogoy at the Auckland BMC RoadShow

IRRI GALAXY

Beneficiaries of the genomics revolution?Rice 3K project: 3,000 rice genomes, 13.4TB public data

Page 16: Nicole Nogoy at the Auckland BMC RoadShow

Avian Phylogenomics Project

Reference: Piotr Zurek - Kea (Nestor notabilis) (2007). Creative Commons Attribution-Share Alike 2.0 Generic license (via Wikimedia Commons)

Tītitipounamu, Rifleman, female at left and male at right. Endemic to New Zealand. Creative Commons Public Domain (via Wikimedia Commons)

Page 17: Nicole Nogoy at the Auckland BMC RoadShow

Disseminating new types of data

Page 18: Nicole Nogoy at the Auckland BMC RoadShow

NO

Collaborations with Pensoft & PLOSCyber-centipedes & virtual worms

Page 19: Nicole Nogoy at the Auckland BMC RoadShow
Page 20: Nicole Nogoy at the Auckland BMC RoadShow

How are we supporting data reproducibility?

Data sets

Analyses

Open-Paper

Open-Review

DOI:10.1186/2047-217X-1-18

~21,000 accesses

Open-Code

8 reviewers tested data in ftp server & named reports published

DOI:10.5524/100044

Open-Pipelines

Open-Workflows

DOI:10.5524/100038

Open-Data

78GB CC0 data

Code in sourceforge under GPLv3: http://soapdenovo2.sourceforge.net/~21,000 downloads

Enabled code to being picked apart by bloggers in wiki http://homolog.us/wiki/index.php?title=SOAPdenovo2

Page 21: Nicole Nogoy at the Auckland BMC RoadShow

New & more transparent peer-review:

The GigaScience way:

8 referees downloaded & tested data, then signed reports

Page 22: Nicole Nogoy at the Auckland BMC RoadShow

New & more transparent peer-review:

The GigaScience way:

Real-time open-review = paper in arXiv + blogged reviews

Page 23: Nicole Nogoy at the Auckland BMC RoadShow

Rewarding and aiding reproducibility

OMERO: providing access to imaging data…

Page 24: Nicole Nogoy at the Auckland BMC RoadShow

Changing the way we publish:

Page 25: Nicole Nogoy at the Auckland BMC RoadShow

“Deconstructed”Journal

“Regular”Journal

“Conscientious” Online Journal

Page 26: Nicole Nogoy at the Auckland BMC RoadShow

Ruibang Luo (BGI/HKU)Shaoguang Liang (BGI-SZ)Tin-Lap Lee (CUHK)Qiong Luo (HKUST)Senghong Wang (HKUST)Yan Zhou (HKUST)

Thanks to:

@gigascience

facebook.com/GigaScience

blogs.biomedcentral.com/gigablog/

Peter LiChris HunterJesse Si ZheScott EdmundsLaurie GoodmanRob DavidsonAmye Kenall (BMC)

Marco Roos (LUMC)Mark Thompson (LUMC)Jun Zhao (Lancaster)Susanna Sansone (Oxford)Philippe Rocca-Serra (Oxford) Alejandra Gonzalez-Beltran (Oxford)

www.gigadb.orggalaxy.cbiit.cuhk.edu.hk

www.gigasciencejournal.com

CBIITFunding from:

Our collaborators:team: Case study: