Top Banner
New data and tools at TAIR (The Arabidopsis Information Resource)
47

New data and tools at TAIR (The Arabidopsis Information Resource)

Dec 28, 2015

Download

Documents

Robyn Blair
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: New data and tools at TAIR (The Arabidopsis Information Resource)

New data and tools at TAIR

(The Arabidopsis Information Resource)

Page 2: New data and tools at TAIR (The Arabidopsis Information Resource)

Overview of TAIR

Genome release

Published papers

Gene function

Journal collaborations

Direct submission

RNA-seq Proteomic Corrections

Other data:MarkersEcotypes

Gene symbolsNew genomes

New tools

ResearchersDirectly (TAIR pages)AND via other databases

Page 3: New data and tools at TAIR (The Arabidopsis Information Resource)

TAIR10 Genome Release

Genome release

RNA-seq Proteomic Corrections • No assembly updates

• Will incorporate: – 200M Ecker and Mockler

RNA-seq reads– Additional proteomics data– Individual gene structure

corrections sent to us

Page 4: New data and tools at TAIR (The Arabidopsis Information Resource)

Mapping and Assembly1. Mapping• RNA-seq sequences (Tophat (C. Trapnell),

Supersplat (T.C. Mockler))• Peptides (6-frame translation, spliced exon graph)

2. Assembly approaches• Augustus (M. Stanke)o Uses spliced RNA seq reads, peptideso Aim: Identify additional splice-variants, update existing

genes• TAU (T.C. Mockler)o Uses spliced RNA seq readso Aim: Identify additional splice-variants• Cufflinks (C. Trapnell)o Uses spliced and unspliced RNA seq datao Aim: Identify novel genes

Page 5: New data and tools at TAIR (The Arabidopsis Information Resource)

Preliminary Results

Augustus/TAU/Cufflinks predicted models are classified into categories:

Novel genes 21

Updated genes 812

Splice-variants 2134

B-list 1586

Rejects 2318

Page 6: New data and tools at TAIR (The Arabidopsis Information Resource)

TAIR10 Genome Release

Genome release

RNA-seq Proteomic Corrections • No assembly updates

• Will incorporate: – 200M Ecker and Mockler

RNA-seq reads– Additional proteomics data– Individual gene structure

corrections sent to us

• Release expected in August 2010

Page 7: New data and tools at TAIR (The Arabidopsis Information Resource)

Experimentally Verified Gene Function

• From research articles read by TAIR curators

• From TAIR’s collaboration with journals

• From direct submissions by researchers to TAIR

Published papers

Gene function

Journal collaborations

Direct submission

Where does it come from???

Page 8: New data and tools at TAIR (The Arabidopsis Information Resource)

• How?– Papers are prioritized

according to novelty of gene function results

– Highest priority papers are read and gene function is extracted

• Why?– A lot of high quality

experimental gene function information is only available in the form of articles

• How many?– About 1/3 of all new articles

containing gene function data are curated at TAIR each year

Published papers

Gene function

Literature Curation

Page 9: New data and tools at TAIR (The Arabidopsis Information Resource)

• How?– Author instructions, Excel

sheet or online form

• Why?– To capture a larger fraction of

gene function data– Because publication is the

right time to get the data into TAIR

• What journals?

Gene function

Journal collaborations

Journal Collaboration

Page 10: New data and tools at TAIR (The Arabidopsis Information Resource)

Journal Collaboration

Page 11: New data and tools at TAIR (The Arabidopsis Information Resource)

• How?– Author instructions, Excel

sheet or online form

• Why?– To capture a larger fraction of

gene function data– Because publication is the

right time to get the data into TAIR

• What journals?

Gene function

Journal collaborations

2010:Journal of Integrative Plant BiologyJournal of Experimental BotanyPlant ScienceEnvironmental BotanyPlant Physiology and BiochemistryPlant, Cell and Environment

Plant Physiology (2008)

The Plant Journal (2009)

Journal Collaboration

Page 12: New data and tools at TAIR (The Arabidopsis Information Resource)

Direct Submission of Gene Function

• How?– Excel sheet or online form

• Why?– To capture more data with a

small curation team– Because researchers are the

experts on the genes they study

Gene function

Direct submission

Page 13: New data and tools at TAIR (The Arabidopsis Information Resource)

New online submission form

17986450

Page 14: New data and tools at TAIR (The Arabidopsis Information Resource)
Page 15: New data and tools at TAIR (The Arabidopsis Information Resource)

Why Gene Ontology?

• Standardization allows comparison across experiments and species

• Hierarchical structure allows high level categorization

• Well structured ontology framework facilitates computational analysis

• Attached to data source (peer reviewed published research)

• Experimental evidence can be distinguished from predictions

Page 16: New data and tools at TAIR (The Arabidopsis Information Resource)

Example Gene Ontology annotations

Gene GO term Evidence Reference

Phot1 Phototropism Mutant phenotype Huala et al 1997

Phot1 Cytoplasm Direct assay Sakamoto et al 2002

Phot1 Serine / threonine kinase activity

Direct assay Christie et al 1998

Biological process

Cellular component

Molecular function

3 GO flavors

Page 17: New data and tools at TAIR (The Arabidopsis Information Resource)
Page 18: New data and tools at TAIR (The Arabidopsis Information Resource)

New online submission form

Autocomplete (just start typing to get a list of matching terms)

Page 19: New data and tools at TAIR (The Arabidopsis Information Resource)

New online submission form

Page 20: New data and tools at TAIR (The Arabidopsis Information Resource)

New online submission form

Page 21: New data and tools at TAIR (The Arabidopsis Information Resource)

What is the result of TAIR’s effort to capture gene function?

• How many genes have experimental gene function in TAIR?

Published papers

Gene function

Journal collaborations

Direct submission

Page 22: New data and tools at TAIR (The Arabidopsis Information Resource)

Num

ber

of g

enes

9342 genes (May 31 2010)

Genes in TAIR with experimental evidence for biological process, molecular function or cellular component

Page 23: New data and tools at TAIR (The Arabidopsis Information Resource)

Arabidopsis Gene Function in TAIR

Year

Ge

nes

Protein coding genes

Predicted function

Experimental function

Page 24: New data and tools at TAIR (The Arabidopsis Information Resource)

Ara-bidopsis

yeast worm fly ze-brafish

mouse rat0

1000

2000

3000

4000

5000

6000

7000

8000

Experimental GO Annotations

Biological Process

Cellular Component

Molecular Function

Organism

Nu

mb

er

of

gen

e p

rod

ucts

Page 25: New data and tools at TAIR (The Arabidopsis Information Resource)

Overview of TAIR

Genome release

Published papers

Gene function

Journal collaborations

Direct submission

RNA-seq Proteomic Corrections

Other data:MarkersEcotypes

Gene symbolsNew genomes

New tools

ResearchersDirectly (TAIR pages)AND via other databases

Page 26: New data and tools at TAIR (The Arabidopsis Information Resource)

GBrowse_syn

Tool by Sheldon McKay, CSHLAlignment data from Pedro Pattyn, Van de Peer lab, U. of Ghent

Page 27: New data and tools at TAIR (The Arabidopsis Information Resource)

GBrowse_syn

A. lyrata

A. thaliana

poplar

Page 28: New data and tools at TAIR (The Arabidopsis Information Resource)

NBrowse

Tool by H.-L. Kao, F. Piano, M. Schuman, M. Gibson, Kris Gunsalus, NYUInteraction datasets curated by TAIR, BioGRID and IntAct

Page 29: New data and tools at TAIR (The Arabidopsis Information Resource)

NBrowse

Tool by H.-L. Kao, F. Piano, M. Schuman, M. Gibson, Kris Gunsalus, NYUInteraction datasets curated by TAIR, BioGRID and IntAct

Page 30: New data and tools at TAIR (The Arabidopsis Information Resource)

NBrowse

Tool by H.-L. Kao, F. Piano, M. Schuman, M. Gibson, Kris Gunsalus, NYUInteraction datasets curated by TAIR, BioGRID and IntAct

Page 31: New data and tools at TAIR (The Arabidopsis Information Resource)
Page 32: New data and tools at TAIR (The Arabidopsis Information Resource)
Page 33: New data and tools at TAIR (The Arabidopsis Information Resource)

Genes have been loaded

Working on adding some gene function information and improving searching

Arabidopsis lyrata

Page 34: New data and tools at TAIR (The Arabidopsis Information Resource)

Overview of TAIR

Genome release

Published papers

Gene function

Journal collaborations

Direct submission

RNA-seq Proteomic Corrections

Other data:MarkersEcotypes

Gene symbolsNew genomes

New tools

ResearchersDirectly (TAIR pages)AND via other databases

Page 35: New data and tools at TAIR (The Arabidopsis Information Resource)

Central registry for Gene Symbols

Page 36: New data and tools at TAIR (The Arabidopsis Information Resource)

Central registry for Gene Symbols

Page 37: New data and tools at TAIR (The Arabidopsis Information Resource)

Central registry for Gene Symbols

Page 38: New data and tools at TAIR (The Arabidopsis Information Resource)

Central registry for Gene Symbols

Page 39: New data and tools at TAIR (The Arabidopsis Information Resource)

Helpdesk

Page 40: New data and tools at TAIR (The Arabidopsis Information Resource)

Helpdesk

Page 41: New data and tools at TAIR (The Arabidopsis Information Resource)

Helpdesk

Page 42: New data and tools at TAIR (The Arabidopsis Information Resource)

RSS news feed

Page 43: New data and tools at TAIR (The Arabidopsis Information Resource)

RSS news feed

Page 44: New data and tools at TAIR (The Arabidopsis Information Resource)

TAIR Facebook Page

Page 45: New data and tools at TAIR (The Arabidopsis Information Resource)

TAIR Twitter Feed

Page 46: New data and tools at TAIR (The Arabidopsis Information Resource)

Tanya Berardini Donghui Li

Gene Function/GO:

Bob Muller Larry Ploetz Chris Wilks (50%)

?

David Swarbreck Philippe Lamesch Rajkumar Sasidharan

Genome Annotation:

TAIR Staff

Tech Team:

Cynthia LeeShanker Singh

Page 47: New data and tools at TAIR (The Arabidopsis Information Resource)

TAIR Sponsors:

Funding Agencies:

Host Institution: Partner: