New data and tools at TAIR (The Arabidopsis Information Resource)

Post on 28-Dec-2015

395 Views

Category:

Documents

5 Downloads

Preview:

Click to see full reader

Transcript

New data and tools at TAIR

(The Arabidopsis Information Resource)

Overview of TAIR

Genome release

Published papers

Gene function

Journal collaborations

Direct submission

RNA-seq Proteomic Corrections

Other data:MarkersEcotypes

Gene symbolsNew genomes

New tools

ResearchersDirectly (TAIR pages)AND via other databases

TAIR10 Genome Release

Genome release

RNA-seq Proteomic Corrections • No assembly updates

• Will incorporate: – 200M Ecker and Mockler

RNA-seq reads– Additional proteomics data– Individual gene structure

corrections sent to us

Mapping and Assembly1. Mapping• RNA-seq sequences (Tophat (C. Trapnell),

Supersplat (T.C. Mockler))• Peptides (6-frame translation, spliced exon graph)

2. Assembly approaches• Augustus (M. Stanke)o Uses spliced RNA seq reads, peptideso Aim: Identify additional splice-variants, update existing

genes• TAU (T.C. Mockler)o Uses spliced RNA seq readso Aim: Identify additional splice-variants• Cufflinks (C. Trapnell)o Uses spliced and unspliced RNA seq datao Aim: Identify novel genes

Preliminary Results

Augustus/TAU/Cufflinks predicted models are classified into categories:

Novel genes 21

Updated genes 812

Splice-variants 2134

B-list 1586

Rejects 2318

TAIR10 Genome Release

Genome release

RNA-seq Proteomic Corrections • No assembly updates

• Will incorporate: – 200M Ecker and Mockler

RNA-seq reads– Additional proteomics data– Individual gene structure

corrections sent to us

• Release expected in August 2010

Experimentally Verified Gene Function

• From research articles read by TAIR curators

• From TAIR’s collaboration with journals

• From direct submissions by researchers to TAIR

Published papers

Gene function

Journal collaborations

Direct submission

Where does it come from???

• How?– Papers are prioritized

according to novelty of gene function results

– Highest priority papers are read and gene function is extracted

• Why?– A lot of high quality

experimental gene function information is only available in the form of articles

• How many?– About 1/3 of all new articles

containing gene function data are curated at TAIR each year

Published papers

Gene function

Literature Curation

• How?– Author instructions, Excel

sheet or online form

• Why?– To capture a larger fraction of

gene function data– Because publication is the

right time to get the data into TAIR

• What journals?

Gene function

Journal collaborations

Journal Collaboration

Journal Collaboration

• How?– Author instructions, Excel

sheet or online form

• Why?– To capture a larger fraction of

gene function data– Because publication is the

right time to get the data into TAIR

• What journals?

Gene function

Journal collaborations

2010:Journal of Integrative Plant BiologyJournal of Experimental BotanyPlant ScienceEnvironmental BotanyPlant Physiology and BiochemistryPlant, Cell and Environment

Plant Physiology (2008)

The Plant Journal (2009)

Journal Collaboration

Direct Submission of Gene Function

• How?– Excel sheet or online form

• Why?– To capture more data with a

small curation team– Because researchers are the

experts on the genes they study

Gene function

Direct submission

New online submission form

17986450

Why Gene Ontology?

• Standardization allows comparison across experiments and species

• Hierarchical structure allows high level categorization

• Well structured ontology framework facilitates computational analysis

• Attached to data source (peer reviewed published research)

• Experimental evidence can be distinguished from predictions

Example Gene Ontology annotations

Gene GO term Evidence Reference

Phot1 Phototropism Mutant phenotype Huala et al 1997

Phot1 Cytoplasm Direct assay Sakamoto et al 2002

Phot1 Serine / threonine kinase activity

Direct assay Christie et al 1998

Biological process

Cellular component

Molecular function

3 GO flavors

New online submission form

Autocomplete (just start typing to get a list of matching terms)

New online submission form

New online submission form

What is the result of TAIR’s effort to capture gene function?

• How many genes have experimental gene function in TAIR?

Published papers

Gene function

Journal collaborations

Direct submission

Num

ber

of g

enes

9342 genes (May 31 2010)

Genes in TAIR with experimental evidence for biological process, molecular function or cellular component

Arabidopsis Gene Function in TAIR

Year

Ge

nes

Protein coding genes

Predicted function

Experimental function

Ara-bidopsis

yeast worm fly ze-brafish

mouse rat0

1000

2000

3000

4000

5000

6000

7000

8000

Experimental GO Annotations

Biological Process

Cellular Component

Molecular Function

Organism

Nu

mb

er

of

gen

e p

rod

ucts

Overview of TAIR

Genome release

Published papers

Gene function

Journal collaborations

Direct submission

RNA-seq Proteomic Corrections

Other data:MarkersEcotypes

Gene symbolsNew genomes

New tools

ResearchersDirectly (TAIR pages)AND via other databases

GBrowse_syn

Tool by Sheldon McKay, CSHLAlignment data from Pedro Pattyn, Van de Peer lab, U. of Ghent

GBrowse_syn

A. lyrata

A. thaliana

poplar

NBrowse

Tool by H.-L. Kao, F. Piano, M. Schuman, M. Gibson, Kris Gunsalus, NYUInteraction datasets curated by TAIR, BioGRID and IntAct

NBrowse

Tool by H.-L. Kao, F. Piano, M. Schuman, M. Gibson, Kris Gunsalus, NYUInteraction datasets curated by TAIR, BioGRID and IntAct

NBrowse

Tool by H.-L. Kao, F. Piano, M. Schuman, M. Gibson, Kris Gunsalus, NYUInteraction datasets curated by TAIR, BioGRID and IntAct

Genes have been loaded

Working on adding some gene function information and improving searching

Arabidopsis lyrata

Overview of TAIR

Genome release

Published papers

Gene function

Journal collaborations

Direct submission

RNA-seq Proteomic Corrections

Other data:MarkersEcotypes

Gene symbolsNew genomes

New tools

ResearchersDirectly (TAIR pages)AND via other databases

Central registry for Gene Symbols

Central registry for Gene Symbols

Central registry for Gene Symbols

Central registry for Gene Symbols

Helpdesk

Helpdesk

Helpdesk

RSS news feed

RSS news feed

TAIR Facebook Page

TAIR Twitter Feed

Tanya Berardini Donghui Li

Gene Function/GO:

Bob Muller Larry Ploetz Chris Wilks (50%)

?

David Swarbreck Philippe Lamesch Rajkumar Sasidharan

Genome Annotation:

TAIR Staff

Tech Team:

Cynthia LeeShanker Singh

TAIR Sponsors:

Funding Agencies:

Host Institution: Partner:

top related