Top Banner
Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007 Fiona McCarthy
28

Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Dec 16, 2015

Download

Documents

Richard Rose
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Modeling Functional Genomics Datasets

CVM8890-101

Lesson 3

13 June 2007 Fiona McCarthy

Page 2: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Lesson 3: Tools for functional annotation. Accessing

functional data; computational strategies to

obtain more complete functional annotation; the

AgBase GO annotation pipeline.

Page 3: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Lesson 3 Outline

1. Review: Functional Annotation

2. Tools for functional annotation– Accessing functional data– Computational strategies to obtain more functional

data

3. Example: The AgBase GO annotation pipeline

4. Other GO annotation tools

Page 4: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Review: Functional Annotation

• biologists refer to both the annotation of the genome and functional annotation of gene products:

“structural” AND “functional” annotation

• Functional annotation is required to make biological sense of high throughput datasets eg. genomics, arrays, proteomics

• COGs, KOGs, GO

Page 5: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Tools for Functional Annotation• Need to be able to access functional annotation

for your dataset– Breadth and depth– Date updated– No annotation vs function unknown

• Need to be able to add more annotation• Need to be able to use the annotations to

model your data– Depth or detail– Compatibility with other programs (eg pathway

analysis)– Comparative data?

Page 6: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Tools for Functional Annotation

• Clusters of Orthologous Groups (COGs)

• euKaryotic Orthologous Groups (KOGs)

• UniProt Knowledgebase (UniProtKB)

• Bioinformatic Harvester

• FANTOM

• Puma

• Gene Ontology (GO)

Page 7: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

COGs & KOGs• Accessible at

http://www.ncbi.nlm.nih.gov/COG/• ftp download• Available for many prokaryotes and 7

eukaryotes• Add more annotation using the KOGinator?• Modeling:

– Has breadth but not always depth– Good for prokaryote comparative analysis?

Page 8: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

COGs & KOGs

Page 9: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

COGs & KOGshttp://www.ncbi.nlm.nih.gov/COG/

Automated tools for large numbers of comparisons??

Page 10: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

UniProtKB• Accessible at

http://www.pir.uniprot.org/• ftp download & sophisticated search &

download capabilities• Available for > 132,000 species• Annotation across both literature (for

selected species) and biological databases• Modeling:

– Has breadth but not always depth; many proteins not represented in UniProtKB

– Those that are represented have a detailed summary of function from a range of sources

– Rapid help and feedback from the database help

Page 11: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

UniProtKBhttp://www.pir.uniprot.org/

Page 12: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

UniProtKBhttp://www.pir.uniprot.org/

Page 13: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

UniProtKBhttp://www.pir.uniprot.org/

Page 14: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Bioinformatic Harvester• Accessible at

http://harvester.fzk.de/harvester/• no download• Available for 6 model species• Integrates data from multiple sources• Modeling:

– Has breadth and depth; not useful for large datasets– Updates?

Page 15: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Bioinformatic Harvesterhttp://harvester.fzk.de/harvester/

Page 16: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

FANTOMhttp://www.gsc.riken.go.jp/e/FANTOM/

Mouse only

Page 17: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

PUMAhttp://compbio.mcs.anl.gov/puma2/

Page 18: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Gene Ontology• Accessible at

http://www.geneontology.org/• updated downloads for 34 species + downloads

for UniProtKB species (>130,000)• UniProtKB species annotation: some depth,

less breadth• GO data mapped from other databases• Modeling:

– Many tools available for modeling using the GO– Can use computational or manual curation to add

annotations

Page 19: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Gene Ontologyhttp://www.geneontology.org/

Page 20: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Accessing GO Data

Page 21: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

EBI-GOA Projecthttp://www.ebi.ac.uk/GOA/

Page 22: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

The AgBase GO Annotation Pipeline• Accessible at

http://www.agbase.msstate.edu/

• Access available annotations for agriculturally important species

• Provide your own GO annotations

• Model GO for your dataset

Page 23: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Coming soon; GOModeler quantitative hypothesis driven modeling using GO

Page 24: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.
Page 25: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

http://www.geneontology.org/GO.tools.shtml

Other GO Annotation Tools

Page 26: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Evaluate:• Can I run it from my computer?• Does it include my species of interest?• When was it last updated?• Does it display evidence codes?• Does it display IEA annotations?• What are the inputs it accepts?• Does it do batch searches?

Other GO Annotation Tools

Page 27: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Using GO to Analyze Array Data

Page 28: Modeling Functional Genomics Datasets CVM8890-101 Lesson 3 13 June 2007Fiona McCarthy.

Using GO to Analyze Array DataEvaluate:• Does it include my species of interest?• When were the annotations last updated?• Can I add my own annotations?• Does it tell me how many of my genes are used for the analysis?• Does it account for “not” annotations?• Does it display IEA annotations?• What are the input IDS it accepts?• Does it analyze both over & under-represented terms?• What statistics does it use for the analysis?• Does it do a graphical representation?

ANY tool will only be as good as the annotations.