Top Banner
Protein Modularity and Evolution: An examination of organism complexity via protein domain structure Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004
22

Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Jan 19, 2016

Download

Documents

Protein Modularity and Evolution : An examination of organism complexity via protein domain structure. Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004. Presentation Outline. Background Material - Protein Evolution, Theory of Domains, Gene Number - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Protein Modularity and Evolution:

An examination of organism complexity via protein domain structure

Presented byJennelle Heyer and Jonathan Ebbers

December 7, 2004

Page 2: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Presentation Outline

• Background Material - Protein Evolution, Theory of Domains,

Gene Number

• Hypothesis- Using a model protein family

• Procedure/Methods - DPIP Program, Phylogenic Analysis

• Results• Discussion/Conclusions

Page 3: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Theories of Protein Evolution

A long time ago, in the primodial soup of life, small polypeptides began to form…

HDLC or TCP or….

HDLC + TCP = HCLCTCP

HCI*CTCP + TCP…

Functional proteins

HDLC or TCP or….

HDLC + TCP = HCLCTCP

HCI*CTCP + QZX…

Functional proteins

Page 4: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Concept of Modularity

• Proteins consist of one or more domains that were pieced together over time

• Domain building blocks of proteins– Defined as “spatially distinct structures

that could conceivably fold and function in isolation” (Pontig and Russell, 2002)

– Dictate the function of the protein– Evolutionary pressure to conserve

(sequence and/or structure)

Page 5: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Organismal Complexity

• The nematode, C. elegans, has 19,500 genes in its genome

• Humans have between 20,000 and 25,000 genes in their genome

• HOW CAN THAT BE?• Alternate splicing, multi-functional/network

proteins

Page 6: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Hypothesis

• Gene products, proteins, can be multi-functional with the introduction of domains

• “…evolution does not produce innovation from scratch. It works on what already exists, either transforming a system to give it a new function or combining several systems to produce a more complex one” (Jacob, 1946)

• More complex or phylogenetically derived organisms produce proteins with greater domain complexity

Page 7: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Hypothesis Part II

• Create a protein domain “tool” – Position– Partner domain– General organization– Protein evolution– Using a variety of sequenced genomes

• Allow investigators to learn about domain of interest and apply to research

Page 8: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Kinesins: A model protein family

• Motor proteins found in eukaryotic organisms

• Contain a conserved motor domain

• Bind and walk along microtubules

• Can carry a variety of “cargo”

• May contain multiple domains

http://www.mb.tn.tudelft.nl/projects/

Page 9: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Kinesins: A model protein family

• Arabidopsis thaliana, a model plant species, contains 61 kinesins

• S. pombe – 10, C. elegans – 22, Drosophlia – 25,

Human and mouse ~ 45

From Reddy and Day, 2001

Page 10: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Programming Approach

• Two programs used, BLAST and InterProScan, held together with perl scripts

• Give a domain sequence to PSI-BLAST, which will identify proteins that have that domain.

• One by one, give those protein sequences to IPR, which identifies domains in the protein.

• Create a listing of proteins and map the data into a phylogeny.

• Create a tree based on the phylogeny and domains

Page 11: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

DomainSequence

List of proteinswith similar domains

List of domains inevery protein

Tree(includes domains)

BLAST

InterProScan

Maketree

Program Flowchart

Page 12: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Program Details

• Database selection:– BLAST: Refseq over nr– InterProScan: SMART database, only

• Threshold values:– BLAST: Option to change, improve resolution– InterProScan: E-value at 0.99, up from 0.01

• Used Arabidopsis sequences as a control• Name: DPIP (Domain Placement in

Proteins)

Page 13: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Results

• A Quick Look at the Data

• Phylogenetic Approach– Hypothesis I

• Qualitative Approach– Hypothesis II

Page 14: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

                               

               

A Quick L k

Page 15: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Phylogenetic Approach• “More complex or phylogenically derived

organisms produce proteins with greater domain complexity”

• Trace domain characteristics on a preset tree– Use MacClade tree drawing software– Uses input data to create most parsimonious

trace

• Characteristics: Maximum # domains

Unique domains

Page 16: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Maximum # of Domains per Protein

Green = 1Black = 3

Page 17: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Number of Unique Domains per Organism

Blue = 1Pink = 2Dk. Blue = 3Yellow = 5Black = 6Dash - ???

Page 18: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Phylogenetic Conclusions

• Inconclusive or null hypothesis supported• Possible explanations:

– Kinesins may have limited domain complexity due to function or folding

– Inherent bias in DPIP (refseq database)

• Future Work:– Testing other domains through same process– Updating database– Include measure for position (N/I/C)

Page 19: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Qualitative Approach

• Create a protein domain “tool” – Position– Partner domain– General organization– Protein evolution– Using a variety of sequenced

genomes

• Compile data into a more informative table

Page 20: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

- Can I trace domain or protein evolution??

Page 21: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Presence of FHA/PH domain in kinesins

Yellow – AbsentBlue - Present

Page 22: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Conclusions• DPIP program was created to answer two

questions:– Does organismal complexity correspond with

protein complexity?– Can we create a tool for researched to better

understand domain in protein families?

• For kinesins motor domains: No and Yes• For other domains:????

Thanks to Webb Miller, Richard Cyr Claude DePamphillis, Alexander Richter, Plant Physiology, Biology, and Bioinformatics Depts.