Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004
Post on 19-Jan-2016
26 Views
Preview:
DESCRIPTION
Transcript
Protein Modularity and Evolution:
An examination of organism complexity via protein domain structure
Presented byJennelle Heyer and Jonathan Ebbers
December 7, 2004
Presentation Outline
• Background Material - Protein Evolution, Theory of Domains,
Gene Number
• Hypothesis- Using a model protein family
• Procedure/Methods - DPIP Program, Phylogenic Analysis
• Results• Discussion/Conclusions
Theories of Protein Evolution
A long time ago, in the primodial soup of life, small polypeptides began to form…
HDLC or TCP or….
HDLC + TCP = HCLCTCP
HCI*CTCP + TCP…
Functional proteins
HDLC or TCP or….
HDLC + TCP = HCLCTCP
HCI*CTCP + QZX…
Functional proteins
Concept of Modularity
• Proteins consist of one or more domains that were pieced together over time
• Domain building blocks of proteins– Defined as “spatially distinct structures
that could conceivably fold and function in isolation” (Pontig and Russell, 2002)
– Dictate the function of the protein– Evolutionary pressure to conserve
(sequence and/or structure)
Organismal Complexity
• The nematode, C. elegans, has 19,500 genes in its genome
• Humans have between 20,000 and 25,000 genes in their genome
• HOW CAN THAT BE?• Alternate splicing, multi-functional/network
proteins
Hypothesis
• Gene products, proteins, can be multi-functional with the introduction of domains
• “…evolution does not produce innovation from scratch. It works on what already exists, either transforming a system to give it a new function or combining several systems to produce a more complex one” (Jacob, 1946)
• More complex or phylogenetically derived organisms produce proteins with greater domain complexity
Hypothesis Part II
• Create a protein domain “tool” – Position– Partner domain– General organization– Protein evolution– Using a variety of sequenced genomes
• Allow investigators to learn about domain of interest and apply to research
Kinesins: A model protein family
• Motor proteins found in eukaryotic organisms
• Contain a conserved motor domain
• Bind and walk along microtubules
• Can carry a variety of “cargo”
• May contain multiple domains
http://www.mb.tn.tudelft.nl/projects/
Kinesins: A model protein family
• Arabidopsis thaliana, a model plant species, contains 61 kinesins
• S. pombe – 10, C. elegans – 22, Drosophlia – 25,
Human and mouse ~ 45
From Reddy and Day, 2001
Programming Approach
• Two programs used, BLAST and InterProScan, held together with perl scripts
• Give a domain sequence to PSI-BLAST, which will identify proteins that have that domain.
• One by one, give those protein sequences to IPR, which identifies domains in the protein.
• Create a listing of proteins and map the data into a phylogeny.
• Create a tree based on the phylogeny and domains
DomainSequence
List of proteinswith similar domains
List of domains inevery protein
Tree(includes domains)
BLAST
InterProScan
Maketree
Program Flowchart
Program Details
• Database selection:– BLAST: Refseq over nr– InterProScan: SMART database, only
• Threshold values:– BLAST: Option to change, improve resolution– InterProScan: E-value at 0.99, up from 0.01
• Used Arabidopsis sequences as a control• Name: DPIP (Domain Placement in
Proteins)
Results
• A Quick Look at the Data
• Phylogenetic Approach– Hypothesis I
• Qualitative Approach– Hypothesis II
A Quick L k
Phylogenetic Approach• “More complex or phylogenically derived
organisms produce proteins with greater domain complexity”
• Trace domain characteristics on a preset tree– Use MacClade tree drawing software– Uses input data to create most parsimonious
trace
• Characteristics: Maximum # domains
Unique domains
Maximum # of Domains per Protein
Green = 1Black = 3
Number of Unique Domains per Organism
Blue = 1Pink = 2Dk. Blue = 3Yellow = 5Black = 6Dash - ???
Phylogenetic Conclusions
• Inconclusive or null hypothesis supported• Possible explanations:
– Kinesins may have limited domain complexity due to function or folding
– Inherent bias in DPIP (refseq database)
• Future Work:– Testing other domains through same process– Updating database– Include measure for position (N/I/C)
Qualitative Approach
• Create a protein domain “tool” – Position– Partner domain– General organization– Protein evolution– Using a variety of sequenced
genomes
• Compile data into a more informative table
- Can I trace domain or protein evolution??
Presence of FHA/PH domain in kinesins
Yellow – AbsentBlue - Present
Conclusions• DPIP program was created to answer two
questions:– Does organismal complexity correspond with
protein complexity?– Can we create a tool for researched to better
understand domain in protein families?
• For kinesins motor domains: No and Yes• For other domains:????
Thanks to Webb Miller, Richard Cyr Claude DePamphillis, Alexander Richter, Plant Physiology, Biology, and Bioinformatics Depts.
top related