1
Chapter 8 Proteomics
暨南大學資訊工程學系黃光璿2004/06/07
2
proteome the sum total of an organism’s proteins
genome the sum total of an organism’s genetic
material
3
8.1 From Genomes to ProteomesWe want to know what proteins are present in cells; what those proteins do and how they
function.
However, it’s not easy.
4
Why?
1. The longevity (壽命 ) of an mRNA and the protein it codes for are very different.
2. Many proteins are extensively modified after translation.
3. Many proteins are not functionally relevant until they are assembled into larger complexes or delivered to an appropriate location.
5
4. Proteins require more careful handling than DNA.
Function may change. Protein identification requires
mass spectrometric analysis specific antibodies.
Obtaining large numbers of protein molecules requires chemical isolation for living cells.
6
8.2 Protein Classification
Based on protein function
six categories evolutionary history & structural
similarity 1000 homologous families
7
8.2.1 Enzyme Nomenclature
Started at 1950s
International Union of Biochemistry and Molecular Biology
8
8.2.2 Family and Superfamily
Modern-day proteins may be derived from ~ 1000 original proteins.
folds superfamilies families databases
SCOP, CATH, DALI
9
fold the same major secondary structure & topologi
cal connections superfamily
probable evolutionary relationships family
clear evolutionary relationships
10
11
12
8.3 Experimental Techniques
2D Electrophoresis Mass Spectrometry
13
2D Electrophoresis
http://tw.expasy.org/cgi-bin/map1
liver kidney
14
15
16
Problems tens of thousand v.s. thousands under presentation of membrane-bound pr
oteins difficult to determine exactly which protein
is represented
17
8.3.2 Mass Spectrometry
2D mass spectrometry, for identification
18
8.3.3 Protein Microarrays
Use antibodies as probes.
Problems Single proteins will interact with
multiple probes. The binding kinetics of each probe are
different. Proteins are sensitive to their
environment.
19
8.4 Inhibitors and Drug Design development & testing of a new drug
~ 15 years, US$ 700 million discovery
target identification lead discovery & optimization toxicology (毒理學 ) pharmacokinetics
testing
20
HIV protease has an active site; cuts a single, large polypeptide chain into
many proteins.
21
8.5 Ligand Screening
22
8.5.1 Ligand Docking
Determine how two molecules of known structure will interact.
Three issues: Identify the energy of a particular
molecular conformations. Search for the conformation that
minimizes the free energy.
23
How to deal with flexibility in both the protein and the putative ligand. Lock and key approaches
rigid protein structure, flexible ligand structure induced fit docking
flexible in both protein and ligand
24
Softwares AutoDock FTDock DOCK Hammerhead Gold FlexX
25
8.5.2 Database Screening
Primary consideration complete and accurate search with a reasonable computational complexity
SLIDE Fig. 8.4
26
27
8.6 X-Ray Crystal Structures
W. C. Roentgen (1895) discovered X-rays. M. von Laue (1912) discovered crystals diffr
act X-rays. D. Hodgkin, etc. (1950s), crystallized compl
ex organic molecules and determined their structures.
28
grow a crystal of the protein
29
30
31
File formats PDB formatted text mmCIF (MacroMolecular Crystallographic Infor
mation File)
32
databases & resources PDB PIR ExPASy
33
Visualizing Tools Fig. 8.8 RasMol Swiss PDB viewer VMD (Visual Molecular Dynamics) Spock Protein explorer DINO
34
8.7 NMR Structures
~ 200 amino acids the structures determined are not
unique
35
8.8 Empirical Methods and Prediction Techniques Example: Fig. 8.9 extracting features learning, training testing
36
37
8.9 Post-Translational Modification Prediction Remove segments of a protein. Covalently attach sugars, phosphates,
or sulfate groups into surface residues. Cross-link residues within a protein
(disulfide bond).
38
8.9.1 Protein Sorting
39
associated with membranes not associated with membranesTable 8.3 (Case 2)
40
PSORT: nearest neighbor classifier Prediction of protein subcellular localization
SignalP: artificial neural networks Prediction of signal peptide cleavage sites
41
8.9.2 Proteolytic Cleavage
chymotrypsin cleaves polypeptides on the C-terminal side of
bulky and aromatic residues trypsin
cleaves on the carboxyl side elastase
cleaves on the C-terminal side of small residues
42
Prediction proteasomes, > 98%, by neural network
43
8.9.3 Glycosylation
The process of covalently linking an oligosaccharide to the side chain of a protein surface residue (科學人 )
N-linked, 75% O-linked, 85%
by neural network
44
8.9.4 Phosphorylation
kinases : add phosphatases : remove
signal
NetPhos, > 70%, neural network
45
參考資料及圖片出處
1. Fundamental Concepts of BioinformaticsDan E. Krane and Michael L. Raymer, Benjamin/Cummings, 2003.
2. Merrian-Webster Dictionary