Top Banner
1 RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF 3020 PROTEINS US HUPO Symposium 14 March, 2005 Gilbert S. Omenn, David States, Dan Chan, Richard Simpson, Henning Hermjakob, and Sam Hanash, on behalf of the HUPO PPP Investigators
53

RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

Feb 03, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

1

RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF 3020 PROTEINS

US HUPO Symposium14 March, 2005

Gilbert S. Omenn, David States, Dan Chan, Richard Simpson, Henning Hermjakob, and

Sam Hanash, on behalf of the HUPO PPP Investigators

Page 2: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

2Protein DNA

Page 3: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

3

LONGTERM SCIENTIFIC GOALS OF HUPO PPP

1. Comprehensive analysis of plasma and serum protein constituents in people

2. Identification of biological sources of variation within individuals over time, with validation of biomarkers

Physiological: age, sex/menstrual cycle, exercisePathological: selected diseases/special cohorts Pharmacological: common medications

3. Determination of the extent of variation across populations and within populations

Page 4: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

4

Scheme Showing Aims and Linkages of the HUPO Plasma Proteome Project

HUPO HUMAN PLASMA PROTEOME

PROJECT (PPP)HUPO PPP Participating Labs

Development & Validation of Biomarkers

Reference Specimens

Technology Platforms--Separation and Identification

Serum vs Plasma

Technology Vendors

Liver and Brain Proteome Projects

Omenn GS. The Human Proteome Organization Plasma Proteome Project Pilot Phase: Reference Specimens, Technology Platform Comparisons, and Standardized Data Submissions and Analyses. Proteomics 2004;4:1235-1240.

Page 5: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

5

PPP TECHNICAL COMMITTEE STRUCTURE

• Reference Specimens and Specimen Handling Issues

(Dan Chan, chair)

• Technology Platforms & Protocols (Richard Simpson)

• Database Development and Links with EBI (HUPO/PSI)

(David States, Henning Hermjakob)

• Population Cohorts/Specimen Banks (Gerard Siest)

• Education & Training Committee (Peipei Ping)

• Executive Committee (including Partnerships) (Omenn)

Page 6: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

6

SERUM AND PLASMA REFERENCE SPECIMENS

1. BD: specially prepared male/female pooled samples, divided into EDTA-, Heparin-, and Citrate-anti-coagulated Plasma and Serum (250 ul x4 of each). BD clot activator. No protease inhibitors. Three separate ethnic pools prepared. Shipped frozen.

2. Chinese Academy of Medical Sciences: Sets of three plasmas + serum, similar to BD protocol.

3. National Institute for Biological Standards & Control,UK: citrate-anti-coagulated, freeze-dried plasma, from 25 donors, prepared for Intl Soc Thrombosis & Hemostasis, 1 ml aliquots/ampoules.

Page 7: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

7

UPDATED SUMMARY OF PPP LABS

31 Total Participating Labs (18 US, 13-International):9 – US Academic, 3 – US Federal, 6– US Corporate4 – Europe, 1– Israel, 6 – Asia, 2 – Australia

LC-MS/MS datasets from 18; MALDI-MS from 5; SELDI-MS from 8; antibody arrays/immunoassays from 4

Number that analyzed various reference specimens:9 – UK NIBSC, 26 – BD b1, Caucasian-American9 – BD b2/b3, African- and Asian-American, 5 -CAMS

Page 8: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

8

Arie Admon, Technion, Haifa, IsraelRuedi Aebersold, Institute for Systems Biology, Seattle William Hancock, Barnett Institute, Northeastern UnivStan Hefta, Bristol-Myers Squibb, NJHelmut Meyer, Ruhr University BochumGil Omenn/Sam Hanash/Phil Andrews/Mike Pisano, MI Young Ki Paik, Yonsei Research Center, KoreaJohn Peltier, Myriad Proteomics Inc.Peipei Ping, UCLAJoel Pounds, Pacific Northwest Natl Lab Xiaohong Qian, Beijing Institute of Radiation Medicine Richard Simpson, Ludwig Institute for Cancer Research David Speicher, Wistar InstituteRong Wang, Mass Spec Proteomics Lab, Mount SinaiValerie Wasinger, Univ of New South WalesChi Yue Wu, Institute of Biol Chem, Acad Sinica, TaiwanXiaohang Zhao, Natl Lab of Molecular Oncology, CAMSRobert Gerszten, Harvard/Erik Forsberg, Amersham-GE

Page 9: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

9

Immunoassay LabsBrian Haab, Van Andel Research InstituteFrank Vitzthum, Dade Behring, Marburg GMBH, GermanyMark Driscoll, Molecular Staging IncBernhard Geierstanger, Genomics Inst of Novartis Research Fdn

MALDI-MS LabsAlexander Archakov, Institute of Biomedical Chemistry, Moscow, Erik Forsberg, Amersham Biosciences, Uppsala, SwedenYoung-ki Paik, Yonsei Proteome Research CenterAkira Tsugita, Proteomics Research Lab, Tsukuba, Japan

SELDI LabsBao-Ling Adam, Univ of GeorgiaAlexander Archakov, Institute of Biomedical Chemistry, Moscow Dan Chan/Alex Rai, Johns Hopkins Kenneth Greis, Procter & Gamble Eastwood Leung, Genome Institute of Singapore Sandra McCutcheon-Maloney/Brett Chromy, Lawrence Livermore Lab William Morgan, Univ of Missouri-KC, Jean-Charles Sanchez, Geneva Proteomics Research CenterPaul Stemmer, Wayne State University

Page 10: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

10

SPECIFICATIONS FOR DATA SUBMISSION

Each lab was instructed (July, 2003) to providea) a detailed experimental protocol, to “push

the limits” to detect low-abundance proteinsb) peptide sequences, rated as “high” or

“lower” confidence, based on MS/MS criteriac) protein IDs from IPI 2.21 (July 2003) and

search engine used to align peptide sequences with proteins in human database

Later, we requested m/z peak lists and raw spectra (by CD or DVD); search parameters

Page 11: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

11

CRITERIA FOR HIGH CONFIDENCE IDENTIFICATION OF PEPTIDES, ILLUSTRATED WITH SEQUEST

Xcorr: singly-charged ion, >=1.9doubly-charged ion, >=2.2triply-charged ion, >=3.75

Delta Cn >= 0.1; Rsp <= 4Fully tryptic

Page 12: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

12

Database Design and Implementation

RDBMS– Stable, proven technology– Data validation

Commercial package– Microsoft SQL Server– Stable and supported– Full RDBMS functionality

• Transactions• Referential integrity

checks– Effective development

tools• GUI• Cross-tab extension

IdentificationsIdentificationsLaboratoryLaboratorySampleSampleMethodMethodDatabase IDDatabase IDPeptidesPeptides

LaboratoriesLaboratories

SamplesSamples

MethodsMethods

Page 13: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

13

Bioinformatics Acknowledgements

University of MichiganDavid StatesMarcin AdamskiThomas BlackwellRajasree MenonYin Xu

EBI – EnglandRolf ApweilerHenning HermjakobChris TaylorNicky MulderSandra Orchard

Ludwig Institute - AustraliaRichard SimpsonEugene Kapp

James Eddes

Institute for Systems BiologyJimmy EngAlexey Nesvizhskii

Technion/IBM, Ilan Beer

Page 14: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

14

Integration Algorithm (Adamski et al)Objectives:o Integrate results from disparate instruments, search

engines, and specimenso Evaluate concordance between results from

different laboratorieso Reduce ambiguity and redundancy of the identificationso Select accession numbers of the most representative and

protein for those matching equally.

We designed a workflow that uses sequences of identified peptides, rather than submitted protein accession numbers.

Page 15: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

15

Numbers of Proteins Identified (LC-MS/MS or FTICR-MS)

From 15,519 reported distinct protein IDs in IPI 2.21, chose representative protein for clusters:

(a) all protein IDs (high and lower conf)9504 = 1 or more peptide matches (>=6 aa)3020 = 2 or more peptide matches [1274 = 3+][2580 in plasma x3; 2353 in serum; 1913 in both]

(b) all protein IDs (high conf peptides, only)2852 = 1 or more peptide matches

http://www.bioinformatics.med.umich.edu/app1/MsSqlAccess [UM] and www.ebi.ac.uk/pride [EBI]

Page 16: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

16

Distribution of protein identifications in function of peptides detected per protein

97%92%91%

25% 75%

86%

0

2,000

4,000

6,000

8,000

10,000

≥ 1 ≥ 2 ≥ 3 ≥ 4 ≥ 5 ···//··· ≥ 10number of peptides per protein detected across experiments and laboratories

num

ber o

f ide

ntifi

ed p

rote

ins

all identifications - left axis confirmed identifications

Page 17: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

17

ESTIMATION OF ERROR RATEPoisson Model (States/Blackwell)

Ndb, total non-redundant protein entries in IPI v2.21 (49,924)

Lambda, proportion of matches false-positivesUpper bound: all 9504 FP, = 0.211Lower bound: accept 1920 high confidence single-peptide-based protein IDs, reject 4864 lower confidence, = 0.146

Pr (true positives): 4 peptides, 0.993 peptides, 0.95-0.982 peptides, 0.70-0.85

Use 2+ peptides to obtain more representative dataset.

Page 18: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

18

Virtual 2D gel

Proteins detected with at least 2 peptidesAll Detected ProteinsAll proteins in IPI 2.21

Page 19: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

19

INDEPENDENT ANALYSES FROM RAW SPECTRA (#IDs with 2+ peptides)

Core Dataset (18 datasets, 3020)• Mascot/Digger (Kapp, Australia, 18

datasets, 3178)• PepMiner (Beer, Israel, 8 large

datasets, 2902)(c) PeptideProphet/ProteinProphet (Eng,

USA, 5 datasets, 508)Plus alternative integration scheme with

Sequest (Eddes, Australia, 18 datasets, 2344)

Page 20: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

20

GREATEST RESOLUTION AND SENSITIVITY

The most extensive high-confidence yield was from combined use of immunoaffinity(“top-6”) depletion, 2 or 3-D high-resolution fractionation, and then ESI-MS/MS with ion-trap LTQ instrument.

LTQ gave several fold more IDs than did LCQ in same hands (B1-serum vs B1-heparin).

Page 21: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

21

Page 22: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

22

SPECIFIC OBSERVATIONS: DEPLETION

• Many investigators depleted albumin and/or immunoglobulins

• Several obtained Agilent immunoaffinitycolumn to remove Top-6 proteins.

• Much higher numbers of identifications after depletion.

• Inadvertent removal of other proteins open issue: LC-MS/MS vs gels; “sponge”effect of albumin.

• Feasible to assay both flow-through & bound fractions.

Page 23: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

23

Example of Depletion AnalysisEchan/Speicher

Immunoaffinity/Top-6 polyclonal (Agilent)o Column for HPLCo Spin column

Two-antibody spin column (Proteoprep); IgYCibacron Blue (for albumin)Protein A or G (for immunoglobulinsTop-6 best: 85% of protein removed; least non-

target removal (lots of fragments of top 6); few “new” proteins on 2D gel despite 10-20X loading

Suggest depleting 12-20 proteins OR using multi-dimensional (microSol IEF) fractionation

Page 24: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

24

Glycoprotein-Enriched Subproteomes[Hancock presentation this afternoon]

Methods Lab 2 Lab 11Enrichment hydrazide chem lectin chrom’yPeptide Fxn SCX + RP RPMass Spec qtof deca-xpSearch engine Seq/ProteinProphet SequestProtein IDs 222 83

in B1-serum [51 in common]Of total 254, 164 found among data from 11 other labs without glycoprotein enrichment.

Page 25: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

25

A B

First dimension fraction numbers (relative pI) and estimated MW of identified proteins. Left (A): 39 locations with complement component 3 precursor (C3); (B):14 locations with clusterin (CLU).

Page 26: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

26

INFLUENCE OF ABUNDANCE

Using quantitative immunoassays and microarrays (generally unknown epitopes), we have found very high rates of detection of the more abundant proteins, less in the mid-range, and occasional detection of very low abundance proteins, as expected.

High correlation (r=0.9) between # peptides and measured concentrations.

Page 27: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

27

Page 28: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

28

log10(N)=0.365* log10(conc)-0.711; r2 = 0.92

concentration (pg/ml)

# of

pep

tide

iden

tifca

tions

100 10,000 1e6 1e8 1e10

110

100

1,00

0

Page 29: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

29

Least Abundant Proteins Identified with two distinct peptides

(pg/ml: range 200 pg/ml to 20 ng/ml)

Alpha fetoprotein 2.9E+-02TNF-R-8 3.3E+02TNF-ligand-6 1.5E+03PDGF-R alpha 4.6E+03Leukemia inhibitory factor receptor 5.0E+03MMP-2/gelatinase 8.8E+03EGFR 1.1E+04TIMP-1 1.4E+04IGFBP-2 1.5E+04Activated leukocyte adhesion mol 1.6E+04Selectin L [five labs;10 peptides] 1.7E+04

Page 30: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

30

SPECIMEN VARIABLES

What evidence have we developed for choice of specimens for analysis?

Plasma preferred over serumCitrate or EDTA preferred over Heparin for

plasmaProtease inhibitors desirable, but complicatedClot activator unnecessary (serum only)Minimize freeze/thaw cycles (archives)Avoid 4C step

Page 31: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

31

SPECIMENS

The sets of four specimens from a given donor pool yielded rather similar numbers of proteins when analyzed identically. More fragmentation of serum. Little evidence of platelet in vitro contamination.

Quantitative immunoassays show generally 15-20% lower values for citrate-plasma, due to dilution and osmosis; no interference with or loss of identifiable proteins.

Page 32: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

32

PROTEASES

Should anti-protease cocktails be used in specimen collection, or in a later step?

Advantages: reduce proteolyticdegradation ex vivo; reduce complexity of peptides after tryptic digestion.

Disadvantages: adds peptides, as well as small molecules, to the mix to be analyzed; may covalently modify the proteins (ABESF does so).

Page 33: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

33

BIOLOGICAL INSIGHTS

The proteins identified can be annotated by many methods. We have searched multiple databases, including Gene Ontology, Novartis Atlas, Online MendelianInheritance in Man (OMIM), incomplete or unidentified sequences in the human genome, microbial genomes, and protein domains.

Some examples follow.

Page 34: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

34

1 5 10 50 100 500 1000

12

510

2050

100

200

occurrences in genome

occu

rrenc

es in

PP

P

GO term usage in the PPP 3020 vs. Human GenomeShown in the figure are the rates of occurrence of Gene Ontology terms in the HUPO PPP 3020 set relative to the frequency of occurrence of the same terms in the human genome. The solid line shows a linear regression estimate for the frequency that would be expected if the 3020 uniformly sampled the genome. The parallel dotted lines show 2 fold over and under representation relative to uniform sampling. The curved dashed lines show over and under representation by 3 standard deviations.

Page 35: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

35

Over- and Under-Represented GO Terms

Over: extracellular, immune response, blood coagulation, lipid transport, complement activation, regulation of blood pressure; also, cytoskeletal proteins, receptors and transporters

Under: perception of smell (1 vs 25 expected); cation transporters, ribosomal proteins, G-protein coupled receptors, and nucleic acid binding proteins

Page 36: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

36

1 5 10 50 100 500 1000

12

510

2050

occurrences in genome

occu

rrenc

es in

PP

P

InterPro domain usage in the PPP 3020 vs. Human Genome

Page 37: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

37

OVER- AND UNDER-REPRESENTED DOMAINS IN INTERPRO FOR PPP vs FULL

IPI DATASET

Over: EGF, intermediate filament protein, sushi, thrombospondin, complement C1q,

and cysteine protease inhibitorUnder: Zinc finger (C2H2, B-box, RING),

tyrosine protein phosphatase, tyrosine and serine/threonine protein kinases, helix-turn-helix motif, and IQ calmodulin binding region

Page 38: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

38

GENE ONTOLOGY SPECIFIC TERMSOver-represented in PPP 3020 (vs whole

genome): “extracellular”, “immune response”, “blood coagulation”, “lipid transport”, “complement activation”, “regulation of blood pressure”, as expected; also: cytoskeletalproteins, receptors and transporters.

Proteins from most cellular locations and molecular processes are recognized.

Under-represented: “perception of smell” (1 vs25 exp); cation transporters, ribosomal proteins, G-protein coupled receptors, and nucleic acid binding proteins.

Page 39: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

39

InterPro Protein Domain Analysis

Compared with the whole human genome, the 3020 PPP proteins are:

Over-represented for EGF, intermediate filament protein, sushi, thrombospondin, complement C1q, and cysteine protease inhibitor, and

Under-represented Zinc finger (C2H2, B-box, RING), tyrosine protein phosphatase, tyrosine and serine/threonine protein kinases, helix-turn-helix motif, and IQ calmodulin binding region domains.

Page 40: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

40

TRANSMEMBRANE AND SECRETED PROTEIN FEATURES

1297 of 3020: SwissProt Annotated ProFun Both

Transmembrane 230 151 104

Secretion signal 373 420 358

1723 of 3020: ProFun PredictedTM domain(s) 137

Secretion signal 255

Page 41: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

41

Cardiovascular-Related Proteins Biomarker Candidates in the PPP Database

(Vondriska-Ping: presentation today)

Proteins characterized in eight groups:Inflammation VascularSignalingGrowth and differentiationCytoskeletalTranscription factorsChannelsReceptors

Page 42: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

42

PROTEINS FROM INHERITED CANCER DISORDERSLinking IPI IDs and Mendelian Inheritance in Man (OMIM)

IPI Cancer Types ProteinLabs

No of Peptides

IPI00012391.1 Colorectal APC 2 2

IPI00017303.1 Colorectal , NHPCC; Ovarian DNA mismatch repair protein Msh2 2 2

IPI00020732.2 Medullary or papillary thyroid Tyrosine kinase ret receptor precursor 2 3

IPI00025087.1 Colorectal Cellular tumor antigen p53 1 3

IPI00031036.1 Colon Chloride anion exchanger 2 4

IPI00164713.1 Breast, Endometrial, Gastric, Ov Epithelial-cadherin precursor 2 4

IPI00181932.1 Prostate Zinc phosphodiesterase 2 5

IPI00185027.1 Pancreatic Arg-Glu dipeptide (RE) repeats 2 2

IPI00218982.1 Breast , Ovarian BRCA1 3 5

IPI00257731.1 Prostate N33 protein. 2 2

IPI00289819.1 Hepatocellular Cation-ind mannose-6-P receptor precursor 2 3

IPI00293471.1 Breast 2, Pancreatic BRCA2 4 8

IPI00294982.1 Breast Estrogen receptor 2 2

IPI00329643.1 Endometrial DNA mismatch repair protein Msh3 3 3

Page 43: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

43

IDENTIFICATION OF 94 NOVEL PEPTIDES USING WHOLE GENOME ORF SEARCH

States has enhanced the annotation of the Human Genome by identifying novel and cryptic genes not previously known to have protein products. Mass spectra peaklists from a subset of PPP labs were searched against all ORFs in NCBI Build 33 in all three reading frames and both strands, using X!Tandem.

A bonus of the PPP: protein to DNA mapping of the human genome!

Page 44: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

44

COMPARISON WITH LITERATURE

Report #IDs #IPI in 3020 in 9504Anderson 1175 990 316 471Shen [1682] 1842 213 526Chan 1444 1019 257 402Zhou 210 107 51 68

Page 45: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

45

NEXT STEPS1. We are in the homestretch on manuscripts from the

Pilot Phase of the PPP for Special Issue of PROTEOMICS August 2005 & for Nature Biotech.

2. Plan potential future phases of PPP a) Identify and perform critical experiments to

support development of standardized procedures for specimens, fractionation, analysis.

b) Provide high-quality bioinformatics and database for plasma proteome datasets from all sources, assuring linkage with organ-proteomes.

c) Organize strategies, labs, and bioinformatics for large-scale studies, or play facilitation role.

Page 46: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

46

DEFINE HIGH THROUGHPUT OPTIONS FOR LARGE-SCALE PROTEOMICS STUDIES (1)

Admon/Dongre: LC-MS with highly accurate mass and elution time parameters for peptide IDs

Combine with depletion; rely on very slow flow (2 hr) LC and accurate mass and elution characteristics for mass fingerprints, after building a high-quality mass x elution database.

Page 47: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

47

DEFINE HIGH THROUGHPUT OPTIONS FOR LARGE-SCALE PROTEOMICS STUDIES (2)

Mann, Beijing Congress (2004)Use MS (3) with FTICR for much greater

precision of mass determination and for detection and localization of post-translational modifications.

Probably convert to microarrays for high throughput of clinical and epidemiological specimens.

Page 48: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

48

Genome-Wide Studies of Proteome (3)Humphery-Smith (Proteomics 2004;4:2519-21)

Design and produce affinity ligands against conserved regions in each ORF for signal enrichment: antibodies, receptins, aptamers; sequence strings unencumbered by PTMs, uncleaved, near 5’ end, exposed at surface

Use ECL, rolling circles, isotopic labeling, and/or light scattering as readout technologies.

Page 49: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

49

Large-Scale Proteomics Studies (4)Aebersold (Nature 2003;422:115-116)

Go from discovery using MS to “browsing” using unique chemically-synthesized peptides tagged with heavy isotope for each gene and even each protein isoform.

Combine this standard peptide mixture with specimen fractions on sample plate for MS, examine double peaks (with the precise differential mass) in the ordered peptide array.

Try the method first on yeast.

Page 50: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

50

HUPO PPP SUPPORT FROM NIH

Trans-NIH ConsortiumNatl Cancer Inst: Div CancerPrevention;

Div Cancer TreatmentNatl Institute on AgingNatl Inst on Alcoholism & Alcohol AbuseNatl Inst on Diabetes, Digestive, & Kidney

DiseasesNatl Inst for Environmental Health ScienceNatl Inst for Neurologic Diseases & Stroke

Page 51: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

51

CORPORATE SPONSORS

Johnson & Johnson Abbott LabsPfizer NovartisInvitrogen AmershamProcter & GambleBD Biosciences BioVisionCiphergen Molecular StagingBristol Myers Squibb Sigma-AldrichAgilent Dade Behring

Page 52: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

52

OUR GENETIC FUTURE

“Mapping the human genetic terrain may rank with the great expeditions of Lewis and Clark, Sir Edmund Hillary, and the Apollo Program.”

--Francis Collins, DirectorNational Human Genome Research Institute, 1999

Next: Functional Genomics/Systems BiologyUnderstand the dynamic proteomic compartments.

Page 53: RESULTS FROM THE HUPO PLASMA PROTEOME PROJECT: CORE DATASET OF

53