Welcome to the world of mass spectrometry mass spectrometry- based proteomics! Xi l Zh Ph D Xiaolu Zhao, Ph.D. Room 6016, CLS Wuhan University Wuhan University
Welcome to the world of mass spectrometrymass spectrometry-based proteomics!p
Xi l Zh Ph DXiaolu Zhao, Ph.D.Room 6016, CLSWuhan UniversityWuhan University
从"基因"到"蛋白"
基因数
线虫 19,000
果蝇 13,500
拟南芥 27 000拟南芥 27,000
人类 ~20,500人类 20,500
从"基因"到"蛋白"
基因数
线虫
基于质谱技术的蛋白质组学 - 全局性研究线虫 19,000
果蝇 13,500 蛋白质的表达水平
拟南芥 27,000
人类 20 500
蛋白质的表达水平
蛋白质翻译后修饰
蛋白质相互作用人类 ~20,500
PROTEIN (蛋白质)DNA RNA蛋白质相互作用
PROTEIN (蛋白质)DNA RNA
Protein and proteomicsProtein and proteomicsMass spectrometry basicsM t t f t iMass spectrometry for proteomicsQuantitative proteomicsQ pHuman Proteome and more
P t i d t iProtein and proteomics
Amino acid structure
R
Side chain (R)
HN C
R
C OHH
N C
H
C OH
O
Amino group
Carboxyl groupgroup group
Protein Backbone
H HR’ R’’H R H
C N
H
C C N
R
C C
R
OH OH OH
H
C
R
N
O H O H OH H
H O H O
HH
H HR’ R’’
H2O H2O
H R
C
O
N
H
C C
O
N
H
C C
O
OH
HHCN
O H O H O
Amide bond
HH
Amide bond
Amino Acid Residue
H
C N
H
C C N
R’
C C
R’’
OH
H
C
R
N C
O
N
H O H OHHCN
H R’HN
HC C
OH O
Amino acid residue structure
What is a proteome?What is a proteome?
The proteome is the entire set of proteins expressed by a genome, cell, tissue or organism at a certain time Moreorganism at a certain time. More specifically, it is the set of expressed proteins in a given type of cell or organism, p g yp g ,at a given time, under defined conditions.
What is Proteomics?What is Proteomics?
Systematic (large scale) study ofSystematic (large scale) study of protein properties to obtain a global integrated view of diseaseglobal, integrated view of disease processes, cellular processes and networks at the protein level
What is Proteomics?What is Proteomics?
“Proteomics includes not only the identification and quantification of proteins, but also the determination of theirbut also the determination of their localization, modifications, interactions, activities, and, ultimately, their function.”
-Stan Fields in Science 2001-Stan Fields in Science, 2001.
Every set of proteins of same modification can be called a proteomep
Phosphorylation Phosphorylome
Acetylation
Ubiquitination
Acetylome
Ubiquitinome
Methylation
GlcNAclation
Methylome
GlcNAclomeGlcNAclation GlcNAclome………
The tasks in proteomicsThe tasks in proteomics
Detection What and WhereWhat and Where
Quantification Compare dynamicsCompare dynamics
Technologies for ProteomicsTechnologies for Proteomics
Adapted from Mallick and Kuster, Nature Biotechnology, 28:695-709, 2010
Conceptual Organization of Proteomic ExperimentsConceptual Organization of Proteomic ExperimentsConceptual Organization of Proteomic ExperimentsConceptual Organization of Proteomic Experiments
Adapted from Mallick and Kuster, Nature Biotechnology, 28:695-709, 2010
Challenges in proteomicsg p
DNA: 4 basesP t i 20 i id + PTMProtein: 20 amino acids + PTM
DNA: similar biochemical property
Protein: i 450average size 450 aa
wide range of solubilityPTMdegradationdegradation
The Human Plasma ProteomeThe Human Plasma Proteome
overwhelming dynamic range of at least 10 orders of magnitude
Proteomics - ChallengesProteomics Challenges
• Sample• Sampling Handling• Complexity Reduction• Chromatography• Mass Spectrometry• Bioinformatics
( )Aebersold (2009) Nature Methods, 6, 411
Why proteomics is a complement or y p palternative to mRNA-based measurements?• In areas in which microarray measurement is not
feasible. Example: blood.
measurements?
• The proteomic measurement already delivers the desired end point, namely the protein expression level of a gene of interestof a gene of interest
• It is not limited to expression profiling of the whole cell. Cellular compartments and organelles and their time-p gresolved dynamics are readily accessible to this technology
f f• The large-scale measurement of protein modifications and their quantitative changes upon perturbations to the cellthe cell.
Figure: Quantitative Proteomics versus TranscriptomicsFigure: Quantitative Proteomics versus Transcriptomics
(A) Overlap between the proteins identified in the example given in Figure 1 (blue) and the messages with “present call” (that is, significantly different from zero signal) in a quadruple microarraysignificantly different from zero signal) in a quadruple microarray experiment of the normal HeLa cell proteome (green).
(B) “Label-free” quantitation of about 4000 proteins identified in brain tissue samples from two separate individuals. Mass spectrometric intensity counts (added peptide signals) from the two separate runs are plotted against each other.
Mass spectrometry basicsMass spectrometry basics
Mass spectrometry to proteomics isMass spectrometry to proteomics isMass spectrometry to proteomics is Mass spectrometry to proteomics is like PCR to genomicslike PCR to genomicsgg
MS PrinciplesMS Principles
• Different elements can be uniquely identified by their mass
MS PrinciplesMS Principles
• Different compounds can be uniquely identified by their mass
Butorphanol L-dopa EthanolNOH
-CH2-
CH CH NHCOOH
HO CH3CH2OH
HO
-CH2CH-NH2HO
HOHO
MW = 327.1 MW = 197.2 MW = 46.1
What is Mass Spectrometry?What is Mass Spectrometry?
Mass spectrometry is a powerful technique forMass spectrometry is a powerful technique for chemical analysis that is used to identify unknown compounds to quantify known compounds and tocompounds, to quantify known compounds, and to elucidate molecular structure.
Wh t d t t d ?What does a mass spectrometer do?
1. It measures mass better than any other technique.
2. It can give information about chemical structuresstructures.
3. It generate spectrum by separating gas phase ions of different mass to charge ratio (m/z)
m=molecular or atomic massz = electrostatic charge unit
Typical Mass Spectrum
Ch t i d b h• Characterized by sharp, narrow peaksX i iti th / tisi
ty
3.3E+
80
90
100
Voyager Spec #1[BP = 2185.1, 32728]
2185.1265
1746 9323 • X-axis position: the m/z ratio of a given ion H i ht f k l tie
inte
ns
50
60
70
80
% In
tens
ity
1746.9323
3442.7118
• Height of peak: relative abundance of a given ion
Rel
ativ
e
10
20
30
40 1303.6221
1626.7960
2975.45221052.6278
1903.0161969.5085 1711.8485 2207.0981
1068.61622647.17881726.8255
2189.1376 2989.47071640.8147666.0449 3446.73661894.93651347.7358877 0664 2140 1285 2979 4543
m/z • Peak intensity indicates the ion’s ability to “fly” (some fly
R
499.0 1399.2 2299.4 3199.6 4099.8 5000.0Mass (m/z)
00
1347.7358877.0664 2140.1285 2979.45432663.1801 3398.6503606.2721 1886.8093 2341.2018
ion’s ability to “fly” (some fly better than others)
Multiple Chargingp g g
Calculating mass-to-charge ratio (m/z)Consider a peptide with MW of 10000
With ESI-MS, charges by H+ additionM + nH+ MnHn+
Resultant ions formed are:When z = 1 m/z = (10000+1)/1 = 10001When z = 1 m/z = (10000+1)/1 = 10001When z = 2 m/z = (10000+2)/2 = 5001Wh 3 / (10000+3)/3 3334 3When z = 3 m/z = (10000+3)/3 = 3334.3When z = 4 m/z = (10000+4)/4 = 2501When z = 5 m/z = (10000+5)/5 = 2001
Figure from The Expanding Role of MS in Bio-technology – G . Siuzdak
Stable isotopes of most abundantStable isotopes of most abundant elements of peptides
Element Mass AbundanceH 1.0078 99.985%H 1.0078
2.014199.985%0.015
C 12.000013 0034
98.891 1113.0034 1.11
N 14.003115 0001
99.640 3615.0001 0.36
O 15.994916.9991
99.760.04
17.9992 0.20
Mass spectrum of peptide with 94 C-atoms (19 amino acid residues)(19 amino acid residues)
“
1981.84No 13C atoms (all 12C)
“Monoisotopic mass”
1982.84 One 13C atom
1983.84Two 13C atoms
Monoisotopic massMonoisotopic massMonoisotopic masspcorresponds tolowest mass peak
When the isotopes are clearly resolved the monoisotopic mass is used as it is the most accurate pmeasurement.
Average massAverage mass
AAverage mass corresponds to the centroid of thecentroid of the unresolved peak cluster
When the isotopes are not resolved, the centroid of the envelope corresponds to the weighted average of all the the i t k i th l t hi h i th thisotope peaks in the cluster, which is the same as the average or chemical mass.
Mass Calculation (Glycine)Mass Calculation (Glycine)
NH CH COOHNH2—CH2—COOH Amino acid
R1—NH—CH2—CO—R3 Residue
Glycine Amino Acid MassMonoisotopic Mass1H = 1.00782512C 12 00000
Glycine Amino Acid Mass5xH + 2xC + 2xO + 1xN= 75.032015 amu12C = 12.00000
14N = 14.0030716O = 15 99491
Glycine Residue Mass3xH + 2xC + 1xO + 1xN
57 021455O = 15.99491 =57.021455 amu
Amino Acid Residue MassesAmino Acid Residue Masses
Monoisotopic Mass
Glycine 57.02147Al i 71 03712
Aspartic acid 115.02695Gl t i 128 05858
Monoisotopic Mass
Alanine 71.03712Serine 87.03203Proline 97 05277
Glutamine 128.05858Lysine 128.09497Glutamic acid 129 0426Proline 97.05277
Valine 99.06842Threonine 101.04768
Glutamic acid 129.0426Methionine 131.04049Histidine 137.05891
Cysteine 103.00919Isoleucine 113.08407L i 113 08407
Phenylalanine 147.06842Arginine 156.10112T i 163 06333Leucine 113.08407
Asparagine 114.04293Tyrosine 163.06333Tryptophan 186.07932
Amino Acid Residue Masses
Average Mass
Glycine 57.0520Al i 71 0788
Aspartic acid 115.0886Gl t i 128 1308
Average Mass
Alanine 71.0788Serine 87.0782Proline 97 1167
Glutamine 128.1308Lysine 128.1742Glutamic acid 129 1155Proline 97.1167
Valine 99.1326Threonine 101.1051
Glutamic acid 129.1155Methionine 131.1986Histidine 137.1412
Cysteine 103.1448Isoleucine 113.1595L i 113 1595
Phenylalanine 147.1766Arginine 156.1876T i 163 1760Leucine 113.1595
Asparagine 114.1039Tyrosine 163.1760Tryptophan 186.2133
Important performance factorsp p
Mass accuracy: How accurate is the mass measurement?mass measurement?
Resolution: How well separated are theResolution: How well separated are the peaks from each other?
Sensitivity: How small an amount can be analyzed?be analyzed?
ResolutionResolutionm
mR
分辨率:分开两个邻近质量峰的能力,高分辨率将目
标物与复杂基质分离,排除干扰,确保质量精度。
m1 m2
未分开分辨率差
部分分开分辨率中等
全分开分辨率高
What if the resolution is not so good?What if the resolution is not so good?
PBetter Poorer resolutionresolution
6130 6140 6150 6160 6170Mass
At lower resolution, the mass measured is the average mass.
Glossary:mass accuracy
• Mass accuracy: The absolute Mass accuracy: The absolute deviation between measured mass f f ll from true mass of an ion, typically expressed as an error value in ppm expressed as an error value in ppm (parts per million).
1000±0.1 = 0.01%= 100 ppm= 100 ppm
Mass measurement accuracy depends on resolutiony p
R l ti 18100
High resolution means better mass accuracy
6000
8000Resolution =18100
15 ppm error
4000
6000
Cou
nts Resolution = 14200
24 ppm error
2000Resolution = 4500
55 ppm error
0
2840 2845 2850 2855
Mass (m/z)( )
Wh t d ll MS i t t h i ?Wh t d ll MS i t t h i ?What do all MS instrument have in common?What do all MS instrument have in common?
Sample introduction Separate Count ions
Collect resultsintroductionIonization
Minimize collisions, interferences
masses Collect results
Sample IntroductionSample Introduction
High Vacuum System
I M DInlet IonSource
Mass Analyzer Detector
DataSystem
HPLCGCFlow injectionSample plate
Ion SourceIon Source
High Vacuum System
I M DInlet IonSource
Mass Analyzer Detector
DataSystem
MALDIESIFABLSIMSEICI
Mass Spec Principlesp p
SampleSample
++_
Ionizer Mass Analyzer Detector
• Find a way to “charge” an atom or molecule
• Place charged molecule in a magnetic field or subject it to an
• Detect ions using microchannelj
electric field and measure its speed or radius of curvature
microchannel plate or photomultiplier tuberelative to its mass-
to-charge ratiotube
Nobel Prize in Chemistry 2002y
"f th d l t f th d f "for the development of methods for identification and structure analyses
of biological macromolecules" of biological macromolecules
"for their development of soft desorption ionisation methods for mass spectrometric NMRionisation methods for mass spectrometric
analyses of biological macromolecules" NMR
John B. Fenn Koichi Tanaka
ESI MALDIESI MALDI
b. 1959b. 1917
Matrix-assisted laser desorption/ionization (MALDI) 基质辅助激光解析电离
Pulsed laser
(MALDI) 基质辅助激光解析电离
Pulsed laser
Sample plate IonsSample plate
Extraction
Ions
Extraction grid
• Absorption of UV radiation by chromophoric matrix and p y pionization of matrix
• Dissociation of matrix, phase change to super-compressed fgas, charge transfer to analyte molecule
• Expansion of matrix at supersonic velocity, analyte trapped in expanding matrix plume (explosion/”popping”)
Adopted from Nature, 422,200 (2003)
in expanding matrix plume (explosion/ popping )
MALDIMALDIMALDIMALDI
MALDIMALDIMALDIMALDI
384 spotsp
Matrix for MALDI ToFMatrix for MALDI ToFMatrix for MALDI ToFMatrix for MALDI ToF
• 2,5-dihydroxybenzoic acid (DHB)• α-cyano-4-hydroxy-cinnamic acidy y y• 3,5-dimethoxy-4-hydroxycinnamic acid
(sinapinic acid)(sinapinic acid)• Specific compounds for glycoprotein etc
Example of a MALDI-TOF Mass Spectrump p
Voyager Spec #1[BP = 2185.1, 32728]
3.3E+4
90
100 2185.1265
70
80 1746.9323
50
60
% In
tens
ity
3442.7118
30
40 1303.6221
1626.7960
2975.45221052.6278
10
20 1903.0161969.5085 1711.8485 2207.09811068.6162
2647.17881726.82552189.1376 2989.47071640.8147666.0449 3446.73661894.93651347.7358877.0664 2140.1285 2979.4543
2663.1801 3398.6503606.2721 1886.8093 2341.2018
499.0 1399.2 2299.4 3199.6 4099.8 5000.0Mass (m/z)
00
Electrospray Ionization (ESI)Electrospray Ionization (ESI)p y ( )p y ( )Atmosphere Vacuum
Li id SIons
p
Liquid chromatography
Spray needle
Nozzle SamplingNozzle Samplingcone
质谱技术的核心——质量分析器质谱技术的核心 质量分析器真空系统真空系统
大气压检测器检测器
数据处理数据处理系统系统质量分析器离子化方式
进样系统(LC或者直
大气压检测器检测器
系统系统质量分析器离子化方式( 或者接进样)
杆四级杆 (Q)离子阱/线性离子阱 (IT/LTQ)飞行时间 (TOF)
低分辨质谱
中高分辨质谱飞行时间 (TOF)傅立叶变换离子回旋共振 (FTICR)静电场轨道阱 (Orbitrap) 超高分辨质谱
中高分辨质谱
( p)
Analyzer TypesAnalyzer Types
• What is the analyzer?Analyser is the section of instrument that separates ions of different m/zthat separates ions of different m/z• Many Different technologies:Many Different technologies:Magnetic Sector, Quadrupole, Ion Trap, ToF
The quadrupole consists of four parallel metal rods
Only ions of a certain m/q will reach the detector for a given ratio of voltages:other ions have unstable trajectories and will collide with the rods.
This allows selection of a particular ionThis allows selection of a particular ion, or scanning by varying the voltages.
Obtains a mass spectrum by sweeping across the entire mass range
MALDI-TOF SMass Spectrometry
B k A t flBruker Autoflex
Now available as Tof/ToF
Easy to use – walk up use after training.g
Highly automated
N b d fNow can be used for imaging of Tissue samplessamples
QQ TOFTOFQQ--TOFTOF
NANOSPRAY TIP
MCPDETECTORDETECTOR
TOFHEXAPOLECOLLISIONCELL
HEXAPOLE
PUSHER
TOF
IONSOURCE
CELL
HEXAPOLE
QUADRUPOLEREFLECTRONSKIMMER
SOURCE
超高分辨质谱——傅立叶变换FTFT-ICR
傅立叶变换离子回旋共振Orbitrap
静电场轨道阱
zmk/
zm
B/
傅立叶变换质谱——当离子进入质量分析器后,以和质荷比相关的特征频率作轨道方式运动,
随后离子被激发到其固有的轨道半径旋转,由收集电极检测随时间变化的镜像电流。时域信
号经FT变换器得到频率域信号 继而转化为质谱图号经FT变换器得到频率域信号,继而转化为质谱图。
独一无二的Orbitrap技术中心电极
外纺锤电极外纺 极
30年来唯一基于全新理论的质量分
析器,真空度比传统质量分析器高3
个数量级以上,稳定性极强。
Orbitrap——逐渐成为质谱分析的金标准
Orbitrap性能9年来飞速增长
500000H
M) 400000
450000Orbitrap
Tof / QTofOrbitrap Fusion
ion
(FW
H
300000
350000
s re
solu
t
200000
250000 Orbitrap Elite QE HF
Mas
s
50000
100000
150000
LTQ Orbitrap
QE
Bendix Tof0
50000
1955 1965 1975 1985 1995 2005 2015
First Q-Tof
Time progression (year)
性能卓越的Q-Exactive系列质谱仪性能卓越的Q Exactive系列质谱仪
Q-Exactive (2011)尖端四极杆
Q-Exactive Plus (2013)尖端四极杆
超高场Orbitrap
Q-Exactive HF (2014)高 p
高灵敏度
四极杆 Orbitrap高灵敏度
高选择性
高动态范围
高分辨率
高质量精度
How do mass spectrometers get their names?
Types of ion sources:
g
• Electrospray (ESI)
• Matrix Assisted Laser Desorption Ionization (MALDI)p ( )
Types of mass analyzers:
Q d l (Q d Q)• Quadrupole (Quad, Q)
• Ion Trap
• Time-of-Flight (TOF)
Either source type can work with either analyzer type: “MALDI-TOF,” “ESI-Quad.”. Analyzers can be combined toMALDI TOF, ESI Quad. . Analyzers can be combined to create “hybrid” instruments. ESI-QQQ, MALDI QQ TOF, Q Trap
Summary: acquiring a mass spectrum
Ionization Mass Sorting (filtering) Detection
y q g p
Ionization Mass Sorting (filtering)
Ion
Detection
Ion S Mass Analyzer DetectorSource
Detect ionsForm ions
(charged molecules)Sort Ions by Mass (m/z)
(charged molecules)
Inlet • Solid• Liquid
100
75
50
25q• Vapor 1330 1340 1350
0
Mass Spectrum
Mass spectrometry for proteomics
Breaking Protein into PeptidesBreaking Protein into Peptides
Name Cleave Don't cleave N or C termName Cleave Don t cleave N or C termTrypsin KR P CTERMArg-C R P CTERMAsp-N BD NTERMAsp-N BD NTERMAsp-N_ambic DE NTERMChymotrypsin FYWL P CTERMCNBr M CTERMCNBr M CTERMFormic_acid D CTERMLys-C K P CTERMLys-C/P K CTERMLys C/P K CTERMPepsinA FL CTERMTryp-CNBr KRM P CTERMTrypChymo FYWLKR P CTERMTrypChymo FYWLKR P CTERMTrypsin/P KR CTERMV8-DE BDEZ P CTERMV8-E EZ P CTERMV8 E EZ P CTERMCNBr+Trypsin M CTERM
KR P CTERM
Protein ID: Two Main ApproachesProtein ID: Two Main ApproachesProtein ID: Two Main ApproachesProtein ID: Two Main Approaches
• Peptide Mass Fingerprinting (PMF)
• Tandem Mass Spectrometry (MS/MS)p y ( )
Peptide Mass Fingerprinting (PMF)p g p g ( )
Generalized Protein Identification by Generalized Protein Identification by PMFPMF
Spot removed f l
Trypsin digest MS1 of tryptic peptidesfrom gel peptides
y
MATCH
Libr
ary
Artificial spectra built
Artificially trypsinated
Database of sequencesspectra built trypsinated (i.e. SwissProt)
Principles of Fingerprintingp g p gSequence Mass (M+H) Tryptic Fragments
>Protein 1acedfhsakdfqeasdfpkivtmeeewe
q ( ) yp g
4842.05
acedfhsakdfgeasdfpki t d d f ksdfpkivtmeeewe
ndadnfekqwfe
>P t i 2
ivtmeeewendadnfekgwfe
>Protein 2acekdfhsadfqeasdfpkivtmeeewe
4842.05
acekdfhsadfgeasdfpkivtmeeewenk
nkdadnfeqwfe
>Protein 3
dadnfeqwfe
acedfhsadfgekacedfhsadfqekasdfpkivtmeeewendakdnfeqwfe
4842.05
acedfhsadfgekasdfpkivtmeeewendakdnfegwfeq dnfegwfe
Principles of Fingerprintingp g p gSequence Mass (M+H) Mass Spectrum
>Protein 1acedfhsakdfqeasdfpkivtmeeewe
q ( ) p
4842.05 sdfpkivtmeeewendadnfekqwfe
>P t i 2>Protein 2acekdfhsadfqeasdfpkivtmeeewe
4842.05
nkdadnfeqwfe
>Protein 3acedfhsadfqekasdfpkivtmeeewendakdnfeqwfe
4842.05
q
Can We Do Better?
Single Stage MS Tandem MS
Protein Identification StrategyProtein Identification Strategy
II
f gyf gy
*IIPeptidesPeptides
12 14 16
Time (min)ProteinProtein1D, 2D, 3D peptide separation1D, 2D, 3D peptide separation
*
Protein Protein mixturemixture
Q2Q2 Q3Q3
IIII200 400 600 800 1000 1200
m/zm/zTandem mass spectrumTandem mass spectrumQ1Q1 Q2Q2
Collision CellCollision CellQ3Q3
Correlative Correlative
Tandem mass spectrumTandem mass spectrumQ1Q1
IIIIII sequence database sequence database searchingsearching
200 400 600 80010001200m/zm/z
200 400 600 80010001200m/zm/z
TheoreticalTheoretical AcquiredAcquiredProtein identificationProtein identification
m/zm/zm/zm/z
Cleavages Observed in MS/MS of Peptides
yn-i zn-ilow energy
xn-i
n i
HN CH CO NH CH CO NH-HN--CH--CO--NH--CH--CO--NH-
Ri CH-R’Ri CH-R
R”ai
ci
Rbi
Fragmenting a Peptideg g p
http://www.proteomecenter.org/course/2005.jan.eng.pdf
MS/MS Peptide Fragmentationp g
Ala-Gly-His-Leu-….Phe-Glu-Cys-Tyr
b1 y1 b2 y2 b3 y3 b4 y4 b5 y5
Example of a ms/ms spectrump p
100YADSGEGDFLAEGGGVR
80
ce
XF D G
YADSGEGDFLAEGGGVR
40
60
bund
anc
EA E
GS
20
40
lativ
e A
b S D
400 600 800 1000 1200 1400 1600
0Re
m/z
Mascot MS/MS Form
Mascot MS/MS Output
Mascot MS/MS Output
MS/MS vs PMFS/ S sAdvantages Disadvantages
• Provides precise sequence-specific data
• Requires more handling, refinement and sample
Advantages Disadvantages
specific data• More informative than PMF
methods (>90%)
refinement and sample manipulation
• Requires more expensive and ( )• Can be used for de-novo
sequencing (not entirely d d t d t b )
complicated equipment• Requires high level expertise
dependent on databases)• Can be used to ID post-
trans modificationstrans. modifications
Quantitative proteomics
Quantitative proteomicsQuantitative proteomics
• The goal of a quantitative proteomics analysis is to determine the changes in protein expression in a given cell from a given organism when subjected to a stimulus. g g j
Label-free Quantitation
Two examples of label-free quantitation:
XIC – extracted ion chromatogramSC – spectral counting
Avoid isotopes but instrumentation needs to be d iblvery reproducible
Label-free Quantitation - XIC
if 660.96 is identified by MS/MS as a peptide coming from protein X
XIC of 660.96
Only one peak because none of the other peptides have a molecular mass of 660.96Find the peak area
Repeat the same process for all the other peaks B, C and DSum all the peak areas in the XICsConcentration of protein X proportional to total area (need an internal standard)
Label-free Quantitation – SC
MS of each peaks
MS of peak B
Peak at 2116 8 is 3rd highest soCount all the MS/MS spectra
hi h f idPeak at 2116.8 is 3 highest so selected for MS/MS
which came from a peptide which can be identified as coming from protein X (S t l C t)(Spectral Count)
Spectral count proportional
90
Use e.g. protein Prospector to d t i th t thi tid i
to protein concentration (need internal standard)
determine that this peptide sequence is derived from protein X
G
Stable Isotope Labeling Strategies
Metabolic stablei t l b liIN
LA
BEL
ING
isotope labeling
PRO
TEI
Digest
LEC
TIO
ND
ATA
CO
LD
YSIS
Mass spectrometryMass spectrometry
ATA
AN
ALY
nten
sity
nten
sity
nten
sity
DA InIn In
m/z m/z m/zAebersold and Mann, NATURE, 422, 198-207, 2003
SILAC – stable isotope labelling of amino id i ll ltacids in cell culture
Grow cells in media containing isotopically labelled amino acidsGrow cells in media containing isotopically labelled amino acids
O
Typically e.g.
L 4 lkl D 4 b it ti +4 itNH2
NH2OH
Lys4 – alkly Dx4 subsitution – +4 unitsArg6 – 13Cx6 subsitution – +6 unitsLys8 - 13Cx6 + 15Nx2 substitution +8 unitsArg10 13Cx6 + 15Nx4 substitution +10 units NH
O
NH
NH2
OHArg10 - 13Cx6 + 15Nx4 substitution +10 units NH NH
NH2
OH
Labelling arginine and lysine to ensure all tryptic peptides are labelled (Trypsin cuts at K or R)labelled (Trypsin cuts at K or R)
SILAC – stable isotope labelling of i id i ll ltamino acids in cell culture
L t t t i tLyse, extract protein, separatetrypsin digest, MS
SILAC mouse
SILAC ?SILAC SILAC mouse?
Protein Chemical labeling
G
Stable Isotope Labeling Strategies
Isotope taggingIN L
AB
ELIN
G
Isotope taggingby chemical
reactionLabel
PRO
TEI
Digest Digest
Digest
LEC
TIO
N Label Label
DAT
A C
OL
DYS
IS
Mass spectrometryMass spectrometry
ATA
AN
ALY
nten
sity
nten
sity
nten
sity
DA InIn In
m/z m/z m/zAebersold and Mann, NATURE, 422, 198-207, 2003
iTRAQ Isobaric Tag for Relative and Absolute Quantification
Reacts with NH groupsReacts with NH2 groups
NNH R
Adds tag of mass 145 to N
NCH3
O OAdds tag of mass 145 toterminal NH2 groups and lysines
NCH2
+MS/MS Fragmentation
97
NCH3Rest of molecule +
Reporter ion
iTRAQ
NO
N
OMw = 2813C x 3 15N x 1
N
NCH3
O
N
O O
OC18O13C x 2
3 O
Produces an ion of Mw = 117f f
N
NCH3
O
N
Oafter fragmentation
O13CO
CH3 O
Mw = 3013C x 2 15N x 1
N
N O
ON
Mw 30
Produces an ion of Mw = 115after fragmentation
NCH3
OO
Mw = 29 etc
98Produces an ion of Mw = 116after fragmentation
iTRAQ
iTRAQ
This part gives the sequence information
The low molecular weight region 114-117 contains reporter ions
100
The low molecular weight region 114 117 contains reporter ionsRatio tells us something about the relative abundance of this proteinin the 4 samples
iTRAQ reagent-8plex Protein Quantitation
Protein DimethylationProtein Dimethylation
+ 28 Da, ‘light’
+ 32 Da, ‘intermediate’
+ 36 Da, ‘heavy’
Using formaldehyde globally label the N-terminus of peptide and ε-amino group of K through reductive amination.
SRM d W t Bl tSRM and Western Blots
WB SRM
Quality of the assay
based on a single antibody
depend on isotopically labeled reference peptides
Quality of the results
depends on the intensity of a band on the blot
based quantification uses multiple signals that are integrated into a composite score indicating the proteinband on the blot composite score indicating the protein quantity.
Performance characteristic
limit of detection, linear dynamic range, ability to multiplex and reproducibility For most of these characteristics MS-characteristic
sand reproducibility. For most of these characteristics MSbased methods now outperform Western blotting.
An example of SRM applicationAn example of SRM application
A set of candidate biomarkers is verified by applying SRM assays for the did t t i t f ti t d f ti t d t lcandidate proteins to fractionated sera from patient and control groups,
for example. Mutants of the target proteins can be monitored in the sampled population.
Human Proteome
2014
The adult/fetal tissues and haematopoietic cell types
Work flow
• ~25million high-resolution tandem mass 2 000 LC MS/MS• >2,000 LC-MS/MS runs
• using the MASCOT16 and SEQUEST17, P l t 18Percolator18• 293,000 non-redundant peptides (q value, 0.01 with
di t 260 ta median mass measurement error ~260 parts per billion)
Th di b f tid d di• The median number of peptides and corresponding tandem mass spectra identified per gene are 10 and 37 ti l37, respectively• the median protein sequence coverage was ~28%
Tissue-supervised hierarchicalTissue-supervised hierarchical clustering
Novel protein-coding regions in the human genomegenome
16 million MS/MS spectra that did not match currently annotated proteins
ConclusionConclusionConclusionConclusion
Proteomics is extremely valuable for understanding biological processes and g g padvancing the field of systems biology.
“The ultimate goal of systems biology is the i t ti f d t f th b tiintegration of data from these observations
into models that might, eventually, represent and simulate the physiology of the cell.”