Interpreting MS/MS Proteomics Results Brian C. Searle Proteome Software Inc. Portland, Oregon USA [email protected]NPC Progress Meeting (February 2nd, 2006) The first thing I should say is that none of the material presented is original research done at Proteome Software but we do strive to make the tools presented here available in our software product Scaffold. With that caveat aside… Illustrated by Toni Boudreault
83
Embed
Interpreting MS/MS Proteomics Results Brian C. Searle Proteome Software Inc. Portland, Oregon USA [email protected] NPC Progress Meeting.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Interpreting MS/MS Proteomics Results
Brian C. SearleProteome Software Inc. Portland, Oregon USA
We advocate for Andrew Keller and Alexy Nesviskii’s Peptide Prophet approach
because it actually calculates a true probability, not just a p-value.
10 Protein Control Sample (Q-ToF)X! Tandem approach
Other IncorrectIDs for Spectrum
PossiblyCorrect?
Mascot: Ion-Identity Score
# of
Mat
ches
So if you remember,
X! Tandem considers the best peptide
match for a spectrum against a
distribution of incorrect
matches
10 Protein Control Sample (Q-ToF)Peptide Prophet approach
ALL Other“Best” Matches
PossiblyCorrect?
Mascot: Ion-Identity Score
# of
Mat
ches
Keller, A. et al Anal. Chem. 74, 5383-5392
Well, Peptide Prophet looks across the entire
sample, and not at just one spectrum at a time.
It compares the best match against all of
the other best matches in the
sample, which is clearly bimodal.
10 Protein Control Sample (Q-ToF)Peptide Prophet approach
ALL Other“Best” Matches
PossiblyCorrect?
Mascot: Ion-Identity Score
# of
Mat
ches
Keller, A. et al Anal. Chem. 74, 5383-5392
The low mode represents matches that are most likely wrong while the high mode represents matches that are probably right.
10 Protein Control Sample (Q-ToF)Peptide Prophet approach
PossiblyCorrect?
“Correct”
“Incorrect”
Mascot: Ion-Identity Score
# of
Mat
ches
Peptide Prophet curve fits two distributions to
the modes,
following the assumption that the low
scoring distribution is “Incorrect”
and that the higher scoring distribution is “correct”.
10 Protein Control Sample (Q-ToF)
“Incorrect” p( | D)
p(D | ) p()
p(D | ) p() p(D | ) p( )
Mascot: Ion-Identity Score
# of
Mat
ches
PossiblyCorrect?
“Correct”
These two distributions can be analyzed using Bayesian statistics with
this formula.
Now that formula looks pretty complex,
but…
10 Protein Control Sample (Q-ToF)
p( | D)
p(D | ) p()
p(D | ) p() p(D | ) p( )“Incorrect”
Mascot: Ion-Identity Score
# of
Mat
ches
“Correct”
It just calculates the height of the correct distribution at a particular score, divided by the height of both distributions.
10 Protein Control Sample (Q-ToF)
p( | D)
p(D | ) p()
p(D | ) p() p(D | ) p( )
prob of having scoreand being correct
prob of having score
“Correct”
“Incorrect”
Mascot: Ion-Identity Score
This is essentially the probability of having that score and being
correct divided by the probability of just having that score
Mascot: Ion-Identity Score
PossiblyCorrect?
“Correct”
“Incorrect”
# of
Mat
ches
This is a neat method because it actually considers the likelihood of being correct,
rather than X! Tandem and Mascot, which only calculate the probability of being incorrect.
It’s because of this that Peptide Prophet can
get produce a true probability,
which is important when the sample characteristics change.
Mascot: Ion-Identity Score
PossiblyCorrect?
“Correct”
“Incorrect”
# of
Mat
ches Q-ToF:
For example, the control sample we’ve been looking at was derived
from Q-ToF data
which produces pretty high quality results
PossiblyCorrect?
“Correct”
“Incorrect”
# of
Mat
ches
Mascot: Ion-Identity Score
PossiblyCorrect?
“Correct”
“Incorrect”
# of
Mat
ches Q-ToF:
Ion Trap:
If you compare that to the same sample on
run on an Ion Trap, the probability of being correct is greatly
diminished.
If you’ll note, the Incorrect distribution doesn’t change very much between the two
analyses, however, the likelihood that the
identification is right changes dramatically!
PossiblyCorrect?
“Correct”
“Incorrect”
# of
Mat
ches
Mascot: Ion-Identity Score
Ion Trap:
As Peptide Prophet considers the correct distribution, it is immune to fluctuations between samples.
P-Values and E-Values don’t consider this information, so they can’t be compared across multiple samples, or different examinations of the same sample
hence the reason why we need to use Peptide
Prophet for comparing two different search engines
Mascot: Ion-Identity Score
Consider Multiple Algorithms?
X!
Tan
de
m: -
log
(E-V
alu
e)
So going back to the scatter plot between X! Tandem and Mascot,
we can use Peptide Prophet to compute the score
threshold that represents a 95% cut-off…
Mascot: Ion-Identity Score
Consider Multiple Algorithms?
X! Tandem: 2.6=95%
Mascot: -2.5=95%
X!
Tan
de
m: -
log
(E-V
alu
e)Like so.
This allows you to fairly consider the answers from both search engines simultaneously.
The important thing to note, is that if you looked at a different sample, these thresholds should change depending on the height of the correct distributions
Conclusion• All search engines
use different criteria, producing different scores
• Using multiple search engines simultaneously yields better results
• Peptide Prophet can normalize search engine results
So in conclusion,
all of the search engines look at different criteria
Conclusion• All search engines
use different criteria, producing different scores
• Using multiple search engines simultaneously yields better results
• Peptide Prophet can normalize search engine results
And we can leverage this to identify more peptides
Conclusion• All search engines
use different criteria, producing different scores
• Using multiple search engines simultaneously yields better results
• Peptide Prophet can normalize search engine results