1 Information-Theoretic Mass Spectral Library Search Arvind Visvanathan CSCE 990 Seminar in Multi-Dimensional Chromatography Systems, Informatics, and Applications Information-Theoretic Mass Spectral Library CSCE 990 – GCxGC Seminar Outline Introduction Related Work Method Results and Discussion
40
Embed
Information-Theoretic Mass Spectral Library Search
Outline Introduction Related Work Method Results and Discussion. Information-Theoretic Mass Spectral Library Search. Arvind Visvanathan CSCE 990 Seminar in Multi-Dimensional Chromatography Systems, Informatics, and Applications. Information-Theoretic Mass Spectral Library Search. - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Information-Theoretic Mass Spectral Library Search
Arvind Visvanathan
CSCE 990Seminar in Multi-Dimensional Chromatography Systems, Informatics,
and Applications
Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar
OutlineIntroduction
Related WorkMethod
Results and Discussion
2
Outline
• Introduction– Mass spectrum search types
• Related Work– Other techniques
• NIST, PBM, DotMap
• Method– Probability and Information– Normalized distribution function
• Results• Conclusion
OutlineIntroduction
Related WorkMethod
Results and Discussion
Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar
3
Introduction – Mass Spectrum
Mass SpectrumSearch AlgorithmSearch TypesApplications
OutlineIntroduction
Related WorkMethod
Results and Discussion
Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar
m/z
Inte
nsity
Decane
4
Introduction – Mass Spectrum Search
OutlineIntroduction
Related WorkMethod
Results and Discussion
Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar
MS Library
Unknown Spectrum Search
Algorithm
Pot
entia
l Mat
ches
Mass SpectrumSearch AlgorithmSearch TypesApplications
5
Introduction – Search Types
Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar
• Identity search– Unknown mass spectrum present in library– Looking for exact spectrum
• Similarity search– Unknown mass spectrum not present in library– Looking for similar spectrum
Mass SpectrumSearch AlgorithmSearch TypesApplications
OutlineIntroduction
Related WorkMethod
Results and Discussion
6
Introduction – MS Search Applications
• Steroid detection in athletes• Monitor patient breath during surgery• Composition of molecular species found in
space• Honey adulterated with corn syrup• Locate oil deposits• Monitor fermentation process in the
biotechnology industry• Detect dioxins in contaminated fish
Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar
Mass SpectrumSearch AlgorithmSearch TypesApplications
OutlineIntroduction
Related WorkMethod
Results and Discussion
7
Related Work – NIST MS-Search [Stein ‘94]
• Pre-search the unknown spectra in library– Reduce search domain (160K 4K compounds)
• Compute match factor for each compound in the pre-search result
• Match Factor (MF)– Range 0-999– Higher the better
• Pre-search result sorted based on MF value• Pick the topmost compounds as possible matches
Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar
MS SearchProbability Based MatchingDotMap
OutlineIntroduction
Related WorkMethod
Results and Discussion
8
Related Work – NIST MS-Search [Stein ‘94]
• Match Factor Computation [Stein ‘94]– Term 1 – Mass weighted normalized dot product
– Term 2 – Relative intensities of adjacent peaks in both spectra
– Combination of F1 & F2
Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar
MS SearchProbability Based MatchingDotMap
OutlineIntroduction
Related WorkMethod
Results and Discussion
9
Related Work – NIST MS-Search [Stein ‘94]
Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar
MS SearchProbability Based MatchingDotMap
OutlineIntroduction
Related WorkMethod
Results and Discussion
m/z Intensity
35 100
36 1
37 1
45 999
55 200
m/z Intensity
35 100
36 1
37 2
45 999
55 200
C-1 C-2
Compare
C-1 & C-1
Compare
C-1 & C-2
F1 999 999
F2 999 824
MF 999 925
10
Related Work – Probability Based Matching [McLafferty et. al. ‘75]
• Confidence Value (K) instead of MF• Four components for each m/z
– Term 1 : U : Based on the uniqueness of a m/z value– Term 2 : A : Intensity contribution to the confidence– Term 3 : W : Window factor (measure of agreement)– Term 4 : D : Dilution factor (measure of purity)– K ∑ (U + A + W – D) for each m/z
Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar
OutlineIntroduction
Related WorkMethod
Results and Discussion
MS SearchProbability Based MatchingDotMap
11
Related Work – DotMap [Sinovec et. al. ‘04]
Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar
OutlineIntroduction
Related WorkMethod
Results and Discussion
MS SearchProbability Based MatchingDotMap
Fumaric acid
Adipic acid
Lactic acid
DotMap
12
Related Work – DotMap [Sinovec et. al. ‘04]
• Inverse problem• DotMap computed across the image
• Higher valued areas indicate presence of compound of interest
• Multiple compounds of interest– Compute DotMap overlay
Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar
OutlineIntroduction
Related WorkMethod
Results and Discussion
MS SearchProbability Based MatchingDotMap
13
Related Work – DotMap [Sinovec et. al. ‘04]
Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar
OutlineIntroduction
Related WorkMethod
Results and Discussion
MS SearchProbability Based MatchingDotMap
14
Related Work – DotMap [Sinovec et. al. ‘04]
Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar
OutlineIntroduction
Related WorkMethod
Results and Discussion
MS SearchProbability Based MatchingDotMap
15
Method – Motivation
• NIST MS-Search [Stein ‘94]– No domain information utilized
• PBM Matching [McLafferty et. al. ‘75]– Old technique (‘75)– Ad hoc domain information utilization
• DotMap– No domain information utilized
Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar