Top Banner
Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从从从从从从从从从从从
27

Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

Dec 25, 2015

Download

Documents

Bethany Shaw
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

Beyond Database SearchPTMs, Mutations

& Full Sequence Coverage

Bin MaProfessor, University of Waterloo

从搜库到蛋白全序列分析

Page 2: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

蛋白质鉴定的搜库手段1. 酶切,2. LC-MS/MS ,3. 对每个 MS/MS 谱通过搜库鉴定多肽,4. 报告含有多个特异多肽的蛋白。

“ 多个”?

Page 3: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

多乎哉 ?

Page 4: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.
Page 5: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

提要• 两个极端:从头测序 & 搜库• 二者结合• 修饰( PTM )• 变异( Mutation )• 实例

Page 6: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

一个目的,两种做法

Page 7: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

同一样的原理

定义一个打分函数,找一个得分最优的多肽。

“ 最优”?

Page 8: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

生物和计算机的差异• 最优解≠真实解。

Page 9: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

听谁的?

我找到最优解了

是真的吗?

Page 10: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

只报高置信度的解

score

false

true

FDR# reported false hits

# reported hits

Page 11: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

An Idea to Improve Score Function

要是这两个结果相等你怎么想?

Page 12: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

score

false

true

before after

把 de novo 和搜库结果相似程度考虑到搜库的打分函数里面。

Page 13: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

… far better than what I could ever squeeze out of my data – Stefano Gotta, Siena Biotech

0 500 1000 1500 2000 2500 3000 3500 40000.0%

0.5%

1.0%

1.5%

2.0%

2.5%

# of PSM

FDR

Sequest Mascot PEAKS DB

“ ”

Zhang et al., PEAKS DB: De Novo Assisted Database Search. MCP 2012.

Page 14: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

提要• 两个极端:从头测序 & 搜库• 二者结合• 修饰( PTM )• 变异( Mutation )• 实例

Page 15: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

Usual PTM Search

• All possible modification forms of a database peptide are tried to match the spectra.

• Can’t blindly search with all 600+ PTMs in Unimod.

PEPTIDEPTM

PEPTIDEPTMPEPT(+80)IDEPTMPEPTIDEPT(+80)MPEPT(+80)IDEPT(+80)MPEPTIDEPTM(+16)PEPT(+80)IDEPTM(+16)PEPTIDEPT(+80)M(+16)PEPT(+80)IDEPT(+80)M(+16)

Ox-MPhos-T

Page 16: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

De Novo Assisted PTM “Blind Search”

• Search for PTM when there is a tag match.

X. Han et al. PeaksPTM. JPR 2011, 10(7): 2930-2936.

DB: …VK.LVNELTEFAK…Denovo: LVNGELTEFAK

Page 17: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

De Novo Enabled Mutation Discovery

问题:从头测序有错、数据库有突变。

Ma and Johnson. De Novo Sequencing and Homology Searching. MCP 2012 11: O111.014902

(denovo) X: [LS]C[FA]K(real) Y: [SL]C[AF]K || || |(homolog) Z: [SL]A[AF]K

de novo error

mutation

(denovo) X: LSCFAK |(homolog) Z: SLAAFK

答案:用最少的测序错和突变来解释二者差异 .

Y. Han, B. Ma, and K. Zhang. SPIDER. JBCB 3(3):697-716. 2005.

Page 18: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.
Page 19: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

BSA Experiment

• “Pure” BSA protein ordered from Sigma.• Three digests with Trypsin, LysC, GluC.• Orbitrap (orbi-orbi) and typical LC-MS/MS protocol.

Page 20: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

1. Contaminants: Bacteria, Keratin, Other Bovine Protein

Page 21: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

2. Protein N-term• The N-terminal region of bovine serum albumin (Asp-Thr-His-Lys) provides

a specific binding site for Cu(II) ions. – T. Peters Jr., F.A. Blumenstock. J. Biol. Chem., 242 (1967), p. 1574

Page 22: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

3. Frequent PTMs

Page 23: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

4. A Mutation• 214th amino acid A T

Page 24: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

4. A Mutation• 214th amino acid A T

Page 25: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

5. Unexplained De Novo Tags• After filtration of DB, PTM, SPIDER, there were still

“de novo only” tags.

KK.QTALVELLK.HK ||||||| DPALVELLKK

Page 26: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

结论1. 既要利用数据库,又不能拘泥于数据

库。2. 蛋白全序列分析(包括修饰和突变)

很有必要。3. 通过多个酶切和多个算法结合,全序

列分析是可行的。

Page 27: Beyond Database Search PTMs, Mutations & Full Sequence Coverage Bin Ma Professor, University of Waterloo 从搜库到蛋白全序列分析.

To those who ignore mutations and PTMs in their protein study: It takes less than 1% amino acid mutations to change between most chimpanzee and human proteins.