YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: Facts and Fallacies about de Novo Sequencing & Database Search.

Facts and Fallacies

about de Novo Sequencing & Database Search

Page 2: Facts and Fallacies about de Novo Sequencing & Database Search.

1. There are a large number of high quality spectra left unassigned after DB search.

TrueFalse

Leftover

Page 3: Facts and Fallacies about de Novo Sequencing & Database Search.

Unassigned Spectra in ABRF/iPRG 2011 Study

Page 4: Facts and Fallacies about de Novo Sequencing & Database Search.

Unassigned Spectra

• Nonspecific trypsin cleavages• Novel peptide/incomplete database • PTM• Mutations

PEAKS PTM

SPIDER

PEAKS DB

De novo sequencing

Page 8: Facts and Fallacies about de Novo Sequencing & Database Search.

Speed

• PEAKS 6 de novo sequence 15 spec/second.– Intel i7 Quad Core, 8GB RAM.– Trypsin– Orbitrap CID MS/MS, mostly charge +2/+3

• PEAKS 7 (coming soon): – Improve speed on high charge states and longer

peptides.– Add 8 core support in standard (desktop) license.

Page 9: Facts and Fallacies about de Novo Sequencing & Database Search.

4. De novo should be done after DB search.

TrueFalse

DB search DB peptides

de novo seq.

Unassigned spectra

de novo peptides

Page 10: Facts and Fallacies about de Novo Sequencing & Database Search.

Order of de Novo and DB

• Better conduct de novo on all spectra.– De novo not slow, and computing is cheap.– De novo provides independent validation for DB result.

# consensus AA (de novo vs. DB search)

true true

score

false

without de novo

with de novo

Page 11: Facts and Fallacies about de Novo Sequencing & Database Search.

5. My protein sequence is confirmed with two unique peptide hits.

TrueFalse

Page 12: Facts and Fallacies about de Novo Sequencing & Database Search.
Page 13: Facts and Fallacies about de Novo Sequencing & Database Search.

Routine Full Protein Coverage

• For regular proteins, full sequence coverage can be routinely achieved with – 3 or more enzyme digests, and– multiple algorithms in PEAKS 6.

• For highly variable proteins (such as antibodies), BSI offers data analysis service for antibody sequencing.

Page 14: Facts and Fallacies about de Novo Sequencing & Database Search.

6. If a peptide is identified with 1% FDR, then it’s sequence is 99% correct.

TrueFalse

Page 15: Facts and Fallacies about de Novo Sequencing & Database Search.

Peptide Validation vs. Amino Acid Validation

You are confident about the peptide sequence only if • you can de novo sequence it, and• the de novo sequence matches the database peptide.

Page 18: Facts and Fallacies about de Novo Sequencing & Database Search.

weak hits

confident protein

weak protein

Target-Decoy Incompatible with Certain Highly Optimized Search Engines

• Adding “protein bonus” to peptide hits increases accuracy.• But it creates bias between target and decoy.

– In extreme, bonus is so large that only peptides from target proteins are selected.

– This gives the wrong impression that FDR=0, while there are still false peptides in the result.

Page 19: Facts and Fallacies about de Novo Sequencing & Database Search.

weak hits

confident protein

weak protein

Decoy Fusion Is A More Powerful Validation Method

• Decoy fusion append a decoy sequence to each protein.

• Recreates the balance.• The built-in validation method since PEAKS 5.3.

Page 21: Facts and Fallacies about de Novo Sequencing & Database Search.

Error Accumulation

• In PEAKS, the inChorus algorithm automatically selects a less than 1% common FDR for each engine so that the combined FDR is approximately 1%.

PEAKS DB Mascot

1696(37)2.4%

2174(1)0.1%

195(22)13%

Target(decoy)FDR%PEAKS DB

3870(38)1%

2369(23)1%

Mascot

Correct < sum of the twoError ≈ sum of the two

Combined FDR = 1.5%

Page 22: Facts and Fallacies about de Novo Sequencing & Database Search.

10. There is no automated way to validate de novo sequencing results.

TrueFalse


Related Documents