Protein Sequencing

Protein SequencingProtein Sequencing

Permits comparisons between normal and mutant proteins.

Permits comparisons between comparable proteins in different sps.

Vital piece of information for determining the 3-D structure of a protein.

Why are peptides, and not proteins, sequenced?Why are peptides, and not proteins, sequenced?

Primary structure of proteins is determined.Primary structure of proteins is determined.

Solubility under the same conditionsSolubility under the same conditions Sensitivity of MS much higher for peptidesSensitivity of MS much higher for peptides MS efficiencyMS efficiency

To find out the order of amino acids in a peptide, sequential removaland identification of residues from one or the other free terminal of the polypeptide chain is carried out.

This poses a problem: long polypeptides contamination of amino acids removed.

What is the solution: Polypeptide chain is broken into short sequence peptide Short sequences reassembled to obtain overall seq.

Sequencing of Peptides

Main steps: Purification of a protein Cleavage of all disulfide bonds Determination of the terminal amino acid residues Specific cleavage of the polypeptide chain into small fragments Independent separation and sequence determination of peptides Reassembly of the individual peptides with appropriate overlaps

Step I– Purification of proteinDifferential Precipitation with (NH4)SO4

Column ChromatographyGel ElectrophoresisDifferential Centrifugation

Step II--Cleavage of disulfide bonds

Step III--Determination of polypeptide-chain end groupsAmino terminal--FDNB

Dansyl Chloride Method

Carboxy terminal—Anhydrous Hydrazine at 100ºC amino acid hydrazides except for all the carboxy terminus ones (free acid) and this can be identified by chromatography.

Treat polypeptides with carboxypeptidase. Results in release of carboxy terminus amino acid as the major free amino acid which is identified chromatographically.

Step III—Fragmentation of peptide chain into short sequences

A. Use of endopeptidases

Choice of EnzymeChoice of Enzyme

Cleaving agent/ProteasesCleaving agent/Proteases SpecificitySpecificity

A. HIGHLY SPECIFICA. HIGHLY SPECIFIC

TrypsinTrypsin Arg-X, Lys-XArg-X, Lys-X

Endoproteinase Glu-CEndoproteinase Glu-C Glu-XGlu-X

Endoproteinase Lys-CEndoproteinase Lys-C Lys-XLys-X

Endoproteinase Arg-CEndoproteinase Arg-C Arg-XArg-X

Endoproteinase Asp-NEndoproteinase Asp-N X-AspX-Asp

B. NONSPECIFICB. NONSPECIFIC

ChymotrypsinChymotrypsin Phe-X, Tyr-X, Trp-X, Leu-XPhe-X, Tyr-X, Trp-X, Leu-X

ThermolysinThermolysin X-Phe, X-Leu, X-Ile, X-Met, X-Val, X-AlaX-Phe, X-Leu, X-Ile, X-Met, X-Val, X-Ala

B. Use of Cyanogen bromide

Specific cleavage at Met residue. Conversion of free carboxyl-terminal Met to Homoserine lactone.

Step IV--Peptide AnalysisStep IV--Peptide Analysis

Edman DegradationEdman Degradation MS (Mass Spectrometry)MS (Mass Spectrometry)

More sensitiveMore sensitive Can fragment peptides fasterCan fragment peptides faster Does not require proteins or peptides to be purified to Does not require proteins or peptides to be purified to

homogeneityhomogeneity Has no problem identifying blocked or modified proteinsHas no problem identifying blocked or modified proteins

Edman degradationEdman degradation

N C S H2N C

H

CH3

C

O

Asp Phe Phe Arg CO

O-+

N C

H

CH3

C

O

Asp Phe Phe Arg CO

O-C

S H

N

H

Labeling

NN

S

O

CH3

H

PTH-alanine

Asp Phe Phe Arg CO

O-H2N+

Release

Peptide shorthened by one residue

Phenyl isothiocyanate

Treatment of a N-unprotected peptide with PITC (C6H5N=C=S) followed by acid. It allows for the successive “stripping away” of the N-terminal amino acid in the form of a phenylthiohydantoin. In the process, it produces a new shortened peptide with a new, exposed N-terminus.

PTH amino acid is identified by TLC

Edman Degradation vs. MS/MSEdman Degradation vs. MS/MS

Steps Involved in Protein Identification and Annotation

1. Protein Samples2. Resolve protein mixtures by 2-D gel electrophoresis

Chromatography

3. Individual Proteins4. Digestion by enzymes or mass spectrometry5. MS/MS Fragment Ion Analysis MALDI-TOF Peptide Mass Fingerprinting

6. Database Search

Tandem Mass SpectrometryTandem Mass Spectrometry

MS used for accurately determining molecular masses by calculating MS used for accurately determining molecular masses by calculating mass:charge ratio of ions in a vacuummass:charge ratio of ions in a vacuum

Combines an instrument for source of ions, massCombines an instrument for source of ions, mass analyser to separate ions by mass:charge ratio and an ion detectoranalyser to separate ions by mass:charge ratio and an ion detector MS/MS plays important role in protein identification (fast and MS/MS plays important role in protein identification (fast and

sensitive).sensitive). Derivation of peptide sequence an important task in proteomics.Derivation of peptide sequence an important task in proteomics. Derivation without help from a protein database (“de novo Derivation without help from a protein database (“de novo

sequencing”), especially important in identification of unknown sequencing”), especially important in identification of unknown protein.protein.

Basic lab experimental stepsBasic lab experimental steps

1. Proteins digested w/ an enzyme to produce peptides1. Proteins digested w/ an enzyme to produce peptides

2. Peptides charged (ionized) by MS and separated according to their 2. Peptides charged (ionized) by MS and separated according to their different m/z ratiosdifferent m/z ratios

3. Each peptide fragmented into ions and m/z values of fragment ions 3. Each peptide fragmented into ions and m/z values of fragment ions are measuredare measured

Steps 2 and 3 performed within a tandem mass spectrometer.Steps 2 and 3 performed within a tandem mass spectrometer.

Mass spectrumMass spectrum

Proteins consist of 20 different types of a. a. with different Proteins consist of 20 different types of a. a. with different masses (except for one pair Leu and Ile)masses (except for one pair Leu and Ile)

Different peptides produce different spectraDifferent peptides produce different spectra

Use the spectrum of a peptide to determine its sequenceUse the spectrum of a peptide to determine its sequence

MS Peptide ExperimentMS Peptide Experiment

+

++

+

+++

++

+

+++

++

+

+++

++

+

++

+

++

+

+++

++

+

+++

++

+

+++

++

+

+

+

++

+

+++

++

+

+++

++

+

+++

++

+

++

+++ ++++++ ++++++ ++++++ ++++++ ++++++ ++++++ ++++++ +++ +

+++++++++++++++++++++++

++++++++++++++++++++++++

MALDI (matrix-assisted laser desorption ionization)

3 nS LASER PULSE

Sample (solid) on target at high voltage/ high vacuum

MALDI is a solid-state technique that gives ions in pulses, best suited to time-of-flight MS.

TOF analyzer

High vacuum

1. Soft ionization method for peptide mass. 2. Tryptic fragments mixed with light-absorbing matrix compound (DHBA) in an organic solvent.3. Solvent evaporated to form crystals, transferred to vacuum with a laser beam.4. Laser energy absorbed and emitted (desorbed) as heat.5. Expansion of matrix and anylation into gas phase.6. High voltage applied across the sample to ionize it and ions are accelerated towards the detector.

ESI (Electrospray Ionization)

Liquid flowQ or Ion Trap

analyzer

ESI is a solution technique that gives a continuous stream of ions, best for quadrupoles, ion traps, etc.

Atmosphere Low vac. High vac.

Soft ionization method used for ion searchingAnalyte is dissolved in a solvent, pushed through a narrow capillaryPotential difference applied across the capillary such that charged droplets emerge and form a fine spray.Stream of heated inert gas applied and each droplet evaporates.Analyte enters the mass analyzer and ions are accelerated towards the detector.

….MALDI or Electrospray ?

MALDI is limited to solid state, ESI to liquid

ESI is better for the analysis of complex mixture as it is directly interfaced to a separation techniques (i.e. HPLC or CE)

MALDI is more “flexible” (MW from 200 to 400,000 Da)

Q2Q2Collision CellCollision Cell

Q3Q3

II

IIII

IIIIIICorrelative Correlative

sequence database sequence database searchingsearching

TheoreticalTheoretical AcquiredAcquiredProtein identificationProtein identification

PeptidesPeptides

1D, 2D, 3D peptide separation1D, 2D, 3D peptide separation

200 400 600 80010001200m/zm/z

200 400 600 80010001200m/zm/z

200 400 600 80010001200m/zm/z

12 14 16Time (min)

Tandem mass spectrumTandem mass spectrum

Protein Identification StrategyProtein Identification Strategy

Q1Q1

*

*

Protein Protein mixturemixture

10-Mar-200514:28:10

100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600m/z0

100

%

CAL050310A 71 (1.353) Cm (1:96) TOF MSMS 785.60ES+ 2.94e3684.17

333.15

187.07

175.12

169.06

246.13

286.11

480.16

382.11

480.08

497.09

627.17612.08

498.09

813.16

785.62

685.18

740.09

1285.141056.17942.16

814.17

924.16

943.17

1039.13

1038.17

1171.14

1057.18

1058.17

1172.15

1173.16

1286.14

1287.13

1296.10

10-Mar-200514:28:10

100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600m/z0

100

%


333.15

187.07

175.12

169.06

246.13

286.11

480.16

382.11

480.08

497.09

627.17612.08

498.09

813.16

785.62

685.18

740.09

1285.141056.17942.16

814.17

924.16

943.17

1039.13

1038.17

1171.14

1057.18

1058.17

1172.15

1173.16

1286.14

1287.13

1296.10

10-Mar-200514:28:10

100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600m/z0

100

%


333.15

187.07

175.12

169.06

246.13

286.11

480.16

382.11

480.08

497.09

627.17612.08

498.09

813.16

785.62

685.18

740.09

1285.141056.17942.16

814.17

924.16

943.17

1039.13

1038.17

1171.14

1057.18

1058.17

1172.15

1173.16

1286.14

1287.13

1296.10

Large-scale Analysis of in Vivo Phosphorylated Membrane Proteins by Immobilized Metal Ion Affinity Chromatography and Mass Spectrometry, Molecular & Cellular Proteomics, 2003, 2.11, 1234, Thomas S. Nuhse, Allan Stensballe, Ole N. Jensen, and Scott C. Peck

What you need for peptide mass mappingWhat you need for peptide mass mapping

Peptide mass spectrumPeptide mass spectrum

Protein DatabaseProtein Database

GenBank, Swiss-Prot, dbEST, etc.GenBank, Swiss-Prot, dbEST, etc.

Search enginesSearch engines

MasCot, Prospector, Sequest, etc.MasCot, Prospector, Sequest, etc.

Protein Identification by MS

Artificial spectra built

Artificially trypsinated

Database of sequences

(i.e. SwissProt)

Spot removed from gel

Fragmented using trypsin

Spectrum of fragments generated

MATCHLi

bra

ry

ConclusionsConclusions

MS of peptides enables high throughput MS of peptides enables high throughput identification and characterization of proteins in identification and characterization of proteins in biological systemsbiological systems

““de novo sequencing” can be used to identify de novo sequencing” can be used to identify unknown proteins not found in protein databasesunknown proteins not found in protein databases

Protein Sequencing

Documents

leux xphe

c amino acid hydrazides

free terminal of peptide

protein cleavage

major free amino acid

alanine peptide

polypeptide chain

mutant proteins