1 Experimentally solving protein structures, protein-protein interactions and simulating protein dynamics Lecture 15 Introduction to Bioinformatics 2007 C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Today’s lecture 1. Experimental techniques for determining protein tertiary structure 2. Protein interaction and docking i. Zdock method 3. Molecular motion simulated by molecular mechanics Experimentally solving protein structures Two basic techniques: 1. X-ray crystallography 2. Nuclear Magnetic Resonance (NMR) tchniques 1. X-ray crystallography Purified protein Crystal X-ray Diffraction Electron density 3D structure Biological interpretation Crystallization Phase problem Protein crystals • Regular arrays of protein molecules • ‘Wet’: 20-80% solvent • Few crystal contacts • Protein crystals contain active protein • Enzyme turnover • Ligand binding Example of crystal packing Examples of crystal packing β2 Glycoprotein I ~90% solvent (extremely high!) Acetylcholinesterase ~68% solvent
13
Embed
C Experimentally solving protein Today’s lecture E€¦ · slide) X-ray and NMR summary • Are experimental techniques to solve protein structures (although they both need a lot
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Experimentally solving protein structures, protein-protein interactions and simulating
protein dynamics
Lecture 15
Introduction to Bioinformatics2007
CENTR
FORINTEGRATIVE
BIOINFORMATICSVU
E
Today’s lecture
1. Experimental techniques for determining protein tertiary structure
2. Protein interaction and dockingi. Zdock method
3. Molecular motion simulated by molecular mechanics
Experimentally solving protein structures
Two basic techniques:
1. X-ray crystallography2. Nuclear Magnetic Resonance (NMR)
tchniques
1. X-ray crystallography
Purified protein
Crystal
X-ray Diffraction
Electron density
3D structureBiological interpretation
Crystallization
Phase problem
Protein crystals• Regular arrays of protein molecules
• ‘Wet’: 20-80% solvent• Few crystal contacts
• Protein crystals contain active protein• Enzyme turnover• Ligand binding
Example of crystal packing
Examples of crystal packing
β2 Glycoprotein I~90% solvent (extremely high!)
Acetylcholinesterase~68% solvent
2
Problematic proteins (no crystallisation)
• Multiple domains
• Similarly, floppy ends may hamper crystallization: change construct
metals, ions– Hydrogen-bonds satisfied– Chemistry in order
• Final B-factor (temperature) values (colour coded in structure in the right)
2. Nuclear Magnetic Resonance (NMR)
800 MHz NMR spectrometer
4
2. NMR
Purified protein
Measure NOEs, etc.
Distance constraints
Ensemble of 3D structuresBiological interpretation
8.8 8.6 8.4 8.2
D1 (ppm)4.
54.
2
D2
(pp
m)
Interpret mapDistance geometry:
resolve constraints
Nuclear Magnetic Resonance (NMR)
• Pioneered by Richard R. Ernst, who won a Nobel Prize in chemistry in 1991.
• FT-NMR works by irradiating the sample, held in a static external magnetic field, with a short square pulse of radio-frequency energy containing all the frequencies in a given range of interest.
• The polarized magnets of the nuclei begin to spin together, creating a radio frequency (RF) that is observable. Because the signals decays over time, this time-dependent pattern can be converted into a frequency-dependent pattern of nuclear resonances using a mathematical function known as a Fourier transformation, revealing the nuclear magnetic resonance spectrum.
• The use of pulses of different shapes, frequencies and durations in specifically-designed patterns or pulse sequences allows the spectroscopist to extract many different types of information about the molecule.
Nuclear Magnetic Resonance (NMR)
• Time intervals between pulses allow—among other things—magnetization transfer between nuclei and, therefore, the detection of the kinds of nuclear-nuclear interactions that allowed for the magnetization transfer.
• Interactions that can be detected are usually classified into two kinds. There are through-bond interactions and through-space interactions. The latter is a consequence of the so-called nuclear Overhauser effect(NOE). Measured NOEs lead to a set of distances between atoms.
• These distances are subjected to a technique called Distance Geometry which normally results in an ensemble of possible structures that are all relatively consistent with the observed distance restraints (NOEs).
• Richard Ernst and Kurt Wüthrich —in addition to many others—developed 2-dimensional and multidimensional FT-NMR into a powerful technique for the determination of the structure of biopolymers such as proteins or even small nucleic acids.
• This is used in protein nuclear magnetic resonance spectroscopy. Wüthrich shared the 2002 Nobel Prize in Chemistry for this work.
currently the best)• KORDO• MolFit• MPI Protein Docking• Nussinov-Wolfson Structural Bioinformatics Group• …
Docking Programs
Issues:• Rigid structures or made flexible?
– Side-chains– Main-chains
• Full atomic detail or simplified models? • Docking energy functions (purpose built
force fields)
Summary protein(-protein) interactions
• Different binding modes (transient, obligate, also depending on (co)localisation, etc.)
• Hydrophobic patch/hydrophilic rim conferring binding specificity
• Interfaces are physico-chemically positioned in between surface and protein core (amino acid composition, etc.)
• Many approaches exist to computationally predict binding sites and therefore PPI
Protein motion
1. For protein function, architecture and dynamics are both essential
2. Protein are very mobile and flexible objects
3. Energy measurements upon protein folding show that most proteins are marginally stable
Molecular motions
Proteins are very dynamic systems• Protein folding• Protein structure• Protein function (e.g. opening and closing
of oxygen binding site in hemoglobin)
9
Protein motion
• Principles• Simulation
– MD– MC
The Ramachandran plotAllowed phi-psi angles
Red areas are preferred, yellow areas are allowed, and white is avoided
Molecular mechanics techniques
Two basic techniques:
• Molecular Dynamics (MD) simulations
• Monte Carlo (MC) techniques
10
Molecular Dynamics (MD) simulation
• MD simulation can be used to study protein motions. It is often used to refine experimentally determined protein structures.
• It is generally not used to predict structure from sequence or to model the protein folding pathway. MD simulation can fold extended sequences to `global' potential energy minima for very small systems (peptides of length ten, or so, in vacuum), but it is most commonly used to simulate the dynamics of known structures.
• Principle: an initial velocity is assigned to each atom, and Newton's laws are applied at the atomic level to propagate the system's motion through time
• MD simulation incorporates a notion of time
q = coordinatesp = momentum
K = kinetic energyV = potential energy
Molecular DynamicsKnowledge of the atomic forces and masses can be used to solve the position of each atom along a series of extremely small time steps (on the order of femtoseconds = 10-15 seconds). The resulting series of snapshots of structural changes over time is called a trajectory. The use of this method to compute trajectories can be more easily seen when Newton's equation is expressed in the following form:
The "leapfrog" method is a common numerical approach to calculating trajectories based on Newton's equation. This method gets its name from the way in which positions (r) and velocities (v) are calculated in an alternating sequence, ̀ leaping' past each other in time The steps can be summarized as follows:
v = dri/dt
a = d2ri/d2t
11
Force fieldThe potential energy of a system can be expressed as a sum of valence (or bond), crossterm, and nonbond interactions:
The energy of valence interactions comprises bond stretching (Ebond), valence angle bending (Eangle), dihedral angle torsion (Etorsion), and inversion (also called out-of-plane interactions) (Einversion or Eoop) terms, which are part of nearly all force fields for covalent systems. A Urey-Bradley term (EUB) may be used to account for interactions between atom pairs involved in 1-3 configurations (i.e., atoms bound to a common atom):
Evalence = Ebond + Eangle + Etorsion + Eoop + EUB
Modern (second-generation) forcefields include cross terms to account for such factors as bond or angle distortions caused by nearby atoms. Crossterms can include the following terms: stretch-stretch, stretch-bend-stretch, bend-bend, torsion-stretch, torsion-bend-bend, bend-torsion-bend, stretch-torsion-stretch.
The energy of interactions between nonbonded atoms is accounted for by van der Waals (EvdW), electrostatic (ECoulomb), and (in some older forcefields) hydrogen bond (Ehbond) terms:
Enonbond = EvdW + ECoulomb + Ehbond
Force field
f = a/r12 - b/r6 Van der Waals forcesdistance
ener
gy
The Lennard-Jones potential is mildly attractive as two uncharged molecules or atoms approach one another from a distance, but strongly repulsive when they approach too close. The resulting potential is shown (in pink). At equilibrium, the pair of atoms or molecules tend to go toward a separation corresponding to the minimum of the Lennard--Jones potential (a separation of 0.38 nanometers for the case shown in the Figure)
Thermal bath
Figure: Snapshots of ubiquitin pulling with constant velocity at three different time steps.
Docking example:antibody HyHEL-63 (cyan) complexed with Hen Egg White Lysozyme (yellow)
The X-ray structure of the antibody HyHEL-63 (cyan) uncomplexed and complexed with Hen Egg White Lysozyme (yellow) has shown that there are small but signif icant, local conformational changes in the antibody paratope on binding. The structure also reveals that most of the charged epitope residues face the antibody. Details are in Li YL, Li HM, Smith-Gill SJ and Mariuzza RA (2000) The conformations of the X-ray structure Three-dimensional structures of the free and antigen-bound Fab from monoclonal antilysozyme antibody HyHEL-63. Biochemistry 39: 6296-6309. Salt links and electrostatic interactions provide much of the free energy of binding. Most of the charged residues face in interface in the X-ray structure. The importance of the salt link between Lys97 of HEL and Asp27 of the antibody heavy chain is revealed by molecular dynamics simulations. After 1NSec of MD simulation at 100°C the overall conformation of the complex has changed, but the salt link persists. Details are described in Sinha N and Smith-Gill SJ (2002) Electrostatics in protein binding and function. Current Protein & Peptide Science 3: 601-614.
Important for binding is a salt bridge (i.e. charge complementary interaction) between Lys97 of HEL and Asp27 of the antibody heavy chain, as demonstrated by Molecular Dynamics (MD)
12
Monte Carlo (MC) simulation• "Monte Carlo Simulation" is a term for a general class of optimization
methods that use randomization.
• The general idea is, given the current configuration and some figure of merit, e.g., the energy of the folded configuration, to generate a new configuration at random (or semi-random):Ø If the energy of the new configuration is smaller than the old
configuration, always accept it as the next configuration; Ø if it is worse than the current configuration, accept or reject it it
with some probability dependent on how much larger the new energy is than the old energy.
∆E = E(new)-E(old)
If ∆E<0 then accept
else if random[0, 1] < e-∆E /kT then accept
else reject
Boltzmann -- probability of conformation c: P(c) = e-E(c)/kT
E
P
Monte Carlo (MC) simulation• The idea is that by always accepting a better configuration, on the
average the system will tend to move toward a (local) energy minimum, while conversely, by sometimes accepting worse configurations, the system will be able to "climb" out of a sub-optimal local minima, and perhaps fall into the basin of attraction of the global minimum.
• The specific algorithms for probabilistically generating and accepting new configurations define the type of "Monte Carlo" algorithm; some common methods are "Metropolis," "Gibbs Sampler," "Heat Bath," "Simulated Annealing," "Great Deluge," etc.
• MC techniques are computationally more efficient than MD
• MC simulations do not incorporate a notion of time!
In many conformational search methods based on Monte Carlo (MC), after a MC move, the system is energy minimised, i.e. put in the lowest local energy conformation, for example by gradient descent (steepest descent).
What can be done with MD and MCDynamics of proteins• Protein folding – very difficult• Protein unfolding – done with MD• Structure refinement – most frequent
application– After experimental structure elucidation– After some model building operation
• PPI – Interaction dynamics, Docking• Hydrophobic patch dynamics
Take home messages• Experimentally determining protein structures
– X-ray diffraction• From crystallised protein sample to electron density map
– Structure descriptors: resolution, R-factor
– Nuclear magnetic resonance (NMR)• Based on atomic nuclear spin • Produces set of distances between residues (distance restraints)• Distances are used to build protein model using Distance Geometry
• Protein dynamics simulation– Molecular dynamics
• Follows Newton’s equations of motion• Simulates molecular movements through time• Very small time steps (typically 2 femtoseconds = 2*10-15 seconds)
• Protein conformational search– Monte Carlo
• Conformations are randomly changed• Uses Mitropolis criterion to decide between conformation i and i+1 based
on conformational internal energy and the Boltzmann equation• Has no notion of time, is a conformational search protocol
– Normally faster than MD so more conformations can be generated