2. Introduction to Rosetta and structural modeling (From Ora Schueler- Furman) • Approaches for structural modeling of proteins • The Rosetta framework and its prediction modes • Cartesian and polar coordinates • Sampling (finding the structure) and scoring (selecting the structure)
41
Embed
2. Introduction to Rosetta and structural modeling (From Ora Schueler-Furman) Approaches for structural modeling of proteins The Rosetta framework and.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
2. Introduction to Rosetta and structural modeling
(From Ora Schueler-Furman)• Approaches for structural modeling of proteins • The Rosetta framework and its prediction
modes• Cartesian and polar coordinates• Sampling (finding the structure) and scoring
(selecting the structure)
Structural Modeling of Proteins - Approaches
Prediction of Structure from Sequence
Flowchart Comparison of query sequence to nr database
Similar to a sequence of known structure?
Homology Modeling(Comparative Modeling)
No
Fold Recognition(Threading)
Fits a known fold?
Yes
Yes
Ab initio prediction
No
The Rosetta framework and its prediction modes
A short history of Rosetta
In the beginning: ab initio modeling of protein structure starting from sequence Short fragments of known proteins are
assembled by a Monte Carlo strategy to yield native-like protein conformations
Reliable fold identification for short proteins. Recently improved to high-resolution models (within 2A RMSD)
ATCSFFGRKLL…..
A short history of Rosetta
Success of ab initio protocol lead to extension to Protein design Design of new fold: TOP7 Protein loop modeling; homology modeling Protein-protein docking; protein interface design
Protein-ligand docking Protein-DNA interactions; RNA modeling Many more, e.g. solving the phase problem in
Xray crystallography
ATCSFFGRKLL…..
ATCSFFGRKLL…..
The Rosetta Strategy
• Observation: local sequence preferences bias, but do not uniquely define, the local structure of a protein
• Goal: mimic interplay of local and global interactions that determine protein structure
• Local interactions: fragments derived from known structures (sampled for similar sequences/secondary structure propensity)
• Global (non-local) interactions: buried hydrophobic residues, paired b strands, specific side chain interactions, etc
The Rosetta Strategy
• Local interactions – fragments– Fragment library representing accessible local
structures for all short sequences in a protein chain, derived from known structures
• Global (non-local) interactions – scoring function– Derived from conformational statistics of known
structures
Scoring and Sampling
The basic assumption in structure prediction
Native structure located in global minimum (free) energy conformation (GMEC)
➜A good Energy function can select the correct model among decoys
➜A good sampling technique can find the GMEC in the rugged landscape
EGMEC
Conformation space
Two-Step Procedure
1. Low-resolution step locates potential minima (fast)
2. Cluster analysis identifies broadest basins in landscape
3. High-resolution step can identify lowest energy minimum in the basins (slow)
GMEC
E
Conformation space
Structure Representation:• Equilibrium bonds and
angles (Engh & Huber 1991)
• Centroid: average location of center of mass of side-chain(Centroid | aa, f,)
PDB x y zATOM 490 N GLN A 31 52.013 -87.359 -8.797 1.00 7.06 NATOM 491 CA GLN A 31 52.134 -87.762 -10.201 1.00 8.67 CATOM 492 C GLN A 31 51.726 -89.222 -10.343 1.00 10.90 CATOM 493 O GLN A 31 51.015 -89.601 -11.275 1.00 9.63 O…..….
2 ways to represent the protein structure
Cartesian coordinates (x,y,z; pdb format)
Intuitive – look at molecules in space
Easy calculation of energy score (based on atom-atom distances)
– Difficult to change conformation of structure (while keeping bond length and bond angle unchanged)
Polar coordinates ( - - ;F Y W equilibrium angles and bond lengths)
Compact (3 values/residue)Easy changes of protein
structure (turn around one or more dihedral angles)
– Non-intuitive– Difficult to evaluate energy
score (calculation of neighboring matrix complicated)
• Cartesian representation: Easy to look at, difficult to move– Moves do not preserve bond length
(and angles in 3D)
• Internal coordinates: Easy to move, difficult to see – calculation of distances between
points not trivial
z
Proteins: bond lengths and angles fixed. Only dihedral angles are varied
Solution: toggle
CALCULATE ENERGY - Cartesian coordinates:
Derive distance matrix (neighbor list) for energy score calculation
Transform: build positions in space according to
dihedral anglesPDB x y zATOM 490 N GLN A 31 52.013 -87.359 -8.797 1.00 7.06 NATOM 491 CA GLN A 31 52.134 -87.762 -10.201 1.00 8.67 CATOM 492 C GLN A 31 51.726 -89.222 -10.343 1.00 10.90 CATOM 493 O GLN A 31 51.015 -89.601 -11.275 1.00 9.63 O…..….
MOVE STRUCTURE - Polar coordinates:
introduce changes in structure by rotating around dihedral angle(s) (change - F Yvalues)
PDB x y z…ATOM 490 C GLN A 31 52.013 -87.359 -8.797 1.00 7.06 NATOM 491 N GLY A 32 52.134 -87.762 -10.201 1.00 8.67 CATOM 492 CA GLY A 32 51.726 -89.222 -10.343 1.00 10.90 CATOM 493 O GLY A 32 51.015 -89.601 -11.275 1.00 9.63 O…..….
How to calculate polar from Cartesian coordinates: example F: C’-N-Ca-C
– define plane perpendicular to N-Ca (b2) vector– calculate projection of Ca-C (b3) and C’-N (b1) onto plane– calculate angle between projections
PDB x y z…ATOM 490 C GLN A 31 52.013 -87.359 -8.797 1.00 7.06 NATOM 491 N GLY A 32 52.134 -87.762 -10.201 1.00 8.67 CATOM 492 CA GLY A 32 51.726 -89.222 -10.343 1.00 10.90 CATOM 493 O GLY A 32 51.015 -89.601 -11.275 1.00 9.63 O…..….
Find x,y,z coordinates of C, based on atom positions of C’, N and Ca, and a given F value (F: C’-N-Ca-C)
• create Ca-C vector: – size Ca-C=1.51A (equilibrium bond length)– angle N-Ca-C= 111o (equilibrium value for N-
Ca-C angle)• rotate vector around N-Ca axis to obtain
projections of Ca-C and N-C’ with wanted F
(0,0),(1,1),(1,2),(2,2),(3,3) 450,90o,0o,45o
Representation of protein structure431 2 875 6Rosetta folding
3 backbone dihedral angles per residue
Sampling and minimization in TORSIONAL space: change angle and rebuild, starting from changed angle
Build coordinates of structure starting from first atom, according to dihedral angles (and equilibrium bond length and angle)