DATA ANALYSIS Fitting RDC Data to Structure Software you should have: Open Babel to convert structure files formats: (http://openbabel.org/wiki/Install ) NOTEPAD++, excellent text editor for Windows: http://notepad-plus-plus.org/download
Feb 02, 2016
DATA ANALYSISFitting RDC Data to Structure
Software you should have:
Open Babel to convert structure files formats:
(http://openbabel.org/wiki/Install)
NOTEPAD++, excellent text editor for Windows:
http://notepad-plus-plus.org/download
DATA ANALYSISFitting RDC Data to Structure
Software for performing SVD fitting of structure to RDCs data:
MSpin by Armando Navarro-Vazquez, Commercialized by Mestrelab Research,Santiago de Compostela, SPAIN.
http://mestrelab.com/software/mspin/
PALES by Markus Zweckstetter, Max Plank Institute for Biophysical Chemistry, Göttingen, GERMANY.
http://www.mpibpc.mpg.de/groups/zweckstetter/_links/software_pales.htm
DATA ANALYSISFitting RDC Data to Structure
In order to fit the data to a set of judicious structures (all the configurational space of your molecule) you need to know more than five independent (non-parallel internuclear vector) RDCs.
At least three of them have to be out of the plane.
You need a minimun of five RDCs to calculate the alignment tensor (A), but need more than five to perform the fitting using Singular Value Decomposition analysis either with the programs MSpin or PALES.
Each fitting will give you a quality factor Q (Cornilescu Q factor - J. Am. Chem. Soc. 1998, 120, 6836-6837). The lower the Q factor the better the fitting.
Fitting structures in MSpin is very straightforward. You can use the tutorial from Mestre website. MSpin reads any type of PDB files and also the XYZ format.
On the other side, PALES follows a unique PDB file format.
PALES file formatThe structure file:
For Small Molecules, PALES reads a structure file that is an adaptation of the original PDB file format used for proteins. Instead of having N number of residues (e.g. aminoacids), the file represents a single residue (N=1).
PRACTICAL EXAMPLES
LUDARTIN
- gastric cytoprotective effect
- inhibits the aromatase enzyme
- first isolated in 1972 from Artemisia carruthii by Geissman and Griffin as a mixture with it 11,13-dihydroderivative
- the stereochemistry displayed in 1 based on the chemical shift and coupling constants of H-6 and on the chemical shift of H-15
O
O
O
H
1234 5
6 7
8
910
11 12
13
14
15
1
O
O
O
H
2
- Giordano, O. S.; Guerreiro, E.; Pestchanker, M. J.; Guzman, J.; Pastor, D.; Guardia, T. J. Nat. Prod. 1990, 53, 803-9.
- Blanco, J. G.; Gil, R. R.; Alvarez, C. I.; Patrito, L. C.; Genti-Raimondi, S.; Flury, A. FEBS Lett. 1997, 409, 396-400.
- Geissman, T. A.; Griffin, T. S. Phytochemistry 1972, 11, 833-5.
O
O
O
HO
O
H
O
O
O
H
O
O
O
H
+
1234 5
6 7
8
910
11 12
13
14
15
1 2
3
4
Sosa, V. E.; Oberti, J. C.; Gil, R. R.; Ruveda, E. A.; Goedken, V. L.; Gutierrez, A. B.; Herz, W. Phytochemistry 1989, 28, 1925-9.
Determination of the Stereochemistry of Ludartin (1) Using Chemical Transformations
RMS fit and overlay of the 3D structures of ludartin (1) and 3,4-b-epoxy-ludartin (2) using only the heavy atoms belonging to the 5-member and the 7-member rings, and the lactone ring.
RMS error: 0.039 Å
H-3 and CH3-1527o above and belowthe plane of the five-memberedring
H-3
CH3-15
PALES Presentation
HETATM 1 C 1 -1.413 -1.626 0.000HETATM 2 C 2 -1.342 -0.119 0.000HETATM 3 N 3 -2.536 0.577 -0.000HETATM 4 C 4 -2.573 2.004 0.000HETATM 5 O 5 -0.252 0.487 -0.000HETATM 6 H 6 -0.373 -2.033 -0.000HETATM 7 H 7 -1.947 -1.993 0.910HETATM 8 H 8 -1.947 -1.993 -0.910HETATM 9 H 9 -3.390 0.079 0.000HETATM 10 H 10 -3.637 2.354 -0.000HETATM 11 H 11 -2.048 2.408 0.908HETATM 12 H 12 -2.048 2.408 -0.908CONECT 1 2 6 7 8CONECT 2 1 3 5CONECT 3 2 4 9CONECT 4 3 10 11 12CONECT 5 2CONECT 6 1CONECT 7 1CONECT 8 1CONECT 9 3CONECT 10 4CONECT 11 4CONECT 12 4END
This is the methylacetamide PDB file created by HyperChemPALES can not read this file. It does not recognize HETATM
Save the structure generate by HyperChem in hin formatConvert the hin format to a PDB file using Open Babel
COMPND C:\Users\rgil\SMASH 2010 RDCs Workshop\Methyl Acetamide\methylacetamide.hin AUTHOR GENERATED BY OPEN BABEL 2.2.3HETATM 1 C LIG 1 -1.413 -1.626 0.000 1.00 0.00 C HETATM 2 C LIG 1 -1.342 -0.119 0.000 1.00 0.00 C HETATM 3 N LIG 1 -2.536 0.577 -0.000 1.00 0.00 N HETATM 4 C LIG 1 -2.573 2.004 0.000 1.00 0.00 C HETATM 5 O LIG 1 -0.252 0.487 -0.000 1.00 0.00 O HETATM 6 H LIG 1 -0.373 -2.033 -0.000 1.00 0.00 H HETATM 7 H LIG 1 -1.947 -1.993 0.910 1.00 0.00 H HETATM 8 H LIG 1 -1.947 -1.993 -0.910 1.00 0.00 H HETATM 9 H LIG 1 -3.390 0.079 0.000 1.00 0.00 H HETATM 10 H LIG 1 -3.637 2.354 -0.000 1.00 0.00 H HETATM 11 H LIG 1 -2.048 2.408 0.908 1.00 0.00 H HETATM 12 H LIG 1 -2.048 2.408 -0.908 1.00 0.00 H CONECT 1 2 6 7 8 CONECT 2 1 3 5 5 CONECT 3 2 4 9 CONECT 4 3 10 11 12 CONECT 5 2 2 CONECT 6 1 CONECT 7 1 CONECT 8 1 CONECT 9 3 CONECT 10 4 CONECT 11 4 CONECT 12 4 MASTER 0 0 0 0 0 0 0 0 12 0 12 0END
PDB file created by Open Babel from hin file
The Information inside the blueboxes is not used by PALES.ERASE IT
HETATM 1 C LIG 1 -1.413 -1.626 0.000 1.00 0.00HETATM 2 C LIG 1 -1.342 -0.119 0.000 1.00 0.00 HETATM 3 N LIG 1 -2.536 0.577 -0.000 1.00 0.00 HETATM 4 C LIG 1 -2.573 2.004 0.000 1.00 0.00 HETATM 5 O LIG 1 -0.252 0.487 -0.000 1.00 0.00 HETATM 6 H LIG 1 -0.373 -2.033 -0.000 1.00 0.00 HETATM 7 H LIG 1 -1.947 -1.993 0.910 1.00 0.00 HETATM 8 H LIG 1 -1.947 -1.993 -0.910 1.00 0.00 HETATM 9 H LIG 1 -3.390 0.079 0.000 1.00 0.00 HETATM 10 H LIG 1 -3.637 2.354 -0.000 1.00 0.00 HETATM 11 H LIG 1 -2.048 2.408 0.908 1.00 0.00 HETATM 12 H LIG 1 -2.048 2.408 -0.908 1.00 0.00 END
Edit the file with a good text editor accordingly. I recommend NOTEPAD++
PALES is written in C and generally it is very forgiving in terms of format, i.e. it does not care whether you use spaces, tabs, ... However, editors from windows such as NOTEPAD or WORDPAD may intruducecharacters that can make the file unreadeable by PALES. This not always the casebut it may happen. The only requirement is that the naming convention in the PDB file and the RDC table are identical. In addition, PALES only takes into account lines in the PDB file starting with "ATOM". For the MAC and LINUX versions of the GUI will also include an automatic reformatting step for the PDB file. Then at least in the GUIthese problems should not show up.
HETATM 1 C LIG 1 -1.413 -1.626 0.000 1.00 0.00HETATM 2 C LIG 1 -1.342 -0.119 0.000 1.00 0.00 HETATM 3 N LIG 1 -2.536 0.577 -0.000 1.00 0.00 HETATM 4 C LIG 1 -2.573 2.004 0.000 1.00 0.00 HETATM 5 O LIG 1 -0.252 0.487 -0.000 1.00 0.00 HETATM 6 H LIG 1 -0.373 -2.033 -0.000 1.00 0.00 HETATM 7 H LIG 1 -1.947 -1.993 0.910 1.00 0.00 HETATM 8 H LIG 1 -1.947 -1.993 -0.910 1.00 0.00 HETATM 9 H LIG 1 -3.390 0.079 0.000 1.00 0.00 HETATM 10 H LIG 1 -3.637 2.354 -0.000 1.00 0.00 HETATM 11 H LIG 1 -2.048 2.408 0.908 1.00 0.00 HETATM 12 H LIG 1 -2.048 2.408 -0.908 1.00 0.00 END
Edit the file with a good text editor accordingly
Editing:
1) Replace all HETATM with ATOM
2) Insert the word TER before END
3) Add the proper number next to each atom label (C, N, O, H, etc)
4) See edited file in next slide
ATOM 1 C1 LIG 1 -1.413 -1.626 0.000 1.00 0.00ATOM 2 C2 LIG 1 -1.342 -0.119 0.000 1.00 0.00 ATOM 3 N3 LIG 1 -2.536 0.577 -0.000 1.00 0.00 ATOM 4 C4 LIG 1 -2.573 2.004 0.000 1.00 0.00 ATOM 5 O5 LIG 1 -0.252 0.487 -0.000 1.00 0.00 ATOM 6 H6 LIG 1 -0.373 -2.033 -0.000 1.00 0.00 ATOM 7 H7 LIG 1 -1.947 -1.993 0.910 1.00 0.00 ATOM 8 H8 LIG 1 -1.947 -1.993 -0.910 1.00 0.00 ATOM 9 H9 LIG 1 -3.390 0.079 0.000 1.00 0.00 ATOM 10 H10 LIG 1 -3.637 2.354 -0.000 1.00 0.00 ATOM 11 H11 LIG 1 -2.048 2.408 0.908 1.00 0.00 ATOM 12 H12 LIG 1 -2.048 2.408 -0.908 1.00 0.00 TER END
Edit the file with a good text editor accordingly
NOTE: The changes are printed in red
RESIDRESNAMEATOMNAME
ATOMNAME is the atom number in the structure fileRESNAME is a fake name to simulate a residue (e.g. Lys, Ala, etc, in a peptide), LIG in this case, but you can use any name as long as you use de same name inRDCs input table.RESID is the number of the residue in peptide sequence, for a small molecule is just 1.
Codes uses by the PALES RDC input fileSee next slide
The RDCs table file in PALES
VARS RESID_I RESNAME_I ATOMNAME_I RESID_J RESNAME_J ATOMNAME_J D DD WFORMAT %5d %6s %6s %5d %6s %6s %9.3f %9.3f %.2f1 LIG C1 1 LIG H6 -6.800 1.000 1.001 LIG C1 1 LIG H5 0.800 1.000 1.001 LIG C1 1 LIG H8 -25.230 1.000 1.001 LIG N3 1 LIG H9 -15.6 1.000 1.001 LIG C4 1 LIG H8 -75.5 1.000 1.001 LIG C4 1 LIG H10 7.510 1.000 1.001 LIG C4 1 LIG H11 -4.530 1.000 1.001 LIG C4 1 LIG H12 123.0 1.000 1.00
RDCs (Hz) ExperimentalError (Hz)
Fitting data to structure using command line PALES in Windows:
To perform an SVD fitting in PALES you have to execute the following command:
pales –bestFit –pdb –name.pdb –inD rdc_file.tab –outD output.SVD.file
name.pdb is the name of the PDB filerdc_file.tab is the name of the RDC data file
The output is very well explained in the PALES documentation or in the Nature Protocols paper (See below). Right now we are only interested on the Q factor.The lower the Q factor the better the SVD fitting.
See also:
Nature Protocols 2008 3(4) 679-690
PERSPECTIVES
Future Directions on RDCs in Small Molecules
-Alignment Media: more variety (PEO), more deuteration, chiral gels for all solvents, …
-Low Temperature Alignment Media to resolve conformational
average.
-Commercialization: gels, stretching apparatus, software, etc…
-Adapted Measurement Techniques: in conjunction with scaling of RDCs novel pulse sequences
-Software: Individual error treatment, incorporation of RCSA,automation of structure generation, ab initio andMD methods for treating flexible molecules, prediction of alignment (absolute configuration)
-Technical Improvements: Gradient shimming
Revisit Underutilized Experiments
Selective 1D NOESY
NOE buildup curves of H10 while selectively exciting H4 axial of cortisol (1) in DMSO-d6: (a) the ‘‘raw’’ NOE buildup curve; (b) the NOE buildup curve obtained with PANIC (peak amplitude normalization for improved cross-relaxation)
1.11.21.31.41.51.61.71.81.92.02.12.22.32.42.52.62.72.82.93.03.13.23.33.43.53.63.73.83.9 ppm
100 ms
600 ms
Series of 1D NOESY Spectra of LudartinH-3 is selectively excited
Steps of 100 ms
NOE 1/r6
H-2a H-2b
H-15