Msi 0112 p

MSI February 2012 Practical examples

Contents

1 Introduction 2

2 System preparation 2

3 Practical Session I: Molecular dynamics 4

3.1 MD with NAMD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3.1.1 Creating a PSF file for PDB 1i45 . . . . . . . . . . . . . . . . . . . . . . . 4

3.1.2 Solvating the structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3.1.3 Running the simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.2 MD with MOLARIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.2.1 Preparing the PDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.2.2 Running an interactive MOLARIS session . . . . . . . . . . . . . . . . . . 9

3.3 MD with ADUN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.3.1 Preparing the PDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.4 Running simulations with ADUN . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.4.1 Analyizing the results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4 Practical Session II: Solvation, pKa, FEP 16

4.1 Running solvation and pKa simulations using PDLD/S-LRA . . . . . . . . . . . . 16

4.2 LIE runs with ADUN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

5 Practical Session III: enzymatic reactivity with EVB 18

5.1 EVB for enzymatic reactivity analysis . . . . . . . . . . . . . . . . . . . . . . . . 18

A Running in the luke cluster 21

B Extra material for NAMD 21

B.1 Set up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

B.2 VMD: wat_sphere.tcl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

B.3 VMD: sod2pot.tcl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

B.4 VMD: 1i45_ws_eq.conf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

B.5 VMD: 1i45_wb_eq.conf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

C Extra material for ADUN 26

C.1 Set up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

c©2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 1


D Additional tools 26

Molecular Simulations: a Practical Approach

1 Introduction

In this session we will practice Molecular Dynamics simulations using three different programs andthe help of VMD, Chimera and R to visualize and analyze the data obtained. We will demonstratethe methods on triosephosphate isomerase (TIM), an enzyme catalyzing the reversible interconver-sion of the triose phosphate isomers dihydroxyacetone phosphate (DHAP) and D-glyceraldehyde3-phosphate (GAP). Apart from its obvious interest, the protein has some characteristics that makeit a good example for this class (it does not contain disulfide bonds, it is an enzyme with a multiplefree energy barrier, it is a dimer...).

The programs we will learn to use are

• NAMD[Phillips et al., 2005], a popular high performance computing MD program that is tightlylinked to the VMD[Humphrey et al., 1996] visualization program. Useful for running stan-dard MD runs with periodic boundary conditions and popular force fields like AMBER orCHARMM. 1

• MOLARIS[Lee et al., 1993], a program containing advanced algorithms for spherical bound-ary conditions and the PDLD/S-LRA model for semimacroscopic solvation calculations.2

• ADUN[Johnston et al., 2005], a high performance productivity and framework based computerprogram for MD simulations, including a plugin system for additions of complex algorithms.We will use it in Section 4 for LIE calculations.3

We will be running these programs in two different platforms. On the one hand, the use of NAMDwill be demonstrated assuming a Mac OS X based computer, although extremely analogous com-mands can be used in a unix machine. On the other hand, ADUN and MOLARIS will be run remotelyin a Linux cluster using Fedora 8. Additionally, the interested participants can download an ex-perimental live CD for ADUN as it can be found at http://susegallery.com/a/hvXWpn/adun-user (you need to -freely- register).

Through this document, we will use THIS COLOR when showing bash scripting code andTHIS COLOR when showing Tcl scripting for VMD.

2 System preparation

First, we need to obtain the PDB files we will be using. TIM is known to explore two conformationsthat influence its ability to bind the substrate. We will be using the following PDB codes: 1I45 forthe open[Rozovsky et al., 2001] and 1NEY for the closed [Jogl et al., 2003]. We will first check thatthese two files correspond precisely to the same protein sequence. From the PDB we can get thesequences of the A chains (the B is identical in this dimeric protein), by running:

1http://www.ks.uiuc.edu/Research/namd/2http://futura.usc.edu/programs/index.html#molaris3adun.imim.es, It is highly recommended subscribing to the adun-users mailing list (https://mail.gna.

org/listinfo/adun-users) to be aware of new improvements and to report problems.



$ wget −−o u t p u t−document =1NEY. f a s t a \h t t p : / / www. pdb . o rg / pdb / download / d o w n l o a d F i l e . do ? f i l e F o r m a t = f a s t a c h a i n \& s t r u c t u r e I d =1NEY\& c h a i n I d =A

$ wget −−o u t p u t−document =1 I45 . f a s t a \h t t p : / / www. pdb . o rg / pdb / download / d o w n l o a d F i l e . do ? f i l e F o r m a t = f a s t a c h a i n \& s t r u c t u r e I d =1 I45 \& c h a i n I d =A

$ wget −−o u t p u t−document =1YPI . f a s t a \h t t p : / / www. pdb . o rg / pdb / download / d o w n l o a d F i l e . do ? f i l e F o r m a t = f a s t a c h a i n \& s t r u c t u r e I d =1YPI\& c h a i n I d =A

where $ refers to the system prompt. From now on, we will not use the $ sign, to simplify thewriting. Notice we have also downloaded the sequence for the PDB code 1YPI[Lolis et al., 1990],which corresponds to the wild type protein, for comparison. We can then put the three sequences inthe same file

c a t 1NEY. f a s t a 1 I45 . f a s t a 1YPI . f a s t a > s e q s . f a s t a

and run clustalw to obtain

CLUSTAL W (1.83) multiple sequence alignment

1NEY_A|PDBID|CHAIN|SEQUENCE -ARTFFVGGNFKLNGSKQSIKEIVERLNTASIPENVEVVICPPATYLDYS1I45_A|PDBID|CHAIN|SEQUENCE MARTFFVGGNFKLNGSKQSIKEIVERLNTASIPENVEVVICPPATYLDYS1YPI_A|PDBID|CHAIN|SEQUENCE -ARTFFVGGNFKLNGSKQSIKEIVERLNTASIPENVEVVICPPATYLDYS

*************************************************

1NEY_A|PDBID|CHAIN|SEQUENCE VSLVKKPQVTVGAQNAYLKASGAFTGENSVDQIKDVGAKYVILGHSERRS1I45_A|PDBID|CHAIN|SEQUENCE VSLVKKPQVTVGAQNAYLKASGAFTGENSVDQIKDVGAKYVILGHSERRS1YPI_A|PDBID|CHAIN|SEQUENCE VSLVKKPQVTVGAQNAYLKASGAFTGENSVDQIKDVGAKWVILGHSERRS

***************************************:**********

1NEY_A|PDBID|CHAIN|SEQUENCE YFHEDDKFIADKTKFALGQGVGVILCIGETLEEKKAGKTLDVVERQLNAV1I45_A|PDBID|CHAIN|SEQUENCE YFHEDDKFIADKTKFALGQGVGVILCIGETLEEKKAGKTLDVVERQLNAV1YPI_A|PDBID|CHAIN|SEQUENCE YFHEDDKFIADKTKFALGQGVGVILCIGETLEEKKAGKTLDVVERQLNAV

**************************************************

1NEY_A|PDBID|CHAIN|SEQUENCE LEEVKDFTNVVVAYEPVWAIGTGLAATPEDAQDIHASIRKFLASKLGDKA1I45_A|PDBID|CHAIN|SEQUENCE LEEVKDFTNVVVAYEPVWAIGTGLAATPEDAQDIHASIRKFLASKLGDKA1YPI_A|PDBID|CHAIN|SEQUENCE LEEVKDWTNVVVAYEPVWAIGTGLAATPEDAQDIHASIRKFLASKLGDKA

******:*******************************************

Notice here that the two PDBs we will work with are indeed mutated structures from the wild type1YPI. It is ok, as they have been seen to have the same activity as the wt protein[Rozovsky et al., 2001,Jogl et al., 2003].

We can obtain the PDB files themselves in very different ways. We can see the contents of thefiles by accessing the PDB.4 and obtain the files from those pages or, more conveniently, by typing:

wget f t p : / / f t p . wwpdb . org / pub / pdb / d a t a / s t r u c t u r e s / a l l / pdb / pdb1ney . e n t . gzwget f t p : / / f t p . wwpdb . org / pub / pdb / d a t a / s t r u c t u r e s / a l l / pdb / pdb1i45 . e n t . gzwget f t p : / / f t p . wwpdb . org / pub / pdb / d a t a / s t r u c t u r e s / a l l / pdb / pdb1ypi . e n t . gz

Upon inspection of the two PDB files, we realize 1NEY, which is in the closed form, contains thesubstrate DHAP, while 1I45 contains no substrate. In addition, we realize that in both cases the ac-tual mutations include a fluorinated variant of Trp 168: Trp90Tyr Trp157Phe with 5’-fluorotryptophanat Trp168 (see ??). We can keep working with that fluorinated version but it is better to modify itby the original Trp, as the former was just used for experimental monitoring of loop 6 and we don’tneed it here.

4Try accessing http://www.rcsb.org/pdb/explore/explore.do?structureId=1I45 and http://www.rcsb.org/pdb/explore/explore.do?structureId=1NEY



Figure 1: 5’fluoro-tryptophane

3 Practical Session I: Molecular dynamics

3.1 MD with NAMD

The first task we will do is to run regular MD simulations with NAMD with periodic boundaryconditions using the CHARMM force field with CMAP, an energy correction lately added to theCHARMM force field. The NAMD team has produced a series of excellent tutorials that can be foundat http://www.ks.uiuc.edu/Training/Tutorials/. Here we will adapt the generalNAMD tutorial to the simulation of our two proteins 1I45 and 1NEY, to analyze their behavior. Torun NAMD we need:

• a PDB file

• a protein structure file (PSF), which stores the information about the topology of the proteinstructure

• a force field parameter file (for example the file toppar_c35b2_c36a2.tgz obtained from:http://mackerell.umaryland.edu/CHARMM_ff_params.html

• a configuration or input file, specifying what do we want to do with running the program

Figure 3.1 shows the way we will proceed. More details below and in the original NAMD tutorial.

3.1.1 Creating a PSF file for PDB 1i45

The first task to do is to split the pdb files into their two subunits, as this is needed by the psfgenprogram.

grep ’ A ’ pdb1i45 . e n t | grep −v ’HOH’ >1i45A . pdbgrep ’ B ’ pdb1i45 . e n t | grep −v ’HOH’ >1 i45B . pdb

We need to be this in order to make the two monomers being segments of the PSF file we willgenerate. In principle, the psfgen should do the rest for us. Thus, we simply run



Figure 2: Flowchart of the process of running a NAMD run, specifying the different tools to beused to generate the output. Extracted from the NAMD tutorial at http://www.ks.uiuc.edu/Training/Tutorials/.

a l i a s vmdrun =/ A p p l i c a t i o n s /VMD\ 1 . 8 . 7 b e t a 3 . app / C o n t e n t s / MacOS / s t a r t u p . commandvmdrun −d i s p d e v t e x t −e o f e x i t < 1 i45_pgn . t c l

or its equivalent in windows or unix, where the 1i45_pgn.tcl file contains:

package r e q u i r e p s f g e nr e s e t p s f# l o a d i n g t h e t o p o l o g yt o p o l og y t o p _ a l l 2 7 _ p r o t _ l i p i d . r t f# c r e a t i n g t h e segment f o r t h e f i r s t monomeri f {1} {

# a l i a s i n g some namesp d b a l i a s r e s i d u e HIS HSEp d b a l i a s r e s i d u e FTR TRPp d b a l i a s atom ILE CD1 CDsegment A {pdb 1 i45A.pdb }coordpdb 1 i45A.pdb A

}i f {1} {

# a l i a s i n g some namesp d b a l i a s r e s i d u e HIS HSEp d b a l i a s r e s i d u e FTR TRPp d b a l i a s atom ILE CD1 CDsegment B {pdb 1 i45B .pdb }coordpdb 1 i45B .pdb B

}guesscoordwritepdb 1 i 4 5 . p d bw r i t e p s f 1 i 4 5 . p s f



Notice the generation of the two segments. Equivalently, we can source the 1i45_pgn.tcl filefrom within VMD, by accessing the Extensions;Tk Console and, there:

cd <where t h e pgn f i l e i s >source 1 i 4 5 _ p g n . t c l

Inspection of the 1i45.pdb and 1i45.psf files generated shows that the psfgen program dida good job in assigning the patches capping the two chains:

PSF CMAP

9 !NTITLEREMARKS original generated structure x-plor psf fileREMARKS 4 patches were applied to the molecule.REMARKS topology ./toppar/top_all27_prot_lipid.rtfREMARKS segment A { first NTER; last CTER; auto angles dihedrals }REMARKS segment B { first NTER; last CTER; auto angles dihedrals }REMARKS defaultpatch NTER A:2REMARKS defaultpatch CTER A:248REMARKS defaultpatch NTER B:2REMARKS defaultpatch CTER B:248

7542 !NATOM1 A 2 ALA N NH3 -0.300000 14.0070 02 A 2 ALA HT1 HC 0.330000 1.0080 03 A 2 ALA HT2 HC 0.330000 1.0080 0

(...)

Exercise 1Prepare the PSF files for the 1NEY and 1YPI structures.

3.1.2 Solvating the structure

NAMD offers two alternatives for the solvation of the structure, prior to the MD runs. One can choosea sphere to solvate the proteins and treat the solvent using spherical boundary conditions (SBC) orone can use periodic boundary conditions (PBC) with, e.g., a a cube or a rectangular prism. We willdemonstrate later the SBC with the SCAAS method[Warshel and King, 1985] in MOLARIS but forthe sake of completeness we will show here how to build both a sphere and a rectangular prism ofwaters around our system, while running MD simulations in both.

We start by creating a sphere of waters around the system to run SBC. We will use the wat_sphere.tclfile in Appendix B.2

vmdrun −d i s p d e v t e x t −e o f e x i t < w a t _ s p h e r e . t c l

This generates the files 1i45_ws.pdb and 1i45_ws.psf, that can be displayed with VMD(seeFigure 3.1.2a).

Afterwards, we use the file wat_box.tcl below in an analogous manner and obtain the sol-vated system in Figure 3.1.2b.package r e q u i r e s o l v a t es o l v a t e 1 i 4 5 . p s f 1 i 4 5 . p d b −t 5 −o 1 i45_wb



Figure 3: Solvated systems for NAMD run with spherical and rectangular prism representations

Some proteins may be sensitive to the ionic strength of the surrounding solvent. Even whenthat is not the case, in molecular dynamics (MD) simulations with periodic boundary conditions,the energy of the electrostatic interactions is often computed using the particle-mesh Ewald (PME)summation, which requires the system to be electrically neutral. The vmd autoionize plugin pro-vides a quick way to make the net charge of the system zero by adding random (following someminimum distances between ions) sodium and chlorine ions to the solvent. In our case, for thePBC-based simulations with 0.05M in NaCl, we can run VMD in text mode again with this Tclscript:package r e q u i r e a u t o i o n i z ea u t o i o n i z e −psf 1 i 4 5 _ w b . p s f −pdb 1 i45_wb.pdb −is 0 . 0 5 −o 1 i45_wb_NaClsource s o d 2 p o t . t c l

where the sod2pot.tcl script, used to substitute the Na+ by K+ ions can be obtained in Ap-pendix B.3.

3.1.3 Running the simulations

Once the files needed have been built, we simply need to run the simulation by typing

namd2 1 i45_ws_eq . con f > 1 i45_ws_eq . l o g &

where an example of configuration file 1i45_ws_eq.conf is given in Appendix B.4.

In a similar manner, an example of configuration file for PBC is given in Appendix B.5.

More details on running NAMD simulations can be found in the official NAMD tutorial at http://www.ks.uiuc.edu/Training/Tutorials/. See also Appendix D for links to extraexamples and useful resources.

Exercise 2Run SBC and PBC relaxations and heating for 1NEY and 1YPI. Produce plots for RMSD and totalenergy in each case.



3.2 MD with MOLARIS

In this quick start guide we will show how to run molecular simulations using MOLARIS[Chu et al., 2003].In particular, we will be setting up a system and running molecular dynamics in a given region withan explicit representation of the solvent.

3.2.1 Preparing the PDB

Using our preferred editor, we edit the *.ent files and change the FTR entries by TRP entries. Forexample, we change the pdb1i45.ent file into a file we will call 1i45mod.pdb by doing the followingsubstitution:

HETATM 1282 N FTR A 168 38.662 51.541 42.102 1.00 15.78 NHETATM 1283 CA FTR A 168 37.687 51.997 43.087 1.00 16.01 CHETATM 1284 CB FTR A 168 37.016 53.291 42.612 1.00 16.50 CHETATM 1285 CG FTR A 168 36.457 53.215 41.211 1.00 18.36 CHETATM 1286 CD2 FTR A 168 35.103 52.917 40.831 1.00 18.75 CHETATM 1287 CE2 FTR A 168 35.046 52.962 39.419 1.00 18.66 CHETATM 1288 CE3 FTR A 168 33.932 52.616 41.545 1.00 20.74 CHETATM 1289 CD1 FTR A 168 37.142 53.419 40.045 1.00 18.66 CHETATM 1290 NE1 FTR A 168 36.302 53.270 38.967 1.00 17.24 NHETATM 1291 CZ2 FTR A 168 33.864 52.718 38.705 1.00 19.86 CHETATM 1292 CZ3 FTR A 168 32.754 52.372 40.827 1.00 21.32 CHETATM 1293 F FTR A 168 31.644 52.083 41.514 0.57 24.59 FHETATM 1294 CH2 FTR A 168 32.735 52.427 39.425 1.00 20.56 CHETATM 1295 C FTR A 168 36.600 50.963 43.385 1.00 16.95 CHETATM 1296 O FTR A 168 35.850 51.115 44.348 1.00 17.00 O

into

ATOM 1282 N TRP A 168 38.662 51.541 42.102 1.00 15.78 NATOM 1283 CA TRP A 168 37.687 51.997 43.087 1.00 16.01 CATOM 1284 CB TRP A 168 37.016 53.291 42.612 1.00 16.50 CATOM 1285 CG TRP A 168 36.457 53.215 41.211 1.00 18.36 CATOM 1286 CD2 TRP A 168 35.103 52.917 40.831 1.00 18.75 CATOM 1287 CE2 TRP A 168 35.046 52.962 39.419 1.00 18.66 CATOM 1288 CE3 TRP A 168 33.932 52.616 41.545 1.00 20.74 CATOM 1289 CD1 TRP A 168 37.142 53.419 40.045 1.00 18.66 CATOM 1290 NE1 TRP A 168 36.302 53.270 38.967 1.00 17.24 NATOM 1291 CZ2 TRP A 168 33.864 52.718 38.705 1.00 19.86 CATOM 1292 CZ3 TRP A 168 32.754 52.372 40.827 1.00 21.32 CATOM 1294 CH2 TRP A 168 32.735 52.427 39.425 1.00 20.56 CATOM 1295 C TRP A 168 36.600 50.963 43.385 1.00 16.95 CATOM 1296 O TRP A 168 35.850 51.115 44.348 1.00 17.00 O

as well as the same change for chain B, of course. Alternatively, we can do something like

sed ’ s / FTR / TRP / ’ pdb1i45 . e n t >temp . pdbmv temp . pdb 1 i45mod . pdbsed ’ s / FTR / TRP / ’ pdb1ney . e n t >temp . pdbmv temp . pdb 1neymod . pdb



3.2.2 Running an interactive MOLARIS session

At the prompt, we type

m o l a r i s

After this, the program will prompt

Sourced /cbbl/soft/molaris/bin/.molaris_rcSourced /home/jvilla/.molaris_rcUsage:For interactive run, please press the Enter key.For using input file on command line, please press the Enter key,type quit, then type on the command line:molaris < input_file_nameor:molaris input_file_name

you may use command line options to read in alternative libraries:molaris [-a amino_lib_name] [-p parm_lib_name] [-e evb_lib_name]

[-s solvent_opt_name] [-o output_directory_name]

This message informs the user about the different possibilities of running MOLARIS. The usercan run the program interactively, as we will do now, or prepare an input file with all the appropriatecommands for running the calculation in the background. We type <Enter> and after some infor-mation we are prompted for a PDB name. At this point we write the name of the coordinates file ofthe system we are interested in. MOLARIS accepts both PDB and Mol2 formats, or a combinationof them. In this case we type 1i45mod.pdb.

Initially the program checks the coordinates file and look for possible errors in it. If the fileis OK then the program will proceed by comparing the residues of the coordinates file with theresidues in the topology library, provided with the program and called amino98.lib in the currentversion. If the file contains a residue that is not in the library a new entry is automatically added tothis library.

After checking and writing the topology in a special file called $OUT_DIR/1i45mod.topthe user is asked what task he/she wants to perform. In this case we will choose ENZYMIX and theprogram prompts the following table:

Table of the Keywords for the Enzymix Level...........................................

keyword modifier example------- -------- -------pre_enz no pre_enzrelax no relaxac no acevb no evbevb2 no evb2evb_ab no evb_abadiab_pot no adiab_potadiab_tem no adiab_temend no endhelp yes help <keyword1> <keyword2> ...help yes help allhelp no helpexit/quit no exit------------------------------------------------------------------------

Here you start to see that the MOLARIS package works as nested tasks, where every keywordfollow a hierarchy of execution. In this way, every time we finish a particular task we must write



an end statement if we want to save the changes made or exit if we did some mistake and we wantto quit without saving. In this particular case we want to perform a relaxation of the protein, so weselect relax. The following table appears:

Table of the Keywords RELAX Level.................................

keyword modifier example------- -------- ------md_parm no md_parmrest_in yes rest_in rest.inrest_out yes rest_out rest.1energy_out yes energy_out gap.outend no endhelp yes help <keyword1> <keyword2> ...help yes help allhelp no helpexit/quit no exit

Here we have several choices to make. If we just quit the level with end, the program willperform a relaxation taking the default parameters for the MD calculation. Let us change thoseparameters before quitting the relax level. When typing md_parm we enter in the next hierarchylevel and we have all the possible choices in the following table:

Table of the Keywords MD_PARM Level...................................

keyword modifier example------- -------- -------nsteps yes nsteps 500temperature(K) yes temperature 300.0tolerance_temp yes tolerance_temp 3000.0stepsize (ps) yes stepsize 0.002nbupdate yes nbupdate 30gas_phase yes gas_phase 0region2a_r yes region2a_r 18.0water_r yes water_r 18.0langevin_r yes langevin_r 20.0ex_w_center yes ex_w_center 3.0 4.5 2.34solvent yes solvent waterinduce yes induce 0indforce yes indforce 0constraint_1 yes constraint_1 0.03constraint_2 yes constraint_2 0.03constraint_w yes constraint_w 30.0constraint_pair yes constraint_pair 5 9 10.0 1.3constraint_post yes constraint_post 10 10. 10. 10. 3.4 -4.6 4.7constraint_r yes constraint_r 5 10.0 50.0 2.0 4.6 7.3constraint_ang yes constraint_ang 10 34 35 10.0 120.constraint_tor yes constraint_tor 5 10 34 35 10.0 120.0 1.0 1h_constraint yes h_constraint 0movie_co yes movie_co rg1movie_fq yes movie_fq 10pmf no pmfub_sampling no ub_samplingfix_region yes fix_region 1fix_atom yes fix_atom 8dist_atoms yes dist_atoms 2 5dist_write_fq yes dist_write_fq 10log_write_fq yes log_write_fq 10opt_his yes opt_his 1steep_mini yes steep_mini 1df_mini yes df_mini 1 0.0001log_detail yes log_detail 1help yes help <keyword1> <keyword2> ...help yes help allhelp no help



end no endexit/quit no exit

All the parameters have their default value, but let’s say that we want a shorter run and we wantto change the temperature and the stepsize. To do so we type:

md_parm> nsteps 300md_parm> temperature 200.md_parm> stepsize 0.0002md_parm> end

Closing the relax level with an additional end keyword will start the run. At the beginningthe relevant information is printed (different radii, coordinates for the center, number of solventmolecules generated...) and then the actual MD calculation starts, giving the values of the energiesat intervals of 10 steps:

(...)In dynamics: Istep= 49 Temp= 205.16 Target= 200.00In dynamics: Istep= 50 Temp= 205.22 Target= 200.00

rms of all protein heavy atoms for (x_average-x0) = 0.02rms of all protein heavy atoms for (x_current-x0) = 0.05

Energies for the system at step 50:------------------------------------------------------------------------protein - ebond : 496.38 ethet : 1699.07

ephi : 3569.11 eitor : 41.70evdw : 3130.24 emumu : -1934.36ehb_pp : -568.27

water - ebond : 57.04 ethet : 19.09evdw : -4.90 emumu : -81.70ehb_ww : 0.00

pro-wat - evdw : 28.55 emumu : -83.32ehb_pw : 0.00

long - elong : -78.27

ac - evd_acp : 0.00 emumuacp : 0.00evd_acw : 0.00 emumuacw : 0.00ehb_acp : 0.00ehb_acw : 0.00

evb - ebond : 0.00 ethet : 0.00 ephi : 0.00evdw : 0.00 emumu : 0.00 eoff : 0.00egpshift : 0.00 eindq : 0.00 ebulk : 0.00

induce - eindp : 0.00 eindw : 0.00

const. - ewatc : 209.95 eproc : 35.30 edist : 0.00

langevin- elgvn : 0.00 evdw_lgv : 0.00 eborn : -22.38

system - epot : 6513.23 ekin : 1574.62 etot : 8087.85________________________________________________________________________

Constraint energy on region I: 0.00

In dynamics: Istep= 51 Temp= 205.25 Target= 200.00(...)

If the decrease in temperature and stepsize is not enough to obtain a stable run, we can use a sim-ple steepest descent minimization by choosing steep_mini 1 in the md_parm table. Once



finnished, we end the enzymix level and we enter the analyze level in order to get the coordi-nates in PDB format after the relaxation.

Table of the Keywords for the Analysis Level............................................

keyword modifier example------- -------- -------rest_in yes rest_in rest.inrest_to_pdb yes rest_to_pdb rest.pdballres no allresrestype yes restype ASPresatom yes resatom 1resbond yes resbond 1resang yes resang 1restor yes restor 1resitor yes resitor 1distatoms yes distatoms 2 5distatompnt yes distatompnt 2 1.0 2.0 2.3chkbond yes chkbond 50.0chkdisulfide yes chkdisulfideelectro yes electro 1 18.0 4center_s no center_scenter_r yes center_r 5 12center_c yes center_c 1sphereion yes sphereion 12.50 3.64 -6.28 10.63sphereion_r yes sphereion_r 12.50 4sphereres yes sphereres 12.50 3.64 -6.28 10.63sphereres_r yes sphereres_r 12.50 4sphereatm yes sphereatm 12.50 3.64 -6.28 10.63addbond yes addbond 2 5 9 10 18mutate_res yes mutate_res 2 SERrotate_h yes rotate_h 5 12rotate_axis yes rotate_axis 5 12

rotate_axis 2.0 3.5 6.7 12.0 23.1 -2.3prot_prot no prot_protviewmovie no viewmovieviewpot no viewpotvdwsurf no vdwsurfmakepdb no makepdbmakelib1 no makelib1dock no dockadd_memgrid yes add_memgrid 1.0 3.2 0.5 Y 10.0 20.0 10.0 1 1.0end no endhelp yes help <keyword1> <keyword2> ...help yes help allhelp no helpexit/quit no exit

We choose makepdb:

Table of the Keywords makepdb Level...................................

keyword modifier example------- -------- ------residue yes residue 2

residue 2 to 10residue allresidue all+w

file_nm yes file_nm file.pdbend no endhelp yes help <keyword1> <keyword2> ...help no helpexit/quit no exit

Then we select the right options and quit the makepdb level:



makepdb> file_nm 1i45mod_wat.pdbmakepdb> residue all+w

The program executes the requested commands and it is ready to be quit by double typing end.

At this point it is important to note the use of the non-interactive way of running the program,which allows one to redirect the output. Try, for examplem o l a r i s 1 i 4 5 m o d _ r e l a x . i n p 1 i 4 5 m o d _ r e l a x . o u t

which puts the output in a file 1i45mod_relax.out file in the $OUT_DIR/1i45mod_relax.

Obviusly several runs can be concatenated in the input file when, for example, one needs to heatthe system in several stages. For example, one can create the configuration file:

1i45mod.pdbenzymix

relaxmd_parm

steep_mini 1stepsize 0.001water_r 30nsteps 30

endrest_out 1i45mod_rx.rest

endendanalyze

makepdbfile_nm 1i45mod_rx.pdbresidue all+wat

endendend

which execution can be followed by

1i45mod_rx.pdbenzymix

relaxmd_parm

temperature 100stepsize 0.002nsteps 100

endrest_out 1i45mod_md100.rest

endrelax

md_parmtemperature 300stepsize 0.002nsteps 1000

endrest_in $OUT_DIR/1i45mod_md100.rest

endendanalyze

makepdbfile_nm 1i45mod_md300.pdbresidue all

endendend



Exercise 3Check the PDB created by the above scripts. What do you need to do to run a relaxation calculationincluding all the residues in the TIM dimer interface? Run such calculation for the three systems1I45, 1NEY and 1YPI. Plot the behavior of the total energy and the RMSD.

3.3 MD with ADUN

ADUN is a program that is based on the Cocoa/NextStep frameworks. This provides excellent toolsfor a graphical user interface and you can download the latest version of the program from the ADUNGNA site: https://gna.org/projects/adun/. In this session we will use ADUN using itscommand line version, as some of the calculations to be done are still experimental (in particularthe LIE implementation in Section 4.2.

3.3.1 Preparing the PDB

Again, we need to clean the pdb file for being used with ADUN. Analogously to what was donebefore:# download pdbswget h t t p : / / www. pdb . o rg / pdb / f i l e s / 1 I45 . pdb

# t r a n s f o r m FTP t o TRP and d e l e t e what i s n o t neededsed ’ s / FTR / TRP / ’ 1 I45 . pdb | grep −v HOH >temp . pdbsed ’ s /HETATM/ATOM / ’ temp . pdb >1i45mod . pdb

File 1i45mod.pdb has multiple models. Delete all of them except the one you want to use. Stripwater as well. Clean pdbs, renumber them, and add hydrogen atoms with reduce[Word et al., 1999].Clean again fixing hydrogens, cap the protein and take care of histidine namings./ c b b l / s o f t / adun / s h a r e d a p p s / repa i rPDB . py 1 i45mod . pdb numb c l e a nr e d u c e −BUILD 1 i45mod_f ixed . pdb > 1 i45mod_reduced . pdb/ c b b l / s o f t / adun / s h a r e d a p p s / repa i rPDB . py 1 i45mod_reduced . pdb c l e a n hyd cap h i s

Now we can build the adun datasources for each of the systems5. The build may complain thatthere are two Atoms (the two fluorines), although it can be safely ignored. There is a known issuewith the builder script. In around 10% cases it misteriouslycrashes with a segmentation fault. Justrerun it, it will work./ c b b l / s o f t / adun / c h i l e / s c r i p t \

B u i l d e r . s t \1 i 4 5 m o d _ r e d u c e d _ f i x e d . pdb \Amber96

3.4 Running simulations with ADUN

Make a separate directory for the simulation and put the PDB file theremkdir 1 i 4 5cp 1 i45mod_dimer . d a t a s o u r c e 1 i 4 5 /

5datasources or systems are the main objects in ADUN. Check http://lavandula.imim.es/adun-new/?page_id=294 for more details



Prepare a template file by editing copies of /cbbl/soft/adun/resources/template.tempthat you will place in each directory. The original file template.temp already has sensible seet-ings. However, <DATASOURCE>must be replaced by 1i45mod_reduced_fixed.datasource.Also, <NUMBER_OF_STEPS> need to be set to a sensible value. The unit is femtoseconds.

In order to run the simulations in the CBBL cluster we have provide a useful script that doesmost of the job for you# p r e p a r e a c l u s t e r f i l ecp / c b b l / s o f t / adun / r e s o u r c e s / c l u s t e r . i n i 1 i 4 5 /

# go i n t o a l l d i r e c t o r i e s and e d i t t h e c l u s t e r . i n i# p u t in a n i c e name and a queue ( e . g . c b b l )

# s t a r t s i m u l a t i o n s/ c b b l / s o f t / adun / c h i l e / c l u s t e r / c b b l / u s e r s / s c r a t c h / c h i l e / 1 i 4 5

3.4.1 Analyizing the results

One of the powerful characteristics of ADUN is its ability to extract results from the simulations tobe analyzed using diverse algorithms.

RMSD analysis We can run the RMSD plugin using the alpha carbons only by/ c b b l / s o f t / adun / c h i l e / s c r i p t \

RMSD. s t \/ c b b l / u s e r s / s c r a t c h / c h i l e / 1 i 4 5 / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n \1 i 4 5 m o d _ r e d u c e d _ f i x e d \@CA

Extract energies and trajectories As ADUN is free energy calculation oriented, the analysis ofthe energetics of the system through the MD trajectory is critical. The folowing commands extractenergies, starting at frame 0 until frame 1000 is reached and obtained every second frame:/ c b b l / s o f t / adun / c h i l e / r e s u l t s C o n v e r t e r \

−Mode Energy \−S i m u l a t i o n / c b b l / u s e r s / s c r a t c h / c h i l e / 1 i 4 5 / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n \−S t a r t 0 \−Length 1000 \−S t e p S i z e 2

Using also the resultsConverter tool, one can extract a series of pdbs, so that the trajec-tory can be viewed in, e.g., VMD:/ c b b l / s o f t / adun / c h i l e / r e s u l t s C o n v e r t e r \

−Mode C o n f i g u r a t i o n \−S i m u l a t i o n / c b b l / u s e r s / s c r a t c h / c h i l e / 1 i 4 5 / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n \−S t a r t 0 \−Length 1000 \−S t e p S i z e 2

Essential dynamics To obtain the essential modes for the alpha carbons only of the system weuse the corresponding ED.st script:/ c b b l / s o f t / adun / c h i l e / s c r i p t \

ED . s t \/ c b b l / u s e r s / s c r a t c h / c h i l e / 1 i 4 5 / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n \1 i45mod_dimer \@CA



Exercise 4Follow the same procedure to run an MD simulation for 1NEY. 1NEY has an 13P ligand, strip it,

we will come back to how to incorporate the ligand later. Create RMSD plots and a movie of theMD run for 1NEY.

4 Practical Session II: Solvation, pKa, FEP

4.1 Running solvation and pKa simulations using PDLD/S-LRA

The next task consists in the calculation of the pKa shift for the residues in the TIM interface. Todo so, we will choose the polaris task and the program will prompt us with a table of the options forpolaris:

Table of the Keywords for the Polaris Level...........................................

keyword modifier example------- -------- -------pre_pol no pre_polsolv_pdld no solv_pdldsolv_pdld_evb no solv_pdld_evbsolv_fep no solv_fepai_pdld no ai_pdldbind_pdld no bind_pdldbind_pdld_evb no bind_pdld_evbbind_fep no bind_feppka_pdld no pka_pdldpka_fep no pka_fepredox_pdld no redox_pdldredox_fep no redox_feplogp no logptitra_ph_0 no titra_ph_0titra_ph no titra_phpka_multi no pka_multievb_pdld no evb_pdldprot_prot no prot_protend no endhelp yes help <keyword1> <keyword2> ...help yes help allhelp no helpexit/quit no exit

We will choose pka_pdld, and the program will prompt:

Table of the Keywords for the pKa_pdld Level............................................

keyword modifier example------- -------- -------reg1_res yes reg1_res 2pka_w yes pka_w 3.0pdld_fn yes pdld_fn asp.pdldreg1_atm yes reg1_atm 10 to 20ab_crg yes ab_crg 10 0.50 0.0regII_r yes regII_r 16.0config yes config 0 5use_restart no use_restartmd_parm_r no md_parm_rmd_parm_w no md_parm_wmd_parm_p no md_parm_p



help yes help <keyword1> <keyword2> ...help no helpend no endexit/quit no exit

Next we will choose residue 137 as our region I. We will also set the number of configurationsto run and the characteristics of the dynamics. Thus, we will tell the program to run the calculationon the initial protein structure and on 2 more conformations which will be generated automaticallyby MD runs.

pka_pdld> reg1_res 137

atoms added to region I:atom# charg_a charg_b----- ------- -------2108 0.000 0.0002109 0.000 0.0002110 0.000 0.0002111 -0.080 0.0002112 0.360 0.0002113 0.360 0.0002114 0.360 0.000

pka_pdld> config 1 2

In order to run the program we will just end the level and the calculation will proceed. The finalresult of the pKa calculations of this very simple (and of course unreliable because of the short run)test:

PDLD SEMI-MACROSCOPIC ESTIMATE FOR pKa......................................

effective dielectric 2 4 6 8 20 40 80epsilon_p(e_p)

pKa_intr for str. 1 9.93 10.17 10.25 10.28 10.29 10.35 10.37

pKa_intr for str. 2 9.91 10.16 10.24 10.28 10.29 10.35 10.37

------------------------------------------------------------------------------aver pKa_int 9.92 10.16 10.24 10.28 10.29 10.35 10.37

estimated apparent pKa 13.01

where the pKa_int corresponds to the intrinsic pKa, the one due to the self energy of the system,while the estimated apparent pKa includes the charge-charge contribution (see the course slides fordetails).

Exercise 5Find the pKa shifts for all residues in the interface of the TIM dimer. Use the prot_prot keywordat the analyze level as a guide.

Exercise 6Recall the solv_pdld keyword at the polaris level is in fact a simplified version of the ther-

modynamic cycle for the pKa shift calculations. Based on this fact, check the stability of theloop 6 residues in the three structure 1I45, 1NEY and 1YPI and discuss the results. See, e.g.,[Bonet et al., 2006, Scheper et al., 2009].



4.2 LIE runs with ADUN

Next we will evaluate the absolute free energy of solvation of a ligand to the two structures bymeans of the linear interaction energy method by Aqvist and coworkers[Hansson et al., 1998]. Weare going to run a linear interaction energy calculation using PGH as a ligand for the TIM structures.6

First, we will build PDB files from XXXXmod_reduced_fixed containing the protein plusthe ligand (after docking with autodock, for example). We will call these files XXXXmod_complex.pdb.

Then, we will build datasources for the TIM+PGH and the PGH alone as in Section 3.3.1. Besure that in all PDB files, the PGH moiety bears the same chain label (C, here)./ c b b l / s o f t / adun / c h i l e / s c r i p t B u i l d e r . s t 1 neymod_dimere . pdb Amber96/ c b b l / s o f t / adun / c h i l e / s c r i p t B u i l d e r . s t 1 i45mod_dimere . pdb Amber96/ c b b l / s o f t / adun / c h i l e / s c r i p t B u i l d e r . s t PGH. pdb Amber96

Finally, the LIE run is done by:/ c b b l / s o f t / adun / c h i l e / s c r i p t LIE . s t \

/ c b b l / u s e r s / s c r a t c h / c h i l e / 1 ney / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n \/ c b b l / u s e r s / s c r a t c h / c h i l e / pgh / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n \C

/ c b b l / s o f t / adun / c h i l e / s c r i p t LIE . s t \/ c b b l / u s e r s / s c r a t c h / c h i l e / 1 i 4 5 / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n/ c b b l / u s e r s / s c r a t c h / c h i l e / pgh / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n \C

Exercise 7Calculate the absolute binding free energy for DHAP in both proteins (the 13P HETATM structurein 1NEY).

5 Practical Session III: enzymatic reactivity with EVB

5.1 EVB for enzymatic reactivity analysis

In order to study an enzymatic reaction we should compare the reaction mechanism in the proteinand the corresponding reaction in water. Thus, most of the times we are interested in comparing thefree energy profiles in both environments. We will see here how to run a simulation in protein. Torun it in water you should modify the pdb file and the input files below.

First, we need to define the resonance states we are going to explore, following Aqvist[Åqvist and Fothergill, 1996]..

/cbbl/users/jparetas/molaris/md1wyi.pdbenzymix

evbevb_state 5 1.00 0.00 0.0 0.0 0.0 1

evb_atm 3749 -0.68859 O- -0.68295 O- -0.68661 O- -0.71403 O- -0.70662 O-evb_atm 3750 0.19013 C0 0.19017 C0 0.19699 C0 0.20108 C0 0.21502 C0evb_atm 3751 0.58918 C+ 0.58214 C+ 0.33797 C+ 0.27155 C+ 0.33581 C+evb_atm 3752 -0.81382 O- -0.80576 O- -0.47856 O- -0.51389 O- -0.65842 O-evb_atm 3753 0.18825 C0 0.15741 C0 0.03782 C0 0.07854 C0 0.47497 C0

6This is an experimental implementation in ADUN, so we have tried to minimize the sources of error, although somemay exist still.



evb_atm 3754 -0.63437 O- -0.61647 O- -0.61681 O- -0.62313 O- -0.72027 O-evb_atm 3755 0.07703 H0 0.10567 H0 0.04920 H0 0.04987 H0 0.03755 H0evb_atm 3756 0.06757 H0 0.07493 H0 0.05844 H0 0.05610 H0 0.06462 H0evb_atm 3757 0.07803 H0 0.05535 H0 0.03800 H0 0.02770 H0 0.01087 H0evb_atm 3758 0.08548 H0 0.03909 H0 0.22519 H0 0.26049 H0 0.12992 H0evb_atm 3759 0.24922 H0 0.25966 H0 0.21598 H0 0.12190 H0 0.14398 H0evb_atm 1416 0.17993 C0 0.17114 C0 0.16329 C0 0.17634 C0 0.17737 C0evb_atm 1417 -0.53201 N- -0.46055 N- -0.47382 N- -0.51056 N- -0.50804 N-evb_atm 1418 0.15798 H0 0.12208 H0 0.23880 H0 0.23902 H0 0.23974 H0evb_atm 1419 0.20793 C0 0.18771 C0 0.19239 C0 0.21360 C0 0.20854 C0evb_atm 1420 0.00456 H0 0.02048 H0 0.00419 H0 0.00979 H0 0.00686 H0evb_atm 1421 0.05774 N0 0.12747 N0 -0.36805 N0 0.12963 N0 0.08870 N0evb_atm 1422 -0.12399 C0 -0.11718 C0 -0.14108 C0 -0.11732 C0 -0.12039 C0evb_atm 1423 0.01450 H0 0.01485 H0 0.00967 H0 0.01515 H0 0.01468 H0evb_atm 2505 -0.03807 C0 -0.02721 C0 -0.00095 C0 -0.02629 C0 -0.03614 C0evb_atm 2506 0.07814 H0 0.06715 H0 0.05948 H0 0.06925 H0 0.07107 H0evb_atm 2507 0.06048 H0 0.08105 H0 0.09790 H0 0.08797 H0 0.07871 H0evb_atm 2508 0.72937 C+ 0.73513 C+ 0.74744 C+ 0.73159 C+ 0.73425 C+evb_atm 2509 -0.80168 O- -0.75704 O- -0.78554 O- -0.88197 O- -0.88894 O-evb_atm 2510 -0.84230 O0 -0.88402 O0 -0.46001 O0 -0.75316 O- -0.80857 O-

evb_bnd 0 3749 3750evb_bnd 0 3750 3755evb_bnd 0 3750 3756evb_bnd 0 3750 3751evb_bnd 0 3751 3752evb_bnd 0 3751 3753evb_bnd 0 3753 3757evb_bnd 1 3753 3758evb_bnd 2 3758 2510evb_bnd 3 3758 2510evb_bnd 4 3758 2510evb_bnd 0 3753 3754evb_bnd 1 3754 3759evb_bnd 2 3754 3759evb_bnd 3 3754 3759evb_bnd 0 2505 2506evb_bnd 0 2505 2507evb_bnd 0 2505 2508evb_bnd 0 2508 2509evb_bnd 0 2508 2510evb_bnd 0 1416 1422evb_bnd 0 1422 1423evb_bnd 0 1422 1421evb_bnd 1 1421 1418evb_bnd 2 1421 1418evb_bnd 3 3752 1418evb_bnd 4 3752 1418evb_bnd 5 3752 1418evb_bnd 0 1421 1419evb_bnd 4 1421 3759evb_bnd 5 1421 3759evb_bnd 5 3758 3751evb_bnd 0 1419 1420evb_bnd 0 1419 1417evb_bnd 0 1417 1416

and then we can continue with a regular MD run

gas_dg 1 0.0gas_dg 2 115.0gas_dg 3 50.0evb_parm

iflag_r4 0endrest_out evb_tim12.resmd_parm

temperature 300.0



nsteps 50000ss 0.001region2a_r 18water_r 18langevin_r 18.5movie_co 1 2 3movie_fq 1000

constraint_1 0.30end

endend# now we check the numbering of the atoms for defining region Ianalyze

resatom 1resatom 2resatom 3

end

After the relaxation we can proceed with the actual free energy perturbation calculations:

enzymixevb

evb_state 5 1.00 0.00 0.0 0.0 0.0 1ap_pf 51 1 2evb_atm 3749 -0.68859 O- -0.68295 O- -0.68661 O- -0.71403 O- -0.70662 O-

(...)# repeat here the description of the EVB region(...)

evb_bnd 0 1417 1416

# gas_dg 1 0.0# gas_dg 2 115.0# gas_dg 3 50.0hij 1 2 3753 2510 10. 2.5evb_parm

iflag_r4 0endrest_out evb_tim12.resmd_parm

temperature 300.0nsteps 50000ss 0.001region2a_r 18water_r 18langevin_r 18.5movie_co 1 2 3movie_fq 1000

constraint_1 0.30end

endend# now we check the numbering of the atoms for defining region Ianalyze

resatom 1resatom 2resatom 3

end

After the end command, the program will start the FEP protocol, according to the settingsabove. The final result of the program is a bunch of *.map files, each of them corresponding toevery frame.



Exercise 8Design the EVB run for the water system and execute both in the CBBL cluster.

A Running in the luke cluster

The luke cluster is going to be used for all expensive runs with NAMD as well as for all runs inMOLARIS and ADUN. To connect, use

s sh <username >@arbutus . imim . es

This will bring you to your /home/<username> directory. To send calculations, first changeto the /homes/users/<username> directory, which is the one shared by all the nodes in thecluster. Otherwise your calculations will not succeed.

The luke cluster uses the Sun grid engine (SGE) queuing system. In order to use it, first add thisline to your $HOME/.bashrc file:

s o u r c e / c b b l / s o f t / sge6 . 2 u2_1 / d e f a u l t / common / s e t t i n g s . sh

Some important keywords are qsub, qstat, qdel. Use the unix man command to obtain help oneach SGE keyword. You can also find information in several web sites (see, e.g., http://www.ats.ucla.edu/clusters/common/computing/batch/sge.htm).

Running ADUN in the luke cluster is done by using special scripts, as described in the corre-sponding sections.

B Extra material for NAMD

B.1 Set up

In this hands on class you are supposed to use NAMD in a local installation. In case this is notpossible or you need to run in the cluster, you can add the following line to your $HOME/.bashrcfile:

e x p o r t PATH=/ c b b l / s o f t /NAMD_2. 6 _Linux−amd64 / : $PATH

B.2 VMD: wat_sphere.tcl

### S c r i p t t o immerse TIM i n a s p h e r e o f wa ter j u s t l a r g e enough### t o c o v e r i t . $max i s t h e r a d i u s o f t h e p r o t e i n### Adapted from t h e NAMD t u t o r i a l

s e t molname 1 i 4 5

mol new ${molname} . p s fmol a d d f i l e ${molname} . p d b

### Determine t h e c e n t e r o f mass o f t h e m o l e c u l e and s t o r e t h e c o o r d i n a t e ss e t cen [ measure c e n t e r [ a t o m s e l e c t t o p a l l ] we ig h t mass ]s e t x1 [ l i n d e x $cen 0]s e t y1 [ l i n d e x $cen 1]s e t z1 [ l i n d e x $cen 2]



s e t max 0

### Determine t h e d i s t a n c e o f t h e f a r t h e s t atom from t h e c e n t e r o f massforeach atom [ [ a t o m s e l e c t t o p a l l ] g e t index ] {

s e t pos [ l i n d e x [ [ a t o m s e l e c t t o p " i n d e x $atom " ] g e t {x y z } ] 0 ]s e t x2 [ l i n d e x $pos 0 ]s e t y2 [ l i n d e x $pos 1 ]s e t z2 [ l i n d e x $pos 2 ]s e t d i s t [ expr pow ( ( $x2−$x1 ) ∗ ( $x2−$x1 ) + ( $y2−$y1 ) ∗ ( $y2−$y1 ) + ( $z2−$z1 ) ∗ ( $z2−$z1 ) , 0 . 5 ) ]i f { $ d i s t > $max} { s e t max $ d i s t }}

mol d e l e t e t o p

### S o l v a t e t h e m o l e c u l e i n a water box w i t h enough padding (15 A ) .### One c o u l d a l t e r n a t i v e l y a l i g n t h e m o l e c u l e such t h a t t h e v e c t o r### from t h e c e n t e r o f mass t o t h e f a r t h e s t atom i s a l i g n e d w i t h an a x i s ,### and t h e n use no paddingpackage r e q u i r e s o l v a t es o l v a t e ${molname} . p s f ${molname} . p d b −t 15 −o d e l _ w a t e r

r e s e t p s fpackage r e q u i r e p s f g e nmol new d e l _ w a t e r . p s fmol a d d f i l e d e l _ w a t e r . p d br e a d p s f d e l _ w a t e r . p s fcoordpdb d e l _ w a t e r . p d b

### Determine which water m o l e c u l e s need t o be d e l e t e d and use a f o r loop### t o d e l e t e thems e t wat [ a t o m s e l e c t t o p " same r e s i d u e as { w a t e r and ( ( x−$x1 ) ∗ ( x−$x1 ) + ( y−$y1 ) ∗ ( y−$y1 ) + ( z−$z1 ) ∗ ( z−$z1 ) ) < ( $max∗$max ) } " ]s e t d e l [ a t o m s e l e c t t o p " w a t e r and n o t same r e s i d u e as { w a t e r and ( ( x−$x1 ) ∗ ( x−$x1 ) + ( y−$y1 ) ∗ ( y−$y1 ) + ( z−$z1 ) ∗ ( z−$z1 ) ) < ( $max∗$max ) } " ]s e t seg [ $ d e l g e t s e g i d ]s e t r e s [ $ d e l g e t r e s i d ]s e t name [ $ d e l g e t name ]f o r { s e t i 0} { $ i < [ l l e n g t h $seg ] } { i n c r i } {

de l a tom [ l i n d e x $seg $ i ] [ l i n d e x $ r e s $ i ] [ l i n d e x $name $ i ]}

w r i t e p s f ${molname} _ w s . p s fwritepdb ${molname} _ws.pdb

mol d e l e t e t o p

mol new ${molname} _ w s . p s fmol a d d f i l e ${molname} _ws.pdbputs "CENTER OF MASS OF SPHERE I S : [ measure c e n t e r [ a t o m s e l e c t t o p a l l ] we ig h t mass ] "puts "RADIUS OF SPHERE I S : $max "mol d e l e t e t o p

B.3 VMD: sod2pot.tcl

# ! / u s r / l o c a l / b i n / vmd −dispdev t e x t# r e p l a c i n g Na+ w i t h K+ ( or a n y t h i n g e l s e w i t h a n y t h i n g e l s e )# adap ted from t h e o r i g i n a l f i l e from# I l y a B a l a b i n ( i l y a @ k s . u i u c . e d u ) , 2002−2003

# d e f i n e i n p u t f i l e s he res e t p s f f i l e " 1 i45_wb_NaCl .ps f "s e t p d b f i l e " 1 i45_wb_NaCl.pdb "s e t p r e f i x " 1 i45_wb_KCl "

# d e f i n e what i o n s t o r e p l a c e w i t h what i o n ss e t i on f rom "SOD"s e t i o n t o "POT"

# do n o t change a n y t h i n g below t h i s l i n epackage r e q u i r e p s f g e nt o p o l og y t o p _ a l l 2 7 _ p r o t _ l i p i d . r t f

puts " \ nSod2pot ) Reading ${ p s f f i l e } / ${ p d b f i l e } . . . "r e s e t p s f



r e a d p s f $ p s f f i l ecoordpdb $ p d b f i l emol load p s f $ p s f f i l e pdb $ p d b f i l e

s e t s e l [ a t o m s e l e c t t o p " name $ ionf rom " ]s e t p o s l i s t [ $ s e l g e t {x y z } ]s e t s e g l i s t [ $ s e l g e t s e g i d ]s e t r e s l i s t [ $ s e l g e t r e s i d ]s e t num [ l l e n g t h $ r e s l i s t ]puts " Sod2pot ) Found ${num} ${ ion f rom } i o n s t o r e p l a c e . . . "s e t num 0foreach s e g i d $ s e g l i s t r e s i d $ r e s l i s t {

de l a tom $ s e g i d $ r e s i di n c r num

}puts " Sod2pot ) D e l e t e d ${num} ${ ion f rom } i o n s "segment $ i o n t o {

f i r s t NONEl a s t NONEforeach r e s $ r e s l i s t {

r e s i d u e $ r e s $ i o n t o}

}s e t num [ l l e n g t h $ r e s l i s t ]puts " Sod2pot ) C r e a t e d ${num} t o p o l o g y e n t r i e s f o r ${ i o n t o } i o n s "s e t num 0foreach xyz $ p o s l i s t r e s $ r e s l i s t {

coord $ i o n t o $ r e s $ i o n t o $xyzi n c r num

}puts " Sod2pot ) S e t c o o r d i n a t e s f o r ${num} ${ i o n t o } i o n s "w r i t e p s f " ${ p r e f i x } . p s f "writepdb " ${ p r e f i x } . p d b "puts " Sod2pot ) Wrote ${ p r e f i x } . p s f / ${ p r e f i x } . p d b "puts " Sod2pot ) A l l d o n e . "q u i t

B.4 VMD: 1i45_ws_eq.conf

1 # Minimization and Equilibration of TIM in a water sphere2

3 #############################################################4 ## ADJUSTABLE PARAMETERS ##5 #############################################################6

7 structure 1i45_ws.psf8 coordinates 1i45_ws.pdb9 set temperature 310

10 set outputname 1i45_ws_eq11 firsttimestep 012

13 #############################################################14 ## SIMULATION PARAMETERS ##15 #############################################################16

17 # Input18 paraTypeCharmm on19 parameters par_all27_prot_lipid.prm20 temperature $temperature21

22 # Force-Field Parameters23 exclude scaled1-424 1-4scaling 1.025 cutoff 12.026 switching on



27 switchdist 10.028 pairlistdist 13.529

30 # Integrator Parameters31 timestep 2.0 ;# 2fs/step32 rigidBonds all ;# needed for 2fs steps33 nonbondedFreq 134 fullElectFrequency 235 stepspercycle 1036

37 # Constant Temperature Control38 langevin on ;# do langevin dynamics39 langevinDamping 5 ;# damping coefficient (gamma) of 5/ps40 langevinTemp $temperature41 langevinHydrogen off ;# don’t couple langevin bath to hydrogens42

43 # Output44 outputName $outputname45 restartfreq 500 ;# 500steps = every 1ps46 dcdfreq 25047 outputEnergies 10048 outputPressure 10049

50 #############################################################51 ## EXTRA PARAMETERS ##52 #############################################################53

54 # Spherical boundary conditions55 sphericalBC on56 sphericalBCcenter 30.3081743413, 28.8049907121, 15.35399442357 sphericalBCr1 26.058 sphericalBCk1 1059 sphericalBCexp1 260

61 #############################################################62 ## EXECUTION SCRIPT ##63 #############################################################64

65 minimize 100066 reinitvels $temperature67 run 2500 ;# 5ps

B.5 VMD: 1i45_wb_eq.conf

1 # Minimization and Equilibration of2 # Ubiquitin in a Water Box3

4 #############################################################5 ## ADJUSTABLE PARAMETERS ##6 #############################################################7

8 structure 1i45_wb.psf9 coordinates 1i45_wb.pdb

10 set temperature 31011 set outputname 1i45_wb_eq12 firsttimestep 013

14 #############################################################15 ## SIMULATION PARAMETERS ##



16 #############################################################17

18 # Input19 paraTypeCharmm on20 parameters par_all27_prot_lipid.prm21 temperature $temperature22

23 # Force-Field Parameters24 exclude scaled1-425 1-4scaling 1.026 cutoff 12.027 switching on28 switchdist 10.029 pairlistdist 13.530

31 # Integrator Parameters32 timestep 2.0 ;# 2fs/step33 rigidBonds all ;# needed for 2fs steps34 nonbondedFreq 135 fullElectFrequency 236 stepspercycle 1037

38 # Constant Temperature Control39 langevin on ;# do langevin dynamics40 langevinDamping 5 ;# damping coefficient (gamma) of 5/ps41 langevinTemp $temperature42 langevinHydrogen off ;# don’t couple langevin bath to hydrogens43

44 # Periodic Boundary Conditions45 cellBasisVector1 42.0 0. 0.46 cellBasisVector2 0. 44.0 0.47 cellBasisVector3 0. 0 47.048 cellOrigin 31.0 29.0 17.549 wrapAll on50

51 # PME (for full-system periodic electrostatics)52 PME yes53 PMEGridSpacing 1.054

55 #manual grid definition56 #PMEGridSizeX 4557 #PMEGridSizeY 4558 #PMEGridSizeZ 4859

60 # Constant Pressure Control (variable volume)61 useGroupPressure yes ;# needed for rigidBonds62 useFlexibleCell no63 useConstantArea no64 langevinPiston on65 langevinPistonTarget 1.01325 ;# in bar -> 1 atm66 langevinPistonPeriod 100.067 langevinPistonDecay 50.068 langevinPistonTemp $temperature69

70 # Output71 outputName $outputname72 restartfreq 500 ;# 500steps = every 1ps73 dcdfreq 25074 xstFreq 25075 outputEnergies 10076 outputPressure 10077



78 #############################################################79 ## EXECUTION SCRIPT ##80 #############################################################81

82 # Minimization83 minimize 10084 reinitvels $temperature85 run 2500 ;# 5ps

C Extra material for ADUN

C.1 Set up

Add these lines to your $HOME/.bashrc file:e x p o r t LD_LIBRARY_PATH=/ c b b l / s o f t / adun / c h i l e / GNUstep / L i b r a r y / L i b r a r i e s : \/ s o f t / l i b : / c b b l / s o f t / GNUstep / System / L i b r a r y / L i b r a r i e s : \/ c b b l / s o f t / adun / l i b / l i b : / c b b l / s o f t / OMPI / l i be x p o r t HOMEPATH=$HOMEe x p o r t PATH=/ c b b l / s o f t / r educe −3 . 1 3 / :$PATH

D Additional tools

NAMD • a nice interface to NAMD runs can be found at http://mmtsb.org/workshops/mmtsb-ctbp_workshop_2009/Tutorials/MMTSB_NAMDSimulation/MMTSB_NAMDSimulation.html

• an extensive example of a standard protocol for minimizing, heating and producingsimulations with NAMD is provided here: http://faculty.uml.edu/vbarsegov/teaching/bioinformatics/lectures/MDSimulationsModified.pdf

• running replica exchange simulations with NAMD: http://www.ks.uiuc.edu/Research/

namd/2.6/ug/node40.html

• NAMD case studies http://www.ks.uiuc.edu/Training/CaseStudies/

MOLARIS • The complete MOLARIS tutorials: http://cbbl.imim.es/?page_id=143#molaris.

ADUN • The Adun site: http://adun.imim.es

• The Adun install guide can be found here: http://lavandula.imim.es/adun-new/?page_id=103

• What to check if something goes wrong with adun? http://lavandula.imim.es/adun-new/?page_id=308

• Experimental Live CD including and ADUN distribution: http://susegallery.com/a/hvXWpn/adun-user.

References

[Åqvist and Fothergill, 1996] Åqvist, J. and Fothergill, M. (1996). Computer Simulation of theTriosephosphate Isomerase Catalyzed Reaction. 271(17):10010–10016.



[Bonet et al., 2006] Bonet, J., Caltabiano, G., Khan, A., Johnstons, M., Corbí, C., Gómez, A.,Rovira, X., Teyra, J., and Villà-Freixa, J. (2006). The Role of Residue Stability in Tran-sient Protein-Protein Interactions Involved in Enzymatic Phosphate Hydrolysis. A ComputationalStudy. 63:65–77.

[Chu et al., 2003] Chu, Z. T., Villà-Freixa, J., Štrajbl, M., Schutz, C. N., Shurki, A., and Warshel,A. (2003). MOLARIS version alpha9.06.01.

[Hansson et al., 1998] Hansson, T., Marelius, J., and Åqvist, J. (1998). Ligand-binding affinityprediction by linear interaction energy methods. J. of Comput-Aided Mol. Design, 12(1):27–35.

[Humphrey et al., 1996] Humphrey, W., Dalke, A., and Schulten, K. (1996). Vmd: visual molecu-lar dynamics. J Mol Graph, 14(1):33–8, 27–8.

[Jogl et al., 2003] Jogl, G., Rozovsky, S., McDermott, A. E., and Tong, L. (2003). Optimal align-ment for enzymatic proton transfer: Structure of the Michaelis complex of triosephosphate iso-merase at 1.2-A resolution. Proceedings of the National Academy of Sciences of the United Statesof America, 100(1):50–55.

[Johnston et al., 2005] Johnston, M. A., Galvan, I. F., and Villà-Freixa, J. (2005). Framework-baseddesign of a new all-purpose molecular simulation application: the adun simulator. J ComputChem, 26(15):1647–1659.

[Lee et al., 1993] Lee, F., Chu, Z., and Warshel, A. (1993). Microscopic and semimicroscopiccalculations of electrostatic energies in proteins by the POLARIS and ENZYMIX programs.Journal of Computational Chemistry, 14(2):161–185.

[Lolis et al., 1990] Lolis, E., Alber, T., Davenport, R. C., Rose, D., Hartman, F. C., and Petsko,G. A. (1990). Structure of yeast triosephosphate isomerase at 1.9A resolution. Biochemistry,29(28):6609–6618.

[Phillips et al., 2005] Phillips, J. C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E.,Chipot, C., Skeel, R. D., Kalé, L., and Schulten, K. (2005). Scalable molecular dynamics withnamd. J Comput Chem, 26(16):1781–1802.

[Rozovsky et al., 2001] Rozovsky, S., Jogl, G., Tong, L., and McDermott, A. E. (2001). Solution-state NMR investigations of triosephosphate isomerase active site loop motion: ligand release inrelation to active site loop dynamics. Journal of Molecular Biology, 310(1):271 – 280.

[Scheper et al., 2009] Scheper, J., Oliva, B., Villà-Freixa, J., and Thomson, T. M. (2009). Analysisof electrostatic contributions to the selectivity of interactions between ring-finger domains andubiquitin-conjugating enzymes. Proteins, 74(1):92–103.

[Warshel and King, 1985] Warshel, A. and King, G. (1985). Polarization Constraints in MolecularDynamics Simulation of Aqueous Solutions: The Surface Constraint All Atom Solvent (SCAAS)Model. 121:124–9.

[Word et al., 1999] Word, J., Lovell, S., Richardson, J., and Richardson, D. (1999). Asparagine andglutamine: using hydrogen atom contacts in the choice of side-chain amide orientation1. Journalof molecular biology, 285(4):1735–1747.


Msi 0112 p

Documents

i n s t r u c t u r

e f o r

wgeto u t p u t document

s e q s

o r g pdb

d o w n

pdb 1i45

ney c h