Young.md.DNA.biopolymers.2001

Molecular DynamicsSimulation of Nucleic Acids:Successes, Limitations, andPromise*

Thomas E. Cheatham, III1

Matthew A. Young2

1 Department of MedicinalChemistry,

University of Utah,30 South, 2000 East, Skaggs

Hall 201,Salt Lake City,

UT 84112-5820

2 The Rockefeller University,Laboratories of Molecular

Biophysics1230 York Avenue, Box 3,

New York, NY 10021

Received 4 September 2001;accepted September 9, 2001

Published online 12 December 2001

Abstract: In the last five years we have witnessed a significant increase in the number publicationsdescribing accurate and reliable all-atom molecular dynamics simulations of nucleic acids. This increasehas been facilitated by the development of fast and efficient methods for treating the long-rangeelectrostatic interactions, the availability of faster parallel computers, and the development of well-validated empirical molecular mechanical force fields. With these technologies, it has been demonstratedthat simulation is not only capable of consistently reproducing experimental observations of sequencespecific fine structure of DNA, but also can give detailed insight into prevalent problems in nucleic acidstructure, ion association and specific hydration of nucleic acids, polyadenine tract bending, and thesubtle environmental dependence of the A-DNA–B-DNA duplex equilibrium. Despite the advances, thereare still issues with the methods that need to be resolved through rigorous controlled testing. In general,these relate to deficiencies of the underlying molecular mechanical potentials or applied methods (suchas the imposition of true periodicity in Ewald simulations and the need for energy conservation), andsignificant limits in effective conformational sampling. In this perspective, we provide an overview of ourexperiences, provide some cautionary notes, and provide recommendations for further study in molec-ular dynamics simulation of nucleic acids. © 2001 John Wiley & Sons, Inc. Biopoly (Nucleic Acid Sci)56: 232–256, 2001

Keywords: all-atom molecular dynamics; nucleic acids; long-range electrostatic interactions;empirical molecular mechanical force fields; ion occupancies; A-tract bending; true periodicity

* We would like to dedicate this article to Peter Kollman, afriend, collaborator, mentor, and all around good guy. Peter wasvery enthusiastic about DNA simulation; he will be sorely missed.

Correspondence to: Thomas E. Cheatham, III; email:[email protected]

Contract grant sponsor: National Computational Science Alli-ance

Contract grant number: MCB000004NBiopolymers (Nucleic Acid Sciences), Vol. 56, 232–256 (2001)© 2001 John Wiley & Sons, Inc.

232

SUCCESSES OF MOLECULARDYNAMICS APPLIED TONUCLEIC ACIDS

An essential component of many biological processesinvolving nucleic acids relates to their sequence-spe-cific structure and motion. This motion, or sequence-dependent flexibility and dynamics, adds an addi-tional level of complexity beyond structure that iscrucial for elucidating the function of nucleic acids.While diverse experimental biophysical techniquesare available for the study of nucleic acid structuralproperties, no one experimental technique is capableof generating a complete description of the dynamicalstructure of DNA in its native solution environment.Molecular dynamics (MD) simulation, with empiri-cally derived force fields, is one technique that can inprinciple provide a complete theoretical description ofthe structure and motion. This is a useful tool not onlyfor developing models of DNA structure and motion,but also for interpreting experimental data. However,while MD is a well-defined theoretical methodology,a number of ongoing methodological questions havelimited the widespread acceptance and applicability ofthe results. In spite of this, in the past five years or sowe have witnessed a tremendous increase in the reli-ability of molecular dynamics simulation applied tonucleic acids. Given this, one of our goals is to con-vince the wider community of the applicability andpromise of these methods, while at the same timeclearly stating the limitations in the methods.

Much of the credit for the success in the past fiveor so years stems from improvements in the empiri-cally derived molecular mechanical force fields, in-creases in computer power, parallelization of theavailable simulation codes, and improved realizationof the importance for properly treating the long-rangeelectrostatic interactions. Demonstrations of the reli-ability have come from long time scale stability of thesimulations, the ability to reproduce experimentallyobserved properties, as well as reproducibility of thesimulation results. Examples from our experience in-clude the investigation of topics ranging from poly-adenine tract (A-tract) bending in DNA duplexes1–3 toconformational transitions among A-DNA and B-DNA related to subtle changes in the solution envi-ronment.4–7 We were not the only groups to capitalizeon these methodological and technological advancesas the field has seemed to collectively jump forward.To date, a wide repertoire of successes have been seenin simulations of DNA single strands,8 duplexes,9,10

triplexes,11–13 quadruplexes,14,15 and usual DNAstructures such as zipper DNA,16 modified back-bones17–19 and damaged DNA20–22, and also of RNA

duplexes and higher order structures.23 These simula-tions have shown that current molecular dynamicssimulation methodologies cannot only reproduce ex-periment, but give insight beyond experiment, andeven suggest novel experiments to address scientificproblems. Moreover, the reliability of the simulationprotocols has pushed the NMR community to begin toinclude these more reliable protocols (including ex-plicit solvent) in the refinement process.24–26 Formore general review of molecular dynamics simula-tions applied to nucleic acids, see recent reviews.9,10

Below we summarize some specific areas where wehave been directly involved.

Successes: Ion Interaction in theGrooves of DNA

The DNA duplex exists in solution as a highlycharged polyanion that has been described as invari-ably enshrouded in a thermodynamically stabile butstructurally dynamic counterion cloud that largelyattenuates the extremely large charge of the polymerwhen measured at considerable distance.27 DNA oli-gomers are stable in a range of ionic conditions, but itis quite well known that specific ionic and solvationconditions can influence both the conformationalproperties of DNA as well as the thermodynamics ofinteractions with ligand that interact with DNA.28 Therange of possible complex interactions between coun-terions and a highly charged polyanionic macromol-ecule such as a DNA oligomer makes it desirable toinclude explicit counterions in simulations aimed atinvestigating conformational properties of DNA. MDsimulation is capable of generating a detailed dynamicpicture of counterion–DNA interactions that can beused to augment experimental information and sug-gest models for the molecular basis behind well-known nucleic acid structural sensitivities to ionicconditions. This section will focus on highlightingsome of the recent conclusions about DNA–ion inter-actions that have emerged from MD simulations, aswell as attempt to outline some of the outlying ques-tions in the use of simulation to study DNA–ioninteractions.

A number of experimental and computational in-vestigations have been published recently that de-scribe not only the interaction of counterions withDNA duplexes, but also describe the converse, how acounterion influences the structure of the DNA (for areview of recent experimental and theoretical work oncounterion–DNA interactions, see Ref. 29). Becausethe ion–DNA interactions are dynamic, it has beendifficult to obtain detailed atomic resolution informa-tion about these interactions using experimental tech-

Molecular Dynamics Simulation of Nucleic Acids 233

niques. This has led to considerably controversy in thefield, with some groups strongly implying direct in-teraction of monovalent ions in the minor groove30–32

with others suggesting their absence.33 Observationsfrom MD simulation have been utilized by a numberof groups to reinterpret experimental data, as well asto propose ideas that have been tested with additionalexperiments.

The simulation of nucleic acids with explicit waterand explicit counterions has now been carried out forover 16 years. An MD simulation on a system con-taining DNA water and ions was first published in1985 by Kollman and co-workers.34 The system in-cluded 830 water molecules and was carried out for alength of 106 ps. Advances in available computertime have had a huge impact on both the size and thetime scale of simulations that are possible. The mostrecent report in the literature at the time of this writingis a 10 ns simulation including roughly 4000 watermolecules by Hamelberg et al.,35 and represents a

typical current simulation length. The ability to carryout longer simulations on larger systems has specifi-cally addressed two crucial issues in simulation pro-tocol: system sampling and convergence, and period-icity artifacts from long-range electrostatics forces.The large conformational space accessible to mobilecounterions as well as the high charge density ofDNA–counterion systems has made it essential toaddress both of these issues.

The simulation sampling required to adequatelydescribe the thermodynamic distribution of nonco-valently linked counterions surrounding the DNA inrelatively low density within the solvent is a difficultproperty to determine. Interaction lifetimes betweenmonovalent ions and specific sites on a DNA doublehelix can be hundreds of picoseconds in MD simula-tions (see Figure 1b).2,36 Analysis of the averagecoordination numbers of Na� counterion distributionsabout equivalent atoms in the two chemically identi-cal DNA strands of the Dickerson–Drew dodecamer

FIGURE 1 (a) Sodium coordination to DNA in d[CGCGAATTCGCG]2. The plot shows theaverage coordination numbers (right axis) for Na� ions interacting with each DNA atom along theduplex, 5� to 3� for each strand (x axis). The data is based on a 14 ns length simulation of theDickerson–Drew dodecamer. Values for atoms within strand 1 are plotted in the positive y direction,and for strand 2 are plotted in the negative y direction. (b) Average lifetimes of direct sodiuminteraction with d[CGCGAATTCGCG]2. Average lifetimes (ps) for direct ion–DNA coordinationevents for atoms in both strands from a 14 ns Dickerson–Drew dodecamer simulation are shown.

234 Cheatham and Young

(DDD) sequence d[CGCGAATTCGCG]2 approaches,but does not achieve perfectly equivalent distributionsfor the over the course of a 14.0 ns trajectory (asshown in Figures 1a and 1b and also later in Figure 4for phased A-tract sequences). Given these observa-tions, it is essential that relevant statistics be calcu-lated over simulation times on at least the nanosecondtime scale.

Early simulations of DNA duplexes encounteredstability problems arising from sometimes subtle sen-sitivities of the system to the large magnitude cou-lombic forces arising from the presence of explicitcounterions in the system. For a detailed review ofissues in the treatment of counterions in computersimulations of DNA, see Ref. 37. As with other nu-cleic acid simulation issues, counterion stability is-sues were greatly improved through the use of simu-lation protocols that eliminate artifacts in the simula-tion potential energy functions such as particle meshEwald (PME) and the introduction of further refinedparameters for the ions.

Some of the first nanosecond time scale simula-tions on the DDD sequence using PME and explicitNa� counterions found that, in addition to the Na�

ions forming a form of “cloud” around the DNAdouble helix, individual Na� ions became partiallydesolvated and take part in specific interactions withsites on the DNA.2,38 A radial distribution function

plot of Na� counterions relative to a DNA doublehelix is show in Figure 2, with ions directly coordi-nating with DNA atoms found in the first sharp peakin the distribution. This simulation result has beenrepeatedly observed in independent simulations ondiverse sequences.35,36 The results of these simula-tions suggested that the electron densities observed inmany crystal structures of DNA double helices couldactually be at least partly derived from ordered coun-terions, in addition to ordered water molecules insupport of recent interpretations of crystal struc-tures.30 Estimates of partial Na� ion occupancies areroughly 5—15% in solution, depending on the coor-dination site, as indicated in Figure 1a. This resultcould have important implications both in the thermo-dynamics of ligand interaction, as well as on thepotential influence that a densely charged ion mightimpart upon the structure of the DNA. Recently,NMR using 23Na has demonstrated that Na� ions areindeed localized preferentially in the minor groove ofAT-rich B-DNA with relatively long residence life-times.39

Understanding the structural effect that localizedions impart upon the DNA has been undertaken by usand a number of other groups.29,35,40,41 Complicatingthis undertaking is the enormous amount of data thatmust be analyzed to collect adequate statistics, as wellas different techniques for analyzing the data. Hamel-

FIGURE 2 Proximity-based radial distribution function of Na� ion density around DNA. Prox-imity based g(R) plot of Na� ion probability as a function of the distance from the atoms of thecentral 10 base pairs in the DNA duplex d(CGCGAATTCGCGAATTCGCGAATTCGCG) areshown. The left axis is the relative probability. Dashed straight lines indicate the extent of “ordered”Na� distribution (roughly 8.0 Å). Triangles indicate the running number of Na� atoms relative tothe total charge on the DNA (right axis).


berg et al. have reported that there is a strong corre-lation between the sequence-determined localizationof ions in the minor groove of DNA and the geometryof the minor groove, consistent with measurementsfrom NMR.31,32 McConnell et al., using a slightlymore restrictive definition of the groove width (thanreported by Hamelberg et al.), report generally lowerfractional occupancies in their analysis of long timescale MD trajectories.

Experimental evidence demonstrates that in addi-tion to ion concentration, different ionic species in-duce different structural properties on DNA doublehelices. For example, gel electrophoresis measure-ments carried out in the presence of K� result indifferent relative migration rates of DNA oligonucle-otides than the same measurements carried out in Na�

containing buffer.42 The addition of divalent cationssuch as Mg2� or Ca2� imparts an even greater effect.It appears as though current MD simulation protocolsare capable of reproducing some of these ion specificfeatures,3 and because they are an important feature ofDNA fine structure, realistic salt species should beincluded. Simulations with Mg2� were carried outwith fully hydrated Mg2� ions that do not ever loseany of their primary coordinated waters over the 5 nstime span of the simulation. A challenge for the futureis to generate improved parameters for dealing withhighly charged divalent ions in a more realistic fash-ion.

More and more evidence is being accumulated tosupport the idea that counterions interact with DNAdouble helices in complex sequence and structure-specific manners. Current MD protocols have dem-onstrated that this technique is capable of describ-ing many of the features of these complex interac-tions and generating data that can be utilized tointerpret experimental observations. Increased MDsimulation, coupled with additional experimentaldata, will hopefully continue to improve our pictureof environmentally and sequence-dependent DNAproperties.

Successes: DNA Bending

One of the more important yet controversial problemsin the field of structural biology DNA is that ofsequence-dependent DNA curvature. Marini et al. firstobserved naturally occurring intrinsically curvedDNA comprising DNA duplex minicircles found inkinetoplast DNA of certain trypanosome organisms.43

Subsequent experimental investigations of the poly-adenine stretch (“A-tract”) DNA sequences found inthe kinetoplast DNA minicircles confirmed the intrin-

sic curvature of the repetitive A-tract DNA sequences.A number of models have been proposed over theyears to both predict the sequence-dependent curva-ture of specific DNA sequences as well as to explainits molecular origins based on the chemistry of thesequences.

As with DNA–ion interactions, MD simulationscan potentially offer a dynamic molecular resolutiondescription of the nature of this biophysical problem.This approach is best suited toward building an un-derstanding of the molecular origins of DNA curva-ture, as opposed to a more empirical “knowledge-based” or statistically derived prediction algorithm.Simulations from diverse laboratories are consistentlyconcluding that A-tract DNA is relatively straight andrigid, and that the sequence-induced bending is oc-curring at the junctions or in the random sequenceDNA.1,3,44 Figure 3 shows snapshots from a 5.0 nssimulation of a sequence containing two A-tract DNAsequences located in-phase with the 10.5 base-pairhelical repeat of B-DNA. Simulations of A-tract DNAare able to reproduce a number of experimentallymeasured properties of the DNA, including: a helicalbend of �17° per A-tract, a progressive narrowing ofthe width of the minor groove in the 5� to 3� directionalong the A-tract, and an environmental sensitivity tocounterion species. After it was demonstrated that thesimulations are able to reproduce experimentally ob-servable parameters, details of the molecular struc-tures have been analyzed to construct a dynamic pic-ture for the bending at molecular resolution.

A challenge for the future remains to determinewhether a general model for DNA bending can bedistilled from the increasingly large collection of datathat has been reported. A general DNA bendingmodel may also be able to correlate intrinsic bendingwith physiologically relevant bendability that influ-ences protein–DNA interactions. In the absence of ageneral model, Schlick et al. have shown that differ-ential bending observed in molecular simulation of arange of oligonucleotides can be correlated with mea-sured affinities for the TBP protein.45

Successes: Environmental Dependenceof Nucleic Acid Structure

In the 1994–1995 time frame, upon release of theCornell et al. force field,46 parallelization of AMBERfor shared memory and message-passing parallel ma-chines,47 and incorporation of the particle meshEwald method into AMBER,48,49 we were very ex-cited by the promise of stable nucleic acid simulationson a nanosecond time scale at reasonable computa-tional cost.50 A concern at this point, given the pre-


viously observed unstable simulation of nucleic acidsupon the application of the standard and disastrouscharge-group based truncated cutoff in AMBER, wasthat the stability we observed was too good to be true.In other words, we became concerned that the appli-cation of true periodicity and Ewald methods could beoverstabilizing the structures on a nanosecond timescale. To show that this was not the case, considerableeffort was expended to demonstrate that the dynamicswere not inhibited. Throughout this investigation, itwas noticed that considerable structural flexibility anddynamics were evident in the molecular dynamicssimulations. After initial equilibration away from thestarting structure, the sampled configurations from themolecular dynamics trajectory typically remain be-tween 1.5 and 2.5 Å (all-atom root mean squaredeviation or RMSD) from the average structure forduplexes of �10–25 base pairs (depending on se-quence). In addition to the dynamics evident by fluc-tuations in the RMSD, we were able to observe sugarrepuckering throughout the DNA backbone in DNA

duplexes, correlated crankshaft transitions in variousbackbone angles and reasonable atomic positionalfluctuations.4,50,51 Despite the considerable motion,our worries of overstability led to the investigation ofcanonical A-DNA duplex models. Our expectationwas that transitions from A-DNA to B-DNA would beinhibited on a nanosecond time scale in MD simula-tion. Part of this was based on worries about period-icity and Ewald artifacts since it is very easy toconstruct conceptual arguments about how the impo-sition of periodicity may alter the dynamics or distortstructure. Commonly cited examples include inhibi-tion of the free rotation of dipoles and the lack of netforce between two oppositely charged particles at halfbox separation.52,53 This potential for artifacts is re-lated to interactions between the periodic images andthese interactions may inhibit conformational fluctu-ations or transitions. Moreover, a later retracted paperhad shown inhibition of motion with Ewald in MDsimulation.54 In addition to the (perhaps retrospec-tively unfounded) fear regarding damped motion in

FIGURE 3 Snapshots of A-tract bending from MD simulation. Snapshots from a five nanosecondMD simulation of a DNA sequence containing two phased A-tracts, including explicit water andphysiological salt conditions, offers atomic level detail of the nature of the intrinsic DNA curvature.Structures are plotted at 1 ns intervals. Ions: K� purple, Cl� green, Mg2� red.


Ewald simulations, it is known from experiment onDNA films and fibers that complete shifting of theA–B DNA equilibrium can take hours or days.55–57

This is certainly longer than the nanosecond time-scale molecular dynamics simulations we could put tobear on the problem. On the other hand, and in spiteof the slow transition observed in fibers and films, allhope for a transition in molecular dynamics simula-tion was not lost since other experiments suggest theA–B transition is rapid and highly reversible.58 This islikely especially true for small 5–25 base pair du-plexes in solution. To investigate the question ofA-DNA stability in MD simulation, our two groupsand others investigated various duplex models ofDNA.

To our knowledge, all simulations of A-DNA du-plexes (with varying sequences) with the Cornell et al.force field46 to date show spontaneous A-DNA toB-DNA transitions (or transitions from a pure canon-ical A-DNA to a mixed A/B-DNA model) on a 500 psto multinanosecond time scale.4 This observation isexciting since it suggests that we can see considerablemotion of a DNA duplex in nanosecond length sim-ulations, that we can see spontaneous conformationaltransitions among various DNA helical forms, andfurther that the force field prefers the canonical B-DNA duplex form under physiological conditions,consistent with experiment. Of course, the fact thatspontaneous A-to-B DNA transitions are observeddoes not fully verify the applied force field since wemay have simply been observing a bias of the Cornellet al. force field toward B-DNA structures. An exam-ple of this type of force field bias—but in the oppositedirection—was observed by Pettitt’s group.59 Theyobserved spontaneous B-DNA to A-DNA transitionson a nanosecond time scale with the all22 force fieldof MacKerell.60 This turned out to become a knownbias of this force field61 that has been overcome innewer parameterizations.62,63

In order to verify the generality of the force field,not only is it important to demonstrate that spontane-ous A-DNA to B-DNA transitions can be observedunder physiological conditions, but that B-DNA toA-DNA transitions can be observed under conditionsexpected to stabilize A-DNA. Examples of conditionsknown to stabilize A-DNA are mixed ethanol/watersolution (in the presence of sodium), high salt (NaCl),or the binding a certain polycationic ligands.28 To testthe generality of the Cornell et al. force field, webegan a series of studies to see if we could stabilizecanonical A-DNA duplexes under the conditions ex-pected to stabilize A-DNA, and moreover, to see if wecould observe spontaneous B-DNA to A-DNA tran-sitions under these conditions. These conditions, in-

vestigated by our groups and others, included �85%ethanol/water (by volume) in the presence of Na�

ions, high salt (1–4M NaCl solution), and binding ofthe polycationic ion hexamine cobalt(III). Variouslevels of success were seen. In ethanolic solution,A-DNA was stable on a 5–10 ns time scale5,7 with theCornell et al. force field; however, spontaneous B-DNA to A-DNA transitions were not observed. Inhigh salt (1–4M NaCl solution), A-DNA was notstabilized (Cheatham, unpublished data). Instead, astructure that can be characterized by A-DNA basepair geometries and a B-DNA backbone is seen.64 Themost reliable representation is observed with the bind-ing of 4 hexammine cobalt(III) ions, Co(NH3)6

3�,where spontaneous B-DNA to A-DNA transitions areobserved if two of the ions bind into GpG sequencepockets in the major grooves with the other two ionshanging nearby to screen the close approach of thephosphates in the bend across the major groove (ev-ident in the A-DNA structure).6 These short simula-tions give a glimpse into the nature of the A-DNA–B-DNA equilibrium; however, even now more de-tailed study of the processes is necessary. Specificallythis is necessary to better understand the interplay ofscreening by salt, and subtle interplay of specificsolvent and ion interaction on the A–B conforma-tional transition process. In terms of performance,with respect to modeling the A-DNA–B-DNA equi-libria, the BMS force field for nucleic acids developedby Langley is currently the best. However, part of thereason for this is that this force field was explicitlytuned to model this process, i.e. top down rather thanbottom up parameter development.64

A cautionary note regarding the relative perfor-mance of the current force fields for nucleic acids isthat all modifications to a given force field do notnecessarily improve all properties. In other words, theproper balance of the parameters is subtle and im-provement in one area may lead to worse performancein another; a classic example is the difference betweenthe TIP3P65 and SPC/E66 water models where (sim-plistically) the former sacrifices diffusion for reason-able energetics and the later sacrifices energetics forproper diffusion. An example in nucleic acid forcefields, from our experience is the parm98/parm99improvements to the Cornell et al. force field.67 Al-though a careful tweaking for the dihedral parametersled to improved sugar pucker phases, endocyclic tor-sion chi values, and overall helical twist, this forcefields does not perform as well as its predecessor interms of A-DNA stabilization. With the parm98/parm99 modifications, spontaneous A-DNA to B-DNA transitions (in the opposite direction expected)are seen in mixed water/ethanol solution and hexam-


ine cobalt(III) binding does not seem to induce B-DNA to A-DNA transitions, at least in preliminarysimulations (Ref. 67 and unpublished data). Anothercautionary note relates to the time scale for the tran-sitions among A-DNA and B-DNA. Given the lack ofknowledge from experiment, we do not know whatexactly the time scale for the conformational transi-tions should be. It could be that the simulations areshowing B-DNA to A-DNA transitions too fast, oralternatively, that we have not sampled long enoughin simulation to see the spontaneous B-DNA to A-DNA transitions that are expected. It could simply bethat the barriers are larger than can be sampled innanosecond length simulation. Supporting evidencecomes from estimates of the relative free energy ofA-RNA–B-RNA and A-DNA–B-DNA that favor theappropriate duplex under appropriate conditions.68

The issue of energetic balance and conformationalsampling is a subtle one that in this case likely alsorelates to the time scale of interaction of ions andsolvent and also the collective motions of the DNAduplex. In this regard, the issue of time scale is noteven resolved at the level of sugar repuckering sinceto our knowledge, experiments do not precisely mapout the time scale for sugar repuckering for DNAduplexes in solution.

Despite the cautionary tone, it should be noted thatconsiderable success of MD simulation is evident. Nolonger are special tricks applied to see stable canon-ical DNA duplex structure. Instead, we get detailedinsight into the sequence dependent fine structure ofnucleic acids including some representation of theA–B-DNA equilibria and A-tract bending.

Successes: Varied DNA Structures

In addition to standard and canonical nucleic acidstructures, such as duplexes of DNA, tRNA, or vari-ous protein/nucleic acid complexes, simulation hasbeen applied with reasonable success to investigatealternative DNA structures, such as guanine-quadru-plex DNA14,15,25,69 and even unusual structures in-volving intercalated bases such as i-DNA70 or zipperDNA.16 The zipper DNA structure is particularly in-teresting since it contains a central zipper of fournonbase-paired and intercalated adenines along with aflanking sheared GA mismatch. This structure, con-sistent with experiment, is relatively rigid in nanosec-ond-length solution phase MD simulations. The sta-bility of this structure suggests that observation of thisstructure is not due to crystal packing artifacts. Thestability is also a testament to the generality of theforce field in that not only can canonical duplexstructures be well represented, but also less usual

structures. A follow-up to this study, in collaborationwith the authors, involves the investigation of a seriesof DNA sequences, such as d[GCGAAGC], that canin principle adopt a zipper structure, a mismatchedduplex, or single-stranded hairpin structure. AccurateMD simulation (with PME, in explicit solvent withconditions similar to our previous work) suggest thatduplex, hairpin, and zipper models of d[GCGAAGC]are all stable in multinanosecond length simulation.We are following up these studies with free energyanalyses to probe the relative stability of each(Cheatham, Sponer, and co-workers,; unpublisheddata).

In addition to probing the stability and energeticsof standard base mismatch pairs, a number of studieshave investigated modified bases such as difluorotolu-ene,71,72 1,N6-ethenoadenine,73 and oxanosine.74 Ineach case, stable MD simulation is observed. How-ever, as we will discuss in greater detail later, stabilityalone is not sufficient to judge the reliability of agiven simulation. Also necessary are proper controlsimulations and ideally some means to estimate therelative importance of the sampled structures from theMD simulation. However, this is not always necessaryin practice to provide useful insight since interestingfeatures may be readily apparent in nanosecond lengthsimulation. For example, in the studies of the non-natural isostere of thymine, difluorotoluene, althoughthe duplex structure was not disrupted (consistentwith experiment), significantly more dynamics wereevident, including base-pair opening events at themodified base.71 To some readers, this may not seemlike a surprise since in MD simulations of duplexDNA, prior to �1995, base-pair fraying was a com-monplace event requiring the addition of artificialrestraints to prevent it. However, given a proper treat-ment of the electrostatic interactions (either throughthe application of Ewald methods or atom-based forceshifted cutoffs), base-pair opening is not a very com-mon event in MD simulation. Nor should it be, since,even in regular Watson–Crick stabilized DNA, themost rapid base-pair opening events (in GC base-pairtracts) occur on a millisecond time scale.75 In all ofour combined sets of MD simulation to date (usingproper methods to treat the electrostatic interactions),we have only observed a few DNA base-pair openingevents in 1–25 nanosecond length simulations. Ineach of these cases where opening was observed, ittended to occur at termini17 and was often followedby closure of the base pair.

Simulations of DNA quadruplex structure, despitebeing rather rigid on a nanosecond time scale inmolecular dynamics simulation, have given insightinto structural changes with different ions (Na� vs


K�) and modification of the bases.14,15,25,69 Addition-ally, recent simulations have shown exchange or bind-ing of ions into the guanine quartets from within orinto the quadruplex from outside.69,76 Despite somesuccess in modeling the exchange and binding of ions,longer simulations or the investigation of many pos-sible substrates is clearly necessary. Despite this, apromising new area of application is in the predictionof the loop structure in G-DNA quadruplexes. Givenrecent advances in methods to enhance sampling inloops, such as the application of the locally enhancedsampling methodology,77 it is expected that in thenear future it will be possible to predict the structureof G-DNA quadruplexes including any loops betweenstrands. In related work, recently MD simulation hasbeen used to aid in understanding the interaction ofcationic porphyrins with G-quaduplexes.78 The goal,towards the eventual design of potential anticanceragents, is to be able to distinguish among variousG-DNA quadruplexes (such as antiparallel vs parallelquadruplexes). In this study, MD simulations (withthe BMS force field) were applied as an adjunct toexperiment to investigate different possible bindingmodes of the porphyrins (either stacked on the ends orintercalated). The simulation results (based on a sig-nificantly more favorable free energy of solvation)favor the stacked binding.

A key aspect of all the success observed to date isthe application of state-of-the-art methods and forcefields, including an Ewald or atom-based force shiftedcutoff treatment of the electrostatics. Most of thesimulations have involved running 1–10 nanosecondlength simulation on a single model and then judgedthe utility by monitoring stability and making infer-ences about the structure and dynamics. More prom-ising results will come from detailed 1–100 nanosec-ond length simulation on multiple models along withbetter methods to probe the structural changes andinteractions of water and ions. A significant advancein this regard is the postprocessing of the MD simu-lation data to judge the relative importance, or freeenergy difference, between sampled configurations;this is discussed in greater detail below.

ISSUES IN THE SIMULATION

Despite many successes in accurately representing thesequence specific fine structure of nucleic acids, thereare still a number of issues that plague the simulationand are cause for some concern. Fundamental are thebasic limitations of all biomolecular simulations: sam-pling and energy. In addition to these, there is greatconcern about the use of Ewald methods and the

application of true periodicity, and about the reliabil-ity of the current “best” nucleic acid force fields bothfor representing the DNA but also the ions. Moreover,it should be noted that these accurate nanosecondlength simulations with explicit solvent and Ewaldmethods are rather expensive, even with the use of thefast and efficient particle mesh Ewald method. Al-though computers are continuing to get faster andfaster, the state-of-the-art in nucleic acid simulationrarely reaches beyond 25 ns of simulation at thepresent time for single simulations of small (10–100bases) nucleic acid systems in solution. Therefore,there is a large drive to develop and apply fastermethods, such as those with minimal solvent or animplicit solvent model, to allow longer and largersimulation. In this issue, Tsui and Case discuss the useof the generalized Born implicit solvent model as onesuch approach to allow longer simulations of largernucleic acid systems. The generalized Born methodsallows efficient parallel simulation of large nucleicacid systems at a cost of only 4–5 times in vacuosimulation and has proven reasonably successful innucleic acid simulation assuming the use of balancedparameters.79 Alternative models apply minimal ex-plicit solvent.80 In addition to longer and better sim-ulation, there is a critical need to estimating the rela-tive importance of a given model in order to judge itsreliability and reality. Each of the concerns and issuesnoted above is discussed in some detail below.

Conformational Sampling

Even with a simple pairwise empirical force field,fully sampling all thermally accessible conformationsof a small nucleic acid duplex is not possible incurrent state-of-the-art molecular dynamics simula-tions in the 1–50 ns time scale. The fact is that thepotential energy surface is rough with many accessi-ble local minima. Transitions among these minimatake time and simulations at least an order of magni-tude longer than the correlation time of a particularobservable are necessary for adequate sampling.81

The implication of this is that molecular dynamicssimulation can get trapped in metastable conforma-tional states that may not be representative of reality.An excellent example of this, from our experience, isin simulation of canonical B-form RNA duplexes thatare trapped in a “B-RNA” structure even when thesimulations are extended beyond 10 ns with the Cor-nell et al. force field (Ref. 82 and unpublished obser-vations). This is inconsistent with experiment sinceisolated “B-RNA” duplexes have never been ob-served. Similarly, simulations of RNA tetraloops be-come trapped in both correct and incorrect metastable


conformations in nanosecond length simulation unlessartificial means are applied to force conformationaltransitions.77,83,84 Even fairly accessible conforma-tional fluctuations, such as that between A-DNA andB-DNA conformations, likely take at least four nano-seconds for small DNA duplexes in solution with theCornell et al.46 or MacKerell all2260 force fields.85 Afurther example of limited sampling is seen with theMacKerell all22 force field where even with 10 ns ofsampling, no repuckering of some of the purine nu-cleotides is observed.85 Crystal packing can also in-hibit sampling, as was observed in 25 ns length sim-ulations of two (periodically repeated) unit cells of thecrystal of d[CCAACGTTGG]2 containing four sepa-rate duplexes.86 Although solution phase simulationsseemed to converge by four nanoseconds, even after25 ns in the crystal simulation, the average structuresof each of the equivalent duplexes did not fully match,differing by �1.0–1.5 Å. The cautionary note is thatcare should be taken to understand if the properties ofinterest have been sufficiently sampled in a moleculardynamics simulation and if the sampled model isrepresentative of reality. Although 1–5 ns of simula-tion may be sufficient with the Cornell et al. forcefield, longer simulations are clearly necessary for con-vergence with the newer variant of the Cornell et al.force field67 or Langley’s BMS force field64 due toslower sugar repuckering and less effective motion;for example, simulations by Langley suggest metasta-bility of various B-DNA or A-DNA states dependingon the ionic environment. In our experience and forsimulations in solution, the collective experiences ofthe field suggest that reasonable equilibration of DNAduplex structure occurs in a 5–25 ns time frame.However, not all properties may converge on this timescale. A case in point is ion distributions and specif-ically groove associated monovalent ions. Althoughthe monovalent ion cloud should equilibrate morerapidly than higher valent ions (due to their largersolvation free energies, larger interaction energies andgenerally slower diffusion rates), current simulationssuggest that significantly longer simulations areneeded even for monovalent ions.

Recently, we have shown significant sampling lim-itations in sampling of phased A-tract DNA duplexstructures. In an attempt to reproduce previous simu-lation,1 a series of 15� ns length simulations ofd[CGA4T4CGA4T4CG]2 (A4T4) and d[CGT4A4CGT4

A4CG]2 (T4A4) with truncated octahedral unit cells,explicit TIP3P water and �200 mM excess Na�/Cl�

or K�/Cl� ions were performed (as discussed in moredetail in the methods section). So far, the simulationsfail to show convergence of ion distributions withinthe minor groove (and fail to show the differential

bending patterns of A4T4 vs T4A4, with A4T4 morebent as expected and was observed previously in MDsimulation). Shown in Figure 4 is a schematic repre-sentation of each of the duplexes showing ion occu-pancies (as percentages occupied via an interaction toa base donor or acceptor atom of less 3.5 Å calculatedover �15 ns of simulation) for Na� ions interactingwith bases in both the minor (red) and major (black)grooves. Although one may quibble about the defini-tion of what “ion interaction in the minor groove”means, as both duplexes are symmetric we expect thedistributions to be equivalent in both halves of themolecules regardless of the definition of what a boundion is. In addition to equivalent populations across thesymmetric duplex, we expect the ions to bind prefer-entially to specific locations in the minor groove.Based on hydroxyl radical footprinting,87 NMR,31,32

crystal,30 and simulation1–3,35 studies, it is known that

FIGURE 4 Ion occupancies in the minor (red) and majorgrooves from 15 ns of MD simulation. The expected loca-tions of minor groove ions, based on experiment, are shownin gray along with expected minor groove narrowing inblue. The occupancies are calculated as percentages ofinteraction (defined as a distance between heavy atoms ofless than 3.5 Å) with minor or major groove donors/accep-tors over 15 ns portions of the respective trajectories. TheA4T4 sequence is shown on the right and the T4A4 on theleft.


there is a progressive minor groove narrowing in the5� to 3� direction in A-tracts and that ions tend tolocalize in the narrow regions. This is shown sche-matically in Figure 4 with the groove narrowing inblue and the high probability minor groove ion loca-tions in gray. This pattern of ion binding and minorgroove narrowing has been seen in earlier MD simu-lations, as previously discussed. However, in the cur-rent set of simulations and considering the A4T4 se-quence (left in Figure 4), it is seen that the top A4T4

region has significant (�77%) ion associationthroughout the simulation in the expected regionswhereas the bottom A4T4 region only has a little ionassociation (8–12%) in a location offset somewhatinto wider regions of the A-tract. Similarly with theT4A4 duplex, although the ions tend to localize inthe narrower regions in the minor groove, as expected,the populations are low and not very well balanced(with �23% in the central part of the upper half of theduplex compared to �10% in the central part of thelower half of the duplex). Despite the lower ion oc-cupancy, as much bending is seen with this sequence(if not more than A4T4; data not shown). At this point,it is difficult to determine the direct cause for the lackof complete sampling. An obvious reason is insuffi-cient sampling in that the simulations are too short.However, other causes may relate to a misbalance ofthe ion parameters. For example, the ion parametersmay underestimate the interaction with DNA. This isa distinct possibility since the ion parameters used areAMBER adapted (to additive combining rules) Aqvistparameters88 that are known to underestimate the freeenergy of solvation.89 The differences compared toearlier simulations may also relate to differences inthe applied periodicity. The current simulations wereperformed in truncated octahedral unit cells ratherthan long and thin rectangular boxes to avoid issuesrelated to the rotation of the DNA. It is possible thatthis change in shape alters the distribution of ions toplace more away from the DNA. Another differenceis that the current simulation only included Na� ionscompared to some K� ions in the earlier simulations.However, preliminary simulations using K� insteadof Na� show similar trends. Finally, an alternatepossibility is a protocol error in the current simula-tions. To address these questions in greater detail, aseries of longer simulations are underway to addressthese questions in more detail. The point of discussingthe conformational sampling limitations here is thatthey have serious implications for MD simulation.

EnergyAlthough the future holds promise for fast ab initioquantum, density functional, or semiempirical quan-

tum methods90–93 for the accurate representation ofthe energy of nucleic acid systems, it will be a numberof years before routine use of these types of potentialswill be possible in nanosecond length molecular dy-namics simulation of nucleic acids. To keep the sim-ulations tractable, it is generally necessary to applysimple empirically derived molecular mechanical po-tential functions or force fields. The good news is thatthe newer generations of force fields do a reasonablejob in representing nucleic acid structure and dynam-ics. This includes both DNA and RNA systems and awide variety of structures. However, in spite of thesuccesses, there are still limitations in the appliedpotentials. Each force field has its strengths and weak-nesses. These will be described in the next section.What may be somewhat surprising to many is that thesimple empirical pairwise force fields work at all andmoreover, do a rather good job of representing thesequence specific structures and dynamics for a vari-ety of nucleic acids. The weaknesses represent subtleissues with balance of the parameters. It is expectedthat with further tuning of the dihedral potentials thateven better agreement will be seen. For example, it iswell known from a variety of molecular dynamicssimulations that the Cornell et al.46 underestimateshelical twist, the chi angle of the bases to the back-bone, and the average sugar pucker. To improve thisforce field, we expended a tremendous amount ofcomputer time and 100� ns of simulation to bench-mark and improve the dihedral angle part of theCornell et al. force field. The goal was to “fix” thesedeficiencies. Although this parameterization did lead tobetter helical twist, chi, and sugar pucker distributions,67

no longer does the force field sample conformationalspace as fast. Moreover some of the desirable propertiesrelated to (perhaps exaggerated) DNA bending and ionassociation are apparently lost. This shows the complex-ity of tweaking force fields due to the interrelated andcorrelated effects of all the separate parameters.

Despite the minor deficiencies of the current forcefields, they appear to do an excellent job on a problemof immense complexity. This is shown by not onlyreproducing expected structure and dynamics, but re-producing expected energetic differences, such as pre-ferred stabilization of B-DNA and A-RNA underphysiological conditions,68 correct ranking of polyA–polyT to polyG–polyC stability,94 correct ranking oftetraloop structures,95 correct modeling of benzo-[a]pyrene modified DNA,22 and a variety of other freeenergetic interactions ranging from minor-groovebinders to protein/RNA interaction.96

The next major step towards improvement maycome with the availability of force fields that includenonadditive effects, polarization, and lone pairs on the


DNA bases. However, this process will likely takesome time and significant effort, as to verify the newforce fields the previous results (such as A-tract bend-ing, ion association, and environmental dependence ofnucleic acid structure) will have to be reproduced.The Kollman group has been working on this forsome time and the availability of a polarizable forcefield may be available in the near term (P. Cieplak,personal communication); however, it remains to befully benchmarked.

Benchmarking Available Force Fields forNucleic Acids

In order to better understand the strengths and limi-tations of currently available and reliable force fieldsfor nucleic acids, we have begun a systematic andconsistent study of a variety of 10-mer DNA duplexstructures, including sequences with all possibledimer-step repeats, homopolymers, and various ex-perimentally relevant structures (d[CGCGAAT-TCGCG]2, d[CCAACGTTGG]2, and others). Thegoal, in collaboration with the respective force fielddevelopers and the Brooks lab at the NIH, was ini-tially to perform �2.5 ns of simulation, in a consistentmanner (applying the same simulation program andconsistent runtime parameters), on each of the se-quences starting from both A-DNA and B-DNA ge-ometries. The force fields initially chosen for studywere the Cornell et al.46 and its newer variant,67 thenewer MacKerell (all27) force field,62,63 and the BMSnucleic acid force field developed by Langley.64

These were chosen since they are relatively popularforce fields and have been shown to perform reason-ably well in a variety of simulations of nucleic acids.Moreover, these force fields do not suffer the need forspecial tricks, such as the tethering of ions, seen withother force fields97 in order to maintain reasonableDNA geometries. A further benefit of the currentstudy relates to the conformance of the values. Ifsimilar structural properties and dynamics are ob-served with multiple force fields, each derived inde-pendently and with a different philosophy, this furthersupports the reality of the MD simulation results.

After the initial 2.5 ns of simulation completed(with details described in the methods section), itappeared that the simulations were too short for afully meaningful comparison. They are currently be-ing continued up to the 10� ns time scale. Thediscussion here centers on preliminary analysis ofthese 2.5–10 ns length trajectories and represents apiece of a presentation by Cheatham to the ACSNational Meeting in Washington, DC, in August2000. Full details will be presented elsewhere. Spon-

taneous A-DNA to B-DNA transitions, under low saltconditions in solution, are observed with each for theforce fields for a wide variety of DNA duplexes. TheCornell et al. force field samples DNA duplex con-formations rather rapidly (with spontaneous A-DNAto B-DNA transitions on a 500 ps or less time scale).The MacKerell (all27) and BMS force field requirelonger simulations (1 ns�) to complete the transitionfrom A-DNA to B-DNA. As mentioned, the Cornellet al. force field underestimates helical twist, chi, andpucker, but does a reasonable job of reproducingexpected bending and twisting patterns,98,99 exceptthat the sequence-specific bending is overemphasizedand twisting is underestimated. The rapid fluctuationin conformational states with the Cornell et al. forcefield suggests that shorter simulations can be used tosample model structures and that the force field tendsto avoid getting trapped in (energetically accessible)metastable conformational states in nanosecondlength simulation. The corollary of this is that struc-tures rapidly interconverts between B-DNA andmixed A/B-DNA like structures.85,100 Overall, theCornell et al. force field is more “solution” like,meaning that the structures more closely resemblethose observed in solution.98 This contrasts to both theMacKerell all27 force field and the BMS nucleic acidforce field that tend to have average helicoidal valuescloser to crystal averages.99 Along with the tendencyto be closer to crystal structure averages, less effectivesampling is observed which means that the structurecan and do get trapped in metastable conformationalstates in nanosecond length simulation. This limita-tion in effective sampling (in short simulations) ismost acute with the BMS nucleic acid force field. Forexample, in simulations of three nanoseconds or less,an inversion in the sequence specific pattern of highand low roll values (in 10-mer sequences of repeateddimer steps) was observed in d[ATATATATA]2 (in-correct) compared to d[TATATATATA]2 with theBMS force field. This is shown in Figure 5, where acomparison of the roll and twist values for each dimerstep based on 2.5 ns length simulations is presentedfor each of the force fields. The expected values oftwist, based on analysis of crystal99 and solution98

structures, for the TpA steps are 40.0° and 37.3°,respectively, compared to 33.4°/34.6° at ApT steps.The expected roll values for crystal vs solution are2.6°/4.0° for TpA steps and �0.6°/�2.1° for ApTsteps. Neglecting the inverted preference for rolling atthe ApT step seen with the BMS force field (andlikely an artifact of insufficient sampling), the forcefields all do a reasonable job of reproducing the se-quence specific structural features seen experimen-tally. Another force field specific tendency that is


exposed in Figure 5 is the apparent preference forrolling over twisting with the Cornell et al. force field.

In order to characterize the sampling and relativeconvergence, the self-shifted root-mean-squared devi-ations (RMSD) for these two 10-mer duplexes can becalculated. This implies matching the central parts of

each duplex to the other duplex, shifted (effectivelyby matching residues 2–8 of one of the duplexes toresidues 3–9 of the other). For the Cornell et al.simulations, this all-atom RMSD was 0.53 Å, show-ing quite reasonable sampling in 2.5 ns of simulation.This compares to 1.08 Å for the MacKerell (all27)

FIGURE 5 Roll and twist comparison for the various force fields in simulations ofd[ATATATATAT]2 vs d[TATATATATA]2. The base-pair roll (degrees) and base-pair step twistvalues (degrees) computed for each dimer step in d[ATATATATA]2 vs d[TATATATAT]2 areplotted. These values were obtained from straight coordinate averaged structures obtained in 2.5 nslength simulations with the Cornell et al. (black), MacKerell all27 (red), and Langley BMS (green)nucleic acid force fields.

FIGURE 6 Comparison of the average structures of d[CGCGAATTCGCG]2 from 3 ns lengthsimulations. Shown are structures from left to right for the Cornell et al., MacKerell (all27), andBMS nucleic acid force fields compared to the recent high resolution NMR structure.101 Overall, thestructural similarities are evident, although the MacKerell (all27) shows less minor groove narrow-ing in the central AATT region and the Cornell et al. structure is significantly undertwisted.


FIGURE 7 Comparison of average helicoidal values from 3 ns length simulation of d[CGC-GAATTCGCG]2 compared to the NMR structure. Shown are the helicoidal values (calculated usingthe Dials and Windows37 interface to Curves134 for each base pair and base-pair step fromsimulations with the Cornell et al. (black), MacKerell all27 (red), and BMS (green) force fieldscompared to the NMR (blue) solution structure.101


and 2.0� Å for the BMS force field. Although theCornell et al. simulations showed faster convergence,the RMSD to canonical B-DNA was further away (at3.7 Å) than seen with the other force fields (all27� 2.67 Å and BMS � 2.07 Å). In general, the BMSforce field tends to stay closest to canonical B-DNAgeometry, even for polyG–polyC homopolymer du-plexes that arguably should have more A-DNA char-acter. The only significant limitation of this forcefield, beyond sampling and tendency towards veryB-DNA like geometries (under physiological condi-tions), is an underestimation of the base pair opening.

In Figure 6 are shown average structures from 3 nslength simulations of d[CGCGAATTCGCG]2 com-pared to the NMR-derived solution structure (inblue).101 From this picture, it is apparent that each ofthe force fields does a reasonable job of reproducingthe expected sequence-specific structure. This is fur-ther demonstrated for the helicoidal values (shown inFigure 7) and the minor groove widths (shown inFigure 8), which all show similar trends in compari-son with experiment. Despite the general agreementto the experimental structures, the under-estimate ofthe helical twist is clearly evident in Figure 6 for theCornell et al. simulations as is the less-than-expectedminor groove narrowing seen with the MacKerell(all27) force field. Despite less minor groove narrow-ing than expected, the MacKerell all27 force fielddoes an excellent job of reproducing sequence trendsin sugar puckers although long lived C3�-endo statesare apparent in some cases (such as simulation ofpolyA–polyT 10-mer duplexes).

Limitations of the current study, beyond sufficientsampling in the molecular dynamics, are difficultieswith comparison to experimental values. The average

“solution” structure values tend to be biased some-what through the inclusion of poorly refined NMR struc-tures (without residual dipolar coupling data) whereasthe crystal structure averages may be tainted by crystalpacking artifacts. A more worrisome issue, however,relates to detailed comparison of sugar repuckering andbackbone dihedral transitions. At this point, it is not clearexactly what distribution of C2�-endo vs C3�-endoshould be observed (be it based on NMR data on isolatednucleosides102 or inferences from newer NMR data), norwhat the time scale for repuckering or correlatedchanges in the backbone angles are. This information isnecessary to allow fuller calibration of the force fields.

Estimating the Relative Free Energies ofStates Sampled in MD Simulation

In the absence of experimental or free energetic datathat suggests a given sampled state is reliable, there isno way to prove that a given model is reliable simplythrough the observation of stable MD simulations.This suggests that stable simulation, with flat andconverged RMSDs plotted vs time, is not sufficient todetermine the reliability of a given simulated struc-ture. This limitation alone does not invalidate all MDsimulation. Numerous simulations of DNA duplexessuggest that it is possible to accurately simulate anexperimentally derived structure. Moreover, we haveseen with DNA duplexes spontaneous conformationaltransitions between A-DNA and B-DNA and A-tractbending, groove narrowing, and ion association that isconsistent with experiment. The cautionary note, stat-ing that stable simulation is not sufficient, suggeststhat multiple simulations of a given model structureare likely necessary. Moreover, simulation of suffi-

FIGURE 8 Comparison of minor groove widths for d[CGCGAATTCGCG]2 compared to exper-iment. Minor groove widths for the simulated (3 ns average structures) of d[CGCGAATTCGCG]2

with the Cornell et al. force field (black), the MacKerell all27 force field (red), and the BMS forcefield (green) are compared to the NMR solution structure (blue).101 The groove widths werecalculated with Curves.135 Apparent is the underestimate of groove width in the AATT region in thesimulations with the MacKerell all27 force field.


cient length to fully equilibrate the properties of in-terest is necessary. As was discussed in the context ofminor groove ion association, even 15 ns of simula-tion was insufficient to fully “equilibrate” the iondistributions. A means around this problem is via anestimation of the relative importance of a given sam-pled structure in MD.

A significant advance in the past few years hasbeen the application of methods that allow a crudeestimate of the relative free energy difference betweenconformational states sampled in molecular dynamicssimulation.68,94,103 The basic idea is to run a series ofmolecular dynamics simulations, each started at adifferent state (for example “B-RNA” and canonicalA-RNA), and then via averaging molecular mechan-ical energies (with solvation components obtainedfrom continuum/implicit solvent models) come upwith an estimate of the free energy difference betweenthe sampled states. As discussed in a recent review,96

this MM_PBSA approach allows us to test the per-formance of the force fields (for example showing that“B-RNA” is less stable than A-RNA68 or B-DNA ispreferred in water solution over A-DNA7), estimatebinding free energies, and also probe the effect of

sequence mutations on protein structure. An obviouslimitation of the MM_PBSA methods relates to theinclusion of explicit solvent or bound counterions. Asimplistic solution, and one that seems to have prom-ise, is via the inclusion of small amounts of explicit“bound” water and counterions. Although this wasdone in the initial method of Jayaram,103 large fluc-tuations in the (free) energy computed were evident.Recently, we have seemingly overcome this problemby only including “nearest” bound counterions andwater. For example, in the analysis of G-DNA qua-druplexes with either two or three bound ions, thestandard MM_PBSA energetic analysis, without theinclusion of ions led to inconsistent results (R. Stefl,Cheatham, N. Spackova, and J. Sponer, work inprogress). For example, it is shown that it is notdirectly possible to a compare a MD trajectory of fourquartets in a G-DNA quadruplex with three boundions to a MD trajectory run with two bound ions. Toovercome this, we compared the three ion simulationto the two ion simulation where an additional ion, theclosest one to the binding region, was also included inthe free energy analysis leading to surprisingly con-sistent free energies. Extending this idea to include

FIGURE 9 Potential periodicity artifacts. The rotational correlation times for small nucleic acidduplexes (�5–25 base pairs) in solution are in the nanosecond time scale. This can lead (as shownin the top panel) to interaction of the DNA duplex with its periodic images. This can cause distortionof the structure as has been seen in simulations of RNA duplexes (Cheatham, unpublishedobservations). In addition to artifacts from rotation, some programs allow independent scaling of thethree box lengths in constant pressure periodic boundary simulations. In the bottom panel, sche-matically is shown how the box may shrink to allow interaction of the ends of the duplex in aperiodic fashion. This has been observed in CHARMM131 simulations with independent (anisotro-pic) pressure scaling (Cheatham, unpublished observations).


the closest set of “bound” waters (for example thosebound in the minor groove) also leads to better con-sistency in the calculated free energies.

Periodicity Artifacts?

The application of Ewald methods is equivalent tofully treating a periodically replicated unit cell. Effec-tively this means that no cutoff is applied to the pairinteractions and that atoms interact directly with theirperiodic images. This can lead to artifacts. Of course,even without the use of Ewald methods, such as whenapplying a finite cutoff, periodicity could still lead toartifacts since even though a given particle may notinteract directly with its periodic image, it may inter-act indirectly with atoms that are influenced by theperiodic image. To discuss the possible artifacts fromperiodicity on a primitive level, the inhibition of freerotation of a dipole in the periodic lattice and thecomplete screening of the attractive force between theoppositely charged particles at half a box lengthsseparation due to the interactions with the periodicimages, are commonly presented. Fortunately, simu-lation suggests that these artifacts are small givensufficient solvation and a reasonably high dielectricmedium.104–107 For example, free rotation of a dipoleis not seriously inhibited105 and the potential of meanforce for two like charged ions does not becomeattractive.108 On the other hand, if rather small peri-odic unit cells are applied (leading to insufficientscreening by solvent between periodic images) theseartifacts can become severe. For example, simulationsof an �-helical peptide tended to overstabilize thehelical structure in small simulation boxes.109

Previously, we investigated the effect of periodicbox size (and other small changes to the methodsincluding reduced pressure, reduced base stacking,various salt concentrations, and other subtle perturba-tions to the methods) in 1-ns length simulation110 andshowed very little effect on the structure and dynam-ics. A similar study, showing similar results, waspublished by Norberto de Souza and Ornstein.111 Adrawback of both of these studies is that the testsimulations were rather short.

Note that problems due to the periodicity are notrelated solely to indirect interaction with periodicimages, but also due to possible direct interaction of amolecule in the primary unit cell with its periodicimage in a neighboring cell. This is shown schemat-ically in Figure 9 for two cases. The first involves thecommon practice of placing a long DNA duplex alonga longer length of an orthorhombic (rectangular) orhexagonal unit cell to minimize the necessary solvent.Unfortunately, since the rotational correlation times

of small DNA duplexes (5–25 base pairs) is on thenanosecond time scale, it is expected that the DNAmay rotate in the unit cell in nanosecond length sim-ulation. This can lead to direct interaction of theperiodic images as we have seen previously in simu-lation of A-RNA. In this 5� ns length MD simulationof RNA, the duplex rotated to interact with its peri-odic image leading to distortion of the terminal basepairs (Cheatham, unpublished observations). Simi-larly, certain pressure coupling algorithms may scalebox lengths independently and this can also lead toshrinkage of the box in one particular direction lead-ing to direct interaction with periodic images. Theseare important artifacts to avoid in simulation. Wehave overcome this issue in a number of ways. Themost sensible is the use of periodic unit cells that havea shape that is closer to spherical. This includes boththe truncated octahedral (14-sided, x � y � z, � � �� 109.4712206344907°) and the rhombic do-decahedral (12-sided, x � y � z, � � 60°, � � 90°,� � 60°) space filling unit cells. These allow freerotation of the DNA duplex and minimize potentialinteractions with periodic images. However, for verylong DNA duplexes, this may be impractical. There-fore, it is necessary to inhibit rotation through thejudicious application of weak restraints to prevent therotation (which may inhibit bending) or via periodicremoval of the center of mass rotation of the DNAduplex (which may add a small and uncompensatednet-torque to the overall system). We have applied thelatter method in a variety of simulations without ap-parent artifacts on the 5–25 ns time scale.1–3

Given the recent results that prove that there arepossible artifacts related to artificial stabilization ofhelices in simulations with minimal solvent, we havesince gone back and revisited this issue in DNAsimulation. Specifically, this involved AMBER47 sim-ulations of varying box sizes with the d[CGCGAAT-TCGCG]2 sequence in truncated octahedral unit cellswith both the Cornell et al. force field and its newer(parm99) variant.67 This ranges in water coveragefrom less than 5 Å of water surrounding the DNAinitially up to �20 Å of surrounding water. Thesimulations are denoted tiny (1733 waters, box lengthis �42.0 Å), small (2761 waters, box length �48.8Å),medium (5157 waters, box length �59.5 Å) and large(11927 waters, box length �78.2 Å). For reference,with the Cornell et al. force field, the length of theduplex is �36 Å. Standard simulations applying theparticle mesh Ewald method, with truncated octahe-dral boundary conditions, were applied locally on acluster of PCs at the University of Utah for approxi-mately 9–12 months of real time. In general, thesmaller simulations were performed on fewer slower


processors. The ratio of relative performance on acommon platform was tiny:small:medium:large, 1.0:1.48:2.87:7.89. Rather long simulations were per-formed with a total of 32� ns on the tiniest system,25� ns on the small system, �15 ns on the mediumsystem, and �15 ns on the largest system. Simulationdetails are discussed in the methods and full analysisof the simulation results will be provided in a subse-quent publication.

Similar to what was seen previously, few artifactsare observed in the simulation. A plot of the root-mean-squared deviations, for each of the simulationsis shown in Figure 10. A-DNA to B-DNA transitionswere seen in each simulation on a similar time scale(see the inset to Figure 10) and the plots of theall-atom RMSD of the nucleic acid atoms to theNMR-derived experimental structure101 are all simi-lar. In each case, considerable fluctuations are evi-dent. From this analysis alone, it may be tempting torationalize that adding in additional water beyond thetiny simulation is unnecessary, especially since thisincreases the cost by �3–8 fold. For these properties,this may be true in general, however it should benoted that in the tiny simulation, one of the terminalGC base pairs was significantly disrupted during thesimulation. This is likely due to a periodicity artifact.This disruption was characterized by ion associationat the terminal GC base pairs at both ends of theduplex (�20 and �15%) and breaking of one of theterminal GC pairs which maintained a Watson–Crickinteraction for only 28% of the 30� ns simulation.

Interaction with the periodic images is possible sincethe terminal base pairs get within �7 Å of periodicimages of the opposite terminal bases.

Based on our study and without providing thedetails in full, the take home mention is that it isrecommended to use more than 7 Å of water sur-rounding a nucleic acid model initially (in each direc-tion) in molecular dynamics simulation. This is sol-vation at the level of the “small” simulation which isonly slightly more computationally demanding thanthe tiny simulation.

Minimal Solvation Models

A significant problem with current MD simulationtechnologies applied to study the structure and dy-namics of nucleic acids is that they are extremelycomputationally demanding. Even with computa-tional speed ups through the use of multiple time stepintegration algorithms or linear scaling methods forthe treatment of the electrostatic interactions,112–114

the calculations are still rather expensive and limitedto the 1 ns to 1 �s (at a stretch) time scale. Moreover,the cost goes up considerably with system size withsignificant regions of water in periodic boundary sim-ulations. There are two ways around the computa-tional limitations for treating large systems over longsimulation times. The first is the use of implicit sol-vent models. Recently, there has been a resurgence inthe use of generalized Born methodologies for treat-ing the implicit solvation electrostatic interactions in

FIGURE 10 Comparison of all (nucleic acid) atom RMSD (Å) vs time (ps) for simulations ofd[CGCGAATTCGCG]2 starting from a canonical A-DNA geometry in various truncated octahedralperiodic unit cells sizes. The simulations are large (black; 11927 waters or box length of �78.2 Å),medium (red; 5157 waters or box length of �59.5 Å), small (blue; 2761 waters or box length of�48.8 Å) and tiny (yellow; 1733 waters or box length of �42.0 Å). Inset is the short time scaleRMSD values to give an indication of the time scale for the A-DNA to B-DNA transition.


nucleic acid simulation.79,115,116 This is discussed ingreater detail in a review in this issue. An alternativeand accurate implicit solvent model is the use of Pois-son–Boltzmann methods in MD simulation117; however,due to the significant computational cost, these methodshave seen little use in MD simulation. Instead, usage hastypically been limited to investigating a series of modelstructures, or snapshots from and MD simulation.

A potential issue with implicit solvent treatments isthe absence of “structural” water. Considering thespine of hydration seen in the minor groove andsignificant interaction of water with nucleic acids,some explicit water may be necessary. Toward thisend, a variety of studies have investigated using onlylimited amounts of explicit water. Various methodshave been applied, such as stochastic boundary con-ditions that avoid boundary artifacts by partially ran-domizing surface waters118 or more advanced meth-ods that treat the region outside the explicit spherewith an implicit (continuum) solvent model. To date,the latter approach has only seen limited applicationin MD simulation of nucleic acids. One problem withthese methods is that it is difficult to avoid the waterordering at the surface of the droplet and also unclearhow to handle migration of explicit water into thecontinuum and continuum water into the explicit wa-ter. The surface ordering is troublesome due to thepotentially high pressures at the center of the droplet;estimates of the pressure for a droplet of radius R is�P � 15,000 Å-atm/R.119 This could drastically in-hibit conformational fluctuations.

A promising model for minimally including ex-plicit solvent in MD simulation has been published ina series of papers by Mazur.80,120–122 In this model,�5 Å of explicit water is added around the DNArepresenting �15–20 waters per base pair. The levelof hydration is significantly lower than the expectedhydration of DNA under normal hydrating conditions(�20 waters per nucleotide) and is more representa-tive of very dehydrating conditions (�5–6 waters pernucleotide).123 In addition to minimal solvation, thephosphate charges are reduced by half and a distance-dependent dielectric function was applied to attenuatecharges. Fast MD simulation was made possiblethrough an internal coordinate treatment with rigidbases but flexible sugars, allowing long MD integra-tion time steps (in the 5–8 fs time range). In additionto rigidification of the bases, to use longer integrationsteps the method modifies some of the inertias byadding additional moments of inertia to hydrogen-only rigid bodies (of between 4–9 amu-Å2). Applica-tion of this minimal solvation model leads to surpris-ingly stable MD simulation. Calculated RMSDs of thestructures compared to experiment (of less than 2.0 Å)

are significantly lower than those observed in simu-lations including more explicit water, periodic bound-ary conditions, and a particle mesh Ewald treatment(which tend to have RMSD values of �2.5 Å orgreater). In addition to B-DNA structures closer toexperiment, significantly lower fluctuations in the he-licoidal parameters (of less than 10% seen in PMEsimulations) were observed. An issue with these sim-ulations is that it is difficult to determine from thepublished reports why more stable and “reliable” sim-ulation was observed. Is this due to a better represen-tation of the underlying potential? Is it due to theapplication of the internal coordinate treatment andrigid bases and heavy hydrogens? Is it due to theminimal solvation model with few waters, reducedphosphate charges, and a distance-dependent dielec-tric? A worry is that the reduced fluctuation and goodstability could be an artifact of the minimal solvationmodel. Previous studies have shown, in simulations ofproteins with a sphere of explicit water and a distance-

FIGURE 11 Distorted structure evident from simulationswith minimal solvent and approximate electrostatic treat-ments. Shown is the straight coordinate averaged structurefrom a 2–6 ns MD simulation of d[CGCGAATTCGCG]2

starting from an A-DNA geometry in explicit solvent with118 waters, the Cornell et al. force field, and no periodicity.The electrostatics were handled with a distance dependentdielectric model with phosphate charges reduced by half. Toprevent complete damping of the DNA motion, separatetemperature coupling (300 K) was applied to the solvent andsolute. Water was weakly restrained to remain within 50 Åwith a 0.5 kcal/mol-Å2 restraint.


dependent dielectric treatment, that the conforma-tional fluctuations are significantly damped in MDsimulation.124 This leads to less effective samplingand a trapping of the structure in metastable states.

To address this question, we have performed aseries of simulations with the minimal hydrationmodel (and the Cornell et al. force field) in a mannerthat is more directly comparable to our earlier simu-lation results that include a full solvent treatment.Specifically, we use the standard MD integrator (with2 fs time steps) and an all-atom treatment, no cutoffon the pair interactions, and minimal explicit waterwith phosphate charges reduced by half and applica-tion of a distance-dependent dielectric. The sequencechosen for study was the standard d[CGCGAAT-TCGCG]2 benchmark sequence. Moreover, a canon-

ical A-DNA geometry was used as the starting struc-ture to see if spontaneous A-DNA to B-DNA transi-tions could be seen in nanosecond length simulationsconsistent with our earlier work. By performing thesimulations in this manner, we separate the rigid bodyintegrator from the solvation model in order to betterunderstand the stability of the Mazur minimal solva-tion model. Various levels of explicit hydration weretried ranging from 118 waters or 134 waters (similarto the levels used in the Mazur simulations) to 200and 500 explicit TIP3P waters. Each of the simula-tions were run for 5� ns, were possible. To avoidproblems with evaporating waters, a spherical re-straint of 0.5 kcal/mol-Å2 was applied at 50 Å (sig-nificantly outside the solvated DNA region). All ofthe simulations were rather stable with very little

FIGURE 12 Comparison of the RMSD values (Å) vs time (ps) for various simulations ofd[CGCGAATTCGCG]2 starting from a canonical A-DNA geometry in explicit solvent. In explicitsolvent, it is expected that DNA will undergo a A-DNA to B-DNA transition on a nanosecond timescale with the Cornell et al. force field. This is seen for a standard simulation (medium) in explicitsolvent with PME and periodic boundary conditions (dashed black line). However, with a simplifiedminimal solvation model (with the net phosphate charge reduced by half and the application of adistance dependent dielectric constant), this transition is not observed with varying amounts of watersurrounding the DNA (in nanosecond length simulations).

FIGURE 13 Comparison of the atomic positional fluctuations from simulations of d[CGCGAAT-TCG]2 in explicit solvent. Shown are fluctuations from the last 5 ns of a standard simulation of theDNA in explicit solvent with periodic boundary conditions (black; simulation named medium)compared to the damped fluctuations seen in 6 nanosecond length simulations using no periodicity,a distance dependent dielectric constant, and a phosphate charge reduced by half with 118 (red) or500 (yellow) waters. In the minimal solvation calculations, a cap at 50 Å (0.5 kcal/mol-Å2 forceconstant) was applied to prevent waters from drifting to far away and separate temperature couplingon the solute and solvent was applied to prevent even further damped motion of the solute.


conformational fluctuations seen. Spontaneous A-DNA to B-DNA transformations were not observed.In constant temperature simulations (at 300 K), it wasnoticed that most of the kinetic energy shifted into thewater. This cold DNA, hot solvent problem furtherdamped the effective motion. To avoid this, separatetemperature coupling on the DNA and solvent wasapplied. This lead to increased motion; however, thesampling was still considerably damped compared tosimulations in full solvent.

Shown in Figure 11 is the average structure from 2to 6 ns for the simulation of A-DNA with 118 water,dual temperature coupling and a 50 Å cap. Althoughthe base pairing is well maintained, the major groovehas collapsed, leading to very close approach of thetwo strands in the bend across the major groove. Thisis characteristic of all of the minimal solvent simula-tions, with less collapse seen as more water is in-cluded. In Figure 12, the all-atom RMSDs to canon-ical A-DNA vs time for a series of minimally solvatedsimulations (with 118, 134, or 200 waters) is shownand compared to the “medium” PME fully solvatedsimulation discussed in the previous section. Fromthis plot, it can be seen that not only is the RMSDlower (remaining close to A-DNA), but the fluctua-tions in the RMSD values seem considerably damped.To get a better handle on the damping of motion,Figure 13 compares atomic positional fluctuations ofthe “medium” simulation to minimally solvated sim-ulations with 118 or 500 waters. The results abovesuggest that the minimally solvated model, applied asdescribed above, significantly damps conformationalfluctuations and traps the MD simulation in metasta-ble states. As applied in this study, care should betaken in application of the minimal solvation model.

Conclusions and Perspective

We have seen tremendous progress in MD simulationapplied to nucleic acids. However, before we capital-ize on the advances and jump to conclusions regard-ing the reliability of the methods and protocols, weshould be aware of significant limits in the effectivesampling and in the reliability of the empirical forcefields for nucleic acids. As shown above, ion interac-tion with DNA is a particular area where care shouldbe exercised since even in state-of-the-art simulation,conformational sampling limits are still apparent.Other issues that the community should be aware ofinclude the potential for inhibited sampling in mini-mal solvent models, the need for proper energy con-servation and removal of net translational motion inperiodic systems,125,126 avoiding periodicity-inducedartifacts in Ewald simulations, and biases of the avail-

able nucleic acid force fields. Further theoretical studyof nucleic acids is necessary to better understand andovercome the limitations. This includes more detailedstudy of force fields, A–B-DNA conformational equi-libria, the effect of specific ion and water associationon DNA structure, DNA bending, and sequence spe-cific flexibility. In spite of the problems and limita-tions of the methods, the future holds considerablepromise as computers get faster, methods and forcefields get better, and we can more accurately representthe sequence-specific fine structure and dynamics ofnucleic acids. Specific areas where application of MDsimulations will show considerable promise in thenear future includes (1) detailed characterization ofprotein/nucleic acid interactions including the role ofinterfacial water; (2) the characterization and designof selective DNA groove binders; (3) reliable estima-tions of the free energy cost of mismatches in DNAsequences and estimation of melting temperatures forduplex DNA; (4) detailed insight into RNA structureand dynamics, such as characterization of the struc-ture, flexibility and binding of tRNA and its codon/anticodon interactions; (5) further characterization ofusual structures and nucleic acid folding; and (6) theeffect of specific water and ion association on thestructure and dynamics of nucleic acids.

METHODS

A-Tract Bending and Ion Association

The simulations described in Figures 1–3 were describedpreviously.1,2,38 The simulations of d[CGA4T4CGA4

T4CG]2 (A4T4) and d[CGT4A4CGT4A4CG]2 (T4A4) refer-enced in Figure 4 were performed with AMBER 6.047 andprerelease versions of AMBER 7.0 with the Cornell et al.force field.46 The initial configurations were canonical B-DNA geometries based on the Arnott B-DNA fiber mod-els.127 The models were built into a truncated octahedralunit cell with enough water added to leave at least a 5 Ådistance from the edge of the box to the DNA. Net-neutral-izing Na� ions (with AMBER-adapted Aqvist parame-ters88) were then added by replacing waters which representthe most favorable electrostatic interaction. After this, anextra 35 Na� and 35 Cl� (with parameters from Ref. 128)ions were added. This leads to an excess Na� concentrationof �175 mM. The A4T4 simulation contained 33885 atomsand the T4A4 simulation contained 33888 atoms. Equilibra-tion was performed by first holding the DNA position fixedand minimizing the water and salt in 1000 steps (500 stepssteepest descent and 500 steps conjugate gradient) followedby 100 ps of MD. Constant temperature (300 K) and pres-sure were applied with the Berendsen coupling129 with acoupling time of 1.0 and 0.5 ps�1 respectively. SHAKE130

was applied to all hydrogen atoms (tolerance � 0.00001)


and a 2 fs time step was used. The particle mesh Ewaldmethod49 was used with a fast Fourier transform (FFT) gridof 80 by 80 by 80, a cutoff of 8.0 Å, cubic B-splineinterpolation order and a Ewald coefficient of �0.349.During the equilibration, the pair list was updated every 25steps. During the production runs, the pair list (build with a1.0 Å buffer) was updated heuristically and the center ofmass translational motion was removed at every restart andevery 10 ps during the dynamics. The simulations were runon the HP N-4000 complex at the University of Kentuckyand the Origin 2000 and Compaq Sierra cluster computersat the University of Utah. All production simulations wereperformed without restraints.

Force Field Comparisons

For all the force field comparisons (Figures 5–8), at least2.5 ns length simulations were performed with CHARMM(version c26n1 and c27n1)131 in a consistent manner. Thisinvolved constant temperature (300 K, mass � 1000)132 andpressure (1 atm, piston mass � 500 amu, relaxation time� 20 ps�1),133 2 fs time steps with the application ofSHAKE130 on hydrogen atoms, accurate use of the particlemesh Ewald method49 (�1.0 Å grid size with 6th orderB-spline interpolation and a Ewald coefficient of 0.34) inrhombic dodecahedral unit cells (x � y � z, � � 60°, �� 90°, � � 60°), a heuristically updated atom based pair listbuilt to 12 Å and cutoff at 10.0 Å with a smooth shift of thevan der Waals energies. The Cornell et al. force field wasconverted by Cheatham (see http://www.chpc.utah.edu/�cheatham) and shown to give equivalent energies andforces (to 10�6) in comparison between AMBER andCHARMM. The MacKerell (all27) force field62 was kindlysupplied by Alex MacKerell and is currently distributedwith CHARMM. The BMS force field was kindly providedby David Langley.64 Canonical A-DNA and B-DNA mod-els, based on the Arnott geometries,127 were used as theinitial coordinates. All of the results discussed here were forB-DNA starting geometries and the following sequences:d[ATATATATAT]2, d[TATATATATA]2, and d[CGC-GAATTCGCG]2. These were solvated with enough pre-equilibrated TIPSP131 water to add 12.0 Å to the maximaldistance extent of the DNA. Net-neutralizing Na� ions88

were placed off the phosphate oxygen bisector and thenminimized (with larger, 5.0 Å, van der Waals radii) in vacuoprior to solvating the system. Equilibration involved theapplication of harmonic positional restraints (25.0 kcal/mol-Å2) and 250 steps of ABNR minimization, followed by 25ps of MD where the temperature was ramped up from 50 to300 K in 1 ps intervals. The initial equilibration was per-formed with the Cornell et al. force field. Subsequent equil-ibration, for the other force fields, involved 250 steps ofABNR minimization followed by 5 ps of MD with positionrestraints. All production simulations were performed with-out any restraints. The simulations were mostly completedon the NIH LoBoS Pentium cluster in 1997–1998. Currentlythey are being continued on a cluster of PCs (Icebox) at theUniversity of Utah.

Periodicity ArtifactsSimulations of potential periodicity artifacts were per-formed for A-DNA models of d[CGCGAATTCGCG]2 withAMBER versions 6.0 and prerelease 7.0.47 Four sets ofsimulations were performed: “tiny” had 1733 waters with abox length of �42.0 Å, “small” had 2761 waters and a boxlength of �48.8 Å, “medium” had 5157 waters and a boxlength of �59.5 Å, and the “large” simulation had 11927waters and a box length of �78.2 Å. All simulations werewith the Cornell et al. force field starting from ArnottA-DNA models. Truncated octahedral unit cells were usedwith particle mesh Ewald49 to treat the electrostatic inter-actions. Run-time parameters similar to the above AMBERsimulations were employed except that a 2.0 Å buffer on thepair list was maintained and default PME parameters ap-plied (which leads to FFT grid that are products of powersof 2, 3, and 5 that are close to the box size, cubic interpo-lation and Ewald coefficients of �0.34). Center of masstranslational motion was removed at every restart or every 5ps and the SHAKE tolerance was 0.000001. Productionsimulations were run without restraints for 35� ns for thetiny system, 25� ns for the small system and 15 ns each forthe medium and large systems. All calculations were per-formed on the Icebox PC cluster at the University of Utah.

Minimal SolventThe minimal solvation models were run with the Cornell etal. force field,46 but with the phosphate charges reduced byhalf, and a distance-dependent dielectric constant. An equil-ibrated and solvated A-DNA starting structure was used tobuild the initial structures. The closest 118, 134, 200, or 500waters to the DNA phosphates were saved and used asinitial coordinates. The only equilibration was an initial1000-step minimization. MD was performed with 2 fs timesteps, SHAKE on the hydrogen and constant (300 K) tem-perature with Berendsen coupling (time � 1 ps –1) either onthe whole system or separately on the solute and solvent. Asmall restraining force at 50 Å from the center of the DNA,with force constant of 0.5 kcal/mol-Å2, was applied toprevent waters from evaporating. No cutoff was applied tothe pair interactions. All calculations were performed on theIcebox PC cluster at the University of Utah.

AnalysisMuch of the analysis was performed with a developmentalversion of ptraj developed by TEC and available fromhttp://www.chpc.utah.edu/�cheatham/software.html. Thisincludes analysis of ion and hydrogen-bond occupanciesand lifetimes, imaging of solvent, calculation of RMSDvalues, fitting and straight coordinate averaging, calculationof atomic positional fluctuations, among other tools. Heli-coidal values were calculated with Curves and the Dials andWindows interface to Curves.37,134

Early access to the various force fields is greatly appreci-ated, for which TEC would like to sincerely thank David


Langley (BMS) and Alex MacKerell, Jr. (all27). This workwas partially supported by the National Computational Sci-ence Alliance under MCB000004N and utilized the Univer-sity of Kentucky Exemplar (2000-2001). Access to compu-tational resources at the NCI Advanced Biomedical Com-puting Center is also appreciated. In addition, TEC isgrateful for an allocation of computer time from the Centerfor High Performance Computing (CHPC) at the Universityof Utah. This includes use of the Icebox PC cluster, CHPC’sSP system (which was funded in part by NSF grant no.CDA9601580 and IBM’s SUR grant to the University ofUtah), and CHPC’s SGI Origin 2000 system (funded in partby the SGI Supercomputing Visualization Center Grant).TEC would also like to thank Bernie Brooks and EricBillings for access to the LoBoS computational cluster inthe Laboratory of Computational Biophysics, NationalHeart, Lung, and Blood Institute, National Institutes ofHealth. Molecular graphics images were produced using theMidasPlus program from the UCSF Computer GraphicsLaboratory, UCSF (supported by NIH P41-RR01081).

MAY is supported by Fellowship DRG-1553 of theCancer Research Fund of the Damon Runyon–WalterWinchell Foundation. We would like to thank D.L. Bever-idge for useful discussions and advice.

REFERENCES

1. Sprous, D.; Young, M. A.; Beveridge, D. L. J MolBiol 1999, 285, 1623–1632.

2. Young, M. A.; Jayaram, B.; Beveridge, D. L. J AmChem Soc 1997, 119, 59–69.

3. Young, M. A.; Beveridge, D. L. J Mol Biol 1998, 281,675–687.

4. Cheatham, T. E., III; Kollman, P. A. J Mol Biol 1996,259(3), 434–444.

5. Cheatham, T. E., III; Crowley, M. F.; Fox, T.; Koll-man, P. A. Proc Natl Acad Sci 1997, 94, 9626–9630.

6. Cheatham, T. E., III; Kollman, P.A . Structure 1997, 5,1297–1311.

7. Sprous, D.; Young, M. A.; Beveridge, D. L. J PhysChem B 1998, 102, 4658–4667.

8. Sen, S.; Nilsson, L. J Am Chem Soc 2001, 123,7414–7422.

9. Cheatham, T. E., III; Kollman, P. A. Ann Rev PhysChem 2000, 51, 435–471.

10. Beveridge, D. L.; McConnell, K. J. Curr Opin StructBiol 2000, 10, 182–196.

11. Weerasinghe, S.; Smith, P. E.; Mohan, V.; Cheng,Y. K.; Pettitt, B. M. J Am Chem Soc 1995, 117(8),2147–2158.

12. Luo, J.; Bruice, T. C. J Am Chem Soc 1998, 120,1115–1123.

13. Shields, G. C.; Laughton, C. A.; Orozco, M. J AmChem Soc 1997, 119, 7463–7469.

14. Spackova, N.; Berger, I.; Sponer, J. J Amer Chem Soc1999, 121, 5519–5534.

15. Spackova, N.; Berger, I.; Sponer, J. J Am Chem Soc2001, 123, 3295–3307.

16. Spackova, N.; Berger, I.; Sponer, J. J Am Chem Soc2000, 122, 7564–7572.

17. Cieplak, P.; Cheatham, T. E., III; Kollman, P. A. J AmChem Soc 1997, 119, 6722–6730.

18. Soliva, R.; Sherer, E.; Luque, F. J.; Laughton, C. A.;Orozco, M. J Am Chem Soc 2000, 122, 5997–6008.

19. Sen, S.; Nilsson, L. J Am Chem Soc 1998, 120,619–631.

20. Spector, T. I.; Cheatham, T. E., III; Kollman, P. A.J Am Chem Soc 1997, 119, 7095–7104.

21. Miaskiewicz, K.; Miller, J.; Cooney, M.; Osman, R.J Am Chem Soc 1996, 118, 9156–9163.

22. Yan, S.; Shapiro, R.; Geacintov, N. E.; Broyde, S.J Am Chem Soc 2001, 123, 7054–7066.

23. Auffinger, P.; Louise-May, S.; Westhof, E. Biophys J1999, 76, 50–64.

24. Konerding, D. E.; Cheatham, T. E., III; Kollman,P. A.; James, T. L. J Biomol NMR 1999, 13, 119–131.

25. Strahan, G. D.; Keniry, M. A.; Shafer, R. H. BiophysJ 1998, 75, 968–981.

26. Roll, C.; Ketterle, C.; Faibis, V.; Fazakerley, G. V.;Boulard, Y. Biochemistry 1998, 37, 4059–4070.

27. Manning, G. S. Quart Rev Biophys 1978, 11, 2.28. Saenger, W. In Principles of Nucleic Acid Structure;

Springer Advanced Texts in Chemistry; Cantor, C. E.,Ed.; New York: Springer-Verlag, 1984; p 556.

29. Hud, N. V.; Polak, M. Curr Opin Struct Biol 2001,11(3), 293–301.

30. Shui, X.; MCFail-Ison, L.; Hu, G. G.; Dean Williams,L. Biochemistry 1998, 37, 8341–8355.

31. Hud, N. V.; Feigon, J. J Am Chem Soc 1997, 119,5756–5757.

32. Hud, N. V.; Sklenar, V.; Feigon, J. J Mol Biol 1999,286, 651–660.

33. Chiu, T. K.; Kaczor-Grzeskowiak, M.; Dickerson,R. E. J Mol Biol 1999, 292, 589–608.

34. Seibel, G. L.; Singh, U. C.; Kollman, P. A. Proc NatlAcad Sci 1985, 82(19), 6537–6540.

35. Hamelberg, D.; McFail-Isom, L.; Williams, L. D.; Wil-son, W. D. J Am Chem Soc 2001, 122, 10513–10520.

36. Feig, M.; Pettitt, B. M. Biophys J 1999, 77(4), 1769–1781.

37. Ravishanker, G.; Auffinger, P.; Langley, D. R.; Ja-yaram B.; Young, M. A.; Beveridge, D. L. 1997.Treatment of Counterions in Computer Simulations ofDNA. In Reviews in Computational Chemistry, BoydD. B., Ed. 1997, p 317–372; Jayaram, B.; Beveridge,D.L.; Aqueous Solution: Theoretical And ComputerSimulation Studies on the Ion Atmosphere of DNA.Ann Rev Biophys Biomol Struct 1996, 25, 367–394.

38. Young, M.A.; Ravishanker, G.; Beveridge, D.L. Bio-phys. J. 1997, 73, 2313–2336.

39. Denisov, V. P.; Halle, B. Proc Natl Acad Sci 2000,97(2), 629–633.

40. McConnell, K. J.; Beveridge, D. L. J Mol Biol 2000,304(5), 803–820.


41. Hamelberg, D.; Williams, L. D.; Wilson, W. D. J AmChem Soc 2001, 123(32), 7745–7755.

42. Diekmann, S.; Wang, J. C. J Mol Biol 1985, 186(1),1–11.

43. Marini, J. C.; Levene, S. D.; Crothers, D. M.; Englund,P. T. Proc Natl Acad Sci 1982, 79, 7664–7668.

44. Strahs, D.; Schlick, T. J Mol Biol 2000, 301(3), 643–663.45. Qian, X.; Strahs, D.; Schlick, T. J Mol Biol 2001,

308(4), 681–703.46. Cornell, W. D., et al. J Am Chem Soc 1995, 117(19),

5179–5197.47. Pearlman, D. A., et al. Comp Phys Comm 1995,

91(1–3), 1–41.48. Darden, T. A.; York, D. M.; Pedersen, L. G. J Chem

Phys 1993, 98(12), 10089–10092.49. Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden,

T.; Lee, H.; Pedersen, L. G. J Chem Phys 1995,103(19), 8577–8593.

50. Cheatham, T. E., III; Miller, J. L.; Fox, T.; Darden,T. A.; Kollman, P. A. J Am Chem Soc 1995, 117(14),4193–4194.

51. Young, M. A.; Srinivasan, J.; Goljer, I.; Kumar, S.;Beveridge, D. L.; Bolton, P. H. Methods Enzymol1995, 261, 121–144.

52. Valleau, J. P.; Whittington, S. G. In Statistical Me-chanics, A. A Modern Theoretical Chemistr; Berne,B. J., Ed.; Plenum Press: New York, 1977.

53. Cheatham, T. E., III; Brooks, B. R.; Kollman, P. A. InCurrent Protocols in Nucleic Acid Chemistry; Beau-cage, S. L., et al., Eds.; Wiley: New York. 2001; p7.9.1–7.9.21.

54. Teleman, O.; Wallqvist, A. Int J Quant Chem 1990,24, 245–249.

55. Ivanov, V. I.; Minchenkova, L. E.; Schyolkina, A. K.;Poletayev, A. I. Biopolymers 1973, 12, 89–110.

56. Lindsay, S. M., et al. Biopolymers 1988, 27, 1015–1043.57. Szabo, A.; Shi, B.; Lee, S. A.; Rupprecht, A. J Biomol

Struct Dynam 1996, 13, 1029–1033.58. Piskur, J.; Rupprecht, A. FEBS Lett 1995, 375, 174–178.59. Yang, L. Q.; Pettitt, B. M. J Phys Chem 1996, 100(7),

2564–2566.60. Mackerell, A. D.; Wiorkiewicz-Kuczera, J.; Karplus,

M. J Am Chem Soc 1995, 117(48), 11946–11975.61. Norberg, J.; Nilsson, L. J Chem Phys 1996, 104(15),

6052–6057.62. Foloppe, N.; MacKerell, A. D. J. J Comp Chem 2000,

21, 86–104.63. MacKerell, A. D. J.; Banavali, N. J Comp Chem 2000,

21, 105–120.64. Langley, D. R. J Biomol Struct Dynam 1998, 16,

487–509.65. Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.;

Impey, R. W.; Klein, M. L. J Chem Phys 1983, 79,926–935.

66. Berendsen, H. J. C.; Grigera, J. R.; Straatsma, T. P. JPhys Chem 1987, 91, 6269–6274.

67. Cheatham, T. E., III; Cieplak, P.; Kollman, P. A.J Biomol Struct Dynam 1999, 16, 845–862.

68. Srinivasan, J.; Cheatham, T. E. I.; Cieplak, P.; Koll-man, P. A.; Case, D. A. J Am Chem Soc 1998, 120,9401–9409.

69. Stefl, R.; Spackova, N.; Berger, I.; Koca, J.; Sponer,J. Biophys J 2001, 80, 455–468.

70. Spackova, N.; Berger, I.; Egli, M.; Sponer, J. J AmChem Soc 1998, 120, 6417–6151.

71. Cubero, E.; Sherer, E. C.; Luque, F. J.; Orozco, M.;Laughton, C. A. J Am Chem Soc 1999, 121, 8653–8654.

72. Cubero, E.; Laughton, C. A.; Luque, F. J.; Orozco, M.J Am Chem Soc 2000, 122, 6891–6899.

73. Guliaev, A. B.; Sagi, J.; Singer, B. Carcinogenesis2000, 21, 1727–1736.

74. Hernandez, B.; Soliva, R.; Luque, F. J.; Orozco, M.Nucleic Acids Res 2000, 28, 4873–4883.

75. Dornberger, U.; Leijon, M.; Fritzche, H. J Biol Chem1999, 274, 6957–6962.

76. Chowdhruy, S.; Bansal, M. J Phys Chem B 2001, 105,7572–7578.

77. Simmerling, C.; Miller, J. L.; Kollman, P. A. J AmChem Soc 1998, 120, 7149–7155.

78. Han, H.; Langley, D. R.; Rangan, A.; Hurley, L. H.J Am Chem Soc 2001, in press.

79. Tsui, V.; Case, D. A. J Am Chem Soc 2000, submitted.80. Mazur, A. K. J Am Chem Soc 1998, 120, 10928–10937.81. van Gunsteren, W. F.; Mark, A. E. J Chem Phys 1998,

108, 6109–6116.82. Cheatham, T. E., III.; Kollman, P. A. J Am Chem Soc

1997, 119, 4805–4825.83. Miller, J. L.; Kollman, P. A. J Mol Biol 1997, 270(3),

436–450.84. Miller, J. L.; Kollman, P. A. Biophys J 1997, 73,

2702–2710.85. Feig, M.; Pettitt, B. M. Biophys J 1998, 75, 134–149.86. Bevan, D. R.; Li, L.; Pedersen, L. G.; Darden, T. A.

Biophys J 2000, 78, 668–682.87. Tullius, T. D.; Burkhoff, A. M. In Structure and Ex-

pression; Olson, W. K. e. a., Ed.; Adenine Press:Albany NY, 1988; p 77–85.

88. Aqvist, J. J Phys Chem 1990, 94, 8021–8024.89. Darden, T.; Pearlman, D.; Pedersen, L. G. J Chem

Phys 1998, 109, 10921–10935.90. Dixon, S. L.; Merz, K. M. J. J Chem Phys 1997,

107(3), 879–893.91. van der Vaart, A.; Suarez, D.; Merz, K. M. J Chem

Phys 2001, 113, 10512–10523.92. York, D. M.; Lee, T. S.; Yang, W. T. Phys Rev Lett

1998, 80, 5011–5014.93. Khandogin, J.; Hu, A. G.; York, D. M. J Comp Chem

2000, 21, 1562–1571.94. Cheatham, T. E.; Srinivasan, J.; Case, D. A.; Kollman,

P. A. J Biomol Struct Dynam 1998, 16, 265–280.95. Srinivasan, J.; Miller, J. L.; Kollman, P. A.; Case,

D. A. J Biomol Struct Dynam 1998, 16, 671–682.96. Kollman, P. A., et al. Acc Chem Res 2000, 33, 889–897.97. Tapia, O.; Velazquez, I. J Am Chem Soc 1997, 119,

5934–5938.


98. Ulyanov, N. B.; James, T. L. Methods Enzymol 1995,261, 90–120.

99. Gorin, A. A.; Zhurkin, V. B.; Olson, W. K. J Mol Biol1995, 247(1), 34–48.

100. Real, A. N.; Greenall, R. J. J Mol Model 2000, 6,654–658.

101. Tjandra, N.; Tate, S.; Ono, A.; Kainosho, M.; Bax, A.J Am Chem Soc 2000, 122, 6190–6200.

102. Davies, D. B. Prog Nuclear Magn Res Spect 1978, 12,135–225.

103. Jayaram, B.; Sprous, D.; Young, M. A.; Beveridge,D. L. J Am Chem Soc 1998, 120, 10629–10633.

104. Smith, P. E.; Pettitt, B. M. J Chem Phys 1996,105(10), 4289–4293.

105. Smith, P. E.; Blatt, H. D.; Pettitt, B. M. J Phys Chem1997, 101B(19), 3886–3890.

106. Hunenberger, P. H.; McCammon, J. A. Biophys Chem1999, 78, 69–88.

107. Hunenberger, P. H.; McCammon, J. A. J Chem Phys1999, 110, 1856–1872.

108. Bader, J. S.; Chandler, D. J Phys Chem 1992, 96,6423–6427.

109. Weber, W.; Hunenberger, P. H.; McCammon, J. A. JPhys Chem B 2000, 104, 3668–3675.

110. Cheatham, T. E., III; Kollman, P. A. In Structure,Motion, Interactions and Expression of BiologicalMacromolecules; Sarma, M.; Sarma, R., Eds; AdeninePress: Schenectady, NY, 1998; pp 99–116.

111. Norberto de Souza, O.; Ornstein, R. L. Biophys J1997, 72, 2395–2397.

112. Greengard, L.; Rokhlin, V. Chem Scripta, 1989, 29A,139–144.

113. Amisaki, T. J Comp Chem 2000, 21, 1075–1087.114. Sagui, C.; Darden, T. A. J Chem Phys 2001, 114,

6578–6591.115. Williams, D. J.; Hall, K. B. Biophys J 1999, 76,

3192–3205.

116. Zacharias, M. Biophys J 2001, 80, 2350–2363.117. Gilson, M. K.; Davis, M. E.; Luty, B. A.; McCammon,

J. A. J Phys Chem 1993, 97, 3591–3600.118. Norberg, J.; Nilsson, L. Proc Natl Acad Sci 1996,

93(19), 10173–10176.119. Cheatham, T. E., III.; Brooks, B. R. Theor Chem Acc

1998, 99, 279–288.120. Mazur, A. K. J Am Chem Soc 2000, 122, 12778–

12785.121. Mazur, A. J Biomol Struct Dynam 2001, 18, 832–843.122. Mazur, A. K. J Comp Chem 2001, 22, 457–467.123. Falk, M.; Hartman, K. A.; Lord, R. C. J Am Chem Soc

1963, 85, 397–391.124. Garemyr, R.; Elofsson, A. Proteins 1999, 37, 417–428.125. Harvey, S. C.; Tan, R. K.-Z.; Cheatham, T. E., III.

J Comp Chem 1998, 19, 726–740.126. Chiu, S.-W.; Clark, M.; Subramaniam, S.; Jakobsson,

E. J Comp Chem 2000, 21, 121–131.127. Arnott, S.; Hukins, D. W. Biochem Biophys Res Com-

mun 1972, 47(6), 1504–1509.128. Smith, D. E.; Dang, L. X. J Chem Phys 1994, 100(5),

3757–3766.129. Berendsen, H. J. C.; Postma, J. P. M.; van Gunsteren,

W. F.; DiNola, A.; Haak, J. R. J Comp Phys 1984, 81,3684–3690.

130. Ryckaert, J. P.; Ciccotti, G.; Berendsen, H. J. C.J Comp Phys 1977, 23, 327–341.

131. Brooks, B. R.; Bruccoleri, R. E.; Olafson, B. D.;States, D., J; Swaminathan, S.; Karplus, M. J CompChem 1983, 4, 187–217.

132. Hoover, W. G. Phys Rev A 1985, 31, 1695–1697.133. Feller, S. E.; Zhang, Y.; Pastor, W.; Brooks, B. R.

J Chem Phys 1995, 103(11), 4613–4621.134. Lavery, R.; Sklenar, H. J Biomol Struct Dynam 1988,

6(1), 63–91.135. Stofer, E.; Lavery, R. Biopolymers 1994, 34, 337–346.


Young.md.DNA.biopolymers.2001

Documents

norberto de

oppositely

phosphate

internal coordinate

atomic positional

rotational

average coordination

icebox pc