Role of hydration in determining the structure and vibrational spectra of L-alanine and N-acetyl L-alanine N’methylamide in aqueous solution: a combined theoretical and experimental approach K. J. Jalkanen, *,1 I. M. Degtyarenko, 2 R. M. Nieminen, 2 X. Cao, 3 L. A. Nafie, 3 F. Zhu, 4 and L. D. Barron 4 1 Nanochemistry Research Institute, Department of Applied Chemistry, Curtin University of Technology, GPO Box U1987, Perth WA 6845, Australia, [email protected]; 2 Laboratory of Physics, Helsinki University of Technology, P.O. Box 1100, FIN-02015 HUT, Finland; 3 Department of Chemistry, Syracuse University, Syracuse, New York, USA; and 4 Department of Chemistry, Glasgow University, Glasgow G12 8QQ Abstract In this work we have utilized recent density func- tional theory Born-Oppenheimer molecular dynamics simulations to determine the first principles locations of the water molecules in the first solvation shell which are responsible for stabilizing the zwitterionic structure. Previous works have used chemical intuition or classical molecular dynamics simulations to position the water molecules. In addition, a complete shell of water molecules was not previously used, only the water molecules which were thought to be strongly interacting (H-bonded) with the zwitterionic species. In a previous work by Tajkhorshid et al. [1] the L- alanine zwitterion was stabilized by 4 water molecules, and a subsequent work by Frimand et al. [2] the number was increased to 9 water molecules. Here we found that 20 water molecules are necessary to fully encapsulate the zwitterionic species when the molecule is embedded within a droplet of water, while 11 water molecules are necessary to encapsulate the polar region with the methyl group exposed to the surface, where it migrates during the MD simulation. Here we present our vibrational absorption, vibrational circular dichroism and Raman and Raman optical activity simulations, which we compare to the previous simulations and experimental results. In addition, we report new VA, VCD, Raman and ROA measurements for L-alanine in aqueous solution with the latest commercially available FTIR VA/VCD instrument (Biotools, Jupiter, FL, USA) and Raman/ROA instrument (Biotools). The signal to noise of the spectra of L-alanine measured with these new instruments is significantly better than the previously reported spectra. Finally we reinvestigate the causes for the stability of the P π structure of the alanine dipeptide, also called N- acetyl-L-alanine N’-methylamide, in aqueous solution. Previously we utilized the B3LYP/6-31G* + Onsager continuum level of theory to investigate the stability of the NALANMA4WC (Han et al.) [3]. Here we use the B3PW91 and B3LYP hybrid exchange correlation functionals, the aug-cc-pVDZ basis set and the PCM and CPCM (COSMO) continuum solvent models, in addition to the Onsager and no continuum solvent model. Here by the comparison of the VA, VCD, Raman and ROA spectra we can confirm the stability of the NALANMA4WC due to the strong hydrogen bonding between the four water molecules and the peptide polar groups. Hence we advocate the use of explicit water molecules and continuum solvent treatment for all future spectral simulations of amino acids, peptides and proteins in aqueous solution, as even the structure (conformer) present can not be found without this level of theory. 1 Introduction Methods which can be used to determine the structure of biomolecules in native and non-native conditions are necessary to be able to understand not only the interactions responsible for structural stability, but also to determine and understand their function. X-ray and neutron diffraction methods are the methods of choice for biomolecules for which one can crystallize [4, 5]. The assumption is that the conditions one uses to get the molecules to crystallize do not affect the structure, that is, the structure of the molecule in the crystal is the same as that in its functionally active state. This may indeed be the case for large proteins, but it is not obviously the case for small flexible molecules [6]. Additionally, nuclear magnetic resonance (NMR) methods have recently been used to supplement the diffraction methods, though in general the method only works for biomolecules which are in the folded state. This is because the NMR methods are based on the nuclear Overhauser effect (NOE), that is, the NOE signals occur only when two protons are close in distance. This is due to the signal strength being dependent on the distance to the negative 6th power. For extended or unfolded proteins, one normally does not have enough NOEs distant constraints to uniquely determine the structure. Hence alternative methods are required. Additionally one would like to be able to follow the conformational changes in the structures as a function of solvent polarity, addition of denaturing agents, changes in pH and addition of ligands [7]. 1
19
Embed
Role of hydration in determining the structure and vibrational spectra of L-alanine and N-acetyl L-alanine N′-methylamide in aqueous solution: a combined theoretical and experimental
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Role of hydration in determining the structure and vibrational spectra ofL-alanine and N-acetyl L-alanine N’methylamide in aqueous solution: a combined
theoretical and experimental approach
K. J. Jalkanen,∗,1 I. M. Degtyarenko,2 R. M. Nieminen,2
X. Cao,3 L. A. Nafie,3 F. Zhu,4 and L. D. Barron4
1 Nanochemistry Research Institute, Department of Applied Chemistry, Curtin University of Technology, GPO Box
U1987, Perth WA 6845, Australia, [email protected]; 2 Laboratory of Physics, Helsinki University of Technology, P.O.
Box 1100, FIN-02015 HUT, Finland; 3 Department of Chemistry, Syracuse University, Syracuse, New York, USA;
and 4 Department of Chemistry, Glasgow University, Glasgow G12 8QQ
Abstract
In this work we have utilized recent density func-tional theory Born-Oppenheimer molecular dynamicssimulations to determine the first principles locationsof the water molecules in the first solvation shellwhich are responsible for stabilizing the zwitterionicstructure. Previous works have used chemical intuitionor classical molecular dynamics simulations to positionthe water molecules. In addition, a complete shellof water molecules was not previously used, only thewater molecules which were thought to be stronglyinteracting (H-bonded) with the zwitterionic species.In a previous work by Tajkhorshid et al. [1] the L-alanine zwitterion was stabilized by 4 water molecules,and a subsequent work by Frimand et al. [2] the numberwas increased to 9 water molecules. Here we found that20 water molecules are necessary to fully encapsulatethe zwitterionic species when the molecule is embeddedwithin a droplet of water, while 11 water moleculesare necessary to encapsulate the polar region with themethyl group exposed to the surface, where it migratesduring the MD simulation. Here we present ourvibrational absorption, vibrational circular dichroismand Raman and Raman optical activity simulations,which we compare to the previous simulations andexperimental results. In addition, we report new VA,VCD, Raman and ROA measurements for L-alanine inaqueous solution with the latest commercially availableFTIR VA/VCD instrument (Biotools, Jupiter, FL,USA) and Raman/ROA instrument (Biotools). Thesignal to noise of the spectra of L-alanine measuredwith these new instruments is significantly betterthan the previously reported spectra. Finally wereinvestigate the causes for the stability of the Pπ
structure of the alanine dipeptide, also called N-acetyl-L-alanine N’-methylamide, in aqueous solution.Previously we utilized the B3LYP/6-31G* + Onsagercontinuum level of theory to investigate the stabilityof the NALANMA4WC (Han et al.) [3]. Here we usethe B3PW91 and B3LYP hybrid exchange correlationfunctionals, the aug-cc-pVDZ basis set and the PCMand CPCM (COSMO) continuum solvent models, inaddition to the Onsager and no continuum solvent
model. Here by the comparison of the VA, VCD,Raman and ROA spectra we can confirm the stabilityof the NALANMA4WC due to the strong hydrogenbonding between the four water molecules and thepeptide polar groups. Hence we advocate the useof explicit water molecules and continuum solventtreatment for all future spectral simulations of aminoacids, peptides and proteins in aqueous solution, aseven the structure (conformer) present can not befound without this level of theory.
1 Introduction
Methods which can be used to determine the structureof biomolecules in native and non-native conditionsare necessary to be able to understand not only theinteractions responsible for structural stability, but alsoto determine and understand their function. X-ray andneutron diffraction methods are the methods of choicefor biomolecules for which one can crystallize [4, 5].The assumption is that the conditions one uses to getthe molecules to crystallize do not affect the structure,that is, the structure of the molecule in the crystal isthe same as that in its functionally active state. Thismay indeed be the case for large proteins, but it is notobviously the case for small flexible molecules [6].
Additionally, nuclear magnetic resonance (NMR)methods have recently been used to supplement thediffraction methods, though in general the methodonly works for biomolecules which are in the foldedstate. This is because the NMR methods are basedon the nuclear Overhauser effect (NOE), that is, theNOE signals occur only when two protons are closein distance. This is due to the signal strength beingdependent on the distance to the negative 6th power.For extended or unfolded proteins, one normally doesnot have enough NOEs distant constraints to uniquelydetermine the structure. Hence alternative methodsare required.
Additionally one would like to be able to followthe conformational changes in the structures as afunction of solvent polarity, addition of denaturingagents, changes in pH and addition of ligands [7].
1
Normal and chiral infrared (vibrational), Raman andelectronic spectroscopies are very structure sensitiveprobes, and many changes have been observed inthese spectra as a function of the solvent polarity,pH, phase and temperature [8–19]. The problem hasbeen to date, interpretation of the changes in thespectra. Finally some workers have attempted tounderstand the biological properties of L-alanine (LA)and N-acetyl L-alanine N’-methylamide (NALANMA)by gas phase or isolate state properties, but the speciesand conformational analysis has been shown to befundamentally different [3, 20–23].
In addition to using spectroscopic measurements tomonitor changes due to solvents, Baker and coworkershave also investigated the redox potentials [24]. Theredox potentials are shown to be a function of thesolvent polarity and hydrogen bonding ability. Forbiochemical catalysis and function, is it important tonot only understand the structural and vibrationalproperties, but also the electronic properties: oxidationand reduction potentials and how they can be tuned byneighboring residues and the protein, solvent and/ormembrane environment.
Vibrational spectroscopic measurements have ad-ditionally been shown to aid fold class assignment.Specifically the combination of vibrational absorption(VA), vibrational circular dichroism (VCD), Ramanand Raman optical activity (ROA) in combination withmolecular dynamics simulations and density functionaltheory (DFT) theory calculations has been shown tobe able to determine the backbone (secondary struc-ture) of L-alanine, N-acetyl L-alanine N’-methylamide,L-alanyl L-alanine and Leu-enkephalin [1–3,22,23,25–30] and the side chain conformation of L-histidine [31].A preliminary study documenting the use of neuralnetworks to predict the structure of peptides basedon a combination of experimental and DFT simulatedVA, VCD, Raman spectra has appeared [32, 33]. Avery feasible extension of this work is to use thecharacteristic VA, VCD, Raman and ROA spectra ofproteins of known fold class and combine with theabove aforementioned work on sequences with knownstructures, to predict the fold class of an unknownprotein not only from the sequence, but also from themeasured VA, VCD, Raman and ROA spectra [34,35].
To date a complete understanding of the VA, VCD,Raman and ROA spectra is not known to the extentthat the combination can be used to determine thecomplete secondary structure of all residues and howthese secondary structural elements fold up to formthe tertiary structure. But many believe that theinformation is present in these spectra, it is just amatter of developing a transparent method which is ca-pable of ’processing’ the spectra into backbone angles,then side chain angles and finally the packing/interfaceinteractions which are responsible for tertiary structureformation and stability. Our preliminary work above,
shows that this work is well worth pursuing, and isbeing pursued by us and other groups (Keiderlinggroup in Chicago, for example). An accompanyingarticle in this special issue by Ramnarayan, Bohr andJalkanen discusses this in more detail [36].
Experimental approaches to assigning the character-istic vibrational bands in the IR/VA spectra due toamino acid residues [37] and amides groups in peptides[38] have also been developed. Here one measures thespectra of amino acids and peptides and fits the spectrato determine the vibrational frequencies and dipolestrengths (molar extinction coefficients). Subsequentlythese characteristic bands have been used to estimatethe secondary structure content in proteins [39]. Inaddition to the zwitterionic species, the measurementshave been made at various pH values, which allowsone to measure the spectra of the cationic and anionicspecies also. These spectra are important if one issubtract out the spectra of the side chains of proteinsto determine the spectra due to the backbone, whichdetermines the secondary structure of the protein.
The amide I region of the spectra has been used todetermine the percentage of various secondary struc-tural elements in the protein [40]. Since the amideI mode is largely composed of the C=O stretch fre-quency, which is very sensitive to hydrogen bonding,this has been very useful to date. But to do so, requiresone to deconvolute the amide I region into its char-acteristic parts. Here one has measured the spectraof proteins with known secondary structure and thendetermined the principle component (characteristic)spectra. In addition to these experimental techniques,computational attempts have also been made to de-termine the molecular species which contribute to theinfrared spectra of acids in aqueous solution [41].
A review article on hydrogen bonding has recentlyappeared which documents the breadth of the differentexperimental and theoretical techniques used to tryto understand hydrogen bonding, in addition to thetechniques which we advocate here in this work [42].Finally the work by Gorelsky and Solomon in thisspecial issue gives an example of the changes in theelectronic charge distribution due to oxidation [43,44].In the case of L-alanine, the zwitterionic species isstabilized by the neighboring solvent molecules, with-out which the negative electronic charge on the COO−
group is not stable with respect to proton transfer fromthe positively charged NH+
3 group. Hence it would benice to see an extended charge decomposition analysisof the two species of L-alanine within the variousmodels used here, that is, explicit solvent models,continuum solvent models and finally the hybrid whereone combines the two models.
In addition to the VA, VCD, Raman and ROAspectra we foresee the use of the electronic absorption(EA) and electronic circular dichroism (ECD) spectrato be of use [40, 45, 46]. Recently the feasibility
2
of the calculation of all of the aforementioned spec-tra has been shown, but mostly in the gas phase,using continuum solvent models, using explicit wa-ter molecules and finally combining these approaches[1, 3, 29]. If one can fully understand the changeswhich occur in the vibrational, electronic and NMRspectra of biomolecules under a variety of experimentalconditions by modeling studies, then one can extractinformation from the spectra of proteins, small flexibleligands (drug molecules) and the changes which occurin both when combined [47–49]. To be able to modelthese changes one is required to treat not only thefew strongly H-bonded water molecules, but also thecomplete solvent shell around the molecules, and howthis shell is changed as the molecule moves from thepolar to nonpolar medium. Here we have initiallytreated the zwitterion in a droplet model, so thenonpolar medium is the vacuum around the droplet.In a future work we will look at a two phase system, forexample, a hydrocarbon and water, but here we look atthe simplest two phase model. The molecular dynamicssimulations were used to generate the snapshots whichwere used for further analysis.
This work contrasts the work done on using softx-ray spectroscopy to understand the properties ofwater by studying ice [50]. In addition, to be able tocompare the calculated electrical properties with theexperimental values one must take into account thevibrational corrections [51,52]. Hence not only is vibra-tional analysis important for structure, conformationand functional studies, but it is important to be ableto evaluate the theoretical methods being developedto determine the electrical and magnetic properties ofmaterials, which have been shown to be dependent onthe vibrational levels and populations.
The interpretation and use of the wealth of ac-cumulated VA, VCD, Raman and ROA spectra forbiomolecules has to date really been limited by theability to perform high level ab initio and DFT spectralsimulations where the effects of the environment havebeen properly and adequately taken into account [53–56]. In this work we document a methodology whichshould be used and which hopefully will open thedoors for the use of these combined spectroscopiesin many applications in molecular biology, molecularbiophysics and physical biochemistry. The limitationswhich previously hindered the interpretation of theeffects of the strongly interacting environment have toa large extent been overcome.
2 Methodology
2.1 Treatment of solvent
The treatment of the aqueous environment surround-ing and responsible for stabilizing the zwitterionicspecies of L-alanine (LAZ) has been very interesting.
Initially one can treat the environment by using onlya continuum model, but this has been shown not to beadequate for strongly interacting H-bonding solventslike water [1, 3]. This has complicated the problembecause one then has to choose the number of watermolecules to include and their positions. Previousworkers have used either chemical intuition or classicalMD simulations to position the water molecules [1–3,27, 57]. Here we have extracted them from density-fuctional Born-Oppenheimer molecular dynamics sim-ulations (DFT BOMD). In addition, there have specif-ically parametrized molecular mechanics force fieldsfor water, e.g. the TIP5P [58], and also specificallyparametrized hybrid exchange correlations functionalsto reproduce the electronic properties for water [59].
The positions of the water molecules were extractedfrom the lowest energy structures from the DFTBOMD simulation for the L-alanine zwitterion (Degt-yarenko et al. 2007, [60]). The structure frommolecular dynamics simulations was then used forpreparation of the initial structure of the L-alaninezwitterion with only a first hydration shell by removingwater molecules which are farther than 3.5 from anyatom of the alanine molecule. All in all, only 20 watermolecules were required to fully encapsulate (surround)the amino acid. The number of water molecules is suf-ficient to complete the first hydration shell completelysurrounding the L-alanine zwitterion. The networkof 20 water molecules have explicit hydrogen bondinginteractions with ammonium and carboxylate sites ofthe L-alanine.
This is the way to treat the first solvation shell ofsolvent molecules explicitly, rather than to use oneof the many continuum solvent model treatments.Additionally we have embedded the L-alanine + 20 wa-ter molecules clusters within various continuum mod-els, the Onsager [61], the polarized continuum model(PCM) [62] and the Conductor Screening Model(COSMO) [63–67]. These models can be used to modelthe effects due to the so-called bulk water molecules.Recent X-ray and neutron diffraction studies haveshown that there are two kinds of solvent molecules,those near the surface of the biomolecules which havereduced mobility and density, and those in the bulksolvent [4,68]. A few recent reviews on implicit solventmodels have appeared, which give more detail thatis possible in our work on vibrational spectroscopy[69–71]. In addition, one can utilize the so-calledQM/MM methods [72]. Here one can treat thesolute molecule, here LA or NALANMA, by quantummechanical methods and the aqueous environment byclassical molecular mechanic methods. Finally onecan try to determine the positions and propertiesof the aqueous environment surrounding amino acidsby combined empirical potential structure refinement(EPSR) fits to diffraction data [73]. This method is inmany ways complementary to the methods which we
3
have used.
2.2 Hessian and vibrational frequencies
To determine whether a structure is a local minimum,one normally calculates the second derivatives withrespect to nuclear displacements, the so-called forceconstant matrix or Hessian. By diagonalizing the massweighted Hessian, one gets the vibrational frequencieswithin the mechanical harmonic approximations. For alocal minimum all of the frequencies should be positive.For a transition state one of the frequencies should benegative.
Initially the Hessians within DFT were calculatedby finite difference methods [74] and subsequentlyanalytical Hessians have been implemented which hasincreased the efficiency [75]. In addition, the analyticalgradients for the derivative of the energy of an exitedstate with DFT has also been reported, allowing forthe optimization of the geometry of the electronicexcited state [76]. Initially one use Hartree-Fock(HF) Hessians to interpret the vibrational spectra ofsimple organic molecules, but subsequent work showedthat one could improve the agreement between thecalculated and experimental spectra by scaling the HFHessians [77–80]. Here one used symmetric moleculeswhere the experimental spectra were assigned so thatone could develop scale factors which could then betransferred to other molecules. This methodology hasalso been used in determining the so-called molecularmechanics force fields [81]. When one has gone tothe MP2 and DFT with the hybrid and meta hybridXC functionals levels of theory, the accuracy has beensufficient such that many groups no longer feel the needto scale the MP2 or DFT force fields. This is theapproach we have taken in this work. The agreementcould of course be improved by scaling the DFT forcefields, but then one would lose the ability to evaluatethe errors remaining in the difference between thecalculated harmonic frequencies and intensities and theanharmonic frequencies and intensities one gets fromthe experimental spectra.
2.3 Vibrational absorption
To simulate the vibrational absorption (VA) spectra ofa molecule one first requires an optimized geometry.Hence the LA + 20 water molecule cluster was opti-mized at the B3LYP/6-31G* optimized level of theory.Additionally the LA + 20 water molecule + (Onsageror PCM or COSMO continuum model) clusters werealso optimized at the B3LYP/6-31G* level of theory.At the optimized geometries, the Hessians and atomicpolar tensors (APTs) were calculated. By diago-nalizing the mass weighted Hessian, one obtains theharmonic frequencies and atomic displacements, theso-called mechanical harmonic approximation. If one
also calculates the APTs, the derivatives of the electricdipole moment with respect to nuclear coordinate dis-placements, one can calculate the VA intensities withinthe so-called electric harmonic approximation. Herethe selection rule gives us that the dipole strengthsfor the harmonic vibrational transitions are given by[82,83]:
ε(ν) =8π3NA
3000hc(2.303)
∑i
νDifi(νi, ν) (1)
where ε is the molar extinction coefficient, Di is thedipole strength of the ith transition of wavenumbersνi in cm−1, and f(νi, ν) is a normalised line-shapefunction and NA is Avogadro’s number.For a fundamental (0→1) transition involving the ithnormal mode within the harmonic approximation
Di =(h
2ωi
)∑β
∑λα
Sλα,iµλαβ
∑λ′α′
Sλ′α′,iµλ′α′β
(2)
where hωi is the energy of the ith normal mode,the Sλα,i matrix interrelates normal coordinates Qi
to Cartesian displacement coordinates Xλα, where λspecifies a nucleus and α = x, y or z:
Xλα =∑
i
Sλα,iQi (3)
µλαβ (α, β = x,y,z) are the APT of nucleus λ. µλα
β aredefined by
µλαβ =
∂
∂Xλα< ψG(~R)|(µel)β)|ψG(~R) >
~Ro
(4)
µλαβ = 2
⟨(∂ψG(~R)
∂Xλα
)~Ro
∣∣ (µeel)β
∣∣ ψG(~Ro)
⟩+ Zλeδαβ
(5)
where ψG(~R) is the electronic wavefunction of theground state G, ~R specifies nuclear coordinates, ~Ro
specifies the equilibrium geometry, ~µel is the electricdipole moment operator, µe
el = −e∑
i ri is the elec-tronic contribution to µel and Zλe is the charge onnucleus λ.
The problem with the expression for the APTpresented is that it involves the wavefunction, andwavefunction derivatives, (usually calculated with Cou-pled Perturbed Hartree-Fock theory) [84–88]. Theseequations extend the Roothan-Hall equations for per-turbations [89–91]. These expressions have been im-plemented in CADPAC to calculate the APT at theSCF and MP2 levels. The advantages of DFT arenumerous but the main one for us is to extend rig-orous methodology to the calculation of propertiesof large biological molecules. DFT seems to providea way to do that. But the expressions must bereformulated as expressions amenable for DFT, that
4
is, as energy derivatives. The APT tensors have alsobeen implemented at the DFT level in Gaussian 03,CADPAC and Dalton [92] using Gaussian orbitals andin the Amsterdam Density Functional (ADF) codeusing Slater orbitals [93–95].
Since in DFT one must do many numerical inte-grations, the advantages of using Gaussian orbitalsis lost to some extent. Investigations of the perfor-mance of local, gradient-corrected and hybrid densityfunctional models in predicting infrared/VA intensitieshave shown that the hybrid functionals are very accu-rate when the solvent effects are not important [96,97].To date there has not been a systematic investigationof the prediction of infrared/VA intensities in stronglyhydrogen bonded solvents. The problem here hasbeen that the solvent absorbs significantly and alsothat there are strong interactions between the soluteand solvent. Hence the conclusions based on previousstudies where the solvent effects were negligible arenot necessarily applicable to the strongly hydrogenbonding systems. Here the use of explicit solventmolecules, continuum solvent models, hybrid modelswhich combine the two aforementioned approachesor a molecular dynamics simulation approach are allmethods which have been used in limited numbers ofcases, but no systematic study has been undertaken.Here we would suggest/recommend that the vibra-tional frequencies and VA intensities be mandatory, asthe amide I region of the spectra of proteins has beenused to date without a complete understanding of thevarious contributions to the linewidths and intensitiesdue to the specific secondary structural elements. Thiswork documents part of that complexity which we haveinvestigated in this work.
For periodic systems the concept of atomic polartensors of the molecular system embedded within asolid is a more complex concept. Here one introducesthe Born effective charge, Z*, that describes thepolarization created by atomic displacements [98].
Z∗κ,αβ = Ωo
∂Pβ
∂τκα
∣∣∣ε=0
(6)
2.4 Vibrational circular dichroism
If one wishes to simulate the vibrational circulardichroism (VCD) spectra, in addition, to the VAspectra, one is required to calculate the atomic axialtensors (AATs) [99–105]. The AAT are the deriva-tives of the magnetic dipole moment with respect tothe nuclear velocities [106, 107]. Hence within theBorn-Oppenheimer approximation [108], the electroniccontribution to the AAT are zero [109]. Stephensand Buckingham have shown that the electronic con-tributions can be calculated by a variety of differentformulations. Additionally, the problem of the gaugedependence of the tensors has been treated in a variety
of ways, the so-called common origin gauge with verylarge basis sets, and the distributed origin gauge[101, 110]. Here one can use the traditional basissets. Another way to treat the gauge problem is toadd magnetic field dependent orbitals, the so-calledLondon atomic orbitals. Finally one can use molecularorbitals which depend on the magnetic field, the so-called LORG method of Bouman and Hansen [111]. Allhave been shown to be feasible and accurate if one useslarge enough basis sets. If one is limited to relativelysmall basis sets, the use of gauge invariant atomicorbitals appears to be the method of choice [112–114].The vibrational circular dichroism spectra is related tothe molecular rotational strengths via,
∆ε(ν) =32π3N
3000hc(2.303)
∑i
νRifi(νi, ν) (7)
where ∆ε = εL− εR is differential extinction coefficientand Ri is the rotational strength of the ith transition ofwavenumbers νi in cm−1, and f(νi, ν) is a normalizedline-shape function and NA is Avogadro’s number.
Ri = h2Im∑
β
∑λα
Sλα,iµλαβ
∑λ′α′
Sλ′α′,imλ′α′β
(8)
where hωi is the energy of the ith normal mode,the Sλα,i matrix interrelates normal coordinates Qi
to Cartesian displacement coordinates Xλα, where λspecifies a nucleus and α = x, y or z: Xλα is aspreviously defined. µλα
β and mλαβ (α, β = x,y,z) are
the APT and AAT of nucleus λ.µλα
β is as previously defined and mλαβ is given by
mλαβ = Iλα
β +i
4hc
∑γ
εαβγRoλγ(Zλe) (9)
Iλαβ =
⟨(∂ψG(~R)
∂Xλα
)~Re
∣∣ (∂ψG(~Re, Bβ)
∂Bβ
)Bβ=0
⟩(10)
where ψG(~Ro, Bβ) is the ground state electronicwavefunction in the equilibrium structure ~Re in thepresence of the perturbation−(µe
mag)βBβ , where ~µemag
is the electronic contribution to the magnetic dipolemoment operator. mλα
β is origin dependent. Its origindependence is given by
(mλαβ )0 = (mλα
β )0′+
i
4hc
∑γδ
εβγδYλ
γ µλδα (11)
where ~Y λ is the vector from O to O′ for the tensor ofnucleus λ. Equation (10) permits alternative gaugesin the calculation of the set of (mλα
β )0 tensors. If~Y λ = 0, and hence O = O′, for all λ the gauge istermed the Common Origin (CO) gauge. If ~Y λ =~Ro
λ, so that in the calculation of (mλαβ )0 O′ is placed
5
at the equilibrium position of nucleus λ, the gauge istermed the DO gauge (Stephens 1987 [101]; Amos etal. 1988 [102]).
Density functional theory atomic axial tensors havealso been implemented in Gaussian 94 (Cheeseman etal. 1996 [115]). We have utilized Gaussian 03 tocalculate the B3LYP/6-31G* AAT in this work. Notethat the AAT have been implemented within both thePCM and COSMO continuum models within Gaussian03, but not yet with the simplest Onsager continuummodel. The VCD simulations for the Onsager calcula-tions utilize the AAT calculated without a continuumsolvent treatment. The AAT have also recently beenimplemented in the ADF code which utilizes Slatertype orbitals rather than the Gaussian type orbitalsused in Gaussian, CADPAC and Dalton [95].
2.5 Raman scattering
If one wishes to simulate the Raman scattering inten-sities, one is required to calculate the electric dipole-electric dipole polarizability derivatives. Here onecan use either the static values, as has been donein most works to date, or one can take into accountthe frequency response. In most conventional Ramanscattering experiments one uses either a 488 nm or1064 nm source, the later for molecules for whichfluorescence is a problem.
The Raman intensity is proportional to the Ramanscattering activity, which is given for the jth normalmode Qj by:
IRamj = gj(45α2
j + 7β2j ), (12)
where gj is the degeneracy of the jth transition, andαj (the mean polarizability derivative tensor) and β2
j
(the anisotropy of the polarizability tensor derivative)are given by:
α2j =
1
9
(Sλα,jα
λαxx + Sλα,jα
λαyy + Sλα,jα
λαzz
)2(13)
and
β2j =
1
9
(Sλα,jα
λαxx − Sλα,jα
λαyy )2 + (Sλα,jα
λαxx−
Sλα,jαλαzz )2 + (Sλα,jα
λαyy − Sλα,jα
λαzz )2 +
6[(Sλα,jαλαxy )2 + (Sλα,jα
λαyz )2 + (Sλα,jα
λαxz )2]
,(14)
respectively. The Sλα,j matrix relates the normal coor-dinates Qj to the Cartesian displacement coordinatesXλα, where λ specifies a nucleus and α = x, y or z(Diem 1993 [116]): as previously defined.
The normal frequencies are the eigenvalues Λ of theHessian matrix H (the second derivative of the energywith respect to nuclear displacements, evaluated at theequilibrium geometry):
Hλα,λ′α′ =
(∂2E(~R)
∂Xλα∂Xλ′α′
)~R=~Ro
(15)
and
C−1HC = Λ. (16)
C is the eigenvector matrix, which defines the trans-formation matrix S which is given by:
S = M−1/2C, (17)
where M is the mass matrix.Therefore, through the matrix S (obtained by di-
agonalizing the Hessian matrix) and the polarizabil-ity derivatives (EDEDPD), αλα
βγ , one can obtain theRaman scattering activity IRam
j of any normal modeQj . The first efficient method for determining theEDEDPD was implemented utilizing finite field pertur-bation theory by Komornicki, 1979 [117]. Subsequentlythe analytical EDEDPD were derived and implementedin CADPAC [118, 119] and Gaussian [120] at theHartree-Fock level of theory, and subsequencly at theDFT level. The EDEDPD have also been implementedat the DFT level and used to simulate the Ramanspectra without any solvent treatment [97, 121–125]and subsequently by treating the solvent with explicitsolvent molecules, with continuum solvent models, andfinally using hybrid models which combine the two[31, 46, 126]. In contrast to solvent subtraction whichis common in infrared/VA spectroscopy, this is not al-ways done with Raman measurements. Investigationsof the performance of local, gradient-corrected andhybrid density functional models in predicting Ramanintensities have shown that the hybrid functionalsare very accurate when the solvent effects are notimportant [96,97].
Hence the question of whether and how to performa solvent subtraction for Raman measurements is aquestion which we will address in this work.
2.6 Raman optical activity
Finally if one wishes to simulate the Raman opticalactivity (ROA) intensities, one is required to calcu-late the electric dipole-magnetic dipole polarizabilityderivatives (EDMDPD) and the electric dipole-electricquadrupole polarizability derivatives (EDEQPD) [107,127–130]. The calculation of the EDMDP, G′, havebeen implemented in CADPAC (Amos 1982 [131]).The EDMDP can be evaluated as the second derivativeof the energy with respect to a static electric field anda time varying magnetic field. Hence the EDMDPDis a third derivative, the third derivative being withrespect to the nuclear displacement.
G′λαβγ =
(∂3WG(~R, Fβ , Bγ)
∂Xλα∂Fβ∂Bγ
)~R=~Ro,Fβ=0,Bγ=0
6
=
(∂G′
βγ(~R)
∂Xλα
)~R=~Ro
(18)
The electric dipole-electric dipole, electric dipole-magnetic dipole and electric dipole-electric quadrupolepolarizability tensors can also be calculated at the fre-quency of the incident light using SCF linear responsetheory. London atomic orbitals have been employed,hence the results are gauge invariant (Helgaker el al.1994 [132]). In this work, the electric dipole-magneticdipole orbitals are gauge independent when using afinite basis set due to the use of gauge invariant atomicorbitals as implemented in Gaussian 03. Previouslywe used conventional basis sets which were not gaugeinvariant, similar to the work of Polavarapu andcoworkers [23, 133–135] and our previous work onNALANMA [3], where that level of theory had beenshow to give predicted ROA spectra in good agreementwith the experimentally observed spectra. We are thusconfident that the new level of theory can be used toanswer the structural questions posed here, realizingthat the agreement in absolute intensities could beimproved by using much larger basis sets, as we shalldo for the NALANMA4WC, but not the LA20WC.
The EDEQP, A, can be evaluated as the secondderivative of the energy with respect to a static electricfield and a static electric field gradient. Hence theEDEQPD is a third derivative, the third derivativebeing with respect to the nuclear displacement.
Aλαβ,γδ =
(∂3WG(~R, Fβ , F
′γδ)
∂Xλα∂Fβ∂F ′γδ
)~R=~Ro,Fβ=0,F ′
γδ=0
=
(∂Aβ,γδ(~R)
∂Xλα
)~R=~Ro
(19)
The Cartesian polarizability derivatives are requiredto calculate the ROA spectra. The quantities requiredhave been derived by Barron and Buckingham (Barronand Buckingham 1971 [129]) and are αjG′
j , γ2j and δ2j
(Barron 1982 [136]; Buckingham 1967 [127]; Barron,Bogaard, and Buckingham 1973 [137] and Hecht andBarron 1990 [138]). The quantities are calculated bycombining the various polarizability derivatives withthe normal mode vectors in the following equations.αjG
′j is given by:
αjG′j =
1
9
(Sλα,jα
λαxx + Sλα,jα
λαyy + Sλα,jα
λαzz
)(Sλα,jG
′λαxx + Sλα,jG
′λαyy + Sλα,jG
′λαzz
).(20)
γ2j is given by:
γ2j =
1
2
(Sλα,jα
λαxx − Sλα,jα
λαyy )(Sλα,jG
′λαxx − Sλα,jG
′λαyy )
+(Sλα,jαλαxx − Sλα,jα
λαzz )(Sλα,jG
′λαxx − Sλα,jG
′λαzz )
+(Sλα,jαλαyy − Sλα,jα
λαzz )(Sλα,jG
′λαyy − Sλα,jG
′λαzz )
+3[(Sλα,jα
λαxy ) (Sλα,jG
′λαxy + Sλα,jG
′λαyx )
+(Sλα,jαλαyz )(Sλα,jG
′λαyz + Sλα,jG
′λαzy )
+(Sλα,jαλαxz )(Sλα,jG
′λαxz + Sλα,jG
′λαzx )
], (21)
and δ2j is given by:
δ2j =ω
2
(Sλα,jα
λαyy − Sλα,jα
λαxx )Sλα,jA
λαz,xy
+(Sλα,jαλαxx − Sλα,jα
λαzz )Sλα,jA
λαy,zx
+(Sλα,jαλαzz − Sλα,jα
λαyy )Sλα,jA
λαx,yz
+Sλα,jαλαxy (Sλα,jA
λαy,yz − Sλα,jA
λαz,yy
+Sλα,jAλαz,xx − Sλα,jA
λαx,xz)
+Sλα,jαλαxz (Sλα,jA
λαy,zz − Sλα,jA
λαz,zy
+Sλα,jAλαx,xy − Sλα,jA
λαy,xx)
+Sλα,jαλαyz (Sλα,jA
λαz,zx − Sλα,jA
λαx,zz
+Sλα,jAλαx,yy − Sλα,jA
λαy,yx)
. (22)
The equations relating these tensor derivatives tothe ROA spectra are given by Barron 1982 [136], 2004[139], Hecht et al. 1989 [140] and Barron et al. 1994[141]. These references provide a good introduction tothe theory and application of ROA spectroscopy. Herewe shall give the relevant equations for completeness.We follow very closely the notation of Barron fromhis Faraday Discussions review article (Barron et al.1994 [141]).
The commonly reported quantity from ROA mea-surements is the dimensionless circular intensity differ-ential (CID)defined as
∆α =
(IR
α − ILα
IRα + IL
α
), (23)
where IRα and IL
α are the scattered intensities withlinear α−polarization in right- and left-circularly po-larized incident light. In terms of the quantities,EDEDPD, EDMDPD and EDEQPD, the CIDs forforward, backward, and polarized and depolarizedright-angle scattering from an isotropic sample forincident laser light are (Barron et al. 1994 [141]):
∆(forward) =
(8[45αG′ + β(G′)2 − β(A)2]
2c[45α2 + 7β(α)2]
)(24)
∆(backward)ICP =
(48[β(G′)2 + (1/3)β(A)2]
2c[45α2 + 7β(α)2]
)(25)
∆(90o, polarized) =
(2[45αG′ + 7β(G′)2 + β(A)2]
c[45α2 + 7β(α)2]
)(26)
7
∆(90o, depolarized) =
(12[β(G′)2 − (1/3)β(A)2]
6cβ(α)2
)(27)
∆(backward)DCPI =
(48[β(G′)2 + (1/3)β(A)2]
2c[6β(α)2]
), (28)
where c is the speed of light and the isotropicinvariants are defined as
α = (1/3)ααα (29)
andG′ = (1/3)G′
αα, (30)
and the anisotropic invariants as
β(α)2 = (1/2)(3ααβααβ − ααααββ) (31)
β(G′)2 = (1/2)(3ααβG′αβ − αααG
′ββ) (32)
β(G′)2 = (1/2)(3ααβG′αβ − αααG
′ββ) (33)
β(A)2 = (1/2)ωααβεαγδAγδβ . (34)
Using these expressions we can calculate the ROAspectra within the harmonic approximation for thevibrational frequencies and with the static field limitfor the EDEDPD, EDMDPD and EDEQPD.
3 Results
3.1 Structures of the L-alanine zwitterion
In Figure 1 we show the optimized L-alanine + 20water molecule cluster (LA20WC) optimized structureas determined at the OPBE0/TZ2P + COSMO levelof theory as calculated with the Amsterdam DensityFunctional (ADF) code [93, 95]. Additionally, theLA20WC and the LA20WC embedded within theOnsager, PCM and COSMO continuum models havebeen calculated at the B3LYP/6-31G* and B3PW91levels of theory. In Table 1 the geometrical parameterfor L-alanine are reported for the 8 new models forthe B3LYP/6-31G* and B3PW91 hybrid exchangecorrelation (XC) functionals, as well as some of thepreviously reported values and the values reportedfrom an experimental crystal structure determination.A comparison of the L-alanine zwitterion stabilized byinteractions in the crystal with the LAZ stabilized bywater molecules in molecular complexes with varyingnumbers of waters has also been discussed [6].
The initial model by Barron et al. [134] was atthe Hartree-Fock level without any treatment of theaqueous environment, the second model by Yu etal. [142] was again at the Hartree-Fock level but
Figure 1. L-alanine + 20 water molecules at OPBE0/TZ2P+ COSMO level of theory
including the Onsager continuum model. The modelsof Tajkhorshid et al. [1] and Frimand et al. [2] bothincluded explicit water molecules in addition to theOnsager continuum model, but the water moleculesincluded were only those which were directly H-bondedwith the ammonium and carboxylate groups. Here thewater molecule layer completely encapsulated the L-alanine. Only the geometry of the L-alanine zwitterionis given in Table 1. The complete set of coordinates areavailable as supplementary material.
3.2 VA spectra for L-alanine
In Figure 2 we present the simulated VA spectra atthe B3LYP/6-31G* level of theory for the LA20WC,LA20WC plus Onsager continuum solvent model,LA20WC plus PCM continuum solvent model andfinally the LA20WC plus the COSMO continuumsolvent model. As one can see the effects of includingthe various continuum solvent models for the treatmentof the bulk waters are not all the same. The way todetermine the optimum continuum solvent model is tocompare to the experimental VA spectra. As one cansee when one compares the four simulated spectra tothe experimental one, there appears to be a systematicshift in the spectra. This can be accounted for by twoeffects: i) the harmonic approximation used which isknown to overestimate the anharmonic frequenciesand ii) the use of a split valence plus polarization onlyon carbon, nitrogen and oxygen basis set (6-31G*).We have previously shown that one can do muchbetter by using the aug-cc-pVDZ and aug-cc-pVTZbasis sets for VA, VCD, Raman and ROA spectralsimulations for phenyloxirane [97].
In addition, in this special issue the Keiderling grouphas utilized the 6-311++G** basis set with the B3LYPand B3PW91 hybrid exchange correlation functionalto simulate the VA spectra for some idealized turn
8
Figure 2. Comparison of VA spectra for B3LYP/6-31G*LA20WC, LA20WC + Onsager, LA20WC + PCM andLA20WC + COSMO and experiment
structures [143]. Here they have not attempted todo a systematic search to determine the lowest energyconformer of the molecules studied, but rather haveused the molecule in its ideal turn structures totry to identify the characteristic features in the VA(and VCD) spectrum which can be used in theirworks on proteins. The question of the interactionswhich stabilize these structures in real proteins andmakes these structures the global minimum were notaddressed and thought not to be important in the VAand VCD spectra. But the effects of some of theenvironment were taken into account by adding someexplicit water molecules. In many cases one comparesto only the gas phase experiments and one does notthen have to deal with the effects of the environment,since there is none [144].
Clearly these methods need to be extended to dealwith the effects due to strongly interacting environ-ments, for example, hydrogen bonding solvents likewater. But then two problems appear, one how toinitially treat the hydrogen bonded molecules in notonly the structure determination, but also the spectralsimulations. In many cases one runs a solvent spectraand then presents the solvent subtracted spectra. Forthe simulations to be able to reproduce, interpret andunderstand these spectra, and the conformational andsolvent fluctuations and changes which contribute tothem, then the simulations must be run long enoughto deal with the sampling problems. In addition, thereare inherent assumptions made by the experimentalistswhich can be checked, verified and confirmed, or notverified and not confirmed. Hence there is clearly morework to do beyond the approach which we have takenin this work, and we look forward our contributions tothis new approach in a future publication.
3.3 VCD spectra for L-alanine
In Figures 3 we present the simulated VCD spec-tra at the B3LYP/6-31G* level of theory for the
Figure 3. Comparison of VCD spectra for B3LYP/6-31G*LA20WC, LA20WC + Onsager, LA20WC + PCM andLA20WC + COSMO and experiment
LA20WC, LA20WC plus Onsager continuum solventmodel, LA20WC plus PCM continuum solvent modeland finally the LA20WC plus the COSMO continuumsolvent model. Here we also see that VCD spectra isalso affected by the various ways to treat the bulk watermolecules. We shall once again use the comparisonto the experimental VCD spectra as our method todetermine which continuum solvent model is optimumfor this purpose. As one can see in Figure 3, theagreement between the four theoretical VCD spectrawith the experimental spectra is quite good, takinginto account the aforementioned differences due to theharmonic approximation and the use of the 6-31G*basis set. This is to be expected since the numberof water molecules has been increased to now fullyencapsulate the LAZ. The effects due to the continuumsolvent treatments are now to treat the effects due tothe second solvation shell water molecules and the bulk.The initial question of fully encapsulating the LAZ hashence been addressed in this work.
Previously experimental work on the VCD of theLAZ has also appeared both with dispersive VCDinstrumentation [145–147] and with FTIR instrumen-tation [148, 149]. Depending on the region of interest,the advantages of each type of instrumentation areobvious, though all the commercial VCD instrumentsare based on FTIR technology. In this special issueCao, Dukor and Nafie present a new method to reducethe effects of linear birefrigerence associated withsample cells, allowing for more accurate and preciseVCD measurements [150].
3.4 Raman scattering spectra for L-alanine
In Figure 4 we present the simulated Raman scatteringspectra at the B3LYP/6-31G* level of theory for theLA20WC, LA20WC plus Onsager continuum solventmodel, LA20WC plus PCM continuum solvent model,
9
Ram
anExperimental
L-alanine Raman spectraComparison of experimental and simulated spectra
Figure 4. Comparison of Raman scattering spectra forB3LYP/6-31G* LA20WC, LA20WC + Onsager, LA20WC+ PCM and LA20WC + COSMO and experiment
the LA20WC plus the COSMO continuum solventmodel and the experimental spectra. Here we alsosee that Raman spectra is also affected by the variousways to treat the bulk water molecules. We shall onceagain use the comparison to the experimental Ramanspectra as our method to determine which continuumsolvent model is optimum for this purpose. As one cansee by comparing the four spectra to the experimentalspectra, the simulated spectra all over estimate theband just above 1500 cm−1. The Onsager appears todo the worst in the relative intensities between 1600and 1800 cm−1. Below 1500 cm−1 all of the fourspectra agree quite well with the experimental spectra.To really answer the question of which model is best,one should look at the ROA spectra simulations. Thetensors for the ROA spectral simulations are the mosttime consuming and hence we do not yet have themfor the highest level of theory, that is, using the aug-cc-pVDZ basis set for the LA20WC.
3.5 Raman optical activity spectra forL-alanine
In Figure 5 we present the newly measured experi-mental Raman optical activity spectra. In this workwe do not present the ROA spectral simulations withthe 20 explicit waters and the three continuum models.Here we present the Raman and ROA spectra by usingonly the PCM continuum model, at the B3LYP/aug-cc-pVDZ level of theory. Surprisingly, when usingthe aug-cc-pVDZ basis set and the PCM continuummodel, the Raman and ROA spectra are in quite goodagreement with the experimental spectrum. Usingthe smaller 6-31G* basis set, the results were not asgood, and hence motivated us to include the explicitsolvent molecules and also embed the LAZ4WC andLAZ20WC within the continuum solvent models. This
RO
A
Experimental
L-alanine zwitterion Raman and ROA spectraComparison of theory (B3LYP/aug-cc-pVDZ + PCM) and experiment
Figure 5. Comparison of theoretical and experimentalRaman and Raman optical activity spectra for LA.B3LYP/aug-cc-pVDZ + PCM level of theory.
may be fortuitous, and hence needs to be documentedfurther. We leave a complete documentation of thePCM and COSMO continuum models for VA, VCD,Raman and ROA for both L-alanine and the alaninedipeptide for a future work, as well as the inclusion ofexplicit water molecules and the hybrid model.
To simulate the ROA spectra for the LA20WCrequires 73 times 6, 438 coupled perturbed KohnSham solutions for the G’ (EDMDP) and A (EDEQP)tensors. By including the 20 water molecules, one hasto perform 6 time 60 = 360 calculations of the EDMDPand and EDEQP, which is quite time consuming Thisis the additional cost beyond the 6 times 13 = 78calculations of the EDMDP and EDEQP required forthe L-alanine. In addition, one would like to see theeffects of both the explicit water molecules and thecombined model, that is, the 20 water molecules andthe continuum model. Clearly to investigate all of theseeffects systematically, with respect to various basissets and exchange-correlations functional is beyond thescope of this work.
Hence we have simulated the ROA spectra using onlythe LA plus 4 water molecule model at this time.
3.6 Structure of N-acetyl L-alanineN’-methylamide
The reduced cost of the hybrid models is a big ad-vantage, and for NALANMA, the conformer found tobe present in aqueous solution is not even stable onthe isolated state potential energy surface and at theOnsager continuum model. This is what motivated usto include the explicit water molecules. The optimizedNALANMA4WC structure at the B3LYP/PCM levelof theory is given in Figure 6
10
Figure 6. N-acetyl L-alanine N’-methyl amide plus 4 watermolecules optimized structure at B3LYP/aug-cc-pVDZ +PCM continuum solvent model
3.7 Vibrational absorption and vibrational cir-cular dichroism for NALANMA
The simulated VA and VCD spectra for theNALANMA plus four water molecule complex withand without the explicit water molecules includedin the vibrational analysis are presented in Figure 7at the B3LYP/aug-cc-pVDZ plus CPCM (COSMO)continuum model and figure 8 at the B3PW91/aug-cc-pVDZ plus CPCM continuum model. Here one cansee which regions (modes) of (in) the VA and VCDspectra are affected by the explicit water molecules.In addition, one can see which regions where there isoverlap between water modes and those of NALANMAand in addition which regions the modes appear whichare due to the hydrogen bonding interactions. Foreach water molecules interacting with NALANMAone introduces 9 internal coordinates, 3 due to thenormal modes of the water molecule and 6 due tothe relative motion of the NALANMA relative tothe water molecule. If there were no interactionsbetween NALANMA and the water molecule, thesemodes would have zero frequency. Hence they giveinformation on the interactions. The informationis embedded in the frequencies and VA and VCDintensities.
To date this information has been underutilized inboth force field parametrization, but also in testingvarious exchange correlation functionals. Here one canuse this information to test how well various exchangecorrelation functional agree with each other, but alsowith experimentally measured vibrational frequencies,VA and VCD intensities. In this work, we present outVA and VCD spectral simulations using the B3LYP
-1500
-1000
-500
0
500
1000
1500
∆ε X
10-4
NALANMA4WC with waterNALANMA4WC without water
N-acetyl L-alanine N’-methylamide plus 4 water moleculesB3LYP/aug-cc-pVDZ + CPCM continuum model
Figure 7. Comparison of theoretical VA and VCD spectrafor NALANMA. B3LYP/aug-cc-pVDZ plus CPCM level oftheory with and without explicit water subtraction.
-1500
-1000
-500
0
500
1000
1500
∆ε x
10-4
NALANMA4WC with waterNALANMA4WC without water
N-acetyl L-alanine N’-methylamide plus 4 water moleculesB3PW91/aug-cc-pVDZ + CPCM continuum model
Figure 8. Comparison of theoretical VA and VCD spectrafor NALANMA. B3PW91/aug-cc-pVDZ plus CPCM levelof theory with and without explicit water subtraction.
and B3PW91 functionals, which have been seen toshow very similar results for nonpolar molecules innonpolar solvents. There has been to date very littlesystematic comparison for polar molecules in polarsolvents.
3.8 Raman and Raman optical activity spec-tra for NALANMA
The Raman and ROA spectra along with the experi-mental spectra for NALANMA are presented in Figure9. Clearly the inclusion of explicit water moleculesis necessary for NALANMA. Here we have simulatedthe spectra using the hybrid model, using the explicitsolvent model to treat the 4 water molecules which havebeen previously shown to be important to stabilize thePπ conformer.
11
4 Discussion and Conclusions
As one can see from our new VA, VCD, and Ra-man spectral simulations the effect of the implicitcontinuum model, be it Onsager, PCM or COSMOstill has a relatively large effect on the VA, VCDand Raman spectral simulations, especially in the NHstretch region. Hence the best way to evaluate whichmodel is best is to compare the spectral simulations toour newly reported experimental VA, VCD and Ramanscattering spectra. This is also seen in Table 2 whereone can see the assignment of the bands in the NH andCH stretch region are not the same. This can also beseen in the NH bond lengths reported in Table 1.
In addition to being used for conformational anal-ysis, VCD can be used to determine the absoluteconfiguration of small chiral molecules [151–156], manyof which are ligand and have been found to bind toproteins. Hence the changes in the VCD spectra canalso be used to monitor changes in both conformationand absolute configuration. Some drug molecules havebeen shown to racemize in the body, and hence itis important to know if the drug racemizes beforeit reaches its target. A classical example is thethalidomide, which was given to pregnant women inEurope, Canada and Australia and caused birth defects[157], but was not approved by the FDA in the USAdue to concerns by Frances Kelsey at the FDA. Thiswas thought to be due to the drug not being givenas an enantiomerically pure form. Subsequently it hasbeen hypothesized that even if it had been given inthe enantiomerically pure form (only one enantiomer),it would have still caused problems due to it beingracemized in the body.
In addition to using vibrational spectroscopy tomonitor changes in conformation, hydrogen bondingand function, other techniques are also complementary,for example, NMR [158] and vibronic spectroscopy[159]. In addition to the analysis of the vibrationalfrequencies and VA, VCD, Raman and ROA intensi-ties, one can analyze the so called potential energydistributions (PEDs) as we have done here, but alsousing a decomposition method developed by Hug andFedorovsky which is presented in this special issue[160]. Here one is able to assign the experimentalobservables (frequencies and VA, VCD, Raman andROA intensities) to specific functional groups in themolecule. This is very important for molecules withmultiple chiral centers, for example, aframodial, whichhas four chiral centers [156]. The method of Hug andFedorovsky complements the method of PEDs usedby Jalkanen et al. in this special issue where theydocument the use of various sets of internal coordinatesfor not only the assignment of the normal modes, butalso in terms of scaling force fields and transferability[161]. The use of symmetric molecules to determinescale factors has been documented for both ethylene
Figure 9. Comparison of theoretical and experimentalRaman and Raman optical activity spectra for NALANMA.B3LYP/aug-cc-pVDZ + PCM level of theory.
oxide [77] and 2-oxetanone [162].The question of how many solvent molecules are in
the first hydration shell of water has been investigated.Here we have increased to 20 from our previous studieswhere we have used 4 and 9 respectively. Our definitionis the number of water molecules which are necessaryto completely encapsulate the LA zwitterion. Anexperimental determination of the hydration numberof glycine has also recently been reported [163]. Intheir work they have utilized calorimetric methodsand determined that the hydration number or glycineis 7 for the zwitterionic form. In addition to thiscalorimetric method, low temperature matrix isolationFourier transform infrared techniques have also beenused [164] in addition to ATR studies [165]. NMRstudies of the hydration of biological molecules havealso been undertaken [166].
The effects of hydration, other hydrogen bond-ing interactions and hydrophobic packing interactionshave been observed to affect and distort the helixstructures of proteins from their ideal 3.613 and 310
structures [167] and also been shown to important in
12
both protein folding, unfolding and refolding (afterremoval of denaturant, lower temperature or removalof other perturbation which caused the unfolding ordenaturing) [168]. As many molecular diseases arecaused by misfolding of proteins, it is fundamentalthat we can understand not only the effects which cancause misfolding or denaturing if we wish to preventthese occurrences, but we also need to understand thepossible ways to refold, in the event that the processis reversible. The difference between irreversible andreversible misfolding and denaturing is not obviousand clear, and the class of proteins which can bereversibly refolded does not appear to be large. Inmany cases other proteins may be necessary to helpthe protein refold (and in some cases to even foldinitially), the so called chaperon proteins. In the case ofthe damaged (denatured) DNA, the photolyase proteinhave been shown to be able to initiate repair, butrequire blue light, in a so called photoinitiated repairprocess [169–173].
For the case of NALANMA the early vibrationalanalyzes used HF/4-21G force fields which needed tobe scaled [174, 175]. We extended this work by usingthe RHF/6-31G* [23] and subsequently B3LYP/6-31G* force fields. In our works we did not scale theforce fields, as by doing so one masks the systematicerrors and one then loses sights of both the errors andapproximations made. Subsequently the treatment ofthe molecule was also undertaken, initially at the scaleHF/4-21G level [176] and then subsequently by us atthe B3LYP/6-31G* level of theory. The initial studieswere important in documenting the structure andvibrational spectra of NALANMA in the isolated state,in nonpolar solvents, in inert gas matrices (so calledmatrix isolation studies) and finally in the crystal.When one measured the spectra of NALANMA inaqueous solution one noticed large changes in thespectra. The early works which did not include theexplicit treatment of solvent [23] were not able tofully interpret the spectra in aqueous solution, butsurprising the spectra of a transition state species gavethe best agreement with the experimentally observedspectra. This motivated us to use explicit solventmodels, implicit solvent models and finally a combi-nation of the two, where we used an explicit solventtreatment to treat the water molecules which hadspecific and directionally oriented interactions whichwere responsibility for stabilizing species not stable inthe isolated state or gas phase potential energy surface(PES) [3,22,177] and an implicit water model to treatthe effects due to the bulk water.
This NALANMA4WC was subsequently sent toNMR spectroscopists who were then able to interpretthe NMR spectra of this molecule, which had previ-ously not been able to be interpreted using the con-formers which are only stable on the isolated state PES[178, 179]. In this work using B3LYP/aug-cc-pVDZ
and B3PW91/aug-cc-pVDZ levels of theory we havefurther confirmed and established that NALANMAwith four explicit water molecules (NALANMA4WC)is a stable complex which is the global minimum andthat the VA, VCD, Raman and ROA spectra are ingood agreement with those experimentally measured.In addition, we have embedded the NALANMA4WCwithin three continuum models: the Onsager, PCMand CPCM models which have treated the effects dueto bulk water molecules. Hence there should no longerbe any doubt that the combination of explicit andimplicit solvent models using hybrid XC functionalsin DFT with relatively large basis sets (aug-cc-pVDZor larger) are needed to accurately treat the VA, VCD,Raman and ROA spectra.
By combining these latest theoretical and experi-mental methods one now has the opportunity to fullyinterpret and utilize the VA, VCD, Raman and ROAspectra of biological molecules in aqueous solution.The theory of VCD developed by Stephens [99,101] andBuckingham [106] and the theory of ROA developedby Barron and Buckingham [129] and Nafie [180] forisolated molecules has now been extended to condensedmatter physics, aqueous solution, in addition to theprevious isolated state, inert gas matrices and nonpolarsolvents. With the commercial availability of ChiralFTIR and Chiral Raman instrumentation [151] as wellas both commercial and academic codes which cansimulate the spectra measured the chiral vibrationalspectroscopies have now become tools which to datehave been untapped. We hope this mini review andour updated original work on LA and NALANMAwhich have been used to understand fully the struc-tural changes in these molecules as one changes thephase/environment, will serve as a basis for othergroups to use these techniques in their works instructural biology, physical biochemistry, molecularbiophysics and quantum nanobiology. Like their com-plementary tools, x-ray and neutron diffraction andNMR, vibrational spectroscopy once again is taking itsplace a very powerful tool in the nano-, quantum- andmolecular-based sciences.
Acknowledgments
KJJ would like to acknowledge the Western Aus-tralia government’s Premier Fellowship program forproviding financial support and the iVEC Supercom-puter Centre of Western Australia and the APACNational Supercomputer in Canberra for providingcomputational facilities. KJJ would also like to thankAlexandra Wassileva for her work on the literaturereview of the experimental spectra on both L-alanineand NALANMA at the German Cancer ResearchCentre in Heidelberg, Germany.
6)(2∆θ9,7,11 −∆θ9,7,10 −∆θ11,7,10) C7H9,10,11 asd R55 τ4,6 C4-O6 tor
R20 (1/√
2)(∆θ9,7,10 −∆θ11,7,10) C7H9,10,11 asd R56 τ6,17 O6-H17 tor
R21 (1/√
6)(2∆θ9,7,3 −∆θ11,7,3 −∆θ10,7,3) C7H9,10,11C7 r R57 τ17,18 H17-O18 b
R22 (1/√
2)(∆θ11,7,3 −∆θ10,7,3) C7H10,11C3 r R58 ∆r5,21 O5...H21 s
R23 (1/√
6)(2∆θ8,3,2 −∆θ8,3,4 −∆θ8,3,7) H8-C3-N2 b R59 ∆θ4,5,21 C4-O5...H21 b
R24 (1/√
2)(∆θ8,3,4 −∆θ8,3,7) H8-C3-C4 b R60 ∆θ5,21,20 O5...H21-O20 b
R25 (1/√
18)(4∆θ4,3,2 + ∆θ4,3,7 + ∆θ2,3,7) O5-C4-C3 b R61 τ4,5 C4-O5 tor
R26 (1/√
18)(4∆θ4,3,7 + ∆θ4,3,2 + ∆θ2,3,7) O6-C4-C3 b R62 τ5,21 O5-H21 tor
R27 (1/√
18)(4∆θ2,3,7 + ∆θ4,3,2 + ∆θ4,3,7) C7-C3-N2 b R63 τ21,20 H21-O20 tor
R28 (1/√
6)(2∆θ5,4,6 −∆θ3,4,5 −∆θ3,4,6) C7-C3-C4 b R64 ∆r13,23 H13...O23 s
R29 (1/√
2)(∆θ3,4,5 −∆θ3,4,6) N2-C3-C4 b R65 ∆θ2,13,23 N2-H13...O23 bR30 τ4,5 C3-C4-O5-O6 opw R66 ∆θ13,23,24 H13...O23-H24 bR31 τ2,3 H1-N2-C3-C4 tor R67 ∆θ13,23,25 H13...O23-H25 bR32 τ3,4 N2-C3-C4-O6 tor R68 τ13,23 H13-O23 torR33 τ7,3 H9-C7-C3-N2 tor R69 τ2,13 H13-O23 torR34 ∆r14,15 O14-H15 s
Atom numbering corresponds to those in Fig. 1 reference [2]∗Classification of coordinates: tor = torsion, s = stretch, (a)sd = (a)symmetric deformation,r = rocking, b = bend, opw = out of plane wagging
Table 2. Internal coordinates for the LA + 4H2O molecule for the F’ structure
15
F’
Assgn
F”’
Assgn
Barron
Assgn
Yu
Assgn
IA
ssgn
II
Assgn
Exptl
Assgn
ν(cm
−1)
PE
Dν(cm
−1)
PE
Dν(cm
−1)
PE
Dν(cm
−1)
PE
Dν(cm
−1)
PE
Dν(cm
−1)
PE
Dν(cm
−1)
PE
D
3164
R5
,63161
R5
,63809
R3
,13712
R1
3266.
R2
3176.
R5
,63080
νa N
H+ 3
3142
R2
,13138
R6
,43727
R1
,33689
R3
,23205.
R6
,53151.
R6
,43060
νa N
H+ 3
3137
R1
,33103
R3
3334
R5
3599
R2
,3,1
3176.
R5
,4,6
3089.
R7
3020
νs N
H+ 3
3135
R6
,43086
R7
3295
R7
3326
R5
,63124.
R7
3080.
R4
,6,5
3003
νa C
H3
3113
R3
,2,1
,73066
R4
,6,5
3258
R4
,63292
R6
,4,7
3096.
R4
,5,6
2988.
R2
2993
νa C
H3
3093
R7
3026
R2
3204
R6
,43272
R7
,62952.
R1
2956.
R3
,12962
νC∗
H
3064
R4
,6,5
2990
R1
2813
R2
3214
R4
,6,5
2831.
R3
2944.
R1
,32949
νs C
H3
1759
R14
,15
1778
R14
1996
R8
,91877
R9
,81758.
R15
,17
1756.
R14
,31
1645
δa N
H+ 3
1750
R15
,14
1766
R15
1822
R15
1836
R14
,15
1740.
R14
,31
1753.
R15
,17
1625
δa N
H+ 3
1695
R8
,91678
R13
,9,8
1790
R14
,13
1802
R14
,15
1646.
R9
,81674.
R9
,13
,81607
νa C
O− 2
1635
R13
1653
R13
,9,8
1641
R19
1652
R13
1617.
R13
1647.
R13
,91498
δs N
H+ 3
1536
R19
,20
1536
R19
,20
1639
R20
1642
R19
1530.
R20
,19
1539.
R19
,20
1459
δa C
H3
1529
R20
,19
1529
R20
,19
1568
R18
1636
R20
1527.
R19
,33
,21
1519.
R19
,33
,21
1459
δa C
H3
1441
R18
1456
R18
,9,1
1,8
1523
R13
,9,8
,23
1578
R18
1474.
R19
,18
1466.
R19
,23
,18
1410
νs C
O− 2
1416
R23
,18
,17
1422
R18
,23
1501
R23
,91542
R18
,8,9
,11
1438.
R18
,23
1438.
R18
,23
1375
δs C
H3
1390
R9
,11
,16
1394
R23
,81447
R24
,13
,11
1502
R23
1421.
R18
,23
,81415.
R8
,17
,24
,23
1351
δC∗
H
1350
R24
,91351
R24
,16
1395
R13
,9,1
41452
R24
1375.
R17
,24
,19
,81365.
R8
,17
,24
1301
δC∗
H
1290
R16
,22
,23
,24
1274
R16
,22
1314
R22
,17
,12
,27
1317
R22
,16
,27
,12
1394.
R17
,22
,24
,16
1315.
R22
,17
,16
,24
1220
ζN
H+ 3
1213
R17
,24
,23
1207
R17
,24
1205
R21
,24
,16
1222
R21
,24
,17
1240.
R16
,17
,22
,24
1251.
R16
,17
,24
1145
ζN
H+ 3
1133
R21
,10
,12
1143
R21
,10
,12
1164
R16
,25
,10
,12
1186
R10
,12
,17
,21
,16
1151.
R10
,21
,19
1156.
R10
,21
,19
1110
νa C
CN
1062
R22
,16
,12
1057
R22
,16
,12
1080
R16
,12
,21
1061
R17
,22
1067.
R22
,12
,16
1075.
R22
,12
1001
νC
C(O
2)
1028
R12
,21
,10
1030
R12
,21
,10
1060
R17
,22
1048
R16
,22
1042.
R21
,12
,33
,10
1048.
R21
,12
,10
995
ζC
H3
918
R10
,11
,22
938
R10
,11
,22
942
R11
,28
,10
970
R11
,10
,28
949.
R21
,22
,10
,11
944.
R21
,22
,10
,11
922
ζC
H3
840
R28
,10
,11
854
R28
,10
,11
,12
881
R10
,28
894
R10
,30
,28
866.
R22
,28
,33
860.
R22
,28
,33
,11
850
νs C
CN
775
R30
,28
,26
771
R30
,26
838
R30
,28
844
R30
,28
786.
R30
781.
R30
,33
775
γC
O− 2
635
R31
704
R31
681
R25
,11
,29
,30
640
R11
,28
,25
,30
694.
R33
,31
,21
712.
R31
,22
640
δC
O− 2
625
R25
,28
,11
,30
632
R25
,28
,11
,30
,29
562
R29
,11
,10
536
R29
,10
,11
683.
R31
,22
,33
660.
R33
,25
,28
,22
527
ζC
O− 2
528
R29
,11
,25
,27
,10
533
R29
,27
,11
,25
413
R27
426
R27
,26
545.
R29
,27
,21
,22
540.
R29
,27
,22
,21
477
τN
H+ 3
432
R27
436
R27
352
R25
,2,2
9330
R25
,26
428.
R27
,21
431.
R27
,21
,22
,32
399
δs
ke
l
355
R25
,29
,26
363
R25
,29
,26
303
R31
,15
275
R26
,29
,25
353.
R33
,22
346.
R33
,22
296
τC
H3
290
R26
,29
291
R26
,29
,25
275
R26
,29
,33
254
R33
,26
280.
R26
,21
,33
,30
284.
R26
,22
,21
,32
283
δs
ke
l
264
R33
,26
261
R33
,26
255
R33
,26
138
R31
,16
206.
R33
,22
248.
R33
,22
219
δs
ke
l
170
R32
152
R32
54
R31
,32
84
R32
,31
184.
R33
,22
200.
R33
,22
184
τC
O− 2
F’B
3LY
P/6-3
1G
*+
4H
2O
+O
nsa
ger
from
[1].
F’(P
ED
)B
3LY
P/6-3
1G
*+
4H
2O
+O
nsa
ger
this
work
.F”’B
3LY
P/6-3
1G
*+
9H
2O
+O
nsa
ger
from
[2].
F”’(P
ED
)B
3LY
P/6-3
1G
*+
9H
2O
+O
nsa
ger
this
work
.B
arron
RH
F/6-3
1G
*fr
om
[134].
Barron
(PE
D)
RH
F/6-3
1G
*th
isw
ork
.Y
u’R
HF/6-3
1G
*+
Onsa
ger
from
[142].
Yu’(P
ED
)R
HF/6-3
1G
*+
Onsa
ger
this
work
.I
B3LY
P/6-3
1G
*+
20H
2O
+P
CM
.I
(PE
D)
B3LY
P/6-3
1G
*+
20H
2O
+P
CM
this
work
.II
B3LY
P/6-3
1G
*+
20H
2O
+C
OSM
O.
II(P
ED
)B
3LY
P/6-3
1G
*+
20H
2O
+C
OSM
Oth
isw
ork
.Exptl
LA
inH
2O
/so
lid
from
[116,1
34,1
42,1
82,1
83].
Exptl
Ass
ignm
ent
base
don
gro
up
freq
uen
cies
and
isoto
pic
data
[182].
Table
3.
Ass
ignm
ents
base
don
Pote
nti
alEnerg
yD
istr
ibuti
ons
for
LA
inH
2O
.
16
References
[1] Tajkhorshid, E, Jalkanen, KJ, Suhai, S (1998) J PhysChem B 102:5899
[8] Roberts, Gull-Maj Lilian (1990) Ph.D. thesis, TheCity University of New York (Hunter College), NewYork, New York
[9] Roberts, G-M L, Diem, M (1987) In Schmid, E D,Schneider, F W, Siebert F, eds., Spectroscopy ofBiological Molecules New Advances, 77–79. Euro-pean Conference of the Spectroscopy of BiologicalMolecules, John Wiley and Sons Ltd., New York, NewYork
[10] Madison, V, Kopple, KD (1980) J Am Chem Soc102:4855
[11] Maxfield, FR, Leach, SJ, Stimson, ER, Powers, SP,Scheraga, HA (1979) Biopolymers 18:2507
[12] Grenie, Y, Avignon, M, Garribou-Lagrange, C (1975)J Mol Struct 24:293
[13] Mattice, WL (1974) Biopolymers 13:169[14] Avignon, M, Garrigou-Lagrange, C, Bothorel, P
(1973) Biopolymers 12:1651[15] Cann, JR (1972) Biochemistry 11:2654[16] Koyama, Y, Shimanouchi, T, Sato, M, Tatsuno, T
(1971) Biopolymers 10:1059[17] Avignon, M, Houng, PV (1970) Biopolymers 9:427[18] Avignon, M, Huong, PV, Lascombe, J, Marraud, M,
T, Kurosaki, K, Mataga, N, Souda, R (1952) J AmChem Soc 74:4639
[20] Blanco, S, Lesarri, A, Lopez, JC, Alonso, JL (2004)J Am Chem Soc 126:11675
[21] Jalkanen, K J, Bohr, H G, Suhai, S (1997) In SuhaiS, ed., Proceedings of the international Symposiumon Theoretical and Computational Genome Research,255–277. Plenum Press, New York, Spring Street,New York.