1 USING CIRCULAR DICHROISM TO DETERMINE HOW FLUORINATION AFFECTS PEPTIDE FOLDING Benjamin J. Levin April 23 rd , 2013 This thesis has been read and approved by Professor Neil Marsh. Signed: ____________________________ Date: ___/___/___ Faculty advisor e-mail: [email protected] Phone: (734) 763-6096
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
USING CIRCULAR DICHROISM TO DETERMINE HOW
FLUORINATION AFFECTS PEPTIDE FOLDING
Benjamin J. Levin
April 23rd
, 2013
This thesis has been read and approved by Professor Neil Marsh.
human growth hormone (191 res.) are just a few well studied examples (1). More recently, there
has been substantial interest in antimicrobial peptides (13, 14). These are evolutionarily ancient
peptides that target infective agents without harming animal or plant cells in physiologically
relevant concentrations (13). Although their structural diversity is impressive, these peptides all
seem to differentiate between bacterial cells and eukaryotic cells via charge interactions on the
cell membranes (15). The outer surface of bacterial membranes is composed of lipids with
negatively charged phospholipid headgroups, while in eukaryotes the outer surface is typically
more neutral with the negative groups on the internal side of the membrane. Thus it seems that
the peptides interact specifically with the negatively charged membranes and then kill the
microbes through a variety of different pathways. How each peptide kills pathogens has been the
subject of much debate (15). An understanding of these mechanisms could lead to a new class of
antibiotics, especially considering bacteria cannot seem to develop any resistance to these
peptides in a physiologically relevant setting or timescale (16).
In order to better study the structure of peptides, the element fluorine has been utilized
extensively (17). Fluorine is a fascinating element with a number of unusual properties,
particularly in organic compounds. For instance, it exhibits the aptly named “fluorous effect”. A
solution of perfluorohexane is not soluble in hexane or in water and a mixture of the three results
6
in three phases, indicating that the compound is neither truly hydrophobic nor hydrophilic (18,
19). Even more strangely, 1,2-difluoroethane is the canonical example of the Gauche Effect (20).
Steric considerations can explain why the staggered form of butane is 0.9 kcal/mol more stable
than the gauche configuration, but the opposite holds for 1,2-difluoroethane: The gauche form is
more stable by about 0.6 kcal/mol (20). This is due to hyperconjugation (21). That is, the σC-H
bond interacts favorably with the σ*C-F that is anti to it or equivalently that the HOMO of the
carbon-hydrogen bond interacts with the low lying LUMO of the anti carbon-fluorine bond.
Although fluorine does have some unusual features, it has several that could make it a powerful
tool in the study of proteins. The van der Waals radius of fluorine, about 1.4 Å, is similar to that
of hydrogen (~1.2 Å), and so the two are relatively isosteric and can be substituted for each other
while only minimally perturbing the structure (22). The fluorine-19 nucleus has a spin of ½, is
83% as sensitive as 1H, and is 100% abundant (23). This has made
19F-NMR not only a powerful
tool in vitro, but due to the lack of fluorine in biological compounds (see ref. 24 for exceptions),
in vivo 19
F-NMR has become quite popular as well (18).
The incorporation of fluorine into peptides can be done in a number of ways. Solid phase
synthesis with fluorine-containing amino acid analogues is the most flexible tool for short
peptides of less than 50 residues (18). Another method requires the use of an amino acyl-tRNA
synthetase that can recognize fluorinated substrates. Although the latter method is less developed
currently, it has the potential to allow the site-specific incorporation of fluorinated amino acids
into large proteins (25). After incorporation, the analysis of fluorine’s local chemical
environment can be done with NMR, as there is no background signal. Just one example is the
determination of how fluorinated analogs of antimicrobial peptides interact with bacterial cell
7
membranes (26). By examining how the chemical shift and width of the peaks change over time,
insight into the peptide’s mechanism can be gained.
In addition to their usefulness as probes, fluorinated analogs of proteins can have a more
stable folded form when packed appropriately. When coiled-coil proteins were analyzed and the
interior hydrophobic leucine residues were replaced with trifluoroleucine, the melting
temperature and the midpoint concentration for urea increased, indicating increased resistance to
thermal and chemical denaturation (27). In addition, proteins fluorinated in specific locations are
known to become more resistant to proteolysis (28). Increased resistance to proteolysis is
medically relevant, as it could lead to peptide pharmaceuticals that have a longer half-life, as
typically these drugs are metabolized too rapidly to be useful. The basis for this is
straightforward. The hydrogen-to-fluorine substitution alters the structure just enough that
proteolytic enzymes cannot conform to fit around that segment of the protein, but not enough to
dramatically decrease the activity of the peptide (28).
The increased resistance to thermal and chemical denaturation has been harder to explain
(29). Methods to measure this increase in stability most commonly use the difference in melting
temperature or free energy of unfolding, but these cannot give a clear structural basis for the
cause. Complicating matters further is enthalpy-entropy compensation, which makes the
thermodynamic data even tougher to interpret without structural support (30). Several different
theories have been proposed. Some rely simply on the increased size and hydrophobicity of
fluorine (28). Because fluorocarbon chains are even more hydrophobic than hydrocarbons, they
will be relatively more stable in a folded protein where they can be put into a water-excluded
cavity. However, fluorine’s unusual properties suggest the cause might be more complex. The
fluorous effect and the unusually low lying σ*C-F make alternative explanations plausible.
8
Perhaps fluorine atoms in the hydrophobic cavity are interacting favorably with one another,
providing an enthalpic contribution to folding. To fully utilize fluorine in peptides, an
understanding of why they increase stability is essential.
A model system to explore the contribution of fluorine to peptide stability was developed
by the Marsh Lab in 2004 (31). See Figure 1 for its structure. In their design, the folded tetramer
has hydrophilic residues placed externally, while the hydrophobic fluorous residues are internal
and able to interact with each other while avoiding the aqueous solution. Conversely, in the
unfolded state the protein is in a random coil conformation. Verification of the α-helical content
was done via circular dichroism, while analytical ultracentrifugation and gel filtration confirmed
the monomer-tetramer equilibrium. The fluorinated analog was more stable to chemical
denaturation than the non-fluorinated version, demonstrating that fluorine is somehow playing a
stabilizing role (31).
Additional analogs were synthesized, each containing differing amounts of fluorine in the
interior cavity (32). These peptides were analyzed using the above techniques and similar results
observed. Interestingly, the free energy of folding seemed to increase linearly with the number of
fluorine atoms in the peptide. Subsequently, proteolysis experiments revealed that the fluorinated
analogs have a substantially increased resistance to degradation (33). Thus this sequence of
peptides demonstrates the desired properties, but it is still unclear how fluorine confers these
effects.
9
Figure 1: Reprinted with permission from Ref. 31. Copywrite 2004 American Chemical Society.
This was the first antiparallel α4-helix bundle designed by the Marsh Lab to determine how
fluorine affects the stability of folded proteins. Note how the hexafluoroleucine residues are
packed into the hydrophobic interior of the tetramer.
Further variants of α4H made it clear that it is not only the amount of fluorine, but also
where the fluorine is located that plays a role (34). By selectively placing hexafluoroleucine at
specific ‘a’ and ‘d’ positions in the chain (see Figure 2), the packing structure was shown to
influence the free energy change. However, crystal structures of the tetramers were not
immediately available and so it was challenging to determine the details of the effect. Notably,
enthalpy-entropy compensation could have been taking place, making anything other than free
energy calculations impossible to interpret.
10
Figure 2: Reprinted with permission from Ref. 34. Copywrite 2009 American Chemical Society.
These are several variants of the synthesized α4H peptide, where the hexafluoroleucine is
indicated with darker, larger spheres. The packing differences are intimately related to the
stability of the tetramer.
To conclusively determine the source of the stability increase, a detailed thermodynamic
analysis was undertaken (29). Guanidinium hydrochloride and heat were used to denature the
peptides and the folding was monitored using circular dichroism. Circular dichroism is a
powerful tool to monitor protein secondary structure, showing characteristic peaks for α-helices,
β-sheets, and random coils (35). In particular, the ellipticity at 222 nm is strongly correlated with
α-helical content and since all of the α4 analogs above have an α-helical folded state and a
random coil unfolded state, this provides a convenient and accurate way to determine the
proportion folded under given conditions.
Materials and Methods:
Before peptide synthesis could be performed, the fluorinated amino acids had to be
obtained. 4,4,4-Trifluoroethylglycine was purchased from SynQuest Laboratory and
enzymatically resolved using porcine kidney acylase I as previously described (36). Next, L-
11
5,5,5,5’,5’,5’-hexafluoroleucine was synthesized using established methods (37). Boc and Fmoc
protected β-tert-butyl-L-alanine were bought from AnaSpec Inc. Peptide synthesis was
performed using standard protocols (31, 38). The peptide sequences are given in Figure 3: α4H
and α4tbA6 were synthesized via the Fmoc protocol, while the others were done with Boc
procedures.
Figure 3: Adapted with permission from Ref. 29. Copywrite 2012 American Chemical Society.
This shows the sequences of all of the synthesized peptides, as well as the structures of the
folded proteins. The structures of the amino acids are also given.
12
The circular dichroism experiments were done in an Avis 62DS spectropolarimeter at 222
nm with a 1 mm path length cuvette. Stock solutions were prepared that contained 40 μM peptide
(monomer concentration), 10 mM potassium phosphate buffer at pH 7.0, and 9 to12 different
concentrations of guanidinium hydrochloride. Each of those solutions had ellipticity
measurements averaged over 10 seconds and the temperature for each was varied from 4 °C to
90 °C in 2 °C increments. Between 430 and 512 data points were obtained for each peptide.
The relationship between the CD signal and the thermodynamic parameters was
established as follows. It was assumed that the equilibrium was between the monomeric random
coil peptide and the folded 4-α-helix bundle. Suppose [P] is the total amount of peptide and let
[U] and [F] be the concentrations of monomeric peptide and folded tetramer respectively, so
. For all of the experiments presented here [P] = 40 μM. Then at any given set
of conditions, the ellipticity θ can be written as the sum of the contributions from the unfolded
and folded components, as in Eqn. 1.
This relates the measured ellipticity to the amount of unfolded peptide, and so now a relationship
between [U] and the thermodynamic parameters must be found.
The equilibrium constant between F ⇄ 4U is given by the following equation.
For monomer and dimer folding, the above equation can be solved for [U] because the exponent
is only equal to 1 or 2. In general though, the above equation cannot be solved for [U] in a simple
or convenient way (although an analytic solution does exist) (39). It is far simpler to use a
numeric solution. With modern computers, the error from this is completely negligible. In Eqn. 2
13
we are given a function from [U] to K, and so an inverse must be shown to exist and to be unique
in the domain where [U] lies in the interval (0, [P]). This will be done using real analysis (40). K
is clearly defined for all [U] in the desired interval and as a rational function of [U], it is also
differentiable where defined.
This function is positive for all [U] in the domain of interest, and so K is monotonic and thus
there exists an inverse function that maps from K to [U] as desired. To prove uniqueness, it
suffices to define [U] = 0 when K = 0 and [U] = [P] as K tends to ∞. Thus, Eqn. 2 has a unique
inverse function. Informally, Eqn. 2 finds the K value for each [U], and the previous steps show
that Eqn. 2 can also be used to find the [U] that corresponds to each K. Although this cannot be
done explicitly, this computation can be done numerically by finding the (unique) root of
between 0 and [P]1.
Returning to Eqn. 1, this can be rewritten as
where now the monomer concentration is a function of the equilibrium constant. The equilibrium
constant can be related to the Gibbs free energy, and this can then be substituted into the Gibbs-
Helmholtz equation (Eqn. 6 below) as follows (41).
1 Although the above procedure is rigorous, inverse functions are rather common. For example, the inverse of the
squaring function is the square root function . Although the fourth order polynomial in Eqn. 4 is
more complicated than the squaring function, the mathematics involved is entirely analogous.
14
In Eqn. 7, ΔH°, ΔS°, and ΔCp° represent the change in enthalpy, entropy, and heat capacity upon
unfolding at a standard temperature T0 (typically 298 K). Over this temperature range it cannot
be assumed that the enthalpy and entropy changes are constant, but these are accounted for with
the ΔCp° term. The m term relates the free energy of unfolding to the concentration of
denaturant, namely that the free energy of unfolding decreases linearly with [GuHCl]. In the
above it is safe to assume that ΔCp° and m are constant (41, 42). This gives K as a function of
temperature, [GuHCl], and the desired thermodynamic parameters.
Thus, using Eqn. 5, we have the ellipticity in terms of [U], and Eqn. 4 gives [U] as a
function of K. From Eqn. 7, K is dependent on the temperature T, [GuHCl], and the constants
ΔH°, ΔS°, ΔCp°, and m. Combining all of this together gives the ellipticity as a function of the
experimental conditions and the thermodynamic parameters. The only note remaining is to
determine the functional form for θu and θf. As done in Ref. 41, these were given a linear
dependence on the parameters.
This is more of an empirical rather than theoretical assumption, but it has been shown to be
accurate in most cases and perhaps more importantly setting these to constants rather than planes
gave similar values for the peptides studied.
The above theory was used to create a curve fitting algorithm in MATLAB (MathWorks,
Inc.). By using the function given by θ vs. T vs. [GuHCl], a nonlinear least squares program
could provide estimates and confidence intervals for the thermodynamic parameters.
15
Results:
The CD spectra were obtained as expected. It had been determined previously that the
peptides existed as monomers and tetramers with no significant intermediate structures (34). It
was confirmed here that the folding was reversible and that no precipitate formed, indicating that
the assumptions used in the modeling held for these peptides (29). Before the results of the fits
are given, the algorithms developed are given below.
R = 8.3145; %Ideal Gas Constant in SI units P = 40e-6; %Free Peptide Concentration in M T0 = 298.15; %Reference Temperature in K
%1 kcal = 4184 J exactly %Make sure data is saved in three columns: %Temp (Celcius), Theta, [Den] (Molar)
data = uiimport; %Import data; it will be saved as 'data' data = data.data; %MatLab Glitch; this is the workaround Temp = data(:,1) + 273.15; %Converts to Kelvin Theta_Obsd = data(:,2); Den = data(:,3); Cond = [Temp Den]; %Cond is the n x 2 matrix of Temp and Den, and Theta_Obsd is output
%Define additional functions K, U, and f K = @(b,Cond)exp(-(b(3)-Cond(:,1)*b(4)+b(5)*(Cond(:,1)-T0 ... +Cond(:,1).*log(T0./Cond(:,1)))-Cond(:,2)*b(6))./(R*Cond(:,1))); U = @(b,Cond) arrayfun(@(k) fzero(@(x) x^4+k*x/4-k*P/4, [0 P]), K(b,Cond)); f = @(b,Cond)
Because the above algorithm can be generalized to systems beyond studying tetramer folding
with circular dichroism, it will be analyzed in depth. Through defining the matrix ‘Cond’ in the
above script, the program is defining constants and putting the experimental data into a format
that MATLAB can work with. The equilibrium constant ‘K’ is defined using Eqn. 7 to be a
function of the constant thermodynamic parameters and m and the experimental conditions T and
[GuHCl]. Note that the constants are not defined yet, even though ‘K’ is a function of them.
Next, the free monomer concentration ‘U’ is defined as a function of those same parameters by
using ‘K’ and Eqn. 4. Finally, the ellipticity ‘f’ is presented as a function of all of the variables
through Eqn. 5. In the above, all of the parameters are given in the vector b = [a; e; ΔH°; ΔS°;
ΔCp°; m; b; c; f; g].
After the required definitions were programmed, parameter estimates were computed.
When doing nonlinear least squares it is necessary to input a functional form, experimental data,
and then initial values for the constants to be determined. Instead of requiring a user to estimate
parameters each time, a module was created to do this automatically. To determine estimates, it
was first assumed that the base planes were constant (i.e. b = c = f = g = 0) and that the
contribution of the folded peptide to ellipticity was just above the maximum signal at 222 nm
and contribution from the unfolded peptide was just below the minimum signal at 222 nm (35).
17
This allows [U] to be estimated for each experimental condition using Eqn. 1. From that value of
[U], Eqn. 2 gives a value for K and then
allows ΔG to be estimated at each experimental condition. From Eqn. 6, using the estimated free
energy of unfolding as well as the temperature and [GuHCl], which are the known experimental
conditions, we have a linear system of equations where the unknowns are the desired parameters.
This system can be solved easily and the resulting ‘ParaEst’ is the collection of those estimates.
It is reasonable to ask if these computed parameters could be used as the results. Just
doing that would be inaccurate. In particular, it is rarely appropriate to transform a nonlinear
model into a linear one and then perform straightforward linear regression and expect the results
to be precise. This is a common problem when analyzing Lineweaver-Burk plots, for example
(43). More specifically, the procedure gives additional weight to certain points, which causes the
errors to be non-normally distributed. The same problem occurs in the ‘Initial Values Module’
above. Fortunately, these values only need to be approximate enough to be used in the nonlinear
regression component, and the results indicate that this method gives good enough values for the
nonlinear least squares algorithm to converge.
The last two lines of code perform the nonlinear curve fitting algorithm using the
experimental conditions, observed ellipticity, the assumed model, and the estimated parameters.
The ‘ci’ portion computes the confidence intervals from the regression output. This outputs the
desired parameters and the confidence intervals in which they lie. To determine the accuracy of
these calculations, the experimental data was plotted with the theoretical curve in MATLAB
using the algorithm below.
X = (min(Temp):(max(Temp)-min(Temp))/99:max(Temp)).'; Y = (min(Den):(max(Den)-min(Den))/99:max(Den)).'; [X,Y] = meshgrid(X,Y);
18
Z = zeros(size(X)); for i = 1:100 for j = 1:100 Z(i,j) = f(beta, [X(i,j) Y(i,j)]); end end
%plottools is the function to graph stuff; plot the surface X vs. Y vs. Z
%with a scatterplot of the data
By plotting the data, it could be made clear if the data was appropriately fitted. That provided a
good first test, but the residuals were also analyzed to confirm the fit was valid. The plots are
given below.
For five of the peptides (α4H, α4F3af3d, α4F2(6,24), α4F2(10,20), and α4F2(13,17)), the
above algorithm did not converge. Because they unfolded so easily, the lower base plane could
not be accurately estimated. A slight variation of the above algorithm was used that fixed the
lower based plane. Running the data for the more stable peptides through this program revealed
that the parameters only changed around 10%, and so for comparative purposes was acceptable.
The portions with changes are given here for completeness, but there are really only minor
computational differences.
%Define additional functions K, U, and f K = @(b,Cond)exp(-(b(3)-Cond(:,1)*b(4)+b(5)*(Cond(:,1)-T0 ... +Cond(:,1).*log(T0./Cond(:,1)))-Cond(:,2)*b(6))./(R*Cond(:,1))); U = @(b,Cond) arrayfun(@(k) fzero(@(x) x^4+k*x/4-k*P/4, [0 P]), K(b,Cond)); f = @(b,Cond)