-
1
Investigating Alkaline Phosphatase and Ketosteroid Isomerase
by Rational Design
A thesis presented
by
Nicholas A. DeLateur
to
The Department of Chemistry and Chemical Biology
in partial fulfillment of the requirements for the degree of
Master of Science in the field of Chemistry
Northeastern University
Boston, Massachusetts
August 8, 2013
-
2
© Copyright 2013
Nicholas A. DeLateur
All rights reserved
-
3
Investigating Alkaline Phosphatase and Ketosteroid Isomerase
by Rational Design
by
Nicholas A. DeLateur
ABSTRACT OF THESIS
Submitted in partial fulfillment of the requirements for the
degree
of Master of Science in Chemistry and Chemical Biology
in the College of Science of Northeastern University,
August 8, 2013
-
4
Abstract
Enzymes catalyze chemical reactions many orders of magnitude
faster than the
uncatalyzed reaction and are capable of doing so at
physiological pH and temperature. As
enzymes consist of hundreds of amino acids, the ability to
identify which residues contribute to
catalysis with high recall and low false positive rates is of
critical importance to characterizing
and engineering enzymes. Theoretical Microscopic Anomalous
Titration Curve Shapes
(THEMATICS) and Partial Order Optimum Likelihood (POOL) are
programs developed at
Northeastern University that can identify the residues
contributing to catalysis. THEMATICS
finds anomalous titration behavior, which correlates with
catalytic activity. POOL combines the
THEMATICS input with geometric and evolutionary predictions to
rank each residue by the
likelihood of its importance for catalysis.
Alkaline phosphatase (AP) is a protein found in all domains of
life which cleaves
phosphate groups from a broad range of substrates. Ketosteroid
isomerase performs an important
biological function in the metabolism of many bacteria by
degrading steroids. THEMATICS and
POOL predict that alkaline phosphatase and ketosteroid isomerase
contain most of their catalytic
power in the residues directly surrounding the reacting
substrate molecule; there is very little
contribution from the residues in the distal or remote residues
of the protein. This example is in
stark contrast to phosphoglucose isomerase (PGI) and nitrile
hydratase (NH), where
THEMATICS and POOL predict a multi-layer active site, with
residues in the second and third
shells contributing to activity. The predictions for KSI, PGI,
and NH have been experimentally
validated.
-
5
Pseudomonas putida KSI (PpKSI) is strikingly efficient and
selective. Three putative
KSIs identified from Structural Genomics were analyzed by
THEMATICS and POOL and then
characterized in vitro to determine the presence of, or lack of,
KSI activity. A putative KSI from
Mycobacterium tuberculosis (MtKSI) was predicted to have
isomerase activity and biochemical
experiments reveal that the putative M. tuberculosis KSI does
indeed possess KSI activity,
although with reduced efficiency compared to PpKSI.
To investigate this lower efficiency in the correctly annotated
KSI, we engineered the
MtKSI active site to resemble more closely that of PpKSI under
the hypothesis that these
mutations would increase the activity of MtKSI. However, we
found that most of these mutations
alone or in tandem significantly lowered rather than increased
activity. Variants S16Y, F111D,
S16Y/F64Y, S16Y/F111D, F64Y/F111D, and S16Y/F64Y/F111D lost
catalytic power and were
essentially inactive. Variant F64Y retained catalytic power
similar to the wild-type enzyme.
Although the active sites of MtKSI and PpKSI are similar, our
attempts to increase the catalytic
efficiency by creating a more PpKSI-like active site of MtKSI
were not successful.
Protein engineering relies on the ability to accurately predict
sites of function. The best
predictor for active-site residues is POOL using THEMATICS,
INTREPID, and ConCavity
inputs. We’ve shown that not only can POOL correctly predict the
residues required for
catalysis, but these predictions can also be used to assign
function to proteins whose function is
unknown or putatively assigned. Even if the residues required
for catalysis are known, the ability
to engineer improved or novel function is still difficult and
may require multiple approaches.
-
6
Acknowledgments
I am blessed with not one, but two advisors of extraordinary
talent and patience. I am
forever grateful to Professor Penny Beuning for allowing me to
begin work in her lab as a young
freshman with no experience in chemistry or biology. She has
been an unending source of
mentoring and teaching. Professor Mary Jo Ondrechen has trusted
me with project after project,
encouraging me to investigate and grow as a scientist, for which
I will be always grateful.
Dr. Srinivas Somarowthu performed the herculean task of teaching
me both the
computational and experimental aspects of THEMATICS/POOL,
alkaline phosphatase, and
ketosteroid isomerase. I owe most of my practical knowledge in
these areas to Sri, and am
thankful for the pleasure of meeting and working with him over
these past years.
I want to thank the numerous past and present DNA and ORG lab
members, with
emphasis towards Judith Hollander and Ramya Parasarum for
graciously sharing bench space
and wisdom. Mark Naniong and Colleen Shea experimented on MtKSI
as undergraduate
researchers and their impressive work contributed to the data
contained in this thesis.
Neither this work—nor even my graduation—would be possible
without Richard
Pumphrey, Cara Shockley, Andrew Bean, Jordan Keefe, and Katie
Cameron assisting me
through the NU shuffle and my own shortcomings. Jeff Peterson,
Professor Graham Jones,
Professor Carla Mattos, and Professor O’Doherty have provided me
with immensely valuable
discussion and direction. I believe John Bottomy has forgiven me
more than anyone on Earth; I
cherish his friendship and kindness.
-
7
I owe my inspiration and aptitude to my ever-supportive family,
especially my parents
Sandra and Joe. They have been a never-ending source of love.
Thank you so much Mom, Dad,
and Matt, along with Cole and Tiffany.
Funding that allowed these projects and my research to happen
was provided by the
Office of the Provost at Northeastern University, the Matz Co-op
Scholarship, and grants NSF:
MCB-0843603, CAREER MCB-0845033, and REU MCB-0843603.
-
8
Table of Contents
Abstract
...........................................................................................................................................
3
Acknowledgments...........................................................................................................................
6
Table of Contents
............................................................................................................................
8
List of Figures
.................................................................................................................................
9
List of Tables
................................................................................................................................
11
List of Abbreviations
....................................................................................................................
12
Chapter 1. Protein Engineering
.....................................................................................................
16
1.1. Proteins as catalysts
............................................................................................................
16
1.2. Design vs. Redesign; Directed Evolution vs. Rational Design
.......................................... 18
1.3. Functional Site Prediction with THEMATICS and POOL
................................................ 20
1.4. Catalysis by remote residues
..............................................................................................
24
Chapter 2. Alkaline Phosphatase
..................................................................................................
26
2.1. Introduction
........................................................................................................................
26
2.2. Computational Predictions
.................................................................................................
28
2.3. Materials and Methods
.......................................................................................................
33
2.4. Results
................................................................................................................................
35
2.5. Conclusions
........................................................................................................................
39
Chapter 3. Ketosteroid Isomerase
.................................................................................................
43
3.1. Introduction
........................................................................................................................
43
3.2. Computational Predictions
.................................................................................................
45
3.3. Materials and Methods
.......................................................................................................
46
3.4. Results
................................................................................................................................
48
3.5. Conclusions
........................................................................................................................
53
Chapter 4. Future Work
................................................................................................................
56
4.1. POOL-rank
cut-offs............................................................................................................
56
Appendix A. Propagation of error in calculating catalytic
efficiency .......................................... 59
References
.....................................................................................................................................
60
-
9
List of Figures
Figure 1.1. Alanine, aspartate, glutamate, and asparagine at pH
7. .............................................. 19
Figure 1.2. Phenylalanine, tyrosine, and serine at pH 7.
..............................................................
20
Figure 1.3. A titration curve of mean net charge as a function
of pH for select lysine residues in
E. coli β-lactamase.
.................................................................................................................
22
Figure 1.4. Diagram of a multi-layered active site.
......................................................................
25
Figure 2.1. The active site of alkaline phosphatase based on PDB
ID: 1ALK.. ........................... 27
Figure 2.2. Diagram of Evolutionary Trace and THEMATICS
predictions for AP. .................... 28
Figure 2.3. A POOL plot of POOL score vs. POOL rank for alkaline
phosphatase. ................... 30
Figure 2.4. The 2nd
and 3rd
shell residues predicted by THEMATICS.
........................................ 32
Figure 2.5. Primers for site-directed mutagenesis of E. coli
alkaline phosphatase.. .................... 33
Figure 2.6. Standard curve for 4-nitrophenol phosphate
..............................................................
36
Figure 2.7. Michaelis-Menten plots for AP in 1 M Tris-HCl pH 8.0
buffer. ............................... 37
Figure 2.8. Catalytic efficiencies of wild-type and variant
alkaline phosphatases ....................... 38
Figure 2.9. AP residues investigated in this work..
......................................................................
39
Figure 2.10. A plot of Table 3 and Table 4 showing effects on
catalytic efficiency based on
POOL rank for AP.
.................................................................................................................
40
Figure 3.1. Mechanism of KSI based on PpKSI
numbering.........................................................
43
Figure 3.2. Primers for site-directed mutagenesis of MtKSI in
plasmid pGST-Rv0760c. ........... 46
Figure 3.3. Standard curve for 4-androstene-3,17-dione (4AND).
............................................... 49
Figure 3.4. Michaelis-Menten plots for MtKSI WT and
variants................................................. 49
Figure 3.5. WT and F64Y individual Michaelis-Menten plots.
.................................................... 49
Figure 3.6. Single run of Michaelis-Menten plot for MtKSI F111D.
........................................... 50
-
10
Figure 3.7. “Top-down” view of PpKSI.
......................................................................................
53
Figure 3.8. Three residues of interest in PpKSI without
surrounding secondary structure. ......... 54
Figure 4.1. POOL plots for AP, KSI, PGI, NH, DnaE and DinB.
................................................ 58
-
11
List of Tables
Table 2.1. POOL predictions for alkaline phosphatase.
...............................................................
29
Table 2.2. Kinetic assays for alkaline
phosphatase.......................................................................
35
Table 2.3. WT and variant AP kinetic parameters.
.......................................................................
37
Table 2.4. Summary calculations for WT alkaline phosphatase and
variants. ............................. 38
Table 2.5. 1st shell variants of AP and their catalytic
efficiency under comparable conditions to
our experiments.
......................................................................................................................
41
Table 2.6. 2nd
and 3rd
shell variants of AP and their catalytic efficiency under
comparable
conditions to our experiments.
................................................................................................
42
Table 3.1. SALSA alignment of POOL predicted residues for known
KSI proteins and proteins
annotated as putative KSIs..
....................................................................................................
45
Table 3.2. Kinetic assays for MtKSI..
...........................................................................................
48
Table 3.3. Vmax and KMapp
for MtKSI WT and variants.
................................................................
50
Table 3.4. Comparison between the WT MtKSI and F111D variant at
90 μM 5AND. ............... 50
Table 3.5. Catalytic efficiency for MtKSI WT and
variants.........................................................
51
Table 3.6. MtKSI WT and variants based on initial velocities at
30 μM 5AND. ......................... 52
Table A.1. Concentrations of enzymes used to gather kinetic data
for alkaline phosphatase. ..... 59
-
12
List of Abbreviations
% Percent
°C Degrees Celsius
4AND 4-androstene-3,17-dione
5AND 5-androstene-3,17-dione
Å Ångströms
AP Alkaline phosphatase
BSA Bovine Serum Albumin
cm Centimeter
Da Dalton
DinB DNA Polymerase IV
DNA Deoxyribonucleic acid
DTT Dithiothreitol
E. coli Escherichia coli
ET Evolutionary Trace
FPLC Fast protein liquid chromatography
GST Glutathione S-transferase
h Hours
HEPES 4-(2-Hydroxyethyl)-1-Piperazineethanesulfonic Acid
kcat First order rate constant
kDa Kilodalton
KM Michaelis constant
-
13
KSI Ketosteroid isomerase
L Liter
M Molar
min Minutes
mL Milliliters
Ml Mesorhizobium loti
mM Millimolar
mmol Millimoles
Mt Mycobacterium tuberculosis
NH Nitrile hydratase
nM Nanomolar
nm Nanometers
NTF2 Nuclear Transcription Factor 2
OD Optical density
Pa Pectobacterium atrosepticum
PDB Protein Data Bank
PGI Phosphoglucose isomerase
PhoA Alkaline phosphatase
PNP para-nitrophenol
PNPP para-nitrophenol phosphate
POOL Partial Order Optimum Likelihood
PSI Protein Structure Initiative
-
14
R2 Regression co-efficient
rcf Relative centrifugal force
RNA Ribonucleic acid
SALSA Structurally Aligned Local Sites of Activity
SDS-PAGE Sodium dodecyl sulfate poly-acrylamide gel
electrophoresis
SG Structural Genomics
SVM Support vector machine
TEV Tobacco etch virus
THEMATICS Theoretical Microscopic Anomalous Titration Curve
Shapes
TM Melting temperature
Tris-HCl 2-amino-2-hydroxymethyl-propane-1,3-diol
μ3 3rd
central moment
μ4 4th
central moment
v/v Volume by volume
V0 Initial velocity
Vmax Maximum velocity
WT Wild-type
YT Yeast extract and Bacto Tryptone
μL Microliter
μM Micromolar
σ Error
-
15
A Ala Alanine
C Cys Cysteine
D Asp Aspartic Acid
E Glu Glutamic Acid
F Phe Phenylalanine
G Gly Glycine
H His Histidine
I Ile Isoleucine
K Lys Lysine
L Leu Leucine
M Met Methionine
N Asn Asparagine
P Pro Proline
Q Gln Glutamine
R Arg Arginine
S Ser Serine
T Thr Threonine
V Val Valine
W Trp Tryptophan
Y Tyr Tyrosine
-
16
Chapter 1. Protein Engineering
1.1. Proteins as catalysts
All known life forms create polymers of various combinations of
20 different amino
acids. These polymers are known as proteins and frequently act
as catalysts, in which case they
are then referred to as enzymes. The linear chain of amino acids
(primary structure) folds to form
local order such as α-helices and β-strands (secondary
structure). These helices, strands, loops,
and other local structures fold into a single overall
arrangement (tertiary structure); multiple
chains can associate with each other (quaternary structure).
Enzymes catalyze reactions under
physiological conditions, such as neutral pH and room
temperature, with extreme specificity and
high efficiency. With few exceptions, enzymes are responsible
for catalyzing every important
chemical reaction in biology, giving rise to statements such as
Orgel’s First Law:
Whenever a spontaneous process is too slow or too
inefficient
a protein will evolve to speed it up or make it more
efficient.
Most proteins are on the order of 100 to 1000 amino acids. With
20 canonical amino
acids with which to build, the number of possible protein
sequences quickly becomes
unfathomable. For a protein on the smaller end, the number of
possible sequences is 20100
. This
number however includes sequences that are nothing more than 200
prolines in a row, a
sequence that would be generally considered non-functional.
Estimates put the fraction of
“functional” folds to be 1 in 1077
[1].
How enzymes are capable of achieving the remarkable feats of
chemistry required of
them represents a central area of research in biochemistry.
While a protein may be composed of
hundreds of amino acids, generally only a small handful of those
amino acids are directly
-
17
involved in the performance of catalysis. These residues compose
the “active site” of the
enzyme. In 1946 Linus Pauling postulated that the catalytic
power of enzymes lies in their ability
to lower the energy of the transition state between substrate
and product[2], a theory which
essentially is still true today[3]. To explain the ability of
enzymes to perform reactions only on
their specific cognate substrates, the “lock and key”[4] theory
was proposed, eventually giving
way to a more nuanced “induced fit”[5] theory taking into
account the transition state geometry
and realistic expectation of a dynamic system. The lock-key and
induced fit models generally
assume a globular fold with a solvent accessible active site. In
many cases the active site is
buried within the protein or protein cavity[6]. Recently, Jiri
Damborsky has pursued a “keyhole-
lock-key” model to address this complication[6, 7].
The Central Dogma[8] of biology, that DNA is transcribed into
RNA which is then
translated into protein, provides a natural scheme in which to
probe hypotheses about protein
sequence-structure-function relationships. By manipulation of an
organism’s DNA, a variant
protein product is produced, which can then either be examined
at an in vitro functional level
after isolation, or kept in the organism and the phenotype of
the organism observed under
varying conditions to elucidate in vivo function. The
sequence-structure relationship is a folding
problem, and while interesting in its own right, will not be
addressed here in favor of the
structure-function relationship. One reason is that many
sequences result in the same overall
structure. Another reason is that the active site is a
structural feature and is the focus of protein
engineering endeavors.
Two things required for protein engineering as a field to emerge
were:
-A method to change the protein sequence, and thus structure,
with exquisite control
-
18
-A falsifiable hypothesis of how a change in protein structure
will change protein function
This was finally accomplished in 1982, exemplified with a
foundational study on tyrosyl-
transfer RNA synthetase[9], after the advent of site-directed
mutagenesis which allowed specific,
controlled changes at the DNA level to be specified by the
researcher. Protein engineering as a
field today produces marvelous work that ranges from designing
enzymes to catalyze Diels-
Alder[10] and Kemp[11] reactions, building a fully functional
enzyme from a 9-amino acid
alphabet[12], and creating a completely new fold never seen in
nature[13].
1.2. Design vs. Redesign; Directed Evolution vs. Rational
Design
To be strict, protein engineering, or protein design, would
refer to the process of creating
a functional protein de novo (also referred to as “artificial”
enzymes). Most protein engineering
however utilizes already functional enzymes, such as those
extracted from organisms of research
interest, and manipulates them in a way to make them more
functional, different in function, or
to make them lose their functionality. These are examples of
protein reengineering and can often
be seen designated as such in the literature (for example, see
recent review by Hilvert[14]).
Presently, the protein engineering paradigm of creating
mutations and examining
resulting changes in function is well established.
Implementation on the other hand is constrained
by the unfathomably high permutation level proteins occupy; it
is impossible for an experimental
lab to investigate every residue in a protein especially if
multiple mutations at the same residue
are desired. There are two main approaches to deal with this
dilemma: directed evolution and
rational design.
Directed evolution draws upon Darwinian evolution concepts to
discover mutations of
interest by iterating rounds of mutagenesis and selection. At
its simplest, a gene encoding a
-
19
protein undergoes non-specific mutagenesis to introduce a large
array of mutations, and the
resulting library expressed and a certain phenotype is selected.
The survivors of the first round of
selection return to the mutagenesis step to repeat the process
until a satisfying level of function is
attained. Since its inception, directed evolution has proven to
be a powerful technique for protein
engineering[15, 16] for developing new or improved function.
Rational design represents the oldest method of protein
engineering. Using hypotheses
about the roles of particular residues, specific mutations to
specific amino acids are chosen,
created, and then the resulting change (or lack thereof)
examined. The residues of interest can be
chosen based on crystal structures, previous experiments,
sequence comparison, structural
comparison, etc. To determine which residues in a protein
contribute to catalysis for example,
one would determine which residues are suspected of contributing
to catalysis, and create one or
more mutations that probe this hypothesis.
Figure 1.1. Alanine, aspartate, glutamate, and asparagine at pH
7.
Alanine contains a mere methyl group as its side-chain, whereas
aspartic acid is a short
acid. Glutamic acid is another acid with a side chain longer
than aspartic acid by a single
methylene, and asparagine contains an amide group instead of the
carboxyl group. Often a
residue is changed to alanine due to alanine’s simple nature,
consisting of a single methyl group
O
NH3+
CH3O-
O-O
-
NH3+
O
O
O O
O-
O-
NH3+
O
O
NH2
NH3+
O-Glutamate
Alanine Aspartate
Asparagine
-
20
for a side chain residue. This approximates a loss of both
functional group and bulk. Charge and
size are two of the most important characteristics to
investigate for an amino acid’s contribution.
Figure 1.1 shows the differences between an aspartic acid and a
change to asparagine to change
charge, or a change to glutamic acid to change size.
Many times a residue will contain multiple functionalities. For
example, tyrosine contains
both a hydroxyl functional group and an aromatic functional
group. To investigate the
contributions of these moieties as separately as possible, a
series of mutations such as visualized
in Figure 1.2 could be made. A mutation from tyrosine to serine,
while a drastic change in size,
would remove the aromatic functionality. A mutation from
tyrosine to phenylalanine would
remove just the hydroxyl group, leaving the 6-membered aromatic
ring intact.
O
NH3
+O-
O
NH3
+
OH
O-
O
NH3
+OH O
-
Figure 1.2. Phenylalanine, tyrosine, and serine at pH 7.
Tyrosine provides both an aromatic ring and a hydroxyl group to
the active site of an
enzyme. Phenylalanine provides only the aromatic moiety whereas
serine adds only a hydroxyl
group without aromaticity. These mutations allow us to test
hypotheses pertaining to an
enzyme’s stability, mechanism, selectivity, or efficiency by
rational change of the protein. This
method of investigation underpins protein engineering as a
powerful tool to investigate the active
sites of proteins.
1.3. Functional Site Prediction with THEMATICS and POOL
The active site of the protein is commonly termed “where the
chemistry happens”. For
our purposes we sometimes use a more strict definition of
“residues within 5 Å of the site of
Phenylalanine Tyrosine Serine
-
21
reaction”. These residues interact with the substrate directly,
whether it by hydrophobic
interactions, π- π interactions, (de)protonation, hydrogen
bonds, Coulomb forces, dipole-dipole
interactions, or covalent bonding. It is of great interest to
predict accurately and quickly the
active site of a given protein structure. To that end, the
active site prediction method Theoretical
Microscopic Anomalous Titration Curve Shapes (THEMATICS) was
published in 2001[17].
THEMATICS uses computational methods to calculate a theoretical
titration curve for every
ionizable residue (K, R, D, E, H, Y, C) in a protein structure.
A small minority of these titration
curves will show behavior that significantly differs from the
ideal Henderson-Hasselbalch
behavior (Figure 1.3). While a single outlier may be a fluke, a
“cluster”, defined as two or more
residues with deviant behavior within 6 Å of each other, is
considered a positive hit for
identifying the active site.
THEMATICS utilizes the unique property of a catalyst to help
find active sites; a catalyst
must replenish itself to the former state at the end of a
chemical reaction. For enzymes, of which
there are many, that give or receive a proton there is a
fundamental problem that to be acidic
enough to offer a proton, or basic enough to abstract a proton
from the substrate, would
necessitate being too weak a base to take back the proton once
owned by the enzyme, or too
weak an acid to give back the proton borrowed by the
enzyme[18].
If a residue could be both an acceptor and donor of a proton
simultaneously, or near
simultaneously, the paradox would be resolved. The residue would
have to be ionizable over a
wide range of pH values and not follow Henderson-Hasselbalch
behavior: the type of behavior
THEMATICS calculates for known residues of catalytic
importance.
-
22
Figure 1.3. A titration curve of mean net charge as a function
of pH for select lysine residues in E. coli β-lactamase.
In Figure 1.3[19] the two filled symbols show the titration
curve of two lysines (K146
and K215) that do not contribute to catalysis. The two unfilled
symbols in Figure 1.3 show the
titration curves of active site lysines K73 and K234. Note the
classic, sharp transition of charge
states as modeled by the Henderson-Hasselbalch equation for the
non-catalytic lysines contrasted
to the perturbed, anomalous behavior of the curves for catalytic
lysines.
THEMATICS contains additional advantages beyond predicting
active sites. Because the
criteria for prediction are based purely on computed chemical
properties from the three-
dimensional coordinates for the query protein and are not
dependent on homology,
THEMATICS remains immune to false positives due to homology or
database misannotation. A
structure of an enzyme could be the only structure in existence,
such as a novel or artificial fold,
and THEMATICS will still perform just as powerfully. It has been
shown that THEMATICS
works well using a homology model as an input rather than
empirical structures[20], and finds
both catalysis and recognition sites of enzymes[21].
Quantitation of the deviation from
Henderson-Hasselbalch behavior was implemented by examining the
3rd
and 4th
central moments
-
23
of the curves, which correspond to asymmetry and kurtosis
respectively[22]. Residues scoring
more than one standard error higher than the average residue of
its type were considered positive
hits (Z >1 for μ3 or μ4)[22]. The Z-score cut-off was later
refined to Z >0.99 for μ3 or μ4 after it
was found to improve performance on the reference data
set[23].
Originally, THEMATICS titration curves were inspected manually
for non-Henderson-
Hasselbalch behavior, which raises both resource-commitment and
scientist-bias issues.
Automation[24] alleviated of both of these concerns and paved
the way to add Support Vector
Machines (SVM) as a potential way of raising THEMATICS recall
and precision even
higher[25].
Partial-Order Optimum Likelihood (POOL) combines THEMATICS with
other
predictors to create the best functional site predictor to
date[26]. Originally using CASTp for
geometric features and ConSurf for sequence-based features[26],
POOL has since[27]
incorporated ConCavity[28] for geometric features and
INTREPID[29] for sequence-based
phylogenetics features. POOL provides many advantages over
THEMATICS: the ability to
predict non-ionizable residues, include sequence/geometric
information, and improved
performance. POOL allows non-ionizable residues to be predicted
by assigning all residues an
environmental μ3 and μ4 based on the behavior of nearby
residues. In addition to the 3rd
and 4th
central moments, the buffer range[27] (BR) was added as a
feature to quantitate the wide-range
of buffering capability that is typically high for active site
residues. POOL is publicly available
via web at http://www.pool.neu.edu/wPOOL/[30]
http://www.pool.neu.edu/wPOOL/
-
24
1.4. Catalysis by remote residues
Earlier we defined the active site as “residues within 5 Å of
the site of reaction”. Even
during the seminal work on tyrosyl-transfer RNA synthetase the
concept of residues remote from
the site of chemical transformation contributing to catalysis
seemed evident and was validated by
showing that T40 and H45 contributed to catalysis by binding of
the tail phosphate groups of the
ATP moiety[9]. However, here a stricter definition of remote
residues is adopted, and we
redefine the active site as “residues within 5 Å of the
substrate”, regardless of whether that
particular residue is directly involved in chemical reactions.
With this definition, residues such as
T40 and H45 in tyrosyl-transfer RNA synthetase would not be
considered remote, but rather it
could be said that the active site of tyrosyl-transfer RNA
synthetase is particularly large to
accommodate a particularly large substrate.
As soon as THEMATICS was created, it was noted that certain
predictions by
THEMATICS included residues that were not in direct contact with
the substrate[17]. These
residues were not only far away from the site of the reaction,
but did not have any interaction
with the substrate. Whether these predictions were false
positives, or correct predictions yet to be
tested remained an open question[17, 18]. One could imagine an
active site to be composed of
layers: the first layer are those residues that are within
contact with the substrate, the second
layer would be composed of the residues in contact with, but
behind, the first shell, the third shell
would be composed of the residues in contact with, but behind,
the second shell.
Figure 1.4 abstractly shows a multi-layered active site
consisting of a 1st shell that
interacts with the substrate, a 2nd
shell of residues interacting with the 1st shell, and a 3
rd shell of
residues interacting with the 2nd
shell. Each shell is approximately 5 Å in depth.
-
25
Figure 1.4. Diagram of a multi-layered active site.
These predicted residues were in the second, or even third,
shell of the active site. Nitrile
Hydratase (NH), Phosphoglucose Isomerase (PGI) and DNA
Polymerase IV (DinB) were
predicted to contain 2nd
shell residues contributing to catalysis. Alternatively, there
are some
enzymes such as Ketosteroid Isomerase (KSI) where no
second-shell residues are predicted to be
important for catalysis. It was found that indeed NH[31],
PGI[32], and DinB[33] all contain
remote residues contributing to catalysis, whereas KSI[32]
possesses a mostly single-layered
active site.
These results show that many, but not all, enzymes contain
active sites that are extended,
utilizing remote residues to contribute towards catalysis.
THEMATICS and POOL accurately
predict the contributions of remote residues to catalysis by a
wide range of enzymes. Thus, the
extent of an enzyme’s active site can be predicted using POOL
and THEMATICS.
-
26
Chapter 2. Alkaline Phosphatase
2.1. Introduction
Alkaline phosphatase (AP) appears across all domains of life
releasing phosphate groups
from a wide range of substrates. AP is of great interest for use
in diagnostic assays but the
bacterial enzyme is considered too slow compared to the
mammalian enzyme, although the
temperature stability of the mammalian enzyme is much lower than
the bacterial enzyme (65 °C
and 95 °C TM, respectively)[34]. Alkaline phosphatase has been a
staple of enzymology studies
for decades[35] although it is under constant revision and
further investigation as to its
mechanism[35-38]. Its thermostability, ubiquity in both nature
and the chemical literature, and
ease of kinetic assay present an excellent learning opportunity.
As such, the senior-level
Chemical Biology course at Northeastern University utilizes the
site-directed mutagenesis and
Michaelis-Menten parameterization of alkaline phosphatase as a
long term lab experiment. Some
mutations in this work were designed by undergraduates partaking
in this course.
E. coli AP is encoded by the phoA gene and encodes 471 amino
acids composing the
precursor protein; the first 21 amino acids contain a
periplasmic signal sequence that is then
removed from the protein, resulting in a 450 amino acid enzyme
that naturally dimerizes in
solution. Each monomer contains its own active site with three
metal ions: two zinc and one
magnesium[39]. These metal ions are held in place by various
residues and with no substrate
present are coordinated with three water molecules[40]. The
magnesium ion is held in place by
D51, D153, T155, and E322; the zinc1 ion is held in place by
R166, D327, H331, and H412; the
zinc2 ion is held in place by D51, R166, D369 and H370[35,
39-49]. K328 interacts with the
-
27
phosphate moiety through a water molecule[49] and S102 performs
the nucleophilic attack on
the substrate[40, 50, 51].
Figure 2.1. The active site of alkaline phosphatase based on PDB
ID: 1ALK. Zinc: purple; magnesium: yellow;
phosphate: green/red.
-
28
2.2. Computational Predictions
Alkaline phosphatase as analyzed by THEMATICS was predicted to
have mostly 1st shell
residues, with two predicted 2nd
shell residues and one 3rd
shell residue. In addition to analysis by
THEMATICS, analysis by Evolutionary Trace (ET)[52, 53] predicted
a much larger population
of residues that fully included those predicted by
THEMATICS.
Figure 2.2. Diagram of Evolutionary Trace and THEMATICS
predictions for AP.
-
29
Each group in Figure 2.2 represents a shell of the active site
for AP. Residues predicted
by ET but not predicted by THEMATICS could be non-ionizable
(such as the case of S102)
and/or simply not predicted by THEMATICS
POOL
Rank Residue
Raw
POOL score
Normalized
POOL Score
1 ASP 51 2.06E-02 1.00E+00
2 ASP 369 1.42E-02 6.87E-01
3 HIS 370 1.06E-02 5.15E-01
4 ASP 327 1.06E-02 5.15E-01
5 GLU 322 7.89E-03 3.83E-01
6 HIS 412 3.57E-03 1.73E-01
7 HIS 331 5.18E-04 2.52E-02
8 ASP 101 3.57E-04 1.73E-02
9 HIS 372 3.36E-04 1.63E-02
10 ASP 153 1.54E-04 7.50E-03
11 ARG 166 1.10E-04 5.36E-03
12 GLU 341 8.08E-05 3.92E-03
13 LYS 328 7.37E-05 3.58E-03
14 HIS 86 6.85E-05 3.33E-03
15 PRO 156 4.25E-05 2.06E-03
16 GLY 52 3.80E-05 1.85E-03
17 GLU 57 2.20E-05 1.07E-03
18 THR 155 2.11E-05 1.03E-03
19 SER 102 2.11E-05 1.03E-03
20 HIS 162 1.77E-05 8.59E-04
21 MET 53 1.69E-05 8.18E-04
22 ASP 330 9.87E-06 4.79E-04
23 ASN 44 8.80E-06 4.27E-04
24 PHE 317 8.80E-06 4.27E-04
25 GLY 207 8.80E-06 4.27E-04
Table 2.1. POOL predictions for alkaline phosphatase. THEMATICS
predictions include those colored. Blue: 1st
shell; yellow: 2nd
shell
-
30
Shown here are only the 25 mostly highly ranked residues of 450.
Residues in blue are
known residues contributing to catalysis via ligand or metals;
residues in yellow are predicted by
THEMATICS to be 2nd
or 3rd
shell residues of interest. E341 helps form the dimer
interface.
Figure 2.3. A POOL plot of POOL score vs. POOL rank for alkaline
phosphatase.
The POOL plot in Figure 2.3 extends out towards a rank of 449,
asymptotically
approaching a POOL score of 0. There are quite a few interesting
predictions by POOL for
alkaline phosphatase (Table 2.1). It performs well in predicting
the first shell of residues, as well
as the dimer-interface forming residue. Residues predicted by
THEMATICS all reside in the top
22 (top 5%) of POOL ranking, including the 2nd
and 3rd
shell residues predicted by
THEMATICS. Threonine 155 and serine 102 are both essential for
catalysis but only rank as 18
and 19 respectively; because neither serine nor threonine are
considered ionizable, THEMATICS
would not predict these residues directly. The computational
predictions shown above suggest
that alkaline phosphatase may have a few 2nd
and 3rd
shell residues important for catalysis,
namely E57, D330, and H372 (Figure 2.4).
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 40 50
PO
OL
Sco
re
POOL rank
-
31
POOL discards the THEMATICS Boolean approach of assigning
discrete yes/no values
to predictions of functional importance for residues in exchange
for a ranking system (Figure
2.3), complete with its own advantages and disadvantages (see
discussion[26]). Traditionally a
%-based cut-off, such as top 8%, 10%, or as low as top 5%, is
utilized to determine what the user
should investigate as a residue important for catalysis. However
the exact cut-off is still an area
of investigation (see Further Work) and can be dependent on the
size and type of protein of
interest.
-
32
Figure 2.4. The 2nd
and 3rd
shell residues predicted by THEMATICS: (top) H372, (middle)
D330, and (bottom) E57.
Zinc: purple; magnesium: yellow; phosphate: green/red.
H372
D330
E57
-
33
2.3. Materials and Methods
To investigate these predictions by THEMATICS and POOL
pertaining to the possible
outer-shell residues in alkaline phosphatase, we employed
site-directed mutagenesis to construct
mutants, expressed and purified them, and assayed their
activities in reference to the wild-type
protein.
2.3.1. Materials
Quik-Change® site-directed mutagenesis kits (Agilent, CA) were
used to make mutations
in pEK29[43] (provided by E. Kantorwitz (Boston College)) using
primers below and confirmed
by DNA sequencing (Massachusetts General Hospital DNA Core,
Cambridge, MA).
M75T
5'-GGCGATGGGACGGGGGACTCGG-3'
5'-CCGAGTCCCCCGTCCCATCGCC-3'
H394D
5'-CTGATCACGCCGACGCCAGCCAG-3'
5'-CTGGCTGGCGTCGGCGTGATCAG-3'
H108L
5'-GGGCAATACACTCTCTATGCGCTG-3'
5'-CAGCGCATAGAGAGTGTATTGCCC-3'
E79Q
5'-GGACTCGCAAATTACTGCCGCACG-3'
5'-CGTGCGGCAGTAATTTGCGAGTCC-3'
D352N
5'-CGATAAACAGAATCATGCTGCCAATCC-3'
5'-GGATTCGCAGCATGATTCTGTTTATCG-3'
H394L
5'-CGCTGATCACGCCCTCGCCAGCCAG-3'
5'-CTGGCTGGCGAGGGCGTGATCAGCG-3'
M75A
5'-CTGATTGGCGATGGGGCAGGGGACTCG-3'
5'-CGAGTCCCCTGCCCCATCGCCAATCAG-3'
E172Q
5'-GTTTCTACCGCACAGTTGCAGGATG-3'
5'-CATCCTGCAACTGTGCGGTAGAAAC-3'
S127A
5'-GACTCGGCTGCAGCAGCAACCGCC-3'
5'-GGCGGTTGCTGCTGCAGCCGAGTC-3'
Q457E
5'-GGACTGACCGACGAGACCGATCTC-3'
5'-GAGATCGGTCTCGTCGGTCAGTCC-3'
Figure 2.5. Primers for site-directed mutagenesis of E. coli
alkaline phosphatase. Codons manipulated are
underlined.
SM547 cells, lacking a chromosomal phoA gene, were provided by
E. Kantorwitz and
made competent by chemical treatment with CaCl2 and stored at
-80 °C in aliquots. Primers were
-
34
hydrated to 100 μM concentration with sterile water, and a 5 μM
stock created by diluting 20-
fold into sterile water.
2.3.2. Methods
For protein purification, plasmids to express either WT or
variant AP were transformed
into SM547 competent cells and selected on LB agar containing
100 μg mL-1
ampicillin. An
overnight culture of 50 mL YT medium containing100 μg mL-1
ampicillin grown at 37 °C was
sub-cultured to 1 L YT supplemented with 100 μg mL-1
ampicillin and growth was continued for
12 hours at 37 °C. The cells were harvested, washed, and
osmotically shocked as previously
described by Brockman & Heppel[54] and then precipitated,
suspended, dialyzed, and purified
on a HiTrap FastFlow Q column (GE Healthcare) by FPLC as
described by Chaidaroglou et
al.[43]. Purity of each fraction was determined by 10% SDS-PAGE
and pure fractions were
stored at -20 °C. Concentration of protein was determined by
Bradford assay (Bio-Rad) against a
bovine serum albumin standard.
Formation of para-nitrophenyl was measured at 410 nm at room
temperature in High Tris
buffer (1.0 M Tris-HCl pH 8.0) from the cleavage of
para-nitrophenyl phosphate to calculate
initial velocities with an extinction coefficient of 1.42 x 104
M
-1 cm
-1 (Figure 2.6). Non-linear
regression to calculate KM and kcat was performed using GraphPad
Prism 5 version 5.02 . At least
three independent trials were performed for each protein. Data
were collected every 0.5 seconds
starting at the 3rd
sec of the reaction and continuing for 2 min to construct the
initial velocities,
initiated with addition of enzyme. PNPP was kept in the dark as
much as possible, and stored in
light-resistant microcentrifuge tubes when aliquoted.
-
35
PNPP
μM
Buffer
(2X) Water
PNPP
2 mM
Enzyme
variable nM Total
1 500 483 2 15 1000
2 500 481 4 15 1000
5 500 475 10 15 1000
10 500 465 20 15 1000
20 500 445 40 15 1000
50 500 385 100 15 1000
100 500 285 200 15 1000
200 500 85 400 15 1000 Table 2.2. Kinetic assays for alkaline
phosphatase. Bolded columns denote final concentrations, where all
other
numbers refer to μL added to the cuvette.
2.4. Results
In order to determine initial velocities by monitoring
production of the product 4-
nitrophenol phosphate (4-PNP), a standard curve with dilutions
of 4-PNP gives a molar
extinction coefficient of 1.42 x 104 M
-1 cm
-1 similar to the reported 1.62 x 10
4 M
-1 cm
-1 [43].
-
36
Figure 2.6. Standard curve for 4-nitrophenol phosphate
Each alkaline phosphatase variant was tested concurrently with
wild-type alkaline
phosphatase on the same day. Initial velocities, V0, for each
substrate concentration (1-200 μM
PNPP) was calculated by taking the slope of the product
formation (in a.u. min-1
) and dividing by
the 4-PNP molar extinction coefficient to give μM PNP min-1
.
y = 0.0142x + 0.0304 R² = 0.9993
0
0.5
1
1.5
2
2.5
0 20 40 60 80 100 120 140 160
Ab
sorb
an
ce (
410 n
m)
4-nitrophenol phosphate (μM)
-
37
Figure 2.7. Michaelis-Menten plots for AP in 1 M Tris-HCl pH 8.0
buffer. Error bars represent standard error of at
least three independent trials.
AP Variant Vmax (μM min-1
) KM (μM) R2
WT 4.8 (0.1) 26.7 (2.6) 0.94
M53A 8.9 (0.4) 26.2 (4.1) 0.96
M53T 1.7 (0.1) 22.7 (2.1) 0.98
E57Q 7.5 (0.3) 25.2 (3.5) 0.97
H86L 3.2 (0.1) 13.5 (0.9) 0.99
S105A 3.1 (0.2) 14.1 (3.1) 0.91
E150Q 5.2 (0.5) 33.2 (8.7) 0.90
D330N 2.0 (0.1) 22.0 (5.4) 0.87
H372D 3.0 (0.3) 53.1 (14.4) 0.91
H372L 4.1 (0.2) 10.6 (1.5) 0.95
Q435E 8.4 (0.1) 20.9 (1.1) 0.99
Table 2.3. WT and variant AP kinetic parameters. Standard errors
are in parentheses and consist of at least three
independent trials.
Vmax is not proportional to kcat between enzymes due to the
enzymes being at different
concentrations (Appendix A). None of the variants showed a
dramatic decrease in activity. While
there are some small differences in individual kcat or KM
values, the catalytic efficiencies are all
similar (Table 2.4).
-
38
PhoA Variant
POOL Rank
Å to PO4 Shell kcat (s
-1) KM (μM) Catalytic Efficiency
(106 M-1 s-1) Fold
Decrease
WT -- -- -- 40 (7.3) 28 (9) 1.43 (0.53) --
H372D 9 6.7 2nd 27 (10) 63 (44) 0.43 (0.34) 3.33 (2.91)
H372L 9 6.7 2nd 6.3 (0.1) 11 (2.9) 0.57 (0.15) 2.49 (1.13)
H86L 14 11.2 2nd 6.3 (0.1) 14 (0.7) 0.45 (0.02) 3.17 (1.19)
S105A 16 7.2 2nd 9.7 (0.7) 14 (3.2) 0.69 (0.17) 2.06 (0.91)
E57Q 17 12.3 3rd 21 (0.8) 26 (4) 0.81 (0.13) 1.77 (0.71)
M53A 21 14.6 3rd 25 (4.4) 27 (16) 0.93 (0.17) 1.54 (0.64)
M53T 21 14.6 3rd 14 (0.8) 23 (3.4) 0.61 (0.10) 2.36 (0.94)
D330N 22 11.0 2nd 17 (4.2) 25 (2.5) 0.68 (0.18) 2.1 (0.96)
Q453E 44 11.1 2nd 17.4 (0.4) 21 (3) 0.83 (0.12) 1.72 (0.68)
E150Q 136 10.2 2nd 13 (8.8) 34 (6.4) 0.38 (0.27) 3.74 (2.97)
Table 2.4. Summary calculations for WT alkaline phosphatase and
variants. Standard errors are in parentheses and
consist of at least three independent trials.
There is no correlation between either POOL rank nor distance
(Å) to the phosphate
substrate for the residues tested, nor is there a significant
difference in results between the 2nd
shell and the 3rd
shell residues as groups. Distances from the PO4 (Å) are based
on
PDB:1ALK[39] and measured from tip of the residue side chain to
the phosphorous atom.
Figure 2.8. Catalytic efficiencies of wild-type and variant
alkaline phosphatases. Error bars represent standard error
over at least three independent trials.
0.00
0.50
1.00
1.50
2.00
2.50
WT H372D H372L H86L S105A E57Q M53A M53T D330N Q453E E150Q
Cata
lyti
c E
ffec
ien
cy (
10
6 M
-1 s
-1)
AP variant
-
39
2.5. Conclusions
WT alkaline phosphatase is catalytically efficient with a
kcat/KM of 1.5 x 106 M
-1 s
-1,
which is similar to the literature values reported across
various experiments[34, 43, 55, 56].
Mutations disrupting interactions of the active site at the
first shell commonly decrease the
catalytic efficiency of alkaline phosphatase by many orders of
magnitude. In contrast, throughout
this work and from the compiled literature of single-mutation
variants, mutations in the second
or third shell have little to no effect on catalysis. Alkaline
phosphatase seems to have a compact
active site comprising solely first-shell residues that
contribute significantly to catalysis as
measured by single-point mutants.
Figure 2.9. AP residues investigated in this work. Zinc: purple;
magnesium: yellow; phosphate: green/red.
It has been shown that the turnover rate of AP can be increased
substantially by multiple
mutations, including 2nd
shell mutations. A D153G/D330N double mutant was reported to
have
over 50-fold higher kcat than the WT AP; the KM was also raised
by about 30 fold leaving the
-
40
enzyme with less than a 2-fold higher overall catalytic
efficiency, however[34]; similarly,
D101A gives a 2-fold increase to kcat and KM negating each
other[57]. D153A by itself, while
resulting in almost no change in catalytic efficiency, resulted
in a 7-fold increase in each kcat and
KM[42]. D101, D153, and D330 are all predicted by POOL and rank
8th
, 10th
, and 22nd
respectively. Multiple mutations in close space achieved modest
2- to 6-fold increases in kcat/KM
including V99A, T100V, T100I, and D101S[58].
Figure 2.10. A plot of Table 3 and Table 4 showing effects on
catalytic efficiency based on POOL rank for AP.
The largest loss of activity are seen in D327 and S102, with no
mutations on residues
outside the top 20 predicted residues by POOL having a large
(>1 magnitude) decrease in
catalytic efficiency. It is important to note that this
compilation only examines single mutations
where both subunits are affected. Alkaline phosphatase has been
known to shown intragenic
complementation where a heterodimer of variants A and B, AB,
will have higher activity than
0.1
1
10
100
1000
10000
100000
1000000
0 20 40 60 80 100 120 140 160
Fold
Dec
rease
over
res
pec
tive
WT
AP
(log s
cale
)
POOL Rank for AP
-
41
AA or BB[56]. While the two active sites per dimer are more than
30 Å apart, there seems to be
molecular communication between them.
Variant Shell
Pool
Rank
POOL
percentile
(kcat/KM) wild-type /
(kcat/KM) mutant Reference
D51E 1 1 99 231 [44]
D369N 1 2 99 95 [56]
D327N 1 4 99 4350 [45]
D327N 1 4 99 100 [46]
D327A 1 4 99 >600,000 [45]
D327A 1 4 99 >1,000,000 [46]
E322K 1 5 99 1520 [56]
H412Y 1 6 99 >12,000 [56]
H412E 1 6 99 2237 [44]
H331E 1 7 98 972 [44]
D101S 1 8 98 0.2 [59]
D101A 1 8 98 1 [57]
D153G 1 10 98 0.2 [59]
D153H 1 10 98 1.1 [34]
D153H 1 10 98 3.5 [47]
D153E 1 10 98 1.3 [44]
D153A 1 10 98 1.1 [42]
D153N 1 10 98 1.1 [42]
R166A 1 11 98 313 [43]
R166S 1 11 98 125 [43]
R166Q 1 11 98 166 [48]
R166K 1 11 98 4 [48]
K328R 1 13 97 0.9 [58]
K328C 1 13 97 10 [51]
K328H 1 13 97 0.5 [34]
K328H 1 13 97 3.2 [49]
K328A 1 13 97 3.8 [49]
T155M 1 18 96 678 [56]
S102G 1 19 96 >300,000 [60]
S102A 1 19 96 >60,000 [60]
S102C 1 19 96 >19,000 [60]
Table 2.5. 1st shell variants of AP and their catalytic
efficiency under comparable conditions to our experiments.
-
42
Variant Shell
Pool
Rank
POOL
percentile
(kcat/KM) wild-type /
(kcat/KM) mutant Reference
H372A 2 9 98 2.9 [61]
H372D 2 9 98 3.3 This Work
H372L 2 9 98 2.5 This Work
H86L 2 14 97 3.1 This Work
E57Q 2 17 96 1.8 This Work
M53A 2 21 95 1.5 This Work
M53T 2 21 95 2.4 This Work
D330N 2 22 95 0.2 [34]
D330N 2 22 95 2.1 This Work
Q435E 2 44 90 1.7 This Work
A103C 2 50 89 0.9 [58]
A103D 2 50 89 2.2 [58]
T100V 2 51 89 0.3 [58]
T100I 2 51 89 0.3 [58]
V99A 2 100 78 0.2 [58]
S105L 2 136 70 6.3 [56]
S105A 2 136 70 2.1 This Work
E150Q 3 106 76 3.7 This Work
E341K * 12 97 1407 [56]
T59A * 169 37 1.5 [62]
T59R * 169 37 >600,000 [62]
Table 2.6. 2nd
and 3rd
shell variants of AP and their catalytic efficiency under
comparable conditions to our
experiments.
-
43
Chapter 3. Ketosteroid Isomerase
3.1. Introduction
Ketosteroid isomerase (KSI) moves a double bond to convert
∆5-3-ketosteroids to ∆
4-3-
ketosteroids by cleavage of the C-H bond at C4 and reattaching
the proton at C6. This reaction is
characteristic of many biological processes of intramolecular
abstraction and reprotonation
(Figure 3.1). Considering that some known KSI enzymes reach
diffusion-limited rates of
reaction[63, 64], KSI is an attractive model for studying enzyme
kinetics and active site
engineering[32, 63, 65]. There are two well-studied sources of
∆5-3-ketosteroid isomerase:
Pseudomonas putida (PpKSI) and Commamonas testosteroni (CtKSI).
These two enzymes have
practically identical active sites and catalytic residues
placement, while sharing only 34% amino
acid sequence identity[63]. This fold is not entirely uncommon
in nature[66], being
superimposable on Nuclear Transcription Factor 2[67] despite
lack of sequence homology or
function similarity[66, 68].
O-
O
R
OH
R
OHO
R
O
O
CH3
CH3
H
H
H
OHO
R
OH
R
OHO
R
O
O-
CH3
CH3
H
H
O-
O
R
OH
R
OHO
R
O
O
CH3
CH3
H
H
H
Figure 3.1. Mechanism of KSI based on PpKSI numbering.
The active site of KSI is particularly hydrophobic which is
reasonable for an enzyme that
binds steroid ligands[66]. The mechanism for KSI involves
abstraction of a proton at the C4
D40 D40 D40
D103 D103 D103
Y16 Y16 Y16
-
44
position by D40 (PpKSI numbering) followed by stabilization of
the intermediate by D103 and
Y16[32, 63, 65, 66, 69]. Regeneration of the catalyst is
achieved by the C6 carbon abstracting
the hydrogen from D40. The ability for an aspartic acid to act
as a base is of particular interest,
especially with a nearby aspartic acid requiring protonation to
stabilize the resulting enolate ion.
With the advent of the Protein Structure Initiative many crystal
structures are uploaded to
the Protein Data Bank with putative, predicted, or unknown
function. These structures often have
function assignments based purely on sequence or structural
similarity. With misannotation in
databases becoming an increasing problem[70], recently we have
developed a method to help
assign function to structures without biochemical data called
SALSA: Structurally Aligned Local
Sites of Activity[71]. Because THEMATICS and POOL allow the
active site of any protein to be
predicted regardless of existing homology and based solely upon
the tertiary structure of the
enzyme they are optimal for prediction of protein function that
may be incorrectly annotated.
There are three putative KSI proteins from structural genomics
centers from three
organisms: Mycobacterium tuberculosis (MtKSI), Pectobacterium
atrosepticum (PaKSI), and
Mesorhizobium loti (MlKSI). Previous work in our group by Dr.
Srinivas Somarowthu has
shown that of these three, only MtKSI possesses KSI activity.
However, the catalytic efficiency
of MtKSI was found to be on the order of 105 M
-1 s
-1, a thousand times lower than PpKSI’s
efficiency of 108 M
-1 s
-1. This begs the question: what are the key differences that
lead to this
loss of activity between MtKSI and PpKSI? Can the activity of
MtKSI be brought to PpKSI
levels by making the MtKSI active site more PpKSI-like?
http://www.pdb.org/pdb/search/smartSubquery.do?smartSearchSubtype=TreeEntityQuery&t=1&n=29471http://www.pdb.org/pdb/search/smartSubquery.do?smartSearchSubtype=TreeEntityQuery&t=1&n=381
-
45
3.2. Computational Predictions
For each known KSI and putative KSI, POOL ranked each residue’s
importance for
catalysis and the top 10% for each was used as a cut-off. The
structures were aligned based on
their active sites and a structural alignment Table was created
(see Table 3.1). Nuclear
Transcription Factor 2 (NTF2) contains an incredibly similar
overall fold without sharing any
function with KSI and thus was used as a negative control.
PDB Structurally aligned POOL predicted residues
PpKSI 1oh0 Y32 Y57 Y16 D40 W120 F56 G49 P41 D103 D35 G43 E39
M116
CtKSI 8cho F30 Y55 Y14 D38 F116 F54 G47 P39 D99 D33 G41 E37
M112
MtKSI 2z76 M32 F64 S16 D40 W128 F63 G56 P41 F111 D35 G43 E39
M124
MlKSI 3hx8 Y52 W76 F36 P60 S146 L75 G68 P61 Y125 D55 - F59
D142
PaKSI 3d9r Y35 Y59 Y19 G43 K131 V58 G51 P44 E110 D38 - M42
Y127
NTF2 1oun Y33 L56 Y18 W41 A122 K55 G48 E42 Q101 A36 - T40
D117
Table 3.1. SALSA alignment of POOL predicted residues for known
KSI proteins and proteins annotated as putative
KSIs. Bold: POOL-predicted; underlined: literature
annotated.
POOL prediction based on top 10% of rankings. The proteins in
Table 3.1, in order from
top to bottom: two known KSIs, three SG putative KSIs, and a
nuclear transcription factor of
similar structure, shown for comparison. For the three putative
KSIs, only MtKSI’s active site is
both predicted and similar to the known KSI active sites; both
MlKSI and PaKSI do not have
similar active sites, nor are the residues in the same spatial
positions as the KSI active site
predicted to be important for activity. The match between MtKSI
and PpKSI / CtKSI is not
100%. While a tyrosine to phenylalanine mismatch is somewhat
conservative, it is of note that
for MtKSI that F64 of interest is not predicted by POOL to be
important for catalysis. The same
can be said for the S16 where PpKSI and CtKSI have a tyrosine as
well. The essential aspartic
acid at D40 is conserved, but curiously the other aspartic acid
at PpKSI-D103 / CtKSI-D99
which is thought to be essential is replaced by a
non-POOL-predicted F111 in MtKSI.
-
46
3.3. Materials and Methods
Wild-type MtKSI DNA was obtained in the form of a plasmid
pGST-Rv0760c (Craig
Garen and Prof. Michael James, Department of Biochemistry,
University of Alberta) encoding MtKSI with a
GST-tag, as well as an ampicillin resistance marker gene.
Steroids were purchased from
Steraloids Inc, RI, USA. Primers were hydrated to 100 μM
concentration with sterile water, and
a 5 μM stock was created by diluted 20-fold into sterile water.
Codons manipulated are
underlined.
3.3.1. Methods
QuikChange (Agilent Technologies) site-directed mutagenesis was
used to mutate the
wild-type KSI gene with the following mutations: S16Y, F64Y,
F111D, S16Y/F64Y,
S16Y/F111D, F64Y/F111D, and S16Y/F64Y/F111D. Since the amino
acids of interest are coded
by codons far enough apart, multiple mutations can be introduced
using single-mutation primers
in succession.
MtKSI.F111D-F GGCGTGGACACCTACCGGGTG
MtKSI.F111D-R CACCCGGTAGGTGTCCACGCC
MtKSI.F64Y-F GGCGCCTTCTACGACACACAC
MtKSI.F64Y-R GTGTGTGTCGTAGAAGGCGCC
MtKSI.S16Y-F CGCAGTCGTACTGGCGGTGCG
MtKSI.S16Y-R CGCACCGCCAGTACGACTGCG
Figure 3.2. Primers for site-directed mutagenesis of MtKSI in
plasmid pGST-Rv0760c.
BL21 DE3 pLysS competent cells were transformed with
pGST-Rv0760c containing
either WT or mutations, and after streaking a transformed
colony, a single colony was used to
inoculate 50 mL of LB liquid culture which was grown overnight
with 100 μg μL-1
ampicillin.
The next day, the 50 mL culture transferred to a 500 mL of LB
liquid culture with 100 μg μL-1
and grown with shaking for 2 h at 37 °C. Once an OD of 0.5-0.8
at 600 nm was obtained, the
-
47
culture was brought to 0.5 mmol L-1
IPTG to induce expression and agitated at room temperature
overnight. After overnight growth, the culture was harvested by
centrifugation at 6000 RPM for
10 minutes, suspended in 1X Phosphate Buffered Saline (PBS) pH
7.3 with 1 mM DTT and ½ a
tablet of Roche Protease Inhibitor cocktail (Buffer A) and
stored at -80°C.
Frozen pellets from the -80 °C freezer were thawed overnight in
ice. The suspended,
thawed cells were subjected to sonication for 2 min (multiple
rounds of 10 sec on followed by 10
sec off) and then clarified by centrifugation at 14,000 rcf for
60 min. The supernatant was
collected and loaded onto a disposable 4B Sepharose GST column
resin (GE Healthcare). The
column was washed with Buffer A extensively, and then the
GST-tagged MtKSI gradually eluted
with 1 to 10 mM reduced glutathione. Fractions containing MtKSI
determined by SDS-PAGE
were collected and combined with histidine-tagged TEV protease
overnight and then dialyzed
against Buffer A to remove any reduced glutathione. The solution
was then run through a 4B
Sepharose GST column, except this time the KSI was collected in
the initial flow through, and
then filtered onto a Nickel FPLC column to remove the
histidine-tagged TEV protease, and the
MtKSI was collected in the flow through. Fractions containing
MtKSI were determined by SDS-
PAGE, and then concentrated using Viva-spin tubes with a 5000 Da
Molecular Weight Cut Off
(SartoriusStedim biotech) while being exchanged into KSI storage
buffer (50 mM NaCl, 10 mM
Tris-HCl, 1 mM DTT, pH 8.0). Purity was determined by SDS-PAGE
and protein concentration
determined by Bradford Assay against a BSA standard.
Activity of MtKSI was determined by formation of
4-androstene-3,17-dione (4AND) by
isomerization of 5-androstene-3,17-dione (5AND), measured at 248
nm by a UV/Vis instrument
for a fixed enzyme concentration and varying substrate
concentration between 30 and 300 μM
-
48
5AND while keeping final methanol concentration 3.3% v/v (Table
3.2; Figure 3.3). Enzyme
concentration was fixed at a final concentration of 10 nM from a
1.2 μM stock that was made by
diluting purified KSI with a dilution buffer (34 mM KCl, 2.5 mM
EDTA, 1% BSA, pH 7.0).
Reactions were blanked with all reagents except the substrate,
5AND. 5AND was added,
mixed completely quickly, and then the absorbance at 248 nm
tracked for 60 seconds, starting
after 3 seconds, every 0.5 seconds.
[5AND]
μM final
[KSI]
nM
2 X
Buffer Water Methanol
KSI
1200 nM
5AND
3 mM
5AND
10 mM Total
10 10 1500 1375 90 25 10 - 3000
20 10 1500 1375 80 25 20 - 3000
30 10 1500 1375 70 25 30 - 3000
60 10 1500 1375 40 25 60 - 3000
90 10 1500 1375 10 25 90 - 3000
120 10 1500 1375 64 25 - 36 3000
180 10 1500 1375 46 25 - 54 3000
300 10 1500 1375 10 25 - 90 3000
Table 3.2. Kinetic assays for MtKSI. Bolded columns denote final
concentrations, where all other numbers refer to
μL added to the cuvette.
3.4. Results
y = 0.0142x - 0.0088 R² = 0.9994
y = 0.0147x + 0.0076 R² = 0.9971
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 40 50 60 70
Ab
sorb
an
ce (
248 n
m)
4AND (μM)
-
49
Figure 3.3. Standard curve for 4-androstene-3,17-dione
(4AND).
A molar extinction coefficient of 1.4 x 104 M
-1 cm
-1 was used for kinetic analysis. Two of
two trials are shown in Figure 3.3.
Figure 3.4. Michaelis-Menten plots for WT MtKSI and variants.
Error bars represent standard error of at least three
independent trials.
WT and F64Y KSI show increasing V0 with increasing substrate
concentration, although
the V0 do not approach a Vmax due to poor solubility of the
substrate. Therefore, all KM values for
KSI are reported as KMapp
. Vmax can still be extrapolated by non-linear regression, but
with less
accuracy represented by lower regression coefficients and higher
standard errors.
Figure 3.5. WT KSI and F64Y individual Michaelis-Menten
plots.
MtKSI WT MtKSI F64Y
-
50
Any variant containing S16Y and/or F111D however does not show
classic Michaelis-
Menten behavior along with significantly diminished activity.
Purification of MtKSI-F111D was
problematic, including low yields and loss of protein after
concentration. Only a small amount of
data could be obtained for F111D, but there seems to be no
deviation from the behavior shown
by the other non-F64Y mutants.
MtKSI Variant Vmax KMapp
R2
WT 63.67 (29.08) 453.6 (297.3) 0.7611
S16Y 10.04 (25.41) 1607 (4635) 0.5909
F64Y 57.83 (3.993) 231.1 (27.89) 0.98
S16Y/F64Y 18.65 (29.19) 2485 (4247) 0.8972
S16Y/F111D 4.094 (1.148) 273.4 (127) 0.8289
F64Y/F111D 0.3873 (0.104) 3.235 920.3) 0.2783
S16Y/F64Y/F111D 3.821 (11.98) 1674 (5973) 0.5009 Table 3.3. Vmax
and KM
app for WT MtKSI and variants. Standard errors are in
parenthesis and consist of at least three
independent trials.
WT V0
μM min-1
F111D V0
μM min-1
Trial 1 11.8 0.68
Trial 2 4.9 0.70
Trial 3 11.8 0.88
Trial 4 13.8 0.85 Table 3.4. Comparison between the WT MtKSI and
F111D variant at 90 μM 5AND.
Figure 3.6. Single experiment of Michaelis-Menten plot for MtKSI
F111D.
-
51
kcat
(s-1
)
KMapp
(μM)
Catalytic
Efficiency
(103 M
-1 s
-1)
Fold decrease
to WT
WT 106 (48) 454 (297) 234 (187) --
S16Y 17 (42) 1607 (4635) 10 (40) 23 (88)
F64Y 96 (6.7) 231 (28) 417 (58) 0.6 (0.45)
F111D 1.5 (--) 70 (--) 36 (--) 6.4 (5.15)
S16Y/F64Y 31 (49) 2485 (4247) 13 (29) 18.7 (46)
S16Y/F111D 6.8 (1.9) 273 (127) 25 (14) 9.4 (9.1)
F64Y/F111D 0.6 (0.2) 3.2 (20) 200 (1253) 1.2 (7.4)
S16Y/F64Y/F111D 6.4 (20) 1674 (5973) 4 (18) 62 (296) Table 3.5.
Catalytic efficiency for WT MtKSI and variants. Where available,
standard errors are in
parentheses and consist of at least three independent
trials.
For comparison, in Table 3.5, PpKSI’s catalytic efficiency is
100,000 x 103 M
-1 s
-1. F64Y
retained the same kcat while having a lower KMapp
, giving it a higher catalytic efficiency than the
WT. All other mutants lack the signal to noise required to make
an accurate analysis of their
Michaelis-Menten paramaters or catalytic efficiency.
For any variant tested besides F64Y, the Michaelis-Menten
parameters of KM and kcat
could not be calculated, evidenced by higher standard errors
than measurements themselves for
most of these variants. Enzyme efficiencies may be compared
without separating the KM and kcat
variables. If the concentration of substrate is negligible
compared to the KM, the additive term of
substrate concentration in the Michaelis-Menten equation can be
dropped.
Assuming [s]
-
52
MtKSI Variant V0
Fold Decrease to
WT
WT 3.87 (1.4) --
S16Y 0.27 (0.24) 14.3 (13.7)
F64Y 6.80 (3.3) 0.6 (0.35)
F111D 0.41 (--) 9.4 (--)
S16Y/F64Y 0.26 (0.02) 14.8 (5.7)
S16Y/F111D 0.47 (0.28) 8.2 (5.8)
F64Y/F111D 0.41 (0.23) 9.4 (6.4)
S16Y/F64Y/F111D 0.42 (--) 9.2 (--) Table 3.6. WT MtKSI and
variants compared solely based on initial velocities at 30 μM 5AND.
Where
available, standard deviations are in parentheses and represent
at least three independent trials.
These results only report ratios of catalytic efficiency without
examining kcat or KMapp
individually. F64Y results are similar between this method and
full Michaelis-Menten kinetic
analysis.
-
53
3.5. Conclusions
For any mutation tested in MtKSI, or combination thereof, the
resulting variant had little
to no KSI activity on 5AND except for the F64Y variant, and
proper Michaelis-Menten curves
could not be constructed. Why did we not increase the catalytic
efficiency to more closely
approximate the PpKSI and CtKSI forms with a more “PpKSI-like”
active site?
Figure 3.7. “Top-down” view of PpKSI (PDB ID: 1OHO; Red), CtKSI
(PDB ID: 8CHO; Orange), and
MtKSI (PDB ID: 2Z76; Yellow).
The steroid-binding pocket and active site is at the front of
Figure 3.7. The left group of
residues is Y57, Y55, and F64 respectively. The top group of
residues is Y16, Y14, and S16
respectively. The right group of residues is D103, D99, and F111
respectively. MtKSI-F64,
while being spatially aligned with PpKSI-Y57 in many structural
alignments, is actually
swiveled almost 180° away from where they phenol group is
pointing in either PpKSI or CtKSI
(Figure 3.7). There are few replacements for PpKSI-Y57 and
MtKSI-Y113 is too far away to
-
54
take over its job[68].This seems to be a limitation of
structural alignments more than SALSA,
but calls to attention the importance of human verification. In
this respect, it makes sense for the
F64Y variant to have unmodified catalytic activity.
Figure 3.8. Three residues of interest in PpKSI (PDB ID: 1OHO;
Red), CtKSI (PDB ID: 8CHO; Orange),
and MtKSI (PDB ID: 2Z76; Yellow) without surrounding secondary
structure.
F111 and S16 from MtKSI overlap well with their SALSA partners
in PpKSI and CtKSI
(Figure 3.8). However, mutations to make the side-chains similar
resulted in loss of activity. The
natural substrate for MtKSI is unknown. Because Y16 PpKSI /
CtKSI position is used in
recognition of the steroid ligand[68], MtKSI could very well use
a different steroid. If the natural
substrate for MtKSI is a different steroid, this would explain
the reduced catalytic efficiency on
5AND and sensitivity to changing the binding recognition
pocket.
The identification of F111 in MtKSI as spatially equivalent to
PpKSI-D103 does not
seem to be an alignment error; there are no residues in the
MtKSI structure that seem capable of
-
55
replicating the essential catalytic role of D99/D103. Indeed,
the authors in the report of the
crystal structure use this to argue against Rv0760c having KSI
activity and reported no activity
on 5AND[68].
The peculiarity of the POOL and SALSA predictions stands out
after these results. What
does it mean for an enzyme to not only have a strikingly
different residue at a catalytic position,
but also for that residue to not be predicted for activity by
POOL? Clearly it doesn’t discount a
certain functional activity, such as ketosteroid isomerization
of 5AND, but it may correspond to
different substrate recognition, or even a different
mechanism.
How many differences are required to declare two enzymes to have
different functions,
and how many similarities must there be before they are declared
similar? This is a current area
of investigation[71].
-
56
Chapter 4. Future Work
4.1. POOL-rank cut-offs
THEMATICS is a Boolean predictor giving either a yes or no for
each residue in a
protein structure. In contrast, POOL assigns a ranking to every
residue in a given protein
structure, and it is up to the user to determine what cut-off to
implement for best results.
Traditionally, the top 5, 8, or even 10% of POOL predictions are
considered to be positive
predictions[26, 27, 30, 32, 33, 71]. There remain two open
questions:
1) Should POOL prediction be based on percentage or POOL
score?
2) Can we use POOL to predict single-layer vs. multi-layer
enzyme active sites?
Recent work has shown convincingly that a POOL normalized score
cut-off is superior to
flat % cut-offs. By itself, a percentage cut-off presents an odd
assumption that the number of
partaking residues of an active site is linearly and directly
proportional to the total number of
amino acids. Rather, assigning an absolute cut-off of normalized
POOL score (such as 0.01)
seems more rational, and is currently being investigated as the
next generation cut-off for POOL.
The second question remains much more difficult and lies central
to our work on active
site catalysis, engineering, and understanding. It has been
shown that there are some multi-layer
active site enzymes[31-33] and some single-layer active site
enzymes[32][this work]. How to
differentiate easily though, much less without examining each
predicted residue, is still an
ongoing discussion. It has been proposed that the shape of the
POOL plot itself may provide
predictive power regarding the extent of an enzyme’s active
site. This hypothesis comes from an
empirical observation across a few proteins studied so far that
the POOL plots seem to drop
-
57
much more sharply for enzymes with single-layer active sites
than multi-layer active sites
(Figure 4.1).
Single-layered active site proteins alkaline phosphatase and
ketosteroid isomerase have
sharp decreases immediately, flat-lining by their 10th
residue for AP and even by the 5th
residue
for KSI. Multi-layered active site proteins phosphoglucose
isomerase, cobalt-type nitrile
hydratase, α subunit of pol III (DnaE), and pol IV (DinB) have
extended tails on their POOL
plots and start flat-lining farther out compared to AP and
KSI.
-
58
Figure 4.1. POOL plots for AP, KSI, PGI, NH, DnaE and DinB.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20
Norm
ali
zed
PO
OL
sco
re
POOL rank
AP POOL plot
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20
Norm
ali
zed
PO
OL
sco
re
POOL rank
KSI POOL plot
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20
Norm
ali
zed
PO
OL
sco
re
POOL rank
PGI POOL plot
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20
Norm
ali
zed
PO
OL
sco
re
POOL rank
NH POOL plot
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20
Norm
ali
zed
PO
OL
sco
re
POOL rank
DnaE POOL Plot
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20
Norm
ali
zed
PO
OL
sco
re
POOL rank
DinB POOL Plot
-
59
Appendix A. Propagation of error in calculating catalytic
efficiency
Both the Vmax value and the KM value are calculated with their
respective standard errors
based on the inputs to the GraphPad Prism program. Vmax values
are converted to kcat values by
the following transformation, where kcat is in s-1
, Vmax in μM min-1
and [enzyme] in μM:
AP Variant [e] (μM)
WT 0.002
H372D 0.011
H372L 0.011
H86L 0.0083
S105A 0.0018
E57Q 0.006
M53A 0.006
M53T 0.002
D330N 0.002
Q453E 0.008
E150Q 0.0023
Table A.1. Concentrations of enzymes used to gather kinetic data
for alkaline phosphatase.
All MtKSI kinetic experiments were done with 0.010 μM enzyme.
Enzymes were diluted
from stock concentrations measured by Bradford assays using a
BSA standard. Catalytic
efficiency is defined as the kcat divided by KM. To propagate
the error in each measurement, I
used (where σx is the standard error of variable x):
√(
)
(
)
Where in our case Z is catalytic efficiency, X is kcat, and Y is
KM.
-
60
References
1. Axe, D.D., Estimating the prevalence of protein sequences
adopting functional enzyme
folds. J Mol Biol, 2004. 341(5): p. 1295-315.
2. Pauling, L., Molecular architecture and biological reactions.
Chem. Eng. News, 1946.
24(10): p. 1375-1377.
3. Garcia-Viloca, M., et al., How enzymes work: analysis by
modern rate theory and
computer simulations. Science, 2004. 303(5655): p. 186-95.
4. Fischer, E., Einfluss der Configuration auf die Wirkung der
Enzyme. Berichte der
deutschen chemischen Gesellschaft, 1894. 27(3): p.
2985-2993.
5. Koshland, D.E., Application of a Theory of Enzyme Specificity
to Protein Synthesis.
Proceedings of the National Academy of Sciences, 1958. 44(2): p.
98-104.
6. Damborsky, J. and J. Brezovsky, Computational tools for
designing and engineering
biocatalysts. Curr Opin Chem Biol, 2009. 13(1): p. 26-34.
7. Gora, A., J. Brezovsky, and J. Damborsky, Gates of Enzymes.
Chemical Reviews, 2013.
8. Crick, F., Central Dogma of Molecular Biology. Nature, 1970.
227(5258): p. 561-563.
9. Brannigan, J.A. and A.J. Wilkinson, Protein engineering 20
years on. Nat Rev Mol Cell
Biol, 2002. 3(12): p. 964-70.
10. Siegel, J.B., et al., Computational design of an enzyme
catalyst for a stereoselective
bimolecular Diels-Alder reaction. Science, 2010. 329(5989): p.
309-13.
11. Rothlisberger, D., et al., Kemp elimination catalysts by
computational enzyme design.
Nature, 2008. 453(7192): p. 190-5.
12. Walter, K.U., K. Vamvaca, and D. Hilvert, An active enzyme
constructed from a 9-amino
acid alphabet. J Biol Chem, 2005. 280(45): p. 37742-6.
13. Kuhlman, B., et al., Design of a novel globular protein fold
with atomic-level accuracy.
Science, 2003. 302(5649): p. 1364-8.
14. Hilvert, D., Design of protein catalysts. Annu Rev Biochem,
2013. 82: p. 447-70.
15. Turner, N.J., Directed evolution drives the next generation
of biocatalysts. Nat Chem
Biol, 2009. 5(8): p. 567-73.
16. Jackel, C. and D. Hilvert, Biocatalysts by evolution. Curr
Opin Biotechnol, 2010. 21(6):
p. 753-9.
17. Ondrechen, M.J., J.G. Clifton, and D. Ringe, THEMATICS: a
simple computational
predictor of enzyme function from structure. Proc Natl Acad Sci
U S A, 2001. 98(22): p.
473-8.
18. Shehadi, I.A., H. Yang, and M.J. Ondrechen, Future
directions in protein function
prediction. Mol Biol Rep, 2002. 29(4): p. 329-35.
19. Shehadi, I.A., et al., Active site prediction for
comparative model structures with
thematics. J Bioinform Comput Biol, 2005. 3(1): p. 127-43.
20. Shehadi, I.A., et al., THEMATICS is effective for active
site prediction in comparative
model structures, in Proceedings of the second conference on
Asia-Pacific bioinformatics
- Volume 292004, Australian Computer Society, Inc.: Dunedin, New
Zealand. p. 209-215.
21. Ringe, D., et al., Protein structure to function: insights
from computation. Cell Mol Life
Sci, 2004. 61(4): p. 387-92.
-
61
22. Ko, J., et al., Prediction of active sites for protein
structures from computed chemical
properties. Bioinformatics, 2005. 21 Suppl 1: p. i258-65.
23. Wei, Y., et al., Selective prediction of interaction sites
in protein structures with
THEMATICS. BMC Bioinformatics, 2007. 8(1): p. 119.
24. Ko, J., et al., Statistical criteria for the identification
of protein active sites using
theoretical microscopic titration curves. Proteins: Structure,
Function, and
Bioinformatics, 2005. 59(2): p. 183-195.
25. Tong, W., et al., Enhanced performance in prediction of
protein active sites with
THEMATICS and support vector machines. Protein Sci, 2008. 17(2):
p. 333-41.
26. Tong, W., et al., Partial Order Optimum Likelihood (POOL):
Maximum Likelihood
Prediction of Protein Active Site Residues Using 3D Structure
and Sequence Properties.
PLoS Comput Biol, 2009. 5(1): p. e1000266.
27. Somarowthu, S., et al., High-performance prediction of
functional residues in proteins
with machine learning and computed input features. Biopolymers,
2011. 95(6): p. 390-
400.
28. Capra, J.A., et al., Predicting Protein Ligand Binding Sites
by Combining Evolutionary
Sequence Conservation and 3D Structure. PLoS Comput Biol, 2009.
5(12): p. e1000585.
29. Sankararaman, S. and K. Sjölander,
INTREPID—INformation-theoretic TREe traversal
for Protein functional site IDentification. Bioinformatics,
2008. 24(21): p. 2445-2452.
30. Somarowthu, S. and M.J. Ondrechen, POOL server: machine
learning application for
functional site prediction in proteins. Bioinformatics, 2012.
28(15): p. 2078-2079.
31. Brodkin, H.R., et al., Evidence of the participation of
remote residues in the catalytic
activity of Co-type nitrile hydratase from Pseudomonas putida.
Biochemistry, 2011.
50(22): p. 4923-35.
32. Somarowthu, S., et al., A tale of two isomerases: compact
versus extended active sites in
ketosteroid isomerase and phosphoglucose isomerase.
Biochemistry, 2011. 50(43): p.
9283-95.
33. Walsh, J.M., et al., Effects of non-catalytic, distal amino
acid residues on activity of E.
coli DinB (DNA polymerase IV). Environ Mol Mutagen, 2012. 53(9):
p. 766-76.
34. Muller, B.H., et al., Improving Escherichia coli alkaline
phosphatase efficacy by
additional mutations inside and outside the catalytic pocket.
Chembiochem, 2001. 2(7-8):
p. 517-23.
35. Coleman, J.E., Structure and mechanism of alkaline
phosphatase. Annu Rev Biophys
Biomol Struct, 1992. 21: p. 441-83.
36. Lassila, J.K., J.G. Zalatan, and D. Herschlag, Biological
Phosphoryl-Transfer Reactions:
Understanding Mechanism and Catalysis. Annu Rev Biochem, 2011.
80(1): p. 669-702.
37. Andrews, L.D., T.D. Fenn, and D. Herschlag, Ground State
Destabilization by Anionic
Nucleophiles Contributes to the Activity of Phosphoryl Transfer
Enzymes. PLoS Biol,
2013. 11(7): p. e1001599.
38. Andrews, L.D., H. Deng, and D. Herschlag, Isotope-edited
FTIR of alkaline phosphatase
resolves paradoxical ligand binding properties and suggests a
role for ground-state
destabilization. J Am Chem Soc, 2011. 133(30): p. 11621-31.
39. Kim, E.E. and H.W. Wyckoff, Reaction mechanism of alkaline
phosphatase based on
crystal structures. Two-metal ion catalysis. J Mol Biol, 1991.
218(2): p. 449-64.
-
62
40. Stec, B., K.M. Holtz, and E.R. Kantrowitz, A revised
mechanism for the alkaline
phosphatase reaction involving three metal ions. J Mol Biol,
2000. 299(5): p. 1303-11.
41. Murphy, J.E., X. Xu, and E.R. Kantrowitz, Conversion of a
magnesium binding site into
a zinc binding site by a single amino acid substitution in
Escherichia coli alkaline