Understanding the role of intrinsic disorder in subunits of hemoglobin and the disease process of sickle cell anemia Reis Fitzsimmons and Narmin Amin Abstract One of the most notorious and common genetic disorders is sickle cell anemia, in which two recessive alleles must meet to allow for destruction and alteration in the morphology of red blood cells. This usually leads to loss of binding to oxygen and curved, sickle-shaped erythrocytes. The mutation responsible for this disease occurs in the 6 th codon of the β A -globin, a protein responsible for binding to the oxygen in the blood. It changes from a charged glutamic acid to a hydrophobic valine residue, which disrupts the tertiary structure and stability of the hemoglobin molecule. Questionably, intrinsic disorder in protein structure generally results from low mean hydrophobicity and high net charge, leading to unstructured protein morphology. Perhaps intrinsic disorder might have a role in the disease process of sickle cell disease. GlobProt2 and FoldIndex were used to predict intrinsically disordered regions in all subunits of hemoglobin: alpha, beta, delta, epsilon, zeta, and gamma (two of them). The
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Understanding the role of intrinsic disorder in subunits of hemoglobin and the disease process of sickle cell anemia
Reis Fitzsimmons and Narmin Amin
Abstract
One of the most notorious and common genetic disorders is sickle cell anemia, in which
two recessive alleles must meet to allow for destruction and alteration in the morphology of red
blood cells. This usually leads to loss of binding to oxygen and curved, sickle-shaped
erythrocytes. The mutation responsible for this disease occurs in the 6th codon of the βA-globin, a
protein responsible for binding to the oxygen in the blood. It changes from a charged glutamic
acid to a hydrophobic valine residue, which disrupts the tertiary structure and stability of the
hemoglobin molecule. Questionably, intrinsic disorder in protein structure generally results from
low mean hydrophobicity and high net charge, leading to unstructured protein morphology.
Perhaps intrinsic disorder might have a role in the disease process of sickle cell disease.
GlobProt2 and FoldIndex were used to predict intrinsically disordered regions in all subunits of
hemoglobin: alpha, beta, delta, epsilon, zeta, and gamma (two of them). The protein sequences
for each subunit were retrieved from the UniprotKB database. Then structural analysis was
completed by using the SWISS-MODEL Repository to ensure the accuracy of the disorder
predictors. Finally, Uniprot STRING was used to determine each hemoglobin’s biochemical
interactome and protein partners along with analyzing their posttranslational modifications.
These other properties were used to correlate the sickle cell mutation with intrinsic disorder and
determine any differences between the six different types of subunits of hemoglobin.
Additionally, other considerations were discovered, such as the biochemical properties and
molecular mechanism of sickle cell, threading energy, comparisons between the hemoglobin
subunits, and how sickle cell anemia affects embryonic development of hemoglobin.
2
Introduction
Sickle cell anemia is an autosomal recessive genetic disease that is caused by the
“substitution of one amino acid in the hemoglobin molecule” (Roseff). This phenomenon is
caused by the sickle cell transformation of erythrocytes, which can no longer properly bind to
oxygen. Low oxygen levels can cause “occlusion of blood vessels, increased viscosity, and
inflammation” (Roseff). Sickle cell was the first genetic disorder to be “identified at the
molecular level” in 1957 (Pawliuk et al). The reason was that it was caused by the substitution of
valine for glutamic acid in the sixth codon of human βA-globin. Homozygotes for sickle cell have
abnormal hemoglobin which “polymerizes in long fibers” when red blood cells lose their oxygen
supply (Pawliuk et al). This is a major factor that explains how the RBCs transform into sickle-
shaped, deformed floppy discs. Although the reason might sound very insignificant at first, the
mutation creates radical changes in the structure and function of the RBCs. When the glutamic
acid residue is replaced by valine, the position for a charged residue is replaced with a nonpolar
residue, which could “cause some disruption of the tertiary structure” (Arends et al). Arends
mentioned that when oxygen levels were measured in a heterozygote, oxygen levels tended to be
normal, but when the oxygen levels were compared to those of a recessive homozygote, there
was decreased affinity for oxygen from the disruption in the tertiary structure. When the oxygen
affinity is lowered, then the red blood cells have been reshaped into a new dysfunctional
morphology that suspends their activity of carrying oxygen. The glutamic acid residue might be
influential because it is charged and enforces the secondary structure of the hemoglobin.
However, when it has been replaced by valine, the protein becomes more nonpolar and the valine
promotes intrinsic disorder, although this is normally an order-promoting residue. The objective
is to better understand the role of intrinsic disorder in hemoglobin and how it affects the disease
3
process of sickle cell anemia. Other considered factors were posttranslational modifications and
biochemical interactions with other proteins.
Intrinsic disorder in each subunit of hemoglobin
Hemoglobin in Homo sapiens is made of many different subunits that change during the
development of the human. When a human is an adult, the hemoglobin protein is made of two
alpha subunits and two beta subunits. The mutation for sickle cell occurs in the beginning of the
beta subunit as mentioned before. Before hemoglobin is able to develop alpha subunits, it must
have “combinations of ζ- with ɛ- or γ- subunits to form embryonic hemoglobins” (Manning et
al). Their order of expression is determined by their relative positions on the gene, “i.e., ζ → α (2
copies) on chromosome 16 and ɛ → γ (2 copies) → δ → β on chromosome 11” (Manning et al).
During normal development, the embryo is normally ζ2γ2, ζ2ɛ2, or α2ɛ2, the fetus is typically α2γ2, and
finally the adult stage consists of either α2β2 or α2δ2. Since the protein consists of a tetramer of any
of these combinations of these six different types of subunits, it would be most accurate to detect
intrinsic disorder levels in all of them. Protein sequences of each subunit were retrieved from
UniprotKB and were predicted using FoldIndex and GlobProt2. The gamma subunit has two
different versions, so both were considered. Although hemoglobin has more subunits such as mu
and theta, they were ignored because the majority of development of hemoglobin relies on the
combinations of the six main types: alpha, beta, gamma, delta, epsilon, and zeta. Clustal Omega
was used to run a multiple sequence alignment of all subunits, including both isoforms of the
gamma subunit (Fig. 1). The multiple sequence alignment revealed that all sequences share 37
identical positions and 56 similar positions, considering that every sequence is between 142 and
147 amino acids long. The percent identity was 24.8%, which shows that the subunits of
hemoglobin have a low level of evolutionary conservation. The phylogenetic tree showed that at
4
first there was a divergence between the alpha and zeta subunits from the others (Fig. 2). This
would make sense because they bind to all of the other ones through the development of a
human. Another divergence emerged in which the epsilon and gamma subunits separated from
the beta and the delta subunits. This would probably involve the fact that beta and delta subunits
bind to the alpha subunits in adulthood while the gamma and epsilon subunits bind to alpha or
zeta subunits during embryonic development. Finally, there was a divergence between the
epsilon and gamma subunits and eventually the divergence between the gamma subunit’s two
isoforms.
Fig 1. Multiple sequence alignment between all subunits of hemoglobin
Fig 2. Phylogenetic tree of all subunits of hemoglobin
The beta subunit is most directly involved with the sickle cell mutation. After analyzing
the protein sequence using FoldIndex and GlobProt2, FoldIndex indicated that there are no
disordered regions in the sequence (right image of Fig. 3). However, GlobProt2 indicated that
5
there is one disordered region roughly in the middle of the protein (left image of Fig. 3). The
disorder predictors did not reliably indicate an overall presence of disorder in the beta subunit.
Fig 3. Intrinsic disorder prediction of the beta subunit
The alpha subunit could pose as a reasonable subunit to determine if there is intrinsic
disorder because it requires two of them to form the tetramer in adult Homo sapiens with the beta
subunits. In this case, GlobProt2 predicted that there is one disordered region indicating the same
positioning as the beta subunit (left image of Fig. 4). FoldIndex once again showed no signs of
disordered regions (right image of Fig. 4).
6
Fig 4. Intrinsic disorder prediction of the alpha subunit
The delta subunit could possibly consist of intrinsic disorder like its alpha and beta
counterparts, especially if it is still present in the adult human. GlobProt2 reflected one
disordered region similar to the one found in the previous subunits (left image of Fig. 5).
FoldIndex has shown no disordered regions reflecting consistency from the previous sequences
(right image of Fig. 5). The delta subunit appears to have the same regions of intrinsic disorder
found in the alpha and beta subunits.
7
Fig 5. Intrinsic disorder prediction of the delta subunit
The gamma subunit consists of two isoforms, so there were two results for each predictor.
GlobProt2 showed similar results for both isoforms (left image of Fig. 6 and Fig. 7) while
FoldIndex both showed no disordered regions like all previous results (right image of Fig. 6 and
Fig. 7). According to GlobProt2, the gamma subunit has two IDRs, one at the beginning and one
towards the end of the protein sequence. Interestingly, this is different from the alpha, beta, and
delta subunits. The gamma subunit is also only present during fetal and embryonic stages of the
hemoglobin protein. Perhaps the disease process of sickle cell anemia is more sensitive during
these stages because its single codon mutation occurs in the sixth codon towards the beginning of
the protein sequence in HBB.
8
Fig 6. Intrinsic disorder prediction of the first gamma subunit isoform
Fig 7. Intrinsic disorder prediction of the second gamma subunit isoform
The epsilon subunit is one of the subunits found in embryonic hemoglobin. According to
GlobProt2, it has one disordered region similar to the alpha, beta, and delta subunits (left image
of Fig. 8). Strangely, no globular domain structure was detected in the residues positioned before
9
the IDR. This might be a sign that the intrinsically disordered region might have influence over
the area prior to it in sequence order. It might also raise the question that it could affect the
position of the sickle cell mutation. FoldIndex showed the same results as all other previous
subunits beforehand (right image of Fig. 8).
Fig 8. Intrinsic disorder prediction of the epsilon subunit
Finally, the zeta subunit was predicted and is the subunit which epsilon and gamma
subunits bind during embryonic development of the hemoglobin protein. GlobProt2 showed that
the zeta subunit has three disordered regions spread all over the protein sequence (left image of
Fig. 9). The zeta subunit has all disordered regions of every subunit before mentioned. It also has
the disordered region where the sickle cell normally occurs, which was also found in both
gamma isoforms. The zeta subunit might have a central role in how the sickle cell mutation is
inherited during embryonic development. FoldIndex also revealed similar results in which the
hemoglobin has no disordered regions which has proven the consistency of the predictor (right
mage of Fig. 9).
10
Fig 9. Intrinsic disorder prediction of the zeta subunit
Structural analysis of intrinsic disorder
Although sequence is considered the most reliable method of predicting intrinsically
disordered regions within a protein, predicting secondary and tertiary structure is also a useful
tool in considering the reality of the sequence predictions. The SWISS-MODEL Repository was
used to create models to predict the protein structures of each subunit of hemoglobin and were
used to determine the reliability of the disorder predictors. When viewing protein structures, a
dark blue region indicates that the threading energy is low and that the residues are properly set
in their positions while red indicates that the threading energy is very high and that the region of
residues is considered entropic or unsettled in its environment. The structure prediction might not
be the most reliable method, but it can still provide an accurate 3-dimensional image of the
protein and represent the distribution of threading energy throughout the entire molecule. The 3-
dimensional conformation of the protein structure could possibly predict intrinsically disordered
regions because the structure’s “binding-folding thermodynamics and kinetics,” which are
11
important for the “efficiency of realizing biomolecular function,” can be deduced from its
“global energy landscape topology” (Chu et al). Therefore, the intrinsically disordered regions of
the hemoglobin subunits could be further analyzed by the levels of threading energy detected by
the SWISS-MODEL Repository protein structures since intrinsic disorder is characterized by
high thermodynamic energy and lack of defined structure.
The beta subunit showed one IDR around bases 48-60 (left image of Fig. 3). Although
SWISS-MODEL might not be an accurate predictor of intrinsic disorder, it still provides an
accurate measure of the distribution of threading energy, which is essential to the biological
function and defined structure of the hemoglobin. The structure and sequence from Fig. 10
showed that the most prominent red regions are around bases 37-46, 63-73, 87-100, and 142-147.
This model of the beta subunit indicates that the disorder prediction might not have been that
accurate or that the correlation between intrinsic disorder and entropic residues might have flaws.
Another interesting observation is that the IDR is surrounded by two red areas indicating that
intrinsic disorder might cause lack of defined structure to surrounding regions.
12
Fig 10. SWISS-MODEL protein structure and threading energy of protein sequence of the beta subunit
The alpha subunit has one IDR around base positions 48-60 similar to the beta subunit
(left image of Fig. 4). Fig. 11 has its most prominent red regions around base positions 41-48,
58-67, 83-102, and 136-142. This time the alpha subunit seems to have better disorder prediction
versus the beta subunit. However, the red regions still seem to surround the IDR rather than be
part of it, as shown in the beta subunit (Fig. 10). This continues to support the idea that an IDR
might cause lack of defined structure or higher thermodynamics to surrounding regions within
the protein sequence.
Fig 11. SWISS-MODEL protein
structure and threading energy
of protein sequence of the alpha subunit
13
The delta subunit showed one IDR consisting of bases 47-60 (left image of Fig. 5). It is
roughly the same region as in the alpha and beta subunits. SWISS-MODEL showed that the most
prominent red regions of the delta subunit’s sequence are around the base positions 37-46, 88-
100, and 143-147 (Fig. 12). The delta subunit has roughly the same red regions as the alpha and
beta subunits, except that the red regions are less prominent and that the region around 58-72 is
either violet or blue. This time there is only one prominently red region that is adjacent to the
IDR. The delta subunit must not be as disordered and has more defined tertiary structure than the
alpha and beta subunits. It might not even be that involved with the sickle cell mutation.
Fig 12. SWISS-MODEL protein structure and threading energy of protein sequence of the delta subunit
Both isoforms of the gamma subunit have intrinsic disorder found around bases 1-7 and
140-144 (left images of Fig. 6 and Fig. 7). The SWISS-MODEL protein structure for the first
14
isoform has a few prominent red regions, but most of them are short or blended with blue
regions. The most prominent regions are around the bases 38-47, 64-72, 88-107 and 142-147
(Fig. 13). The SWISS-MODEL protein structure for the second isoform has many violet and
weak red regions, but its most prominent regions for red color are roughly 38-43, 93-98, and
145-147 (Fig. 14). Both isoforms of the gamma subunit do not have many prominent red regions
and seem to have long streaks of defined tertiary structure. Their highest energy levels are in
similar locations, although the first gamma subunit has much more pronounced red coloring and
much more energy in the region between 60 and 75. Interestingly, the intrinsically disordered
region at the start of the sequence for both isoforms was not accurately predicted, but the IDR at
the very end of the sequence for both isoforms was predicted very accurately. The sickle cell
mutation located at the start of the sequence of hemoglobin might be influential in causing high
threading energy at the end of the sequence. This could show how IDRs can influence other
IDRs even if they are at opposite ends of the protein. Compared to the alpha and beta subunits,
the first isoform of the gamma subunit seems quite similar in which regions are most
prominently red. The second gamma isoform might not share the exact residue positions for
prominent red areas, but both subunits showed roughly similar colored regions and the gamma 2
subunit has much less threading energy than the alpha, beta, and first gamma subunits. This
evidence also reflects that the gamma subunit must be versatile when binding to different types
of other subunits in embryonic development of hemoglobin.
15
Fig 13. SWISS-MODEL protein structure and threading energy of protein sequence of the first gamma subunit
Fig 14. SWISS-MODEL protein structure and threading energy of protein sequence of the second gamma subunit
16
The epsilon subunit has one IDR located around residues 44-60 (left image of Fig. 8). It
has similar intrinsic disorder to the alpha, beta, and delta subunits, but it lacks globular domain
structure in the first 40 to 45 bases of the sequence. In Fig. 15, the epsilon subunit has its most
prominent red regions around the bases 38-46, 64-72, 90-107, and 142-146. Once again, the idea
that an IDR affects the threading energy of its surrounding regions is seen, similar to when it was
mentioned about the alpha and beta subunits. Its most prominent red regions highlight its
tendency to resemble the alpha and beta subunits, including the first isoform of the gamma
subunit. Surprisingly, the first 40 to 45 bases of this sequence showed fairly stable structure and
might reflect that GlobProt2 might not be an accurate disorder predictor. However, lacking
globular domain structure might not necessarily mean that that part of the protein structure is
completely unordered.
Fig 15. SWISS-MODEL protein structure and threading energy of protein sequence of the epsilon subunit
17
At last, the zeta subunit has disordered regions around the nucleotide bases 1-8, 42-52
and 133-137 (left image of Fig. 9). Its disordered regions include the site of mutation for sickle
cell anemia, which is common in the gamma subunits, and reflect all disordered regions of all
other subunits (left images of Fig. 1 to 8). The strongest red regions in the zeta subunit are
roughly 40-47, 59-66, 84-102 and 133-142. Strangely, bases 1-8 were not predicted by the
structure even though they have the site of the sickle cell mutation, similar to the conclusion
about both isoforms of the gamma subunit. The other IDRs were accurately predicted, reflecting
that perhaps threading energy and lack of defined structure indicate not the site of the mutation,
but rather the most affected areas. Since the zeta subunit was mentioned to have a central role in
the embryonic development of the hemoglobin protein, the evidence has been showing more
direction towards the idea that the sickle cell mutation definitely has more genetic influence
during the development of the embryo versus other life stages. Finally, the zeta subunit seems to
resemble the alpha, beta, epsilon and first gamma isoform subunits based on the areas of its
highest threading energy.
Fig 16. SWISS-MODEL protein structure and threading energy of the protein sequence of the zeta subunit
18
Posttranslational modifications of hemoglobin
Posttranslational modifications are enzymatic and covalent modifications of proteins after
the process of translation that serve several functions, such as providing the protein with a
specific function or targeting it for proteolytic cleavage. These include phosphorylation,
glycosylation, nitrosylation, ubiquitination, and others. Hemoglobin has a large number of
posttranslational modifications in the majority of its different subunits. The PTMs were indicated
by the display of the sequence in UniprotKB.
Fig 17. Posttranslational modifications of hemoglobin subunit beta
As shown in Fig. 17, the beta subunit of hemoglobin mainly has posttranslational
modifications at positions 2, 9, 10, 13, 18, 45, 51, 60, 67, 83, 88, 94, 121, and 145. The amino
acid valine at position 2 is an N-acetylated, glycosylated, and pyruvic acid iminylated residue.
The beta subunit is also glycosylated at positions 9, 18, 67, 121, and 145. There are also several
amino acids that are phosphorylated in this subunit, such as the serine residues at positions 10
and 45 and the threonine residues at positions 13, 51, and 88. The lysine residues at positions 60,
19
83, and 145 are N6-acetylated. Finally, the cysteine residue at position 94 is S-nitrosylated.
According to the GlobProt2 graph in Fig. 3, the beta subunit is disordered in the middle of the
amino acid sequence between residues 50 and 60. This might indicate a correlation between the
aforementioned posttranslational modifications at residues 51 and 60 and the disorder within the
corresponding region of the amino acid sequence. The sickle cell mutation occurs in the sixth
codon of the beta subunit. Since the beta subunit is the main hemoglobin subunit involved in
sickle cell anemia, the intrinsic disorder within the subunit might play a role in sickle cell
disease. The sickle cell mutation, which occurs at the 6th codon of this subunit, might be
correlated with the surrounding modified and glycosylated residues because the beginning of the
protein is the most modified region. It would not be surprising if the amino acid valine at
position 2 plays a crucial role due to its triple-modified condition. Thus the mutation induces a
glutamic acid-to-valine transition in which the protein structure destabilizes due to lowered
charge from the new valine. Perhaps the region in the beginning of the beta subunit might be
prone to the molecular mechanism of the disease process if many of the residues are modified
and that the region is usually low in threading energy indicating an otherwise normally stable
structure (Fig. 10).
20
Fig 18. Posttranslational modifications of hemoglobin subunit alpha
In Figure 18, the alpha subunit is phosphorylated at numerous sites, including serine
residues at positions 4, 36, 50, 103, 125, 132, and 139; threonine residues at positions 9, 109,
135, and 138; and a tyrosine residue at position 25. The second most frequent posttranslational
modification in this amino acid sequence is glycosylation, which is found at positions 8, 17, 41,
and 62. The lysine residues at positions 8, 12, 17, and 41 are also N6-succinylated. Finally, the
lysine residue at position 17 is N6-acetylated. According to the GlobProt2 graph in Fig. 4, the
alpha subunit is disordered in the middle of the amino acid sequence between residues 50 and 60.
Therefore, the posttranslational modification at position 50 could be correlated with the disorder
in this subunit. However, the IDR still consists of only one posttranslational modification, so no
correlation can actually be accurately deduced.
Fig 19. Posttranslational modifications of hemoglobin subunit delta
The delta subunit only has one posttranslational modification, which is a phosphorylated
serine residue at position 51 (Fig. 19). There is another posttranslational modification that occurs
21
in the Niigata variant of this subunit, which is an N-acetylated alanine residue at position 2.
According to the GlobProt2 graph in Fig. 5, the delta subunit is disordered in the middle of the
amino acid sequence between residues 50 and 60. Therefore, the posttranslational modification at
position 51 could be correlated with the disorder in this subunit, yet still no strong correlation is
present from the given evidence.
Fig 20. Posttranslational modifications of hemoglobin subunit gamma 1
The only posttranslational modification found in hemoglobin subunit gamma 1 is the N-
acetylation of glycine at position 2 (Fig. 20). According to the GlobProt2 graph in Fig. 6, the
gamma 1 subunit is disordered in the beginning of the amino acid sequence, between residues 0
and 5. Therefore, the posttranslational modification at position 2 could be correlated with the
disorder in this subunit. Once again, the correlation is still not that accurate although acetylation
has been shown to have an effect in protein stability.
22
Fig 21. Posttranslational modifications of hemoglobin subunit gamma 2
The only posttranslational modification in the gamma 2 subunit is N-acetylation of
glycine at position 2, similar to gamma subunit 1 (Figure 21). According to the GlobProt2 graph
in Fig. 7, the gamma 2 subunit is disordered in the beginning of the amino acid sequence,
between residues 0 and 5. Therefore, the N-acetylation of glycine at position 2 could be
correlated with the disorder in this subunit, since acetylation does play a role in protein structure
stability.
Fig 22. Posttranslational modifications of hemoglobin subunit zeta
The hemoglobin zeta subunit has only one posttranslational modification site, which is
the N-acetylated serine residue at position 2 (Fig. 22). Similar to the gamma subunits, the
disordered region of zeta is located in the beginning of the amino acid sequence, specifically
between residues 0 and 8 (left image of Fig. 9). Therefore, the N-acetylation of serine at position
2 could be correlated with the intrinsic disorder of the zeta subunit.
23
Fig 23. Posttranslational modifications of hemoglobin subunit epsilon
The hemoglobin epsilon subunit epsilon has three phosphorylated amino acid residues:
two serine residues at positions 45 and 51 and threonine at position 124. It also has two N6-
succinylated lysine residues at positions 18 and 60 (Fig. 23). Finally, the valine residue at
position 2 is N-acetylated. According to the GlobProt2 graph in Fig. 8, the epsilon subunit is
disordered in the middle of the amino acid sequence between residues 45 and 60. Therefore, the
posttranslational modifications at positions 45, 51, and 60 could be correlated with the disorder
in this subunit. Phosphorylation of the serine residues and the N6-succinylation of the lysine
most likely result in changes in the protein structure and function, possibly leading to the
intrinsic disorder.
Biochemical interactions with protein partners
Hemoglobin’s subunits have multiple interactions with a wide variety of other proteins.
These protein interactomes were discovered through Uniprot STRING, a database which
develops functional protein association networks to determine the function of the selected
protein. The interactomes are important because they could show the true role of each subunit of
hemoglobin and how each one is associated with another protein. Functional genomics could
provide a better answer towards how the disease process of sickle cell anemia can alter
hemoglobin’s function and the disease’s possible correlation with intrinsic disorder.
24
Fig 24. Biochemical interaction network of hemoglobin subunit beta
As shown in Fig 24, the beta subunit of hemoglobin interacts with approximately 17
different proteins, six of which are hemoglobin subunits HBA1 (hemoglobin alpha 1), HBA2