hemogen among high-latitude Antarctic notothenioids by ...m044c681x/fulltext.pdfblooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an adaptive
Post on 23-Aug-2020
0 Views
Preview:
Transcript
1
Relaxed selection in erythropoietic gene hemogen among high-latitude Antarctic notothenioids
by Carmen M. Elenberger
B.A. in Anthropology, University of Florida
A thesis submitted to
The Faculty of
the College of Science of
Northeastern University
in partial fulfillment of the requirements
for the degree of Master of Science
December 12, 2018
Thesis directed by
H. William Detrich
Professor of Biochemistry and Marine Biology
2
Copyright 2018
Carmen Elenberger
3
Acknowledgements
First and foremost, I would like to thank my advisor, Dr. H. William Detrich, for his guidance
and his support over the past four years. He challenged me to broaden my horizons and gave me
the opportunity to travel to the ends of the earth in order to do so. I would also like to thank Dr.
Thomas Desvignes, as well as Laura Goetz and Sierra Smith, for their assistance in conducting
field work for this project. I would like to extend further thanks to Dr. Jacob Daane for
permitting me to use his unpublished data to expand my analyses. Many thanks to Biology Open
for allowing me to reproduce their figure with permission [1].
I would like to thank my committee members, Dr. A. Randall Hughes and Dr. Steve
Vollmer, for their interest in my research and their advice in analyzing and framing the results of
my research. I would also like to thank my labmate, Dr. Michael Peters, and our lab manager,
Sandra Parker, for their advice, assistance, and encouragement over the years. Additionally, I
would like to thank the faculty and staff of the Marine Science Center, as well as the funding
sources for this research. Special thanks to the staff of Palmer Station and the crew of the
Laurence M. Gould for a productive and memorable field season. Finally, I would like to thank
my friends and family for their unwavering support and encouragement, now and always.
4
Abstract of Thesis
Antarctic icefish (Channichthyidae) are the only vertebrate taxon with an erythrocyte-null
phenotype, and present an interesting model for studying the evolution and regulation of
erythropoiesis. The gene hemogen has been identified to encode a protein which plays a role in
regulating erythropoietic processes in vertebrates. hemogen may have been potentially impacted
by the loss of globin-expression. I investigated possible relaxed selection at the hemogen locus
by looking for evolutionary change to the regulatory elements or segments encoding the
Hemogen protein, and assessed the evolutionary processes that drove hemogen variation among
Antarctic notothenioids. While regulatory mechanisms remain intact, icefish show a significant
90bp indel in exon 3 of hemogen that would disrupt conserved modules in the Hemogen protein
that are critical for erythropoiesis. Despite this, hemogen still remains expressed at low levels in
adult icefish and possesses a novel splice variant that encodes a truncated protein possibly
serving as a dominant negative for wild-type Hemogen. I conclude that while hemogen has
undergone relaxed selection and accumulated mutations that would impact erythropoietic
function in non-Antarctic fish, the observed mutations may be tolerated due to erythrocyte and
hematocrit modifications in notothenioid blood phenotypes. hemogen may have a decreased—
but still important—role to play in icefish, possibly functioning as a dominant negative for
hemogen’s role in erythropoiesis.
5
Table of Contents
Acknowledgements 3
Abstract of Thesis 4
Table of Contents 5
List of Tables 6
List of Figures 7
List of Abbreviations 9
Introduction 11
Methods 15
Results 21
Discussion 29
Tables and Figures 40
References 67
6
List of Tables
1 Primers used in PCR and qRT-PCR reactions to amplify hemogen gDNA and cDNA in
Antarctic notothenioids (pg 40)
2 Species sequenced and included in study of Antarctic notothenioid hemogen (pg 41)
3 Codon usage bias for hemogen (total coding sequence) among Antarctic notothenioids
(pg 42)
4 Mean pairwise dN/dS for within-family comparisons of Antarctic notothenioid families
(pg 43)
5 Mean pairwise dN/dS for between-family comparisons of Antarctic notothenioid families
(pg 44)
6 Results of codon-based site tests conducted in CodeML on the Antarctic radiation (pg 45)
7
List of Figures
1 Zebrafish Si:dkey-25o16.2 and human Hemogen are orthologous and encode related
proteins that differ in size (pg 46)
2 Icefish transcript variants for hemogen and their putative effects on translation illustrated
in representative species Champsocephalus gunnari (pg 48)
3 Maximum likelihood tree used to test for positive selection on the branch leading to the
Antarctic notothenioid clade (pg 50)
4 Maximum likelihood tree used in site-tests for positive/pervasive selection among
Antarctic notothenioids (pg 51)
5 RELAX tree shows relaxed selection on the branches contained Bathydraconidae and
Channichthyidae, demonstrating a trend of relaxed selection in hemogen on the way to
the erythrocyte-null phenotype (pg 53)
6 Gene structure and size remains conserved among red-blooded and white-blooded
notothenioids, including regulatory regions conserved among teleost fish (pg 54)
7 Conservation of conserved non-coding elements CNE1 and CNE2 in Antarctic
notothenioids relative to Gasterosteus aculeatus and Danio rerio (pg 56)
8 hemogen exon 3 deletions in representative species from Channichthyidae relative to a
red-blooded notothenioid, and their predicted effects on transcription and translation (pg
57)
9 Variant forms of hemogen “exon 3” deletion mapped onto the Channichthyidae species
tree (pg 59)
10 hemogen indels in Antarctic notothenioids mapped onto a maximum parsimony tree (pg
60)
11 Pairwise dN/dS comparisons plotting total dN/dS of whole Hemogen-encoding sequence
with the dN/dS values for the N-terminus and C-terminus of notothenioid Hemogen,
within families Nototheniidae (A & B) and Channichthyidae (C & D). (pg 62)
12 Pairwise dN/dS trends between families Nototheniidae and Channichthyidae, plotting
whole-Hemogen dN/dS vs the N-terminus (A) or C-terminus (B). (pg 63)
13 qPCR quantification of hemogen transcript variants in representative icefish species C.
aceratus and C. gunnari, comparing adult head kidney hemogen expression with N.
coriiceps adult head kidney for both hemgn-L and hemgn-s splice variants (pg 64)
8
14 Changes to the bipartite nuclear localization signal in icefish (Champsocephalus gunnari)
relative to red-blooded notothens (Notothenia coriiceps). (pg 66)
9
List of Abbreviations
aa amino acid
bp base pair
CAI Codon Adaptation Index
cDNA complementary DNA
CNE conserved non-coding element
dN nonsynonymous mutation rate
DNA deoxyribonucleic acid
dN/dS ratio of nonsynonymous to synonymous mutation rates
dS synonymous mutation rate
EDAG erythroid differentiation-associated gene
GATA1 GATA-binding protein 1
gDNA genomic deoxyribonucleic acid
HoxB4 homeobox B4
KLF4 Krueppel-like Factor 4
-lnL negative log likelihood
MMCT Middle Miocene Climate Transition
MRCA most recent common ancestor
Mya million years
Myb MYB Proto-Oncogene, Transcription Factor
NLS nuclear localization signal
p300 histone acetyltransferase p300
PCR polymerase chain reaction
10
qPCR quantitative polymerase chain reaction
RNA ribonucleic acid
Sox9 transcription factor SOX-9
UTR untranslated region
11
INTRODUCTION
Cold-driven evolution of the Antarctic notothenioid lineage began roughly 46 Mya [2]
concurrent with the emergence of the Drake Passage (55-41 Ma) [3] and the initial formation of
the Antarctic Circumpolar Current [4]. The development of antifreeze glycoproteins [5, 6]
permitted colonization and persistence in the Southern Ocean [7] and set the stage for further
diversification during successive cooling periods and accompanying geological events. The
radiation of the high latitude Antarctic notothenioids (Cryonotothenioidea) occurred during a
period of diversification driven by intensified cooling of the Southern Ocean during the Middle
Miocene Climate Transition (MMCT) [7, 8], with species diversification beginning ~14 Mya and
accelerating ~11 Mya during the Late Miocene [7, 9-11]. Cooling during the MMCT led to
contemporary Antarctic conditions (-2℃ to + 2℃) and resulted in the scouring of continental
shelves by ice [12, 13]. This opened ecological niches for potential colonization by removing
more temperate adapted competitors [14] and leading to rapid morphological and ecological
diversification [15]. Current day Antarctic notothenioids comprise 77% of Antarctic teleost
diversity and constitute a marine species flock [16] derived via adaptive radiation [17-19]. High
levels of morphological diversity and intense speciation make Antarctic notothenioids a useful
evolutionary model for studying cold adaptation.
Antarctic notothenioids possess a number of remarkable changes to erythropoiesis and
the oxygen-transport system at large that resulted in the evolution of the only known vertebrate
clade devoid of erythrocytes—the family Channichthyidae, characterized by a “white-blooded”
phenotype [20]. It has been hypothesized that the high oxygen concentration in polar seawater
could lead to potential relaxed selection on erythrocytes and other oxygen-binding pigments, as
hypoxic stress becomes less of a relevant factor with oxygen in such high abundance [21].
12
Evidence for such relaxed selection can be seen in changes to blood content: a study of “red-
blooded” Antarctic species from McMurdo Sound showed decreased numbers of erythrocytes,
lowered hematocrit, and lowered hemoglobin concentrations when compared with temperate fish
[22]. General trends throughout the radiation show that the more derived the family, the fewer
erythrocytes present in circulating blood and the lower the hemoglobin content [21]. Both red-
blooded and white-blooded notothenioid fish show reduced hematocrit, which is potentially an
adaptive feature to contend with the increased viscosity of blood under low temperatures [23,
24]. Hemoglobin multiplicity is reduced among notothenioids relative to temperate fish [25-27]
and cold anemia responses became genetically assimilated [28-30]. At some point notothenioid
dependence on hemoglobin for respiration became so reduced even red-blooded fish could
continue to effectively absorb and utilize oxygen even in the presence of carbon monoxide [31],
suggesting that the stage had well been set for hemoglobin loss before it disappeared.
Channichthyidae are characterized by loss of the vertebrate oxygen-transport molecules
the α2β2 hemoglobin tetramer carried within erythrocytes. This occured in the most recent
common ancestor (MRCA) of all icefish via large genomic lesions within the respective loci [25,
32-36]. Furthermore, there have been multiple, independent losses of myoglobin during
diversification [36]. Icefish possess few erythroblasts, and their blood contains mostly leukocytes
and plasma [35]. The evolution of the white-blooded phenotype is unique among vertebrates and
has far reaching consequences for the cardiovascular system and key globin partners. As a result
of hemoglobin loss, we would anticipate changes to the genetic machinery involved in red blood
cell production and maintenance, as selective constraints on this may relax in the absence of key
globin partners. It is also possible that this began somewhere within the red-blooded families, as
oxygen transport molecules became less necessary for survival. Relaxed selection in the
13
regulatory regions of globin has been detected among dragonfish, prior to the emergence of a
white-blooded phenotype [37].
The gene hemogen has been identified as an interesting candidate for further study in
notothenioid fish, given evidence based on subtraction libraries that expression may be impaired
or entirely absent in icefish. The hemogen gene encodes the transcription factor Hemogen (Fig
1), which acts as a regulator in hematopoietic development by stimulating the differentiation of
hematopoietic cells into both the erythroid and megakaryocytic lineages [38-43]. In teleost fish,
Hemogen is encoded by four exons and contains domains similar to those predicted in the human
ortholog: a coiled-coil domain, a bipartite nuclear localization, a series of tandem repeats and an
acidic domain (Fig 1) [1, 38]. It is promoted via two conserved non-coding elements, one
proximal and one distal, both critical for promoting primitive erythropoiesis (Fig 1) [1].
Hemogen also plays a role in cell apoptosis [39] and has been implicated in the regulation of
tumor cells in acute myeloid leukemia [44]. Other possible roles include spermatogenesis [45],
sex-determination [46], and osteoblast recruitment and bone calcification [47-49]. Research
show Hemogen’s role in hematopoiesis takes place via interactions with a number of key
proteins involved in erythropoiesis and development, including GATA1 and p300. GATA1 is
critical for erythroid differentiation [50-52] and functions in both primitive and definitive
hematopoiesis [53]. Nonsense mutations in GATA1 lead to a “bloodless” phenotype [54].
GATA1 recruitment is crucial for hemogen function and downregulation of hemogen expression
inhibits GATA1 activity [40, 43], while GATA1 recruits hemogen to the beta-globin locus [55].
p300 is crucial for cell differentiation [56, 57] and inhibition of p300 binding to Hemogen causes
decreased production of erythroid cells. Hemogen facilitates the interaction between GATA1 and
p300, making it a critical part of the erythroid differentiation process [55].
14
Decreased hemogen expression in white-blooded fish may indicate functional loss. Given
the decreasing importance of red blood cells to the notothenioid lineage, selective constraints on
known regulators of erythrocyte production may have relaxed prior to complete globin loss.
Hemogen interacts with Beta-globin and regulates erythroid production, raising the possibility
that erythropoietic features may be aberrant in icefish. However, hemogen demonstrates
pleiotropy, as described in the previous paragraph, and lists of potential partners implicate it in a
number of important cellular processes beyond erythropoiesis. Therefore, at least some features
must remain conserved in order to carry out non-erythropoietic roles.
In this thesis, I characterize hemogen genes in both red-blooded and white-blooded
Antarctic notothenioids and compare them with sub-Antarctic perciform outgroups to establish
hemogen’s history within this clade. I hypothesize that the hemogen locus is undergoing relaxed
selection among the icefish, and that relaxation of selective constraints began prior to the
emergence of Channichthyidae. I investigated partial conservation of the hemogen gene,
hypothesizing that pleiotropy would protect against total pseudogenization of hemogen. Features
under relaxed selection would be implicated in erythropoietic function and could be considered
targets for further study of hemogen in erythropoiesis. I hypothesize some level of differential
expression between white-blooded and red-blooded fish; if not complete loss of expression, than
loss in certain tissues or of certain key isoforms in Channichthyidae.
My results show a strong trend towards relaxed selection in high-latitude Antarctic
notothenioids relative to Sub-Antarctic relatives, with icefish showing intensified relaxation.
Confirmation of relaxed selection among-red-blooded fish supports the theory that the decreased
dependence on erythrocytes in notothenioid fish also correlates with larger-scale changes in the
erythropoietic paradigm on the genomic level. Three out of four key functional domains show
15
some form of degradation, either via nonsynonymous mutation or through the
transcriptional/translational impacts of indels on icefish hemogen. Three key evolutionary events
took place in the MRCA of all extant icefish: the degradation of the bipartite NLS, a 30aa loss in
a proline-rich region of tandem repeats, and the development of a novel splice form, hemgn-s,
which excludes all functional domains encoded by exon 3 and 4 and theoretically results in a
frameshifted and truncated hemogen protein. However, key promoter regions remain conserved
in icefish, and while expression is down-regulated in adult tissues relative to red-blooded
species, hemogen is still expressed in adult tissues of some icefish. This suggests that while the
decreased importance of erythropoietic functions may have significantly relaxed pressure on
hemogen and resulted in mutations impacting domains critical for erythropoietic-function, it is
not necessarily non-functional and may still be playing a decreased but critical role in other
processes.
METHODS
Sample collection & sequencing of notothenioid hemogen gDNA
The primary source of genomic material came from tissues obtained by the Detrich Lab
during the 2012, 2014 and 2016 winter fishing cruises conducted by the Research Vessel
Laurence M. Gould near Palmer Station, Antarctica. Tissues were flash-frozen in liquid nitrogen
and then stored at -80℃. I generated sequences from between 1-5 individual fish per species.
Molecular methods for gDNA extraction from tissues are as specified in the Quick-gDNA
miniprep kit (Zymo Research, D3024). Full notothenioid hemogen—from start codon to the 3’
UTR—was amplified by PCR from gDNA samples using 1 µM primers (Table 1) designed from
previously obtained Notothenia coriiceps sequences. The amplification protocol was as
16
follows—35 cycles of 98°C for 10 s, 59°C for 10 s, 72°C for 1 min. PCR products were cloned
into the pGEM T-easy vector (Promega, A1360), and recombinant clones were transformed into
DH5α competent cells (New England Biolabs, C2987H). Recombinant plasmids were identified
using blue/white screening, purified via the Wizard Plus Miniprep DNA Purification System
(Promega, A7500), and sequenced by GeneWiz. I obtained full genomic sequences for 18
notothenioid species (Table 2, Figure S1).
Cloning and sequencing of notothenioid cDNAs
I isolated total RNA from flash-frozen tissues of adult N. coriiceps and C. aceratus using
the RNEasy Mini Kit (Qiagen, 74104). Several potential hemogen transcripts had been
previously identified by other Detrich Lab members (Figure 2). To expand upon these results,
RNA samples were prepared from ten tissues: liver, spleen, head kidney, trunk kidney, white
muscle, pectoral red muscle, testes, brain, heart ventricle and gill. Total cDNA was produced
from the mRNA using M-MuLV reverse transcriptase and an oligo(dT)23 primer according to the
protocol outlined in the Protoscript II First Strand cDNA Synthesis kit (NEB, E6560S). cDNA
was amplified via PCR using the same primers as gDNA PCR (Table 1) according to the
following protocol: 35 cycles of 98°C for 10 s, 59°C for 10 s, and 72°C for 45 s. cDNA was then
cloned into pGEM T-easy vector and subsequently transformed and purified as outlined for
gDNA sequences.
Construction of genomic, coding and protein alignments for gene characterization,
phylogenies and evolutionary analysis
17
Nucleic acid alignments were constructed using MUSCLE [58] as implemented in
MEGA7 [59], with a gap opening penalty of 15 and gap extension penalty of 6.66. Alignments
were subsequently inspected and adjusted by eye in BioEdit [60]. Construction of gene trees and
evolutionary analysis relied primarily on three alignments: a gDNA alignment with all exons +
introns; a coding alignment based on cDNA sequences, transcriptome data and concatenated
exome data; and a protein alignment, translated from the coding sequences in MEGA7 [59].
The cDNA sequences that I generated were supplemented with hemogen cDNAs from
transcriptomic analyses of Pseudochaenichthys georgianus [unpublished data from Detrich lab]
and Parachaenichthys charcoti [unpublished data from Detrich lab], and aligned with my
genomic sequences to generate coding sequences for other notothenioids. Additionally, cDNA
and transcriptome sequences also served as a basis for alignment and quality control for
sequences obtained via an exome-capture analysis [unpublished] conducted by Dr. Jacob Daane
of the Detrich lab. A full list of all species included and the sequence sources can be found in
Table 2. In total 43 species representing Antarctic notothenioids from all high-latitude families
(Artedidraconidae, Bathydraconidae, Channichthyidae, Harpagiferidae, Nototheniidae) as well
as 7 Sub-Antarctic outgroups were included in evolutionary analyses.
Analysis of positive, pervasive and relaxed selection on Antarctic notothenioid hemogen
All trees were constructed in RAxML [61, 62] using nucleotide substitution model
GTRGAMMA to conduct an initial tree search of 20 trees and select the best tree from this pool.
No outgroups were specified. Branch tests were conducted using the CodeML module included
in PAML 4.0 [63, 64].
18
Two tests for positive selection were run. The first was conducted using branch-site
models [65, 66] on a subset of coding sequences (SFigure 2) to search for possible positive
adaptation in the Antarctic clade relative to several notothenioid outgroups. The branch leading
to the representative Antarctic notothenioids was specified a priori as the foreground branch
(Figure 3). The null model set NSsites = 2, fix_omega = 1, and omega = 1. This assumes two
categories of sites (purifying and neutral selection) and looks for a difference in proportions of
sites undergoing neutral selection on the foreground branch relative to the background. The
positive/alternative model set NSsites = 2, fix_omega = 0, and omega = 1, which allows for three
categories of sites (purifying, neutral, and positive selection) and looks to identify sites
undergoing positive selection on the foreground relative to the background branch. If the
alternative model is accepted over the null, this indicates a site has undergone episodic positive
selection (changed once, then retained in the clade)
The second test relied on codon-substitution site models [67, 68] to detect pervasive
positive selection among Antarctic notothenioids using the coding sequences (SFigure 2). This
would identify any possible sites which changed repeatedly throughout diversification of the
clade, possibly as a result of differing adaptive challenges related to the modification of the
hematic system. Models M0, M1a, M2a, M3, M7, and M8 were run by setting NSsites = 0 1 2 3
7 8 (respectively), fix_omega = 0, and omega = 1. Model M8a set NSsites = 8 but set fix_omega
= 1 and omega = 1. The submitted gene tree for the site tests can be found in Figure 4.
Test for relaxed selection in the branch leading to Channichthyidae was conducted using
RELAX [69] as part of the HyPhy suite of hypothesis testing software [70]. RELAX conducts a
comparative test of whether an a priori specified branch or subset of branches has undergone
relaxed or diversifying selection relative to the rest of the tree. This makes it useful for
19
identifying trends and/or shifts in the stringency of natural selection on a given gene, provided
one has an idea of where that should occur. The branches ending in Channichthyidae and
Bathydraconidae were selected as the test branches, with all others used as reference branches.
The reference tree used was the putative species tree of Daane [unpublished]. The test was run
on the Datamonkey server [71, 72].
Bioinformatic comparison of notothenioid hemogen promoters and coding domains
Regulatory regions from Eleginops maclovinus, N. coriiceps and Chaenocephalus
aceratus were sequenced based on the annotations for the N. coriiceps genome (NCBI
Accession: PRJNA66471, ID: 66471) [73]. gDNA sequences were aligned to N. coriiceps and C.
aceratus scaffolds via BLAST in Geneious (v. 10.0.5) [74] to determine whether notothenioid
species possess conserved synteny around the hemogen locus as observed in other vertebrate
species [1]. Scaffold sequences were confirmed by sequencing from the upstream (anp32b) and
downstream (TRMO) genes towards hemogen. Promoter alignments for hemogen were obtained
using the whole genome alignments for D. rerio and Gasterosteus aculeatus (ENSEMBL v94)
[75]. Transcription factor binding sites were predicted with ConTra v2 using a similarity matrix
of 0.75 [76]. Protein domains were identified based on annotations from human [38] and
zebrafish hemogen [1].
Parsimony gene tree and deletion mapping
A hemogen gene tree was built using coding sequences and maximum parsimony method
[77] in Mega7. Gaps were treated as partial deletions with site coverage set for 90%. This
allowed for the inclusion of sites where a majority of species possessed sequence data but one
20
species (or genus) possessed a phylogenetically informative indel. The tree included 1st, 2nd, and
3rd codon positions and was computed using the Subtree-Pruning-Regrafting method, beginning
with 10 trees and retaining 100 trees. Following 1000 bootstrap iterations the best tree was
selected based on comparison with known species phylogenies. The phylogeny was edited to
include indel information using ggtree in R [78] and the Interactive Tree of Life (iTOL) v3 [79].
I ran the tree topology in CodeML [63, 64] using the M0 model (model = 0, NSsites = 0) [67] to
obtain the number of nucleotide substitutions per codon (dN+dS) as well as dN, dS, and dN/dS
for the whole tree.
An icefish species tree was constructed based on the species tree built from the exome
data of Daane [unpublished] with modifications derived from available Channichthyidae
phylogenies [80, 81].
Pairwise dN/dS comparisons
Pairwise dN/dS values were generated using a subset of the coding alignment (SFigure 2)
and were ran in PAML4 using yn00 [64]. yn00 calculates rates based on the method outlined in
Nielsen & Yang 2000 [82] and allows for codon usage bias as well as transition-transversion rate
differences. To assess codon usage bias in notothenioid hemogen, I used DnaSP v5 [83, 84] to
measure codon usage bias via the codon adaptation index (CAI/CBI) [85, 86]. Values for CAI
are shown in Table 3; the values fall within a range of 0.3-0.4 for all species, which represents
moderate codon usage bias (low bias < 0.3 and high > 0.5).
All forty-three high-latitude notothenioid species were included in this analysis (Table 2).
I examined two kinds of evolutionary relationships: within-family comparisons (ex: icefish vs
icefish) and between-family pairwise comparisons (ex: Channichthyidae vs Nototheniidae). For
21
each kind of comparison I ran the data with three partitions: the total protein coding sequence,
the coding sequence for the N-terminus only (1-79 aa, which represents the end of the bipartite
NLS), and the coding sequence for the C-terminus only (80 aa—end). This allowed for a more
nuanced analysis of the selective forces at work on different parts of the gene as well as within
different clades and is derived from work done parsing geographic effects on cichlids and
positive selection in notothenioids [87, 88].
qPCR
Previous qualitative PCR I conducted on C. aceratus cDNA established general
presence/absence of hemogen expression in several adult tissues—liver, head kidney, trunk
kidney, spleen, and brain—and isolated the predominant isoforms of hemogen expression in
Channichthyidae (Figure 2). qPCR experiments utilized cDNA samples (outlined in preceding
selection) taken from tissues of adult icefish (Champsocephalus gunnari and C. aceratus). The
experiment was designed to verify and quantify hemogen expression in multiple icefish species
and compare isoform expression in adult tissues. Target transcripts were amplified from cDNA
using 1 µM primers (Table 1). Targets were amplified in triplicate. Expression was normalized
to beta-actin as the endogenous control for ΔΔCt method [89]. Standard curves were generated
to assess the primer efficiencies. qPCR was performed using QuantStudio3 ThermoCycler using
QuantStudio Design and Analysis Software.
RESULTS
Branch test does not detect positive selection in Antarctic notothenioids
22
The test attempted to determine whether adaptive positive selection has occurred in
Antarctic notothenioid Hemogen in comparison to Sub-Antarctic relatives/perciform outgroups
(Figure 3) using the branch-site method. The alternative model testing for positive selection
returned –lnL = -3135.34323 with np = 20. The null model (no positive selection) gave –lnL = -
3135.34325 with np = 19. The likelihood ratio test yielded a value of 0.00039 with df = 1, with p
= 0.9842. The test for positive selection was not significant, and the evolutionary change
observed in Antarctic notothenioids relative to other teleosts is not likely to be adaptive change
driven by positive selection.
Site test results do not detect pervasive adaptive change within the high-latitude Antarctic
notothenioid radiation
Model M0 gives fundamental statistics about the base composition of the tree, as well as
measures mutational rates over all sequences. The tree used for site tests was generated from
hemogen coding sequences via the maximum likelihood method (Figure 4). dN summed over the
entire tree = 0.3485, while dS = 0.2974, giving omega dN/dS = 1.17184. While these mutational
rates are low, this ratio would be consistent with relaxed purifying selection on hemogen,
although it cannot definitively differentiate between relaxed and positive selection.
Site tests yielded three tests to detect positive selection (M1a-M2a, M7-M8, M8a-M8)
with p < 0.05 (Table 6). All of these models measure positive selection, with M8a-M8 being the
most robust and reliable. That all returned significant p-values would be indicative of pervasive
positive selection at specific sites throughout the diversification of Antarctic notothenioids.
However, the data violates a critical assumption of the site test of positive selection. dS summed
over all branches < 0.5, which indicates insufficient sequence divergence among the species
23
tested. As a result of this low sequence divergence, the codon-based test is insufficiently robust
and cannot reliably measure changes to selective pressure at different sites. No pervasive positive
selection can be inferred from these results.
Relaxed selection in Channichthyidae relative to other Antarctic notothenioid families
RELAX confirmed a trend towards relaxed selection in the branches leading to
Channichthyidae and Bathydraconidae relative to Artedidraconidae, Nototheniidae, and E.
maclovinus (Figure 5). Test for selection relaxation (K = 0.25) was significant (p = 0.002, LR
= 9.77).
Key promoters remain conserved in Antarctic notothenioids
Based on previous work on both human and teleost hemogen, both conserved non-coding
regulatory elements, CNE1 and CNE2, identified by Peters et al 2018 [1] were identified in E.
maclovinus, N. coriiceps and C. aceratus (Figure 6). While the intergenic regions between CNEs
is reduced in C. aceratus relative to D. rerio (Figures 1/6), both elements remain intact in all
three species examined. Preliminary comparisons of key transcription factor binding sites show
no significant departures or losses in C. aceratus compared with N. coriiceps or E. maclovinus.
This includes putative binding sites for important co-factors like p300, GATA1, Sox9 and
HoxB4 (Figure 7).
Gene size and structure remains largely conserved in icefish relative to red-blooded fish
but show a large genetic lesion in exon 3
24
I used two species as representatives for assessing differences in the gene based on
erythroid presence/absence: the red-blooded notothen N. coriiceps and the white-blooded icefish
C. aceratus. Notothenioid hemogen is composed of 4 exons and 3 introns, similar to the
previously described D. rerio [1]. hemogen is approximately the same size in both species:
1762bp in N. coriiceps, and 1701bp in C. aceratus as measured from the start codon to the stop
codon. I observed no size change in exons 1, 2 & 4, and only small indels in each intron (1-12bp)
(Figure 6). However, C. aceratus showed a significant deletion in exon 3—the loss of 89bp,
which occurs within the tandem-repeat region (Figure 6/SFigure 3). This prompted further
exploration of this region among icefish to: 1) determine its prevalence throughout the clade, and
2) assess its potential significance on transcription and translation.
Indels in exon 3 serve as the primary source of sequence divergence and evolutionary
change among Antarctic notothenioids
I surveyed 14 of the 16 extant icefish species to determine if the deletion observed in C.
aceratus is a species-specific feature or evidence of an evolutionary event in the
Channichthyidae MRCA. All species examined showed evidence of an evolutionary loss at this
locus, but in three different variants: an 89 bp deletion, a 90 bp deletion, and a 99 bp deletion
(Figure 8). The deletions are not distributed evenly throughout the radiation and do not neatly
correspond with the known phylogeny for icefish speciation (Figure 9). By far the most prevalent
deletion was the 90bp loss, in a majority of species from the most ancestral (genus
Champsocephalus) to the most recently derived (genus Chionodraco) (Figure 9). The 99bp/33aa
loss emerged more recently and is contained within the clade consisting of the genera
Chionobathyscus, Cryodraco, Chaenodraco, and Chionodraco (Figure 9). The 89 bp deletion
25
was only present in species that did not form a monophyletic group. In addition, some species
possess multiple deletion alleles—within the derived clade where the 99 bp deletion first
emerged, 4 of 6 species were found to carry alleles for both the 90 bp and 99 bp deletion.
Analysis of the coding sequence (all exons) showed high sequence conservation
regardless of the breadth of speciation among Antarctic notothenioids. The majority of
evolutionary change centers on repeated insertion or deletion events, which are notable both for
their frequency as well as a tendency to reoccur independently in different species or clades
within the same, often overlapping, region of the gene. CodeML model M0 showed low
mutation rates even when E. maclovinus was included in the analysis: a tree length for dN of
0.3156, and tree length for dS of 0.2284.
The majority of indels are concentrated in two specific regions within the hemogen
protein: at the beginning of the region encoding the C-terminus and within the segment encoding
the proline-rich region of the C-terminus—both of which are contained within exon 3. Within the
species surveyed, I identified 24 unique indels in the coding regions of high-latitude Antarctic
notothenioids: 5 insertions, 19 deletions (Figure 10). Of these 24 events, 23 occur within the
segment encoding the C-terminus, and only one occurs in the segment encoding the N-terminus,
in a single species (SFigure 3). In addition to disproportionate prevalence of deletions over
insertions, there’s considerable variability among deletions and their occurrence within the tree
when compared with insertions. Insertion size is 1-2 amino acids, exclusively, and almost solely
contained within the Nototheniidae (with one exception among icefish). Deletions range from 1-
33 amino acids. These indels are also the most phylogenetically distinct features of notothenioid
hemogen; a full 22 of 24 indels are parsimony informative, while only 2 are homoplasic.
26
dN/dS shows disparate selection pressure in hemogen segments encoding the N-terminus vs
the C-terminus
Pairwise comparisons show similar patterns: neutral mutation rate (as measured by dS) is
low/non-existent in all comparisons. A significant number of pairwise comparisons were
excluded from measurements of dN/dS because of a lack of synonymous mutations between the
two sequences, resulting in N/A values (Table 4/5). Both within and between family
comparisons show a preference for accumulating non-synonymous mutations, primarily within
the segments encoding the C-terminus (Figure 11/12). The segments encoding the N-terminus
are under purifying selection within families, and most between family comparisons also display
this trend (Table 4). A trend away from purifying selection, possibly relaxed to the point of
coming under drift, is shown in between-family comparisons with Channichthyidae (Table 5),
with all comparisons showing dN/dS >1. The N-terminal still remains largely conserved in
Channichthyidae (Figure 11), but given low neutral change, any nonsynonymous changes to the
coding sequence will result in high dN/dS.
Discovery of novel splice variation in Channichthyidae that excludes key domains from
translation
Two transcripts have been detected in surveyed icefish. The first transcript is hemgn-L,
which includes all exons (Figure 2). The tissue survey for C. aceratus uncovered a novel splice
variant not previously detected in zebrafish, N. coriiceps, or P. charcoti [unpublished
transcriptome]. This transcript (hemgn-s) (Figure 2) splices from the end of exon 2 beyond the
“icefish deletion” in exon 3. Curiously, hemgn-s splices into another frameshift, resulting in the
same truncation observed in hemgn-L from C. aceratus—a missense mutation and premature
27
stop preventing translation of the acidic domain and exon 4 (Figure 2). This transcript has been
detected in other icefish which do not show a frameshift in hemgn-L; in each, it yields a similarly
truncated protein (Figure 2). As it does not include the beginning of exon 3, this transcript would
also exclude the bipartite nuclear localization signal from any translated protein. If translated, it
would produce a protein of approximately 78AA with only one functional domain intact: the
coiled-coil domain. The evolution of the hemgn-s splice form corresponds with an amino acid
change at the end of exon 2 that results in the gain of a basic residue (SFigure 3, position 52 in
alignment). Surveying the coding and protein sequences (SFigure 2/3) shows this change is
unique to the emergence of Channichthyidae.
qPCR shows low levels of hemogen expression in adult tissues of icefish
Primers were designed to capture two types of hemogen expression in icefish. One is the
hemgn-s variant, and the other is all “whole” hemgn. In icefish, this captures only the hemgn-L
variant, while in N. coriiceps this captures all transcripts (excluding a potential hemgn-s
transcript) Given the high standard deviations for icefish hemogen expression, I chose to analyze
my results as dCT rather than fold change. We see some hemogen expression in adult icefish,
primarily in a non-frameshifted species, C. gunnari, with little to no expression in C. aceratus.
Expression of “whole” hemogen is significantly reduced in icefish relative to red-blooded fish
(Figure 13). The hemgn-s transcript is not expressed in N. coriiceps, but does appear to be
expressed at relatively equivalent levels to hemgn-L in C. gunnari (Figure 13). Sequencing the
qPCR product confirms that hemgn-s was detected by our primers in C. aceratus, albeit at a
nearly undetectable and unreliably quantifiable level. Sequencing the N. coriiceps product for
28
hemgn-s qPCR showed an off-target fragment was amplified using the hemgn-s primers. N.
coriiceps does not legitimately express the hemgn-s splice variant.
Conservation of functional domains & degradation of nuclear localization domain in
Channichthyidae
Yang et al [38] identified several structural features of the mouse ortholog: a coiled-coil
domain and bipartite nuclear localization signal in the N-terminus, and an acidic domain in the
C-terminus. I find these features conserved in red-blooded notothenioids, but partially lost or
degraded in Channichthyidae. The coiled-coil domain is located at 25-39aa; the bipartite NLS at
57-74aa; and the acidic domain at 240-261aa (SFigure 3). Exons 1, 2, and part of 3 encode the
N-terminus. Exon 3 encodes the bulk of the C-terminus, and consists of the bipartite NLS as well
as a series of tandem repeats [1]. The icefish deletion appears to eliminate at least one tandem
repeat from icefish relative to red-blooded notothens (Figure 6, 8) in those species with an in-
frame deletion (Figure 8). The out-of-frame deletion causes a missense mutation impacting the
tandem repeat structure further (Figure 8).
The bipartite nuclear localization signal has undergone degradation in all observed icefish
species (Figure 14), indicating that this feature likely evolved in the MRCA of Channichthyidae.
It’s been previously established that excising the NLS from hemogen prevents nuclear
localization [90], and simply mutating the initial amino acid of a bipartite NLS has been shown
to be sufficient to inhibit nuclear localization in some genes [91]. Furthermore, the bipartite NLS
is excised from the hemgn-s splice form. The degradation of the NLS coupled with the evolution
of an isoform encoding a protein which excludes the NLS outright suggests this feature may be a
victim of relaxed selection in icefish.
29
DISCUSSION
hemogen is under relaxed selection in Channichthyidae and potentially notothenioid fish at
large
Relaxed selection plays an important role in evolution by permitting novel traits and
functions to evolve— evolutionary innovation— and sometimes permitting expanded phenotypic
plasticity [92, 93]. Alternatively, relaxed selection can lead to loss-of-function and possibly
pseudogenization [94]. RELAX does show that changes observed in Channichthyidae relative to
other Antarctic notothenioids are indicative of a trend towards relaxed selection on these
branches. This method is beneficial for detecting subtle selective pressure but comes with a
serious drawback of being unable to identify specific sites under relaxed selection, or
substantiate more than a trend between a foreground and background set of branches—which
requires some previously existing insight on where relaxed selection is most likely to occur [69].
However, the codon-based site tests to fully confirm—or refute—if changes at the level of
individual amino acids can be attributed to relaxed selection, vs positive selection.
I conducted the branch-site test, which examined N. coriiceps and C. aceratus vs other
teleosts, to specifically identify changes shared among Antarctic notothenioids which were not
present in any of the teleost outgroups. Such changes might be indicative of adaptive changes
associated with the hematic challenges of polar living, and would be good candidates for positive
selection. Distinguishing between relaxed and positive selection can be difficult, as both can
show a similar signal of increased ω when a gene should be under purifying selection [95]. The
ideal scenario would be 1) to confirm relaxed selection, or 2) exclude the possibility that
observed amino-acid changes might be due to positive selection. The branch-site test did not
30
yield any sites under possible positive selection, let alone any that were considered significant
under the most-robust Bayes Empirical Bayes analysis implemented in CodeML [96]. This
would seem to support the conclusion that evolutionary change among Antarctic notothenioids is
due to relaxed selection rather than positive selection. However, this test cannot assess changes
observed within Antarctic notothenioids, but not shared in common amongst all of them—that is,
evolutionary change that emerged during speciation but not shared in kind amongst the whole
radiation.
When trying to asses change within the Antarctic clade, several factors make it difficult
to directly test hypotheses of positive selection vs genetic drift on hemogen. Evolutionary
hypothesis testing using dN/dS (ω) relies on codon-based models for best practices, where the
number of silence and replacement changes are used to “decide” if there’s been an excess of
nonsynonymous change relative to synonymous change—and thus, make inferences about
selective pressure. Large dN/dS estimates for extremely short branch lengths from model M0
indicate that any estimate of selective pressure should be considered unreliable as proof positive
of positive selection, and that fundamental assumptions of the test may be violated. The current
dataset fails to meet the sequence diversity requirement (dS over branches > 0.5) for any site test
of selection to accurately test hypotheses about positive selection [97-101].
I had intended to use similar methods to explicitly test for relaxed selection, which rely
on similar models and assumptions in order to assess selective pressure [102]. Further
exploration of these models and their practical applications showed they would be no more
robust than the tests for positive selection had been—due to either low levels of sequence
divergence among my dataset, or lack of a comparison to assess potentially accelerated mutation
rates and thus establish relaxed selection [103-106].
31
This illustrates an interesting problem in evolutionary biology: if the selective changes
are subtle, or in recently diverged species, it can be difficult to pinpoint the selective cause
behind the mutational effects. Among adaptive radiations, if the gene of interest is 1) recently
under positive or relaxed selection, or 2) only under slight positive or relaxed selection, or 3) not
a gene responsible for a strong phenotypic effect of adaptive change, potentially driving
speciation—then it is unlikely that many current methods will be able to detect that selective
pressure. Given the low mutational rate of change, there’s no evidence to suggest that
notothenioid preference for non-synonymous change in this region is associated with positive
selection or novel interactions at this time.
Potentially the best evidence for relaxed selection is post hoc—with repeated mutations
that impair proper functioning of key domains, one could argue that this could only be due to
relaxed selection. Frameshift mutations tend to have strongly adverse effects and have been
linked to many cancers and diseases, but have occurred independently at least twice within the
icefish clade. Frequent indels occurring within exon 3 are consistent with the hypothesis that
genes associated with erythropoiesis underwent a change in the intensity of purifying selection as
a result of Southern Ocean colonization [21]. To substantiate this, it is necessary to outline what
the functional impacts of these indels might be and how they would tie into larger trends of
evolutionary change among notothenioids, as I will do in the following sections.
The most conclusive evidence for relaxed selection would be establishing that icefish
hemogen has undergone pseudogenization, complete loss-of-function, or relaxation of selectional
constraints leading to deterioration of some functional domains. This could be accomplished
either through observation (only a fragment remains, ex: icefish alpha- & beta-globin) or via
prediction of pseudogenization via comparative genomics [107]. While it is undeniable that
32
hemogen has been impacted in icefish, I argue that it does not reach the level of complete
pseudogenization, and that the functional picture is more complicated than simply: is hemogen
nonfunctional in white-blooded fish?
Observed mutations in hemogen are not due to gene duplication or chromosomal
rearrangement
If hemogen is under relaxed selection in notothenioids, it becomes important to determine
whether or not the observed changes in the hemogen sequence represent changes to the true
hemogen orthologue of zebrafish and other vertebrates, and rule out some alternative possibilities
for the evolutionary processes behind them. Two evolutionary processes could be in play: gene
duplication in the icefish MRCA, while would result in a hemogen paralog that could be
deteriorating; or chromosomal rearrangement/localized chromosomal breakage, resulting in the
partial loss of part of the hemogen gene. Either could result in a signal of relaxed selection, but
the evolutionary conclusions surrounding hemogen and its current role would differ depending
on which—if either—process is in effect.
Gene duplications play a critical role in the evolutionary process, either as contributor to
phenotypic plasticity via evolution of new roles through neofunctinalization and
subfunctionalization [108, 109], or by development of pseudogenes [94]. I ruled out ancestral
gene duplication related to the whole-genome duplication event in teleosts [110-112], as analysis
conducted with zebrafish ortholog Si:dkey-25o16.2 showed that most teleost genomes retain only
a single hemogen ortholog [1]. However, a duplication event somewhere within the Antarctic
radiation, or prior to the diversification if Channichthyidae could be possible [113, 114]. If the
hemogen first identified in C. aceratus might happen to be a paralog, or the product of a
33
duplication event in the icefish MRCA, then subfunctionalization of non-erythropoietic
processes could explain the C-terminus deterioration in said duplicate [115, 116] as well as some
of the reduced expression of icefish hemogen [117]. Thus, it was critical for me to make sure I
was analyzing true ortholog hemogen in the notothenioids, and not a duplicate which has either
subfunctionalized, neofunctionalized, or deteriorated into fragmentation and pseudogenization.
Alternatively, the large indels in Channichthyidae could be attributed to chromosomal
breakage,or rearrangement [118]. Studies with human ortholog EDAG identify it as a putative
oncogene located at a chromosomal region (9q22) linked to leukemia-associated chromosomal
breakpoints [119, 120]. If the hemogen locus has been disrupted by chromosomal breakage, the
gene seen in icefish might be a fragment rather than a whole (but mutated) gene. Given the high
sequence conservation of the hemogen gene, significant disruption around the locus seemed
unlikely.
My experimental evidence already suggested a low possibility that either gene
duplication or chromosomal rearrangement had taken place, given the high sequence
conservation between red-blooded and white-blooded notothenioids (> 90%), and the lack of
double bands following gel electrophoresis of PCR. To be entirely sure, I ruled out both
possibilities by assessing the conservation of synteny surrounding hemogen in Antarctic
notothenioids and by using available genomes to survey for similar sequences which might
represent hemogen paralogs. I was able to check my C. aceratus sequence against the available
C. aceratus [unpublished results] and N. coriiceps genomes (NCBI RefSeq NC_015653.1) [121]
to confirm its chromosomal location and assess possible sequencing errors or duplicate genes.
The sequence I obtained, with the observed 89 bp deletion, was present on the genome scaffold,
34
and BLAST failed to obtain multiple hits for the whole-gene (exon + introns), coding sequence
only (all exons), or partial coding (individual exons) sequences.
Thanks to my work sequencing the promoter region of hemogen, I was also able to assess
the conservation of synteny among three representative species: C. aceratus (white-blooded), N.
coriiceps (red-blooded), and E. maclovinus (Sub-Antarctic outgroup). Previous work by the
Detrich lab established the conservation of synteny between zebrafish and other vertebrates [1]
with anp32b upstream of hemogen, and TRMO downstream. I sequenced towards hemogen from
both of those genes, and sequenced upstream and downstream from hemogen, in order to rule out
a small, localized chromosomal rearrangement that might impact hemogen. The Eleginops
sequence confirmed this localized synteny existed prior to the diversification of Antarctic
notothenioids, and expanding my search to other teleosts, lobe-finned, and cartilaginous fish
substantiated conservation of this ~15 kb region despite > 400 Ma of evolutionary distance [122-
125].
I therefore conclude that the sequence I obtained is not a hemogen paralog in C. aceratus,
and that it is unlikely that any other icefish sequences represent paralogs rather than true
orthologs. I also conclude that the indels observed in both red-blooded and white-blooded fish
are likely further evidence of relaxed selection acting upon hemogen, as opposed to larger-scale
chromosomal factors that could cause significant deletion or genomic rearrangement.
Mutation is preferentially accumulated in areas known to be conserved and vital for
erythropoiesis
The majority of evolutionary change across the Antarctic radiation occurs within the C-
terminal region encoded in exon 3, and in particular within the proline-rich region composed of
35
tandem repeats. This includes nonsynonymous amino acid changes, as illustrated in the pairwise
dN/dS comparisons, but primarily occurs as indels. The C-terminus is critical for binding of
p300, and the recruitment and binding to p300 is critical for Hemogen to promote erythroid
differentiation in the human ortholog [55].
These indels also overlap or occur adjacent to a conserved C-terminal area identified as
essential for erythropoiesis in zebrafish [1]. Previous studies used CRISPR/Cas9 zebrafish
mutants to induce indels in zebrafish hemogen—a frameshift mutant deleting 5aa, and an in-
frame mutant deleting 12 aa and part of a conserved acidic motif (EEED). Analysis of
hemoglobin concentration and numbers of circulating blood cells show indels in this region lead
to reduced erythrocyte levels and hypochromatic blood in embryos of in-frame mutants, and that
the proportion of anemic individuals increases even in the heterozygous condition in both mutant
strains [1]. Frameshifted mutants did not appear to be translationally successful, whereas the
non-frameshifted mutants did produce a slightly-smaller Hemogen protein [1]. Indels in this
region also impacted development via nototchord and trunk defects in both frameshifted and in-
frame indel mutant zebrafish, and increased cellular apoptosis was identified within framesfhited
mutants throughout the embryo. Adult fish were statistically divergent in size from wild-type,
especially within homozygous in-frame mutant strains [1].
The implications of this research for interpreting the prevalence of indels suggest several
possibilities about relaxed selective pressures on erythropoiesis and hematopoietic traits.
Zebrafish mutations targeting the C-terminus of Hemogen reduced erythrocyte levels in adults
and decreased expression of Embryonic beta-globin regardless of in-frame or frameshift
condition. Therefore, deletions around or within this area of the C-terminus should impair
erythropoiesis in notothenioids even in the absence of frameshifts. The permissibility of
36
hemogen indels in red-blooded fish supports the conclusion that Hemogen cannot be absolutely
essential for erythropoiesis. Importantly, given that even heterozygous mutants show impaired
erythropoiesis, “one good copy” of the gene would not be sufficient to prevent such an indel
from having an effect on erythrocyte production and concentration.
Given that red-blooded notothenioids demonstrate reduced hematocrit and lowered
erythrocyte levels, the observed in-frame indels in red-blooded fish may represent a response to
the necessary adaptation to polar conditions. At this time it is not possible to distinguish between
the chicken-and-egg of this situation: do the indels in hemogen represent a part of the adaptive
changes needed to lower hematocrit/decrease blood viscosity, or did they occur from relaxed
selection on erythrocyte-regulators as the importance of hemoglobin decreased throughout the
radiation? Given the lack of strong phylogenetic pattern among the red-blooded indel
distribution, I cannot determine between these two possibilities at this time. It is also unclear
specifically how the mutations in red-blooded notothends impact binding with p300, or if they
impair or inhibit access to a TAD. Similarly, it is unknown whether or not the nonsynonymous
mutations have a strong functional impact (or any functional impact) on binding or protein
structure in red-blooded notothenioids.
The adaptive value of frameshifts among icefish is another matter. The large size of the
deletion (89-99 bp) and the fixation of a frameshift variant are most likely the result of relaxed
selection, given that they should have significant adverse effects on erythropoiesis. Furthermore,
during diversification of Channichthyidae two independent frameshift mutations occurred in this
C-terminal region, resulting in truncated proteins that eliminate the acidic domain and the
segment encoded by exon 4. If erythropoiesis is no longer required, then there is no longer any
reason to selectively maintain the domains responsible for that function, and frameshift
37
mutations may be tolerated to the point of fixation. Thus, the large indels may be indicative of
ongoing subfunctionalization in the hemogen ortholog, paring down the gene to only those
functional domains most essential to non-erythropoietic roles for hemogen.
Evolution of a novel splice variant missing majority of functional domains required for
erythropoiesis might be a natural dominant negative
Regardless of which isoform an icefish may be expressing at any given time, at least two
functional domains show evidence of relaxed selection. All icefish display a degraded bipartite
NLS and some loss to the proline-rich/tandem repeat region implicated in GATA1/p300/EDAG
complex [1, 55]. All icefish can theoretically express the hemgn-s isoform that excludes
everything but the coiled-coil domain from its truncated protein product. This elimination of
functional domains from a key isoform could be a dominant negative mutation that would
interfere with the functions of the Hemogen protein encoded by hemgn-L. This could be
accomplished either by binding to key partners (such as GATA1 or p300) or potentially through
oligomerization of Hemogen itself. Dominant negatives have an important role to play in
inhibiting wild-type expression, and have particularly been implicated in disease and promotion
of cancer formation. Additionally, dominant negatives in erythropoietic genes have been known
to lead to anemic conditions [126, 127] and development of a dominant negative in hemogen
could promote the erythrocyte-null condition. Dominant negatives have also been shown to
increase favorable outcomes in acute myeloid leukemia at other loci [128], suggesting that
dominant negative mutations in other putative oncogenes associated with leukemia (such as
hemogen) might also be beneficial in decreasing proliferation of this particular type of tumor.
38
Key transcription factor binding domains for hematopoiesis- and/or erythropoiesis-
promoting genes appear to be conserved in icefish despite the erythrocyte-null phenotype,
including GATA1 [129], KLF4 [130], and Myb [57, 131]. With the promoter region and
necessary CNEs for primitive erythropoiesis still potentially intact, hemogen could still be
recruited to these processes in icefish. If hemgn-s is a dominant negative, it might interact with
some complexes in such a way as to prohibit their functions to promote erythropoiesis, while still
permitting non-erythropoietic function to occur.
Several caveats to this theory need to be stated. At this time, it is unclear how expression
of hemgn-s is regulated and what promoters might be behind it, or if it is controlled by a
promoter region further upstream that has yet to be discovered. It is unclear how it is
differentially expressed relative to wild-type hemogen, or if promoting expression of wild-type
hemogen means that hemgn-s is automatically expressed as well. However, the implications for -
hemgn-s if it could be conclusively proven to encode a dominant negative—or, alternatively, if
other forms of icefish hemogen could function as a dominant negative in some processes—could
be significant for non-erythropoietic research as well, given that the C-terminal is also implicated
in cell apoptosis and developmental defects and delays in zebrafish [1].
Large-scale deletions and degradation of functional domains occur concurrent with loss of
alpha- & beta-globin expression
The majority of functional change in Channichthyidae appears to be traceable to events
that occurred within the MRCA of icefish, prior to the diversification of the clade. The major
deletion in exon 3, the degradation of the bipartite NLS, and the evolution of a novel splice
39
variant are all shared among extant icefish, and the most parsimonious explanation is that these
characters developed prior to diversification and speciation.
Additionally, the icefish-specific deletions provide insight into the evolutionary history of
the icefish clade that implicate globin-loss in relaxed selection on hemogen. Based on analysis of
hemogen allelic variation, the most likely evolutionary scenario for deletions and deteriorations
is that the 90bp/30aa deletion occurred in the MRCA of icefish, possibly as a consequence of
relaxed constraints following the loss of hemoglobin expression. From there two independent
mutations occurred: 1) a 1 bp insertion leading to the “frameshift” allele observed in C. aceratus
and Neopagetopsis ionah, which became fixed in some species, and 2) a secondary 9 bp deletion,
which has fixed in some but not all of the most recently speciated/most derived lineages.
Distribution of these deletion variants is consistent with incomplete lineage sorting, which is
common in adaptive radiations due to the rapid speciation [132], but also consistent with
possible introgression. Past introgression events have been detected in some icefish species
[133]. The evolution of these traits suggest relaxed selection permitting further deterioration
following the removal of a functional constraint: i.e., the non-expression of erythrocytes.
40
Table 1. Primers used in PCR and qRT-PCR reactions to amplify hemogen gDNA and cDNA in
Antarctic notothenioids
PRIMER NAME SEQUENCE PURPOSE SPECIES
Ncor130for 5'-TGGAGGAGACATTTCAACA-3' gDNA, cDNA Antarctic notothenioids
NcHemRev2 5'-ACTAACAGGATGCACACTAACC-3' gDNA, cDNA Antarctic notothenioids
QP_CA500SpliceF2 5'-GACTAACCAGTGGGTTTAAGCC-3' qPCR C. aceratus, C. gunnari
NcHemRev1 5'-TTGTGGAGGAGGTGTCGAG-3' qPCR Antarctic notothenioids
hemAllqPCRFor 5'-AGAATGGAGGAGACATTTCAACA-3' qPCR C. aceratus, C. gunnari,
N. coriiceps
hemAllqPCRRev1 5'-TTCCTCAGAAGATCCCTGTC-3' qPCR C. aceratus, C. gunnari
hemAllqPCRRev2B 5'-CTTGTCTTCTGCTTCAGCTT-3' qPCR N. coriiceps
RTBactF 5'-CAGATCATGTTCGAGACCTTCAAC-
3'
qPCR C. aceratus, C. gunnari,
N. coriiceps
RTBactR 5'-TCACCRGARTCCATGACGATA-3' qPCR C. aceratus, C. gunnari,
N. coriiceps
41
Table 2. Species sequenced and included in study of Antarctic notothenioid hemogen
ORGANISM FAMILY LOCALE ORGANISM FAMILY LOCALE
Champsocephalus esox* Channichthyidae Sub-Antarctic Harpagifer antarcticus Harpagiferidae Antarctic
Champsocephalus gunnari* Channichthyidae Antarctic Notothenia rossii Nototheniidae Antarctic
Neopagetopsis ionah* Channichthyidae Antarctic Notothenia coriiceps* Nototheniidae Antarctic
Pagetopsis macropterus* Channichthyidae Antarctic Notothenia angustata Nototheniidae Sub-Antarctic
Pseudochaenichthys georgianus*† Channichthyidae Antarctic Gobionotothen gibberifrons Nototheniidae Antarctic
Dacodraco hunteri Channichthyidae Antarctic Pleuragramma antarctica Nototheniidae Antarctic
Channichthys rhinoceratus* Channichthyidae Antarctic Trematomus hansoni* Nototheniidae Antarctic
Chaenocephalus aceratus*† Channichthyidae Antarctic Trematomus bernacchii* Nototheniidae Antarctic
Chionobathyscus dewitti* Channichthyidae Antarctic Trematomus eulepidotus Nototheniidae Antarctic
Cryodraco antarcticus* Channichthyidae Antarctic Trematomus borchgrevinki Nototheniidae Antarctic
Chaenodraco wilsoni* Channichthyidae Antarctic Trematomus newnesi* Nototheniidae Antarctic
Chionodraco myersi* Channichthyidae Antarctic Trematomus scotti Nototheniidae Antarctic
Chionodraco hamatus* Channichthyidae Antarctic Patagonotothen guntheri Nototheniidae Antarctic
Chionodraco rastrospinosus* Channichthyidae Antarctic Patagonotothen cornucola Nototheniidae Antarctic
Parachaenichthys charcoti† Bathydraconidae Antarctic Lepidonotothen nudifrons Nototheniidae Antarctic
Gerlachea australis Bathydraconidae Antarctic Dissostichus mawsoni Nototheniidae Antarctic
Bathydraco marri Bathydraconidae Antarctic Dissostichus eleginoides Nototheniidae Antarctic
Akarotaxis nudiceps Bathydraconidae Antarctic Aethotaxis mitopteryx Nototheniidae Antarctic
Vomeridens infuscipinnis Bathydraconidae Antarctic Eleginops maclovinus* Eleginopsidae Sub-Antarctic
Racovitzia glacialis Bathydraconidae Antarctic Pseudaphritis urvilli Pseudaphritidae Eastern Australia
Pogonophryne barsukovi Artedidraconidae Antarctic Cottoperca gobio Bovichtidae Sub-Antarctic
Pogonophryne scotti Artedidraconidae Antarctic Bovichtus diacanthus Bovichtidae Sub-Antarctic
Dolloidraco longedorsalis Artedidraconidae Antarctic Percophis brasiliensis Percophidae South America
Histiodraco velifer Artedidraconidae Antarctic Etheostoma nigrum Percidae North America
Artedidraco skottsbergi Artedidraconidae Antarctic Gasterosteus aculeatus Gasterosteidae Northern Hemisphere
* sequenced gDNA † transcriptome available
42
Table 3. Codon usage bias for hemogen (total coding sequence) among Antarctic notothenioids
SPECIES
CAI SPECIES
CAI
Eleginops maclovinus 0.34 Pogonophryne scotti 0.329
Aethotaxis mitopteryx 0.352 Gerlachea australis 0.334
Dissostichus mawsoni 0.37 Parachaenichthys charcoti 0.332
Dissostichus eleginoides 0.345 Racovitzia glacialis 0.346
Lepidonotothen nudifrons 0.357 Vomeridens infuscipinnis 0.343
Patagonotothen cornucola 0.378 Akarotaxis nudiceps 0.349
Patagonotothen guntheri 0.377 Bathydraco marri 0.337
Trematomus bernacchii 0.356 Chaenocephalus aceratus 0.309
Trematomus borchgrevinki 0.363 Dacodraco hunteri 0.342
Trematomus eulepidotus 0.36 Neopagetopsis ionah 0.338
Trematomus hansoni 0.376 Cryodraco antarcticus 0.316
Trematomus newnesi 0.35 Chionodraco hamatus 0.332
Trematomus scotti 0.368 Chionodraco rastrospinosus 0.329
Gobionotothen gibberifrons 0.324 Champsocephalus esox 0.343
Notothenia angustata 0.359 Chionodraco myersi 0.304
Notothenia rossii 0.367 Pagetopsis macropterus 0.344
Notothenia coriiceps 0.358 Champsocephalus gunnari 0.337
Harpagifer antarcticus 0.339 Chaenodraco wilsoni 0.309
Artedidraco skottsbergi 0.332 Channichthys rhinoceratus 0.337
Histiodraco velifer 0.333 Chionobathyscus dewitti 0.313
Dolloidraco longedorsalis 0.327 Pseudochaenichthys georgianus 0.327
Pogonophryne barsukovi 0.329
43
Table 4. Mean pairwise dN/dS for within-family comparisons of Antarctic notothenioid families
total N-terminus C-terminus
Artedidraconidae 1.381 N/A N/A
Bathydraconidae 0.807 0.880 0.866
Nototheniidae 1.320 0.716 1.191
Channichthyidae 1.077 0.435 1.190
44
Table 5. Mean pairwise dN/dS for between-family comparisons of Antarctic notothenioid
families
total N-terminus C-terminus
Harpagiferidae-Nototheniidae 5.889 0.656 5.793
Harpagiferidae-Artedidraconidae 5.889 0.394 N/A
Harpagiferidae-Bathydraconiade 1.980 1.172 2.225
Harpagiferidae-Channichthyidae 4.737 2.171 4.723
Bathydraconidae-Nototheniidae 3.010 0.839 2.991
Bathydraconidae-Artedidraconidae 1.754 0.557 1.184
Bathydraconidae-Channichthyidae 2.225 1.844 1.653
Artedidraconidae-Nototheniidae 3.158 0.516 3.708
Artedidraconidae-Channichthyidae 2.601 1.772 1.963
Channichthyidae-Nototheniidae 2.865 1.895 2.597
45
Table 6. Results of codon-based site tests conducted in CodeML on the Antarctic radiation
TEST ΔLRT DF P-VALUE
M0-M3 9.08037 4 p = 0.0591
M1a-M2a 8.3575 2 p = 0.0153
M7-M8 10.1192 2 p = 0.0063
M8-M8a 8.07556 1 p = 0.0045
46
47
Figure 1. Zebrafish Si:dkey-25o16.2 and human Hemogen are orthologous and encode
related proteins that differ in size. (A) Structure of the zebrafish Hemogen-like gene, Si:dkey-
25o16.2. Two conserved noncoding elements (C1 and C2, black boxes) were identified in a 2 kb
segment proximal to the start codon (see Results, Figs 4-6). Coding exons, white boxes;
noncoding exons, gray boxes. Numbers indicate length in bp. (B) Synteny of loci for zebrafish
Si:dkey-25o16.2 on chromosome 1 and Hemogen on human chromosome 9 (region q22).
Transcriptional orientations indicated by arrows. (C) Alternative splicing of zebrafish Hemogen-
like transcripts showing sequenced regions. Introns are shown as chevrons. Transcripts 1 and 2
differ by retention of 12 bp of intron (red). (D) Modular structures of zebrafish and human
Hemogen proteins each encoded by four exons (numbered boxes). Locations of truncating
mutations found in some human cancers (Forbes et al., 2017) are indicated by asterisks.
Predicted regions and motifs: green, coiled coil; blue, nuclear localization signal; red, four
residues introduced by alternative splicing; yellow, tandem peptide repeats; brown, acidic repeat
with transactivation domain (TAD) motif; gray, no prediction. (E) Three-dimensional ab initio
models of Hemogens. The ribbon diagram of the zebrafish protein, color-coded as in panel D, is
superimposed on the gray, space-filling model for the human protein. (Reproduced with
permission from Biology Open)
48
49
Figure 2. Icefish transcript variants for hemogen and their putative effects on translation
illustrated in representative species Champsocephalus gunnari. While several potential
transcripts were possible, only two transcripts have been confirmed for hemogen in icefish. The
first transcript, hemgn-L, consists of the complete transcription of all exons. The novel transcript
hemgn-s splices from the end of exon 2, to downstream of the 90bp deletion region; additionally,
it splices into a frameshift similar to that observed in C. aceratus, and thus, the end of exon 3 and
exon 4 would not be translated. This hemgn-s transcript would exclude the bipartite NLS, the
proline-rich tandem repeat domain and the acidic region. It has been detected in all icefish
surveyed regardless of their exon 3 deletion allele. Additionally, all transcripts detected in
surveyed icefish (C. aceratus, C. gunnari, P. georgianus and C. rastrospinosus) show that a
potential splice variant feature at the end of exon 2 (+/- 4AA) present in teleost fish only splices
with the +4AA (blue) in icefish. (C) Illustration of functional domains which would be included
in the Hemogen protein encoded by hemgn-s. If translated, only the coiled-coil domain (green)
would be present in the Hemogen protein; the splice form would exclude the functional domains
encoded within exon three: the bipartite nuclear localization signal, the tandem repeats, and the
acidic region. The light-grey regions on the protein illustration represent areas with no predicted
functional domain.
50
Figure 3. Maximum likelihood tree used to test for positive selection on the branch leading
to the Antarctic notothenioid clade. Tree was constructed based on the coding sequences
derived from each species. The background branch includes all non-Antarctic outgroups ranging
from as far as the three-spine stickleback (G. aculeatus) to close Sub-Antarctic relative E.
maclovinus. The foreground branch is marked in red and includes two representative species: one
red-blooded (N. coriiceps) and one white-blooded (C. gunnari).
51
52
Figure 4. Maximum likelihood tree used in site-tests for positive/pervasive selection among
Antarctic notothenioids. Tree was constructed in RAxML (see Methods) and used for
calculating changes to site dN/dS for all models, in order to detect pervasive selection among the
high-latitude Antarctic notothenioids. No such selection was detected.
53
Figure 5. RELAX tree shows relaxed selection on the branches contained Bathydraconidae
and Channichthyidae, demonstrating a trend of relaxed selection in hemogen on the way to
the erythrocyte-null phenotype. Test for selection relaxation (K = 0.25) was significant (p =
0.002, LR = 9.77). The test branch is indicated in light teal and represents the K value (K = 0.25)
for this branch relative to the background (K = 1, not colored.)
54
55
Figure 6. Gene structure and size remains conserved among red-blooded and white-
blooded notothenioids, including regulatory regions conserved among teleost fish. Structure
of the coding region in both Notothenia coriiceps (A) and Chaenocephalus aceratus (B) matches
that observed in other teleosts as well in other vertebrates and show strong size conservation for
exons 1, 2 and 4, as well as all introns. Two conserved non-coding elements described in Peters
et al 2018 for D. rerio are also still present in both species. The hemogen regulatory structure
shown (C) is derived from C. aceratus intergenic sequencing, but general spacing is approximate
to the regulatory regions of both N. coriiceps and Sub-Antarctic relative Eleginops maclovinus.
While the 5’ intergenic region of notothenioid hemogen is decreased in size relative to D. rerio
(Peters et al 2018), both CNEs are intact and show no significant genetic lesions relative to other
vertebrate species. (D) shows the partitioning of functional domains among exons in a
representative red-blooded species, Trematomus scotti. There are four primary domains: the
coiled-coil domain (exon 2), the bipartite nuclear localization signal (exon 3), a proline-rich
region composed of a variable number of tandem repeats (exon 3) and an acidic domain (exon
3). (E) shows these functional domains in a white-blooded fish, Champsocephalus gunnari.
Preliminary analysis of tandem repeats in icefish shows that the large deletion in exon 3 may
have resulted in the loss of at least one repeat.
56
Figure 7. Conservation of conserved non-coding elements CNE1 and CNE2 in Antarctic notothenioids relative to Gasterosteus
aculeatus and Danio rerio. Transcripton factor binding sites were predicted using Contra V2 (Broos et al 2011) based on previously
identified key binding factors in both human hemogen and D. rerio. Relative to Antarctic notothenioids and representative teleost
species, C. aceratus does not show significant deterioration of either CNE1 (A) or CNE2 (B) and still possesses many putative binding
sites for key co-factors like p300, GATA1, Sox9, etc. Binding sites are colored accordingly: Foxl2 (orange), GFI1 (light blue), KLF4
(bright green), HNF1 (pink), HOXB4 (light brown), MYB (cyan/lavender), P300 (grey), Sox9 (red), GATA1 (dark pink/dark orange.)
57
58
Figure 8. hemogen exon 3 deletions in representative species from Channichthyidae relative
to a red-blooded notothenioid, and their predicted effects on transcription and translation.
The genetic lesion representing the most significant mutation to icefish hemogen takes three key
forms: a 90bp deletion, a 99bp deletion, and an 89bp deletion. (A) shows a representative red-
blooded notothenioid, Trematomus scotti, which does not possess any lesions in exon 3. (B)
shows representative C. gunnari, which possess the 90bp form of the deletion. Putative
translation shows this would result in a 30AA deletion but does not produce a frameshift, and the
rest of the gene should be translated normally. Similarily, in (C) you can see the 99bp deletion of
C. rastrospinosus would be somewhat reduced but still translated normally in frame. However,
the 89bp deletion first observed in C. aceratus (D) would result in a missense mutation leading
to a premature stop, and a subsequently truncated protein. The transcript possessing all exons is
illustrated here in red; grey regions indicate sequence that would be excluded from translation as
a result of the premature stop codon. The Hemogen protein structure is illustrated underneath
each transcript, in order to show how some features would not be wholly translated/translated at
all due to the exon 3 deletions in icefish. Functional domains are colored as follows: coiled-coil,
green; nuclear localization signal, blue; tandem repeats, yellow; acidic domain, red; light grey,
no predicted domain.
59
Figure 9. Variant forms of hemogen “exon 3” deletion mapped onto the Channichthyidae
species tree. The deletion forms are not distributed evenly throughout the tree and follow a loose
evolutionary pattern at best. The most common form the deletion is the 90bp loss (green), which
may be the ancestral form of this lesion, as it is present throughout the entire tree. Second most
common is the 99bp deletion (blue), which evolved within the more recently speciated lineages
(< 4Mya) and does not appear to be fully fixed in most species, as it frequently appears as a
second allele alongside the 90bp deletion. The 89bp deletion, resulting in a frameshift, follows
no clear evolutionary pattern. This mutation appears to be fixed in species such as C. aceratus,
and surveying multiple indidividuals failed to detect any non-frameshifted hemogen variants in
these species. Tree topology was recreated using the mitochondrial tree of Near et al 2006 and
the putative species tree built by Dr. Jacob Daane [unpublished results].
60
61
Figure 10. hemogen indels in Antarctic notothenioids mapped onto a maximum parsimony
tree. Maximum parsimony tree was constructed in MEGA7 based on putative protein sequences
(see Methods). All indels marked appear in the coding sequence and would have an impact on
the protein composition. Purple triangles indicate a deletion event, while red triangles indicate an
insertion. Indels marked with a colored star are homoplastic and appear to have occurred
independently several times throughout Antarctic notothenioid diversification. However, the
indel marked with a green star is an artifact introduced by the parsimony construction process;
due to the high sequence similarity among icefish, the parsimony method cannot adequately
model correct species relationships in this clade.
62
Figure 11. Pairwise dN/dS comparisons plotting total dN/dS of whole Hemogen-encoding
sequence with the dN/dS values for the N-terminus and C-terminus of notothenioid
Hemogen, within families Nototheniidae (A & B) and Channichthyidae (C & D). This
demonstrates the relative changes in selective pressure of the N-terminus vs the C-terminus,
relative to the overall selective pressure on the whole Hemogen protein. Values were obtained by
conducting pairwise-comparisons in the module yn00 of PAML 4 (see Methods). Each dot
represents a unique species pair and the calculated dN/dS values for the N-terminus, C-terminus,
and total coding sequence for that specific species comparison.
63
Figure 12. Pairwise dN/dS trends between families Nototheniidae and Channichthyidae,
plotting whole-Hemogen dN/dS vs the N-terminus (A) or C-terminus (B). This demonstrates
the relative changes in selective pressure of the N-terminus vs the C-terminus, relative to the
overall selective pressure on the whole Hemogen protein. Values were obtained by conducting
pairwise-comparisons in the module yn00 of PAML 4 (see Methods). Each dot represents a
unique species pair and the calculated dN/dS values for the N-terminus, C-terminus, and total
coding sequence for that specific species comparison.
64
65
Figure 13. qPCR quantification of hemogen transcript variants in representative icefish
species C. aceratus and C. gunnari, comparing adult head kidney hemogen expression with
N. coriiceps adult head kidney for both hemgn-L and hemgn-s splice variants. Primers were
designed to capture two types of hemogen expression in icefish. One is the hemgn-s variant and
the other is referred to as “whole” hemogen. In icefish, this captures only the hemgn-L variant,
while in N. coriiceps this captures all transcripts (excluding a potential hemgn-s transcript).
When normalized to beta-actin expression, we see some hemogen expression in adult icefish,
particularly in the non-frameshifted species C. gunnari, but no amplification of hemgn-L in the
species with the frameshift indel, C. aceratus. In C. gunnari, hemgn-s is expressed at relatively
similar levels to hemgn-L, whereas in C. aceratus the variant hemgn-s is expressed at such low
levels that it is nearly undetectable. While it appears that N. coriiceps may also express hemgn-s
at low levels, sequencing of qPCR product shows this is off-target binding and not legitimate
amplification of the targeted splice variant. N. coriiceps does not express hemgn-s.
66
Figure 14. Changes to the bipartite nuclear localization signal in icefish (Champsocephalus
gunnari) relative to red-blooded notothens (Notothenia coriiceps). The bipartite nuclear
localization signal consists of two clusters of positively charged amino acids (typically lysine
and arginine) separated by a spacer sequence. In C. gunnari and other icefish, the first portion of
the bipartite signal has been degraded at the first (R -> M) and fourth (R -> S) positions. This
would result in a decreased positive charge and negatively impact the nuclear localization
process.
67
REFERENCES
1. Peters, M.J., et al., Divergent Hemogen genes of teleosts and mammals share conserved
roles in erythropoiesis: analysis using transgenic and mutant zebrafish. Biol Open, 2018.
7(8).
2. Near, T.J., et al., Identification of the notothenioid sister lineage illuminates the
biogeographic history of an Antarctic adaptive radiation. BMC Evol Biol, 2015.
15(109): p. 1-14.
3. Scher, H.D. and E.E. Martin, Timing and Climatic Consequences of the Opening of
Drake Passage. Science, 2006. 312: p. 428-430.
4. Barker, P.F., et al., Onset and role of the Antarctic Circumpolar Current. Deep Sea
Research Part II: Topical Studies in Oceanography, 2007. 54(21-22): p. 2388-2398.
5. Chen, L., A. Devries, and C.H. Cheng, Evolution of antifreeze glycoprotein gene from a
trypsinogen gene in Antarctic notothenioid fish. Proc Natl Acad Sci U S A, 1997. 94: p.
3811-3816.
6. Cheng, C.H. and L. Chen, Evolution of an antifreeze glycoprotein: a blood protein that
keeps Antarctic fish from freezing arose from a digestive enzyme. Nature, 1999. 401: p.
443-444.
7. Near, T.J., et al., Ancient climate change, antifreeze, and the evolutionary diversification
of Antarctic fishes. Proc Natl Acad Sci U S A, 2012. 109(9): p. 3434-3439.
8. Shevenell, A.E., J.P. Kennett, and D.W. Lea, Middle Miocene Southern Ocean Cooling
and Antarctic Cryosphere Expansion. Science, 2004. 305: p. 1766-1770.
9. Clarke, A., D.K. Barnes, and D.A. Hodgson, How isolated is Antarctica? Trends Ecol
Evol, 2005. 20(1): p. 1-3.
10. Clarke, A. and I.A. Johnston, Evolution and adaptive radiation of Antarctic fishes.
Trends Ecol Evol, 1996. 11(5): p. 212-218.
11. Dornburg, A., et al., Cradles and museums of Antarctic teleost biodiversity. Nat Ecol
Evol, 2017. 1(9): p. 1379-1384.
12. Tripati, A.K., C.D. Roberts, and R.A. Eagle, Coupling of CO2 and ice sheet stability over
major climate transitions of the last 20 million years. Science, 2009. 326(5958): p. 1394-
7.
13. Pollard, D. and R.M. DeConto, Modelling West Antarctic ice sheet growth and collapse
through the past five million years. Nature, 2009. 458(7236): p. 329-32.
14. Thatje, S., et al., Life hung by a thread: endurance of Antarctic fauna in glacial periods.
Ecology, 2008. 89(3): p. 682-692.
15. Cheng, C.H. and H.W. Detrich, 3rd, Molecular ecophysiology of Antarctic notothenioid
fishes. Philos Trans R Soc Lond B Biol Sci, 2007. 362(1488): p. 2215-32.
16. Johns, G.C. and J.C. Avise, Tests for ancient species flocks based on molecular
phylogenetic appraisals of Sebastes rockfishes and other marine fishes. Evolution, 1998.
52(4): p. 1135-1146.
17. Eastman, J.T., The nature of the diversity of Antarctic fishes. Polar Biology, 2005. 28(2):
p. 93-107.
18. Schluter, D., The Ecology of Adaptive Radiation. 2000, Oxford: OUP.
19. Rutschmann, S., et al., Parallel ecological diversification in Antarctic notothenioid fishes
as evidence for adaptive radiation. Mol Ecol, 2011. 20(22): p. 4707-21.
68
20. Ruud, J.T., Vertebrates without erythrocytes and blood pigment. Nature, 1954. 173: p.
848-850.
21. Eastman, J.T., Antarctic fish biology: evolution in a unique environment. 1993: Academic
Press.
22. Wells, R.M.G., et al., Comparative study of the erythrocytes and haemoglobins in
nototheniid fishes from Antarctica. Journal of Fish Biology, 1980. 17(5): p. 517-527.
23. Wells, R.M.G., J.A. Macdonald, and G. di Prisco, Thin-blooded Antarctic fishes- a
rheological comparison of the haemoglobin-free icefishes Chionodraco kathleenae and
Cryodraco antarcticus with a red-blooded nototheniid, Pagothenia bernacchii. Journal of
Fish Biology, 1990. 36(4): p. 595-609.
24. Macdonald, J.A. and R.M.G. Wells, Viscosity of Body Fluids From Antarctic
Notothenioid Fish, in Biology of Antarctic Fish, G. Di Prisco, B. Maresca, and B. Tota,
Editors. 1991, Springer-Verlag: Berlin. p. 163-178.
25. Cocca, E., et al., Genomic remnants of alpha-globin genes in the hemoglobinless
antarctic icefishes. Proc Natl Acad Sci U S A, 1995. 92: p. 1817-1821.
26. di Prisco, G., Molecular Adaptations of Antarctic Fish Hemoglobins. 1998: p. 339-353.
27. di Prisco, G., et al., Biogeography and adaptation of Notothenioid fish: hemoglobin
function and globin-gene evolution. Gene, 2007. 398(1-2): p. 143-55.
28. Xu, Q., et al., Adaptive evolution of hepcidin genes in antarctic notothenioid fishes. Mol
Biol Evol, 2008. 25(6): p. 1099-112.
29. Beers, J.M. and N. Jayasundara, Antarctic notothenioid fish: what are the future
consequences of 'losses' and 'gains' acquired during long-term evolution at cold and
stable temperatures? J Exp Biol, 2015. 218(Pt 12): p. 1834-1845.
30. Beers, J.M., K.A. Borley, and B.D. Sidell, Relationship among circulating hemoglobin,
nitric oxide synthase activities and angiogenic poise in red- and white-blooded Antarctic
notothenioid fishes. Comp Biochem Physiol A Mol Integr Physiol, 2010. 156(4): p. 422-
9.
31. di Prisco, G., J.A. MacDonald, and M. Brunori, Antarctic fishes survive exposure to
carbon monoxide. Experientia, 1992. 48(5): p. 473-475.
32. Cocca, E., et al., Do the hemoglobinless icefishes have globin genes? Comp Biochem
Physiol, 1997. 118A(4): p. 1027-1030.
33. Zhao, Y., et al., The Major Adult -Globin Gene of Antarctic Teleosts and Its Remnants in
the Hemoglobinless Icefishes: CALIBRATION OF THE MUTATIONAL CLOCK FOR
NUCLEAR GENES. Journal of Biological Chemistry, 1998. 273(24): p. 14745-14752.
34. Near, T.J., S.K. Parker, and H.W. Detrich, 3rd, A genomic fossil reveals key steps in
hemoglobin loss by the antarctic icefishes. Mol Biol Evol, 2006. 23(11): p. 2008-16.
35. Barber, D.L., The blood cells of the Antarctic icefish Chaenocephalus aceratus
Lönnberg: light and electron microscopic observations. Journal of Fish Biology, 1981.
19(1): p. 11-28.
36. Sidell, B.D. and K.M. O'Brien, When bad things happen to good fish: the loss of
hemoglobin and myoglobin expression in Antarctic icefishes. J Exp Biol, 2006. 209(Pt
10): p. 1791-802.
37. Lau, Y.T., et al., Evolution and function of the globin intergenic regulatory regions of the
antarctic dragonfishes (Notothenioidei: Bathydraconidae). Mol Biol Evol, 2012. 29(3):
p. 1071-80.
69
38. Yang, L.V., et al., Hemogen is a novel nuclear factor specifically expressed in mouse
hematopoietic development and its human homologue EDAG maps to chromosome 9q22,
a region containing breakpoints of hematological neoplasms. Mechanisms of
Development, 2001. 104: p. 105-111.
39. Li, C.Y., et al., EDAG regulates the proliferation and differentiation of hematopoietic
cells and resists cell apoptosis through the adtivation of nuclear factor-ĸB. Cell Death
and Differentiation, 2004. 11: p. 1299-1308.
40. Li, C.-Y., et al., Suppression of EDAG gene expression by phorbol 12-myristate 13-
acetate is mediated through down-regulation of GATA-1. Biochimica et Biophysica
Actta, 2008. 2008(1779): p. 606-615.
41. Li, C.-Y., et al., Overexpression of a hematopoietic transcriptional regulator EDAG
induces myelopoiesis and suppresses lymphopoiesis in transgenic mice. Leukemia, 2007.
21: p. 2277-2286.
42. Jiang, J., et al., Hemgn is a direct transcriptional target of HOXB4 and induces
expansion of murine myeloid progenitor cells. Blood, 2010. 116(5): p. 711-719.
43. Ding, Y.L., et al., Over-expression of EDAG in the myeloid cell line 32D: induction of
GATA-1 expression and erythroid/megakaryocytic phenotype. J Cell Biochem, 2010.
110(4): p. 866-74.
44. An, L.-L., et al., High expression of EDAG and its significance in AML. Leukemia, 2005.
19: p. 1499-1502.
45. Yang, L.V., et al., Alternative promoters and polyadenylation regulate tissue-specific
expression of Hemogen isoforms during hematopoiesis and spermatogenesis. Dev Dyn,
2003. 228(4): p. 606-16.
46. Nakata, T., et al., Chicken hemogen homolog is involved in the chicken-specific sex-
determining mechanism. PNAS, 2013. 110(9): p. 3417-3422.
47. Kruger, A., et al., RP59, a marker for osteoblast recruitment, is also detected in primitive
mesenchymal cells, erythroid cells, and megakaryocytes. Dev Dyn, 2002. 223(3): p. 414-
8.
48. Wurtz, T., et al., A new protein expressed in bone marrow cells and osteoblasts with
implication in osteoblast recruitment. Exp Cell Res, 2001. 263(2): p. 236-42.
49. Shao, J., et al., Sequencing and bioinformatics analysis of the differentially expressed
genes in herniated discs with or without calcification. Int J Mol Med, 2017. 39(1): p. 81-
90.
50. Iwasaki, H., et al., GATA-1 Converts Lymphoid and Myelomonocytic Progenitors into the
Megakaryocyte/Erythrocyte Lineages. Immunity, 2003. 19: p. 451-462.
51. Pevny, L., et al., Development of hematopoietic cells lacking transcription factor GATA-
1. Development, 1995. 121: p. 163-172.
52. Galloway, J.L., et al., Loss of gata1 but not gata2 converts erythropoiesis to myelopoiesis
in zebrafish embryos. Dev Cell, 2005. 8(1): p. 109-16.
53. Belele, C.L., et al., Differential requirement for Gata1 DNA binding and transactivation
between primitive and definitive stages of hematopoiesis in zebrafish. Blood, 2009.
114(25): p. 5162-72.
54. Lyons, S.E., et al., A nonsense mutation in zebrafish gata1 causes the bloodless
phenotype in vlad tepes. Proc Natl Acad Sci U S A, 2002. 99(8): p. 5454-9.
55. Zheng, W.W., et al., EDAG positively regulates erythroid differentiation and modifies
GATA1 acetylation through recruiting p300. Stem Cells, 2014. 32(8): p. 2278-89.
70
56. Blobel, G.A., CREB-binding protein and p300: molecular integrators of hematopoietic
transcription. Blood, 2000. 95(3): p. 745-755.
57. Sandberg, M.L., et al., c-Myb and p300 regulate hematopoietic stem cell proliferation
and differentiation. Dev Cell, 2005. 8(2): p. 153-66.
58. Edgar, R., MUSCLE: multiple sequence alignment with high accuracy and high
throughput. Nucleic Acids Res, 2004. 32(5): p. 1792-1797.
59. Kumar, S., G. Stecher, and K. Tamura, MEGA7: Molecular Evolutionary Genetics
Analysis Version 7.0 for Bigger Datasets. Mol Biol Evol, 2016. 33(7): p. 1870-4.
60. Hall, T.A., BioEdit: a user-friendly biological sequence alignment editor and analysis
program for Windows 95/98/NT. Nucleic Acids Symposium Series, 1999. 41: p. 95-98.
61. Stamatakis, A., RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with
thousands of taxa and mixed models. Bioinformatics, 2006. 22(21): p. 2688-90.
62. Stamatakis, A., RAxML Version 8: A tool for Phylogenetic Analysis and Post-Analysis of
Large Phylogenies. Bioinformatics, 2014.
63. Yang, Z., PAML: a program package for phylogenetic analysis by maximum likelihood.
Comput Appl Biosci, 1997. 13(5): p. 555-556.
64. Yang, Z., PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol, 2007.
24(8): p. 1586-91.
65. Zhang, J., R. Nielsen, and Z. Yang, Evaluation of an improved branch-site likelihood
method for detecting positive selection at the molecular level. Mol Biol Evol, 2005.
22(12): p. 2472-9.
66. Burri, R., et al., Adaptive divergence of ancient gene duplicates in the avian MHC class II
beta. Mol Biol Evol, 2010. 27(10): p. 2360-74.
67. Yang, Z. and R. Nielsen, Codon-Substitution Models for Detecting Molecular Adaptation
at Indiviual Sites Along Specific Lineages. Mol Biol Evol, 2002. 19(6): p. 908-917.
68. Yang, Z. and W.J. Swanson, Codon-Substiution Models to Detect Adaptive Evolution that
Account for Heterogeneous Selective Pressures Among Site Clases. Mol Biol Evol, 2002.
19(1): p. 49-57.
69. Wertheim, J.O., et al., RELAX: detecting relaxed selection in a phylogenetic framework.
Mol Biol Evol, 2015. 32(3): p. 820-32.
70. Pond, S.L.K. and S.V. Muse, HyPhy: Hypothesis Testing Using Phylogenies, in
Statistical methods in molecular evolution. 2005, Springer: New York, NY. p. 125-181.
71. Delport, W., et al., Datamonkey 2010: a suite of phylogenetic analysis tools for
evolutionary biology. Bioinformatics, 2010. 26(19): p. 2455-7.
72. Weaver, S., et al., Datamonkey 2.0: a modern web application for characterizing
selective and other evolutionary processes. Mol Biol Evol, 2018.
73. Shin, S.C., et al., The genome sequence of the Antarctic bullhead notothen reveals
evolutionary adaptations to a cold environment. Genome Biol, 2014. 15(468).
74. Kearse, M., et al., Geneious Basic: an integrated and extendable desktop software
platform for the organization and analysis of sequence data. Bioinformatics, 2012.
28(12): p. 1647-9.
75. Zerbino, D.R., et al., Ensembl 2018. Nucleic Acids Res, 2018. 46(D1): p. D754-D761.
76. Broos, S., et al., ConTra v2: a tool to identify transcription factor binding sites across
species, update 2011. Nucleic Acids Res, 2011. 39(Web Server issue): p. W74-8.
77. Nei, M. and S. Kumar, Molecular Evolution and Phylogenetics. 2000, Oxford: Oxford
University Press.
71
78. Yu, G., et al., ggtree: anrpackage for visualization and annotation of phylogenetic trees
with their covariates and other associated data. Methods in Ecology and Evolution,
2017. 8(1): p. 28-36.
79. Letunic, I. and P. Bork, Interactive tree of life (iTOL) v3: an online tool for the display
and annotation of phylogenetic and other trees. Nucleic Acids Res, 2016. 44(W1): p.
W242-5.
80. Near, T.J. and C.H. Cheng, Phylogenetics of notothenioid fishes (Teleostei:
Acanthomorpha): inferences from mitochondrial and nuclear gene sequences. Mol
Phylogenet Evol, 2008. 47(2): p. 832-40.
81. Near, T.J., J.J. Pesavento, and C.-H.C. Cheng, Mitochondrial DNA, morphology, and the
phylogenetic relationships of Antarctic icefishes (Notothenioidei: Channichthyidae).
Molecular Phylogenetics and Evolution, 2003. 28(1): p. 87-98.
82. Yang, Z. and R. Nielsen, Estimating Synonymous and Nonsynonymous Substitution Rates
Under Realistic Evolutionary Models. Mol Biol Evol, 2000. 17(1): p. 32-43.
83. Librado, P. and J. Rozas, DnaSP v5: a software for comprehensive analysis of DNA
polymorphism data. Bioinformatics, 2009. 25(11): p. 1451-2.
84. Rozas, J., et al., DnaSP, DNA polymorphism analyses by the coalescent and other
methods. Bioinformatics, 2003. 19(18): p. 2496-2497.
85. Sharp, P.M. and W.-H. Li, The codon adaptation index- a measure of directional
synonymous codon usage bias, and its potential applications. Nucleic Acids Res, 1987.
15(3): p. 1281-1295.
86. Morton, B.R., Codon Use and the Rate of Divergence of Land Plant Chloroplast Genes.
Mol Biol Evol, 1994. 11(2): p. 231-238.
87. Sugawara, T., Y. Terai, and N. Okada, Natural Selection of the Rhodopsin Gene During
the Adaptive Radiation of East African Great Lakes Cichlid Fishes. Mol Biol Evol, 2002.
19(10): p. 1807-1811.
88. Ota, T., et al., Positive Darwinian Selection Operating on the Immunoglobulin Heavy
Chain of Antarctic Fishes. Journal of Experimental Zoology (Mol Dev Evol), 2003.
295B: p. 45-58.
89. Schmittgen, T.D. and K.J. Livak, Analyzing real-time PCR data by the comparative CT
method. Nature Protocols, 2008. 3(6): p. 1101-1108.
90. Gao, P., Functional Study of Hemogen Knockout Mouse Model. Theses and Dissertations
(ETD), 2013. Paper 92.
91. Boulikas, T., Putative Nuclear Localization Signals (NLS) in Protein Transcription
Factors. 1994, 1994. 55: p. 32-58.
92. Lahti, D.C., et al., Relaxed selection in the wild. Trends Ecol Evol, 2009. 24(9): p. 487-
96.
93. Hunt, B.G., et al., Relaxed selection is a precursor to the evolution of phenotypic
plasticity. PNAS, 2011. 108(38): p. 15936-15941.
94. Go, Y., et al., Lineage-specific loss of function of bitter taste receptor genes in humans
and nonhuman primates. Genetics, 2005. 170(1): p. 313-26.
95. Murrell, B., et al., Detecting individual sites subject to episodic diversifying selection.
PLoS Genet, 2012. 8(7): p. e1002764.
96. Yang, Z., W.S. Wong, and R. Nielsen, Bayes empirical bayes inference of amino acid
sites under positive selection. Mol Biol Evol, 2005. 22(4): p. 1107-18.
72
97. Anisimova, M., J.P. Bielawski, and Z. Yang, Accuracy and power of the likelihood ratio
test in detecting adaptive molecular evolution. Mol Biol Evol, 2001. 18(8): p. 1585-1592.
98. Yang, Z., Adaptive molecular evolution, in Handbook of statistical genetics, D.J.
Balding, M. Bishop, and C. Cannings, Editors. 2001, Wiley: New York. p. 327-350.
99. Anisimova, M., J.P. Bielawski, and Z. Yang, Accuracy and power of Bayes prediction of
amino acid sites under positive selection. Mol Biol Evol, 2002. 19(6): p. 950-958.
100. Yang, Z. and J.P. Bielawski, Statistical methods for detecting molecular adaptation.
TREE, 2000. 15(12): p. 496-503.
101. Yang, Z., Inference of selection from multiple species aignments. Current Opinion in
Genetics and Development, 2002. 12: p. 688-694.
102. Bielawski, J.P. and Z. Yang, Maximum likelihood methods for detecting adaptive
evolution after gene duplication, in Genome Evolution, A. Meyer and Y. Van de Peer,
Editors. 2003, Kluwer Academic Publishers: Netherlands. p. 201-212.
103. Zhao, H., et al., Rhodopsin molecular evolution in mammals inhabiting low light
environments. PLoS One, 2009. 4(12): p. e8326.
104. Veilleux, C.C., E.E. Louis, Jr., and D.A. Bolnick, Nocturnal light environments influence
color vision and signatures of selection on the OPN1SW opsin gene in nocturnal lemurs.
Mol Biol Evol, 2013. 30(6): p. 1420-37.
105. Markova, S., J.B. Searle, and P. Kotlik, Relaxed functional constraints on triplicate
alpha-globin gene in the bank vole suggest a different evolutionary history from other
rodents. Heredity (Edinb), 2014. 113(1): p. 64-73.
106. Feng, P., et al., Massive losses of taste receptor genes in toothed and baleen whales.
Genome Biol Evol, 2014. 6(6): p. 1254-65.
107. Dainat, J., et al., GLADX: an automated approach to analyze the lineage-specific loss
and pseudogenization of genes. PLoS One, 2012. 7(6): p. e38792.
108. Ohno, S., Evolution by gene duplication. 1970, New York: Allen & Unwin.
109. Lynch, M. and J.S. Conery, The Evolutionary Fate and Consequences of Duplicate
Genes. Science, 2000. 290(5494): p. 1151-1155.
110. Christoffels, A., et al., Fugu genome analysis provides evidence for a whole-genome
duplication early during the evolution of ray-finned fishes. Mol Biol Evol, 2004. 21(6): p.
1146-51.
111. Hoegg, S., et al., Phylogenetic timing of the fish-specific genome duplication correlates
with the diversification of teleost fish. J Mol Evol, 2004. 59(2): p. 190-203.
112. Postlethwait, J.H., et al., Zebrafish comparative genomics and the origins of vertebrate
chromosomes. Genome Res, 2000. 10(1): p. 1890-1902.
113. Brunet, F.G., et al., Gene loss and evolutionary rates following whole-genome
duplication in teleost fishes. Mol Biol Evol, 2006. 23(9): p. 1808-16.
114. Glasauer, S.M.K. and S.C.F. Neuhauss, Whole-genome duplication in teleost fishes and
its evolutionary consequences. Mol Genet Genomics, 2014. 289: p. 1045-1060.
115. Rastogi, S. and D.A. Liberles, Subfunctionalization of duplicated genes as a transition
state to neofunctionalization. BMC Evol Biol, 2005. 5: p. 28.
116. Amoutzias, G.D., et al., Posttranslational regulation impacts the fate of duplicated genes.
Proc Natl Acad Sci U S A, 2010. 107(7): p. 2967-71.
117. Qian, W., et al., Maintenance of duplicate genes and their functional redundancy by
reduced expression. Trends in Genetics, 2010. 26(10).
73
118. Amores, A., et al., Cold Fusion: Massive Karyotype Evolution in the Antarctic Bullhead
Notothen Notothenia coriiceps. G3 (Bethesda), 2017. 7(7): p. 2195-2207.
119. Chen, D.L., et al., EDAG-1 promotes proliferation and invasion of human thyroid cancer
cells by activating MAPK/Erk and AKT signal pathways. Cancer Biol Ther, 2016. 17(4):
p. 414-21.
120. Lü, J.W.-X.X.S.-Y.W.Y.J.C.-Y.L.W.-M.C.X.-M.Y., Overexpression of EDAG-1 in
NIH3T3 cells leads to malignant transformation. heng wu hua xue yu sheng wu wu li xue
bao Acta biochimica et biophysica Sinica, 2002. 34(1): p. 95-98.
121. O'Leary, N.A., et al., Reference sequence (RefSeq) database at NCBI: current status,
taxonomic expansion, and functional annotation. Nucleic Acids Res, 2016. 44(D1): p.
D733-45.
122. Jones, F.C., et al., The genomic basis of adaptive evolution in threespine sticklebacks.
Nature, 2012. 484(7392): p. 55-61.
123. Betancur, R.R., et al., The tree of life and a new classification of bony fishes. PLoS Curr,
2013. 5.
124. Read, T.D., et al., Draft sequencing and assembly of the genome of the world's largest
fish, the whale shark: Rhincodon typus Smith 1828. BMC Genomics, 2017. 18(1): p. 532.
125. Venkatesh, B., et al., Elephant shark genome provides unique insights into gnathostome
evolution. Nature, 2014. 505(7482): p. 174-9.
126. Arnaud, L., et al., A dominant mutation in the gene encoding the erythroid transcription
factor KLF1 causes a congenital dyserythropoietic anemia. Am J Hum Genet, 2010.
87(5): p. 721-7.
127. Devlin, E.E., et al., A transgenic mouse model demonstrates a dominant negative effect of
a point mutation in the RPS19 gene associated with Diamond-Blackfan anemia. Blood,
2010. 116(15): p. 2826-35.
128. Paz-Priel, I. and A.D. Friedman, C/EBPα Dysregulation in AML and ALL. Crit Rev
Oncog, 2011. 16(1-2): p. 93-102.
129. Yang, L.V., et al., The GATA site-dependent hemogen promoter is transcriptionally
regulated by GATA1 in hematopoietic and leukemia cells. Leukemia, 2006. 20(3): p. 417-
25.
130. Gardiner, M.R., et al., A global role for zebrafish klf4 in embryonic erythropoiesis. Mech
Dev, 2007. 124(9-10): p. 762-74.
131. Soza-Ried, C., et al., Essential role of c-myb in definitive hematopoiesis is evolutionarily
conserved. PNAS, 2010. 107(40): p. 17304-17308.
132. Takahashi, K., et al., Phylogenetic relationships and ancient incomplete lineage sorting
among cichlid fishes in Lake Tanganyika as revealed by analysis of the insertion of
retroposons. Mol Biol Evol, 2001. 18(11): p. 2056-2066.
133. Marino, I.A., et al., Evidence for past and present hybridization in three Antarctic icefish
species provides new perspectives on an evolutionary radiation. Mol Ecol, 2013. 22(20):
p. 5148-61.
top related