Temperature-Dependent Patterns of Gene Expression in ... · ii Temperature-Dependent Patterns of Gene Expression in Caenorhabditis briggsae Stephanie Mark Master of Science Department
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Temperature-Dependent Patterns of Gene Expression in Caenorhabditis brigssae
by
Stephanie Mark
A thesis submitted in conformity with the requirements for the degree of Master of Science
Department of Ecology and Evolutionary Biology University of Toronto
1. Table 1. Number of raw and cleaned reads in fastq files from Genome Quebec.
2. Table 2. Number and percentage of reads that mapped to unique locations (i.e. one
location in the genome) with STAR.
3. Table 3. Values for each soft threshold power that was tested.
4. Table 4. Table of G-test p-values for to test whether the proportion of differentially
expressed genes in a module differed significantly from genome-wide proportions
5. Table 5. G-test p-values from a test to determine whether the proportion of genes located
in chromosome arms or centres differed from expected proportions for each differential
expression group
6. Table 6. G-test p-values from a test to determine whether the proportion of genes located
in autosome arms or centres differed from expected proportions for each of the 22
modules identified by co-expression clustering
7. Table 7. Table of G-test p-values for tests of whether the proportions of genes on
autosomes and the X-chromosome were significantly different from expectations for each
differential expression group
8. Table 8. Table of G-test p-values for tests of whether the proportions of genes on
autosomes and the X-chromosome were significantly different from expectations for each
co-expression module
vii
List of Figures
Figure 1. Number of reads from fastq files for each data file……………………………………50
Figure 2. Distribution of intron lengths in C. briggsae reference genome………………………51
Figure 3. Ratio of average number of uniquely mapped reads in Tropical and Temperate Genotypes……………………………………………………………………………………..…52
Figure 4. Percentage of uniquely mapped reads with STAR per biological replicate……….......54
Figure 5. Reads counted by htseq-count per replicate…………………………………………...55
Figure 6. Multi-dimensional scaling plot (MDS) of filtered, normalized, and log transformed count data……………………………………………………………………………………...…56
Figure 7. Distributions of p-values from preliminary tests………………………………………57
Figure 8. Analysis for differential expression with edgeR versus limma………………………..58
Figure 9. Quantile-quantile plot for normalized, voom-transformed count data………………...59
Figure 10. Dendrogram and heatmap of normalized count data…………………………………60
Figure 11. Fit of scale-free topology generated by soft-thresholding powers…………………...61
Figure 12. Mean connectivity of soft-thresholding powers……………………………………...62
Figure 13. Dendrogram of initial 124 modules from co-expression clustering………………….64
Figure 14. Similarity heatmap of initial 124 modules from co-expression clustering…………..65
Figure 15. Dendrogram of merged co-expression clusters………………………………………66
Figure 16. Heatmap of merged co-expression clusters…………………………………………..67
Figure 17. Numbers of differentially expressed genes…………………………………………..68
Figure 18. Proportions of genes expressed under chronic cold versus heat stress………………69
Figure 19. Proportions of genes that increase or decrease expression under chronic cold versus heat stress………………………………………………………………………………………...70
Figure 20. Magnitude of change in expression under chronic cold versus heat stress…………..71
Figure 21. Distribution of differential expression groups within co-expression modules……….72
viii
Figure 22. Proportions of differentially expressed genes within co-expression modules….....…73
Figure 23. Module eigengene expression plots for Genotype modules………………………….74
Figure 24. Module eigengene expression plots for Temperature modules………………………75
Figure 25. Module eigengene expression plots for GxT modules……………………………….76
Figure 26. Module eigengene expression plots for genes with no differential expression………77
Figure 27. Proportion of differentially expressed genes in arm versus centre domains of autosomes………………………………………………………………………………………...81
Figure 28. Proportion of differentially expressed genes in arm versus centre domains of the X-chromosome……………………………………………………………………………………...82
Figure 29. Proportion of module genes in arm versus centre domains of autosomes…………...83
Figure 30. Proportion of module genes in arm versus centre domains of X-chromosome..…….84
Figure 31. Proportion of genes on autosomes and the X-chromsome for each differential expression group…………………………………………………………………………………87
Figure 32. Proportion of genes on autosomes and the X-chromsome for each co-expression module............................................................................................................................................88
temperature results in complex instabilities, it could explain my observations of relatively
consistent responses to cold in both Temperate and Tropical phenotypes versus the varied
responses to heat.
In spite of Temperate and Tropical strains showing similar expression patterns under cold
stress and distinct patterns under heat stress, I observed phenotypic differences at both
temperature extremes, suggesting that expression is not simply the result of temperature-
dependent enzyme activity. At 14°C, the Temperate genotype has higher fecundity than
the Tropical genotype whereas at 30°C, the Tropical genotype is more fecund than the
Temperate (Prasad et al. 2011). This suggests that the relationship between gene
expression and phenotype at the organism level is not straightforward and that certain
genes influence fitness differently depending on the genotype in which it is expressed. A
possible explanation for this complex relationship is a scenario in which different alleles
fix in different populations because they are each beneficial in their local environments
but have either no effect or negative effects in the other, known as conditional neutrality
and antagonistic pleiotropy, respectively (Anderson et al. 2013). A meta analysis of QTL
studies in A. thaliana found that antagonistic pleiotropy underlies at least 60% of
instances of GxE (Des Marais et al. 2013), suggesting that antagonistic pleiotropy is very
common, and could be responsible for the pattern observed in our Temperate and
Tropical genotypes.
35
Chromosomal domains and differentially expressed genes
Given the well-described structure of C. briggsae chromosomes, I wanted to determine
whether there was a relationship between gene expression pattern and physical location
on chromosomes. Like C. elegans, chromosomes in C. briggsae lack centromeres and are
organized into clear domains that are defined by distinct rates of recombination
(Rockman and Kruglyak 2009, Ross et al. 2011). Centre domains have relatively low
rates of recombination while arm domains have much higher rates of recombination
(Cutter and Choi 2010). Arm domains also are the most genetically variable regions, in
terms of both functional and silent site diversity, whereas centre domains have the highest
gene density and lower polymorphism (Thomas et al. 2015). These patterns suggest that
if sequence differences drive differential expression, then the expression of genes that are
primarily located in chromosome centres is likely regulated by trans-acting factors.
Conversely, genes that are differentially expressed that are located in arm regions could
potentially be regulated in cis, given the higher polymorphism in these domains.
Furthermore, genes whose expression is more consistent across environments, or whose
expression changes in a similar manner across environments tend to be cis-regulated
(Smith and Kruglyak 2008) whereas genes whose expression is more variable across
different environments tend to be trans-regulated in C. elegans and in yeast (Li et al.
2006, Smith and Kruglyak 2008). Taken together, these observations suggest the
hypothesis that genes identified as having a significant effect of genotype are more likely
to be cis-regulated and located in arm domains. Additionally, I would expect to see most
Temperature genes in centre regions because they have similar responses in both
genotypes and therefore should have fewer cis-acting regulatory polymorphisms. Finally,
given that variable responses to the environment, especially those that constitute a change
in direction of expression between genotypes (i.e. increase in one and decrease in the
other), are regulated by trans-acting factors ((Li et al. 2006, Smith and Kruglyak 2008), I
would expect that most GxT genes would be located in the centres of chromosomes.
Consistent with this hypothesis, my analysis revealed that, on the whole, autosomal genes
that were differentially expressed due to genotype were overrepresented in arm domains
by almost 20%. Contrary to my expectations, GxT genes were also enriched in autosome
36
arm domains, albeit to a lesser extent (1.04-fold enrichment). Interestingly, G&T genes
were enriched in autosome centres.
The unexpected enrichment of GxT genes in autosome arms may be the result of
considering all GxT genes together, regardless of their expression patterns. For example,
whereas differential expression in Genotype genes can have only two patterns (i.e. higher
expression in Tropical than Temperate or vice versa), numerous distinct expression
profiles can each produce a significant interaction between genotype and temperature.
Indeed, when arm versus centre enrichment for GxT genes was examined for separate co-
expression modules the results can be better explained. For example, the observation that
the expression of cis-regulated genes tends to change less across environments whereas
expression in trans-regulated genes tends to change more across environments as well as
demonstrates the crossing reaction norms characteristic of GxE in local adaptation (Smith
and Kruglyak 2008, Kawecki and Ebert 2004). For instance, Modules 1, 9, and 16 were
enriched in autosome arms while Modules 3 and 14 were enriched in the centre. Module
eigengenes for the modules enriched in the arms could be interpreted as having regions in
which expression stays relatively consistent, a feature of cis-regulated genes (Smith and
Kruglyak 2008). For example, gene expression in Modules 1 and 16 does not change
drastically between 14°C and 20°C and expression in Module 9 is relatively unchanged
between 20°C and 30°C. Expression patterns between these temperatures are similar to
the expression profiles of Genotype genes, which were also enriched in arm domains. In
contrast, expression of module eigengenes of modules enriched in the centres shows
more drastic changes in expression, consistent with genes regulated in trans (Smith and
Kruglyak 2008). For instance, in Module 14, expression is opposite in the genotypes,
showing the most drastically different expression pattern between Temperate and
Tropical.
The enrichment of G&T genes in autosome centres can also be interpreted as conforming
to expectations. Although these genes have significant effects of Genotype and
Temperature on their expression, the effects of these factors are independent. When G&T
genes are considered as genes that are significantly differentially expressed due to
temperature, it is expected that they be primarily trans-regulated and consequently that
37
they be enriched in chromosome centres, where polymorphism is low. The difference in
expression between the genotypes that is maintained across temperatures could be caused
by a polymorphism in the promoter of a trans-acting factor that regulates expression in
the same way in both the Temperate and Tropical genotypes.
Less enrichment of gene groups on the X chromosome overall is unsurprising under the
assumption that differences in recombination rates drive differences in gene expression
and function. For example, genes that are specific to germline function are absent on the
X chromosome (Reinke et al. 2000). Gene density and recombination rates are both more
uniform across domains on the X chromosome (Hillier et al. 2007, Andersen et al. 2012).,
suggesting that differential expression groups would not be preferentially located in
either the arms or the centre. At the module level, three modules (1, 7, 12) were enriched
in X chromosome domains. However, Modules 1 and 12 show the same pattern of
enrichment in the autosomes, indicating that these two modules are overrepresented in
the arms in all chromosomes. Only Module 7, a Genotype module, was unique in being
enriched in the centre on the X. While it would be unexpected that a Genotype module be
enriched in the centre on an autosome, it is less surprising on the X chromosome where
the domains are less distinct.
My results are therefore consistent with the idea that higher nucleotide polymorphism
creates more functional variation in the form of genetic variation for gene expression.
More enrichment in distinct chromosomal domains was observed in the autosomes,
where regions of polymorphism are most pronounced. Given that these patterns of
enrichment are also consistent with expectations for genes that are regulated by cis- and
trans-acting factors, it would be interesting to further explore the question of gene
regulation with this dataset.
It would be possible to gain insight into the potential cis- and trans-regulation of these
differentially expressed genes by integrating data from my analysis with SNP
information. Using SNP variant data for the Temperate genotype (the C. briggsae
reference genome is based on the Tropical genotype), the SNP density could be
quantified for upstream regions of differentially expressed genes. If the expression of
38
genotype genes is primarily regulated in cis, then the density of SNPs in promoter regions
ought to be higher than expected. This information would be particularly interesting
given that the Temperate and Tropical genotypes are considered by some researchers to
represent the early stages of speciation (Baird and Stonesifer 2012, Abbott et al. 2013,
Chang et al. 2016). Interspecific differences in gene expression are caused primarily by
differences in cis-regulatory regions whereas intraspecies differences tend to be driven by
trans-regulatory regions (Wittkopp et al. 2008, Tirosh et al. 2009). If genotype genes
have higher SNP density in promoter regions, they could represent genes that are
contributing to expression differences between incipient species.
Small RNAs and temperature-sensitive regulation of gene expression
Small RNA RNA-seq data also were collected at the same time as the mRNA data used
in this study for the aim of shedding light on the mechanisms of gene expression
regulation in response to chronic temperature stress. Small RNAs are encoded within the
genome and bind to mRNA targets after being transcribed themselves (Claycomb 2012).
Typically, small RNAs silence their target genes by cleaving or binding to transcripts to
prevent subsequent translation into proteins. Certain small RNAs are also sensitive to
temperature, particularly those involved in the maintenance of fertility, such as Piwi-
interacting small RNAs (piRNAs) (Conine et al. 2009, Batista et al. 2008). piRNAs target
foreign genetic sequences such as transgenes and transposons in the germline during
development and are more active at increased temperatures in C. elegans (Batista et al.
2008, Lee et al. 2012). Similar piRNAs are found in many species within Caenorhabditis,
including C. briggsae (Shi et al. 2013, Tu et al. 2015). Given that most genotype-specific
responses are observed under heat stress, it is possible that post-transcriptional regulation
by piRNAs provide another pathway through which different expression patterns are
produced between the Temperate and Tropical genotypes. Future analyses that relate
small RNA expression to the patterns of mRNA expression revealed through this analysis
could help explain genotype-specific gene regulation, particularly under chronic heat
stress.
39
Conclusion
My analysis of temperature-dependent patterns of gene expression in Temperate and
Tropical populations of C. briggsae revealed several surprising results. For example, both
my differential expression and my co-expression clustering analyses showed that the
response to rearing under cold stress was qualitatively different from the response to
rearing under heat stress. A small proportion of genes responded to both cold stress and
heat stress whereas most Temperature genes responded to cold stress only and most GxT
genes responded to heat stress only. Genotype-specific responses to temperature were
also relatively common throughout the genome, occurring in 30% of all genes tested.
Visualization of module eigengenes of co-expression clusters corroborated my
differential expression results and revealed that the majority of genes that have GxT
responses showed genotype-specific expression only in response to heat stress.
Expression changed in the same direction for both Temperate and Tropical genotypes but
under heat stress, the expression change in the Temperate genotype was more drastic
when compared to that of the Tropical genotype. Finally, the enrichment of Genotype
genes and GxT genes in the more polymorphic arm domains of autosomal chromosomes
suggests that these genes tend to be regulated by cis-acting factors.
Results from this study point to the potential for future investigations. For example, given
that the Temperate and Tropical genotypes are considered by some to be undergoing
speciation and that most expression differences observed between species are cis-
regulated, it would be interesting to quantify the actual number of SNPs in regions
upstream of the differential expression groups to validate the results from analyses done
at the chromosome scale. In particular, genes with an effect of genotype that have a high
number of SNPs in upstream regions could be identified as likely being regulated in cis
and potentially contributing to expression differences between the incipient species. Such
genes may be considered candidate genes for investigations into the genome architecture
of speciation. Finally, the integration of the mRNA expression data with small RNA
expression data from the same experiment could shed light on patterns of post-
transcriptional regulation.
40
Table 1. Number of raw and cleaned reads in fastq files from Genome Quebec.
sample raw clean
AF14-1.1 17830362 17168256
AF14-1.2 17782202 17120594
AF14-2.1 26025046 25044050
AF14-2.2 26054157 25056610
AF14-3.1 29963016 29129264
AF14-3.2 29612363 28780806
AF20-1.1 31872995 30736051
AF20-1.2 31763308 30628482
AF20-2.1 34508074 33251105
AF20-2.2 34522693 33241950
AF20-3.1 24025150 23389721
AF20-3.2 23746489 23115267
AF30-1.1 36732913 35418582
AF30-1.2 36617640 35307045
AF30-2.1 19787947 19103142
AF30-2.2 19811944 19115062
AF30-3.1 18967224 18475211
AF30-3.2 18735493 18246983
HK14-1.1 28556436 27483161
HK14-1.2 28482204 27409768
HK14-2.1 24166460 23286124
HK14-2.2 24185031 23286128
HK14-3.1 26809455 26054377
HK14-3.2 26491330 25742077
HK20-1.1 17304708 16698014
HK20-1.2 17251430 16646977
HK20-2.1 27441709 26350515
HK20-2.2 27480811 26366445
HK20-3.1 22694619 22052191
HK20-3.2 22434965 21794800
HK30-1.1 29222885 28223148
HK30-1.2 29132626 28137121
HK30-2.1 22879967 22100957
HK30-2.2 22876949 22086416
HK30-3.1 24915269 24343611
HK30-3.2 24623445 24056254
41
Figure 1. Number of reads from fastq files for each data file (each biological replicate was sampled across 2 lanes) before and after cleaning with Trimmomatic. Replicates that begin with “AF” denote Tropical genotypes and “HK” denotes Temperate genotypes.
0
5
10
15
20
25
30
35
40
AF1
4-1
.1
AF1
4-1
.2
AF1
4-2
.1
AF1
4-2
.2
AF1
4-3
.1
AF1
4-3
.2
AF2
0-1
.1
AF2
0-1
.2
AF2
0-2
.1
AF2
0-2
.2
AF2
0-3
.1
AF2
0-3
.2
AF3
0-1
.1
AF3
0-1
.2
AF3
0-2
.1
AF3
0-2
.2
AF3
0-3
.1
AF3
0-3
.2
HK
14
-1.1
HK
14
-1.2
HK
14
-2.1
HK
14
-2.2
HK
14
-3.1
HK
14
-3.2
HK
20
-1.1
HK
20
-1.2
HK
20
-2.1
HK
20
-2.2
HK
20
-3.1
HK
20
-3.2
HK
30
-1.1
HK
30
-1.2
HK
30
-2.1
HK
30
-2.2
HK
30
-3.1
HK
30
-3.2
Re
ad
s (1
06)
raw
clean
42
Figure 2. Distribution of intron lengths in C. briggsae reference genome (WS253). Counts left of the red line represent 99% of all introns.
log2 intron length (bp)
Co
unt
(mill
ion
s)
0 5 10 15
0
1
2
3
4
5
43
Figure 3. Ratio of average number of uniquely mapped reads in Tropical and Temperate Genotypes. To minimize the bias towards reads mapped from the reference genotype, up to 10 mismatches were allowed per read.
1.06
1.08
1.1
1.12
1.14
1.16
1.18
0 1 2 3 4 5 6 7 8 9 10
Ra
tio
of
Tro
pic
al
to T
em
pe
rate
av
g.
un
iqu
e r
ea
ds
ma
pp
ed
Number of Mismatches
44
Table 2. Number and percentage of reads that mapped to unique locations (i.e. one location in the genome) with STAR.
Sample
Uniquely Mapped
Reads
% Uniquely Mapped
Reads
AF14-1 32007586 93.35%
AF14-2 47012080 93.84%
AF14-3 54523984 94.15%
AF20-1 57787850 94.17%
AF20-2 62122726 93.43%
AF20-3 43596994 93.75%
AF30-1 66295835 93.74%
AF30-2 35618842 93.20%
AF30-3 33087508 90.10%
HK14-1 51264144 93.39%
HK14-2 43616826 93.65%
HK14-3 48035023 92.74%
HK20-1 30262908 90.76%
HK20-2 49528582 93.95%
HK20-3 41332043 94.26%
HK30-1 51537189 91.44%
HK30-2 40748065 92.22%
HK30-3 35632338 73.62%
45
Figure 4. Percentage of uniquely mapped reads with STAR per biological replicate. Maximum mismatch rate was set at 10. “AF” denotes Tropical genotypes and “HK” denotes Temperate genotypes. Although sample HK30-3 had a lower proportion of uniquely mapped reads, the absolute number of uniquely mapped reads was comparable and so the sample was retained for downstream analysis.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
% u
niq
ue
ly m
ap
pe
d r
ea
ds
Biological Replicate
46
a)
b)
Figure 5. a) Number of reads counted with htseq-count by biological replicate. “AF” = Tropical genotype, “HK” = Temperate genotype. b) Percentage of reads counted. Although the percentage of reads counted in HK30-3 was low, the sample was kept for downstream analysis because the number of reads counted was comparable.
0
10
20
30
40
50
60
70
80R
ea
ds
(x 1
06)
Biological Replicates
Not Counted
Counted
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
% R
ea
ds
Biological Replicates
Not Counted
Counted
47
Figure 6. Multi-dimensional scaling plot (MDS) of filtered, normalized, and log transformed count data. Different colours represent experimental groups (ex. Temperate at 20°C). “AF” = Tropical genotype and “HK” = Temperate genotype. The x-axis represents the principal component with the largest proportion of variation and the y-axis represents the principal component with the second largest proportion of variation. Biological replicates that cluster together in the plot are more similar to each other. Samples that are close in space indicates consistency across replicates.
−4 −2 0 2 4
−2
0
2
4
Leading logFC dim 1
Le
ad
ing
lo
gF
C d
im 2
AF14−1AF14−2
AF14−3
AF20−1AF20−2
AF20−3
AF30−1AF30−2
AF30−3HK14−1HK14−2HK14−3
HK20−1HK20−2
HK20−3
HK30−1HK30−2
HK30−3
48
a)
b)
Figure 7. a) Distribution of p-values for t-tests for effect of strain, b) distribution of p-values for F-tests for effect of temperature. Heavily right-skewed distributions for both indicate that there is a good possibility of identifying significant effects for strain and temperature in many genes.
p−values
No.
of g
en
es (
10
00s)
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
6
7
p−values
No.
of g
en
es (
10
00
s)
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
6
7
49
Figure 8. Analysis for differential expression with edgeR (negative binomial distribution) versus limma (log-transformed counts distribution). edgeR = blue, limma = red. Limma is more conservative in all tests.
GxT G only T only G&T no DE
Differential Expression Group
No.
ge
ne
s (
10
00
s)
0
1
2
3
4
5
6
7
8
50
Figure 9. Quantile-quantile plot for normalized, voom-transformed count data shows that data approximate a normal distribution (represented by the red line) and are suitable for analysis with the limma package.
51
Figure 10. After filtering out genes with very low to no counts, TMM normalization for different library sizes, and log-transforming count data, a dendrogram reveals similarity of samples within strain and within temperature but not replicate. This suggests that the data are heterogeneous (i.e. from different sources) but for reasons of experimental design and not batch effects.
52
Figure 11. Analysis of soft-thresholding powers revealed 30 to be the power at which the scale-free fit is maximized (R2 = 0.75) and most closely approximates a scale-free network.
53
Figure 12. Analysis of a range of soft-thresholding powers revealed 30 to be the number at which the scale-free fit is maximized and that has a mean connectivity of at least 100 (k = 115).
54
Table 3. Values for each soft threshold power that was tested. Ideal powers have an
R-squared value close to 1, a slope close to -1, and a mean k (connectivity) over 100.
55
Figure 13. Clustering with WGCNA produced 124 modules based on expression similarity of 16 199 genes. One module (Module 0) contained 37 genes whose expression patterns were not sufficiently similar to be placed in any module. The soft thresholding power that determines similarity between genes was set at 30.
56
Figure 14. Clustering of 16 199 genes with WGCNA was also visualized as a heatmap in which red represents maximum similarity and blue no similarity. Large blocks of red indicate there is also similarity between clusters. Each colour on the x- or y-axis represents a co-expression module.
57
Figure 15. After merging modules with a <0.25 distance, 23 modules remained, including Module 0. The dendrogram and heatmap echo module similarity patterns across the initial 124 modules.
58
Figure 16. The heatmap of clustering of 16 199 genes with WGCNA after merging modules with a distance between them of less than 0.25. The clustering pattern of similar modules that was seen in the initial heatmap of 124 modules is retained after merging. Each colour on the x- or y-axis represents a co-expression module.
59
Figure 17. Differential expression analysis with limma showed that over half (54%) of all genes were significantly differentially expressed. Of these genes, the majority had significant interaction effects (GxT) (FDR= 0.05). “G&T” genes showed significant effects of genotype and temperature independently whereas “Genotype” genes and “Temperature” genes were significantly differentially expressed for those variables alone.
60
Figure 18. Looking at the proportion of differentially expressed genes that are expressed under cold stress versus heat stress reveals that the majority of genes with an effect of temperature respond to cold stress whereas most of the genes with an interaction respond to heat stress. G&T genes are included in this figure as Temperature genes because they have a significant effect of temperature and it is independent of its effect of genotype.
61
Figure 19. Looking at the proportion of genes that increase or decrease expression in response to either cold stress (light blue line) or heat stress (red line) shows that roughly equal proportions increase as decrease expression, especially for Temperature genes. Again, more Temperature genes (G&T and Temperature genes) respond to cold stress whereas more GxT genes respond to heat stress.
62
Figure 20. Constrasting the magnitude of expression change for genes with a significant independent effect of temperature (Temperature genes, G&T genes) under chronic cold stress (light blue line) and chronic heat stress (red line) shows that for genes with a significant independent effect of temperature, the magnitude of change in expression is similar under both types of extreme temperature stress. However, more variation is seen in the magnitude of expression change between cold stress and heat stress for genes with a significant interaction. For example, for GxT genes, the magnitude of expression increase is greater under heat stress than cold stress whereas for genes with a significant effect of temperature, the response is comparable at both temperatures.
63
Figure 21. Co-expression clustering of 16 199 genes by expression similarity with WGCNA resulted in 22 modules ordered by size. Module 0 is the module that contains genes whose expression patterns were not sufficiently similar to other genes to be placed in a module. Cross-referencing module membership with the results of my differential expression analysis revealed that differentially expressed genes are not distributed equally among the modules.
64
Figure 22. The 23 modules that resulted from co-expression clustering of 16 199 genes have different proportions of differentially expressed genes. Modules with > 1000 genes and or with > 50% differentially expressed genes were retained for further analysis.
65
Figure 23. Module eigengene plots of normalized, log2-transformed expression across temperature treatments for Modules 7 and 10, Representative Modules with a large proportion of genes that were differentially expressed due to Genotype, show that genes in Module 7 are expressed more in the Temperate genotype (blue) whereas genes in Module 10 are expressed more in the Tropical genotype (red).
66
Figure 24. Module eigengene plots of normalized, log2-transformed expression across temperature treatments for Modules 4, 5, 6, 12, and 15, Representative Modules with a large proportion of genes that were differentially expressed due to Temperature. Genes in Modules 6, 12, and 15 increase expression in response to rearing under heat stress. Genes in Modules 4 and 5, modules with a large proportion of G&T genes, increase expression in response to rearing under cold stress. Genes in Module 4 are expressed more in the Temperate genotype (blue) whereas genes in Module 5 are expressed more in the Tropical genotype (red).
67
Figure 25. Module eigengene plots of normalized, log2-transformed expression across temperature treatments for Modules 1,2,3,9,14,16, and 22, Representative Modules with a large proportion of genes that showed significant interactions between genotype and temperature. Genes in Modules 1, 2, 3, and 16 show similar patterns of increase and decrease in expression at each temperature, but the Temperate (blue) genotype changes expression more drastically when reared under chronic heat stress. A minority of genes from Modules 14 and 22 show opposite patterns of expression between Temperate and Tropical (red) genotypes.
68
Figure 26. Module eigengene plots of normalized, log2-transformed expression across temperature treatments for Modules with fewer than 1000 genes and less than 50% differentially expressed genes. Expression patterns in these non-Representative Modules are less distinct between Temperate (blue) and Tropical (red) genotypes and across temperatures. Crossing over of expression patterns between genotypes where the difference in slopes is not pronounced indicates low power to detect differential expression for genes in these modules.
69
Table 4. Table of G-test p-values for to test whether the proportion of differentially expressed genes in a module differed significantly from genome-wide proportions (p = 0.05, Bonferroni adjusted). All modules are significantly different except for Module 0 (membership in Module 0 is not based on expression similarity).
Module
G-test p-
value adj. p-value
0 0.006774373 0.155810568
1 2.69E-36 6.19E-35
2 3.67E-143 8.45E-142
3 2.06E-48 4.73E-47
4 1.66E-72 3.81E-71
5 1.34E-138 3.08E-137
6 8.79E-132 2.02E-130
7 1.39E-124 3.19E-123
8 9.32E-92 2.14E-90
9 3.47E-11 7.99E-10
10 3.52E-108 8.10E-107
11 5.35E-27 1.23E-25
12 1.24E-60 2.86E-59
13 8.84E-58 2.03E-56
14 1.24E-05 0.000284691
15 1.23E-46 2.83E-45
16 6.15E-08 1.41E-06
17 1.60E-08 3.68E-07
18 7.40E-17 1.70E-15
19 1.01E-21 2.33E-20
20 2.53E-08 5.81E-07
21 1.75E-14 4.01E-13
22 2.34E-08 5.39E-07
70
Table 5. G-test p-values from a test to determine whether the proportion of genes located in chromosome arms or centres differed from expected proportions for each differential expression group (p = 0.05, Bonferroni adjusted).
DE
group
G test p-
value adj. p-value
T only 0.144735537 0.723677687
G only 1.13E-07 5.66E-07
G&T 2.35E-05 0.00011734
GxT 0.005203834 0.026019169
noDE 0.132138705 0.660693526
71
Table 6. G-test p-values from a test to determine whether the proportion of genes located in autosome arms or centres differed from expected proportions for each of the 22 modules identified by co-expression clustering (FDR = 0.05, Benjamini-Hochberg correction).
Module
G test p-
value adj. p-value
1 0.012177946 0.022326235
2 0.544743083 0.630755149
3 0.009617524 0.021158554
4 0.152170221 0.209234054
5 0.768639623 0.805946524
6 6.28E-18 1.38E-16
7 0.995516473 0.995516473
8 1.37E-05 6.03E-05
9 0.02862504 0.044982206
10 1.86E-11 1.37E-10
11 0.116978603 0.171568618
12 8.24E-05 0.000258911
13 8.16E-14 8.98E-13
14 0.011301327 0.022326235
15 0.002961166 0.008143206
16 1.87E-05 6.85E-05
17 0.00545346 0.01333068
18 0.377099659 0.460899583
19 0.769312591 0.805946524
20 0.014375624 0.024327979
21 2.54E-09 1.40E-08
22 0.265232617 0.34324221
72
Figure 27. The proportion of genes in each differential expression category that are located in either arm or centre chromosomal domains on autosomes (chromosomes I – V). * indicates significant enrichment (p = 0.05, Bonferroni adjustment).
73
Figure 28. The proportion of genes in each differential expression category that are located in either arm or centre chromosomal domains on the X chromosome. No groups were significantly enriched in arms or the centre (p = 0.05, Bonferroni adjustment).
74
Figure 29. The proportion of genes in each of the Representative Modules that are located in either arm or centre chromosomal domains on the autosomes. Blue = Temperature Modules, red = Genotype Modules, purple = G&T modules, orange = GxT modules. * indicates significant enrichment (FDR = 0.05).
75
Figure 30. The proportion of genes in each of the Representative Modules that are located in either arm or centre chromosomal domains on the X-chromosome. Blue = Temperature Modules, red = Genotype Modules, purple = G&T modules, orange = GxT modules. * indicates significant enrichment (FDR = 0.05).
76
Table 7. Table of G-test p-values for tests of whether the proportions of genes on autosomes and the X-chromosome were significantly different from expectations for each differential expression group (p = 0.05, Bonferroni adjusted).
DE
group
G test p-
value adj. p-value
T only 0.766174012 1
G only 0.002842788 0.014213941
G&T 0.001210832 0.006054162
GxT 0.000236965 0.001184823
no DE 0.002134383 0.010671915
77
Table 8. Table of G-test p-values for tests of whether the proportions of genes on autosomes and the X-chromosome were significantly different from expectations for each co-expression module (FDR = 0.05, BH correction).
Module
G test p-
value adj. p-value
1 1.11E-36 8.17E-36
2 0.058403179 0.071381663
3 4.32E-17 1.58E-16
4 1.85E-38 2.04E-37
5 2.31E-30 1.27E-29
6 8.17E-39 1.80E-37
7 0.010162859 0.013973931
8 0.039370389 0.050949915
9 4.70E-28 2.07E-27
10 8.20E-05 0.000180384
11 0.002065337 0.003029161
12 1.05E-13 3.29E-13
13 0.000638395 0.001170391
14 0.001346515 0.002115951
15 0.285415181 0.313956699
16 0.121207929 0.140346023
17 1.96E-05 4.79E-05
18 0.000107935 0.000215869
19 0.415392993 0.435173612
20 0.000718097 0.001215242
21 1.09E-06 3.00E-06
22 0.758338434 0.758338434
78
Figure 31. The proportion of genes on autosomes and the X-chromsome for each differential expression group. The dotted red line indicates the expected proportion. * indicates significant enrichment on either the autosomes or the X-chromosome (p = 0.05, Bonferroni adjusted).
79
Figure 32. The proportion of genes on autosomes and the X-chromsome for each co-expression module. The dotted red line indicates the expected proportion. * indicates significant enrichment on either the autosomes or the X-chromosome (p = 0.05, Bonferroni adjusted). – under the module name indicates a Representative Module.
80
Supplementary Figure 1. Treemap generated with REVIGO from significantly overrepresented molecular function GO terms for Module 7 (Genotype module).
Supplementary Figure 2. Treemap generated with REVIGO from significantly overrepresented biological process GO terms for Module 10 (Genotype module).
m10_bp
metabolic
process
homeostatic processregulation of liquid surface tension
cell
communication
metabolism
regulation of liquid surface tension
82
Supplementary Figure 3. Treemap generated with REVIGO from significantly overrepresented biological process GO terms for Module 6 (Temperature module).