Page 1
Human-mediated introgression of haplotypes in a modern 1
dairy cattle breed 2
3
Qianqian Zhang1,2,#,*
, Mario Calus
2, Mirte Bosse
2, Goutam Sahana
1, Mogens Sandø Lund
1, and 4
Bernt Guldbrandtsen1 5
6
1 Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, 7
Aarhus University, Denmark 8
2 Animal Breeding and Genomics, Wageningen University & Research, the Netherlands 9
# Present address: Department of Veterinary and Animal Sciences, Faculty of Health and Medical 10
Sciences, University of Copenhagen, Denmark 11
* Corresponding author 12
13
Email addresses: 14
QZ: [email protected] 15
MC: [email protected] 16
MB: [email protected] 17
GS: [email protected] 18
MSL: [email protected] 19
BG: [email protected] 20
Genetics: Early Online, published on May 30, 2018 as 10.1534/genetics.118.301143
Copyright 2018.
Page 2
Abstract 21
22
Domestic animals can serve as model systems of adaptive introgression and their genomic 23
signatures. In part their usefulness as model systems is due to their well-known histories. Different 24
breeding strategies such as introgression and artificial selection have generated numerous desirable 25
phenotypes and superior performance in domestic animals. The Modern Danish Red Dairy Cattle is 26
studied as an example of an introgressed population. It originates from crossing the traditional 27
Danish Red Dairy Cattle with the Holstein and Brown Swiss breeds, both known for high milk 28
production. This crossing happened among other things due to changes in production system, to 29
raise milk production and overall performance. The genomes of Modern Danish Red Dairy Cattle 30
are heavily influenced by regions introgressed from the Holstein and Brown Swiss breeds and under 31
subsequent selection in the admixed population. The introgressed proportion of the genome was 32
found to be highly variable across the genome. Haplotypes introgressed from Holstein and Brown 33
Swiss contained or overlapped known genes affecting milk production, as well as protein and fat 34
content (CD14, ZNF215, BCL2L12 and THRSP for Holstein origin; ITPR2, BCAT1, LAP3 and 35
MED28 for Brown Swiss origin). Genomic regions with high introgression signals also contained 36
genes and enriched QTLs associated with calving traits, body confirmation, feed efficiency, carcass, 37
and fertility traits. These introgressed signals with relative identity by descent scores larger than the 38
median showing Holstein or Brown Swiss introgression are mostly significantly correlated with the 39
corresponding test statistics from signatures of selection analyses in Modern Danish Red Dairy 40
Cattle. Meanwhile, the putative significant introgressed signals have a significant dependency with 41
the putative significant signals from signatures of selection analyses. Artificial Selection has played 42
an important role in the genomic footprints of introgression in the genome of Modern Danish Red 43
Dairy Cattle. Our study on a modern cattle breed contributes to an understanding of genomic 44
consequences of selective introgression by demonstrating the extent to which adaptive effects 45
contribute to shaping the specific genomic consequences of introgression. 46
47
Keywords: selective introgression, signature of selection, high yielding cattle breeds, modern dairy 48
cattle breed 49
50
51
52
Page 3
Introduction 53
54
Processes of adaptive introgression are complex and their genomic signatures in human and other 55
species have been studied extensively (Hasenkamp, Solomon et al. 2015, Deschamps, Laval et al. 56
2016, Figueiro, Li et al. 2017, Jagoda, Lawson et al. 2017). Genome analysis has enabled an in-57
depth assessment of genomic consequences of different demographic processes including 58
introgression, selection, and their interplay in the modern species (Deschamps, Laval et al. 2016, 59
Figueiro, Li et al. 2017). Domestic animals can serve as model organisms for these processes. They 60
have several advantages in understanding the impact of introgression and selection on genomes: 61
first, introgression and selection is known to occur between breeds, and the processes are often well 62
documented by breeders; second, massive data are routine collected before and after introgression 63
and selection such as parentage, genotypes and phenotypes; third, under controlled appropriate 64
environmental conditions a large part of genomic consequence is caused by human-mediated 65
directional introgression and selection. 66
67
Artificial selection and different breeding strategies have enabled generating numerous desirable 68
phenotypes in domestic animals such as cattle (Hartwig, Wellmann et al. 2015, Buzanskas, Ventura 69
et al. 2017, Davis, Spelman et al. 2017), pigs (Bosse, Megens et al. 2014, Bosse, Megens et al. 2014, 70
Ai, Fang et al. 2015) and dogs (Galov, Fabbri et al. 2015, vonHoldt, Kays et al. 2016). Strategies 71
including crossbreeding and introgression have been very successful in improving productivity and 72
performance in domestic animals. For example, Chinese pig breeds have been imported to Europe 73
to improve the productivity of European pigs in the late eighteenth and early nineteenth centuries 74
(White 2011). The fertility related traits have been largely improved by the crossbreeding and 75
introgression from Asian pigs (Merks, Mathur et al. 2012, Bosse, Megens et al. 2014, Bosse, Lopes 76
et al. 2015). Similarly, in dairy cattle, crossbreeding and following introgression between local 77
breeds with other breeds has been applied in order to achieve better productivity and performance 78
(Davis, Spelman et al. 2017). The genetic architecture of modern domestic animals including dairy 79
cattle is shaped by the interplay between different forces including the intentional introduction of 80
favorable alleles from other breeds, subsequent selection for favorable introgressed alleles and 81
demographic processes. 82
83
Page 4
Since the introduction of scientific breeding, the main breeding goal in cattle has been to improve 84
milk yield, fat, and protein content, even for dual-purpose breeds in intensive farming systems 85
especially in Europe (Hartwig, Wellmann et al. 2015). By the crossing of high yielding breeds with 86
local breeds, the productivity of local breeds could rapidly be increased at the expense of the 87
genetic distinctiveness of the local breeds. The high productivity of these admixed breeds was 88
further improved by intense selection, resulting in increased frequencies or even fixation of 89
favorable alleles. Many of the variants thus spreading in the population will have been of 90
introgressed origin. With the great success of cattle breeding including crossbreeding, and because 91
of the availability of large scale genomic data sets, analysis of admixed local cattle breeds 92
represents an appealing model to identify the genomic consequences of admixture with the intention 93
of improving traits of interest in local breeds. We expect that the introgressed genomic regions play 94
an important role in improving productivity and performance in the crossbreeds. We hypothesize 95
that: 1) the genome-wide introgressed regions are non-randomly distributed across the genome with 96
respect to their genomic locations, and that: 2) the majority of highly introgressed regions affects 97
production and performance traits, and that: 3) these introduced haplotypes are or have been under 98
selection in the admixed breed. 99
100
To test these hypotheses and identify specific genomic regions and genes of interest involved in the 101
important traits from high-yields breeds such as Holstein (HOL) and Brown Swiss (BSW), we use 102
the hybrid Modern Danish Red Dairy Cattle (mRDC) breed as an example. Our analyses illustrate 103
the patterns of introgressed and selected haplotypes in an admixed local breed. The hybrid mRDC 104
originates from traditional Danish Red Dairy Cattle (tRDC). In recent decades Holstein and Brown 105
Swiss have been used extensively to improve the milk yield of mRDC (Kantanen, Olsaker et al. 106
2000). Years of crossbreeding and selection have led to the differentiation between mRDC and 107
tRDC. The introgression of and selection for haplotypes from HOL and BSW has probably made a 108
significant contribution to the increased milk production level of mRDC. The objective of this study 109
was to examine genomic patterns of introgression from two high-yielding breeds (HOL and BSW) 110
in a modern dairy breed (i.e. the hybrid mRDC), and unravel the consequences of introgression and 111
selection at the genome level using the whole genome sequence data. 112
113
Methods 114
115
Page 5
1. SNP genotyping, sequencing, variant calling, and quality control 116
117
Whole genome sequence data were available for 97 animals from 4 breeds (32 Holstein: HOL; 20 118
Brown Swiss: BSW; 15 traditional Red Danish Dairy Cattle: tRDC; 30 modern Danish Red Dairy 119
cattle: mRDC). All individuals’ genomes were sequenced to ~10× of depth or deeper using Illumina 120
paired-end sequencing. Reads were aligned to the cattle genome assembly UMD3.1 (Zimin, 121
Delcher et al. 2009) using bwa (Li and Durbin 2009). Aligned sequences were converted to raw 122
BAM files using samtools (Li, Handsaker et al. 2009). Duplicate reads were marked using the 123
samtools rmdup option (Li, Handsaker et al. 2009). The Genome Analysis Toolkit (McKenna, Hanna 124
et al. 2010) was used for local realignment around insertion/deletion (indels) regions, and 125
recalibration following the 1000 Bull Genome Project guidelines (Daetwyler, Capitan et al. 2014) 126
incorporating information from dbSNP (Sherry, Ward et al. 2001). Finally, variants were called 127
using the Genome Analysis Toolkit’s UnifiedGenotyper (McKenna, Hanna et al. 2010), which 128
simultaneously calls short indels and SNPs. Indels and variants on sex chromosomes were excluded 129
from further analyses. 130
131
2. Population structure and admixture 132
133
Using PLINK (Chang, Chow et al. 2015), the sequence variants were pruned to remove markers 134
with pairwise linkage disequilibrum (LD) greater than 0.1 with any other SNP within a 50 SNP 135
sliding window (advancing by 10 SNPs at a time). The SNPs on all the autosomes were used to 136
study the population structure. To get an overview of population structure of the genotyped animals 137
from different breeds, the whole genome sequence data was used for principal component 138
estimation using GCTA (Yang, Lee et al. 2011). Admixture analyses were done using the program 139
Admixture (Alexander, Novembre et al. 2009) with values of K between 2 and 10. The K value with 140
a low cross validation error was chosen. 141
142
In order to investigate the statistical significance of admixture among these cattle populations. 143
TreeMix software was used to perform the three population (f3) test (Pickrell and Pritchard 2012). 144
In the f3 test with the form of f3 (A; B, C), an extreme negative f3 statistic indicates that there has 145
been significant gene flow to A from populations B and C. The combination of two of HOL, BSW, 146
Page 6
tRDC (A and B) were used as source populations and mRDC were used as admixed population (C) 147
in the f3 test, which results in 3 different combinations. 148
149
We also calculated the breed proportions for the sequenced mRDC individuals using pedigree. The 150
full pedigree of the sequenced mRDC animals were extracted and used to infer the breed 151
proportions by coding the breed where the ancestors enter as the known parents of the traced animal. 152
153
3. Introgression mapping 154
155
3.1 Calculation of ancestry dosages in mRDC 156
157
A novel two-layer hidden Markov model was implemented in the method developed in Guan (2014) 158
(Guan 2014) to infer the structure of local haplotypes introgressed from HOL, BSW and tRDC in 159
mRDC. The software package ELAI developed by Guan (2014) (Guan 2014) was used to infer the 160
ancestry dosages of the haplotypes from three source populations in mRDC. The SNPs from 161
sequence variants with minor allele frequency (MAF) lower than 0.01 or missing proportion of 162
higher than 0.05 were removed from subsequent analyses. The option of three-way admixture with 163
admixture generations of 10 was chosen, which approximates the history of admixture in the mRDC 164
population. 30 steps of Expectation–maximization (EM) algorithm were run to infer the ancestry 165
dosages of HOL, tRDC and BSW in mRDC. 166
167
3.2 Relative Identity by Descent Scores (rIBD) 168
169
HOL, BSW and tRDC have made large genetic contributions to the mRDC. Therefore, the 170
sequenced mRDC, tRDC, BSW and HOL were selected for introgression mapping analysis. The 171
identity-by-descent (IBD) regions comparing mRDC and tRDC were used as a reference to map the 172
introgression regions from HOL and BSW using a pairwise comparison between these breeds. 173
Following Bosse et al. (Bosse, Megens et al. 2014), sequences for 29 autosomes were first phased 174
using Beagle fastIBD (V. 3.3.2) (Browning and Browning 2007). Pairwise comparisons for 175
detecting IBD were performed between mRDC and tRDC; mRDC and HOL; mRDC and BSW. As 176
recommended in the Beagle documentation (Browning and Browning 2007), 10 independent runs 177
for phasing and pairwise IBD detection were performed. The identified IBD segments were 178
Page 7
combined from 10 runs and the threshold parameter compromising between power and false-179
discovery rate was 10-10
for identifying the true shared IBD as suggested by Browning and 180
Browning (Browning and Browning 2007). We defined the IBD score as the proportion of the 181
number of recorded true IBD haplotype segments over the total number of pairwise comparisons 182
using a window of 10 kb. The IBD score was calculated for each pairwise comparison using a 183
custom perl script. To quantify the relative proportion of introgressed genome from HOL or BSW, 184
we calculated the relative IBD score (rIBD) as follows: IBD score (mRDC & tRDC) - IBD score 185
(mRDC & HOL) or IBD score (mRDC & BSW). Thus, the rIBD has values in the range of -1 to 1. 186
rIBD=1 signifies that all haplotypes in the target breed originate from the first source breed, while 187
rIBD=-1 signifies 100% from the second source breed. The variance of rIBD score was calculated 188
using a robust method (Williams 2000). The p values of rIBD scores were derived from the neutral 189
hypothesis that assuming rIBD are normally distributed with a standard deviation of the squared 190
root of robust estimate of rIBD variance (Bosse, Megens et al. 2014). The significant introgressed 191
haplotypes from HOL and BSW in mRDC were defined as the haplotypes with a corrected p value 192
using Benjamini-Hochberg Procedure (Benjamini and Yekutieli 2001) lower than 0.02 and 0.04 193
respectively. The Peterson’s correlations between rIBD scores from HOL or BSW introgression 194
larger than the median of rIBD scores and the corresponding test statistics from signatures of 195
selection analyses were calculated. Moreover, the putative significant introgressed signals from 196
HOL and BSW were extracted and compared with the putative significant signals from signatures 197
of selection analyses using a chi-squared test with Monte Carlo simulation (Hope 1968) 198
implemented in calculating p values. 199
200
4. Detection of signature of selection 201
202
4.1 Fst analysis 203
The genetic differentiation between individuals from tRDC and mRDC was measured by pairwise 204
Fst analysis following Weir and Cockerham (Weir and Cockerham 1984). Pairwise Fst was 205
computed with Genepop 4.2 in bins of 10 kb over the full length of the genome (Weir and 206
Cockerham 1984). The correlations between the Fst and rIBD scores for HOL and BSW 207
introgression for the same bins of 10 kb were calculated. The R package FlexMix (Grun and Leisch 208
2008) was used to fit a series of finite normal mixture models to explore the underlying structure of 209
the distribution of Fst presumably caused by different evolutionary or demographic processes 210
Page 8
undergoing in the populations such as balancing, directional selection and neutrality. This mixture 211
model postulated that 𝐹𝑠𝑡̅̅ ̅̅̅ : ∑ 𝜋𝑘𝑁(𝐹𝑠𝑡𝑙̅̅ ̅̅ ̅̅ | 𝜇𝑘, 𝜎𝑘
2)𝑘𝑘=1 , where 𝑘 is the number of components of the 212
mixture, 𝑙 is the 𝑙 th locus, 𝜋𝑘 is the probability that 𝐹𝑠𝑡̅̅ ̅̅̅ belongs to cluster k , 𝜇𝑘 and 𝜎𝑘2 are 213
expectation and variance of normal distribution, respectively. The number of components of the 214
mixture (𝑘) was determined by the smallest Akaike’s information criterion (AIC) among different 215
models with different numbers of components. Model parameters were estimated by maximum 216
likelihood via EM algorithm in FlexMix. At last, the p value of the Fst combine the k number of 217
components was calculated as: 𝑝 = 1 − ∑ 𝜋𝑘𝑁(𝐹𝑠𝑡𝑙̅̅ ̅̅ ̅̅ | 𝜇𝑘, 𝜎𝑘
2)𝑘𝑘=1 . The regions with the cutoff of 218
corrected p values using Benjamini-Hochberg Procedure (Benjamini and Yekutieli 2001) of less 219
than 0.05 were used for testing dependency of putative significant signals between different test 220
statistics. 221
222
4.2 Extended haplotype homozygosity tests 223
224
The extended haplotype homozygosity tests were applied between the breeds for the sequenced 225
individuals as a second evidence for signatures of selection. The genome-wide scan for integrated 226
haplotype score (iHS) for mRDC was performed using the R package rehh (Sabeti, Reich et al. 227
2002, Gautier and Vitalis 2012). The threshold of |iHS| of 2.5 for SNPs was used as the cutoff to 228
define as signals of selection signatures following (Voight, Kudaravalli et al. 2006, Gautier and 229
Vitalis 2012). To compare iHS with other test statistics such rIBD scores in the same scale, we 230
calculated the proportion of SNPs with |iHS|>2 in a window size of 10 kb and identified the 231
windows in the highest 1% of the empirical distribution for proportion of SNPs with |iHS|>2 232
following (Voight, Kudaravalli et al. 2006). The putative significant windows of proportion of 233
SNPs with |iHS| used for testing dependency of putative significant signals between different test 234
statistics. 235
236
4.3 Sharing of runs of homozygosity (ROH) 237
238
Runs of homozygosity (ROH) were computed for the sequenced animals to detect shared ROH 239
regions (minimum 10 kb considered) among individuals. For a description of procedures for 240
Page 9
calculation of the nucleotide diversity and for detection of ROH, see (Zhang, Calus et al. 2015, 241
Zhang, Guldbrandtsen et al. 2015). The sharing of ROH regions was calculated as the number of 242
individuals sharing the same ROH region on a particular segment using a window of 10 kb bin 243
across the whole genome in mRDC. Regions of enrichment of shared ROH regions were defined as 244
regions exceeding 95th
percentile of the empirical distribution of the number of individuals sharing 245
the same ROH regions in any particular segment. 246
247
5. Gene annotations and enrichment of QTLs 248
249
Genes in genomic regions showing significant introgression of haplotypes from HOL or BSW in 250
mRDC were annotated. The cattle QTLs were extracted from Animal QTL Database (Hu, Park et al. 251
2016). QTLs on the X chromosome or without locations and references in the list from Animal 252
QTL Database (Hu, Park et al. 2016) were excluded. The remaining QTLs were classified into 6 253
groups according to the associated traits: milk, reproduction, production, health, meat and carcass, 254
exterior. The QTLs overlapped with genomic regions with high rIBD scores were identified. When 255
two QTLs had the same exact genomic interval and same associated traits group, they were counted 256
as one QTL. Gene annotations in these regions were retrieved from the Ensembl Genes 89 Database 257
using BioMart (Kinsella, Kahari et al. 2011). 258
259
To test for enrichment of QTLs in the candidate introgressed regions from HOL or BSW, we 260
applied a permutation test. The candidate introgressed regions were randomly distributed across the 261
whole genome. This permutation did not change the relative proportion and length of introgressed 262
regions to preserve their correlation structure. We then computed the number of QTLs and the 263
number of QTLs in the 6 groups associated traits, which overlapped with the permuted, introgressed 264
regions. In total, 10,000 permutations were performed. The distribution of numbers of QTLs 265
observed in the permutated regions were treated as the null distribution from which we computed 266
the significance levels of the number of QTLs observed in the real data. 267
268
Results and discussion 269
270
1. Population structure and evidence of introgression 271
Page 10
272
Population structures of the sampled cattle breeds in this study was analyzed using Admixture and 273
PCA are shown in Figure 1. mRDC had contribution from tRDC, HOL, BSW as observed in the 274
Admixture analysis (Figure 1). Figure 1 clearly demonstrates the hybrid nature of mRDC cattle, 275
which is consistent with recorded pedigree information of introgression history of mRDC 276
(Andersen, Jensen et al. 2003). A large contribution in mRDC was observed from tRDC, the 277
recipient population. It is notable that BSW and HOL are two mainstream breeds, each contributed 278
heavily to the genomes of extant mRDC individuals. The proportion of introgression differs quite 279
extensively among individuals. In the PCA analysis (Figure 1), Principal Component 1 (PC1, 6.23 % 280
of variance) separated HOL, BSW and tRDC. mRDC, however, was dispersed between the other 281
breeds demonstrating admixture of HOL, BSW and tRDC which have contribution to mRDC. 282
Similarly, PC2 separated HOL, BSW and tRDC. mRDC had a wide range of PC2 values among the 283
individuals from the other breeds. 284
285
Moreover, the breed proportions of mRDC individuals were derived using the recoded full pedigree 286
and there is an average of 0.27 HOL ancestor, 0.17 BSW ancestor and 0.29 tRDC ancestor in 287
mRDC. The statistical significance of admixture in mRDC was measured using the combination of 288
two populations chosen from HOL, BSW and tRDC by f3 test using the program threepop from 289
Treemix (Pickrell and Pritchard 2012). We observed extreme Z scores from f3 test for mRDC using 290
any combination of HOL, BSW and tRDC, i.e. -14.97, -11,72 and -19.24. These results are 291
consistent with what we have observed from admixture and PCA analysis and provides statistical 292
significance for the admixture of mRDC from HOL, BSW and tRDC. These results support that 293
mRDC indeed is a composite breed and can be studied further for introgression from HOL and 294
BSW. 295
296
2. Introgression mapping 297
298
Introgression mapping was performed to identify regions in mRDC that contained an excess of 299
introgressed haplotypes. We first examined the local structure of haplotypes in mRDC introgressed 300
from HOL, BSW and tRDC using a three-way admixture approach (Figure S1). We observed 301
average ancestry dosages of 0.78, 0.63 and 0.59 with a standard deviations of 0.24, 0.22 and 0.18 302
from HOL, tRDC and BSW to mRDC average across the whole genome and all mRDC individuals. 303
Page 11
Three genomic regions with the ancestry dosage of 2 were observed from HOL to mRDC (Figure 304
S1). This indicates complete replacement of tRDC genomic material by HOL genomic material. 305
However, there is only one annotated gene (SNORD116) in these regions. Interestingly, the 306
genomic region with the full HOL ancestry dosage on chromosome 21 (1,348,427-2,107,179 bp) is 307
associated with gestation length and calving ease (Frischknecht, Bapst et al. 2017). Other QTL 308
overlapping regions with full HOL ancestry in mRDC were associated with bovine respiratory 309
disease susceptibility, body weight and udder swelling score (Saatchi, Schnabel et al. 2014, Keele, 310
Kuehn et al. 2015, Michenet, Saintilan et al. 2016). 311
312
To address which admixed haplotypes have most influenced the existing mRDC breed, we inferred 313
whether a genomic region was introgressed from high-yield breeds in multiple individuals by 314
examining the contributions proportional to the admixture fraction. The frequencies of all mRDC 315
haplotypes that were of HOL, BSW or tRDC origin were calculated across the whole genome. The 316
relative fractions of HOL or BSW haplotypes versus tRDC haplotypes in the mRDC group were 317
calculated as ‘relative IBD scores’ (rIBD). Shared haplotypes (i.e. haplotypes with shared ancestry) 318
were observed between mRDC on one hand, and HOL, BSW and tRDC on the other hand. These 319
findings are in agreement with the results observed from population structure and admixture 320
analysis that showed contributions from HOL, BSW and tRDC to mRDC. In contrast to tRDC 321
frequency, HOL haplotype and BSW haplotype frequency in mRDC population, for a given locus, 322
(i.e. rIBD score) ranged from 0.73 to -0.74 and from 0.81 to -0.92, where 1 indicates that all 323
haplotypes were of HOL or BSW origin, while none were of tRDC origin. A value of -1 indicates 324
that all haplotypes were tRDC-like (Figures 2 and 3). The rIBD scores averaged across the genome 325
were negative (-0.06 for HOL introgression and -0.09 for BSW introgression) (Figures 2a and 3a), 326
showing that the majority of the genome displayed more similarity with the tRDC than with either 327
HOL or BSW. However, every chromosome contained genomic regions where the signal for HOL 328
or BSW haplotype was stronger than tRDC. 329
330
The distribution of rIBD from the comparison between mRDC and HOL or BSW for IBD 331
haplotypes resembled a normal distribution (Figures 2b and 3b). By taking a cut-off of rIBD values, 332
we were able to identify the regions which were likely to be of HOL origin or BSW origin. Across 333
the whole genome, many known genes and QTLs were located within regions with high rIBD for 334
HOL origin (Table S1). We observed that the QTLs are significantly enriched in the HOL-like 335
Page 12
haplotypes in mRDC (p=0.025). The genes and QTLs were associated with economic traits 336
including milk-related traits such as milk yield, protein, fat yield and percentage (CD14, ZNF215, 337
BCL2L12 and THRSP) (Ashwell, Heyen et al. 2004, Beecher, Daly et al. 2010, Magee, Sikora et al. 338
2010, Cole, Wiggans et al. 2011, Cochran, Cole et al. 2013, Fontanesi, Calo et al. 2014, 339
Capomaccio, Milanesi et al. 2015), calving traits (MYH14, KCNC3, SYT3 and CTU1) (Kolbehdari, 340
Wang et al. 2008, Cole, Wiggans et al. 2011, Gaddis, Null et al. 2016, Mao, Kadri et al. 2016, Abo-341
Ismail, Brito et al. 2017), feed efficiency-related traits (LRRIQ3, ATP6V1B2 and CCKBR) (Abo-342
Ismail, Kelly et al. 2013, Abo-Ismail, Vander Voort et al. 2014), carcass traits (ZNF215, INTS4 and 343
RPTOR) (Magee, Sikora et al. 2010, Sasago, Abe et al. 2017). The QTLs associated production and 344
reproduction traits were significantly enriched in introgressed regions from HOL showing high 345
rIBD scores (p=0.0004 for production and p=0.034 for reproduction traits). The longest continuous 346
introgressed region (defined as the region with rIBD>0) from HOL to mRDC was on chromosome 347
18 (56,320,000-61,350,000bp) (Figure 2a). This region was previously found to be associated with 348
calving traits and young stock survival in Nordic Holstein cattle (Cole, Wiggans et al. 2011, Mao, 349
Kadri et al. 2016, Wu, Guldbrandtsen et al. 2017). The recombination rate of this region of 350
chromosome 18 is low (Weng, Saatchi et al. 2014). The long genomic regions showing signal of 351
introgression probably tend to occur in regions with low recombination rate. Moreover, introgressed 352
haplotypes included numerous annotated genes due to genetic hitchhiking and short time since 353
introgression. The region with highest rIBD score from HOL was located on chromosome 4 354
(120,540,000-120,810,000 bp, average rIBD=0.449), which is downstream of the gene VIPR2. The 355
VIPR2 gene has been proposed as a candidate gene affecting fat percentage and playing an 356
important role in milk synthesis (Capomaccio, Milanesi et al. 2015). 357
358
Similarly, many known genes and QTLs were occurred in regions where mRDC shared haplotypes 359
with BSW (Table S2). We observed an enrichment of QTLs in the BSW-like haplotypes in mRDC 360
with close to significance (p=0.056). These genes and QTLs mainly affected milk composition 361
including fat and protein percentage and yield (ITPR2, BCAT1, LAP3 and MED28 ) (Cohen-Zinder, 362
Seroussi et al. 2005, Pimentel, Bauersachs et al. 2011, Zheng, Ju et al. 2011, Fang, Fu et al. 2014), 363
growth and body conformation traits such as stature (NCAPG, LCORL, PPP2R1A, IGFBP6 and 364
CREBBP) (Kolbehdari, Wang et al. 2008, Baeza, Corva et al. 2011, Lindholm-Perry, Sexten et al. 365
2011, Cole, Waurich et al. 2014, Sahana, Hoglund et al. 2015), calving and fertility traits (EIF4G3, 366
TGFA and LAP3) (Bongiorni, Mancini et al. 2012, Hering, Olenski et al. 2014, Hoglund, Buitenhuis 367
Page 13
et al. 2015), and feed efficiency related traits (CLMP and PPP2R1A) (Prakash, Bhattacharya et al. 368
2011, Cole, Waurich et al. 2014). We observed a significant enrichment of QTLs associated with 369
milk, production, meat and carcass traits in the introgressed regions from BSW in mRDC (p=0.045 370
for milk, p=0.019 for production, and p=0.017 for meat and carcass traits). While the longest region 371
introgressed from HOL in mRDC was ~4 Mb, there was no introgressed region longer than 1.5 Mb 372
from BSW in mRDC. The highest value of rIBD signal was observed on chromosome 17 373
(35,630,000-35,780,000 bp, average rIBD=0.563). This region is located downstream of the IL2 374
gene. Also there are one unannotated gene in this region. It has previously been shown that IL2 375
gene is associated with mastitis, milk yield and lactation persistency (Alluwaimi, Leutenegger et al. 376
2003, Prakash, Bhattacharya et al. 2011). 377
378
3. Regions of introgression and evidence for selection 379
380
We have shown that HOL or BSW haplotypes that were introgressed in mRDC often originated 381
from genomic regions harboring genes associated with milk production, calving traits, feed 382
efficiency, fertility or body conformation and carcass traits. An introgression introduced haplotype 383
from HOL or BSW with a continued survival in mRDC is not removed by negative selection or 384
genetic drift might be a result of positive or balancing selection. The structure of introgressed 385
haplotype will change in such as linkage-disequilibrium (LD), and distribution of allele frequencies 386
due to selection. The length of introgressed haplotype was affected by a combination of the local 387
recombination rate and strength of selection. To further infer the introgressied regions which are 388
under selection, we used three independent methods: iHS, Fst, and sharing of ROH among 389
individuals to identify the regions which were under selection in the mRDC population. iHS 390
identified regions with extended homozygosity in mRDC due to selection. Local high Fst statistics 391
reflected the genomic regions which showed strong differentiation between mRDC and tRDC. The 392
sharing of ROH among individuals could differentiate the genomic regions which have been fixed 393
or close to fixation in mRDC. We observed a significant dependency (p<0.01) between the putative 394
significant rIBD signals and significant signals from iHS, Fst and sharing of ROH among 395
individuals, except between the significant rIBD signals from HOL introgression and significant 396
iHS signals. It supports the hypothesis that many introgressed regions from HOL or BSW are 397
probably a result of selection. 398
399
Page 14
We firstly observed significant positive correlations between Fst and rIBD scores larger than the 400
median from both HOL and BSW (p<0.001), supportings that at least some of the regions in mRDC 401
showing differentiation from tRDC have been introgressed from HOL or BSW (Figures S2 and S3). 402
Genes associated with milk yield, protein and fat yield and percentage such as CD14 and ZNF215 403
(Magee, Sikora et al. 2010, Cochran, Cole et al. 2013) overlapped with haplotypes introgressed 404
from HOL showed a high Fst (Figure 4e). The longest region introgressed from HOL in mRDC 405
(chromosome 18: 56,320,000-61,350,000 bp) also showed high Fst values (Figure 4d). There was a 406
number of genes included in this longest region with introgression from HOL and high Fst regions, 407
such as MYH14 and ZNF613 associated with calving ease and young stock survival (Abo-Ismail, 408
Brito et al. 2017, Wu, Guldbrandtsen et al. 2017) (Figure 4d). The region on chromosome 19 409
(52,370,000- 52,380,000 bp) with a very high Fst of 0.748 showed strong differentiation between 410
tRDC and mRDC which overlapped with a haplotype introgressed from HOL into mRDC. The 411
RPTOR gene associated with carcass traits in cattle (Sasago, Abe et al. 2017) is located here (Figure 412
4a). At the same time, RPTOR gene was found to play an important role regulates cell growth, 413
energy homeostasis, apoptosis, and immune response during adaptions (Sun, Southard et al. 2010). 414
Similarly, we found that two BSW introgressed haplotypes overlapped with genes (ITPR2 and 415
BCAT1) associated with milk yield, fat and protein yield and percentage (Pimentel, Bauersachs et al. 416
2011, Fang, Fu et al. 2014) showed a high Fst (Figure 4c). Moreover, the region on chromosome 6 417
(38,730,000- 38,780,000 bp) with an average Fst value of 0.251 overlapped with a BSW haplotype 418
introgressed into mRDC. The NCAPG and LCORL gene associated with body confirmation traits 419
such as stature, and feed efficiency (Eberlein, Takasuga et al. 2009, Lindholm-Perry, Sexten et al. 420
2011, Setoguchi, Watanabe et al. 2011, Xia, Fan et al. 2017) is located in this region. 421
422
Regions introgressed from HOL or BSW into mRDC and the significant regions from iHS test were 423
compared in Figures S4 and S5. There was a significant positive correlation between the rIBD score 424
larger than the median for introgressed region from BSW in mRDC and iHS test (p<0.001). 425
However, there was no significant correlation between rIBD score for introgressed regions from 426
HOL in mRDC and iHS test. For example, the genomic region putatively under selection from iHS 427
test (38,510,000- 38,540,000, 104 out of 108 SNPs with |iHS|>2, highest |iHS|=3.8) on chromosome 428
6 overlapped with an introgression signal from BSW into mRDC. This region located on the 429
upstream of LAP3 gene associated with milk composition including fat and protein percentage, 430
Page 15
calving trait (Cohen-Zinder, Seroussi et al. 2005, Zheng, Ju et al. 2011). We observed the 431
significant SNPs with highest |iHS| of 5.3 from iHS test on chromosome 5 overlapped with a high 432
introgression signal from BSW into mRDC. These significant SNPs overlaps with the BCAT1 gene 433
associated with milk yield, protein and fat percentage in milk (Pimentel, Bauersachs et al. 2011) 434
(Figure 4c). There were significant SNPs with highest |iHS| of 4.23 on chromosome 15, which 435
overlapped with an introgressed region from BSW into mRDC. This region lies within the ZNF215 436
gene affecting body confirmation and milk composition (Magee, Sikora et al. 2010) (Figure 4e). 437
Genomic regions showing signals both from iHS test and introgression mapping, e.g. gene BCAT1, 438
were introgressed from BSW, and probably under selection, but not yet fixed in the population due 439
to low pression of selection or recent introduction. 440
441
There was a significant positive correlation (p<0.001) across genomic regions between the number 442
of individuals containing overlapped ROH regions and the rIBD scores larger than the median from 443
both HOL and BSW (Figures S6 and S7). . Sharing of short ROHs shared between individuals 444
indicated selection events that have reached or are close to fixation. Interestingly, many small 445
regions highly enriched for ROH hotspots, overlapped with the longest region introgressed from 446
HOL into mRDC such as the region where ZNF613 gene associated with young stock survival (Wu, 447
Guldbrandtsen et al. 2017) is located (Figure 4d). Similarly, we also observed that the region 448
introgressed from HOL in mRDC, where the THRSP and INTS4 genes are located, shows high 449
enrichment of ROH region among individuals (Figure 4b). Studies showed that gene THRSP is 450
associated with milk composition and involved in the regulation of mammary synthesis of milk fat 451
(Fontanesi, Calo et al. 2014) and gene INTS4 is associated with myristic acid content in carcass trait 452
(Sasago, Abe et al. 2017). One region introgressed from BSW into mRDC, contained the TGFA 453
gene associated with sperm motility (Hering, Olenski et al. 2014) also showed high levels of ROH 454
sharing between individuals. 455
456
Conclusion 457
458
Together, the observed results demonstrated how crossbreeding followed by selection shapes the 459
genomes of a modern breed on a genome-wide scale using dairy cattle as an example. The well-460
Page 16
documented breeding practice provide a robust model system to studying the consequences of 461
adaptive introgression. Key observations were a highly uneven distribution across the genome of 462
the proportions of genomic regions introgressed from the donor breeds. Highly introgressed regions 463
contained genes and QTLs known to affect traits of interest and subject to active selection by 464
breeders. These traits include milk production, feed efficiency, calving traits, body confirmation, 465
feed efficiency, carcass, and fertility traits. Artificial selection plays an important role on the 466
genomic footprints from introgression on the genome of a modern dairy cattle breed. These findings 467
contribute to the understanding of genomic consequences of selective introgression in the genomes 468
of modern species. 469
470
Data availability 471
The whole genome sequence data used in this study originated from the 1000 Bull Genome Project. 472
Part of these Whole-genome sequence data of individual bulls of the 1000 Bull Genomes Project 473
(Bouwman et al. 2018 Nature Genetics and Daetwyler et al. 2014 Nature Genetics) are available at 474
https://doi. org/10.1038/s41588-018-0056-5 and NCBI using SRA no. SRP039339, SRR1188706, 475
SRR1205973, SRR1205973, SRR1205992, SRR1262533, SRR1262536, SRR1262538, 476
SRR1262539, SRR1262614, SRR1262659, SRR1262660, SRR1262788, SRR1262789, 477
SRR1262846, SRR1293227. The test statistics and script in this study are available on http://XXX. 478
Competing interests 479
The authors declare that they have no competing interests. 480
Authors’ contributions 481
QZ developed and planned the design of the study, coordinated the study, performed data analyses 482
and drafted the manuscript. BG participated in design of the study, performed data analyses and 483
drafting of the manuscript. MB, GS and MC participated in design of the study, and drafting of the 484
manuscript. MSL participated in drafting of the manuscript. All authors read and approved the final 485
manuscript. 486
Page 17
Acknowledgements 487
We are grateful to Viking Genetics (Randers, Denmark) for providing samples for sequencing. 488
Qianqian Zhang benefited from a joint grant from the European Commission within the framework 489
of the Erasmus-Mundus joint doctorate "EGS-ABG". This research was supported by the Center for 490
Genomic Selection in Animals and Plants (GenSAP) funded by Innovation Fund Denmark (grant 491
0603-00519B). Mario Calus acknowledges financial support from the Dutch Ministry of Economic 492
Affairs, Agriculture, and Innovation (Public-private partnership “Breed4Food” code BO-22.04-011-493
001-ASG-LR). 494
495
Figures 496
Figure 1. Population structure for four catte breeds 497
a. Admixture analysis of different cattle breeds with k=3. BSW – Brown Swiss, HOL – Holstein, 498
tRDC – traditional Red Dairy Cattle, mRDC – modern Red Dairy cattle. b. Principal component 499
analysis (PCA) plots among different cattle breeds (Principal component 2 vs. principal component 500
3). BSW – Brown Swiss, HOL – Holstein, tRDC – traditional Red Dairy Cattle, mRDC – modern 501
Red Dairy cattle. 502
503
Figure 2. Genome-wide pattern of relative identity-by-descent (rIBD) score showing introgressed 504
haplotypes from Holstein (HOL) in modern Danish Red dairy cattle (mRDC). a. the rIBD score for 505
all 29 autosomes: the positive scores show the signals where it is more HOL-like whereas the 506
negative scores show the signals where it is more traditional Danish red cattle (tRDC)-like. b. the 507
distribution of rIBD scores: the positive scores show the signals where it is more HOL-like whereas 508
the negative scores show the signals where it is more tRDC-like. The chromosomes 1-29 are 509
colored in red and green in order. 510
511
Figure 3. Genome-wide pattern of relative identity-by-descent (rIBD) score showing introgressed 512
haplotypes from Brown Swiss (BSW) in modern Danish Red dairy cattle. a. the rIBD score for all 513
29 autosomes: the positive scores show the signals where it is more BSW-like whereas the negative 514
scores show the signals where it is more traditional Danish red cattle (tRDC)-like. b. the distribution 515
of rIBD score: the positive scores show the signals where it is more HOL-like whereas the negative 516
Page 18
scores show the signals where it is more tRDC-like. The chromosomes 1-29 are colored in red and 517
green in order. 518
519
Figure 4. Examples of genomic regions with relative identity-by-descent (rIBD) scores, Fst, 520
integrated haplotype score (iHS) and sharing of ROH. a. Fst and rIBD for genomic region 521
containing gene RPTOR; b. rIBD and sharing of ROH for genomic region containing gene THRSP; 522
c. rIBD and iHS for genomic region containing gene BCAT1; d. rIBD, Fst and sharing of ROH for 523
genomic region containing genes MYH14 and ZNF613; e. rIBD, Fst and iHS for genomic region 524
containing gene ZNF215. 525
526
Supplementary materials 527
528 529
Table S1. Introgressed haplotypes in modern Red Dairy cattle (mRDC) from Holstein (HOL) 530
531
Table S2. Introgressed haplotypes in modern Red Dairy cattle (mRDC) from Brown Swiss (BSW) 532
533
Figure S1. The local ancestry dosage in modern Red Dairy cattle (mRDC) from Holstein (HOL), 534
Brown Swiss (BSW) and traditional Red Dairy cattle (tRDC) for a) chromosome 14, b) 535
chromosome 21 and c) chromosome 29. 536
537
Figure S2. Comparison and correlation between Fst between modern and traditional Red Dairy cattle 538
(mRDC and tRDC) and relative identity-by-descend (rIBD) score introgressed from Holstein (HOL) 539
cattle in mRDC (correlation between rIBD scores larger than median and the corresponding Fst was 540
0.08, p<0.001). a. rIBD score showing introgression from HOL cattle in mRDC. b. Fst between 541
mRDC and tRDC 542
543
Figure S3. Comparison and correlation between Fst between modern and traditional Red Dairy cattle 544
(mRDC and tRDC) and relative identity-by-descend (rIBD) score introgressed from Brown Swiss 545
(BSW) cattle in mRDC (correlation between rIBD scores larger than median and the corresponding 546
Fst was 0.09, p<0.001). a. rIBD score showing introgression from BSW cattle in mRDC. b. Fst 547
between mRDC and tRDC 548
549
Page 19
Figure S4. Comparison between iHS score detected within modern Danish Red dairy cattle (mRDC) 550
and relative identity-by-descend (rIBD) score showing introgression from Holstein (HOL) cattle in 551
mRDC (correlation between rIBD scores larger than median and the corresponding proportion of 552
SNPs with |iHS|>2 was 0.02, p<0.05). a. genome-wide pattern of rIBD scores showing introgression 553
from HOL in mRDC. b. iHS sore detected within mRDC. 554
555
Figure S5. Comparison between iHS score detected within modern Danish Red dairy cattle (mRDC) 556
and relative identity-by-descend (rIBD) score showing introgression from Brown Swiss cattle 557
(BSW) in mRDC (correlation between rIBD scores larger than median and the corresponding 558
proportion of SNPs with |iHS|>2 in a 10 kb window was 0.17, p<0.001). a. genome-wide pattern of 559
rIBS scores showing introgression from BSW in mRDC. b. iHS sore detected within mRDC. 560
561
Figure S6. Comparison between overlapped runs-of-homozygosity (ROH) among modern Red 562
dairy cattle (mRDC) and relative identity-by-descend (rIBD) score showing introgression from 563
Holstein cattle (HOL) in mRDC (correlation between rIBD scores larger than median and the 564
corresponding number of individuals sharing an ROH was 0.05, p<0.001). a. genome-wide pattern 565
of rIBS scores showing introgression from HOL in mRDC. b. sharing of ROH among mRDC. 566
567
Figure S7. Comparison between overlapped runs-of-homozygosity (ROH) among modern Red 568
dairy cattle (mRDC) and relative identity-by-descend (rIBD) score showing introgression from 569
Brown Swiss cattle (BSW) in mRDC (correlation between rIBD scores larger than median and the 570
corresponding number of individuals sharing an ROH was 0.01, p<0.001). a. genome-wide pattern 571
of rIBS scores showing introgression from BSW in mRDC. b. sharing of ROH among mRDC. 572
573
References 574
Abo-Ismail, M. K., L. F. Brito, S. P. Miller, M. Sargolzaei, D. A. Grossi, S. S. Moore, G. Plastow, P. Stothard, S. 575
Nayeri and F. S. Schenkel, 2017 Genome-wide association studies and genomic prediction of breeding 576
values for calving performance and body conformation traits in Holstein cattle. Genetics Selection Evolution 577
49(1):82. 578
Abo-Ismail, M. K., M. J. Kelly, E. J. Squires, K. C. Swanson, S. Bauck and S. P. Miller, 2013 Identification of single 579
nucleotide polymorphisms in genes involved in digestive and metabolic processes associated with feed 580
efficiency and performance traits in beef cattle. Journal of Animal Science 91(6): 2512-2529. 581
Abo-Ismail, M. K., G. Vander Voort, J. J. Squires, K. C. Swanson, I. B. Mandell, X. P. Liao, P. Stothard, S. Moore, G. 582
Plastow and S. P. Miller, 2014 Single nucleotide polymorphisms for feed efficiency and performance in 583
crossbred beef cattle. BMC Genetics 15(1):14. 584
Page 20
Ai, H. S., X. D. Fang, B. Yang, Z. Y. Huang, H. Chen, L. K. Mao, F. Zhang, L. Zhang, L. L. Cui, W. M. He, J. Yang, X. M. 585
Yao, L. S. Zhou, L. J. Han, J. Li, S. L. Sun, X. H. Xie, B. X. Lai, Y. Su, Y. Lu, H. Yang, T. Huang, W. J. Deng, R. 586
Nielsen, J. Ren and L. S. Huang, 2015. Adaptation and possible ancient interspecies introgression in pigs 587
identified by whole-genome sequencing. Nature Genetics 47(3): 217. 588
Alexander, D. H., J. Novembre and K. Lange, 2009 Fast model-based estimation of ancestry in unrelated 589
individuals. Genome Research 19(9): 1655-1664. 590
Alluwaimi, A. M., C. M. Leutenegger, T. B. Farver, P. V. Rossitto, W. L. Smith and J. S. Cullor, 2003 The cytokine 591
markers in Staphylococcus aureus mastitis of bovine mammary gland. Journal of Veterinary Medicine Series 592
B-Infectious Diseases and Veterinary Public Health 50(3): 105-111. 593
Andersen, B., B. Jensen, A. Nielsen, L. G. Christensen and T. Liboriussen, 2003 Rød Dansk Malkerace-594
avlsmæssigt of kulturhistorisk belyst. Danmarks HordbrugsForskning. 595
Ashwell, M. S., D. W. Heyen, T. S. Sonstegard, C. P. Van Tassell, Y. Da, P. M. VanRaden, M. Ron, J. I. Weller and H. 596
A. Lewin, 2004. Detection of quantitative trait loci affecting milk production, health, and reproductive traits 597
in Holstein cattle. Journal of Dairy Science 87(2): 468-475. 598
Baeza, M. C., P. M. Corva, L. A. Soria, G. Rincon, J. F. Medrano, E. Pavan, E. L. Villarreal, A. Schor, L. Melucci, C. 599
Mezzadra and M. C. Miquel, 2011. Genetic markers of body composition and carcass quality in grazing 600
Brangus steers. Genetics and Molecular Research 10(4): 3146-3156. 601
Beecher, C., M. Daly, S. Childs, D. P. Berry, D. A. Magee, T. V. McCarthy and L. Giblin, 2010. Polymorphisms in 602
bovine immune genes and their associations with somatic cell count and milk production in dairy cattle. 603
BMC Genetics 11(1):99. 604
Benjamini, Y. and D. Yekutieli, 2001. The control of the false discovery rate in multiple testing under 605
dependency. Annals of Statistics 29(4): 1165-1188. 606
Bongiorni, S., G. Mancini, G. Chillemi, L. Pariset and A. Valentini, 2012. Identification of a Short Region on 607
Chromosome 6 Affecting Direct Calving Ease in Piedmontese Cattle Breed. PloS One 7(12): e50137. 608
Bosse, M., M. S. Lopes, O. Madsen, H. J. Megens, R. P. M. A. Crooijmans, L. A. F. Frantz, B. Harlizius, J. W. M. 609
Bastiaansen and M. A. M. Groenen, 2015. Artificial selection on introduced Asian haplotypes shaped the 610
genetic architecture in European commercial pigs. Proceedings of the Royal Society B-Biological Sciences 611
282(1821):20152019. 612
Bosse, M., H. J. Megens, L. A. F. Frantz, O. Madsen, G. Larson, Y. Paudel, N. Duijvesteijn, B. Harlizius, Y. 613
Hagemeijer, R. P. M. A. Crooijmans and M. A. M. Groenen, 2014. Genomic analysis reveals selection for 614
Asian genes in European pigs following human-mediated introgression. Nature Communications 5:4392. 615
Bosse, M., H. J. Megens, O. Madsen, L. A. F. Frantz, Y. Paudel, R. P. M. A. Crooijmans and M. A. M. Groenen, 616
2014. "Untangling the hybrid nature of modern pig genomes: a mosaic derived from biogeographically 617
distinct and highly divergent Sus scrofa populations. Molecular Ecology 23(16): 4089-4102. 618
Browning, S. R. and B. L. Browning, 2007. Rapid and accurate haplotype phasing and missing-data inference for 619
whole-genome association studies by use of localized haplotype clustering. American Journal of Human 620
Genetics 81(5): 1084-1097. 621
Buzanskas, M. E., R. V. Ventura, T. C. S. Chud, P. A. Bernardes, D. J. D. Santos, L. C. D. Regitano, M. M. de 622
Alencar, M. D. Mudadu, R. Zanella, M. V. G. B. da Silva, C. X. Li, F. S. Schenkel and D. P. Munari, 2017. Study 623
on the introgression of beef breeds in Canchim cattle using single nucleotide polymorphism markers. PloS 624
One 12(2): e0171660. 625
Capomaccio, S., M. Milanesi, L. Bomba, K. Cappelli, E. L. Nicolazzi, J. L. Williams, P. Ajmone-Marsan and B. 626
Stefanon, 2015. Searching new signals for production traits through gene-based association analysis in 627
three Italian cattle breeds. Animal Genetics 46(4): 361-370. 628
Page 21
Chang, C. C., C. C. Chow, L. C. A. M. Tellier, S. Vattikuti, S. M. Purcell and J. J. Lee, 2015. Second-generation 629
PLINK: rising to the challenge of larger and richer datasets. Gigascience 4(1):7. 630
Cochran, S. D., J. B. Cole, D. J. Null and P. J. Hansen, 2013. Discovery of single nucleotide polymorphisms in 631
candidate genes associated with fertility and production traits in Holstein cattle. BMC Genetics 14(1):49. 632
Cohen-Zinder, M., E. Seroussi, D. M. Larkin, J. J. Loor, A. Everts-van der Wind, J. H. Lee, J. K. Drackley, M. R. 633
Band, A. G. Hernandez, M. Shani, H. A. Lewin, J. I. Weller and M. Ron, 2005. Identification of a missense 634
mutation in the bovine ABCG2 gene with a major effect on the QTL on chromosome 6 affecting milk yield 635
and composition in Holstein cattle. Genome Research 15(7): 936-944. 636
Cole, J. B., B. Waurich, M. Wensch-Dorendorf, D. M. Bickhart and H. H. Swalve, 2014. A genome-wide 637
association study of calf birth weight in Holstein cattle using single nucleotide polymorphisms and 638
phenotypes predicted from auxiliary traits. Journal of Dairy Science 97(5): 3156-3172. 639
Cole, J. B., G. R. Wiggans, L. Ma, T. S. Sonstegard, T. J. Lawlor, Jr., B. A. Crooker, C. P. Van Tassell, J. Yang, S. 640
Wang, L. K. Matukumalli and Y. Da, 2011. Genome-wide association analysis of thirty one production, 641
health, reproduction and body conformation traits in contemporary U.S. Holstein cows. BMC Genomics 642
12(1): 408. 643
Daetwyler, H. D., A. Capitan, H. Pausch, P. Stothard, R. Van Binsbergen, R. F. Brondum, X. P. Liao, A. Djari, S. C. 644
Rodriguez, C. Grohs, D. Esquerre, O. Bouchez, M. N. Rossignol, C. Klopp, D. Rocha, S. Fritz, A. Eggen, P. J. 645
Bowman, D. Coote, A. J. Chamberlain, C. Anderson, C. P. VanTassell, I. Hulsegge, M. E. Goddard, B. 646
Guldbrandtsen, M. S. Lund, R. F. Veerkamp, D. A. Boichard, R. Fries and B. J. Hayes, 2014. Whole-genome 647
sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle. Nature Genetics 648
46(8): 858-865. 649
Davis, S. R., R. J. Spelman and M. D. Littlejohn, 2017. BREEDING AND GENETICS SYMPOSIUM:Breeding heat 650
tolerant dairy cattle: the case for introgression of the "slick" prolactin receptor variant into dairy breeds. 651
Journal of Animal Science 95(4): 1788-1800. 652
Deschamps, M., G. Laval, M. Fagny, Y. Itan, L. Abel, J. L. Casanova, E. Patin and L. Quintana-Murci, 2016. 653
Genomic Signatures of Selective Pressures and Introgression from Archaic Hominins at Human Innate 654
Immunity Genes. American Journal of Human Genetics 98(1): 5-21. 655
Eberlein, A., A. Takasuga, K. Setoguchi, R. Pfuhl, K. Flisikowski, R. Fries, N. Klopp, R. Furbass, R. Weikard and C. 656
Kuhn, 2009. Dissection of Genetic Factors Modulating Fetal Growth in Cattle Indicates a Substantial Role of 657
the Non-SMC Condensin I Complex, Subunit G (NCAPG) Gene. Genetics 183(3): 951-964. 658
Fang, M., W. X. Fu, D. Jiang, Q. Zhang, D. X. Sun, X. D. Ding and J. F. Liu, 2014. A Multiple-SNP Approach for 659
Genome-Wide Association Study of Milk Production Traits in Chinese Holstein Cattle. PloS One 9(8): e99544. 660
Figueiro, H. V., G. Li, F. J. Trindade, J. Assis, F. Pais, G. Fernandes, S. H. D. Santos, G. M. Hughes, A. Komissarov, 661
A. Antunes, C. S. Trinca, M. R. Rodrigues, T. Linderoth, K. Bi, L. Silveira, F. C. C. Azevedo, D. Kantek, E. 662
Ramalho, R. A. Brassaloti, P. M. S. Villela, A. L. V. Nunes, R. H. F. Teixeira, R. G. Morato, D. Loska, P. 663
Saragueta, T. Gabaldon, E. C. Teeling, S. J. O'Brien, R. Nielsen, L. L. Coutinho, G. Oliveira, W. J. Murphy and E. 664
Eizirik, 2017. Genome-wide signatures of complex introgression and adaptive evolution in the big cats. 665
Science Advances 3(7): e1700299. 666
Fontanesi, L., D. G. Calo, G. Galimberti, R. Negrini, R. Marino, A. Nardone, P. Ajmone-Marsan and V. Russo, 667
2014. A candidate gene association study for nine economically important traits in Italian Holstein cattle. 668
Animal Genetics 45(4): 576-580. 669
Frischknecht, M., B. Bapst, F. R. Seefried, H. Signer-Hasler, D. Garrick, C. Stricker, R. Fries, I. Russ, J. Solkner, A. 670
Bieber, M. G. Strillacci, B. Gredler-Grandl, C. Flury and I. Consortium, 2017. Genome-wide association 671
Page 22
studies of fertility and calving traits in Brown Swiss cattle using imputed whole-genome sequences. BMC 672
Genomics 18(1):910. 673
Gaddis, K. L. P., D. J. Null and J. B. Cole, 2016. Explorations in genome-wide association studies and network 674
analyses with dairy cattle fertility traits. Journal of Dairy Science 99(8): 6420-6435. 675
Galov, A., E. Fabbri, R. Caniglia, H. Arbanasic, S. Lapalombella, T. Florijancic, I. Boskovic, M. Galaverni and E. 676
Randi, 2015. First evidence of hybridization between golden jackal (Canis aureus) and domestic dog (Canis 677
familiaris) as revealed by genetic markers. Royal Society Open Science 2(12): 150450. 678
Gautier, M. and R. Vitalis, 2012. rehh: an R package to detect footprints of selection in genome-wide SNP data 679
from haplotype structure. Bioinformatics 28(8): 1176-1177. 680
Grun, B. and F. Leisch, 2008. FlexMix Version 2: Finite Mixtures with Concomitant Variables and Varying and 681
Constant Parameters. Journal of Statistical Software 28(4): 1-35. 682
Guan, Y. T., 2014. Detecting Structure of Haplotypes and Local Ancestry. Genetics 196(3): 625-642. 683
Hartwig, S., R. Wellmann, R. Emmerling, H. Hamann and J. Bennewitz, 2015. Short communication: Importance 684
of introgression for milk traits in the German Vorderwald and Hinterwald cattle. Journal of Dairy Science 685
98(3): 2033-2038. 686
Hasenkamp, N., T. Solomon and D. Tautz, 2015. Selective sweeps versus introgression - population genetic 687
dynamics of the murine leukemia virus receptor Xpr1 in wild populations of the house mouse (Mus 688
musculus). BMC Evolutionary Biology 15(1):248. 689
Hering, D. M., K. Olenski and S. Kaminski, 2014. Genome-wide association study for poor sperm motility in 690
Holstein-Friesian bulls. Animal Reproduction Science 146(3-4): 89-97. 691
Hoglund, J. K., B. Buitenhuis, B. Guldbrandtsen, M. S. Lund and G. Sahana, 2015. Genome-wide association 692
study for female fertility in Nordic Red cattle. BMC Genetics 16(1):110. 693
Hope, A. C. A., 1968. A Simplified Monte Carlo Significance Test Procedure. Journal of the Royal Statistical 694
Society Series B-Statistical Methodology 30(3): 582-598. 695
Hu, Z. L., C. A. Park and J. M. Reecy, 2016. Developmental progress and current status of the Animal QTLdb. 696
Nucleic Acids Research 44(D1): D827-D833. 697
Jagoda, E., D. J. Lawson, J. D. Wall, D. Lambert, C. Muller, M. Westaway, M. Leavesley, T. D. Capellini, M. 698
Mirazon Lahr, P. Gerbault, M. G. Thomas, A. B. Migliano, E. Willerslev, M. Metspalu and L. Pagani, 2017. 699
Disentangling Immediate Adaptive Introgression from Selection on Standing Introgressed Variation in 700
Humans. Molecular Biology Evolution 35(3):623-630. 701
Kantanen, J., I. Olsaker, L. E. Holm, S. Lien, J. Vilkki, K. Brusgaard, E. Eythorsdottir, B. Danell and S. Adalsteinsson, 702
2000. Genetic diversity and population structure of 20 North European cattle breeds. Journal of Heredity 703
91(6): 446-457. 704
Keele, J. W., L. A. Kuehn, T. G. McDaneld, R. G. Tait, S. A. Jones, T. P. L. Smith, S. D. Shackelford, D. A. King, T. L. 705
Wheeler, A. K. Lindholm-Perry and A. K. McNeel, 2015. Genomewide association study of lung lesions in 706
cattle using sample pooling. Journal of Animal Science 93(3): 956-964. 707
Kinsella, R. J., A. Kahari, S. Haider, J. Zamora, G. Proctor, G. Spudich, J. Almeida-King, D. Staines, P. Derwent, A. 708
Kerhornou, P. Kersey and P. Flicek, 2011. Ensembl BioMarts: a hub for data retrieval across taxonomic 709
space. Database-the Journal of Biological Databases and Curation. 710
Kolbehdari, D., Z. Wang, J. R. Grant, B. Murdoch, A. Prasad, Z. Xiu, E. Marques, P. Stothard and S. S. Moore, 711
2008. A whole-genome scan to map quantitative trait loci for conformation and functional traits in 712
Canadian Holstein Bulls. Journal of Dairy Science 91(7): 2844-2856. 713
Li, H. and R. Durbin, 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. 714
Bioinformatics 25(14): 1754-1760. 715
Page 23
Li, H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin and G. P. D. 716
Proc, 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25(16): 2078-2079. 717
Lindholm-Perry, A. K., A. K. Sexten, L. A. Kuehn, T. P. L. Smith, D. A. King, S. D. Shackelford, T. L. Wheeler, C. L. 718
Ferrell, T. G. Jenkins, W. M. Snelling and H. C. Freetly, 2011. Association, effects and validation of 719
polymorphisms within the NCAPG-LCORL locus located on BTA6 with feed intake, gain, meat and carcass 720
traits in beef cattle. BMC Genetics 12(1):103. 721
Magee, D. A., K. M. Sikora, E. W. Berkowicz, D. P. Berry, D. J. Howard, M. P. Mullen, R. D. Evans, C. Spillane and 722
D. E. MacHugh, 2010. DNA sequence polymorphisms in a panel of eight candidate bovine imprinted genes 723
and their association with performance traits in Irish Holstein-Friesian cattle. BMC Genetics 11(1):93. 724
Mao, X., N. K. Kadri, J. R. Thomasen, D. J. De Koning, G. Sahana and B. Guldbrandtsen, 2016. Fine mapping of a 725
calving QTL on Bos taurus autosome 18 in Holstein cattle. Journal of Animal Breeding and Genetics 133(3): 726
207-218. 727
McKenna, A., M. Hanna, E. Banks, A. Sivachenko, K. Cibulskis, A. Kernytsky, K. Garimella, D. Altshuler, S. Gabriel, 728
M. Daly and M. A. DePristo, 2010. The Genome Analysis Toolkit: a MapReduce framework for analyzing 729
next-generation DNA sequencing data. Genome Research 20: 1297–1303. 730
Merks, J. W. M., P. K. Mathur and E. F. Knol, 2012. New phenotypes for new breeding goals in pigs. Animal 6(4): 731
535-543. 732
Michenet, A., R. Saintilan, E. Venot and F. Phocas, 2016. Insights into the genetic variation of maternal behavior 733
and suckling performance of continental beef cows. Genetics Selection Evolution 48(1):45. 734
Pickrell, J. K. and J. K. Pritchard, 2012. Inference of Population Splits and Mixtures from Genome-Wide Allele 735
Frequency Data. PloS Genetics 8(11): e1002967. 736
Pimentel, E. C. G., S. Bauersachs, M. Tietze, H. Simianer, J. Tetens, G. Thaller, F. Reinhardt, E. Wolf and S. Konig, 737
2011. Exploration of relationships between production and fertility traits in dairy cattle via association 738
studies of SNPs within candidate genes derived by expression profiling. Animal Genetics 42(3): 251-262. 739
Prakash, V., T. K. Bhattacharya, B. Jyotsana and O. P. Pandey, 2011. Molecular Cloning, Characterization, 740
Polymorphism, and Association Study of the Interleukin-2 Gene in Indian Crossbred Cattle. Biochemical 741
Genetics 49(9-10): 638-644. 742
Sabeti, P. C., D. E. Reich, J. M. Higgins, H. Z. P. Levine, D. J. Richter, S. F. Schaffner, S. B. Gabriel, J. V. Platko, N. J. 743
Patterson, G. J. McDonald, H. C. Ackerman, S. J. Campbell, D. Altshuler, R. Cooper, D. Kwiatkowski, R. Ward 744
and E. S. Lander, 2002. Detecting recent positive selection in the human genome from haplotype structure. 745
Nature 419(6909): 832-837. 746
Sahana, G., J. K. Hoglund, B. Guldbrandtsen and M. S. Lund, 2015. Loci associated with adult stature also affect 747
calf birth survival in cattle. BMC Genetics 16(1):47. 748
Sasago, N., T. Abe, H. Sakuma, T. Kojima and Y. Uemoto, 2017. Genome-wide association study for carcass 749
traits, fatty acid composition, chemical composition, sugar, and the effects of related candidate genes in 750
Japanese Black cattle. Animal Science Journal 88(1): 33-44. 751
Setoguchi, K., T. Watanabe, R. Weikard, E. Albrecht, C. Kuhn, A. Kinoshita, Y. Sugimoto and A. Takasuga, 2011. 752
The SNP c.1326T > G in the non-SMC condensin I complex, subunit G (NCAPG) gene encoding a p.Ile442Met 753
variant is associated with an increase in body frame size at puberty in cattle. Animal Genetics 42(6): 650-754
655. 755
Sherry, S. T., M. H. Ward, M. Kholodov, J. Baker, L. Phan, E. M. Smigielski and K. Sirotkin, 2001. dbSNP: the NCBI 756
database of genetic variation. Nucleic Acids Research 29(1): 308-311. 757
Sun, C., C. Southard, D. B. Witonsky, R. Kittler and A. Di Rienzo, 2010. Allele-Specific Down-Regulation of RPTOR 758
Expression Induced by Retinoids Contributes to Climate Adaptations. PloS Genetics 6(10):e1001178. 759
Page 24
Saatchi, M., R. D. Schnabel, J. F. Taylor and D. J. Garrick, 2014. Large-effect pleiotropic or closely linked QTL 760
segregate within and across ten US cattle breeds. BMC Genomics 15(1):442. 761
Voight, B. F., S. Kudaravalli, X. Q. Wen and J. K. Pritchard, 2006. A map of recent positive selection in the human 762
genome. PloS Biology 4(3): 446-458. 763
vonHoldt, B. M., R. Kays, J. P. Pollinger and R. K. Wayne, 2016. Admixture mapping identifies introgressed 764
genomic regions in North American canids. Molecular Ecology 25(11): 2443-2453. 765
Weir, B. S. and C. C. Cockerham, 1984. Estimating F-Statistics for the Analysis of Population-Structure. Evolution 766
38(6): 1358-1370. 767
Weng, Z. Q., M. Saatchi, R. D. Schnabel, J. F. Taylor and D. J. Garrick, 2014. Recombination locations and rates in 768
beef cattle assessed from parent-offspring pairs. Genetics Selection Evolution 46(1):34. 769
White, S., 2011. from globalized PIG BREEDS TO CAPITALIST PIGS: A STUDY IN ANIMAL CULTURES AND 770
EVOLUTIONARY HISTORY. Environmental History 16(1): 94-120. 771
Williams, R. L., 2000. A note on robust variance estimation for cluster-correlated data. Biometrics 56(2): 645-772
646. 773
Wu, X. P., B. Guldbrandtsen, U. S. Nielsen, M. S. Lund and G. Sahana, 2017. Association analysis for young stock 774
survival index with imputed whole-genome sequence variants in Nordic Holstein cattle. Journal of Dairy 775
Science 100(8): 6356-6370. 776
Xia, J. W., H. Z. Fan, T. P. Chang, L. Y. Xu, W. G. Zhang, Y. X. Song, B. Zhu, L. P. Zhang, X. Gao, Y. Chen, J. Y. Li and 777
H. J. Gao, 2017. Searching for new loci and candidate genes for economically important traits through 778
genebased association analysis of Simmental cattle. Scientific Reports 7:42048. 779
Yang, J. A., S. H. Lee, M. E. Goddard and P. M. Visscher, 2011. GCTA: A Tool for Genome-wide Complex Trait 780
Analysis. American Journal of Human Genetics 88(1): 76-82. 781
Zhang, Q. Q., M. P. L. Calus, B. Guldbrandtsen, M. S. Lund and G. Sahana, 2015. Estimation of inbreeding using 782
pedigree, 50k SNP chip genotypes and full sequence data in three cattle breeds. BMC Genetics 16(1):88. 783
Zhang, Q. Q., B. Guldbrandtsen, M. Bosse, M. S. Lund and G. Sahana, 2015. Runs of homozygosity and 784
distribution of functional variants in the cattle genome. BMC Genomics 16(1):542. 785
Zheng, X., Z. H. Ju, J. Wang, Q. L. Li, J. M. Huang, A. W. Zhang, J. F. Zhong and C. F. Wang, 2011. Single 786
nucleotide polymorphisms, haplotypes and combined genotypes of LAP3 gene in bovine and their 787
association with milk production traits. Molecular Biology Reports 38(6): 4053-4061. 788
Zimin, A. V., A. L. Delcher, L. Florea, D. R. Kelley, M. C. Schatz, D. Puiu, F. Hanrahan, G. Pertea, C. P. Van Tassell, 789
T. S. Sonstegard, G. Marcais, M. Roberts, P. Subramanian, J. A. Yorke and S. L. Salzberg, 2009. A whole-790
genome assembly of the domestic cow, Bos taurus. Genome Biology 10(4):R42. 791
792
793