Top Banner
Human-mediated introgression of haplotypes in a modern 1 dairy cattle breed 2 3 Qianqian Zhang 1,2,#,* , Mario Calus 2 , Mirte Bosse 2 , Goutam Sahana 1 , Mogens Sandø Lund 1 , and 4 Bernt Guldbrandtsen 1 5 6 1 Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, 7 Aarhus University, Denmark 8 2 Animal Breeding and Genomics, Wageningen University & Research, the Netherlands 9 # Present address: Department of Veterinary and Animal Sciences, Faculty of Health and Medical 10 Sciences, University of Copenhagen, Denmark 11 * Corresponding author 12 13 Email addresses: 14 QZ: [email protected] 15 MC: [email protected] 16 MB: [email protected] 17 GS: [email protected] 18 MSL: [email protected] 19 BG: [email protected] 20 Genetics: Early Online, published on May 30, 2018 as 10.1534/genetics.118.301143 Copyright 2018.
28

Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

Oct 12, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

Human-mediated introgression of haplotypes in a modern 1

dairy cattle breed 2

3

Qianqian Zhang1,2,#,*

, Mario Calus

2, Mirte Bosse

2, Goutam Sahana

1, Mogens Sandø Lund

1, and 4

Bernt Guldbrandtsen1 5

6

1 Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, 7

Aarhus University, Denmark 8

2 Animal Breeding and Genomics, Wageningen University & Research, the Netherlands 9

# Present address: Department of Veterinary and Animal Sciences, Faculty of Health and Medical 10

Sciences, University of Copenhagen, Denmark 11

* Corresponding author 12

13

Email addresses: 14

QZ: [email protected] 15

MC: [email protected] 16

MB: [email protected] 17

GS: [email protected] 18

MSL: [email protected] 19

BG: [email protected] 20

Genetics: Early Online, published on May 30, 2018 as 10.1534/genetics.118.301143

Copyright 2018.

Page 2: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

Abstract 21

22

Domestic animals can serve as model systems of adaptive introgression and their genomic 23

signatures. In part their usefulness as model systems is due to their well-known histories. Different 24

breeding strategies such as introgression and artificial selection have generated numerous desirable 25

phenotypes and superior performance in domestic animals. The Modern Danish Red Dairy Cattle is 26

studied as an example of an introgressed population. It originates from crossing the traditional 27

Danish Red Dairy Cattle with the Holstein and Brown Swiss breeds, both known for high milk 28

production. This crossing happened among other things due to changes in production system, to 29

raise milk production and overall performance. The genomes of Modern Danish Red Dairy Cattle 30

are heavily influenced by regions introgressed from the Holstein and Brown Swiss breeds and under 31

subsequent selection in the admixed population. The introgressed proportion of the genome was 32

found to be highly variable across the genome. Haplotypes introgressed from Holstein and Brown 33

Swiss contained or overlapped known genes affecting milk production, as well as protein and fat 34

content (CD14, ZNF215, BCL2L12 and THRSP for Holstein origin; ITPR2, BCAT1, LAP3 and 35

MED28 for Brown Swiss origin). Genomic regions with high introgression signals also contained 36

genes and enriched QTLs associated with calving traits, body confirmation, feed efficiency, carcass, 37

and fertility traits. These introgressed signals with relative identity by descent scores larger than the 38

median showing Holstein or Brown Swiss introgression are mostly significantly correlated with the 39

corresponding test statistics from signatures of selection analyses in Modern Danish Red Dairy 40

Cattle. Meanwhile, the putative significant introgressed signals have a significant dependency with 41

the putative significant signals from signatures of selection analyses. Artificial Selection has played 42

an important role in the genomic footprints of introgression in the genome of Modern Danish Red 43

Dairy Cattle. Our study on a modern cattle breed contributes to an understanding of genomic 44

consequences of selective introgression by demonstrating the extent to which adaptive effects 45

contribute to shaping the specific genomic consequences of introgression. 46

47

Keywords: selective introgression, signature of selection, high yielding cattle breeds, modern dairy 48

cattle breed 49

50

51

52

Page 3: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

Introduction 53

54

Processes of adaptive introgression are complex and their genomic signatures in human and other 55

species have been studied extensively (Hasenkamp, Solomon et al. 2015, Deschamps, Laval et al. 56

2016, Figueiro, Li et al. 2017, Jagoda, Lawson et al. 2017). Genome analysis has enabled an in-57

depth assessment of genomic consequences of different demographic processes including 58

introgression, selection, and their interplay in the modern species (Deschamps, Laval et al. 2016, 59

Figueiro, Li et al. 2017). Domestic animals can serve as model organisms for these processes. They 60

have several advantages in understanding the impact of introgression and selection on genomes: 61

first, introgression and selection is known to occur between breeds, and the processes are often well 62

documented by breeders; second, massive data are routine collected before and after introgression 63

and selection such as parentage, genotypes and phenotypes; third, under controlled appropriate 64

environmental conditions a large part of genomic consequence is caused by human-mediated 65

directional introgression and selection. 66

67

Artificial selection and different breeding strategies have enabled generating numerous desirable 68

phenotypes in domestic animals such as cattle (Hartwig, Wellmann et al. 2015, Buzanskas, Ventura 69

et al. 2017, Davis, Spelman et al. 2017), pigs (Bosse, Megens et al. 2014, Bosse, Megens et al. 2014, 70

Ai, Fang et al. 2015) and dogs (Galov, Fabbri et al. 2015, vonHoldt, Kays et al. 2016). Strategies 71

including crossbreeding and introgression have been very successful in improving productivity and 72

performance in domestic animals. For example, Chinese pig breeds have been imported to Europe 73

to improve the productivity of European pigs in the late eighteenth and early nineteenth centuries 74

(White 2011). The fertility related traits have been largely improved by the crossbreeding and 75

introgression from Asian pigs (Merks, Mathur et al. 2012, Bosse, Megens et al. 2014, Bosse, Lopes 76

et al. 2015). Similarly, in dairy cattle, crossbreeding and following introgression between local 77

breeds with other breeds has been applied in order to achieve better productivity and performance 78

(Davis, Spelman et al. 2017). The genetic architecture of modern domestic animals including dairy 79

cattle is shaped by the interplay between different forces including the intentional introduction of 80

favorable alleles from other breeds, subsequent selection for favorable introgressed alleles and 81

demographic processes. 82

83

Page 4: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

Since the introduction of scientific breeding, the main breeding goal in cattle has been to improve 84

milk yield, fat, and protein content, even for dual-purpose breeds in intensive farming systems 85

especially in Europe (Hartwig, Wellmann et al. 2015). By the crossing of high yielding breeds with 86

local breeds, the productivity of local breeds could rapidly be increased at the expense of the 87

genetic distinctiveness of the local breeds. The high productivity of these admixed breeds was 88

further improved by intense selection, resulting in increased frequencies or even fixation of 89

favorable alleles. Many of the variants thus spreading in the population will have been of 90

introgressed origin. With the great success of cattle breeding including crossbreeding, and because 91

of the availability of large scale genomic data sets, analysis of admixed local cattle breeds 92

represents an appealing model to identify the genomic consequences of admixture with the intention 93

of improving traits of interest in local breeds. We expect that the introgressed genomic regions play 94

an important role in improving productivity and performance in the crossbreeds. We hypothesize 95

that: 1) the genome-wide introgressed regions are non-randomly distributed across the genome with 96

respect to their genomic locations, and that: 2) the majority of highly introgressed regions affects 97

production and performance traits, and that: 3) these introduced haplotypes are or have been under 98

selection in the admixed breed. 99

100

To test these hypotheses and identify specific genomic regions and genes of interest involved in the 101

important traits from high-yields breeds such as Holstein (HOL) and Brown Swiss (BSW), we use 102

the hybrid Modern Danish Red Dairy Cattle (mRDC) breed as an example. Our analyses illustrate 103

the patterns of introgressed and selected haplotypes in an admixed local breed. The hybrid mRDC 104

originates from traditional Danish Red Dairy Cattle (tRDC). In recent decades Holstein and Brown 105

Swiss have been used extensively to improve the milk yield of mRDC (Kantanen, Olsaker et al. 106

2000). Years of crossbreeding and selection have led to the differentiation between mRDC and 107

tRDC. The introgression of and selection for haplotypes from HOL and BSW has probably made a 108

significant contribution to the increased milk production level of mRDC. The objective of this study 109

was to examine genomic patterns of introgression from two high-yielding breeds (HOL and BSW) 110

in a modern dairy breed (i.e. the hybrid mRDC), and unravel the consequences of introgression and 111

selection at the genome level using the whole genome sequence data. 112

113

Methods 114

115

Page 5: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

1. SNP genotyping, sequencing, variant calling, and quality control 116

117

Whole genome sequence data were available for 97 animals from 4 breeds (32 Holstein: HOL; 20 118

Brown Swiss: BSW; 15 traditional Red Danish Dairy Cattle: tRDC; 30 modern Danish Red Dairy 119

cattle: mRDC). All individuals’ genomes were sequenced to ~10× of depth or deeper using Illumina 120

paired-end sequencing. Reads were aligned to the cattle genome assembly UMD3.1 (Zimin, 121

Delcher et al. 2009) using bwa (Li and Durbin 2009). Aligned sequences were converted to raw 122

BAM files using samtools (Li, Handsaker et al. 2009). Duplicate reads were marked using the 123

samtools rmdup option (Li, Handsaker et al. 2009). The Genome Analysis Toolkit (McKenna, Hanna 124

et al. 2010) was used for local realignment around insertion/deletion (indels) regions, and 125

recalibration following the 1000 Bull Genome Project guidelines (Daetwyler, Capitan et al. 2014) 126

incorporating information from dbSNP (Sherry, Ward et al. 2001). Finally, variants were called 127

using the Genome Analysis Toolkit’s UnifiedGenotyper (McKenna, Hanna et al. 2010), which 128

simultaneously calls short indels and SNPs. Indels and variants on sex chromosomes were excluded 129

from further analyses. 130

131

2. Population structure and admixture 132

133

Using PLINK (Chang, Chow et al. 2015), the sequence variants were pruned to remove markers 134

with pairwise linkage disequilibrum (LD) greater than 0.1 with any other SNP within a 50 SNP 135

sliding window (advancing by 10 SNPs at a time). The SNPs on all the autosomes were used to 136

study the population structure. To get an overview of population structure of the genotyped animals 137

from different breeds, the whole genome sequence data was used for principal component 138

estimation using GCTA (Yang, Lee et al. 2011). Admixture analyses were done using the program 139

Admixture (Alexander, Novembre et al. 2009) with values of K between 2 and 10. The K value with 140

a low cross validation error was chosen. 141

142

In order to investigate the statistical significance of admixture among these cattle populations. 143

TreeMix software was used to perform the three population (f3) test (Pickrell and Pritchard 2012). 144

In the f3 test with the form of f3 (A; B, C), an extreme negative f3 statistic indicates that there has 145

been significant gene flow to A from populations B and C. The combination of two of HOL, BSW, 146

Page 6: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

tRDC (A and B) were used as source populations and mRDC were used as admixed population (C) 147

in the f3 test, which results in 3 different combinations. 148

149

We also calculated the breed proportions for the sequenced mRDC individuals using pedigree. The 150

full pedigree of the sequenced mRDC animals were extracted and used to infer the breed 151

proportions by coding the breed where the ancestors enter as the known parents of the traced animal. 152

153

3. Introgression mapping 154

155

3.1 Calculation of ancestry dosages in mRDC 156

157

A novel two-layer hidden Markov model was implemented in the method developed in Guan (2014) 158

(Guan 2014) to infer the structure of local haplotypes introgressed from HOL, BSW and tRDC in 159

mRDC. The software package ELAI developed by Guan (2014) (Guan 2014) was used to infer the 160

ancestry dosages of the haplotypes from three source populations in mRDC. The SNPs from 161

sequence variants with minor allele frequency (MAF) lower than 0.01 or missing proportion of 162

higher than 0.05 were removed from subsequent analyses. The option of three-way admixture with 163

admixture generations of 10 was chosen, which approximates the history of admixture in the mRDC 164

population. 30 steps of Expectation–maximization (EM) algorithm were run to infer the ancestry 165

dosages of HOL, tRDC and BSW in mRDC. 166

167

3.2 Relative Identity by Descent Scores (rIBD) 168

169

HOL, BSW and tRDC have made large genetic contributions to the mRDC. Therefore, the 170

sequenced mRDC, tRDC, BSW and HOL were selected for introgression mapping analysis. The 171

identity-by-descent (IBD) regions comparing mRDC and tRDC were used as a reference to map the 172

introgression regions from HOL and BSW using a pairwise comparison between these breeds. 173

Following Bosse et al. (Bosse, Megens et al. 2014), sequences for 29 autosomes were first phased 174

using Beagle fastIBD (V. 3.3.2) (Browning and Browning 2007). Pairwise comparisons for 175

detecting IBD were performed between mRDC and tRDC; mRDC and HOL; mRDC and BSW. As 176

recommended in the Beagle documentation (Browning and Browning 2007), 10 independent runs 177

for phasing and pairwise IBD detection were performed. The identified IBD segments were 178

Page 7: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

combined from 10 runs and the threshold parameter compromising between power and false-179

discovery rate was 10-10

for identifying the true shared IBD as suggested by Browning and 180

Browning (Browning and Browning 2007). We defined the IBD score as the proportion of the 181

number of recorded true IBD haplotype segments over the total number of pairwise comparisons 182

using a window of 10 kb. The IBD score was calculated for each pairwise comparison using a 183

custom perl script. To quantify the relative proportion of introgressed genome from HOL or BSW, 184

we calculated the relative IBD score (rIBD) as follows: IBD score (mRDC & tRDC) - IBD score 185

(mRDC & HOL) or IBD score (mRDC & BSW). Thus, the rIBD has values in the range of -1 to 1. 186

rIBD=1 signifies that all haplotypes in the target breed originate from the first source breed, while 187

rIBD=-1 signifies 100% from the second source breed. The variance of rIBD score was calculated 188

using a robust method (Williams 2000). The p values of rIBD scores were derived from the neutral 189

hypothesis that assuming rIBD are normally distributed with a standard deviation of the squared 190

root of robust estimate of rIBD variance (Bosse, Megens et al. 2014). The significant introgressed 191

haplotypes from HOL and BSW in mRDC were defined as the haplotypes with a corrected p value 192

using Benjamini-Hochberg Procedure (Benjamini and Yekutieli 2001) lower than 0.02 and 0.04 193

respectively. The Peterson’s correlations between rIBD scores from HOL or BSW introgression 194

larger than the median of rIBD scores and the corresponding test statistics from signatures of 195

selection analyses were calculated. Moreover, the putative significant introgressed signals from 196

HOL and BSW were extracted and compared with the putative significant signals from signatures 197

of selection analyses using a chi-squared test with Monte Carlo simulation (Hope 1968) 198

implemented in calculating p values. 199

200

4. Detection of signature of selection 201

202

4.1 Fst analysis 203

The genetic differentiation between individuals from tRDC and mRDC was measured by pairwise 204

Fst analysis following Weir and Cockerham (Weir and Cockerham 1984). Pairwise Fst was 205

computed with Genepop 4.2 in bins of 10 kb over the full length of the genome (Weir and 206

Cockerham 1984). The correlations between the Fst and rIBD scores for HOL and BSW 207

introgression for the same bins of 10 kb were calculated. The R package FlexMix (Grun and Leisch 208

2008) was used to fit a series of finite normal mixture models to explore the underlying structure of 209

the distribution of Fst presumably caused by different evolutionary or demographic processes 210

Page 8: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

undergoing in the populations such as balancing, directional selection and neutrality. This mixture 211

model postulated that 𝐹𝑠𝑡̅̅ ̅̅̅ : ∑ 𝜋𝑘𝑁(𝐹𝑠𝑡𝑙̅̅ ̅̅ ̅̅ | 𝜇𝑘, 𝜎𝑘

2)𝑘𝑘=1 , where 𝑘 is the number of components of the 212

mixture, 𝑙 is the 𝑙 th locus, 𝜋𝑘 is the probability that 𝐹𝑠𝑡̅̅ ̅̅̅ belongs to cluster k , 𝜇𝑘 and 𝜎𝑘2 are 213

expectation and variance of normal distribution, respectively. The number of components of the 214

mixture (𝑘) was determined by the smallest Akaike’s information criterion (AIC) among different 215

models with different numbers of components. Model parameters were estimated by maximum 216

likelihood via EM algorithm in FlexMix. At last, the p value of the Fst combine the k number of 217

components was calculated as: 𝑝 = 1 − ∑ 𝜋𝑘𝑁(𝐹𝑠𝑡𝑙̅̅ ̅̅ ̅̅ | 𝜇𝑘, 𝜎𝑘

2)𝑘𝑘=1 . The regions with the cutoff of 218

corrected p values using Benjamini-Hochberg Procedure (Benjamini and Yekutieli 2001) of less 219

than 0.05 were used for testing dependency of putative significant signals between different test 220

statistics. 221

222

4.2 Extended haplotype homozygosity tests 223

224

The extended haplotype homozygosity tests were applied between the breeds for the sequenced 225

individuals as a second evidence for signatures of selection. The genome-wide scan for integrated 226

haplotype score (iHS) for mRDC was performed using the R package rehh (Sabeti, Reich et al. 227

2002, Gautier and Vitalis 2012). The threshold of |iHS| of 2.5 for SNPs was used as the cutoff to 228

define as signals of selection signatures following (Voight, Kudaravalli et al. 2006, Gautier and 229

Vitalis 2012). To compare iHS with other test statistics such rIBD scores in the same scale, we 230

calculated the proportion of SNPs with |iHS|>2 in a window size of 10 kb and identified the 231

windows in the highest 1% of the empirical distribution for proportion of SNPs with |iHS|>2 232

following (Voight, Kudaravalli et al. 2006). The putative significant windows of proportion of 233

SNPs with |iHS| used for testing dependency of putative significant signals between different test 234

statistics. 235

236

4.3 Sharing of runs of homozygosity (ROH) 237

238

Runs of homozygosity (ROH) were computed for the sequenced animals to detect shared ROH 239

regions (minimum 10 kb considered) among individuals. For a description of procedures for 240

Page 9: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

calculation of the nucleotide diversity and for detection of ROH, see (Zhang, Calus et al. 2015, 241

Zhang, Guldbrandtsen et al. 2015). The sharing of ROH regions was calculated as the number of 242

individuals sharing the same ROH region on a particular segment using a window of 10 kb bin 243

across the whole genome in mRDC. Regions of enrichment of shared ROH regions were defined as 244

regions exceeding 95th

percentile of the empirical distribution of the number of individuals sharing 245

the same ROH regions in any particular segment. 246

247

5. Gene annotations and enrichment of QTLs 248

249

Genes in genomic regions showing significant introgression of haplotypes from HOL or BSW in 250

mRDC were annotated. The cattle QTLs were extracted from Animal QTL Database (Hu, Park et al. 251

2016). QTLs on the X chromosome or without locations and references in the list from Animal 252

QTL Database (Hu, Park et al. 2016) were excluded. The remaining QTLs were classified into 6 253

groups according to the associated traits: milk, reproduction, production, health, meat and carcass, 254

exterior. The QTLs overlapped with genomic regions with high rIBD scores were identified. When 255

two QTLs had the same exact genomic interval and same associated traits group, they were counted 256

as one QTL. Gene annotations in these regions were retrieved from the Ensembl Genes 89 Database 257

using BioMart (Kinsella, Kahari et al. 2011). 258

259

To test for enrichment of QTLs in the candidate introgressed regions from HOL or BSW, we 260

applied a permutation test. The candidate introgressed regions were randomly distributed across the 261

whole genome. This permutation did not change the relative proportion and length of introgressed 262

regions to preserve their correlation structure. We then computed the number of QTLs and the 263

number of QTLs in the 6 groups associated traits, which overlapped with the permuted, introgressed 264

regions. In total, 10,000 permutations were performed. The distribution of numbers of QTLs 265

observed in the permutated regions were treated as the null distribution from which we computed 266

the significance levels of the number of QTLs observed in the real data. 267

268

Results and discussion 269

270

1. Population structure and evidence of introgression 271

Page 10: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

272

Population structures of the sampled cattle breeds in this study was analyzed using Admixture and 273

PCA are shown in Figure 1. mRDC had contribution from tRDC, HOL, BSW as observed in the 274

Admixture analysis (Figure 1). Figure 1 clearly demonstrates the hybrid nature of mRDC cattle, 275

which is consistent with recorded pedigree information of introgression history of mRDC 276

(Andersen, Jensen et al. 2003). A large contribution in mRDC was observed from tRDC, the 277

recipient population. It is notable that BSW and HOL are two mainstream breeds, each contributed 278

heavily to the genomes of extant mRDC individuals. The proportion of introgression differs quite 279

extensively among individuals. In the PCA analysis (Figure 1), Principal Component 1 (PC1, 6.23 % 280

of variance) separated HOL, BSW and tRDC. mRDC, however, was dispersed between the other 281

breeds demonstrating admixture of HOL, BSW and tRDC which have contribution to mRDC. 282

Similarly, PC2 separated HOL, BSW and tRDC. mRDC had a wide range of PC2 values among the 283

individuals from the other breeds. 284

285

Moreover, the breed proportions of mRDC individuals were derived using the recoded full pedigree 286

and there is an average of 0.27 HOL ancestor, 0.17 BSW ancestor and 0.29 tRDC ancestor in 287

mRDC. The statistical significance of admixture in mRDC was measured using the combination of 288

two populations chosen from HOL, BSW and tRDC by f3 test using the program threepop from 289

Treemix (Pickrell and Pritchard 2012). We observed extreme Z scores from f3 test for mRDC using 290

any combination of HOL, BSW and tRDC, i.e. -14.97, -11,72 and -19.24. These results are 291

consistent with what we have observed from admixture and PCA analysis and provides statistical 292

significance for the admixture of mRDC from HOL, BSW and tRDC. These results support that 293

mRDC indeed is a composite breed and can be studied further for introgression from HOL and 294

BSW. 295

296

2. Introgression mapping 297

298

Introgression mapping was performed to identify regions in mRDC that contained an excess of 299

introgressed haplotypes. We first examined the local structure of haplotypes in mRDC introgressed 300

from HOL, BSW and tRDC using a three-way admixture approach (Figure S1). We observed 301

average ancestry dosages of 0.78, 0.63 and 0.59 with a standard deviations of 0.24, 0.22 and 0.18 302

from HOL, tRDC and BSW to mRDC average across the whole genome and all mRDC individuals. 303

Page 11: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

Three genomic regions with the ancestry dosage of 2 were observed from HOL to mRDC (Figure 304

S1). This indicates complete replacement of tRDC genomic material by HOL genomic material. 305

However, there is only one annotated gene (SNORD116) in these regions. Interestingly, the 306

genomic region with the full HOL ancestry dosage on chromosome 21 (1,348,427-2,107,179 bp) is 307

associated with gestation length and calving ease (Frischknecht, Bapst et al. 2017). Other QTL 308

overlapping regions with full HOL ancestry in mRDC were associated with bovine respiratory 309

disease susceptibility, body weight and udder swelling score (Saatchi, Schnabel et al. 2014, Keele, 310

Kuehn et al. 2015, Michenet, Saintilan et al. 2016). 311

312

To address which admixed haplotypes have most influenced the existing mRDC breed, we inferred 313

whether a genomic region was introgressed from high-yield breeds in multiple individuals by 314

examining the contributions proportional to the admixture fraction. The frequencies of all mRDC 315

haplotypes that were of HOL, BSW or tRDC origin were calculated across the whole genome. The 316

relative fractions of HOL or BSW haplotypes versus tRDC haplotypes in the mRDC group were 317

calculated as ‘relative IBD scores’ (rIBD). Shared haplotypes (i.e. haplotypes with shared ancestry) 318

were observed between mRDC on one hand, and HOL, BSW and tRDC on the other hand. These 319

findings are in agreement with the results observed from population structure and admixture 320

analysis that showed contributions from HOL, BSW and tRDC to mRDC. In contrast to tRDC 321

frequency, HOL haplotype and BSW haplotype frequency in mRDC population, for a given locus, 322

(i.e. rIBD score) ranged from 0.73 to -0.74 and from 0.81 to -0.92, where 1 indicates that all 323

haplotypes were of HOL or BSW origin, while none were of tRDC origin. A value of -1 indicates 324

that all haplotypes were tRDC-like (Figures 2 and 3). The rIBD scores averaged across the genome 325

were negative (-0.06 for HOL introgression and -0.09 for BSW introgression) (Figures 2a and 3a), 326

showing that the majority of the genome displayed more similarity with the tRDC than with either 327

HOL or BSW. However, every chromosome contained genomic regions where the signal for HOL 328

or BSW haplotype was stronger than tRDC. 329

330

The distribution of rIBD from the comparison between mRDC and HOL or BSW for IBD 331

haplotypes resembled a normal distribution (Figures 2b and 3b). By taking a cut-off of rIBD values, 332

we were able to identify the regions which were likely to be of HOL origin or BSW origin. Across 333

the whole genome, many known genes and QTLs were located within regions with high rIBD for 334

HOL origin (Table S1). We observed that the QTLs are significantly enriched in the HOL-like 335

Page 12: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

haplotypes in mRDC (p=0.025). The genes and QTLs were associated with economic traits 336

including milk-related traits such as milk yield, protein, fat yield and percentage (CD14, ZNF215, 337

BCL2L12 and THRSP) (Ashwell, Heyen et al. 2004, Beecher, Daly et al. 2010, Magee, Sikora et al. 338

2010, Cole, Wiggans et al. 2011, Cochran, Cole et al. 2013, Fontanesi, Calo et al. 2014, 339

Capomaccio, Milanesi et al. 2015), calving traits (MYH14, KCNC3, SYT3 and CTU1) (Kolbehdari, 340

Wang et al. 2008, Cole, Wiggans et al. 2011, Gaddis, Null et al. 2016, Mao, Kadri et al. 2016, Abo-341

Ismail, Brito et al. 2017), feed efficiency-related traits (LRRIQ3, ATP6V1B2 and CCKBR) (Abo-342

Ismail, Kelly et al. 2013, Abo-Ismail, Vander Voort et al. 2014), carcass traits (ZNF215, INTS4 and 343

RPTOR) (Magee, Sikora et al. 2010, Sasago, Abe et al. 2017). The QTLs associated production and 344

reproduction traits were significantly enriched in introgressed regions from HOL showing high 345

rIBD scores (p=0.0004 for production and p=0.034 for reproduction traits). The longest continuous 346

introgressed region (defined as the region with rIBD>0) from HOL to mRDC was on chromosome 347

18 (56,320,000-61,350,000bp) (Figure 2a). This region was previously found to be associated with 348

calving traits and young stock survival in Nordic Holstein cattle (Cole, Wiggans et al. 2011, Mao, 349

Kadri et al. 2016, Wu, Guldbrandtsen et al. 2017). The recombination rate of this region of 350

chromosome 18 is low (Weng, Saatchi et al. 2014). The long genomic regions showing signal of 351

introgression probably tend to occur in regions with low recombination rate. Moreover, introgressed 352

haplotypes included numerous annotated genes due to genetic hitchhiking and short time since 353

introgression. The region with highest rIBD score from HOL was located on chromosome 4 354

(120,540,000-120,810,000 bp, average rIBD=0.449), which is downstream of the gene VIPR2. The 355

VIPR2 gene has been proposed as a candidate gene affecting fat percentage and playing an 356

important role in milk synthesis (Capomaccio, Milanesi et al. 2015). 357

358

Similarly, many known genes and QTLs were occurred in regions where mRDC shared haplotypes 359

with BSW (Table S2). We observed an enrichment of QTLs in the BSW-like haplotypes in mRDC 360

with close to significance (p=0.056). These genes and QTLs mainly affected milk composition 361

including fat and protein percentage and yield (ITPR2, BCAT1, LAP3 and MED28 ) (Cohen-Zinder, 362

Seroussi et al. 2005, Pimentel, Bauersachs et al. 2011, Zheng, Ju et al. 2011, Fang, Fu et al. 2014), 363

growth and body conformation traits such as stature (NCAPG, LCORL, PPP2R1A, IGFBP6 and 364

CREBBP) (Kolbehdari, Wang et al. 2008, Baeza, Corva et al. 2011, Lindholm-Perry, Sexten et al. 365

2011, Cole, Waurich et al. 2014, Sahana, Hoglund et al. 2015), calving and fertility traits (EIF4G3, 366

TGFA and LAP3) (Bongiorni, Mancini et al. 2012, Hering, Olenski et al. 2014, Hoglund, Buitenhuis 367

Page 13: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

et al. 2015), and feed efficiency related traits (CLMP and PPP2R1A) (Prakash, Bhattacharya et al. 368

2011, Cole, Waurich et al. 2014). We observed a significant enrichment of QTLs associated with 369

milk, production, meat and carcass traits in the introgressed regions from BSW in mRDC (p=0.045 370

for milk, p=0.019 for production, and p=0.017 for meat and carcass traits). While the longest region 371

introgressed from HOL in mRDC was ~4 Mb, there was no introgressed region longer than 1.5 Mb 372

from BSW in mRDC. The highest value of rIBD signal was observed on chromosome 17 373

(35,630,000-35,780,000 bp, average rIBD=0.563). This region is located downstream of the IL2 374

gene. Also there are one unannotated gene in this region. It has previously been shown that IL2 375

gene is associated with mastitis, milk yield and lactation persistency (Alluwaimi, Leutenegger et al. 376

2003, Prakash, Bhattacharya et al. 2011). 377

378

3. Regions of introgression and evidence for selection 379

380

We have shown that HOL or BSW haplotypes that were introgressed in mRDC often originated 381

from genomic regions harboring genes associated with milk production, calving traits, feed 382

efficiency, fertility or body conformation and carcass traits. An introgression introduced haplotype 383

from HOL or BSW with a continued survival in mRDC is not removed by negative selection or 384

genetic drift might be a result of positive or balancing selection. The structure of introgressed 385

haplotype will change in such as linkage-disequilibrium (LD), and distribution of allele frequencies 386

due to selection. The length of introgressed haplotype was affected by a combination of the local 387

recombination rate and strength of selection. To further infer the introgressied regions which are 388

under selection, we used three independent methods: iHS, Fst, and sharing of ROH among 389

individuals to identify the regions which were under selection in the mRDC population. iHS 390

identified regions with extended homozygosity in mRDC due to selection. Local high Fst statistics 391

reflected the genomic regions which showed strong differentiation between mRDC and tRDC. The 392

sharing of ROH among individuals could differentiate the genomic regions which have been fixed 393

or close to fixation in mRDC. We observed a significant dependency (p<0.01) between the putative 394

significant rIBD signals and significant signals from iHS, Fst and sharing of ROH among 395

individuals, except between the significant rIBD signals from HOL introgression and significant 396

iHS signals. It supports the hypothesis that many introgressed regions from HOL or BSW are 397

probably a result of selection. 398

399

Page 14: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

We firstly observed significant positive correlations between Fst and rIBD scores larger than the 400

median from both HOL and BSW (p<0.001), supportings that at least some of the regions in mRDC 401

showing differentiation from tRDC have been introgressed from HOL or BSW (Figures S2 and S3). 402

Genes associated with milk yield, protein and fat yield and percentage such as CD14 and ZNF215 403

(Magee, Sikora et al. 2010, Cochran, Cole et al. 2013) overlapped with haplotypes introgressed 404

from HOL showed a high Fst (Figure 4e). The longest region introgressed from HOL in mRDC 405

(chromosome 18: 56,320,000-61,350,000 bp) also showed high Fst values (Figure 4d). There was a 406

number of genes included in this longest region with introgression from HOL and high Fst regions, 407

such as MYH14 and ZNF613 associated with calving ease and young stock survival (Abo-Ismail, 408

Brito et al. 2017, Wu, Guldbrandtsen et al. 2017) (Figure 4d). The region on chromosome 19 409

(52,370,000- 52,380,000 bp) with a very high Fst of 0.748 showed strong differentiation between 410

tRDC and mRDC which overlapped with a haplotype introgressed from HOL into mRDC. The 411

RPTOR gene associated with carcass traits in cattle (Sasago, Abe et al. 2017) is located here (Figure 412

4a). At the same time, RPTOR gene was found to play an important role regulates cell growth, 413

energy homeostasis, apoptosis, and immune response during adaptions (Sun, Southard et al. 2010). 414

Similarly, we found that two BSW introgressed haplotypes overlapped with genes (ITPR2 and 415

BCAT1) associated with milk yield, fat and protein yield and percentage (Pimentel, Bauersachs et al. 416

2011, Fang, Fu et al. 2014) showed a high Fst (Figure 4c). Moreover, the region on chromosome 6 417

(38,730,000- 38,780,000 bp) with an average Fst value of 0.251 overlapped with a BSW haplotype 418

introgressed into mRDC. The NCAPG and LCORL gene associated with body confirmation traits 419

such as stature, and feed efficiency (Eberlein, Takasuga et al. 2009, Lindholm-Perry, Sexten et al. 420

2011, Setoguchi, Watanabe et al. 2011, Xia, Fan et al. 2017) is located in this region. 421

422

Regions introgressed from HOL or BSW into mRDC and the significant regions from iHS test were 423

compared in Figures S4 and S5. There was a significant positive correlation between the rIBD score 424

larger than the median for introgressed region from BSW in mRDC and iHS test (p<0.001). 425

However, there was no significant correlation between rIBD score for introgressed regions from 426

HOL in mRDC and iHS test. For example, the genomic region putatively under selection from iHS 427

test (38,510,000- 38,540,000, 104 out of 108 SNPs with |iHS|>2, highest |iHS|=3.8) on chromosome 428

6 overlapped with an introgression signal from BSW into mRDC. This region located on the 429

upstream of LAP3 gene associated with milk composition including fat and protein percentage, 430

Page 15: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

calving trait (Cohen-Zinder, Seroussi et al. 2005, Zheng, Ju et al. 2011). We observed the 431

significant SNPs with highest |iHS| of 5.3 from iHS test on chromosome 5 overlapped with a high 432

introgression signal from BSW into mRDC. These significant SNPs overlaps with the BCAT1 gene 433

associated with milk yield, protein and fat percentage in milk (Pimentel, Bauersachs et al. 2011) 434

(Figure 4c). There were significant SNPs with highest |iHS| of 4.23 on chromosome 15, which 435

overlapped with an introgressed region from BSW into mRDC. This region lies within the ZNF215 436

gene affecting body confirmation and milk composition (Magee, Sikora et al. 2010) (Figure 4e). 437

Genomic regions showing signals both from iHS test and introgression mapping, e.g. gene BCAT1, 438

were introgressed from BSW, and probably under selection, but not yet fixed in the population due 439

to low pression of selection or recent introduction. 440

441

There was a significant positive correlation (p<0.001) across genomic regions between the number 442

of individuals containing overlapped ROH regions and the rIBD scores larger than the median from 443

both HOL and BSW (Figures S6 and S7). . Sharing of short ROHs shared between individuals 444

indicated selection events that have reached or are close to fixation. Interestingly, many small 445

regions highly enriched for ROH hotspots, overlapped with the longest region introgressed from 446

HOL into mRDC such as the region where ZNF613 gene associated with young stock survival (Wu, 447

Guldbrandtsen et al. 2017) is located (Figure 4d). Similarly, we also observed that the region 448

introgressed from HOL in mRDC, where the THRSP and INTS4 genes are located, shows high 449

enrichment of ROH region among individuals (Figure 4b). Studies showed that gene THRSP is 450

associated with milk composition and involved in the regulation of mammary synthesis of milk fat 451

(Fontanesi, Calo et al. 2014) and gene INTS4 is associated with myristic acid content in carcass trait 452

(Sasago, Abe et al. 2017). One region introgressed from BSW into mRDC, contained the TGFA 453

gene associated with sperm motility (Hering, Olenski et al. 2014) also showed high levels of ROH 454

sharing between individuals. 455

456

Conclusion 457

458

Together, the observed results demonstrated how crossbreeding followed by selection shapes the 459

genomes of a modern breed on a genome-wide scale using dairy cattle as an example. The well-460

Page 16: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

documented breeding practice provide a robust model system to studying the consequences of 461

adaptive introgression. Key observations were a highly uneven distribution across the genome of 462

the proportions of genomic regions introgressed from the donor breeds. Highly introgressed regions 463

contained genes and QTLs known to affect traits of interest and subject to active selection by 464

breeders. These traits include milk production, feed efficiency, calving traits, body confirmation, 465

feed efficiency, carcass, and fertility traits. Artificial selection plays an important role on the 466

genomic footprints from introgression on the genome of a modern dairy cattle breed. These findings 467

contribute to the understanding of genomic consequences of selective introgression in the genomes 468

of modern species. 469

470

Data availability 471

The whole genome sequence data used in this study originated from the 1000 Bull Genome Project. 472

Part of these Whole-genome sequence data of individual bulls of the 1000 Bull Genomes Project 473

(Bouwman et al. 2018 Nature Genetics and Daetwyler et al. 2014 Nature Genetics) are available at 474

https://doi. org/10.1038/s41588-018-0056-5 and NCBI using SRA no. SRP039339, SRR1188706, 475

SRR1205973, SRR1205973, SRR1205992, SRR1262533, SRR1262536, SRR1262538, 476

SRR1262539, SRR1262614, SRR1262659, SRR1262660, SRR1262788, SRR1262789, 477

SRR1262846, SRR1293227. The test statistics and script in this study are available on http://XXX. 478

Competing interests 479

The authors declare that they have no competing interests. 480

Authors’ contributions 481

QZ developed and planned the design of the study, coordinated the study, performed data analyses 482

and drafted the manuscript. BG participated in design of the study, performed data analyses and 483

drafting of the manuscript. MB, GS and MC participated in design of the study, and drafting of the 484

manuscript. MSL participated in drafting of the manuscript. All authors read and approved the final 485

manuscript. 486

Page 17: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

Acknowledgements 487

We are grateful to Viking Genetics (Randers, Denmark) for providing samples for sequencing. 488

Qianqian Zhang benefited from a joint grant from the European Commission within the framework 489

of the Erasmus-Mundus joint doctorate "EGS-ABG". This research was supported by the Center for 490

Genomic Selection in Animals and Plants (GenSAP) funded by Innovation Fund Denmark (grant 491

0603-00519B). Mario Calus acknowledges financial support from the Dutch Ministry of Economic 492

Affairs, Agriculture, and Innovation (Public-private partnership “Breed4Food” code BO-22.04-011-493

001-ASG-LR). 494

495

Figures 496

Figure 1. Population structure for four catte breeds 497

a. Admixture analysis of different cattle breeds with k=3. BSW – Brown Swiss, HOL – Holstein, 498

tRDC – traditional Red Dairy Cattle, mRDC – modern Red Dairy cattle. b. Principal component 499

analysis (PCA) plots among different cattle breeds (Principal component 2 vs. principal component 500

3). BSW – Brown Swiss, HOL – Holstein, tRDC – traditional Red Dairy Cattle, mRDC – modern 501

Red Dairy cattle. 502

503

Figure 2. Genome-wide pattern of relative identity-by-descent (rIBD) score showing introgressed 504

haplotypes from Holstein (HOL) in modern Danish Red dairy cattle (mRDC). a. the rIBD score for 505

all 29 autosomes: the positive scores show the signals where it is more HOL-like whereas the 506

negative scores show the signals where it is more traditional Danish red cattle (tRDC)-like. b. the 507

distribution of rIBD scores: the positive scores show the signals where it is more HOL-like whereas 508

the negative scores show the signals where it is more tRDC-like. The chromosomes 1-29 are 509

colored in red and green in order. 510

511

Figure 3. Genome-wide pattern of relative identity-by-descent (rIBD) score showing introgressed 512

haplotypes from Brown Swiss (BSW) in modern Danish Red dairy cattle. a. the rIBD score for all 513

29 autosomes: the positive scores show the signals where it is more BSW-like whereas the negative 514

scores show the signals where it is more traditional Danish red cattle (tRDC)-like. b. the distribution 515

of rIBD score: the positive scores show the signals where it is more HOL-like whereas the negative 516

Page 18: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

scores show the signals where it is more tRDC-like. The chromosomes 1-29 are colored in red and 517

green in order. 518

519

Figure 4. Examples of genomic regions with relative identity-by-descent (rIBD) scores, Fst, 520

integrated haplotype score (iHS) and sharing of ROH. a. Fst and rIBD for genomic region 521

containing gene RPTOR; b. rIBD and sharing of ROH for genomic region containing gene THRSP; 522

c. rIBD and iHS for genomic region containing gene BCAT1; d. rIBD, Fst and sharing of ROH for 523

genomic region containing genes MYH14 and ZNF613; e. rIBD, Fst and iHS for genomic region 524

containing gene ZNF215. 525

526

Supplementary materials 527

528 529

Table S1. Introgressed haplotypes in modern Red Dairy cattle (mRDC) from Holstein (HOL) 530

531

Table S2. Introgressed haplotypes in modern Red Dairy cattle (mRDC) from Brown Swiss (BSW) 532

533

Figure S1. The local ancestry dosage in modern Red Dairy cattle (mRDC) from Holstein (HOL), 534

Brown Swiss (BSW) and traditional Red Dairy cattle (tRDC) for a) chromosome 14, b) 535

chromosome 21 and c) chromosome 29. 536

537

Figure S2. Comparison and correlation between Fst between modern and traditional Red Dairy cattle 538

(mRDC and tRDC) and relative identity-by-descend (rIBD) score introgressed from Holstein (HOL) 539

cattle in mRDC (correlation between rIBD scores larger than median and the corresponding Fst was 540

0.08, p<0.001). a. rIBD score showing introgression from HOL cattle in mRDC. b. Fst between 541

mRDC and tRDC 542

543

Figure S3. Comparison and correlation between Fst between modern and traditional Red Dairy cattle 544

(mRDC and tRDC) and relative identity-by-descend (rIBD) score introgressed from Brown Swiss 545

(BSW) cattle in mRDC (correlation between rIBD scores larger than median and the corresponding 546

Fst was 0.09, p<0.001). a. rIBD score showing introgression from BSW cattle in mRDC. b. Fst 547

between mRDC and tRDC 548

549

Page 19: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

Figure S4. Comparison between iHS score detected within modern Danish Red dairy cattle (mRDC) 550

and relative identity-by-descend (rIBD) score showing introgression from Holstein (HOL) cattle in 551

mRDC (correlation between rIBD scores larger than median and the corresponding proportion of 552

SNPs with |iHS|>2 was 0.02, p<0.05). a. genome-wide pattern of rIBD scores showing introgression 553

from HOL in mRDC. b. iHS sore detected within mRDC. 554

555

Figure S5. Comparison between iHS score detected within modern Danish Red dairy cattle (mRDC) 556

and relative identity-by-descend (rIBD) score showing introgression from Brown Swiss cattle 557

(BSW) in mRDC (correlation between rIBD scores larger than median and the corresponding 558

proportion of SNPs with |iHS|>2 in a 10 kb window was 0.17, p<0.001). a. genome-wide pattern of 559

rIBS scores showing introgression from BSW in mRDC. b. iHS sore detected within mRDC. 560

561

Figure S6. Comparison between overlapped runs-of-homozygosity (ROH) among modern Red 562

dairy cattle (mRDC) and relative identity-by-descend (rIBD) score showing introgression from 563

Holstein cattle (HOL) in mRDC (correlation between rIBD scores larger than median and the 564

corresponding number of individuals sharing an ROH was 0.05, p<0.001). a. genome-wide pattern 565

of rIBS scores showing introgression from HOL in mRDC. b. sharing of ROH among mRDC. 566

567

Figure S7. Comparison between overlapped runs-of-homozygosity (ROH) among modern Red 568

dairy cattle (mRDC) and relative identity-by-descend (rIBD) score showing introgression from 569

Brown Swiss cattle (BSW) in mRDC (correlation between rIBD scores larger than median and the 570

corresponding number of individuals sharing an ROH was 0.01, p<0.001). a. genome-wide pattern 571

of rIBS scores showing introgression from BSW in mRDC. b. sharing of ROH among mRDC. 572

573

References 574

Abo-Ismail, M. K., L. F. Brito, S. P. Miller, M. Sargolzaei, D. A. Grossi, S. S. Moore, G. Plastow, P. Stothard, S. 575

Nayeri and F. S. Schenkel, 2017 Genome-wide association studies and genomic prediction of breeding 576

values for calving performance and body conformation traits in Holstein cattle. Genetics Selection Evolution 577

49(1):82. 578

Abo-Ismail, M. K., M. J. Kelly, E. J. Squires, K. C. Swanson, S. Bauck and S. P. Miller, 2013 Identification of single 579

nucleotide polymorphisms in genes involved in digestive and metabolic processes associated with feed 580

efficiency and performance traits in beef cattle. Journal of Animal Science 91(6): 2512-2529. 581

Abo-Ismail, M. K., G. Vander Voort, J. J. Squires, K. C. Swanson, I. B. Mandell, X. P. Liao, P. Stothard, S. Moore, G. 582

Plastow and S. P. Miller, 2014 Single nucleotide polymorphisms for feed efficiency and performance in 583

crossbred beef cattle. BMC Genetics 15(1):14. 584

Page 20: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

Ai, H. S., X. D. Fang, B. Yang, Z. Y. Huang, H. Chen, L. K. Mao, F. Zhang, L. Zhang, L. L. Cui, W. M. He, J. Yang, X. M. 585

Yao, L. S. Zhou, L. J. Han, J. Li, S. L. Sun, X. H. Xie, B. X. Lai, Y. Su, Y. Lu, H. Yang, T. Huang, W. J. Deng, R. 586

Nielsen, J. Ren and L. S. Huang, 2015. Adaptation and possible ancient interspecies introgression in pigs 587

identified by whole-genome sequencing. Nature Genetics 47(3): 217. 588

Alexander, D. H., J. Novembre and K. Lange, 2009 Fast model-based estimation of ancestry in unrelated 589

individuals. Genome Research 19(9): 1655-1664. 590

Alluwaimi, A. M., C. M. Leutenegger, T. B. Farver, P. V. Rossitto, W. L. Smith and J. S. Cullor, 2003 The cytokine 591

markers in Staphylococcus aureus mastitis of bovine mammary gland. Journal of Veterinary Medicine Series 592

B-Infectious Diseases and Veterinary Public Health 50(3): 105-111. 593

Andersen, B., B. Jensen, A. Nielsen, L. G. Christensen and T. Liboriussen, 2003 Rød Dansk Malkerace-594

avlsmæssigt of kulturhistorisk belyst. Danmarks HordbrugsForskning. 595

Ashwell, M. S., D. W. Heyen, T. S. Sonstegard, C. P. Van Tassell, Y. Da, P. M. VanRaden, M. Ron, J. I. Weller and H. 596

A. Lewin, 2004. Detection of quantitative trait loci affecting milk production, health, and reproductive traits 597

in Holstein cattle. Journal of Dairy Science 87(2): 468-475. 598

Baeza, M. C., P. M. Corva, L. A. Soria, G. Rincon, J. F. Medrano, E. Pavan, E. L. Villarreal, A. Schor, L. Melucci, C. 599

Mezzadra and M. C. Miquel, 2011. Genetic markers of body composition and carcass quality in grazing 600

Brangus steers. Genetics and Molecular Research 10(4): 3146-3156. 601

Beecher, C., M. Daly, S. Childs, D. P. Berry, D. A. Magee, T. V. McCarthy and L. Giblin, 2010. Polymorphisms in 602

bovine immune genes and their associations with somatic cell count and milk production in dairy cattle. 603

BMC Genetics 11(1):99. 604

Benjamini, Y. and D. Yekutieli, 2001. The control of the false discovery rate in multiple testing under 605

dependency. Annals of Statistics 29(4): 1165-1188. 606

Bongiorni, S., G. Mancini, G. Chillemi, L. Pariset and A. Valentini, 2012. Identification of a Short Region on 607

Chromosome 6 Affecting Direct Calving Ease in Piedmontese Cattle Breed. PloS One 7(12): e50137. 608

Bosse, M., M. S. Lopes, O. Madsen, H. J. Megens, R. P. M. A. Crooijmans, L. A. F. Frantz, B. Harlizius, J. W. M. 609

Bastiaansen and M. A. M. Groenen, 2015. Artificial selection on introduced Asian haplotypes shaped the 610

genetic architecture in European commercial pigs. Proceedings of the Royal Society B-Biological Sciences 611

282(1821):20152019. 612

Bosse, M., H. J. Megens, L. A. F. Frantz, O. Madsen, G. Larson, Y. Paudel, N. Duijvesteijn, B. Harlizius, Y. 613

Hagemeijer, R. P. M. A. Crooijmans and M. A. M. Groenen, 2014. Genomic analysis reveals selection for 614

Asian genes in European pigs following human-mediated introgression. Nature Communications 5:4392. 615

Bosse, M., H. J. Megens, O. Madsen, L. A. F. Frantz, Y. Paudel, R. P. M. A. Crooijmans and M. A. M. Groenen, 616

2014. "Untangling the hybrid nature of modern pig genomes: a mosaic derived from biogeographically 617

distinct and highly divergent Sus scrofa populations. Molecular Ecology 23(16): 4089-4102. 618

Browning, S. R. and B. L. Browning, 2007. Rapid and accurate haplotype phasing and missing-data inference for 619

whole-genome association studies by use of localized haplotype clustering. American Journal of Human 620

Genetics 81(5): 1084-1097. 621

Buzanskas, M. E., R. V. Ventura, T. C. S. Chud, P. A. Bernardes, D. J. D. Santos, L. C. D. Regitano, M. M. de 622

Alencar, M. D. Mudadu, R. Zanella, M. V. G. B. da Silva, C. X. Li, F. S. Schenkel and D. P. Munari, 2017. Study 623

on the introgression of beef breeds in Canchim cattle using single nucleotide polymorphism markers. PloS 624

One 12(2): e0171660. 625

Capomaccio, S., M. Milanesi, L. Bomba, K. Cappelli, E. L. Nicolazzi, J. L. Williams, P. Ajmone-Marsan and B. 626

Stefanon, 2015. Searching new signals for production traits through gene-based association analysis in 627

three Italian cattle breeds. Animal Genetics 46(4): 361-370. 628

Page 21: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

Chang, C. C., C. C. Chow, L. C. A. M. Tellier, S. Vattikuti, S. M. Purcell and J. J. Lee, 2015. Second-generation 629

PLINK: rising to the challenge of larger and richer datasets. Gigascience 4(1):7. 630

Cochran, S. D., J. B. Cole, D. J. Null and P. J. Hansen, 2013. Discovery of single nucleotide polymorphisms in 631

candidate genes associated with fertility and production traits in Holstein cattle. BMC Genetics 14(1):49. 632

Cohen-Zinder, M., E. Seroussi, D. M. Larkin, J. J. Loor, A. Everts-van der Wind, J. H. Lee, J. K. Drackley, M. R. 633

Band, A. G. Hernandez, M. Shani, H. A. Lewin, J. I. Weller and M. Ron, 2005. Identification of a missense 634

mutation in the bovine ABCG2 gene with a major effect on the QTL on chromosome 6 affecting milk yield 635

and composition in Holstein cattle. Genome Research 15(7): 936-944. 636

Cole, J. B., B. Waurich, M. Wensch-Dorendorf, D. M. Bickhart and H. H. Swalve, 2014. A genome-wide 637

association study of calf birth weight in Holstein cattle using single nucleotide polymorphisms and 638

phenotypes predicted from auxiliary traits. Journal of Dairy Science 97(5): 3156-3172. 639

Cole, J. B., G. R. Wiggans, L. Ma, T. S. Sonstegard, T. J. Lawlor, Jr., B. A. Crooker, C. P. Van Tassell, J. Yang, S. 640

Wang, L. K. Matukumalli and Y. Da, 2011. Genome-wide association analysis of thirty one production, 641

health, reproduction and body conformation traits in contemporary U.S. Holstein cows. BMC Genomics 642

12(1): 408. 643

Daetwyler, H. D., A. Capitan, H. Pausch, P. Stothard, R. Van Binsbergen, R. F. Brondum, X. P. Liao, A. Djari, S. C. 644

Rodriguez, C. Grohs, D. Esquerre, O. Bouchez, M. N. Rossignol, C. Klopp, D. Rocha, S. Fritz, A. Eggen, P. J. 645

Bowman, D. Coote, A. J. Chamberlain, C. Anderson, C. P. VanTassell, I. Hulsegge, M. E. Goddard, B. 646

Guldbrandtsen, M. S. Lund, R. F. Veerkamp, D. A. Boichard, R. Fries and B. J. Hayes, 2014. Whole-genome 647

sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle. Nature Genetics 648

46(8): 858-865. 649

Davis, S. R., R. J. Spelman and M. D. Littlejohn, 2017. BREEDING AND GENETICS SYMPOSIUM:Breeding heat 650

tolerant dairy cattle: the case for introgression of the "slick" prolactin receptor variant into dairy breeds. 651

Journal of Animal Science 95(4): 1788-1800. 652

Deschamps, M., G. Laval, M. Fagny, Y. Itan, L. Abel, J. L. Casanova, E. Patin and L. Quintana-Murci, 2016. 653

Genomic Signatures of Selective Pressures and Introgression from Archaic Hominins at Human Innate 654

Immunity Genes. American Journal of Human Genetics 98(1): 5-21. 655

Eberlein, A., A. Takasuga, K. Setoguchi, R. Pfuhl, K. Flisikowski, R. Fries, N. Klopp, R. Furbass, R. Weikard and C. 656

Kuhn, 2009. Dissection of Genetic Factors Modulating Fetal Growth in Cattle Indicates a Substantial Role of 657

the Non-SMC Condensin I Complex, Subunit G (NCAPG) Gene. Genetics 183(3): 951-964. 658

Fang, M., W. X. Fu, D. Jiang, Q. Zhang, D. X. Sun, X. D. Ding and J. F. Liu, 2014. A Multiple-SNP Approach for 659

Genome-Wide Association Study of Milk Production Traits in Chinese Holstein Cattle. PloS One 9(8): e99544. 660

Figueiro, H. V., G. Li, F. J. Trindade, J. Assis, F. Pais, G. Fernandes, S. H. D. Santos, G. M. Hughes, A. Komissarov, 661

A. Antunes, C. S. Trinca, M. R. Rodrigues, T. Linderoth, K. Bi, L. Silveira, F. C. C. Azevedo, D. Kantek, E. 662

Ramalho, R. A. Brassaloti, P. M. S. Villela, A. L. V. Nunes, R. H. F. Teixeira, R. G. Morato, D. Loska, P. 663

Saragueta, T. Gabaldon, E. C. Teeling, S. J. O'Brien, R. Nielsen, L. L. Coutinho, G. Oliveira, W. J. Murphy and E. 664

Eizirik, 2017. Genome-wide signatures of complex introgression and adaptive evolution in the big cats. 665

Science Advances 3(7): e1700299. 666

Fontanesi, L., D. G. Calo, G. Galimberti, R. Negrini, R. Marino, A. Nardone, P. Ajmone-Marsan and V. Russo, 667

2014. A candidate gene association study for nine economically important traits in Italian Holstein cattle. 668

Animal Genetics 45(4): 576-580. 669

Frischknecht, M., B. Bapst, F. R. Seefried, H. Signer-Hasler, D. Garrick, C. Stricker, R. Fries, I. Russ, J. Solkner, A. 670

Bieber, M. G. Strillacci, B. Gredler-Grandl, C. Flury and I. Consortium, 2017. Genome-wide association 671

Page 22: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

studies of fertility and calving traits in Brown Swiss cattle using imputed whole-genome sequences. BMC 672

Genomics 18(1):910. 673

Gaddis, K. L. P., D. J. Null and J. B. Cole, 2016. Explorations in genome-wide association studies and network 674

analyses with dairy cattle fertility traits. Journal of Dairy Science 99(8): 6420-6435. 675

Galov, A., E. Fabbri, R. Caniglia, H. Arbanasic, S. Lapalombella, T. Florijancic, I. Boskovic, M. Galaverni and E. 676

Randi, 2015. First evidence of hybridization between golden jackal (Canis aureus) and domestic dog (Canis 677

familiaris) as revealed by genetic markers. Royal Society Open Science 2(12): 150450. 678

Gautier, M. and R. Vitalis, 2012. rehh: an R package to detect footprints of selection in genome-wide SNP data 679

from haplotype structure. Bioinformatics 28(8): 1176-1177. 680

Grun, B. and F. Leisch, 2008. FlexMix Version 2: Finite Mixtures with Concomitant Variables and Varying and 681

Constant Parameters. Journal of Statistical Software 28(4): 1-35. 682

Guan, Y. T., 2014. Detecting Structure of Haplotypes and Local Ancestry. Genetics 196(3): 625-642. 683

Hartwig, S., R. Wellmann, R. Emmerling, H. Hamann and J. Bennewitz, 2015. Short communication: Importance 684

of introgression for milk traits in the German Vorderwald and Hinterwald cattle. Journal of Dairy Science 685

98(3): 2033-2038. 686

Hasenkamp, N., T. Solomon and D. Tautz, 2015. Selective sweeps versus introgression - population genetic 687

dynamics of the murine leukemia virus receptor Xpr1 in wild populations of the house mouse (Mus 688

musculus). BMC Evolutionary Biology 15(1):248. 689

Hering, D. M., K. Olenski and S. Kaminski, 2014. Genome-wide association study for poor sperm motility in 690

Holstein-Friesian bulls. Animal Reproduction Science 146(3-4): 89-97. 691

Hoglund, J. K., B. Buitenhuis, B. Guldbrandtsen, M. S. Lund and G. Sahana, 2015. Genome-wide association 692

study for female fertility in Nordic Red cattle. BMC Genetics 16(1):110. 693

Hope, A. C. A., 1968. A Simplified Monte Carlo Significance Test Procedure. Journal of the Royal Statistical 694

Society Series B-Statistical Methodology 30(3): 582-598. 695

Hu, Z. L., C. A. Park and J. M. Reecy, 2016. Developmental progress and current status of the Animal QTLdb. 696

Nucleic Acids Research 44(D1): D827-D833. 697

Jagoda, E., D. J. Lawson, J. D. Wall, D. Lambert, C. Muller, M. Westaway, M. Leavesley, T. D. Capellini, M. 698

Mirazon Lahr, P. Gerbault, M. G. Thomas, A. B. Migliano, E. Willerslev, M. Metspalu and L. Pagani, 2017. 699

Disentangling Immediate Adaptive Introgression from Selection on Standing Introgressed Variation in 700

Humans. Molecular Biology Evolution 35(3):623-630. 701

Kantanen, J., I. Olsaker, L. E. Holm, S. Lien, J. Vilkki, K. Brusgaard, E. Eythorsdottir, B. Danell and S. Adalsteinsson, 702

2000. Genetic diversity and population structure of 20 North European cattle breeds. Journal of Heredity 703

91(6): 446-457. 704

Keele, J. W., L. A. Kuehn, T. G. McDaneld, R. G. Tait, S. A. Jones, T. P. L. Smith, S. D. Shackelford, D. A. King, T. L. 705

Wheeler, A. K. Lindholm-Perry and A. K. McNeel, 2015. Genomewide association study of lung lesions in 706

cattle using sample pooling. Journal of Animal Science 93(3): 956-964. 707

Kinsella, R. J., A. Kahari, S. Haider, J. Zamora, G. Proctor, G. Spudich, J. Almeida-King, D. Staines, P. Derwent, A. 708

Kerhornou, P. Kersey and P. Flicek, 2011. Ensembl BioMarts: a hub for data retrieval across taxonomic 709

space. Database-the Journal of Biological Databases and Curation. 710

Kolbehdari, D., Z. Wang, J. R. Grant, B. Murdoch, A. Prasad, Z. Xiu, E. Marques, P. Stothard and S. S. Moore, 711

2008. A whole-genome scan to map quantitative trait loci for conformation and functional traits in 712

Canadian Holstein Bulls. Journal of Dairy Science 91(7): 2844-2856. 713

Li, H. and R. Durbin, 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. 714

Bioinformatics 25(14): 1754-1760. 715

Page 23: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

Li, H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin and G. P. D. 716

Proc, 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25(16): 2078-2079. 717

Lindholm-Perry, A. K., A. K. Sexten, L. A. Kuehn, T. P. L. Smith, D. A. King, S. D. Shackelford, T. L. Wheeler, C. L. 718

Ferrell, T. G. Jenkins, W. M. Snelling and H. C. Freetly, 2011. Association, effects and validation of 719

polymorphisms within the NCAPG-LCORL locus located on BTA6 with feed intake, gain, meat and carcass 720

traits in beef cattle. BMC Genetics 12(1):103. 721

Magee, D. A., K. M. Sikora, E. W. Berkowicz, D. P. Berry, D. J. Howard, M. P. Mullen, R. D. Evans, C. Spillane and 722

D. E. MacHugh, 2010. DNA sequence polymorphisms in a panel of eight candidate bovine imprinted genes 723

and their association with performance traits in Irish Holstein-Friesian cattle. BMC Genetics 11(1):93. 724

Mao, X., N. K. Kadri, J. R. Thomasen, D. J. De Koning, G. Sahana and B. Guldbrandtsen, 2016. Fine mapping of a 725

calving QTL on Bos taurus autosome 18 in Holstein cattle. Journal of Animal Breeding and Genetics 133(3): 726

207-218. 727

McKenna, A., M. Hanna, E. Banks, A. Sivachenko, K. Cibulskis, A. Kernytsky, K. Garimella, D. Altshuler, S. Gabriel, 728

M. Daly and M. A. DePristo, 2010. The Genome Analysis Toolkit: a MapReduce framework for analyzing 729

next-generation DNA sequencing data. Genome Research 20: 1297–1303. 730

Merks, J. W. M., P. K. Mathur and E. F. Knol, 2012. New phenotypes for new breeding goals in pigs. Animal 6(4): 731

535-543. 732

Michenet, A., R. Saintilan, E. Venot and F. Phocas, 2016. Insights into the genetic variation of maternal behavior 733

and suckling performance of continental beef cows. Genetics Selection Evolution 48(1):45. 734

Pickrell, J. K. and J. K. Pritchard, 2012. Inference of Population Splits and Mixtures from Genome-Wide Allele 735

Frequency Data. PloS Genetics 8(11): e1002967. 736

Pimentel, E. C. G., S. Bauersachs, M. Tietze, H. Simianer, J. Tetens, G. Thaller, F. Reinhardt, E. Wolf and S. Konig, 737

2011. Exploration of relationships between production and fertility traits in dairy cattle via association 738

studies of SNPs within candidate genes derived by expression profiling. Animal Genetics 42(3): 251-262. 739

Prakash, V., T. K. Bhattacharya, B. Jyotsana and O. P. Pandey, 2011. Molecular Cloning, Characterization, 740

Polymorphism, and Association Study of the Interleukin-2 Gene in Indian Crossbred Cattle. Biochemical 741

Genetics 49(9-10): 638-644. 742

Sabeti, P. C., D. E. Reich, J. M. Higgins, H. Z. P. Levine, D. J. Richter, S. F. Schaffner, S. B. Gabriel, J. V. Platko, N. J. 743

Patterson, G. J. McDonald, H. C. Ackerman, S. J. Campbell, D. Altshuler, R. Cooper, D. Kwiatkowski, R. Ward 744

and E. S. Lander, 2002. Detecting recent positive selection in the human genome from haplotype structure. 745

Nature 419(6909): 832-837. 746

Sahana, G., J. K. Hoglund, B. Guldbrandtsen and M. S. Lund, 2015. Loci associated with adult stature also affect 747

calf birth survival in cattle. BMC Genetics 16(1):47. 748

Sasago, N., T. Abe, H. Sakuma, T. Kojima and Y. Uemoto, 2017. Genome-wide association study for carcass 749

traits, fatty acid composition, chemical composition, sugar, and the effects of related candidate genes in 750

Japanese Black cattle. Animal Science Journal 88(1): 33-44. 751

Setoguchi, K., T. Watanabe, R. Weikard, E. Albrecht, C. Kuhn, A. Kinoshita, Y. Sugimoto and A. Takasuga, 2011. 752

The SNP c.1326T > G in the non-SMC condensin I complex, subunit G (NCAPG) gene encoding a p.Ile442Met 753

variant is associated with an increase in body frame size at puberty in cattle. Animal Genetics 42(6): 650-754

655. 755

Sherry, S. T., M. H. Ward, M. Kholodov, J. Baker, L. Phan, E. M. Smigielski and K. Sirotkin, 2001. dbSNP: the NCBI 756

database of genetic variation. Nucleic Acids Research 29(1): 308-311. 757

Sun, C., C. Southard, D. B. Witonsky, R. Kittler and A. Di Rienzo, 2010. Allele-Specific Down-Regulation of RPTOR 758

Expression Induced by Retinoids Contributes to Climate Adaptations. PloS Genetics 6(10):e1001178. 759

Page 24: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable

Saatchi, M., R. D. Schnabel, J. F. Taylor and D. J. Garrick, 2014. Large-effect pleiotropic or closely linked QTL 760

segregate within and across ten US cattle breeds. BMC Genomics 15(1):442. 761

Voight, B. F., S. Kudaravalli, X. Q. Wen and J. K. Pritchard, 2006. A map of recent positive selection in the human 762

genome. PloS Biology 4(3): 446-458. 763

vonHoldt, B. M., R. Kays, J. P. Pollinger and R. K. Wayne, 2016. Admixture mapping identifies introgressed 764

genomic regions in North American canids. Molecular Ecology 25(11): 2443-2453. 765

Weir, B. S. and C. C. Cockerham, 1984. Estimating F-Statistics for the Analysis of Population-Structure. Evolution 766

38(6): 1358-1370. 767

Weng, Z. Q., M. Saatchi, R. D. Schnabel, J. F. Taylor and D. J. Garrick, 2014. Recombination locations and rates in 768

beef cattle assessed from parent-offspring pairs. Genetics Selection Evolution 46(1):34. 769

White, S., 2011. from globalized PIG BREEDS TO CAPITALIST PIGS: A STUDY IN ANIMAL CULTURES AND 770

EVOLUTIONARY HISTORY. Environmental History 16(1): 94-120. 771

Williams, R. L., 2000. A note on robust variance estimation for cluster-correlated data. Biometrics 56(2): 645-772

646. 773

Wu, X. P., B. Guldbrandtsen, U. S. Nielsen, M. S. Lund and G. Sahana, 2017. Association analysis for young stock 774

survival index with imputed whole-genome sequence variants in Nordic Holstein cattle. Journal of Dairy 775

Science 100(8): 6356-6370. 776

Xia, J. W., H. Z. Fan, T. P. Chang, L. Y. Xu, W. G. Zhang, Y. X. Song, B. Zhu, L. P. Zhang, X. Gao, Y. Chen, J. Y. Li and 777

H. J. Gao, 2017. Searching for new loci and candidate genes for economically important traits through 778

genebased association analysis of Simmental cattle. Scientific Reports 7:42048. 779

Yang, J. A., S. H. Lee, M. E. Goddard and P. M. Visscher, 2011. GCTA: A Tool for Genome-wide Complex Trait 780

Analysis. American Journal of Human Genetics 88(1): 76-82. 781

Zhang, Q. Q., M. P. L. Calus, B. Guldbrandtsen, M. S. Lund and G. Sahana, 2015. Estimation of inbreeding using 782

pedigree, 50k SNP chip genotypes and full sequence data in three cattle breeds. BMC Genetics 16(1):88. 783

Zhang, Q. Q., B. Guldbrandtsen, M. Bosse, M. S. Lund and G. Sahana, 2015. Runs of homozygosity and 784

distribution of functional variants in the cattle genome. BMC Genomics 16(1):542. 785

Zheng, X., Z. H. Ju, J. Wang, Q. L. Li, J. M. Huang, A. W. Zhang, J. F. Zhong and C. F. Wang, 2011. Single 786

nucleotide polymorphisms, haplotypes and combined genotypes of LAP3 gene in bovine and their 787

association with milk production traits. Molecular Biology Reports 38(6): 4053-4061. 788

Zimin, A. V., A. L. Delcher, L. Florea, D. R. Kelley, M. C. Schatz, D. Puiu, F. Hanrahan, G. Pertea, C. P. Van Tassell, 789

T. S. Sonstegard, G. Marcais, M. Roberts, P. Subramanian, J. A. Yorke and S. L. Salzberg, 2009. A whole-790

genome assembly of the domestic cow, Bos taurus. Genome Biology 10(4):R42. 791

792

793

Page 25: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable
Page 26: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable
Page 27: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable
Page 28: Human-Mediated Introgression of Haplotypes in a Modern ......May 30, 2018  · 68 Artificial selection and different breeding strategies have enabled generating numerous desirable