Top Banner
Article Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming Carla Aime ´,* ,1 Guillaume Laval, 2 Etienne Patin, 2 Paul Verdu, 1 Laure Se ´gurel, 3 Raphae ¨lle Chaix, 1 Tatyana Hegay, 4 Lluis Quintana-Murci, 2 Evelyne Heyer, 1 and Fre ´de ´ric Austerlitz 1 1 Laboratoire Eco-Anthropologie et Ethnobiologie, UMR 7206, Muse ´um National d’Histoire Naturelle, Centre National de la Recherche Scientifique, Universite ´ Paris 7 Diderot, Paris, France 2 Unit of Human Evolutionary Genetics, Institut Pasteur, CNRS URA3012, Paris, France 3 Department of Human Genetics, University of Chicago 4 Academy of Sciences, Institute of Immunology, Tashkent, Uzbekistan *Corresponding author: E-mail: [email protected]. Associate editor: John Novembre Abstract Demographic changes are known to leave footprints on genetic polymorphism. Together with the increased availability of large polymorphism data sets, coalescent-based methods allow inferring the past demography of populations from their present-day patterns of genetic diversity. Here, we analyzed both nuclear (20 noncoding regions) and mitochondrial (HVS-I) resequencing data to infer the demographic history of 66 African and Eurasian human populations presenting contrasting lifestyles (nomadic hunter-gatherers, nomadic herders, and sedentary farmers). This allowed us to investigate the relationship between lifestyle and demography and to address the long-standing debate about the chronology of demographic expansions and the Neolithic transition. In Africa, we inferred expansion events for farmers, but constant population sizes or contraction events for hunter-gatherers. In Eurasia, we inferred higher expansion rates for farmers than herders with HVS-I data, except in Central Asia and Korea. Although isolation and admixture processes could have impacted our demographic inferences, these processes alone seem unlikely to explain the contrasted demographic histories inferred in populations with different lifestyles. The small expansion rates or constant population sizes inferred for herders and hunter-gatherers may thus result from constraints linked to nomadism. However, autosomal data revealed contraction events for two sedentary populations in Eurasia, which may be caused by founder effects. Finally, the inferred expansions likely predated the emergence of agriculture and herding. This suggests that human populations could have started to expand in Paleolithic times, and that strong Paleolithic expansions in some populations may have ultimately favored their shift toward agriculture during the Neolithic. Key words: population genetics, inferences, coalescent, neolithic transition, expansions. Introduction Studying the current distribution of genetic diversity in human populations has important implications for our un- derstanding of the evolution and history of our species. Indeed, within- and among-population genetic diversity has been shaped both by demographic forces, such as gene flow and genetic drift, and by selective processes (e.g., Balaresque et al. 2007). Cultural factors like social organization and tech- nological innovation have also had a considerable indirect impact on patterns of genetic diversity, as they can influence both the demographic and adaptive history (e.g., Ambrose 2001; Oota et al. 2001; Kumar et al. 2006; Heyer et al. 2012). The Neolithic revolution is thought to be one of the most important cultural and technological transitions in human history. During this period, different human populations do- mesticated plants and animals in several parts of the world, including Central Africa, the Middle Eastern Fertile Crescent, Eastern Asia, and Central America (Bocquet-Appel and Bar-Yosef 2008). The emergence of farming occurred con- comitantly with the sedentarization of most nomadic hunter-gatherer populations. Other populations remained nomadic, but some of them also developed new means of subsistence like nomadic herding. According to some arche- ologists and paleoanthropologists, the major human expan- sions would have started as a result from the Neolithic transition: sedentarized populations could have experienced strong demographic expansions (e.g., Bocquet-Appel 2011), whereas nomadic populations may have remained constant because of inherent constraints of their lifestyle (e.g., a longer inter-birth interval; Short 1982). However, a number of pop- ulation genetic studies have reported evidence for more an- cient expansion processes in many African and Eurasian populations, starting during the Paleolithic period (e.g., Chaix et al. 2008; Atkinson et al. 2009; Laval et al. 2010; Batini et al. 2011). These findings seem consistent with the “demographic theory” proposed by Sauer (1952), according to which human populations could have started to increase ß The Author 2013. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Mol. Biol. Evol. doi:10.1093/molbev/mst156 Advance Access publication September 24, 2013 1 MBE Advance Access published October 11, 2013 at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from at Museum National d'Histoire Naturelle - Bibliothèque Centrale on October 14, 2013 http://mbe.oxfordjournals.org/ Downloaded from
16

Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

Jan 20, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

Article

Human Genetic Data Reveal Contrasting Demographic Patternsbetween Sedentary and Nomadic Populations That Predate theEmergence of FarmingCarla Aime1 Guillaume Laval2 Etienne Patin2 Paul Verdu1 Laure Segurel3 Raphaelle Chaix1

Tatyana Hegay4 Lluis Quintana-Murci2 Evelyne Heyer1 and Frederic Austerlitz1

1Laboratoire Eco-Anthropologie et Ethnobiologie UMR 7206 Museum National drsquoHistoire Naturelle Centre National de laRecherche Scientifique Universite Paris 7 Diderot Paris France2Unit of Human Evolutionary Genetics Institut Pasteur CNRS URA3012 Paris France3Department of Human Genetics University of Chicago4Academy of Sciences Institute of Immunology Tashkent Uzbekistan

Corresponding author E-mail aimemnhnfr

Associate editor John Novembre

Abstract

Demographic changes are known to leave footprints on genetic polymorphism Together with the increased availability oflarge polymorphism data sets coalescent-based methods allow inferring the past demography of populations from theirpresent-day patterns of genetic diversity Here we analyzed both nuclear (20 noncoding regions) and mitochondrial(HVS-I) resequencing data to infer the demographic history of 66 African and Eurasian human populations presentingcontrasting lifestyles (nomadic hunter-gatherers nomadic herders and sedentary farmers) This allowed us to investigatethe relationship between lifestyle and demography and to address the long-standing debate about the chronology ofdemographic expansions and the Neolithic transition In Africa we inferred expansion events for farmers but constantpopulation sizes or contraction events for hunter-gatherers In Eurasia we inferred higher expansion rates for farmersthan herders with HVS-I data except in Central Asia and Korea Although isolation and admixture processes could haveimpacted our demographic inferences these processes alone seem unlikely to explain the contrasted demographichistories inferred in populations with different lifestyles The small expansion rates or constant population sizes inferredfor herders and hunter-gatherers may thus result from constraints linked to nomadism However autosomal datarevealed contraction events for two sedentary populations in Eurasia which may be caused by founder effectsFinally the inferred expansions likely predated the emergence of agriculture and herding This suggests that humanpopulations could have started to expand in Paleolithic times and that strong Paleolithic expansions in some populationsmay have ultimately favored their shift toward agriculture during the Neolithic

Key words population genetics inferences coalescent neolithic transition expansions

IntroductionStudying the current distribution of genetic diversity inhuman populations has important implications for our un-derstanding of the evolution and history of our speciesIndeed within- and among-population genetic diversity hasbeen shaped both by demographic forces such as gene flowand genetic drift and by selective processes (eg Balaresqueet al 2007) Cultural factors like social organization and tech-nological innovation have also had a considerable indirectimpact on patterns of genetic diversity as they can influenceboth the demographic and adaptive history (eg Ambrose2001 Oota et al 2001 Kumar et al 2006 Heyer et al 2012)

The Neolithic revolution is thought to be one of the mostimportant cultural and technological transitions in humanhistory During this period different human populations do-mesticated plants and animals in several parts of the worldincluding Central Africa the Middle Eastern Fertile CrescentEastern Asia and Central America (Bocquet-Appel and

Bar-Yosef 2008) The emergence of farming occurred con-comitantly with the sedentarization of most nomadichunter-gatherer populations Other populations remainednomadic but some of them also developed new means ofsubsistence like nomadic herding According to some arche-ologists and paleoanthropologists the major human expan-sions would have started as a result from the Neolithictransition sedentarized populations could have experiencedstrong demographic expansions (eg Bocquet-Appel 2011)whereas nomadic populations may have remained constantbecause of inherent constraints of their lifestyle (eg a longerinter-birth interval Short 1982) However a number of pop-ulation genetic studies have reported evidence for more an-cient expansion processes in many African and Eurasianpopulations starting during the Paleolithic period (egChaix et al 2008 Atkinson et al 2009 Laval et al 2010Batini et al 2011) These findings seem consistent with theldquodemographic theoryrdquo proposed by Sauer (1952) accordingto which human populations could have started to increase

The Author 2013 Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution All rights reserved For permissions pleasee-mail journalspermissionsoupcom

Mol Biol Evol doi101093molbevmst156 Advance Access publication September 24 2013 1

MBE Advance Access published October 11 2013 at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

at M

useum N

ational dHistoire N

aturelle - BibliothAtilde

umlque Centrale on O

ctober 14 2013httpm

beoxfordjournalsorgD

ownloaded from

before the Neolithic and these Paleolithic expansions in somepopulations may have ultimately favored their shift towardfarming

The recent developments in sequencing technologies andbioinformatics tools have allowed the exploration of largemultilocus polymorphism data sets In combination witharcheological and paleoanthropological records it can sub-stantially improve our ability to infer past demographicalevents (Beaumont 2004) Stemming from Kingmanrsquos (1982)coalescent theory numerical coalescent-based methods havethus been developed allowing the inference of demographicparameters from molecular data Most of these methodsassume a specific demographic model Moreover nonpara-metric approaches such as Extended Bayesian Skyline Plots(EBSPs Heled and Drummond 2010) allow inference of thedemographic history of populations without assuming a spe-cific model by using the time intervals between serial coales-cent events (see Excoffier and Heckel 2006 and Ho andShapiro 2011 for reviews)

Here we used these methods to investigate 1) the rela-tionship between lifestyle (ie sedentary farming nomadicherding or nomadic hunting-gathering) and demographicpatterns in a large set of African and Eurasian populationsand 2) the chronology of demographic expansions and theemergence of farming by comparing inferred expansiononset times with the dating of the most ancient archeologicaltraces of farming and herding (potteries irrigation structuresand animals bones) reported in Bocquet-Appel and Bar-Yosef(2008) for each region In addition by computing FST valuesand immigration rates we investigated the extent to whichthe inferred demographic patterns could be explained byspatial expansion processes Indeed modeling studies (Rayet al 2003 Excoffier 2004) have shown that such processescan produce signals on within-population diversity patternssimilar to those obtained with pure demographic expansionsIn particular these studies argue that ancient spatial expan-sion signals could be attenuated or suppressed in isolatedpopulations Different expansion signals among populationsas inferred from genetic data may thus in part reflect variationin immigration rates and extent of population isolation

We used 20 a priori neutral autosomal regions and thehypervariable control region (HVS-1) of the mitochondrialDNA (mtDNA) sequenced in 404 individuals from 16 popu-lations and 2429 individuals from 61 populations respectively(supplementary table S1 Supplementary Material online)Given their distinct properties and modes of transmissionwe compared the inferences obtained with these two typesof markers in order to gain complementary insights into thepast demography of the studied populations By studyingmany populations from different geographic areas worldwidewe were able to determine which patterns were observedacross all populations and which were specific to a givengeographical region First we focused on Central Africawhere nomadic hunter-gatherer populations commonlycalled Pygmies coexist with sedentary farmer populationsThese two groups are genetically differentiated and seem tohave diverged about 60000 years ago (Patin et al 2009 Verduet al 2009) thus long before the Neolithic sedentarization of

farmer populations in this area (5000ndash4000 years before pre-sent [YBP] Bocquet-Appel and Bar-Yosef 2008) Second weanalyzed a sample of populations from several distant geo-graphical regions of Eurasia where sedentary farmers coexistwith nomadic herders This was of particular interest as toour knowledge the differences in demographic processes be-tween herders and farmers have not been studied yet Thirdwe performed a more detailed study in Central Asia anotherarea of interest as it is thought to have been a major corridorduring the successive Eurasian migration waves (Nei andRoychoudhury 1993)

Results

Neutrality Tests

Focusing first on Africa all farmer populations showed at leastone significantly negative value for one of the four neutralitytests (Tajimarsquos [1989] D Fu and Lirsquos [1993] D and F and Fursquos[1997] Fs table 1) which can be interpreted as a signal ofexpansion Conversely hunter-gatherer populations showedno such expansion signals Aka and Mbuti hunter-gathererspresented at least one significantly positive test indicating apossible contraction event Similarly for HVS-I sequences wefound significantly negative Fursquos Fs values for all farmer pop-ulations except the Ewondo but no expansion signal forhunter-gatherers (supplementary table S2 SupplementaryMaterial online) Kola hunter-gatherers showed a significantlypositive Tajimarsquos D indicating a possible contraction event

Similar analyses on autosomal sequences in Europe andEast Asia revealed no significant expansion signals neither insedentary nor in nomadic populations (table 1) We evenobserved contraction signals in two sedentary populationsone East-Asian and one European Indeed we found signifi-cantly positive values for two neutrality tests for the Japaneseand three neutrality tests for the Danes Conversely for HVS-Isequences from Eurasia (supplementary table S2Supplementary Material online) we obtained significant sig-nals of expansion for at least one test (Fursquos Fs) for all popu-lations (including Japanese and Danes) All sedentarypopulations except Koreans also showed significant signalsof expansion for the three other tests whereas the Koreansand all nomadic populations showed a significant expansionsignal only for Fursquos Fs

Focusing on Central Asia no neutrality test was significantfor the autosomes in neither Tajik sedentary farmers (TAB)nor Kyrgyz nomadic herders (KIB) (table 1) Conversely forHVS-I sequences (supplementary table S2 SupplementaryMaterial online) all farmers and herders presented a signifi-cant expansion signal for at least one test except one farmerpopulation (TDS)

Coalescent-Based Inferences of Demographic HistoryAfrica Pre-Neolitic Demographic Expansions in SedentaryFarmer PopulationsConsidering first the autosomal data models consistent withan increase in population size best fitted the data for allAfrican farmer populations (supplementary table S3Supplementary Material online) The ldquoexpansion modelrdquo

2

Aime et al doi101093molbevmst156 MBE

best fitted the data for the two East-African farmer popula-tions (namely Chagga and Mozambicans) whereas the ldquoex-ponential modelrdquo best fitted the data for all West-Africanfarmer populations (Akele Ngumba and Yoruba) with pos-itive growth rates in all cases (supplementary table S4Supplementary Material online) Conversely no signals of ex-pansion were found for hunter-gatherer populations as theldquoconstant modelrdquo always best fitted the data (supplementarytables S3 and S4 Supplementary Material online)Consistently EBSPs showed signals of expansions for farmerpopulations (fig 1A) 95 highest probability density (HPD)intervals for the estimated number of demographic changesdid not include 0 indicating at least one significant change inpopulation size (supplementary table S5 SupplementaryMaterial online) Conversely we found no evidence of popu-lation size changes for hunter-gatherers (fig 1B and supple-mentary table S5 Supplementary Material online) We furtherdated the onset of farmer expansions from at least 62275 YBP(assuming = 25 108generationsite) or 124550 YBP(assuming = 12 108generationsite) for Mozambicansto 7975 or 15950 YBP for Yoruba Visual examination of the95 HPD intervals showed that the expansion event inferredfor the Mozambican population was significantly older thanthose inferred for the other populations (supplementary tableS6 Supplementary Material online)

We found similar results for the HVS-I sequences fromCentral Africa (supplementary tables S3 and S7Supplementary Material online) Indeed the exponentialmodel with positive growth rates best fitted the data for allfarmer populations indicating expansion events Converselythe exponential model with negative modal values for growthrate (ie contraction event) provided the best fit for allhunter-gatherer populations However as the 95 HPD

intervals for growth rates included 0 we could not concludeany significant contraction events for these populationsSimilarly EBSPs indicated a significant expansion event forall farmer populations (fig 2A and supplementary table S5Supplementary Material online) whereas we found no evi-dence of population size changes for hunter-gatherers (fig 2Band supplementary table S5 Supplementary Material online)We dated farmer populations expansions from 31350 or62700 YBP (assuming = 105 or 5 106generationsiterespectively) to 45319 or 90638 YBP (supplementary table S6Supplementary Material online)

Finally both with autosomes and HVS-I all hunter-gath-erer populations had lower current effective population size(N0) values than farmer populations (supplementary tables S4and S7 Supplementary Material online) Furthermorethe inferred expansion onsets for all farmer populations lar-gely predated the emergence of farming in Central Africa(5000ndash4000 YBP Bocquet-Appel and Bar-Yosef 2008)(figs 3 and 4)

Eurasia Contrasting Demographic Patterns for FarmerPopulations with Autosomes and Stronger Pre-NeolithicExpansions for Farmers Than Herders with HVS-IThe coalescent-based analyses of autosomes in East-Asianand European populations showed contrasting demographicpatterns across sedentary populations (supplementary tablesS3 and S4 Supplementary Material online) Using the para-metric BEAST analysis the expansion model best fitted thedata for Han Chinese indicating an expansion eventConversely we inferred that Japanese and Danes eitherunderwent a contraction event or remained at constantsize Indeed the exponential model with negative growthrates best fitted the data for these two populations but the

Table 1 Summary Statistics and Neutrality Tests Computed from the Whole Autosomal Sequences

Population Area Lifestyle Sa Kb Tajimarsquos Dc Fu amp Lirsquos Dc Fu amp Lirsquos Fc Fursquos Fsc

Akele Africa Sedentary farmers 695 645 035 055 057 112

Chagga Africa Sedentary farmers 865 795 048 070 074 127

Mozambicans Africa Sedentary farmers 880 955 062 115 115 333

Ngumba Africa Sedentary farmers 705 620 020 041 041 068

Yoruba Africa Sedentary farmers 750 715 014 003 003 073

Aka Africa Nomadic HGd 695 660 012 034 032 030

G Baka Africa Nomadic HG 630 600 0008 017 014 033

S Baka Africa Nomadic HG 610 56 017 005 010 003

Kola Africa Nomadic HG 655 625 014 003 008 075

Mbuti Africa Nomadic HG 660 610 025 035 037 016

Danes Eurasia Sedentary farmers 550 485 030 016 024 073

Han Eurasia Sedentary farmers 520 470 003 001 002 021

Japanese Eurasia Sedentary farmers 420 385 045 022 034 106

Chuvash Eurasia Nomadic herders 570 505 009 011 012 034

Tajiks (TAB) C Asia Sedentary farmers 900 900 019 003 010 024

Kyrgyz (KIB) C Asia Nomadic herders 104 1040 011 008 011 023

NOTEmdashValues significantly higher than expected for a constant population size model are italicized whereas significantly lower values are underlinedaNumber of polymorphismsbNumber of haplotypescWe report the means over the 20 regionsdHG = Hunter-gatherers significance levels Plt 005 Plt 001 after FDR correction for multiple testing (Benjamini and Hochberg 1995)

3

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

95 HPD intervals also included g = 0 The constant modelbest fitted the data for the Chuvash a traditionally nomadicpopulation EBSPs showed a significant expansion event forthe Han population the value of 0 was not included in the95 HPD interval of the number of demographic changes(supplementary table S5 Supplementary Material online)These expansion events started at least 36025 or 72050YBP (fig 1C and supplementary table S6 SupplementaryMaterial online) clearly predating the emergence of farmingin East Asia about 9000 YBP (Bocquet-Appel and Bar-Yosef

2008) (fig 3) Japanese showed a significant contraction event(ie the value of 0 was not included in the 95 HPD intervalof the number of demographic changes supplementary tableS5 Supplementary Material online) starting at least 21350 or42700 YBP Danes also showed a significant contractionevent starting at least 26440 or 52880 YBP (fig 1C and sup-plementary tables S5 and S6 Supplementary Material online)EBSP analyses showed no significant demographic changes forthe Chuvash (fig 1D and supplementary table S5Supplementary Material online)

FIG 1 EBSPs inferred from autosomal sequences in African sedentary farmers (A) African nomadic hunter-gatherers (B) Eurasian sedentary farmers (C)and Eurasian nomadic herders (D) The values indicated in bold on the axes are obtained assuming a mutation rate of = 12 108generationsite(measured from trios parentsndashchildren by Conrad et al 2011) and the other values correspond to = 25 108generationsite (derived from thesequence divergence humanndashchimpanzee by Pluzhnikov et al 2002) Although time was expressed in generations for the analyses we represented timein years here assuming a generation time of 25 years Time is represented backward on the x axis from present to the left to the most distant past onthe right 95 lower and upper HPD are represented by dashed lines Populations for which the estimated number of demographic changes include 0(ie no significant signal of expansion or decline) are represented in light gray

4

Aime et al doi101093molbevmst156 MBE

FIG 2 EBSPs inferred from HVS-I sequences in African sedentary farmers (A) African nomadic hunter-gatherers (B) Eurasian sedentary farmers (C)Eurasian nomadic herders (D) Central Asian sedentary farmers (E) and Central Asian nomadic herders (F) The values indicated in bold are obtainedassuming a mutation rate of = 5 106generationsite (transitional changes rate Forster et al 1996) and the others correspond to = 105generationsite (pedigree-based Howell et al 1996 Heyer et al 2001) Time is represented time in years assuming a generation time of 25 years It isrepresented backward on the x axis from present to the left to the most distant past on the right 95 lower and upper HPD are represented by dashedlines Populations for which the estimated number of demographic changes include 0 (ie no significant signal of expansion or decline) are representedin light gray and the others in black

5

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

For the HVS-I sequences from Eurasia (supplementarytables S7 and S8 Supplementary Material online) the para-metric BEAST analyses showed that models consistent withan increase in population size (expansion model or exponen-tial model with positive growth rates) best fitted the data forall sedentary populations except Koreans Conversely theconstant model best fitted the data for all nomadic popula-tions as well as Koreans EBSPs showed however significantexpansion events for both farmers and herders but not forKoreans (fig 2C and D and supplementary table S5Supplementary Material online) Nevertheless there was atendency toward stronger expansion rates and higher Ne

values in sedentary than in nomadic populations (fig 2Cand D) although the 95 HPD intervals for Ne were quitelarge for sedentary populations The estimated expansiononset times inferred from the EBSPs (supplementary tableS6 Supplementary Material online) followed an east-to-west gradient they appeared more ancient in Eastern popu-lations in both sedentary and nomadic populations (supple-mentary fig S1 Supplementary Material online) They alsoclearly predated the Neolithic transition in all geographicareas (fig 4)

The Central Asian Exception Similar Demographic Patterns inFarmers and HerdersFor autosomes the constant model best fitted the data forboth sedentary farmers (TAB) and traditionally nomadic her-ders (KIB) (supplementary tables S3 and S4 SupplementaryMaterial online) EBSPs showed also no significant demo-graphic changes for these populations (figs 1C and D andsupplementary table S5 Supplementary Material online)

For HVS-I (supplementary tables S3 and S7 SupplementaryMaterial online) the exponential model best fitted the datafor six of the 12 sedentary farmer populations (includingTAB) whereas the constant model was preferred for theother farmers Unlike the rest of Eurasia a model indicatingexpansion (the exponential model with positive growth rates)was also selected for all nomadic herders Moreover EBSPsshowed significant expansion signals for both herder andfarmer populations except TJY (Yagnobs from Dushanbe)since at least 13860 YBP (or 27720 YBP) for farmers and16546 YBP (or 33092 YBP) for herders on average (fig 2Eand F and supplementary tables S5 and S6 Supplementary

Material online) Again these inferred expansion onsets pre-dated the emergence of farming in the area about 8000 YBP(Bocquet-Appel and Bar-Yosef 2008) (fig 4) Inferred expan-sions for Central Asian sedentary farmers seemed overallweaker (ie lower growth rate and lower Ne) than those ob-served for other sedentary populations in Eurasia althoughwe observed important variations in growth rates and Ne

among populations and large 95 HPD intervals for someof them (fig 2C and E)

Degrees of Isolation and Migration Patterns

African farmer populations appeared less isolated and re-ceived more migrants than hunter-gatherer populationsIndeed the population-specific FST values (supplementarytable S9 Supplementary Material online) were on averagesignificantly lower for farmers than for hunter-gatherers(mean[farmers] = 0058 mean[HG] = 0192 Wilcoxon two-sidedtest P value = 00002) Moreover the estimated number ofimmigrants was significantly higher for sedentary farmersthan for nomadic hunter-gatherers (mean[farmers] = 314mean[HG] = 221 P value = 00001) (supplementary table S10Supplementary Material online) For hunter-gatherers the FST

values were negatively correlated with the negative growthrates that we inferred from the parametric method (=0893 P value = 0012) (fig 5B) meaning that less isolatedpopulations showed weaker contraction events (ie less neg-ative growth rates) Conversely there was no significant cor-relation between FST values and inferred growth rates forsedentary farmers (= 0433 P value = 0249) (fig 5A)However we found a significant positive correlation betweenthe number of immigrants and the inferred growth rates(supplementary fig S2 Supplementary Material online)among sedentary farmer populations (= 0867 P value =0004) but not among nomadic hunter-gatherers (= 0536P value = 0235)

For Eurasia we found no significant difference in FST valuesbetween farmers and herders (mean[farmers] = 0039 mean-

[herders] = 0043 P value = 077 supplementary table S9Supplementary Material online) except in Central Asia forwhich we found significantly lower FST values for nomadicherders than for sedentary farmers (mean[farmers] = 0018mean[herders] = 0008 P value = 0017) We report a significant

FIG 3 Comparison of estimated times for expansion onsets using autosomes and dating of the first archeological traces of farming in Africa and ChinaTime is represented backward (in YBP) Only populations for which the EBSP analysis showed a significant expansion event are represented Wereported the time values estimated with the highest mutation rate that we used for the autosomes (= 25 108generationsite) Thus these timevalues can be considered as a lower bound for the expansion onsets The dates for the emergence of farming come from the review by Bocquet-Appeland Bar-Yosef (2008) They are based on archeological remains

6

Aime et al doi101093molbevmst156 MBE

FIG

4

Com

par

ison

ofes

tim

ated

tim

esfo

rex

pan

sion

onse

tsus

ing

HV

S-Ia

nd

dati

ng

ofth

efir

star

cheo

logi

calt

race

sof

farm

ing

orhe

rdin

gin

Cen

tral

Afr

ica

(A)

Eura

sia

(B)

and

Cen

tral

Asi

a(C

)T

ime

isre

pre

sen

ted

back

war

d(i

nY

BP)

On

lyp

opul

atio

ns

for

whi

chth

eEB

SPan

alys

issh

owed

asi

gnifi

can

tex

pan

sion

even

tar

ere

pre

sen

ted

We

rep

orte

dth

eti

me

valu

eses

tim

ated

wit

hth

ehi

ghes

tm

utat

ion

rate

that

we

used

for

the

HV

S-Is

eque

nce

s(

=10

5 gen

erat

ion

sit

e)T

hus

thes

eti

me

valu

esca

nbe

con

side

red

asa

low

erbo

und

for

the

exp

ansi

onon

sets

The

date

sfo

rth

eem

erge

nce

offa

rmin

gco

me

from

the

revi

ewby

Bocq

uet-

Ap

pel

and

Bar-

Yos

ef(2

008)

The

yar

eba

sed

onar

cheo

logi

cal

rem

ain

s

7

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

FIG 5 Correlations between population-specific FST values and inferred growth rates in African farmer (A) and hunter-gatherer (B) populationsEurasian farmer populations (C) and Central Asian farmer (D) and herder populations (E) Population-specific FST values were computed withARLEQUIN v311 (Excoffier et al 2005) The growth rates were inferred under the best-fitting model from the parametric method using BEAST(Drummond and Rambaut 2007) When the best-fitting model was the constant model we assumed a growth rate of 0 Note that we did not representEurasian herder populations as the constant model best-fitted the data for all of them Plots and correlation tests were performed using R v2141(R Development Core Team 2011)

8

Aime et al doi101093molbevmst156 MBE

negative correlation between FST values and inferred growthrates for sedentary farmers in Eurasia (=0673 P value =0028) (fig 5C) and Central Asia (=0773 P value = 0003)(fig 5D) thus meaning that less isolated populations showedhigher inferred growth rates There was no significant corre-lation for Central Asian herders (=0092 P value = 0736)(fig 5E) Note that this analysis could not be performed forthe other Eurasian herder populations as the constant modelbest fitted the data with the parametric method

The estimation of the proportion of immigrants did notconverge for 11 Eurasian populations (Han Chinese LiaoningQingdao Palestinians Pathans Mongols as well as threeCentral Asian farmer populations and two Central Asianherder populations) Regarding the other populations weshowed no significant difference in the proportion of immi-grants between farmers and herders both in Central Asia(mean[farmers] = 82868 mean[herders] = 260265 P value = 012)and in the rest of Eurasia (mean[farmers] = 576757 mean-

[herders] = 201311 P value = 051) (supplementary table S10Supplementary Material online) We also found no significantcorrelation between this proportion and the inferred growthrates for Eurasian farmers (=0238 P value = 048) CentralAsians farmers (= 0386 P value = 030) and Central Asianherders (=042 P value = 0139) (supplementary fig S2Supplementary Material online)

DiscussionIn this study using a large set of populations from distantgeographic areas we report contrasted demographic historiesthat correlate with lifestyle Moreover the inferred expansionsignals in both African and Eurasian farmer and herder pop-ulations predated the Neolithic transition and the sedentar-ization of these populations

Contrasted Demographic Histories in Sedentary andNomadic Populations

For Africa both mtDNA and autosomal data revealed expan-sion patterns in most sedentary farmer populations as indi-cated by neutrality tests and the parametric andnonparametric BEAST methods Conversely we found con-stant effective population sizes (or possibly contractionevents) for all hunter-gatherer populations Among the farm-ers results were least clear for the Yoruba and the Ewondopopulations as no neutrality test was significant for thesepopulations whereas they showed evidence of expansionevents when analyzed with BEAST This indicates that thesepopulations may have undergone weaker expansion dynam-ics (ie lower growth rates and Ne) than the others Theseremarkable results are of particular importance for theYoruba as it is a reference population in many databases(HapMap 1000 genomes) This also demonstrates thehigher sensitivity of MCMC methods such as BEAST todetect expansions in comparison to neutrality tests

The contrasted patterns inferred between sedentary andnomadic populations in Africa suggest strong differences be-tween the demographic histories of these two groups of pop-ulations The question is whether this pattern results mostly

from differences in local expansion dynamics or whether spa-tial expansion processes at a larger scale were also involved Asshown by Ray et al (2003) negative values for the neutralitytests will be observed in a spatial expansion process if the rateof migrants (Nm) is high enough (at least 20) but not other-wise As in previous studies (eg Verdu et al 2013) we reporta higher degree of isolation (higher population-specific FST

values) in hunter-gatherer populations than in farmer popu-lations Using the spatial expansion model of Excoffier (2004)also leads to higher estimates of the number of immigrantsinto farmer populations Thus both farmers and hunter-gath-erers may have been subject to a spatial expansion processbut the limited number of migrants among hunter-gatherersmay have resulted in an absence of expansion signals forthem This would be consistent with the positive correlationthat we observe between the growth rates estimated withBEAST and the inferred number of immigrants in the seden-tary farmer populations However this spatial expansion pro-cess seems unlikely to completely explain the strongassociation that we observed between lifestyle and expansionpatterns as some farmer populations (Teke Gabonese Fang)displayed FST values similar to those of hunter-gatherers but aclear signal of expansion with relatively high growth rates Thissuggests that even rather isolated farmer populations showsubstantial level of expansions Moreover FST values and in-ferred growth rates in farmer populations were not signifi-cantly correlated Therefore our results suggest that theexpansion patterns observed in sedentary populationsresult not only from a spatial expansion pattern In additionlocal dynamics connected with the higher capacity of foodproduction by farmers also explain their much strongerexpansion signatures relative to their neighboring hunter-gatherer populations

For Eurasia when considering the mtDNA data all threemethods (neutrality tests parametric BEAST analyses andEBSPs) yielded expansion signals for all sedentary farmer pop-ulations except Koreans Conversely only EBSPs and Fursquos Fstest showed expansion signals for nomadic herders but notthe parametric BEAST method nor other neutrality tests Thisresult points toward weaker expansion dynamics in herdersthan in farmers as supported also by the tendency for lowergrowth rates and Ne in herder populations than in farmerpopulations on the EBSP graphs (fig 2C and D) It thus seemsthat the flexibility and nonparametric nature of EBSP analysesallows one to detect weaker expansion events than the para-metric method Moreover Fursquos Fs is known to be more sen-sitive than the other neutrality tests to detect expansions(Ramos-Onsins and Rozas 2002) Again these inferred expan-sions may result at least in part from spatial expansion pro-cesses The population-specific FST values are indeed ratherlow in Eurasia Moreover we found a significant negativecorrelation between FST values and inferred growth rates forthe sedentary farmers indicating that less isolated popula-tions showed stronger expansion signals However althoughwe inferred much stronger expansion patterns for the farmersthan for the herders we did not observe any differences inEurasia between the farmers and the herders in the popula-tion-specific FST values or in the estimated number of

9

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

immigrants suggesting that spatial processes alone cannotexplain the strong difference that we observed between theexpansion patterns of these two groups of populations Thisindicates that the intrinsic demographic growth patterns aredifferent between these two kinds of populations the farmersshowing much higher growth rates than the herders

To our knowledge although other studies have found dif-ferent patterns between hunter-gatherers and farmers (egVerdu et al 2009) our study is the first to show differencesbetween farmers and herders the two major post-Neolithichuman groups A plausible explanation could be that no-madic herders and hunter-gatherers share several of the con-straints of a nomadic way of life For instance birth intervalsare generally longer (at least 4 years) in nomadic populationsthan in sedentary populations (eg Short 1982) According toBocquet-Appel (2011) these longer birth intervals may bemainly determined by diet differences Indeed Valeggia andEllison (2009) demonstrated that birth interval is mainly de-termined by the rapidity of postpartum energy recoverywhich may be increased by high carbohydrate food (like ce-reals) consumption Moreover the nomadic herder way of lifemay offer less food security than sedentary farming the latterfacilitating efficient long-term food storage

However unlike in Africa we did not find systematicallyconsistent patterns between the autosomal and mtDNA datain Eurasia The possible contraction events that our resultssuggest for two sedentary populations (Japanese and Danes)with autosomes appeared concomitant with historical eventsthat could have led to bottleneck processes For the Japanesepopulation this contraction signal could indeed result from afounder effect due to the Paleolithic colonization of Japan by asubset of the Northern Asiatic people (especially from KoreaNei 1995) Similarly a bottleneck process may also have oc-curred in the Danish population linked with the last glacialmaximum occurring between 26500 and 19500 YBP (Clarket al 2009) Reasons why these processes impacted the auto-somes but not the mtDNA data remain to be determined forinstance through simulation studies In any case our studyclearly emphasizes the utility of combining mtDNA and au-tosomal sequences as they allow access to different aspects ofhuman history A recent study on harbor porpoises has sim-ilarly shown that nuclear markers were sensitive to a recentcontraction event whereas mtDNA allowed inferring a moreancient expansion (Fontaine et al 2012)

Interestingly Central Asia displayed a distinct pattern fromthe rest of Eurasia Indeed we did not infer higher expansionrates for sedentary farmers than for nomadic herders in thatarea It could result from harsh local environmental condi-tions due to the arid continental climate in this area Indeedusing pollen records Dirksen and van Geel (2004) showedthat the paleoclimate in Central Asia was very arid from atleast 12000 to 3000 YBP which could have limited theamount of suitable areas for farming and impacted humandemography Spatial expansion processes may also haveplayed a role in this difference as population-specific FST

values were higher for the farmers than for the herdersThis may indicate that more migrants were involved in thespatial expansion process for the herders than for the farmers

yielding a weaker expansion signal (ie lower inferred growthrate) for the latter (Ray et al 2003) This is supported by thenegative correlation between the FST values and the inferredgrowth rates in the farmer populations The Korean popula-tion also stood out as an exception in Eurasia Even though itis a population of sedentary farmers it showed no significantexpansion signal with both the parametric and nonparamet-ric methods with HVS-I This could be explained by a latersedentarization of this population The Korean Neolithic isnotably defined by the introduction of Jeulmun ware ce-ramics about 8000 YBP but the people of the Jeulmunperiod were still predominantly semi-nomadic fishers andhunter-gatherers until about 3000 YBP when Koreans startedan intensive crop production implying a sedentary lifestyle(Nelson 1993)

Inferred Expansion Signals Predate the Emergence ofFarming

EBSP analyses revealed that the inferred expansion events infarmers and herder populations were more ancient than theemergence of farming and herding Therefore the differencesin demographic patterns between farmers and herders seemto predate their divergence in lifestyle which raises the ques-tion of the chronology of demographic expansions and theNeolithic transition These findings appear to be quite robustto the choice of the scaling parameters We used here boththe lower and the higher mutation rate estimates in humansfor autosomes (Pluzhnikov et al 2002 Conrad et al 2011) andfor the HVS-I sequence (Forster et al 1996 Howell et al 1996)Despite this uncertainty in mutation rates which lead to a 2-fold uncertainty in our time estimates the inferred expansionsignals predated the emergence of agriculture in both casesfor all populations Similarly using a generation time of 29years (Tremblay and Vezina 2000) instead of 25 years lead toslightly more ancient estimates thus do not change our con-clusions (data not shown) However note that for HVS-Iusing the higher bound of the credibility interval for the high-est estimated mutation rate (275 105generationsiteHeyer et al 2001) instead of the mean value (ie 105gen-erationsite) leads to expansion time estimates consistentwith the Neolithic transition in Eurasian populations (supple-mentary table S11 Supplementary Material online)Nevertheless these estimates still clearly predated theNeolithic for the African populations However 105gener-ationsite is by far the highest estimation of mutation rate inthe literature (Howell et al 1996) To infer Neolithic expan-sions in most Eurasian populations one needs to assume amutation rate of at least 2 105generationsite muchhigher than other estimates from the literature and is thusprobably unrealistic Moreover our method for determiningthe expansion onset time using EBSP graph is very conserva-tive and also tends to favor the lower bound of expansiononset times Finally for autosomes using similarly474 108generationsite instead of 25 108genera-tionsite (Pluzhnikov et al 2002) lead to an inferred expansiononset time that is not compatible with the Neolithic transi-tion for all Eurasian and African populations except for one

10

Aime et al doi101093molbevmst156 MBE

African population the Yoruba (supplementary table S12Supplementary Material online) Consequently it seemsvery likely that the expansions inferred in this study corre-spond to Paleolithic rather than Neolithic demographicevents in agreement also with most previous studies as de-tailed later

In Africa the emergence of agriculture has been datedbetween 5000 and 4000 YBP in the Western part ofCentral Africa and subsequently rapidly expanded to therest of sub-Saharan Africa (Phillipson 1993) However usingHVS-I we showed expansion events in farmer populationssince about 30000 or 60000 YBP thus largely predating theemergence of agriculture in the area Similarly using auto-somes especially in Eastern African populations we inferredexpansion signals that clearly predated the Neolithic Notablywe inferred an expansion signal for Mozambicans since atleast 80000 YBP Several genetic studies have already high-lighted that expansion events occurred in African farmersbefore the Neolithic transition (eg Atkinson et al 2009Laval et al 2010 Batini et al 2011) This finding is also con-sistent with paleoanthropological data (ie radiocarbondating) suggesting an expansion event in Africa 60000ndash80000 YBP (Mellars 2006a) This Paleolithic demographicexpansion could be linked to a rapid environmental changetoward a dryer climate (Partridge et al 1997) andor to theemergence of new hunting technologies (Mellars 2006a)

According to Mellars (2006a) this period corresponds to amajor increase in the complexity of the technological eco-nomic social and cognitive behavior of certain Africangroups It corresponds in particular to the emergence of pro-jectile technologies (Shea 2009) which was probably part of abroader pattern of ecological diversification of early Homosapiens populations These changes could have been decisivefor the human spread ldquoOut of Africardquo during the same periodand could have ultimately also led to the sedentarization ofthe remaining populations This inference is consistent withSauerrsquos (1952) demographic theory which stated that latePaleolithic demographic expansions could have favored thesedentarization and the emergence of agriculture in somehuman populations In the case of Central Africa theperiod of 60000 YBP corresponds to the separation betweenhunter-gatherers and farmers ancestors (Patin et al 2009Verdu et al 2009) Thus these two groups may have pre-sented contrasting demographic patterns since their diver-gence Much later higher expansion rates and largerpopulation sizes among farmersrsquo ancestors may have inducedthe emergence of agriculture and sedentarization

With respect to Eurasia the expansion profiles inferredwith HVS-I for all populations and with autosomes for theHan Chinese population also seem to have begun during thePaleolithic thus before the Neolithic transition Some geneticstudies already reported pre-Neolithic expansions in Asia andEurope (eg Chaix et al 2008) Notably using mismatch andintermatch distributions Chaix et al (2008) showed an east-to-west Paleolithic expansion wave in Eurasia We found asimilar pattern here as the inferred expansions of East-Asianpopulations were earlier than those of Central Asian popula-tions themselves earlier than those of European populations

Moreover we found this pattern in both sedentary farmerand nomadic herder populations Thus the ancestors of cur-rently nomadic herder populations also experienced thesePaleolithic expansions However Paleolithic expansion signalsin nomadic populations seem lower than in sedentary pop-ulations This is again compatible with the demographictheory of the Neolithic sedentarization (Sauer 1952) somepopulations may have experienced more intense Paleolithicexpansions which may have led ultimately to theirsedentarization

The inferred Paleolithic expansion signals might resultpartly from spatial expansions out of some refuge areasafter the Last Glacial Maximum (LGM 26500ndash19500 YBPClark et al 2009) as this time interval matches with our in-ferred dating for expansion onsets in East Asia with HVS-Iusing the pedigree-based mutation rate and in Europe andMiddle East using the transitional mutation rate Some of theearlier date estimates might also be consistent with the out-of-Africa expansion of H sapiens However the time radiocar-bon-based estimates of the spread of H sapiens in Eurasia aregenerally more ancient than our inferred expansion onsettimings For instance Mellars (2006b) dated the colonizationof Middle East by H sapiens at 47000ndash49000 YBP and ofEurope at 41000ndash42000 YBP Pavlov et al (2001) reporttraces of modern human occupation nearly 40000 yearsold in Siberia Finally Liu et al (2010) described modernhuman fossils from South China dated to at least 60000YBP Moreover out-of-Africa or post-LGM expansionswould not explain our finding of an east-to-west gradientof expansion onset timing which rather supports the hypoth-esis of a demographic expansion diffused from east to west inEurasia in a demic (ie migrations of individuals) or cultural(favored by the diffusion of new technologies)

Possible Confounding Factors

Our approach makes the assumption that populations areisolated and panmictic which is questionable for human pop-ulations However we analyzed a large set of populationssampled in very distant geographical regions (ie CentralAfrica East Africa Europe Middle East Central Asia PamirSiberia and East Asia) The main conclusions of this study relyon consistent patterns between most of these areas and itseems unlikely that processes such as admixture could havebiased the estimates similarly everywhere Moreover inCentral Africa several studies have shown that hunter-gath-erer populations show signals of admixture whereas it is notthe case for farmer populations (Patin et al 2009 Verdu et al2009 2013) If this introgression had been strong enough thismay have yielded a spurious expansion signal in the hunter-gatherer populations which is not what we observed here InEurope spatial expansion processes during the Neolithic mayhave led to admixture with Paleolithic populations Aspointed out by a simulation study (Arenas et al 2013) thismay lead to a predominance of the Paleolithic gene pool Thismay be one of the factors explaining why we observed mostlyPaleolithic expansions here

11

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

Similarly potential selection occurring on the whole mito-chondrial genome (eg Pakendorf and Stoneking 2005) seemsunlikely to have impacted in the same way all the studiedpopulations within each group (eg stronger positive selec-tion on sedentary than on nomadic populations) as we an-alyzed different nomadic and sedentary populations livingnear each other in several geographically distant areas

Regarding the potential effects of recombination on theinferences from autosomal data we found that neutralitytests gave similar results on the whole sequences whenusing a simulation procedure that was taking the known re-combination rate of each sequence into account (table 1)and on the largest non-recombining blocks as inferred withIMgc without taking recombination into account in the sim-ulation process (supplementary table S12 SupplementaryMaterial online) It thus appears unlikely that the BEAST anal-yses that can only handle the largest inferred non-recombin-ing blocks are biased because of this

Finally note that the effective population sizes inferredusing BEAST correspond to the Ne of the populationsduring their recent history rather than a value of Ne averagedover the history of the population It explains the finding thatfor most populations we inferred Ne estimates much higherthan generally assumed for humans by population geneticists(about 10000)

Material and Methods

Genetic MarkersAutosomal SequencesWe used data from 20 noncoding a priori neutral and un-linked autosomal regions selected by Patin et al (2009) to beat least 200 kb away from any known or predicted gene tonot be in linkage disequilibrium (LD) neither with each othernor with any known or predicted gene and to have a regionof homology with the chimpanzee genome These regions areon average 1253 bp long Using the four-gamete test (Hudsonand Kaplan 1985) as implemented in IMgc online (Woerneret al 2007) we identified recombination events for 6 of these20 regions As some methods used in this study cannothandle recombination we retained for these six sequencesthe largest non-recombining block inferred by IMgc Becauseof this reduction the 20 regions used were on average1228 bp long To identify potential bias related to thismethod (eg some recombination events may not bedetected using the four-gamete test larger blocks of non-recombining sequence may select for gene trees that areshorter than expected) we computed the summary statisticsand performed neutrality tests (discussed later) both on thewhole sequences (table 1) and on the largest non-recombin-ing blocks (supplementary table S12 Supplementary Materialonline)

Mitochondrial SequencesWe used the first hypervariable segment of the mitochondrialcontrol region (HVS-I) sequenced between positions 16067and 16383 excluding the hypervariable poly-C region (sites16179ndash16195) The total length of the sequence was thus of300 bp

Population Panel

For Africa we used the autosomal sequences data set of Patinet al (2009) which consists of five farmer populations(N = 118 individuals) and five Pygmy hunter-gatherer popu-lations (N = 95) In addition we used the HVS-I data set fromQuintana-Murci et al (2008) which consists of nine CentralAfrican farmer populations (N = 486) and seven CentralAfrican hunter-gatherer populations (N = 318) (supplemen-tary table S1 Supplementary Material online)

For Eurasia we used the autosomal sequences data set ofLaval et al (2010) consisting of 48 individuals from two East-Asian populations (Han Chinese and Japanese) and 47 indi-viduals from two European populations (Chuvash andDanes) We also used the data from 48 individuals fromone sedentary Central Asian population (Tajik farmers) and48 individuals from one nomadic Central Asian population(Kyrgyz herders) of Segurel et al (2013) For HVS-I we ana-lyzed data from 17 Eurasian populations (N = 494 in total)located from Eastern to Western Eurasia belonging to severalpublished data sets (Derenko et al 2000 Richards et al 2000Bermisheva et al 2002 Imaizumi et al 2002 Yao et al 2002Kong et al 2003 Quintana-Murci et al 2004 supplementarytable S1 Supplementary Material online)

For our detailed study of Central Asia we used HVS-I se-quences from 12 farmer populations (N = 408 in total) and 16herder populations (N = 567 in total) These data come fromthe studies by Chaix et al (2007) and Heyer et al (2009) for 25populations (supplementary table S1 SupplementaryMaterial online) The other populations (KIB TAB andTKY) were sequenced for this study As in Chaix et al(2007) and Heyer et al (2009) DNA was extracted fromblood samples using standard protocols and the sequencequality was ensured as follows each base pair was determinedonce with a forward and once with a reverse primer anyambiguous base call was checked by additional and indepen-dent PCR and sequencing reactions all sequences were ex-amined by two independent investigators All sampledindividuals were healthy donors from whom informed con-sent was obtained The study was approved by appropriateEthic Committees and scientific organizations in all countrieswhere samples have been collected

Demographic Inferences from Sequences AnalysisSummary Statistics and Neutrality TestsWe computed classical summary statistics (number of poly-morphic sites S number of haplotypes K) and four neutralitytests (Tajimarsquos [1989] D Fu and Lirsquos [1993] D and F and Fursquos[1997] Fs) on both mitochondrial and autosomal sequencesAlthough neutrality tests were originally designed to detectselective events they also give information about demo-graphic processes especially when applied to neutral markersIndeed expansion events lead to more negative values thanexpected in the absence of selective and demographic pro-cesses Conversely contraction events lead to more positivevalues of the neutrality tests For HVS-I sequences we com-puted all summary statistics and neutrality tests and tested

12

Aime et al doi101093molbevmst156 MBE

their departure from neutrality using the coalescent-basedtests provided in DnaSP (Librado and Rozas 2009)

For the autosomal sequences we used the procedure de-veloped in Laval et al (2010) which combines all autosomalsequences into a single test This procedure consists in com-puting the mean value of each summary statistics across the20 loci and in testing whether this mean value departs signif-icantly from its expectation under neutrality in a constant-size population model using a simulation procedure For agiven population with sample size n we produced 105 simu-lated samples of the same size n under a constant populationsize model using the generation-per-generation coalescent-based algorithm implemented in SIMCOAL v2 (Laval andExcoffier 2004) Each simulated individual was constitutedby 20 independent sequences of 1253 bp (the average se-quence length for the real data)Then we used ARLEQUIN v3(Excoffier et al 2005) as modified by Laval et al (2010) tocompute the summary statistics on these simulated samplesWe assessed whether the observed statistics differed signifi-cantly from the constant population model under neutralityby comparing these statistics with their null distribution ob-tained from the simulated data We used gamma distributedmutation rates with a mean value of 25 108generationsite (95 confidence interval 1476 108

4036 108)in agreement with previous studies (Pluzhnikov et al 2002Voight et al 2005) This procedure yielded a P value for thesignificance of the departure from a constant size model Weperformed this procedure both on the whole sequences andon the largest non-recombining blocks For the whole se-quences (ie including recombination) we performed thedata simulation under a coalescent model with recombina-tion using for each locus the recombination rate provided bythe HapMap build GRCh37 genetic map (InternationalHapMap Consortium 2003) (supplementary table S8Supplementary Material online) whereas for the largestnon-recombining blocks we used a coalescent model withoutrecombination

Both for autosomes and for HVS-I sequences we adjustedthe obtained P values for each neutrality test using a falsediscovery rate (FDR) correction (Benjamini and Hochberg1995) in R v2141 (R Development Core Team 2011) inorder to take into account the increased error probability inthe case of multiple testing

MCMC Estimations of Demographic ParametersWe used the MCMC algorithm implemented in BEAST v16(Drummond and Rambaut 2007) We tested the four demo-graphic models implemented in this software constant effec-tive population size (N0) (constant model) populationexpansion with an increasing growth rate (g) (exponentialmodel) population expansion with an decreasing growthrate (g) (logistic model) and the expansion model in whichN0 is the present day population size N1 the population sizethat the model asymptotes to going into the distant past andg the exponential growth rate that determines how fast thetransition is from near the N1 population size to N0 popula-tion size In fact BEAST estimates composite parameters foreach model namely N0 and g where N0 is the current

effective population size g the growth rate and the muta-tion rate In addition for the expansion model the ratio be-tween the current (N0) and ancestral (N1) effectivepopulation size is also estimated To infer N0 and g fromthese composite parameters we needed to assume a valuefor the mutation rate However there is no consensus formutation rates in humans in the literature as different meth-ods lead to different estimations For autosomes the mostcommonly used value is the phylogenetic rate of= 25 108generationsite (Pluzhnikov et al 2002)However recent studies based on the 1000 genome project(1000 Genomes Project Consortium 2012) have found a2-fold lower rate (= 12 108generationsite) by directlycomparing genome-wide sequences from children and theirparents (Conrad et al 2011 Scally and Durbin 2012) We usedhere both mutation rates Similarly for HVS-I estimated mu-tation rates are highly dependent on methodologies andmodes of calibration (Endicott et al 2009) We used boththe lower and the higher estimated mutation rates the tran-sitional changes mutation rate of = 5 106generationsite (Forster et al 1996) and the pedigree-based rate of= 105generationsite (Howell et al 1996 Heyer et al2001) We used a general time-reversible substitution model(Rodriguez et al 1990) We assumed a generation time of 25years permitting the comparison with previous human pop-ulation genetics studies (eg Chaix et al 2008 Patin et al 2009Laval et al 2010) As BEAST cannot handle recombinationevents we used the largest nonrecombining block withineach sequence (discussed earlier)

We performed three runs of 107 steps per population andper demographic model for the HVS-I sequence and threeruns of 2 108 steps (which corresponded to three runs of107 steps per locus) for the autosomal sequences We re-corded one tree every 1000 steps which thus implied atotal of 105 trees per locus and per run We then removedthe first 10 steps of each run (burn-in period) and combinedthe runs to obtain acceptable effective sample sizes (ESSs of100 or above) The convergence of these runs was assessedusing two methods visual inspection of traces using Tracerv15 (Rambaut and Drummond 2007) to check for concor-dance between runs and computation of Gelman and Rubinrsquos(1992) convergence diagnostic using R v2141 (RDevelopment Core Team 2011) with the function gelmandiag available in the package coda (Plummer et al 2006)

To facilitate a large exploration of the parameter space forthe autosomal sequences we chose uniform priors between 0and 005 for 2N0 and between 109 and 109 for g ForHVS-I sequences we chose uniform priors for N0 between 0and 10 and for gm between25 106 and 25 106 result-ing for the same priors on N0 and g than for autosomalsequences if we assumed = 105generationsite (ieN0 constrained between 0 and 106 and g constrained between1 and 1 per year) Conversely if we assumed = 5 106generationsite it meant that N0 was constrained between 0and 2 106 and g was constrained between05 and 05 peryear

For each population and model we obtained themode and the 95 HPD of N0 and g inferred from their

13

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

posterior distributions (supplementary tables S4 and S7Supplementary Material online) using the add-on packageLocfit (Loader 1999) in R v2141 We selected the best-fittingmodel among the four tested demographic models by esti-mating marginal likelihoods using two methods path sam-pling and stepping-stone sampling (Baele et al 2012) Themodel with the greater marginal likelihood (supplementarytable S3 Supplementary Material online) was considered asthe best-fitting model

Extended Bayesian Skyline PlotsEBSPs (Heled and Drummond 2010) also implemented inBEAST estimate demographic changes occurring continu-ously through time in a population using the time intervalsbetween successive coalescent events This method allows avisualization of the evolution of Ne through time As abovewe combined three runs of 107 steps for mitochondrial se-quences and three runs of 2 108 steps for autosomal se-quences to obtain acceptable ESS values We assumed thesame mutation rates as above and a generation time of 25years Outputs were analyzed with Tracer v15 to visuallycheck for convergence and ESS also to obtain the 95 HPDinterval for the number of demographic changes that oc-curred in the population (supplementary table S5Supplementary Material online) A constant population sizecould be rejected when the 95 HPD of the number ofchange points excluded 0 but included 1 (Heled andDrummond 2010) Then we used R v2141 to computeGelman and Rubinrsquos (1992) convergence diagnostic asabove as well as to compute skyline plots Finally we usedthe population growth curves generated from BEAST toassess the time at which populations began to expandEach Skyline plot consisted of smoothed data points atamp10ndash20 generation intervals We consider that the popula-tion increased (or decreased) when both the median and 95HPD values for Ne increased (or decreased) between morethan two successive data points Although this method didnot provide a 95 HPD interval for the inferred expansiontimings this conservative approach ensured that we consid-ered only relevant expansion signals

Correlation Tests of Inferred Growth Rates andIsolationImmigration Patterns

To test how isolation degrees and migration patterns differ-ences could have impacted our demographic inferences weused ARLEQUIN v311 (Excoffier et al 2005) to compute pop-ulation-specific FST values (Weir and Hill 2002) from HVS-Idata for Central Africa Eurasia and Central Asia (supplemen-tary table S9 Supplementary Material online) and estimateimmigration rates from mismatch distributions under a spa-tially explicit model (Excoffier 2004) (supplementary tableS10 Supplementary Material online) We performed thenSpearman tests using R v2141 to investigate for eachregion how the inferred parametric growth rates were corre-lated with those FST values and immigration rates We used avalue of 0 for the growth rate when the constant model bestfitted the data

Supplementary MaterialSupplementary figures S1 and S2 and tables S1ndashS12 are avail-able at Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

The authors would like to warmly thank all volunteer partic-ipants They also thank Michael Fontaine for his help on somedata analyses Phillip Endicott and Julio Bendezu-Sarmientofor insightful discussions Alexei Drummond for helpful dis-cussions and Friso Palstra for his help on English usage Theythank Laurent Excoffier two anonymous reviewers and theeditor for helpful comments and suggestions All computa-tionally intensive analyses were run on the Linux cluster of theMuseum National drsquoHistoire Naturelle (administrated by JulioPedraza) and on the web-based portal ldquoBioportalrdquo (Kumaret al 2009) This work was supported by ActionsTransversales du Museum - Museum National drsquoHistoireNaturelle (grant ldquoLes relations Societes-Natures dans le longtermerdquo) and Agence Nationale de la Recherche (grantsldquoAlterite culturellerdquo ANR-10-ESVS-0010 and ldquoDemochipsrdquoANR-12-BSV7-0012) CA was financed by a PhD grantfrom the Centre National de la Recherche Scientifique

References1000 Genomes Project Consortium 2012 An integrated map of genetic

variation from 1092 human genomes Nature 49156ndash65Ambrose SH 2001 Paleolithic technology and human evolution Science

2911748ndash1753Arenas M Francois O Currat M Ray N Excoffier L 2013 Influence of

admixture and Paleolithic range contractions on current Europeandiversity gradients Mol Biol Evol 30 57ndash61

Atkinson QD Gray RD Drummond AJ 2009 Bayesian coalescent infer-ence of major human mitochondrial DNA haplogroup expansionsin Africa Proc R Soc Lond B Biol Sci 276367ndash373

Baele G Lemey P Bedford T Rambaut A Suchard MA Alekseyenko AV2012 Improving the accuracy of demographic and molecular clockmodel comparison while accommodating phylogenetic uncertaintyMol Biol Evol 292157ndash2167

Balaresque PL Ballereau SJ Jobling MA 2007 Challenges in humangenetic diversity demographic history and adaptation Hum MolGenet 16R134ndashR139

Batini C Lopes J Behar DM Calafell F Jorde LB van der Veen LQuintana-Murci L Spedini G Destro-Bisol G Comas D 2011Insights into the demographic history of African pygmies from com-plete mitochondrial genomes Mol Biol Evol 281099ndash1110

Beaumont MA 2004 Recent developments in genetic data analysiswhat can they tell us about human demographic historyHeredity 92365ndash379

Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdashapractical and powerful approach to multiple testing J R Stat SocSer B 57289ndash300

Bermisheva M Salimova A Korshunova T Svyatova G Berezina GVillems R Khusnutdinova E 2002 Mitochondrial DNA diversity inthe populations of Middle Asia and Northern Caucasus Eur J HumGenet 10179ndash180

Bocquet-Appel JP 2011 When the worldrsquos population took off thespringboard of the Neolithic Demographic Transition Science 333560ndash561

Bocquet-Appel J-P Bar-Yosef O 2008 The Neolithic demographic tran-sition and its consequences Dordrecht (The Netherlands) Springer

14

Aime et al doi101093molbevmst156 MBE

Chaix R Austerlitz F Hegay T Quintana-Murci L Heyer E 2008 Genetictraces of east-to-west human expansion waves in Eurasia Am J PhysAnthropol 136309ndash317

Chaix R Quintana-Murci L Hegay T Hammer MF Mobasher ZAusterlitz F Heyer E 2007 From social to genetic structures incentral Asia Curr Biol 1743ndash48

Clark PU Dyke AS Shakun JD Carlson AE Clark J Wohlfarth BMitrovica JX Hostetler SW McCabe AM 2009 The Last GlacialMaximum Science 325710ndash714

Conrad DF Keebler JEM DePristo MA et al (17 co-authors) 2011Variation in genome-wide mutation rates within and betweenhuman families Nat Genet 43712ndash714

Derenko MV Malyarchuk BA Dambueva IK Shaikhaev GO DorzhuCM Nimaev DD Zakharov IA 2000 Mitochondrial DNA variationin two South Siberian aboriginal populations implications for thegenetic history of North Asia Hum Biol 72945ndash973

Dirksen VG van Geel B 2004 Mid to late Holocene climate change andits influence on cultural development in South Central Siberia NatoSci S Ss IV Ear 42291ndash307

Drummond AJ Rambaut A 2007 BEAST Bayesian evolutionary analysisby sampling trees BMC Evol Biol 7214

Endicott P Ho SYW Metspalu M Stringer C 2009 Evaluating the mi-tochondrial timescale of human evolution Trends Ecol Evol 24515ndash521

Excoffier L 2004 Patterns of DNA sequence diversity and genetic struc-ture after a range expansion lessons from the infinite-island modelMol Ecol 13853ndash864

Excoffier L Heckel G 2006 Computer programs for population geneticsdata analysis a survival guide Nat Rev Genet 7745ndash758

Excoffier L Laval G Schneider S 2005 Arlequin (version 30) an inte-grated software package for population genetics data analysis EvolBioinform Online 147ndash50

Fontaine MC Snirc A Frantzis A Koutrakis E Ozturk B Ozturk AAAusterlitz F 2012 History of expansion and anthropogenic collapsein a top marine predator of the Black Sea estimated from geneticdata Proc Natl Acad Sci U S A 109E2569ndashE2576

Forster P Harding R Torroni A Bandelt HJ 1996 Origin and evolutionof native American mtDNA variation a reappraisal Am J HumGenet 59935ndash945

Fu YX 1997 Statistical tests of neutrality of mutations against popula-tion growth hitchhiking and background selection Genetics 147915ndash925

Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133693ndash709

Gelman A Rubin DB 1992 Inference from iterative simulation usingmultiple sequences Stat Sci 7457ndash511

Heled J Drummond AJ 2010 Bayesian inference of species trees frommultilocus data Mol Biol Evol 27570ndash580

Heyer E Balaresque P Jobling MA Quintana-Murci L Chaix R Segurel LAldashev A Hegay T 2009 Genetic diversity and the emergence ofethnic groups in Central Asia BMC Genet 1049

Heyer E Chaix R Pavard S Austerlitz F 2012 Sex-specific demographicbehaviours that shape human genomic variation Mol Ecol 21597ndash612

Heyer E Zietkiewicz E Rochowski A Yotova V Puymirat J Labuda D2001 Phylogenetic and familial estimates of mitochondrial substi-tution rates study of control region mutations in deep-rootingpedigrees Am J Hum Genet 631113ndash1126

Ho SYW Shapiro B 2011 Skyline-plot methods for estimatingdemographic history from nucleotide sequences Mol Ecol Res 11423ndash434

Howell N Kubacka I Mackey DA 1996 How rapidly does the humanmitochondrial genome evolve Am J Hum Genet 59501ndash509

Hudson RR Kaplan NL 1985 Statistical properties of the number ofrecombination events in the history of a sample of DNA sequencesGenetics 111147ndash164

Imaizumi K Parsons TJ Yoshino M Holland MM 2002 A new databaseof mitochondrial DNA hypervariable regions I and II sequences from162 Japanese individuals Int J Legal Med 11668ndash73

International HapMap Consortium 2003 The International HapMapProject Nature 426789ndash796

Kingman JFC 1982 The coalescent Stoch Proc Appl 13235ndash248Kong QP Yao YG Sun C Bandelt HJ Zhu CL Zhang YP 2003 Phylogeny

of East Asian mitochondrial DNA lineages inferred from completesequences Am J Hum Genet 73671ndash676

Kumar V Langstieh BT Madhavi KV Naidu VM Singh HP Biswas SThangaraj K Singh L Reddy BM 2006 Global patterns in humanmitochondrial DNA and Y-chromosome variation caused by spatialinstability of the local cultural processes PLoS Genet 2e53

Kumar S Skjaeveland A Orr RJS Enger P Ruden T Mevik BH Burki FBotnen A Shalchian-Tabrizi K 2009 AIR a batch-oriented webprogram package for construction of supermatrices ready for phy-logenomic analyses BMC Bioinformatics 10357

Laval G Excoffier L 2004 SIMCOAL 20 a program to simulate genomicdiversity over large recombining regions in a subdivided populationwith a complex history Bioinformatics 202485ndash2487

Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulatinga historical and demographic model of recent human evolutionbased on resequencing data from noncoding regions PLoS One 5e10284

Librado P Rozas J 2009 DnaSP v5 a software for comprehensive analysisof DNA polymorphism data Bioinformatics 251451ndash1452

Liu W Jin CZ Zhang YQ et al (13 co-authors) 2010 Human remainsfrom Zhirendong South China and modern human emergence inEast Asia Proc Natl Acad Sci U S A 10719201ndash19206

Loader C 1999 Local regression and likelihood New York SpringerMellars P 2006a Why did human populations disperse from Africa ca

60000 years ago A new model Proc Natl Acad Sci U S A 1039381ndash9386

Mellars P 2006b A new radiocarbon revolution and the dispersal ofmodern humans in Eurasia Nature 439931ndash935

Nei M 1995 The origins of human populations genetic linguistic andarcheological data In Brenner S Hanihara K editors The origin andpast of modern humans as viewed from DNA Singapore WorldScientific p 71ndash91

Nei M Roychoudhury AK 1993 Evolutionary relationships of human-populations on a global scale Mol Biol Evol 10927ndash943

Nelson SM 1993 The archaeology of Korea Cambridge CambridgeUniversity Press p 307

Oota H Settheetham-Ishida W Tiwawech D Ishida T Stoneking M2001 Human mtDNA and Y-chromosome variation is corre-lated with matrilocal versus patrilocal residence Nat Genet 2920ndash21

Pakendorf B Stoneking M 2005 Mitochondrial DNA and human evo-lution Annu Rev Genomics Hum Genet 6165ndash183

Partridge TC Demenocal PB Lorentz SA Paiker MJ Vogel JC 1997Orbital forcing of climate over South Africa a 200000-year rainfallrecord from the Pretoria Saltpan Quat Sci Rev 161125ndash1133

Patin E Laval G Barreiro LB et al (15 co-authors) 2009 Inferring thedemographic history of African farmers and pygmy hunter-gath-erers using a multilocus resequencing data set PLoS Genet 5e1000448

Pavlov P Svendsen J Indrelid SI 2001 Human presence in the EuropeanArctic nearly 40000 years ago Nature 41364ndash67

Phillipson DW 1993 African archaeology Cambridge CambridgeUniversity Press

Plummer M Best N Cowles K Vines K 2006 CODA Convergencediagnosis and output analysis for MCMC R News 67ndash11

Pluzhnikov A Di Rienzo A Hudson RR 2002 Inferences about humandemography based on multilocus analyses of noncoding sequencesGenetics 1611209ndash1218

Quintana-Murci L Chaix R Wells RS et al (17 co-authors) 2004 Wherewest meets east the complex mtDNA landscape of the southwestand Central Asian corridor Am J Hum Genet 74827ndash845

Quintana-Murci L Quach H Harmant C et al (23 co-authors) 2008Maternal traces of deep common ancestry and asymmetric geneflow between Pygmy hunter-gatherers and Bantu-speaking farmersProc Natl Acad Sci U S A 1051596ndash1601

15

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE

Page 2: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

before the Neolithic and these Paleolithic expansions in somepopulations may have ultimately favored their shift towardfarming

The recent developments in sequencing technologies andbioinformatics tools have allowed the exploration of largemultilocus polymorphism data sets In combination witharcheological and paleoanthropological records it can sub-stantially improve our ability to infer past demographicalevents (Beaumont 2004) Stemming from Kingmanrsquos (1982)coalescent theory numerical coalescent-based methods havethus been developed allowing the inference of demographicparameters from molecular data Most of these methodsassume a specific demographic model Moreover nonpara-metric approaches such as Extended Bayesian Skyline Plots(EBSPs Heled and Drummond 2010) allow inference of thedemographic history of populations without assuming a spe-cific model by using the time intervals between serial coales-cent events (see Excoffier and Heckel 2006 and Ho andShapiro 2011 for reviews)

Here we used these methods to investigate 1) the rela-tionship between lifestyle (ie sedentary farming nomadicherding or nomadic hunting-gathering) and demographicpatterns in a large set of African and Eurasian populationsand 2) the chronology of demographic expansions and theemergence of farming by comparing inferred expansiononset times with the dating of the most ancient archeologicaltraces of farming and herding (potteries irrigation structuresand animals bones) reported in Bocquet-Appel and Bar-Yosef(2008) for each region In addition by computing FST valuesand immigration rates we investigated the extent to whichthe inferred demographic patterns could be explained byspatial expansion processes Indeed modeling studies (Rayet al 2003 Excoffier 2004) have shown that such processescan produce signals on within-population diversity patternssimilar to those obtained with pure demographic expansionsIn particular these studies argue that ancient spatial expan-sion signals could be attenuated or suppressed in isolatedpopulations Different expansion signals among populationsas inferred from genetic data may thus in part reflect variationin immigration rates and extent of population isolation

We used 20 a priori neutral autosomal regions and thehypervariable control region (HVS-1) of the mitochondrialDNA (mtDNA) sequenced in 404 individuals from 16 popu-lations and 2429 individuals from 61 populations respectively(supplementary table S1 Supplementary Material online)Given their distinct properties and modes of transmissionwe compared the inferences obtained with these two typesof markers in order to gain complementary insights into thepast demography of the studied populations By studyingmany populations from different geographic areas worldwidewe were able to determine which patterns were observedacross all populations and which were specific to a givengeographical region First we focused on Central Africawhere nomadic hunter-gatherer populations commonlycalled Pygmies coexist with sedentary farmer populationsThese two groups are genetically differentiated and seem tohave diverged about 60000 years ago (Patin et al 2009 Verduet al 2009) thus long before the Neolithic sedentarization of

farmer populations in this area (5000ndash4000 years before pre-sent [YBP] Bocquet-Appel and Bar-Yosef 2008) Second weanalyzed a sample of populations from several distant geo-graphical regions of Eurasia where sedentary farmers coexistwith nomadic herders This was of particular interest as toour knowledge the differences in demographic processes be-tween herders and farmers have not been studied yet Thirdwe performed a more detailed study in Central Asia anotherarea of interest as it is thought to have been a major corridorduring the successive Eurasian migration waves (Nei andRoychoudhury 1993)

Results

Neutrality Tests

Focusing first on Africa all farmer populations showed at leastone significantly negative value for one of the four neutralitytests (Tajimarsquos [1989] D Fu and Lirsquos [1993] D and F and Fursquos[1997] Fs table 1) which can be interpreted as a signal ofexpansion Conversely hunter-gatherer populations showedno such expansion signals Aka and Mbuti hunter-gathererspresented at least one significantly positive test indicating apossible contraction event Similarly for HVS-I sequences wefound significantly negative Fursquos Fs values for all farmer pop-ulations except the Ewondo but no expansion signal forhunter-gatherers (supplementary table S2 SupplementaryMaterial online) Kola hunter-gatherers showed a significantlypositive Tajimarsquos D indicating a possible contraction event

Similar analyses on autosomal sequences in Europe andEast Asia revealed no significant expansion signals neither insedentary nor in nomadic populations (table 1) We evenobserved contraction signals in two sedentary populationsone East-Asian and one European Indeed we found signifi-cantly positive values for two neutrality tests for the Japaneseand three neutrality tests for the Danes Conversely for HVS-Isequences from Eurasia (supplementary table S2Supplementary Material online) we obtained significant sig-nals of expansion for at least one test (Fursquos Fs) for all popu-lations (including Japanese and Danes) All sedentarypopulations except Koreans also showed significant signalsof expansion for the three other tests whereas the Koreansand all nomadic populations showed a significant expansionsignal only for Fursquos Fs

Focusing on Central Asia no neutrality test was significantfor the autosomes in neither Tajik sedentary farmers (TAB)nor Kyrgyz nomadic herders (KIB) (table 1) Conversely forHVS-I sequences (supplementary table S2 SupplementaryMaterial online) all farmers and herders presented a signifi-cant expansion signal for at least one test except one farmerpopulation (TDS)

Coalescent-Based Inferences of Demographic HistoryAfrica Pre-Neolitic Demographic Expansions in SedentaryFarmer PopulationsConsidering first the autosomal data models consistent withan increase in population size best fitted the data for allAfrican farmer populations (supplementary table S3Supplementary Material online) The ldquoexpansion modelrdquo

2

Aime et al doi101093molbevmst156 MBE

best fitted the data for the two East-African farmer popula-tions (namely Chagga and Mozambicans) whereas the ldquoex-ponential modelrdquo best fitted the data for all West-Africanfarmer populations (Akele Ngumba and Yoruba) with pos-itive growth rates in all cases (supplementary table S4Supplementary Material online) Conversely no signals of ex-pansion were found for hunter-gatherer populations as theldquoconstant modelrdquo always best fitted the data (supplementarytables S3 and S4 Supplementary Material online)Consistently EBSPs showed signals of expansions for farmerpopulations (fig 1A) 95 highest probability density (HPD)intervals for the estimated number of demographic changesdid not include 0 indicating at least one significant change inpopulation size (supplementary table S5 SupplementaryMaterial online) Conversely we found no evidence of popu-lation size changes for hunter-gatherers (fig 1B and supple-mentary table S5 Supplementary Material online) We furtherdated the onset of farmer expansions from at least 62275 YBP(assuming = 25 108generationsite) or 124550 YBP(assuming = 12 108generationsite) for Mozambicansto 7975 or 15950 YBP for Yoruba Visual examination of the95 HPD intervals showed that the expansion event inferredfor the Mozambican population was significantly older thanthose inferred for the other populations (supplementary tableS6 Supplementary Material online)

We found similar results for the HVS-I sequences fromCentral Africa (supplementary tables S3 and S7Supplementary Material online) Indeed the exponentialmodel with positive growth rates best fitted the data for allfarmer populations indicating expansion events Converselythe exponential model with negative modal values for growthrate (ie contraction event) provided the best fit for allhunter-gatherer populations However as the 95 HPD

intervals for growth rates included 0 we could not concludeany significant contraction events for these populationsSimilarly EBSPs indicated a significant expansion event forall farmer populations (fig 2A and supplementary table S5Supplementary Material online) whereas we found no evi-dence of population size changes for hunter-gatherers (fig 2Band supplementary table S5 Supplementary Material online)We dated farmer populations expansions from 31350 or62700 YBP (assuming = 105 or 5 106generationsiterespectively) to 45319 or 90638 YBP (supplementary table S6Supplementary Material online)

Finally both with autosomes and HVS-I all hunter-gath-erer populations had lower current effective population size(N0) values than farmer populations (supplementary tables S4and S7 Supplementary Material online) Furthermorethe inferred expansion onsets for all farmer populations lar-gely predated the emergence of farming in Central Africa(5000ndash4000 YBP Bocquet-Appel and Bar-Yosef 2008)(figs 3 and 4)

Eurasia Contrasting Demographic Patterns for FarmerPopulations with Autosomes and Stronger Pre-NeolithicExpansions for Farmers Than Herders with HVS-IThe coalescent-based analyses of autosomes in East-Asianand European populations showed contrasting demographicpatterns across sedentary populations (supplementary tablesS3 and S4 Supplementary Material online) Using the para-metric BEAST analysis the expansion model best fitted thedata for Han Chinese indicating an expansion eventConversely we inferred that Japanese and Danes eitherunderwent a contraction event or remained at constantsize Indeed the exponential model with negative growthrates best fitted the data for these two populations but the

Table 1 Summary Statistics and Neutrality Tests Computed from the Whole Autosomal Sequences

Population Area Lifestyle Sa Kb Tajimarsquos Dc Fu amp Lirsquos Dc Fu amp Lirsquos Fc Fursquos Fsc

Akele Africa Sedentary farmers 695 645 035 055 057 112

Chagga Africa Sedentary farmers 865 795 048 070 074 127

Mozambicans Africa Sedentary farmers 880 955 062 115 115 333

Ngumba Africa Sedentary farmers 705 620 020 041 041 068

Yoruba Africa Sedentary farmers 750 715 014 003 003 073

Aka Africa Nomadic HGd 695 660 012 034 032 030

G Baka Africa Nomadic HG 630 600 0008 017 014 033

S Baka Africa Nomadic HG 610 56 017 005 010 003

Kola Africa Nomadic HG 655 625 014 003 008 075

Mbuti Africa Nomadic HG 660 610 025 035 037 016

Danes Eurasia Sedentary farmers 550 485 030 016 024 073

Han Eurasia Sedentary farmers 520 470 003 001 002 021

Japanese Eurasia Sedentary farmers 420 385 045 022 034 106

Chuvash Eurasia Nomadic herders 570 505 009 011 012 034

Tajiks (TAB) C Asia Sedentary farmers 900 900 019 003 010 024

Kyrgyz (KIB) C Asia Nomadic herders 104 1040 011 008 011 023

NOTEmdashValues significantly higher than expected for a constant population size model are italicized whereas significantly lower values are underlinedaNumber of polymorphismsbNumber of haplotypescWe report the means over the 20 regionsdHG = Hunter-gatherers significance levels Plt 005 Plt 001 after FDR correction for multiple testing (Benjamini and Hochberg 1995)

3

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

95 HPD intervals also included g = 0 The constant modelbest fitted the data for the Chuvash a traditionally nomadicpopulation EBSPs showed a significant expansion event forthe Han population the value of 0 was not included in the95 HPD interval of the number of demographic changes(supplementary table S5 Supplementary Material online)These expansion events started at least 36025 or 72050YBP (fig 1C and supplementary table S6 SupplementaryMaterial online) clearly predating the emergence of farmingin East Asia about 9000 YBP (Bocquet-Appel and Bar-Yosef

2008) (fig 3) Japanese showed a significant contraction event(ie the value of 0 was not included in the 95 HPD intervalof the number of demographic changes supplementary tableS5 Supplementary Material online) starting at least 21350 or42700 YBP Danes also showed a significant contractionevent starting at least 26440 or 52880 YBP (fig 1C and sup-plementary tables S5 and S6 Supplementary Material online)EBSP analyses showed no significant demographic changes forthe Chuvash (fig 1D and supplementary table S5Supplementary Material online)

FIG 1 EBSPs inferred from autosomal sequences in African sedentary farmers (A) African nomadic hunter-gatherers (B) Eurasian sedentary farmers (C)and Eurasian nomadic herders (D) The values indicated in bold on the axes are obtained assuming a mutation rate of = 12 108generationsite(measured from trios parentsndashchildren by Conrad et al 2011) and the other values correspond to = 25 108generationsite (derived from thesequence divergence humanndashchimpanzee by Pluzhnikov et al 2002) Although time was expressed in generations for the analyses we represented timein years here assuming a generation time of 25 years Time is represented backward on the x axis from present to the left to the most distant past onthe right 95 lower and upper HPD are represented by dashed lines Populations for which the estimated number of demographic changes include 0(ie no significant signal of expansion or decline) are represented in light gray

4

Aime et al doi101093molbevmst156 MBE

FIG 2 EBSPs inferred from HVS-I sequences in African sedentary farmers (A) African nomadic hunter-gatherers (B) Eurasian sedentary farmers (C)Eurasian nomadic herders (D) Central Asian sedentary farmers (E) and Central Asian nomadic herders (F) The values indicated in bold are obtainedassuming a mutation rate of = 5 106generationsite (transitional changes rate Forster et al 1996) and the others correspond to = 105generationsite (pedigree-based Howell et al 1996 Heyer et al 2001) Time is represented time in years assuming a generation time of 25 years It isrepresented backward on the x axis from present to the left to the most distant past on the right 95 lower and upper HPD are represented by dashedlines Populations for which the estimated number of demographic changes include 0 (ie no significant signal of expansion or decline) are representedin light gray and the others in black

5

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

For the HVS-I sequences from Eurasia (supplementarytables S7 and S8 Supplementary Material online) the para-metric BEAST analyses showed that models consistent withan increase in population size (expansion model or exponen-tial model with positive growth rates) best fitted the data forall sedentary populations except Koreans Conversely theconstant model best fitted the data for all nomadic popula-tions as well as Koreans EBSPs showed however significantexpansion events for both farmers and herders but not forKoreans (fig 2C and D and supplementary table S5Supplementary Material online) Nevertheless there was atendency toward stronger expansion rates and higher Ne

values in sedentary than in nomadic populations (fig 2Cand D) although the 95 HPD intervals for Ne were quitelarge for sedentary populations The estimated expansiononset times inferred from the EBSPs (supplementary tableS6 Supplementary Material online) followed an east-to-west gradient they appeared more ancient in Eastern popu-lations in both sedentary and nomadic populations (supple-mentary fig S1 Supplementary Material online) They alsoclearly predated the Neolithic transition in all geographicareas (fig 4)

The Central Asian Exception Similar Demographic Patterns inFarmers and HerdersFor autosomes the constant model best fitted the data forboth sedentary farmers (TAB) and traditionally nomadic her-ders (KIB) (supplementary tables S3 and S4 SupplementaryMaterial online) EBSPs showed also no significant demo-graphic changes for these populations (figs 1C and D andsupplementary table S5 Supplementary Material online)

For HVS-I (supplementary tables S3 and S7 SupplementaryMaterial online) the exponential model best fitted the datafor six of the 12 sedentary farmer populations (includingTAB) whereas the constant model was preferred for theother farmers Unlike the rest of Eurasia a model indicatingexpansion (the exponential model with positive growth rates)was also selected for all nomadic herders Moreover EBSPsshowed significant expansion signals for both herder andfarmer populations except TJY (Yagnobs from Dushanbe)since at least 13860 YBP (or 27720 YBP) for farmers and16546 YBP (or 33092 YBP) for herders on average (fig 2Eand F and supplementary tables S5 and S6 Supplementary

Material online) Again these inferred expansion onsets pre-dated the emergence of farming in the area about 8000 YBP(Bocquet-Appel and Bar-Yosef 2008) (fig 4) Inferred expan-sions for Central Asian sedentary farmers seemed overallweaker (ie lower growth rate and lower Ne) than those ob-served for other sedentary populations in Eurasia althoughwe observed important variations in growth rates and Ne

among populations and large 95 HPD intervals for someof them (fig 2C and E)

Degrees of Isolation and Migration Patterns

African farmer populations appeared less isolated and re-ceived more migrants than hunter-gatherer populationsIndeed the population-specific FST values (supplementarytable S9 Supplementary Material online) were on averagesignificantly lower for farmers than for hunter-gatherers(mean[farmers] = 0058 mean[HG] = 0192 Wilcoxon two-sidedtest P value = 00002) Moreover the estimated number ofimmigrants was significantly higher for sedentary farmersthan for nomadic hunter-gatherers (mean[farmers] = 314mean[HG] = 221 P value = 00001) (supplementary table S10Supplementary Material online) For hunter-gatherers the FST

values were negatively correlated with the negative growthrates that we inferred from the parametric method (=0893 P value = 0012) (fig 5B) meaning that less isolatedpopulations showed weaker contraction events (ie less neg-ative growth rates) Conversely there was no significant cor-relation between FST values and inferred growth rates forsedentary farmers (= 0433 P value = 0249) (fig 5A)However we found a significant positive correlation betweenthe number of immigrants and the inferred growth rates(supplementary fig S2 Supplementary Material online)among sedentary farmer populations (= 0867 P value =0004) but not among nomadic hunter-gatherers (= 0536P value = 0235)

For Eurasia we found no significant difference in FST valuesbetween farmers and herders (mean[farmers] = 0039 mean-

[herders] = 0043 P value = 077 supplementary table S9Supplementary Material online) except in Central Asia forwhich we found significantly lower FST values for nomadicherders than for sedentary farmers (mean[farmers] = 0018mean[herders] = 0008 P value = 0017) We report a significant

FIG 3 Comparison of estimated times for expansion onsets using autosomes and dating of the first archeological traces of farming in Africa and ChinaTime is represented backward (in YBP) Only populations for which the EBSP analysis showed a significant expansion event are represented Wereported the time values estimated with the highest mutation rate that we used for the autosomes (= 25 108generationsite) Thus these timevalues can be considered as a lower bound for the expansion onsets The dates for the emergence of farming come from the review by Bocquet-Appeland Bar-Yosef (2008) They are based on archeological remains

6

Aime et al doi101093molbevmst156 MBE

FIG

4

Com

par

ison

ofes

tim

ated

tim

esfo

rex

pan

sion

onse

tsus

ing

HV

S-Ia

nd

dati

ng

ofth

efir

star

cheo

logi

calt

race

sof

farm

ing

orhe

rdin

gin

Cen

tral

Afr

ica

(A)

Eura

sia

(B)

and

Cen

tral

Asi

a(C

)T

ime

isre

pre

sen

ted

back

war

d(i

nY

BP)

On

lyp

opul

atio

ns

for

whi

chth

eEB

SPan

alys

issh

owed

asi

gnifi

can

tex

pan

sion

even

tar

ere

pre

sen

ted

We

rep

orte

dth

eti

me

valu

eses

tim

ated

wit

hth

ehi

ghes

tm

utat

ion

rate

that

we

used

for

the

HV

S-Is

eque

nce

s(

=10

5 gen

erat

ion

sit

e)T

hus

thes

eti

me

valu

esca

nbe

con

side

red

asa

low

erbo

und

for

the

exp

ansi

onon

sets

The

date

sfo

rth

eem

erge

nce

offa

rmin

gco

me

from

the

revi

ewby

Bocq

uet-

Ap

pel

and

Bar-

Yos

ef(2

008)

The

yar

eba

sed

onar

cheo

logi

cal

rem

ain

s

7

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

FIG 5 Correlations between population-specific FST values and inferred growth rates in African farmer (A) and hunter-gatherer (B) populationsEurasian farmer populations (C) and Central Asian farmer (D) and herder populations (E) Population-specific FST values were computed withARLEQUIN v311 (Excoffier et al 2005) The growth rates were inferred under the best-fitting model from the parametric method using BEAST(Drummond and Rambaut 2007) When the best-fitting model was the constant model we assumed a growth rate of 0 Note that we did not representEurasian herder populations as the constant model best-fitted the data for all of them Plots and correlation tests were performed using R v2141(R Development Core Team 2011)

8

Aime et al doi101093molbevmst156 MBE

negative correlation between FST values and inferred growthrates for sedentary farmers in Eurasia (=0673 P value =0028) (fig 5C) and Central Asia (=0773 P value = 0003)(fig 5D) thus meaning that less isolated populations showedhigher inferred growth rates There was no significant corre-lation for Central Asian herders (=0092 P value = 0736)(fig 5E) Note that this analysis could not be performed forthe other Eurasian herder populations as the constant modelbest fitted the data with the parametric method

The estimation of the proportion of immigrants did notconverge for 11 Eurasian populations (Han Chinese LiaoningQingdao Palestinians Pathans Mongols as well as threeCentral Asian farmer populations and two Central Asianherder populations) Regarding the other populations weshowed no significant difference in the proportion of immi-grants between farmers and herders both in Central Asia(mean[farmers] = 82868 mean[herders] = 260265 P value = 012)and in the rest of Eurasia (mean[farmers] = 576757 mean-

[herders] = 201311 P value = 051) (supplementary table S10Supplementary Material online) We also found no significantcorrelation between this proportion and the inferred growthrates for Eurasian farmers (=0238 P value = 048) CentralAsians farmers (= 0386 P value = 030) and Central Asianherders (=042 P value = 0139) (supplementary fig S2Supplementary Material online)

DiscussionIn this study using a large set of populations from distantgeographic areas we report contrasted demographic historiesthat correlate with lifestyle Moreover the inferred expansionsignals in both African and Eurasian farmer and herder pop-ulations predated the Neolithic transition and the sedentar-ization of these populations

Contrasted Demographic Histories in Sedentary andNomadic Populations

For Africa both mtDNA and autosomal data revealed expan-sion patterns in most sedentary farmer populations as indi-cated by neutrality tests and the parametric andnonparametric BEAST methods Conversely we found con-stant effective population sizes (or possibly contractionevents) for all hunter-gatherer populations Among the farm-ers results were least clear for the Yoruba and the Ewondopopulations as no neutrality test was significant for thesepopulations whereas they showed evidence of expansionevents when analyzed with BEAST This indicates that thesepopulations may have undergone weaker expansion dynam-ics (ie lower growth rates and Ne) than the others Theseremarkable results are of particular importance for theYoruba as it is a reference population in many databases(HapMap 1000 genomes) This also demonstrates thehigher sensitivity of MCMC methods such as BEAST todetect expansions in comparison to neutrality tests

The contrasted patterns inferred between sedentary andnomadic populations in Africa suggest strong differences be-tween the demographic histories of these two groups of pop-ulations The question is whether this pattern results mostly

from differences in local expansion dynamics or whether spa-tial expansion processes at a larger scale were also involved Asshown by Ray et al (2003) negative values for the neutralitytests will be observed in a spatial expansion process if the rateof migrants (Nm) is high enough (at least 20) but not other-wise As in previous studies (eg Verdu et al 2013) we reporta higher degree of isolation (higher population-specific FST

values) in hunter-gatherer populations than in farmer popu-lations Using the spatial expansion model of Excoffier (2004)also leads to higher estimates of the number of immigrantsinto farmer populations Thus both farmers and hunter-gath-erers may have been subject to a spatial expansion processbut the limited number of migrants among hunter-gatherersmay have resulted in an absence of expansion signals forthem This would be consistent with the positive correlationthat we observe between the growth rates estimated withBEAST and the inferred number of immigrants in the seden-tary farmer populations However this spatial expansion pro-cess seems unlikely to completely explain the strongassociation that we observed between lifestyle and expansionpatterns as some farmer populations (Teke Gabonese Fang)displayed FST values similar to those of hunter-gatherers but aclear signal of expansion with relatively high growth rates Thissuggests that even rather isolated farmer populations showsubstantial level of expansions Moreover FST values and in-ferred growth rates in farmer populations were not signifi-cantly correlated Therefore our results suggest that theexpansion patterns observed in sedentary populationsresult not only from a spatial expansion pattern In additionlocal dynamics connected with the higher capacity of foodproduction by farmers also explain their much strongerexpansion signatures relative to their neighboring hunter-gatherer populations

For Eurasia when considering the mtDNA data all threemethods (neutrality tests parametric BEAST analyses andEBSPs) yielded expansion signals for all sedentary farmer pop-ulations except Koreans Conversely only EBSPs and Fursquos Fstest showed expansion signals for nomadic herders but notthe parametric BEAST method nor other neutrality tests Thisresult points toward weaker expansion dynamics in herdersthan in farmers as supported also by the tendency for lowergrowth rates and Ne in herder populations than in farmerpopulations on the EBSP graphs (fig 2C and D) It thus seemsthat the flexibility and nonparametric nature of EBSP analysesallows one to detect weaker expansion events than the para-metric method Moreover Fursquos Fs is known to be more sen-sitive than the other neutrality tests to detect expansions(Ramos-Onsins and Rozas 2002) Again these inferred expan-sions may result at least in part from spatial expansion pro-cesses The population-specific FST values are indeed ratherlow in Eurasia Moreover we found a significant negativecorrelation between FST values and inferred growth rates forthe sedentary farmers indicating that less isolated popula-tions showed stronger expansion signals However althoughwe inferred much stronger expansion patterns for the farmersthan for the herders we did not observe any differences inEurasia between the farmers and the herders in the popula-tion-specific FST values or in the estimated number of

9

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

immigrants suggesting that spatial processes alone cannotexplain the strong difference that we observed between theexpansion patterns of these two groups of populations Thisindicates that the intrinsic demographic growth patterns aredifferent between these two kinds of populations the farmersshowing much higher growth rates than the herders

To our knowledge although other studies have found dif-ferent patterns between hunter-gatherers and farmers (egVerdu et al 2009) our study is the first to show differencesbetween farmers and herders the two major post-Neolithichuman groups A plausible explanation could be that no-madic herders and hunter-gatherers share several of the con-straints of a nomadic way of life For instance birth intervalsare generally longer (at least 4 years) in nomadic populationsthan in sedentary populations (eg Short 1982) According toBocquet-Appel (2011) these longer birth intervals may bemainly determined by diet differences Indeed Valeggia andEllison (2009) demonstrated that birth interval is mainly de-termined by the rapidity of postpartum energy recoverywhich may be increased by high carbohydrate food (like ce-reals) consumption Moreover the nomadic herder way of lifemay offer less food security than sedentary farming the latterfacilitating efficient long-term food storage

However unlike in Africa we did not find systematicallyconsistent patterns between the autosomal and mtDNA datain Eurasia The possible contraction events that our resultssuggest for two sedentary populations (Japanese and Danes)with autosomes appeared concomitant with historical eventsthat could have led to bottleneck processes For the Japanesepopulation this contraction signal could indeed result from afounder effect due to the Paleolithic colonization of Japan by asubset of the Northern Asiatic people (especially from KoreaNei 1995) Similarly a bottleneck process may also have oc-curred in the Danish population linked with the last glacialmaximum occurring between 26500 and 19500 YBP (Clarket al 2009) Reasons why these processes impacted the auto-somes but not the mtDNA data remain to be determined forinstance through simulation studies In any case our studyclearly emphasizes the utility of combining mtDNA and au-tosomal sequences as they allow access to different aspects ofhuman history A recent study on harbor porpoises has sim-ilarly shown that nuclear markers were sensitive to a recentcontraction event whereas mtDNA allowed inferring a moreancient expansion (Fontaine et al 2012)

Interestingly Central Asia displayed a distinct pattern fromthe rest of Eurasia Indeed we did not infer higher expansionrates for sedentary farmers than for nomadic herders in thatarea It could result from harsh local environmental condi-tions due to the arid continental climate in this area Indeedusing pollen records Dirksen and van Geel (2004) showedthat the paleoclimate in Central Asia was very arid from atleast 12000 to 3000 YBP which could have limited theamount of suitable areas for farming and impacted humandemography Spatial expansion processes may also haveplayed a role in this difference as population-specific FST

values were higher for the farmers than for the herdersThis may indicate that more migrants were involved in thespatial expansion process for the herders than for the farmers

yielding a weaker expansion signal (ie lower inferred growthrate) for the latter (Ray et al 2003) This is supported by thenegative correlation between the FST values and the inferredgrowth rates in the farmer populations The Korean popula-tion also stood out as an exception in Eurasia Even though itis a population of sedentary farmers it showed no significantexpansion signal with both the parametric and nonparamet-ric methods with HVS-I This could be explained by a latersedentarization of this population The Korean Neolithic isnotably defined by the introduction of Jeulmun ware ce-ramics about 8000 YBP but the people of the Jeulmunperiod were still predominantly semi-nomadic fishers andhunter-gatherers until about 3000 YBP when Koreans startedan intensive crop production implying a sedentary lifestyle(Nelson 1993)

Inferred Expansion Signals Predate the Emergence ofFarming

EBSP analyses revealed that the inferred expansion events infarmers and herder populations were more ancient than theemergence of farming and herding Therefore the differencesin demographic patterns between farmers and herders seemto predate their divergence in lifestyle which raises the ques-tion of the chronology of demographic expansions and theNeolithic transition These findings appear to be quite robustto the choice of the scaling parameters We used here boththe lower and the higher mutation rate estimates in humansfor autosomes (Pluzhnikov et al 2002 Conrad et al 2011) andfor the HVS-I sequence (Forster et al 1996 Howell et al 1996)Despite this uncertainty in mutation rates which lead to a 2-fold uncertainty in our time estimates the inferred expansionsignals predated the emergence of agriculture in both casesfor all populations Similarly using a generation time of 29years (Tremblay and Vezina 2000) instead of 25 years lead toslightly more ancient estimates thus do not change our con-clusions (data not shown) However note that for HVS-Iusing the higher bound of the credibility interval for the high-est estimated mutation rate (275 105generationsiteHeyer et al 2001) instead of the mean value (ie 105gen-erationsite) leads to expansion time estimates consistentwith the Neolithic transition in Eurasian populations (supple-mentary table S11 Supplementary Material online)Nevertheless these estimates still clearly predated theNeolithic for the African populations However 105gener-ationsite is by far the highest estimation of mutation rate inthe literature (Howell et al 1996) To infer Neolithic expan-sions in most Eurasian populations one needs to assume amutation rate of at least 2 105generationsite muchhigher than other estimates from the literature and is thusprobably unrealistic Moreover our method for determiningthe expansion onset time using EBSP graph is very conserva-tive and also tends to favor the lower bound of expansiononset times Finally for autosomes using similarly474 108generationsite instead of 25 108genera-tionsite (Pluzhnikov et al 2002) lead to an inferred expansiononset time that is not compatible with the Neolithic transi-tion for all Eurasian and African populations except for one

10

Aime et al doi101093molbevmst156 MBE

African population the Yoruba (supplementary table S12Supplementary Material online) Consequently it seemsvery likely that the expansions inferred in this study corre-spond to Paleolithic rather than Neolithic demographicevents in agreement also with most previous studies as de-tailed later

In Africa the emergence of agriculture has been datedbetween 5000 and 4000 YBP in the Western part ofCentral Africa and subsequently rapidly expanded to therest of sub-Saharan Africa (Phillipson 1993) However usingHVS-I we showed expansion events in farmer populationssince about 30000 or 60000 YBP thus largely predating theemergence of agriculture in the area Similarly using auto-somes especially in Eastern African populations we inferredexpansion signals that clearly predated the Neolithic Notablywe inferred an expansion signal for Mozambicans since atleast 80000 YBP Several genetic studies have already high-lighted that expansion events occurred in African farmersbefore the Neolithic transition (eg Atkinson et al 2009Laval et al 2010 Batini et al 2011) This finding is also con-sistent with paleoanthropological data (ie radiocarbondating) suggesting an expansion event in Africa 60000ndash80000 YBP (Mellars 2006a) This Paleolithic demographicexpansion could be linked to a rapid environmental changetoward a dryer climate (Partridge et al 1997) andor to theemergence of new hunting technologies (Mellars 2006a)

According to Mellars (2006a) this period corresponds to amajor increase in the complexity of the technological eco-nomic social and cognitive behavior of certain Africangroups It corresponds in particular to the emergence of pro-jectile technologies (Shea 2009) which was probably part of abroader pattern of ecological diversification of early Homosapiens populations These changes could have been decisivefor the human spread ldquoOut of Africardquo during the same periodand could have ultimately also led to the sedentarization ofthe remaining populations This inference is consistent withSauerrsquos (1952) demographic theory which stated that latePaleolithic demographic expansions could have favored thesedentarization and the emergence of agriculture in somehuman populations In the case of Central Africa theperiod of 60000 YBP corresponds to the separation betweenhunter-gatherers and farmers ancestors (Patin et al 2009Verdu et al 2009) Thus these two groups may have pre-sented contrasting demographic patterns since their diver-gence Much later higher expansion rates and largerpopulation sizes among farmersrsquo ancestors may have inducedthe emergence of agriculture and sedentarization

With respect to Eurasia the expansion profiles inferredwith HVS-I for all populations and with autosomes for theHan Chinese population also seem to have begun during thePaleolithic thus before the Neolithic transition Some geneticstudies already reported pre-Neolithic expansions in Asia andEurope (eg Chaix et al 2008) Notably using mismatch andintermatch distributions Chaix et al (2008) showed an east-to-west Paleolithic expansion wave in Eurasia We found asimilar pattern here as the inferred expansions of East-Asianpopulations were earlier than those of Central Asian popula-tions themselves earlier than those of European populations

Moreover we found this pattern in both sedentary farmerand nomadic herder populations Thus the ancestors of cur-rently nomadic herder populations also experienced thesePaleolithic expansions However Paleolithic expansion signalsin nomadic populations seem lower than in sedentary pop-ulations This is again compatible with the demographictheory of the Neolithic sedentarization (Sauer 1952) somepopulations may have experienced more intense Paleolithicexpansions which may have led ultimately to theirsedentarization

The inferred Paleolithic expansion signals might resultpartly from spatial expansions out of some refuge areasafter the Last Glacial Maximum (LGM 26500ndash19500 YBPClark et al 2009) as this time interval matches with our in-ferred dating for expansion onsets in East Asia with HVS-Iusing the pedigree-based mutation rate and in Europe andMiddle East using the transitional mutation rate Some of theearlier date estimates might also be consistent with the out-of-Africa expansion of H sapiens However the time radiocar-bon-based estimates of the spread of H sapiens in Eurasia aregenerally more ancient than our inferred expansion onsettimings For instance Mellars (2006b) dated the colonizationof Middle East by H sapiens at 47000ndash49000 YBP and ofEurope at 41000ndash42000 YBP Pavlov et al (2001) reporttraces of modern human occupation nearly 40000 yearsold in Siberia Finally Liu et al (2010) described modernhuman fossils from South China dated to at least 60000YBP Moreover out-of-Africa or post-LGM expansionswould not explain our finding of an east-to-west gradientof expansion onset timing which rather supports the hypoth-esis of a demographic expansion diffused from east to west inEurasia in a demic (ie migrations of individuals) or cultural(favored by the diffusion of new technologies)

Possible Confounding Factors

Our approach makes the assumption that populations areisolated and panmictic which is questionable for human pop-ulations However we analyzed a large set of populationssampled in very distant geographical regions (ie CentralAfrica East Africa Europe Middle East Central Asia PamirSiberia and East Asia) The main conclusions of this study relyon consistent patterns between most of these areas and itseems unlikely that processes such as admixture could havebiased the estimates similarly everywhere Moreover inCentral Africa several studies have shown that hunter-gath-erer populations show signals of admixture whereas it is notthe case for farmer populations (Patin et al 2009 Verdu et al2009 2013) If this introgression had been strong enough thismay have yielded a spurious expansion signal in the hunter-gatherer populations which is not what we observed here InEurope spatial expansion processes during the Neolithic mayhave led to admixture with Paleolithic populations Aspointed out by a simulation study (Arenas et al 2013) thismay lead to a predominance of the Paleolithic gene pool Thismay be one of the factors explaining why we observed mostlyPaleolithic expansions here

11

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

Similarly potential selection occurring on the whole mito-chondrial genome (eg Pakendorf and Stoneking 2005) seemsunlikely to have impacted in the same way all the studiedpopulations within each group (eg stronger positive selec-tion on sedentary than on nomadic populations) as we an-alyzed different nomadic and sedentary populations livingnear each other in several geographically distant areas

Regarding the potential effects of recombination on theinferences from autosomal data we found that neutralitytests gave similar results on the whole sequences whenusing a simulation procedure that was taking the known re-combination rate of each sequence into account (table 1)and on the largest non-recombining blocks as inferred withIMgc without taking recombination into account in the sim-ulation process (supplementary table S12 SupplementaryMaterial online) It thus appears unlikely that the BEAST anal-yses that can only handle the largest inferred non-recombin-ing blocks are biased because of this

Finally note that the effective population sizes inferredusing BEAST correspond to the Ne of the populationsduring their recent history rather than a value of Ne averagedover the history of the population It explains the finding thatfor most populations we inferred Ne estimates much higherthan generally assumed for humans by population geneticists(about 10000)

Material and Methods

Genetic MarkersAutosomal SequencesWe used data from 20 noncoding a priori neutral and un-linked autosomal regions selected by Patin et al (2009) to beat least 200 kb away from any known or predicted gene tonot be in linkage disequilibrium (LD) neither with each othernor with any known or predicted gene and to have a regionof homology with the chimpanzee genome These regions areon average 1253 bp long Using the four-gamete test (Hudsonand Kaplan 1985) as implemented in IMgc online (Woerneret al 2007) we identified recombination events for 6 of these20 regions As some methods used in this study cannothandle recombination we retained for these six sequencesthe largest non-recombining block inferred by IMgc Becauseof this reduction the 20 regions used were on average1228 bp long To identify potential bias related to thismethod (eg some recombination events may not bedetected using the four-gamete test larger blocks of non-recombining sequence may select for gene trees that areshorter than expected) we computed the summary statisticsand performed neutrality tests (discussed later) both on thewhole sequences (table 1) and on the largest non-recombin-ing blocks (supplementary table S12 Supplementary Materialonline)

Mitochondrial SequencesWe used the first hypervariable segment of the mitochondrialcontrol region (HVS-I) sequenced between positions 16067and 16383 excluding the hypervariable poly-C region (sites16179ndash16195) The total length of the sequence was thus of300 bp

Population Panel

For Africa we used the autosomal sequences data set of Patinet al (2009) which consists of five farmer populations(N = 118 individuals) and five Pygmy hunter-gatherer popu-lations (N = 95) In addition we used the HVS-I data set fromQuintana-Murci et al (2008) which consists of nine CentralAfrican farmer populations (N = 486) and seven CentralAfrican hunter-gatherer populations (N = 318) (supplemen-tary table S1 Supplementary Material online)

For Eurasia we used the autosomal sequences data set ofLaval et al (2010) consisting of 48 individuals from two East-Asian populations (Han Chinese and Japanese) and 47 indi-viduals from two European populations (Chuvash andDanes) We also used the data from 48 individuals fromone sedentary Central Asian population (Tajik farmers) and48 individuals from one nomadic Central Asian population(Kyrgyz herders) of Segurel et al (2013) For HVS-I we ana-lyzed data from 17 Eurasian populations (N = 494 in total)located from Eastern to Western Eurasia belonging to severalpublished data sets (Derenko et al 2000 Richards et al 2000Bermisheva et al 2002 Imaizumi et al 2002 Yao et al 2002Kong et al 2003 Quintana-Murci et al 2004 supplementarytable S1 Supplementary Material online)

For our detailed study of Central Asia we used HVS-I se-quences from 12 farmer populations (N = 408 in total) and 16herder populations (N = 567 in total) These data come fromthe studies by Chaix et al (2007) and Heyer et al (2009) for 25populations (supplementary table S1 SupplementaryMaterial online) The other populations (KIB TAB andTKY) were sequenced for this study As in Chaix et al(2007) and Heyer et al (2009) DNA was extracted fromblood samples using standard protocols and the sequencequality was ensured as follows each base pair was determinedonce with a forward and once with a reverse primer anyambiguous base call was checked by additional and indepen-dent PCR and sequencing reactions all sequences were ex-amined by two independent investigators All sampledindividuals were healthy donors from whom informed con-sent was obtained The study was approved by appropriateEthic Committees and scientific organizations in all countrieswhere samples have been collected

Demographic Inferences from Sequences AnalysisSummary Statistics and Neutrality TestsWe computed classical summary statistics (number of poly-morphic sites S number of haplotypes K) and four neutralitytests (Tajimarsquos [1989] D Fu and Lirsquos [1993] D and F and Fursquos[1997] Fs) on both mitochondrial and autosomal sequencesAlthough neutrality tests were originally designed to detectselective events they also give information about demo-graphic processes especially when applied to neutral markersIndeed expansion events lead to more negative values thanexpected in the absence of selective and demographic pro-cesses Conversely contraction events lead to more positivevalues of the neutrality tests For HVS-I sequences we com-puted all summary statistics and neutrality tests and tested

12

Aime et al doi101093molbevmst156 MBE

their departure from neutrality using the coalescent-basedtests provided in DnaSP (Librado and Rozas 2009)

For the autosomal sequences we used the procedure de-veloped in Laval et al (2010) which combines all autosomalsequences into a single test This procedure consists in com-puting the mean value of each summary statistics across the20 loci and in testing whether this mean value departs signif-icantly from its expectation under neutrality in a constant-size population model using a simulation procedure For agiven population with sample size n we produced 105 simu-lated samples of the same size n under a constant populationsize model using the generation-per-generation coalescent-based algorithm implemented in SIMCOAL v2 (Laval andExcoffier 2004) Each simulated individual was constitutedby 20 independent sequences of 1253 bp (the average se-quence length for the real data)Then we used ARLEQUIN v3(Excoffier et al 2005) as modified by Laval et al (2010) tocompute the summary statistics on these simulated samplesWe assessed whether the observed statistics differed signifi-cantly from the constant population model under neutralityby comparing these statistics with their null distribution ob-tained from the simulated data We used gamma distributedmutation rates with a mean value of 25 108generationsite (95 confidence interval 1476 108

4036 108)in agreement with previous studies (Pluzhnikov et al 2002Voight et al 2005) This procedure yielded a P value for thesignificance of the departure from a constant size model Weperformed this procedure both on the whole sequences andon the largest non-recombining blocks For the whole se-quences (ie including recombination) we performed thedata simulation under a coalescent model with recombina-tion using for each locus the recombination rate provided bythe HapMap build GRCh37 genetic map (InternationalHapMap Consortium 2003) (supplementary table S8Supplementary Material online) whereas for the largestnon-recombining blocks we used a coalescent model withoutrecombination

Both for autosomes and for HVS-I sequences we adjustedthe obtained P values for each neutrality test using a falsediscovery rate (FDR) correction (Benjamini and Hochberg1995) in R v2141 (R Development Core Team 2011) inorder to take into account the increased error probability inthe case of multiple testing

MCMC Estimations of Demographic ParametersWe used the MCMC algorithm implemented in BEAST v16(Drummond and Rambaut 2007) We tested the four demo-graphic models implemented in this software constant effec-tive population size (N0) (constant model) populationexpansion with an increasing growth rate (g) (exponentialmodel) population expansion with an decreasing growthrate (g) (logistic model) and the expansion model in whichN0 is the present day population size N1 the population sizethat the model asymptotes to going into the distant past andg the exponential growth rate that determines how fast thetransition is from near the N1 population size to N0 popula-tion size In fact BEAST estimates composite parameters foreach model namely N0 and g where N0 is the current

effective population size g the growth rate and the muta-tion rate In addition for the expansion model the ratio be-tween the current (N0) and ancestral (N1) effectivepopulation size is also estimated To infer N0 and g fromthese composite parameters we needed to assume a valuefor the mutation rate However there is no consensus formutation rates in humans in the literature as different meth-ods lead to different estimations For autosomes the mostcommonly used value is the phylogenetic rate of= 25 108generationsite (Pluzhnikov et al 2002)However recent studies based on the 1000 genome project(1000 Genomes Project Consortium 2012) have found a2-fold lower rate (= 12 108generationsite) by directlycomparing genome-wide sequences from children and theirparents (Conrad et al 2011 Scally and Durbin 2012) We usedhere both mutation rates Similarly for HVS-I estimated mu-tation rates are highly dependent on methodologies andmodes of calibration (Endicott et al 2009) We used boththe lower and the higher estimated mutation rates the tran-sitional changes mutation rate of = 5 106generationsite (Forster et al 1996) and the pedigree-based rate of= 105generationsite (Howell et al 1996 Heyer et al2001) We used a general time-reversible substitution model(Rodriguez et al 1990) We assumed a generation time of 25years permitting the comparison with previous human pop-ulation genetics studies (eg Chaix et al 2008 Patin et al 2009Laval et al 2010) As BEAST cannot handle recombinationevents we used the largest nonrecombining block withineach sequence (discussed earlier)

We performed three runs of 107 steps per population andper demographic model for the HVS-I sequence and threeruns of 2 108 steps (which corresponded to three runs of107 steps per locus) for the autosomal sequences We re-corded one tree every 1000 steps which thus implied atotal of 105 trees per locus and per run We then removedthe first 10 steps of each run (burn-in period) and combinedthe runs to obtain acceptable effective sample sizes (ESSs of100 or above) The convergence of these runs was assessedusing two methods visual inspection of traces using Tracerv15 (Rambaut and Drummond 2007) to check for concor-dance between runs and computation of Gelman and Rubinrsquos(1992) convergence diagnostic using R v2141 (RDevelopment Core Team 2011) with the function gelmandiag available in the package coda (Plummer et al 2006)

To facilitate a large exploration of the parameter space forthe autosomal sequences we chose uniform priors between 0and 005 for 2N0 and between 109 and 109 for g ForHVS-I sequences we chose uniform priors for N0 between 0and 10 and for gm between25 106 and 25 106 result-ing for the same priors on N0 and g than for autosomalsequences if we assumed = 105generationsite (ieN0 constrained between 0 and 106 and g constrained between1 and 1 per year) Conversely if we assumed = 5 106generationsite it meant that N0 was constrained between 0and 2 106 and g was constrained between05 and 05 peryear

For each population and model we obtained themode and the 95 HPD of N0 and g inferred from their

13

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

posterior distributions (supplementary tables S4 and S7Supplementary Material online) using the add-on packageLocfit (Loader 1999) in R v2141 We selected the best-fittingmodel among the four tested demographic models by esti-mating marginal likelihoods using two methods path sam-pling and stepping-stone sampling (Baele et al 2012) Themodel with the greater marginal likelihood (supplementarytable S3 Supplementary Material online) was considered asthe best-fitting model

Extended Bayesian Skyline PlotsEBSPs (Heled and Drummond 2010) also implemented inBEAST estimate demographic changes occurring continu-ously through time in a population using the time intervalsbetween successive coalescent events This method allows avisualization of the evolution of Ne through time As abovewe combined three runs of 107 steps for mitochondrial se-quences and three runs of 2 108 steps for autosomal se-quences to obtain acceptable ESS values We assumed thesame mutation rates as above and a generation time of 25years Outputs were analyzed with Tracer v15 to visuallycheck for convergence and ESS also to obtain the 95 HPDinterval for the number of demographic changes that oc-curred in the population (supplementary table S5Supplementary Material online) A constant population sizecould be rejected when the 95 HPD of the number ofchange points excluded 0 but included 1 (Heled andDrummond 2010) Then we used R v2141 to computeGelman and Rubinrsquos (1992) convergence diagnostic asabove as well as to compute skyline plots Finally we usedthe population growth curves generated from BEAST toassess the time at which populations began to expandEach Skyline plot consisted of smoothed data points atamp10ndash20 generation intervals We consider that the popula-tion increased (or decreased) when both the median and 95HPD values for Ne increased (or decreased) between morethan two successive data points Although this method didnot provide a 95 HPD interval for the inferred expansiontimings this conservative approach ensured that we consid-ered only relevant expansion signals

Correlation Tests of Inferred Growth Rates andIsolationImmigration Patterns

To test how isolation degrees and migration patterns differ-ences could have impacted our demographic inferences weused ARLEQUIN v311 (Excoffier et al 2005) to compute pop-ulation-specific FST values (Weir and Hill 2002) from HVS-Idata for Central Africa Eurasia and Central Asia (supplemen-tary table S9 Supplementary Material online) and estimateimmigration rates from mismatch distributions under a spa-tially explicit model (Excoffier 2004) (supplementary tableS10 Supplementary Material online) We performed thenSpearman tests using R v2141 to investigate for eachregion how the inferred parametric growth rates were corre-lated with those FST values and immigration rates We used avalue of 0 for the growth rate when the constant model bestfitted the data

Supplementary MaterialSupplementary figures S1 and S2 and tables S1ndashS12 are avail-able at Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

The authors would like to warmly thank all volunteer partic-ipants They also thank Michael Fontaine for his help on somedata analyses Phillip Endicott and Julio Bendezu-Sarmientofor insightful discussions Alexei Drummond for helpful dis-cussions and Friso Palstra for his help on English usage Theythank Laurent Excoffier two anonymous reviewers and theeditor for helpful comments and suggestions All computa-tionally intensive analyses were run on the Linux cluster of theMuseum National drsquoHistoire Naturelle (administrated by JulioPedraza) and on the web-based portal ldquoBioportalrdquo (Kumaret al 2009) This work was supported by ActionsTransversales du Museum - Museum National drsquoHistoireNaturelle (grant ldquoLes relations Societes-Natures dans le longtermerdquo) and Agence Nationale de la Recherche (grantsldquoAlterite culturellerdquo ANR-10-ESVS-0010 and ldquoDemochipsrdquoANR-12-BSV7-0012) CA was financed by a PhD grantfrom the Centre National de la Recherche Scientifique

References1000 Genomes Project Consortium 2012 An integrated map of genetic

variation from 1092 human genomes Nature 49156ndash65Ambrose SH 2001 Paleolithic technology and human evolution Science

2911748ndash1753Arenas M Francois O Currat M Ray N Excoffier L 2013 Influence of

admixture and Paleolithic range contractions on current Europeandiversity gradients Mol Biol Evol 30 57ndash61

Atkinson QD Gray RD Drummond AJ 2009 Bayesian coalescent infer-ence of major human mitochondrial DNA haplogroup expansionsin Africa Proc R Soc Lond B Biol Sci 276367ndash373

Baele G Lemey P Bedford T Rambaut A Suchard MA Alekseyenko AV2012 Improving the accuracy of demographic and molecular clockmodel comparison while accommodating phylogenetic uncertaintyMol Biol Evol 292157ndash2167

Balaresque PL Ballereau SJ Jobling MA 2007 Challenges in humangenetic diversity demographic history and adaptation Hum MolGenet 16R134ndashR139

Batini C Lopes J Behar DM Calafell F Jorde LB van der Veen LQuintana-Murci L Spedini G Destro-Bisol G Comas D 2011Insights into the demographic history of African pygmies from com-plete mitochondrial genomes Mol Biol Evol 281099ndash1110

Beaumont MA 2004 Recent developments in genetic data analysiswhat can they tell us about human demographic historyHeredity 92365ndash379

Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdashapractical and powerful approach to multiple testing J R Stat SocSer B 57289ndash300

Bermisheva M Salimova A Korshunova T Svyatova G Berezina GVillems R Khusnutdinova E 2002 Mitochondrial DNA diversity inthe populations of Middle Asia and Northern Caucasus Eur J HumGenet 10179ndash180

Bocquet-Appel JP 2011 When the worldrsquos population took off thespringboard of the Neolithic Demographic Transition Science 333560ndash561

Bocquet-Appel J-P Bar-Yosef O 2008 The Neolithic demographic tran-sition and its consequences Dordrecht (The Netherlands) Springer

14

Aime et al doi101093molbevmst156 MBE

Chaix R Austerlitz F Hegay T Quintana-Murci L Heyer E 2008 Genetictraces of east-to-west human expansion waves in Eurasia Am J PhysAnthropol 136309ndash317

Chaix R Quintana-Murci L Hegay T Hammer MF Mobasher ZAusterlitz F Heyer E 2007 From social to genetic structures incentral Asia Curr Biol 1743ndash48

Clark PU Dyke AS Shakun JD Carlson AE Clark J Wohlfarth BMitrovica JX Hostetler SW McCabe AM 2009 The Last GlacialMaximum Science 325710ndash714

Conrad DF Keebler JEM DePristo MA et al (17 co-authors) 2011Variation in genome-wide mutation rates within and betweenhuman families Nat Genet 43712ndash714

Derenko MV Malyarchuk BA Dambueva IK Shaikhaev GO DorzhuCM Nimaev DD Zakharov IA 2000 Mitochondrial DNA variationin two South Siberian aboriginal populations implications for thegenetic history of North Asia Hum Biol 72945ndash973

Dirksen VG van Geel B 2004 Mid to late Holocene climate change andits influence on cultural development in South Central Siberia NatoSci S Ss IV Ear 42291ndash307

Drummond AJ Rambaut A 2007 BEAST Bayesian evolutionary analysisby sampling trees BMC Evol Biol 7214

Endicott P Ho SYW Metspalu M Stringer C 2009 Evaluating the mi-tochondrial timescale of human evolution Trends Ecol Evol 24515ndash521

Excoffier L 2004 Patterns of DNA sequence diversity and genetic struc-ture after a range expansion lessons from the infinite-island modelMol Ecol 13853ndash864

Excoffier L Heckel G 2006 Computer programs for population geneticsdata analysis a survival guide Nat Rev Genet 7745ndash758

Excoffier L Laval G Schneider S 2005 Arlequin (version 30) an inte-grated software package for population genetics data analysis EvolBioinform Online 147ndash50

Fontaine MC Snirc A Frantzis A Koutrakis E Ozturk B Ozturk AAAusterlitz F 2012 History of expansion and anthropogenic collapsein a top marine predator of the Black Sea estimated from geneticdata Proc Natl Acad Sci U S A 109E2569ndashE2576

Forster P Harding R Torroni A Bandelt HJ 1996 Origin and evolutionof native American mtDNA variation a reappraisal Am J HumGenet 59935ndash945

Fu YX 1997 Statistical tests of neutrality of mutations against popula-tion growth hitchhiking and background selection Genetics 147915ndash925

Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133693ndash709

Gelman A Rubin DB 1992 Inference from iterative simulation usingmultiple sequences Stat Sci 7457ndash511

Heled J Drummond AJ 2010 Bayesian inference of species trees frommultilocus data Mol Biol Evol 27570ndash580

Heyer E Balaresque P Jobling MA Quintana-Murci L Chaix R Segurel LAldashev A Hegay T 2009 Genetic diversity and the emergence ofethnic groups in Central Asia BMC Genet 1049

Heyer E Chaix R Pavard S Austerlitz F 2012 Sex-specific demographicbehaviours that shape human genomic variation Mol Ecol 21597ndash612

Heyer E Zietkiewicz E Rochowski A Yotova V Puymirat J Labuda D2001 Phylogenetic and familial estimates of mitochondrial substi-tution rates study of control region mutations in deep-rootingpedigrees Am J Hum Genet 631113ndash1126

Ho SYW Shapiro B 2011 Skyline-plot methods for estimatingdemographic history from nucleotide sequences Mol Ecol Res 11423ndash434

Howell N Kubacka I Mackey DA 1996 How rapidly does the humanmitochondrial genome evolve Am J Hum Genet 59501ndash509

Hudson RR Kaplan NL 1985 Statistical properties of the number ofrecombination events in the history of a sample of DNA sequencesGenetics 111147ndash164

Imaizumi K Parsons TJ Yoshino M Holland MM 2002 A new databaseof mitochondrial DNA hypervariable regions I and II sequences from162 Japanese individuals Int J Legal Med 11668ndash73

International HapMap Consortium 2003 The International HapMapProject Nature 426789ndash796

Kingman JFC 1982 The coalescent Stoch Proc Appl 13235ndash248Kong QP Yao YG Sun C Bandelt HJ Zhu CL Zhang YP 2003 Phylogeny

of East Asian mitochondrial DNA lineages inferred from completesequences Am J Hum Genet 73671ndash676

Kumar V Langstieh BT Madhavi KV Naidu VM Singh HP Biswas SThangaraj K Singh L Reddy BM 2006 Global patterns in humanmitochondrial DNA and Y-chromosome variation caused by spatialinstability of the local cultural processes PLoS Genet 2e53

Kumar S Skjaeveland A Orr RJS Enger P Ruden T Mevik BH Burki FBotnen A Shalchian-Tabrizi K 2009 AIR a batch-oriented webprogram package for construction of supermatrices ready for phy-logenomic analyses BMC Bioinformatics 10357

Laval G Excoffier L 2004 SIMCOAL 20 a program to simulate genomicdiversity over large recombining regions in a subdivided populationwith a complex history Bioinformatics 202485ndash2487

Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulatinga historical and demographic model of recent human evolutionbased on resequencing data from noncoding regions PLoS One 5e10284

Librado P Rozas J 2009 DnaSP v5 a software for comprehensive analysisof DNA polymorphism data Bioinformatics 251451ndash1452

Liu W Jin CZ Zhang YQ et al (13 co-authors) 2010 Human remainsfrom Zhirendong South China and modern human emergence inEast Asia Proc Natl Acad Sci U S A 10719201ndash19206

Loader C 1999 Local regression and likelihood New York SpringerMellars P 2006a Why did human populations disperse from Africa ca

60000 years ago A new model Proc Natl Acad Sci U S A 1039381ndash9386

Mellars P 2006b A new radiocarbon revolution and the dispersal ofmodern humans in Eurasia Nature 439931ndash935

Nei M 1995 The origins of human populations genetic linguistic andarcheological data In Brenner S Hanihara K editors The origin andpast of modern humans as viewed from DNA Singapore WorldScientific p 71ndash91

Nei M Roychoudhury AK 1993 Evolutionary relationships of human-populations on a global scale Mol Biol Evol 10927ndash943

Nelson SM 1993 The archaeology of Korea Cambridge CambridgeUniversity Press p 307

Oota H Settheetham-Ishida W Tiwawech D Ishida T Stoneking M2001 Human mtDNA and Y-chromosome variation is corre-lated with matrilocal versus patrilocal residence Nat Genet 2920ndash21

Pakendorf B Stoneking M 2005 Mitochondrial DNA and human evo-lution Annu Rev Genomics Hum Genet 6165ndash183

Partridge TC Demenocal PB Lorentz SA Paiker MJ Vogel JC 1997Orbital forcing of climate over South Africa a 200000-year rainfallrecord from the Pretoria Saltpan Quat Sci Rev 161125ndash1133

Patin E Laval G Barreiro LB et al (15 co-authors) 2009 Inferring thedemographic history of African farmers and pygmy hunter-gath-erers using a multilocus resequencing data set PLoS Genet 5e1000448

Pavlov P Svendsen J Indrelid SI 2001 Human presence in the EuropeanArctic nearly 40000 years ago Nature 41364ndash67

Phillipson DW 1993 African archaeology Cambridge CambridgeUniversity Press

Plummer M Best N Cowles K Vines K 2006 CODA Convergencediagnosis and output analysis for MCMC R News 67ndash11

Pluzhnikov A Di Rienzo A Hudson RR 2002 Inferences about humandemography based on multilocus analyses of noncoding sequencesGenetics 1611209ndash1218

Quintana-Murci L Chaix R Wells RS et al (17 co-authors) 2004 Wherewest meets east the complex mtDNA landscape of the southwestand Central Asian corridor Am J Hum Genet 74827ndash845

Quintana-Murci L Quach H Harmant C et al (23 co-authors) 2008Maternal traces of deep common ancestry and asymmetric geneflow between Pygmy hunter-gatherers and Bantu-speaking farmersProc Natl Acad Sci U S A 1051596ndash1601

15

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE

Page 3: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

best fitted the data for the two East-African farmer popula-tions (namely Chagga and Mozambicans) whereas the ldquoex-ponential modelrdquo best fitted the data for all West-Africanfarmer populations (Akele Ngumba and Yoruba) with pos-itive growth rates in all cases (supplementary table S4Supplementary Material online) Conversely no signals of ex-pansion were found for hunter-gatherer populations as theldquoconstant modelrdquo always best fitted the data (supplementarytables S3 and S4 Supplementary Material online)Consistently EBSPs showed signals of expansions for farmerpopulations (fig 1A) 95 highest probability density (HPD)intervals for the estimated number of demographic changesdid not include 0 indicating at least one significant change inpopulation size (supplementary table S5 SupplementaryMaterial online) Conversely we found no evidence of popu-lation size changes for hunter-gatherers (fig 1B and supple-mentary table S5 Supplementary Material online) We furtherdated the onset of farmer expansions from at least 62275 YBP(assuming = 25 108generationsite) or 124550 YBP(assuming = 12 108generationsite) for Mozambicansto 7975 or 15950 YBP for Yoruba Visual examination of the95 HPD intervals showed that the expansion event inferredfor the Mozambican population was significantly older thanthose inferred for the other populations (supplementary tableS6 Supplementary Material online)

We found similar results for the HVS-I sequences fromCentral Africa (supplementary tables S3 and S7Supplementary Material online) Indeed the exponentialmodel with positive growth rates best fitted the data for allfarmer populations indicating expansion events Converselythe exponential model with negative modal values for growthrate (ie contraction event) provided the best fit for allhunter-gatherer populations However as the 95 HPD

intervals for growth rates included 0 we could not concludeany significant contraction events for these populationsSimilarly EBSPs indicated a significant expansion event forall farmer populations (fig 2A and supplementary table S5Supplementary Material online) whereas we found no evi-dence of population size changes for hunter-gatherers (fig 2Band supplementary table S5 Supplementary Material online)We dated farmer populations expansions from 31350 or62700 YBP (assuming = 105 or 5 106generationsiterespectively) to 45319 or 90638 YBP (supplementary table S6Supplementary Material online)

Finally both with autosomes and HVS-I all hunter-gath-erer populations had lower current effective population size(N0) values than farmer populations (supplementary tables S4and S7 Supplementary Material online) Furthermorethe inferred expansion onsets for all farmer populations lar-gely predated the emergence of farming in Central Africa(5000ndash4000 YBP Bocquet-Appel and Bar-Yosef 2008)(figs 3 and 4)

Eurasia Contrasting Demographic Patterns for FarmerPopulations with Autosomes and Stronger Pre-NeolithicExpansions for Farmers Than Herders with HVS-IThe coalescent-based analyses of autosomes in East-Asianand European populations showed contrasting demographicpatterns across sedentary populations (supplementary tablesS3 and S4 Supplementary Material online) Using the para-metric BEAST analysis the expansion model best fitted thedata for Han Chinese indicating an expansion eventConversely we inferred that Japanese and Danes eitherunderwent a contraction event or remained at constantsize Indeed the exponential model with negative growthrates best fitted the data for these two populations but the

Table 1 Summary Statistics and Neutrality Tests Computed from the Whole Autosomal Sequences

Population Area Lifestyle Sa Kb Tajimarsquos Dc Fu amp Lirsquos Dc Fu amp Lirsquos Fc Fursquos Fsc

Akele Africa Sedentary farmers 695 645 035 055 057 112

Chagga Africa Sedentary farmers 865 795 048 070 074 127

Mozambicans Africa Sedentary farmers 880 955 062 115 115 333

Ngumba Africa Sedentary farmers 705 620 020 041 041 068

Yoruba Africa Sedentary farmers 750 715 014 003 003 073

Aka Africa Nomadic HGd 695 660 012 034 032 030

G Baka Africa Nomadic HG 630 600 0008 017 014 033

S Baka Africa Nomadic HG 610 56 017 005 010 003

Kola Africa Nomadic HG 655 625 014 003 008 075

Mbuti Africa Nomadic HG 660 610 025 035 037 016

Danes Eurasia Sedentary farmers 550 485 030 016 024 073

Han Eurasia Sedentary farmers 520 470 003 001 002 021

Japanese Eurasia Sedentary farmers 420 385 045 022 034 106

Chuvash Eurasia Nomadic herders 570 505 009 011 012 034

Tajiks (TAB) C Asia Sedentary farmers 900 900 019 003 010 024

Kyrgyz (KIB) C Asia Nomadic herders 104 1040 011 008 011 023

NOTEmdashValues significantly higher than expected for a constant population size model are italicized whereas significantly lower values are underlinedaNumber of polymorphismsbNumber of haplotypescWe report the means over the 20 regionsdHG = Hunter-gatherers significance levels Plt 005 Plt 001 after FDR correction for multiple testing (Benjamini and Hochberg 1995)

3

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

95 HPD intervals also included g = 0 The constant modelbest fitted the data for the Chuvash a traditionally nomadicpopulation EBSPs showed a significant expansion event forthe Han population the value of 0 was not included in the95 HPD interval of the number of demographic changes(supplementary table S5 Supplementary Material online)These expansion events started at least 36025 or 72050YBP (fig 1C and supplementary table S6 SupplementaryMaterial online) clearly predating the emergence of farmingin East Asia about 9000 YBP (Bocquet-Appel and Bar-Yosef

2008) (fig 3) Japanese showed a significant contraction event(ie the value of 0 was not included in the 95 HPD intervalof the number of demographic changes supplementary tableS5 Supplementary Material online) starting at least 21350 or42700 YBP Danes also showed a significant contractionevent starting at least 26440 or 52880 YBP (fig 1C and sup-plementary tables S5 and S6 Supplementary Material online)EBSP analyses showed no significant demographic changes forthe Chuvash (fig 1D and supplementary table S5Supplementary Material online)

FIG 1 EBSPs inferred from autosomal sequences in African sedentary farmers (A) African nomadic hunter-gatherers (B) Eurasian sedentary farmers (C)and Eurasian nomadic herders (D) The values indicated in bold on the axes are obtained assuming a mutation rate of = 12 108generationsite(measured from trios parentsndashchildren by Conrad et al 2011) and the other values correspond to = 25 108generationsite (derived from thesequence divergence humanndashchimpanzee by Pluzhnikov et al 2002) Although time was expressed in generations for the analyses we represented timein years here assuming a generation time of 25 years Time is represented backward on the x axis from present to the left to the most distant past onthe right 95 lower and upper HPD are represented by dashed lines Populations for which the estimated number of demographic changes include 0(ie no significant signal of expansion or decline) are represented in light gray

4

Aime et al doi101093molbevmst156 MBE

FIG 2 EBSPs inferred from HVS-I sequences in African sedentary farmers (A) African nomadic hunter-gatherers (B) Eurasian sedentary farmers (C)Eurasian nomadic herders (D) Central Asian sedentary farmers (E) and Central Asian nomadic herders (F) The values indicated in bold are obtainedassuming a mutation rate of = 5 106generationsite (transitional changes rate Forster et al 1996) and the others correspond to = 105generationsite (pedigree-based Howell et al 1996 Heyer et al 2001) Time is represented time in years assuming a generation time of 25 years It isrepresented backward on the x axis from present to the left to the most distant past on the right 95 lower and upper HPD are represented by dashedlines Populations for which the estimated number of demographic changes include 0 (ie no significant signal of expansion or decline) are representedin light gray and the others in black

5

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

For the HVS-I sequences from Eurasia (supplementarytables S7 and S8 Supplementary Material online) the para-metric BEAST analyses showed that models consistent withan increase in population size (expansion model or exponen-tial model with positive growth rates) best fitted the data forall sedentary populations except Koreans Conversely theconstant model best fitted the data for all nomadic popula-tions as well as Koreans EBSPs showed however significantexpansion events for both farmers and herders but not forKoreans (fig 2C and D and supplementary table S5Supplementary Material online) Nevertheless there was atendency toward stronger expansion rates and higher Ne

values in sedentary than in nomadic populations (fig 2Cand D) although the 95 HPD intervals for Ne were quitelarge for sedentary populations The estimated expansiononset times inferred from the EBSPs (supplementary tableS6 Supplementary Material online) followed an east-to-west gradient they appeared more ancient in Eastern popu-lations in both sedentary and nomadic populations (supple-mentary fig S1 Supplementary Material online) They alsoclearly predated the Neolithic transition in all geographicareas (fig 4)

The Central Asian Exception Similar Demographic Patterns inFarmers and HerdersFor autosomes the constant model best fitted the data forboth sedentary farmers (TAB) and traditionally nomadic her-ders (KIB) (supplementary tables S3 and S4 SupplementaryMaterial online) EBSPs showed also no significant demo-graphic changes for these populations (figs 1C and D andsupplementary table S5 Supplementary Material online)

For HVS-I (supplementary tables S3 and S7 SupplementaryMaterial online) the exponential model best fitted the datafor six of the 12 sedentary farmer populations (includingTAB) whereas the constant model was preferred for theother farmers Unlike the rest of Eurasia a model indicatingexpansion (the exponential model with positive growth rates)was also selected for all nomadic herders Moreover EBSPsshowed significant expansion signals for both herder andfarmer populations except TJY (Yagnobs from Dushanbe)since at least 13860 YBP (or 27720 YBP) for farmers and16546 YBP (or 33092 YBP) for herders on average (fig 2Eand F and supplementary tables S5 and S6 Supplementary

Material online) Again these inferred expansion onsets pre-dated the emergence of farming in the area about 8000 YBP(Bocquet-Appel and Bar-Yosef 2008) (fig 4) Inferred expan-sions for Central Asian sedentary farmers seemed overallweaker (ie lower growth rate and lower Ne) than those ob-served for other sedentary populations in Eurasia althoughwe observed important variations in growth rates and Ne

among populations and large 95 HPD intervals for someof them (fig 2C and E)

Degrees of Isolation and Migration Patterns

African farmer populations appeared less isolated and re-ceived more migrants than hunter-gatherer populationsIndeed the population-specific FST values (supplementarytable S9 Supplementary Material online) were on averagesignificantly lower for farmers than for hunter-gatherers(mean[farmers] = 0058 mean[HG] = 0192 Wilcoxon two-sidedtest P value = 00002) Moreover the estimated number ofimmigrants was significantly higher for sedentary farmersthan for nomadic hunter-gatherers (mean[farmers] = 314mean[HG] = 221 P value = 00001) (supplementary table S10Supplementary Material online) For hunter-gatherers the FST

values were negatively correlated with the negative growthrates that we inferred from the parametric method (=0893 P value = 0012) (fig 5B) meaning that less isolatedpopulations showed weaker contraction events (ie less neg-ative growth rates) Conversely there was no significant cor-relation between FST values and inferred growth rates forsedentary farmers (= 0433 P value = 0249) (fig 5A)However we found a significant positive correlation betweenthe number of immigrants and the inferred growth rates(supplementary fig S2 Supplementary Material online)among sedentary farmer populations (= 0867 P value =0004) but not among nomadic hunter-gatherers (= 0536P value = 0235)

For Eurasia we found no significant difference in FST valuesbetween farmers and herders (mean[farmers] = 0039 mean-

[herders] = 0043 P value = 077 supplementary table S9Supplementary Material online) except in Central Asia forwhich we found significantly lower FST values for nomadicherders than for sedentary farmers (mean[farmers] = 0018mean[herders] = 0008 P value = 0017) We report a significant

FIG 3 Comparison of estimated times for expansion onsets using autosomes and dating of the first archeological traces of farming in Africa and ChinaTime is represented backward (in YBP) Only populations for which the EBSP analysis showed a significant expansion event are represented Wereported the time values estimated with the highest mutation rate that we used for the autosomes (= 25 108generationsite) Thus these timevalues can be considered as a lower bound for the expansion onsets The dates for the emergence of farming come from the review by Bocquet-Appeland Bar-Yosef (2008) They are based on archeological remains

6

Aime et al doi101093molbevmst156 MBE

FIG

4

Com

par

ison

ofes

tim

ated

tim

esfo

rex

pan

sion

onse

tsus

ing

HV

S-Ia

nd

dati

ng

ofth

efir

star

cheo

logi

calt

race

sof

farm

ing

orhe

rdin

gin

Cen

tral

Afr

ica

(A)

Eura

sia

(B)

and

Cen

tral

Asi

a(C

)T

ime

isre

pre

sen

ted

back

war

d(i

nY

BP)

On

lyp

opul

atio

ns

for

whi

chth

eEB

SPan

alys

issh

owed

asi

gnifi

can

tex

pan

sion

even

tar

ere

pre

sen

ted

We

rep

orte

dth

eti

me

valu

eses

tim

ated

wit

hth

ehi

ghes

tm

utat

ion

rate

that

we

used

for

the

HV

S-Is

eque

nce

s(

=10

5 gen

erat

ion

sit

e)T

hus

thes

eti

me

valu

esca

nbe

con

side

red

asa

low

erbo

und

for

the

exp

ansi

onon

sets

The

date

sfo

rth

eem

erge

nce

offa

rmin

gco

me

from

the

revi

ewby

Bocq

uet-

Ap

pel

and

Bar-

Yos

ef(2

008)

The

yar

eba

sed

onar

cheo

logi

cal

rem

ain

s

7

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

FIG 5 Correlations between population-specific FST values and inferred growth rates in African farmer (A) and hunter-gatherer (B) populationsEurasian farmer populations (C) and Central Asian farmer (D) and herder populations (E) Population-specific FST values were computed withARLEQUIN v311 (Excoffier et al 2005) The growth rates were inferred under the best-fitting model from the parametric method using BEAST(Drummond and Rambaut 2007) When the best-fitting model was the constant model we assumed a growth rate of 0 Note that we did not representEurasian herder populations as the constant model best-fitted the data for all of them Plots and correlation tests were performed using R v2141(R Development Core Team 2011)

8

Aime et al doi101093molbevmst156 MBE

negative correlation between FST values and inferred growthrates for sedentary farmers in Eurasia (=0673 P value =0028) (fig 5C) and Central Asia (=0773 P value = 0003)(fig 5D) thus meaning that less isolated populations showedhigher inferred growth rates There was no significant corre-lation for Central Asian herders (=0092 P value = 0736)(fig 5E) Note that this analysis could not be performed forthe other Eurasian herder populations as the constant modelbest fitted the data with the parametric method

The estimation of the proportion of immigrants did notconverge for 11 Eurasian populations (Han Chinese LiaoningQingdao Palestinians Pathans Mongols as well as threeCentral Asian farmer populations and two Central Asianherder populations) Regarding the other populations weshowed no significant difference in the proportion of immi-grants between farmers and herders both in Central Asia(mean[farmers] = 82868 mean[herders] = 260265 P value = 012)and in the rest of Eurasia (mean[farmers] = 576757 mean-

[herders] = 201311 P value = 051) (supplementary table S10Supplementary Material online) We also found no significantcorrelation between this proportion and the inferred growthrates for Eurasian farmers (=0238 P value = 048) CentralAsians farmers (= 0386 P value = 030) and Central Asianherders (=042 P value = 0139) (supplementary fig S2Supplementary Material online)

DiscussionIn this study using a large set of populations from distantgeographic areas we report contrasted demographic historiesthat correlate with lifestyle Moreover the inferred expansionsignals in both African and Eurasian farmer and herder pop-ulations predated the Neolithic transition and the sedentar-ization of these populations

Contrasted Demographic Histories in Sedentary andNomadic Populations

For Africa both mtDNA and autosomal data revealed expan-sion patterns in most sedentary farmer populations as indi-cated by neutrality tests and the parametric andnonparametric BEAST methods Conversely we found con-stant effective population sizes (or possibly contractionevents) for all hunter-gatherer populations Among the farm-ers results were least clear for the Yoruba and the Ewondopopulations as no neutrality test was significant for thesepopulations whereas they showed evidence of expansionevents when analyzed with BEAST This indicates that thesepopulations may have undergone weaker expansion dynam-ics (ie lower growth rates and Ne) than the others Theseremarkable results are of particular importance for theYoruba as it is a reference population in many databases(HapMap 1000 genomes) This also demonstrates thehigher sensitivity of MCMC methods such as BEAST todetect expansions in comparison to neutrality tests

The contrasted patterns inferred between sedentary andnomadic populations in Africa suggest strong differences be-tween the demographic histories of these two groups of pop-ulations The question is whether this pattern results mostly

from differences in local expansion dynamics or whether spa-tial expansion processes at a larger scale were also involved Asshown by Ray et al (2003) negative values for the neutralitytests will be observed in a spatial expansion process if the rateof migrants (Nm) is high enough (at least 20) but not other-wise As in previous studies (eg Verdu et al 2013) we reporta higher degree of isolation (higher population-specific FST

values) in hunter-gatherer populations than in farmer popu-lations Using the spatial expansion model of Excoffier (2004)also leads to higher estimates of the number of immigrantsinto farmer populations Thus both farmers and hunter-gath-erers may have been subject to a spatial expansion processbut the limited number of migrants among hunter-gatherersmay have resulted in an absence of expansion signals forthem This would be consistent with the positive correlationthat we observe between the growth rates estimated withBEAST and the inferred number of immigrants in the seden-tary farmer populations However this spatial expansion pro-cess seems unlikely to completely explain the strongassociation that we observed between lifestyle and expansionpatterns as some farmer populations (Teke Gabonese Fang)displayed FST values similar to those of hunter-gatherers but aclear signal of expansion with relatively high growth rates Thissuggests that even rather isolated farmer populations showsubstantial level of expansions Moreover FST values and in-ferred growth rates in farmer populations were not signifi-cantly correlated Therefore our results suggest that theexpansion patterns observed in sedentary populationsresult not only from a spatial expansion pattern In additionlocal dynamics connected with the higher capacity of foodproduction by farmers also explain their much strongerexpansion signatures relative to their neighboring hunter-gatherer populations

For Eurasia when considering the mtDNA data all threemethods (neutrality tests parametric BEAST analyses andEBSPs) yielded expansion signals for all sedentary farmer pop-ulations except Koreans Conversely only EBSPs and Fursquos Fstest showed expansion signals for nomadic herders but notthe parametric BEAST method nor other neutrality tests Thisresult points toward weaker expansion dynamics in herdersthan in farmers as supported also by the tendency for lowergrowth rates and Ne in herder populations than in farmerpopulations on the EBSP graphs (fig 2C and D) It thus seemsthat the flexibility and nonparametric nature of EBSP analysesallows one to detect weaker expansion events than the para-metric method Moreover Fursquos Fs is known to be more sen-sitive than the other neutrality tests to detect expansions(Ramos-Onsins and Rozas 2002) Again these inferred expan-sions may result at least in part from spatial expansion pro-cesses The population-specific FST values are indeed ratherlow in Eurasia Moreover we found a significant negativecorrelation between FST values and inferred growth rates forthe sedentary farmers indicating that less isolated popula-tions showed stronger expansion signals However althoughwe inferred much stronger expansion patterns for the farmersthan for the herders we did not observe any differences inEurasia between the farmers and the herders in the popula-tion-specific FST values or in the estimated number of

9

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

immigrants suggesting that spatial processes alone cannotexplain the strong difference that we observed between theexpansion patterns of these two groups of populations Thisindicates that the intrinsic demographic growth patterns aredifferent between these two kinds of populations the farmersshowing much higher growth rates than the herders

To our knowledge although other studies have found dif-ferent patterns between hunter-gatherers and farmers (egVerdu et al 2009) our study is the first to show differencesbetween farmers and herders the two major post-Neolithichuman groups A plausible explanation could be that no-madic herders and hunter-gatherers share several of the con-straints of a nomadic way of life For instance birth intervalsare generally longer (at least 4 years) in nomadic populationsthan in sedentary populations (eg Short 1982) According toBocquet-Appel (2011) these longer birth intervals may bemainly determined by diet differences Indeed Valeggia andEllison (2009) demonstrated that birth interval is mainly de-termined by the rapidity of postpartum energy recoverywhich may be increased by high carbohydrate food (like ce-reals) consumption Moreover the nomadic herder way of lifemay offer less food security than sedentary farming the latterfacilitating efficient long-term food storage

However unlike in Africa we did not find systematicallyconsistent patterns between the autosomal and mtDNA datain Eurasia The possible contraction events that our resultssuggest for two sedentary populations (Japanese and Danes)with autosomes appeared concomitant with historical eventsthat could have led to bottleneck processes For the Japanesepopulation this contraction signal could indeed result from afounder effect due to the Paleolithic colonization of Japan by asubset of the Northern Asiatic people (especially from KoreaNei 1995) Similarly a bottleneck process may also have oc-curred in the Danish population linked with the last glacialmaximum occurring between 26500 and 19500 YBP (Clarket al 2009) Reasons why these processes impacted the auto-somes but not the mtDNA data remain to be determined forinstance through simulation studies In any case our studyclearly emphasizes the utility of combining mtDNA and au-tosomal sequences as they allow access to different aspects ofhuman history A recent study on harbor porpoises has sim-ilarly shown that nuclear markers were sensitive to a recentcontraction event whereas mtDNA allowed inferring a moreancient expansion (Fontaine et al 2012)

Interestingly Central Asia displayed a distinct pattern fromthe rest of Eurasia Indeed we did not infer higher expansionrates for sedentary farmers than for nomadic herders in thatarea It could result from harsh local environmental condi-tions due to the arid continental climate in this area Indeedusing pollen records Dirksen and van Geel (2004) showedthat the paleoclimate in Central Asia was very arid from atleast 12000 to 3000 YBP which could have limited theamount of suitable areas for farming and impacted humandemography Spatial expansion processes may also haveplayed a role in this difference as population-specific FST

values were higher for the farmers than for the herdersThis may indicate that more migrants were involved in thespatial expansion process for the herders than for the farmers

yielding a weaker expansion signal (ie lower inferred growthrate) for the latter (Ray et al 2003) This is supported by thenegative correlation between the FST values and the inferredgrowth rates in the farmer populations The Korean popula-tion also stood out as an exception in Eurasia Even though itis a population of sedentary farmers it showed no significantexpansion signal with both the parametric and nonparamet-ric methods with HVS-I This could be explained by a latersedentarization of this population The Korean Neolithic isnotably defined by the introduction of Jeulmun ware ce-ramics about 8000 YBP but the people of the Jeulmunperiod were still predominantly semi-nomadic fishers andhunter-gatherers until about 3000 YBP when Koreans startedan intensive crop production implying a sedentary lifestyle(Nelson 1993)

Inferred Expansion Signals Predate the Emergence ofFarming

EBSP analyses revealed that the inferred expansion events infarmers and herder populations were more ancient than theemergence of farming and herding Therefore the differencesin demographic patterns between farmers and herders seemto predate their divergence in lifestyle which raises the ques-tion of the chronology of demographic expansions and theNeolithic transition These findings appear to be quite robustto the choice of the scaling parameters We used here boththe lower and the higher mutation rate estimates in humansfor autosomes (Pluzhnikov et al 2002 Conrad et al 2011) andfor the HVS-I sequence (Forster et al 1996 Howell et al 1996)Despite this uncertainty in mutation rates which lead to a 2-fold uncertainty in our time estimates the inferred expansionsignals predated the emergence of agriculture in both casesfor all populations Similarly using a generation time of 29years (Tremblay and Vezina 2000) instead of 25 years lead toslightly more ancient estimates thus do not change our con-clusions (data not shown) However note that for HVS-Iusing the higher bound of the credibility interval for the high-est estimated mutation rate (275 105generationsiteHeyer et al 2001) instead of the mean value (ie 105gen-erationsite) leads to expansion time estimates consistentwith the Neolithic transition in Eurasian populations (supple-mentary table S11 Supplementary Material online)Nevertheless these estimates still clearly predated theNeolithic for the African populations However 105gener-ationsite is by far the highest estimation of mutation rate inthe literature (Howell et al 1996) To infer Neolithic expan-sions in most Eurasian populations one needs to assume amutation rate of at least 2 105generationsite muchhigher than other estimates from the literature and is thusprobably unrealistic Moreover our method for determiningthe expansion onset time using EBSP graph is very conserva-tive and also tends to favor the lower bound of expansiononset times Finally for autosomes using similarly474 108generationsite instead of 25 108genera-tionsite (Pluzhnikov et al 2002) lead to an inferred expansiononset time that is not compatible with the Neolithic transi-tion for all Eurasian and African populations except for one

10

Aime et al doi101093molbevmst156 MBE

African population the Yoruba (supplementary table S12Supplementary Material online) Consequently it seemsvery likely that the expansions inferred in this study corre-spond to Paleolithic rather than Neolithic demographicevents in agreement also with most previous studies as de-tailed later

In Africa the emergence of agriculture has been datedbetween 5000 and 4000 YBP in the Western part ofCentral Africa and subsequently rapidly expanded to therest of sub-Saharan Africa (Phillipson 1993) However usingHVS-I we showed expansion events in farmer populationssince about 30000 or 60000 YBP thus largely predating theemergence of agriculture in the area Similarly using auto-somes especially in Eastern African populations we inferredexpansion signals that clearly predated the Neolithic Notablywe inferred an expansion signal for Mozambicans since atleast 80000 YBP Several genetic studies have already high-lighted that expansion events occurred in African farmersbefore the Neolithic transition (eg Atkinson et al 2009Laval et al 2010 Batini et al 2011) This finding is also con-sistent with paleoanthropological data (ie radiocarbondating) suggesting an expansion event in Africa 60000ndash80000 YBP (Mellars 2006a) This Paleolithic demographicexpansion could be linked to a rapid environmental changetoward a dryer climate (Partridge et al 1997) andor to theemergence of new hunting technologies (Mellars 2006a)

According to Mellars (2006a) this period corresponds to amajor increase in the complexity of the technological eco-nomic social and cognitive behavior of certain Africangroups It corresponds in particular to the emergence of pro-jectile technologies (Shea 2009) which was probably part of abroader pattern of ecological diversification of early Homosapiens populations These changes could have been decisivefor the human spread ldquoOut of Africardquo during the same periodand could have ultimately also led to the sedentarization ofthe remaining populations This inference is consistent withSauerrsquos (1952) demographic theory which stated that latePaleolithic demographic expansions could have favored thesedentarization and the emergence of agriculture in somehuman populations In the case of Central Africa theperiod of 60000 YBP corresponds to the separation betweenhunter-gatherers and farmers ancestors (Patin et al 2009Verdu et al 2009) Thus these two groups may have pre-sented contrasting demographic patterns since their diver-gence Much later higher expansion rates and largerpopulation sizes among farmersrsquo ancestors may have inducedthe emergence of agriculture and sedentarization

With respect to Eurasia the expansion profiles inferredwith HVS-I for all populations and with autosomes for theHan Chinese population also seem to have begun during thePaleolithic thus before the Neolithic transition Some geneticstudies already reported pre-Neolithic expansions in Asia andEurope (eg Chaix et al 2008) Notably using mismatch andintermatch distributions Chaix et al (2008) showed an east-to-west Paleolithic expansion wave in Eurasia We found asimilar pattern here as the inferred expansions of East-Asianpopulations were earlier than those of Central Asian popula-tions themselves earlier than those of European populations

Moreover we found this pattern in both sedentary farmerand nomadic herder populations Thus the ancestors of cur-rently nomadic herder populations also experienced thesePaleolithic expansions However Paleolithic expansion signalsin nomadic populations seem lower than in sedentary pop-ulations This is again compatible with the demographictheory of the Neolithic sedentarization (Sauer 1952) somepopulations may have experienced more intense Paleolithicexpansions which may have led ultimately to theirsedentarization

The inferred Paleolithic expansion signals might resultpartly from spatial expansions out of some refuge areasafter the Last Glacial Maximum (LGM 26500ndash19500 YBPClark et al 2009) as this time interval matches with our in-ferred dating for expansion onsets in East Asia with HVS-Iusing the pedigree-based mutation rate and in Europe andMiddle East using the transitional mutation rate Some of theearlier date estimates might also be consistent with the out-of-Africa expansion of H sapiens However the time radiocar-bon-based estimates of the spread of H sapiens in Eurasia aregenerally more ancient than our inferred expansion onsettimings For instance Mellars (2006b) dated the colonizationof Middle East by H sapiens at 47000ndash49000 YBP and ofEurope at 41000ndash42000 YBP Pavlov et al (2001) reporttraces of modern human occupation nearly 40000 yearsold in Siberia Finally Liu et al (2010) described modernhuman fossils from South China dated to at least 60000YBP Moreover out-of-Africa or post-LGM expansionswould not explain our finding of an east-to-west gradientof expansion onset timing which rather supports the hypoth-esis of a demographic expansion diffused from east to west inEurasia in a demic (ie migrations of individuals) or cultural(favored by the diffusion of new technologies)

Possible Confounding Factors

Our approach makes the assumption that populations areisolated and panmictic which is questionable for human pop-ulations However we analyzed a large set of populationssampled in very distant geographical regions (ie CentralAfrica East Africa Europe Middle East Central Asia PamirSiberia and East Asia) The main conclusions of this study relyon consistent patterns between most of these areas and itseems unlikely that processes such as admixture could havebiased the estimates similarly everywhere Moreover inCentral Africa several studies have shown that hunter-gath-erer populations show signals of admixture whereas it is notthe case for farmer populations (Patin et al 2009 Verdu et al2009 2013) If this introgression had been strong enough thismay have yielded a spurious expansion signal in the hunter-gatherer populations which is not what we observed here InEurope spatial expansion processes during the Neolithic mayhave led to admixture with Paleolithic populations Aspointed out by a simulation study (Arenas et al 2013) thismay lead to a predominance of the Paleolithic gene pool Thismay be one of the factors explaining why we observed mostlyPaleolithic expansions here

11

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

Similarly potential selection occurring on the whole mito-chondrial genome (eg Pakendorf and Stoneking 2005) seemsunlikely to have impacted in the same way all the studiedpopulations within each group (eg stronger positive selec-tion on sedentary than on nomadic populations) as we an-alyzed different nomadic and sedentary populations livingnear each other in several geographically distant areas

Regarding the potential effects of recombination on theinferences from autosomal data we found that neutralitytests gave similar results on the whole sequences whenusing a simulation procedure that was taking the known re-combination rate of each sequence into account (table 1)and on the largest non-recombining blocks as inferred withIMgc without taking recombination into account in the sim-ulation process (supplementary table S12 SupplementaryMaterial online) It thus appears unlikely that the BEAST anal-yses that can only handle the largest inferred non-recombin-ing blocks are biased because of this

Finally note that the effective population sizes inferredusing BEAST correspond to the Ne of the populationsduring their recent history rather than a value of Ne averagedover the history of the population It explains the finding thatfor most populations we inferred Ne estimates much higherthan generally assumed for humans by population geneticists(about 10000)

Material and Methods

Genetic MarkersAutosomal SequencesWe used data from 20 noncoding a priori neutral and un-linked autosomal regions selected by Patin et al (2009) to beat least 200 kb away from any known or predicted gene tonot be in linkage disequilibrium (LD) neither with each othernor with any known or predicted gene and to have a regionof homology with the chimpanzee genome These regions areon average 1253 bp long Using the four-gamete test (Hudsonand Kaplan 1985) as implemented in IMgc online (Woerneret al 2007) we identified recombination events for 6 of these20 regions As some methods used in this study cannothandle recombination we retained for these six sequencesthe largest non-recombining block inferred by IMgc Becauseof this reduction the 20 regions used were on average1228 bp long To identify potential bias related to thismethod (eg some recombination events may not bedetected using the four-gamete test larger blocks of non-recombining sequence may select for gene trees that areshorter than expected) we computed the summary statisticsand performed neutrality tests (discussed later) both on thewhole sequences (table 1) and on the largest non-recombin-ing blocks (supplementary table S12 Supplementary Materialonline)

Mitochondrial SequencesWe used the first hypervariable segment of the mitochondrialcontrol region (HVS-I) sequenced between positions 16067and 16383 excluding the hypervariable poly-C region (sites16179ndash16195) The total length of the sequence was thus of300 bp

Population Panel

For Africa we used the autosomal sequences data set of Patinet al (2009) which consists of five farmer populations(N = 118 individuals) and five Pygmy hunter-gatherer popu-lations (N = 95) In addition we used the HVS-I data set fromQuintana-Murci et al (2008) which consists of nine CentralAfrican farmer populations (N = 486) and seven CentralAfrican hunter-gatherer populations (N = 318) (supplemen-tary table S1 Supplementary Material online)

For Eurasia we used the autosomal sequences data set ofLaval et al (2010) consisting of 48 individuals from two East-Asian populations (Han Chinese and Japanese) and 47 indi-viduals from two European populations (Chuvash andDanes) We also used the data from 48 individuals fromone sedentary Central Asian population (Tajik farmers) and48 individuals from one nomadic Central Asian population(Kyrgyz herders) of Segurel et al (2013) For HVS-I we ana-lyzed data from 17 Eurasian populations (N = 494 in total)located from Eastern to Western Eurasia belonging to severalpublished data sets (Derenko et al 2000 Richards et al 2000Bermisheva et al 2002 Imaizumi et al 2002 Yao et al 2002Kong et al 2003 Quintana-Murci et al 2004 supplementarytable S1 Supplementary Material online)

For our detailed study of Central Asia we used HVS-I se-quences from 12 farmer populations (N = 408 in total) and 16herder populations (N = 567 in total) These data come fromthe studies by Chaix et al (2007) and Heyer et al (2009) for 25populations (supplementary table S1 SupplementaryMaterial online) The other populations (KIB TAB andTKY) were sequenced for this study As in Chaix et al(2007) and Heyer et al (2009) DNA was extracted fromblood samples using standard protocols and the sequencequality was ensured as follows each base pair was determinedonce with a forward and once with a reverse primer anyambiguous base call was checked by additional and indepen-dent PCR and sequencing reactions all sequences were ex-amined by two independent investigators All sampledindividuals were healthy donors from whom informed con-sent was obtained The study was approved by appropriateEthic Committees and scientific organizations in all countrieswhere samples have been collected

Demographic Inferences from Sequences AnalysisSummary Statistics and Neutrality TestsWe computed classical summary statistics (number of poly-morphic sites S number of haplotypes K) and four neutralitytests (Tajimarsquos [1989] D Fu and Lirsquos [1993] D and F and Fursquos[1997] Fs) on both mitochondrial and autosomal sequencesAlthough neutrality tests were originally designed to detectselective events they also give information about demo-graphic processes especially when applied to neutral markersIndeed expansion events lead to more negative values thanexpected in the absence of selective and demographic pro-cesses Conversely contraction events lead to more positivevalues of the neutrality tests For HVS-I sequences we com-puted all summary statistics and neutrality tests and tested

12

Aime et al doi101093molbevmst156 MBE

their departure from neutrality using the coalescent-basedtests provided in DnaSP (Librado and Rozas 2009)

For the autosomal sequences we used the procedure de-veloped in Laval et al (2010) which combines all autosomalsequences into a single test This procedure consists in com-puting the mean value of each summary statistics across the20 loci and in testing whether this mean value departs signif-icantly from its expectation under neutrality in a constant-size population model using a simulation procedure For agiven population with sample size n we produced 105 simu-lated samples of the same size n under a constant populationsize model using the generation-per-generation coalescent-based algorithm implemented in SIMCOAL v2 (Laval andExcoffier 2004) Each simulated individual was constitutedby 20 independent sequences of 1253 bp (the average se-quence length for the real data)Then we used ARLEQUIN v3(Excoffier et al 2005) as modified by Laval et al (2010) tocompute the summary statistics on these simulated samplesWe assessed whether the observed statistics differed signifi-cantly from the constant population model under neutralityby comparing these statistics with their null distribution ob-tained from the simulated data We used gamma distributedmutation rates with a mean value of 25 108generationsite (95 confidence interval 1476 108

4036 108)in agreement with previous studies (Pluzhnikov et al 2002Voight et al 2005) This procedure yielded a P value for thesignificance of the departure from a constant size model Weperformed this procedure both on the whole sequences andon the largest non-recombining blocks For the whole se-quences (ie including recombination) we performed thedata simulation under a coalescent model with recombina-tion using for each locus the recombination rate provided bythe HapMap build GRCh37 genetic map (InternationalHapMap Consortium 2003) (supplementary table S8Supplementary Material online) whereas for the largestnon-recombining blocks we used a coalescent model withoutrecombination

Both for autosomes and for HVS-I sequences we adjustedthe obtained P values for each neutrality test using a falsediscovery rate (FDR) correction (Benjamini and Hochberg1995) in R v2141 (R Development Core Team 2011) inorder to take into account the increased error probability inthe case of multiple testing

MCMC Estimations of Demographic ParametersWe used the MCMC algorithm implemented in BEAST v16(Drummond and Rambaut 2007) We tested the four demo-graphic models implemented in this software constant effec-tive population size (N0) (constant model) populationexpansion with an increasing growth rate (g) (exponentialmodel) population expansion with an decreasing growthrate (g) (logistic model) and the expansion model in whichN0 is the present day population size N1 the population sizethat the model asymptotes to going into the distant past andg the exponential growth rate that determines how fast thetransition is from near the N1 population size to N0 popula-tion size In fact BEAST estimates composite parameters foreach model namely N0 and g where N0 is the current

effective population size g the growth rate and the muta-tion rate In addition for the expansion model the ratio be-tween the current (N0) and ancestral (N1) effectivepopulation size is also estimated To infer N0 and g fromthese composite parameters we needed to assume a valuefor the mutation rate However there is no consensus formutation rates in humans in the literature as different meth-ods lead to different estimations For autosomes the mostcommonly used value is the phylogenetic rate of= 25 108generationsite (Pluzhnikov et al 2002)However recent studies based on the 1000 genome project(1000 Genomes Project Consortium 2012) have found a2-fold lower rate (= 12 108generationsite) by directlycomparing genome-wide sequences from children and theirparents (Conrad et al 2011 Scally and Durbin 2012) We usedhere both mutation rates Similarly for HVS-I estimated mu-tation rates are highly dependent on methodologies andmodes of calibration (Endicott et al 2009) We used boththe lower and the higher estimated mutation rates the tran-sitional changes mutation rate of = 5 106generationsite (Forster et al 1996) and the pedigree-based rate of= 105generationsite (Howell et al 1996 Heyer et al2001) We used a general time-reversible substitution model(Rodriguez et al 1990) We assumed a generation time of 25years permitting the comparison with previous human pop-ulation genetics studies (eg Chaix et al 2008 Patin et al 2009Laval et al 2010) As BEAST cannot handle recombinationevents we used the largest nonrecombining block withineach sequence (discussed earlier)

We performed three runs of 107 steps per population andper demographic model for the HVS-I sequence and threeruns of 2 108 steps (which corresponded to three runs of107 steps per locus) for the autosomal sequences We re-corded one tree every 1000 steps which thus implied atotal of 105 trees per locus and per run We then removedthe first 10 steps of each run (burn-in period) and combinedthe runs to obtain acceptable effective sample sizes (ESSs of100 or above) The convergence of these runs was assessedusing two methods visual inspection of traces using Tracerv15 (Rambaut and Drummond 2007) to check for concor-dance between runs and computation of Gelman and Rubinrsquos(1992) convergence diagnostic using R v2141 (RDevelopment Core Team 2011) with the function gelmandiag available in the package coda (Plummer et al 2006)

To facilitate a large exploration of the parameter space forthe autosomal sequences we chose uniform priors between 0and 005 for 2N0 and between 109 and 109 for g ForHVS-I sequences we chose uniform priors for N0 between 0and 10 and for gm between25 106 and 25 106 result-ing for the same priors on N0 and g than for autosomalsequences if we assumed = 105generationsite (ieN0 constrained between 0 and 106 and g constrained between1 and 1 per year) Conversely if we assumed = 5 106generationsite it meant that N0 was constrained between 0and 2 106 and g was constrained between05 and 05 peryear

For each population and model we obtained themode and the 95 HPD of N0 and g inferred from their

13

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

posterior distributions (supplementary tables S4 and S7Supplementary Material online) using the add-on packageLocfit (Loader 1999) in R v2141 We selected the best-fittingmodel among the four tested demographic models by esti-mating marginal likelihoods using two methods path sam-pling and stepping-stone sampling (Baele et al 2012) Themodel with the greater marginal likelihood (supplementarytable S3 Supplementary Material online) was considered asthe best-fitting model

Extended Bayesian Skyline PlotsEBSPs (Heled and Drummond 2010) also implemented inBEAST estimate demographic changes occurring continu-ously through time in a population using the time intervalsbetween successive coalescent events This method allows avisualization of the evolution of Ne through time As abovewe combined three runs of 107 steps for mitochondrial se-quences and three runs of 2 108 steps for autosomal se-quences to obtain acceptable ESS values We assumed thesame mutation rates as above and a generation time of 25years Outputs were analyzed with Tracer v15 to visuallycheck for convergence and ESS also to obtain the 95 HPDinterval for the number of demographic changes that oc-curred in the population (supplementary table S5Supplementary Material online) A constant population sizecould be rejected when the 95 HPD of the number ofchange points excluded 0 but included 1 (Heled andDrummond 2010) Then we used R v2141 to computeGelman and Rubinrsquos (1992) convergence diagnostic asabove as well as to compute skyline plots Finally we usedthe population growth curves generated from BEAST toassess the time at which populations began to expandEach Skyline plot consisted of smoothed data points atamp10ndash20 generation intervals We consider that the popula-tion increased (or decreased) when both the median and 95HPD values for Ne increased (or decreased) between morethan two successive data points Although this method didnot provide a 95 HPD interval for the inferred expansiontimings this conservative approach ensured that we consid-ered only relevant expansion signals

Correlation Tests of Inferred Growth Rates andIsolationImmigration Patterns

To test how isolation degrees and migration patterns differ-ences could have impacted our demographic inferences weused ARLEQUIN v311 (Excoffier et al 2005) to compute pop-ulation-specific FST values (Weir and Hill 2002) from HVS-Idata for Central Africa Eurasia and Central Asia (supplemen-tary table S9 Supplementary Material online) and estimateimmigration rates from mismatch distributions under a spa-tially explicit model (Excoffier 2004) (supplementary tableS10 Supplementary Material online) We performed thenSpearman tests using R v2141 to investigate for eachregion how the inferred parametric growth rates were corre-lated with those FST values and immigration rates We used avalue of 0 for the growth rate when the constant model bestfitted the data

Supplementary MaterialSupplementary figures S1 and S2 and tables S1ndashS12 are avail-able at Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

The authors would like to warmly thank all volunteer partic-ipants They also thank Michael Fontaine for his help on somedata analyses Phillip Endicott and Julio Bendezu-Sarmientofor insightful discussions Alexei Drummond for helpful dis-cussions and Friso Palstra for his help on English usage Theythank Laurent Excoffier two anonymous reviewers and theeditor for helpful comments and suggestions All computa-tionally intensive analyses were run on the Linux cluster of theMuseum National drsquoHistoire Naturelle (administrated by JulioPedraza) and on the web-based portal ldquoBioportalrdquo (Kumaret al 2009) This work was supported by ActionsTransversales du Museum - Museum National drsquoHistoireNaturelle (grant ldquoLes relations Societes-Natures dans le longtermerdquo) and Agence Nationale de la Recherche (grantsldquoAlterite culturellerdquo ANR-10-ESVS-0010 and ldquoDemochipsrdquoANR-12-BSV7-0012) CA was financed by a PhD grantfrom the Centre National de la Recherche Scientifique

References1000 Genomes Project Consortium 2012 An integrated map of genetic

variation from 1092 human genomes Nature 49156ndash65Ambrose SH 2001 Paleolithic technology and human evolution Science

2911748ndash1753Arenas M Francois O Currat M Ray N Excoffier L 2013 Influence of

admixture and Paleolithic range contractions on current Europeandiversity gradients Mol Biol Evol 30 57ndash61

Atkinson QD Gray RD Drummond AJ 2009 Bayesian coalescent infer-ence of major human mitochondrial DNA haplogroup expansionsin Africa Proc R Soc Lond B Biol Sci 276367ndash373

Baele G Lemey P Bedford T Rambaut A Suchard MA Alekseyenko AV2012 Improving the accuracy of demographic and molecular clockmodel comparison while accommodating phylogenetic uncertaintyMol Biol Evol 292157ndash2167

Balaresque PL Ballereau SJ Jobling MA 2007 Challenges in humangenetic diversity demographic history and adaptation Hum MolGenet 16R134ndashR139

Batini C Lopes J Behar DM Calafell F Jorde LB van der Veen LQuintana-Murci L Spedini G Destro-Bisol G Comas D 2011Insights into the demographic history of African pygmies from com-plete mitochondrial genomes Mol Biol Evol 281099ndash1110

Beaumont MA 2004 Recent developments in genetic data analysiswhat can they tell us about human demographic historyHeredity 92365ndash379

Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdashapractical and powerful approach to multiple testing J R Stat SocSer B 57289ndash300

Bermisheva M Salimova A Korshunova T Svyatova G Berezina GVillems R Khusnutdinova E 2002 Mitochondrial DNA diversity inthe populations of Middle Asia and Northern Caucasus Eur J HumGenet 10179ndash180

Bocquet-Appel JP 2011 When the worldrsquos population took off thespringboard of the Neolithic Demographic Transition Science 333560ndash561

Bocquet-Appel J-P Bar-Yosef O 2008 The Neolithic demographic tran-sition and its consequences Dordrecht (The Netherlands) Springer

14

Aime et al doi101093molbevmst156 MBE

Chaix R Austerlitz F Hegay T Quintana-Murci L Heyer E 2008 Genetictraces of east-to-west human expansion waves in Eurasia Am J PhysAnthropol 136309ndash317

Chaix R Quintana-Murci L Hegay T Hammer MF Mobasher ZAusterlitz F Heyer E 2007 From social to genetic structures incentral Asia Curr Biol 1743ndash48

Clark PU Dyke AS Shakun JD Carlson AE Clark J Wohlfarth BMitrovica JX Hostetler SW McCabe AM 2009 The Last GlacialMaximum Science 325710ndash714

Conrad DF Keebler JEM DePristo MA et al (17 co-authors) 2011Variation in genome-wide mutation rates within and betweenhuman families Nat Genet 43712ndash714

Derenko MV Malyarchuk BA Dambueva IK Shaikhaev GO DorzhuCM Nimaev DD Zakharov IA 2000 Mitochondrial DNA variationin two South Siberian aboriginal populations implications for thegenetic history of North Asia Hum Biol 72945ndash973

Dirksen VG van Geel B 2004 Mid to late Holocene climate change andits influence on cultural development in South Central Siberia NatoSci S Ss IV Ear 42291ndash307

Drummond AJ Rambaut A 2007 BEAST Bayesian evolutionary analysisby sampling trees BMC Evol Biol 7214

Endicott P Ho SYW Metspalu M Stringer C 2009 Evaluating the mi-tochondrial timescale of human evolution Trends Ecol Evol 24515ndash521

Excoffier L 2004 Patterns of DNA sequence diversity and genetic struc-ture after a range expansion lessons from the infinite-island modelMol Ecol 13853ndash864

Excoffier L Heckel G 2006 Computer programs for population geneticsdata analysis a survival guide Nat Rev Genet 7745ndash758

Excoffier L Laval G Schneider S 2005 Arlequin (version 30) an inte-grated software package for population genetics data analysis EvolBioinform Online 147ndash50

Fontaine MC Snirc A Frantzis A Koutrakis E Ozturk B Ozturk AAAusterlitz F 2012 History of expansion and anthropogenic collapsein a top marine predator of the Black Sea estimated from geneticdata Proc Natl Acad Sci U S A 109E2569ndashE2576

Forster P Harding R Torroni A Bandelt HJ 1996 Origin and evolutionof native American mtDNA variation a reappraisal Am J HumGenet 59935ndash945

Fu YX 1997 Statistical tests of neutrality of mutations against popula-tion growth hitchhiking and background selection Genetics 147915ndash925

Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133693ndash709

Gelman A Rubin DB 1992 Inference from iterative simulation usingmultiple sequences Stat Sci 7457ndash511

Heled J Drummond AJ 2010 Bayesian inference of species trees frommultilocus data Mol Biol Evol 27570ndash580

Heyer E Balaresque P Jobling MA Quintana-Murci L Chaix R Segurel LAldashev A Hegay T 2009 Genetic diversity and the emergence ofethnic groups in Central Asia BMC Genet 1049

Heyer E Chaix R Pavard S Austerlitz F 2012 Sex-specific demographicbehaviours that shape human genomic variation Mol Ecol 21597ndash612

Heyer E Zietkiewicz E Rochowski A Yotova V Puymirat J Labuda D2001 Phylogenetic and familial estimates of mitochondrial substi-tution rates study of control region mutations in deep-rootingpedigrees Am J Hum Genet 631113ndash1126

Ho SYW Shapiro B 2011 Skyline-plot methods for estimatingdemographic history from nucleotide sequences Mol Ecol Res 11423ndash434

Howell N Kubacka I Mackey DA 1996 How rapidly does the humanmitochondrial genome evolve Am J Hum Genet 59501ndash509

Hudson RR Kaplan NL 1985 Statistical properties of the number ofrecombination events in the history of a sample of DNA sequencesGenetics 111147ndash164

Imaizumi K Parsons TJ Yoshino M Holland MM 2002 A new databaseof mitochondrial DNA hypervariable regions I and II sequences from162 Japanese individuals Int J Legal Med 11668ndash73

International HapMap Consortium 2003 The International HapMapProject Nature 426789ndash796

Kingman JFC 1982 The coalescent Stoch Proc Appl 13235ndash248Kong QP Yao YG Sun C Bandelt HJ Zhu CL Zhang YP 2003 Phylogeny

of East Asian mitochondrial DNA lineages inferred from completesequences Am J Hum Genet 73671ndash676

Kumar V Langstieh BT Madhavi KV Naidu VM Singh HP Biswas SThangaraj K Singh L Reddy BM 2006 Global patterns in humanmitochondrial DNA and Y-chromosome variation caused by spatialinstability of the local cultural processes PLoS Genet 2e53

Kumar S Skjaeveland A Orr RJS Enger P Ruden T Mevik BH Burki FBotnen A Shalchian-Tabrizi K 2009 AIR a batch-oriented webprogram package for construction of supermatrices ready for phy-logenomic analyses BMC Bioinformatics 10357

Laval G Excoffier L 2004 SIMCOAL 20 a program to simulate genomicdiversity over large recombining regions in a subdivided populationwith a complex history Bioinformatics 202485ndash2487

Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulatinga historical and demographic model of recent human evolutionbased on resequencing data from noncoding regions PLoS One 5e10284

Librado P Rozas J 2009 DnaSP v5 a software for comprehensive analysisof DNA polymorphism data Bioinformatics 251451ndash1452

Liu W Jin CZ Zhang YQ et al (13 co-authors) 2010 Human remainsfrom Zhirendong South China and modern human emergence inEast Asia Proc Natl Acad Sci U S A 10719201ndash19206

Loader C 1999 Local regression and likelihood New York SpringerMellars P 2006a Why did human populations disperse from Africa ca

60000 years ago A new model Proc Natl Acad Sci U S A 1039381ndash9386

Mellars P 2006b A new radiocarbon revolution and the dispersal ofmodern humans in Eurasia Nature 439931ndash935

Nei M 1995 The origins of human populations genetic linguistic andarcheological data In Brenner S Hanihara K editors The origin andpast of modern humans as viewed from DNA Singapore WorldScientific p 71ndash91

Nei M Roychoudhury AK 1993 Evolutionary relationships of human-populations on a global scale Mol Biol Evol 10927ndash943

Nelson SM 1993 The archaeology of Korea Cambridge CambridgeUniversity Press p 307

Oota H Settheetham-Ishida W Tiwawech D Ishida T Stoneking M2001 Human mtDNA and Y-chromosome variation is corre-lated with matrilocal versus patrilocal residence Nat Genet 2920ndash21

Pakendorf B Stoneking M 2005 Mitochondrial DNA and human evo-lution Annu Rev Genomics Hum Genet 6165ndash183

Partridge TC Demenocal PB Lorentz SA Paiker MJ Vogel JC 1997Orbital forcing of climate over South Africa a 200000-year rainfallrecord from the Pretoria Saltpan Quat Sci Rev 161125ndash1133

Patin E Laval G Barreiro LB et al (15 co-authors) 2009 Inferring thedemographic history of African farmers and pygmy hunter-gath-erers using a multilocus resequencing data set PLoS Genet 5e1000448

Pavlov P Svendsen J Indrelid SI 2001 Human presence in the EuropeanArctic nearly 40000 years ago Nature 41364ndash67

Phillipson DW 1993 African archaeology Cambridge CambridgeUniversity Press

Plummer M Best N Cowles K Vines K 2006 CODA Convergencediagnosis and output analysis for MCMC R News 67ndash11

Pluzhnikov A Di Rienzo A Hudson RR 2002 Inferences about humandemography based on multilocus analyses of noncoding sequencesGenetics 1611209ndash1218

Quintana-Murci L Chaix R Wells RS et al (17 co-authors) 2004 Wherewest meets east the complex mtDNA landscape of the southwestand Central Asian corridor Am J Hum Genet 74827ndash845

Quintana-Murci L Quach H Harmant C et al (23 co-authors) 2008Maternal traces of deep common ancestry and asymmetric geneflow between Pygmy hunter-gatherers and Bantu-speaking farmersProc Natl Acad Sci U S A 1051596ndash1601

15

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE

Page 4: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

95 HPD intervals also included g = 0 The constant modelbest fitted the data for the Chuvash a traditionally nomadicpopulation EBSPs showed a significant expansion event forthe Han population the value of 0 was not included in the95 HPD interval of the number of demographic changes(supplementary table S5 Supplementary Material online)These expansion events started at least 36025 or 72050YBP (fig 1C and supplementary table S6 SupplementaryMaterial online) clearly predating the emergence of farmingin East Asia about 9000 YBP (Bocquet-Appel and Bar-Yosef

2008) (fig 3) Japanese showed a significant contraction event(ie the value of 0 was not included in the 95 HPD intervalof the number of demographic changes supplementary tableS5 Supplementary Material online) starting at least 21350 or42700 YBP Danes also showed a significant contractionevent starting at least 26440 or 52880 YBP (fig 1C and sup-plementary tables S5 and S6 Supplementary Material online)EBSP analyses showed no significant demographic changes forthe Chuvash (fig 1D and supplementary table S5Supplementary Material online)

FIG 1 EBSPs inferred from autosomal sequences in African sedentary farmers (A) African nomadic hunter-gatherers (B) Eurasian sedentary farmers (C)and Eurasian nomadic herders (D) The values indicated in bold on the axes are obtained assuming a mutation rate of = 12 108generationsite(measured from trios parentsndashchildren by Conrad et al 2011) and the other values correspond to = 25 108generationsite (derived from thesequence divergence humanndashchimpanzee by Pluzhnikov et al 2002) Although time was expressed in generations for the analyses we represented timein years here assuming a generation time of 25 years Time is represented backward on the x axis from present to the left to the most distant past onthe right 95 lower and upper HPD are represented by dashed lines Populations for which the estimated number of demographic changes include 0(ie no significant signal of expansion or decline) are represented in light gray

4

Aime et al doi101093molbevmst156 MBE

FIG 2 EBSPs inferred from HVS-I sequences in African sedentary farmers (A) African nomadic hunter-gatherers (B) Eurasian sedentary farmers (C)Eurasian nomadic herders (D) Central Asian sedentary farmers (E) and Central Asian nomadic herders (F) The values indicated in bold are obtainedassuming a mutation rate of = 5 106generationsite (transitional changes rate Forster et al 1996) and the others correspond to = 105generationsite (pedigree-based Howell et al 1996 Heyer et al 2001) Time is represented time in years assuming a generation time of 25 years It isrepresented backward on the x axis from present to the left to the most distant past on the right 95 lower and upper HPD are represented by dashedlines Populations for which the estimated number of demographic changes include 0 (ie no significant signal of expansion or decline) are representedin light gray and the others in black

5

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

For the HVS-I sequences from Eurasia (supplementarytables S7 and S8 Supplementary Material online) the para-metric BEAST analyses showed that models consistent withan increase in population size (expansion model or exponen-tial model with positive growth rates) best fitted the data forall sedentary populations except Koreans Conversely theconstant model best fitted the data for all nomadic popula-tions as well as Koreans EBSPs showed however significantexpansion events for both farmers and herders but not forKoreans (fig 2C and D and supplementary table S5Supplementary Material online) Nevertheless there was atendency toward stronger expansion rates and higher Ne

values in sedentary than in nomadic populations (fig 2Cand D) although the 95 HPD intervals for Ne were quitelarge for sedentary populations The estimated expansiononset times inferred from the EBSPs (supplementary tableS6 Supplementary Material online) followed an east-to-west gradient they appeared more ancient in Eastern popu-lations in both sedentary and nomadic populations (supple-mentary fig S1 Supplementary Material online) They alsoclearly predated the Neolithic transition in all geographicareas (fig 4)

The Central Asian Exception Similar Demographic Patterns inFarmers and HerdersFor autosomes the constant model best fitted the data forboth sedentary farmers (TAB) and traditionally nomadic her-ders (KIB) (supplementary tables S3 and S4 SupplementaryMaterial online) EBSPs showed also no significant demo-graphic changes for these populations (figs 1C and D andsupplementary table S5 Supplementary Material online)

For HVS-I (supplementary tables S3 and S7 SupplementaryMaterial online) the exponential model best fitted the datafor six of the 12 sedentary farmer populations (includingTAB) whereas the constant model was preferred for theother farmers Unlike the rest of Eurasia a model indicatingexpansion (the exponential model with positive growth rates)was also selected for all nomadic herders Moreover EBSPsshowed significant expansion signals for both herder andfarmer populations except TJY (Yagnobs from Dushanbe)since at least 13860 YBP (or 27720 YBP) for farmers and16546 YBP (or 33092 YBP) for herders on average (fig 2Eand F and supplementary tables S5 and S6 Supplementary

Material online) Again these inferred expansion onsets pre-dated the emergence of farming in the area about 8000 YBP(Bocquet-Appel and Bar-Yosef 2008) (fig 4) Inferred expan-sions for Central Asian sedentary farmers seemed overallweaker (ie lower growth rate and lower Ne) than those ob-served for other sedentary populations in Eurasia althoughwe observed important variations in growth rates and Ne

among populations and large 95 HPD intervals for someof them (fig 2C and E)

Degrees of Isolation and Migration Patterns

African farmer populations appeared less isolated and re-ceived more migrants than hunter-gatherer populationsIndeed the population-specific FST values (supplementarytable S9 Supplementary Material online) were on averagesignificantly lower for farmers than for hunter-gatherers(mean[farmers] = 0058 mean[HG] = 0192 Wilcoxon two-sidedtest P value = 00002) Moreover the estimated number ofimmigrants was significantly higher for sedentary farmersthan for nomadic hunter-gatherers (mean[farmers] = 314mean[HG] = 221 P value = 00001) (supplementary table S10Supplementary Material online) For hunter-gatherers the FST

values were negatively correlated with the negative growthrates that we inferred from the parametric method (=0893 P value = 0012) (fig 5B) meaning that less isolatedpopulations showed weaker contraction events (ie less neg-ative growth rates) Conversely there was no significant cor-relation between FST values and inferred growth rates forsedentary farmers (= 0433 P value = 0249) (fig 5A)However we found a significant positive correlation betweenthe number of immigrants and the inferred growth rates(supplementary fig S2 Supplementary Material online)among sedentary farmer populations (= 0867 P value =0004) but not among nomadic hunter-gatherers (= 0536P value = 0235)

For Eurasia we found no significant difference in FST valuesbetween farmers and herders (mean[farmers] = 0039 mean-

[herders] = 0043 P value = 077 supplementary table S9Supplementary Material online) except in Central Asia forwhich we found significantly lower FST values for nomadicherders than for sedentary farmers (mean[farmers] = 0018mean[herders] = 0008 P value = 0017) We report a significant

FIG 3 Comparison of estimated times for expansion onsets using autosomes and dating of the first archeological traces of farming in Africa and ChinaTime is represented backward (in YBP) Only populations for which the EBSP analysis showed a significant expansion event are represented Wereported the time values estimated with the highest mutation rate that we used for the autosomes (= 25 108generationsite) Thus these timevalues can be considered as a lower bound for the expansion onsets The dates for the emergence of farming come from the review by Bocquet-Appeland Bar-Yosef (2008) They are based on archeological remains

6

Aime et al doi101093molbevmst156 MBE

FIG

4

Com

par

ison

ofes

tim

ated

tim

esfo

rex

pan

sion

onse

tsus

ing

HV

S-Ia

nd

dati

ng

ofth

efir

star

cheo

logi

calt

race

sof

farm

ing

orhe

rdin

gin

Cen

tral

Afr

ica

(A)

Eura

sia

(B)

and

Cen

tral

Asi

a(C

)T

ime

isre

pre

sen

ted

back

war

d(i

nY

BP)

On

lyp

opul

atio

ns

for

whi

chth

eEB

SPan

alys

issh

owed

asi

gnifi

can

tex

pan

sion

even

tar

ere

pre

sen

ted

We

rep

orte

dth

eti

me

valu

eses

tim

ated

wit

hth

ehi

ghes

tm

utat

ion

rate

that

we

used

for

the

HV

S-Is

eque

nce

s(

=10

5 gen

erat

ion

sit

e)T

hus

thes

eti

me

valu

esca

nbe

con

side

red

asa

low

erbo

und

for

the

exp

ansi

onon

sets

The

date

sfo

rth

eem

erge

nce

offa

rmin

gco

me

from

the

revi

ewby

Bocq

uet-

Ap

pel

and

Bar-

Yos

ef(2

008)

The

yar

eba

sed

onar

cheo

logi

cal

rem

ain

s

7

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

FIG 5 Correlations between population-specific FST values and inferred growth rates in African farmer (A) and hunter-gatherer (B) populationsEurasian farmer populations (C) and Central Asian farmer (D) and herder populations (E) Population-specific FST values were computed withARLEQUIN v311 (Excoffier et al 2005) The growth rates were inferred under the best-fitting model from the parametric method using BEAST(Drummond and Rambaut 2007) When the best-fitting model was the constant model we assumed a growth rate of 0 Note that we did not representEurasian herder populations as the constant model best-fitted the data for all of them Plots and correlation tests were performed using R v2141(R Development Core Team 2011)

8

Aime et al doi101093molbevmst156 MBE

negative correlation between FST values and inferred growthrates for sedentary farmers in Eurasia (=0673 P value =0028) (fig 5C) and Central Asia (=0773 P value = 0003)(fig 5D) thus meaning that less isolated populations showedhigher inferred growth rates There was no significant corre-lation for Central Asian herders (=0092 P value = 0736)(fig 5E) Note that this analysis could not be performed forthe other Eurasian herder populations as the constant modelbest fitted the data with the parametric method

The estimation of the proportion of immigrants did notconverge for 11 Eurasian populations (Han Chinese LiaoningQingdao Palestinians Pathans Mongols as well as threeCentral Asian farmer populations and two Central Asianherder populations) Regarding the other populations weshowed no significant difference in the proportion of immi-grants between farmers and herders both in Central Asia(mean[farmers] = 82868 mean[herders] = 260265 P value = 012)and in the rest of Eurasia (mean[farmers] = 576757 mean-

[herders] = 201311 P value = 051) (supplementary table S10Supplementary Material online) We also found no significantcorrelation between this proportion and the inferred growthrates for Eurasian farmers (=0238 P value = 048) CentralAsians farmers (= 0386 P value = 030) and Central Asianherders (=042 P value = 0139) (supplementary fig S2Supplementary Material online)

DiscussionIn this study using a large set of populations from distantgeographic areas we report contrasted demographic historiesthat correlate with lifestyle Moreover the inferred expansionsignals in both African and Eurasian farmer and herder pop-ulations predated the Neolithic transition and the sedentar-ization of these populations

Contrasted Demographic Histories in Sedentary andNomadic Populations

For Africa both mtDNA and autosomal data revealed expan-sion patterns in most sedentary farmer populations as indi-cated by neutrality tests and the parametric andnonparametric BEAST methods Conversely we found con-stant effective population sizes (or possibly contractionevents) for all hunter-gatherer populations Among the farm-ers results were least clear for the Yoruba and the Ewondopopulations as no neutrality test was significant for thesepopulations whereas they showed evidence of expansionevents when analyzed with BEAST This indicates that thesepopulations may have undergone weaker expansion dynam-ics (ie lower growth rates and Ne) than the others Theseremarkable results are of particular importance for theYoruba as it is a reference population in many databases(HapMap 1000 genomes) This also demonstrates thehigher sensitivity of MCMC methods such as BEAST todetect expansions in comparison to neutrality tests

The contrasted patterns inferred between sedentary andnomadic populations in Africa suggest strong differences be-tween the demographic histories of these two groups of pop-ulations The question is whether this pattern results mostly

from differences in local expansion dynamics or whether spa-tial expansion processes at a larger scale were also involved Asshown by Ray et al (2003) negative values for the neutralitytests will be observed in a spatial expansion process if the rateof migrants (Nm) is high enough (at least 20) but not other-wise As in previous studies (eg Verdu et al 2013) we reporta higher degree of isolation (higher population-specific FST

values) in hunter-gatherer populations than in farmer popu-lations Using the spatial expansion model of Excoffier (2004)also leads to higher estimates of the number of immigrantsinto farmer populations Thus both farmers and hunter-gath-erers may have been subject to a spatial expansion processbut the limited number of migrants among hunter-gatherersmay have resulted in an absence of expansion signals forthem This would be consistent with the positive correlationthat we observe between the growth rates estimated withBEAST and the inferred number of immigrants in the seden-tary farmer populations However this spatial expansion pro-cess seems unlikely to completely explain the strongassociation that we observed between lifestyle and expansionpatterns as some farmer populations (Teke Gabonese Fang)displayed FST values similar to those of hunter-gatherers but aclear signal of expansion with relatively high growth rates Thissuggests that even rather isolated farmer populations showsubstantial level of expansions Moreover FST values and in-ferred growth rates in farmer populations were not signifi-cantly correlated Therefore our results suggest that theexpansion patterns observed in sedentary populationsresult not only from a spatial expansion pattern In additionlocal dynamics connected with the higher capacity of foodproduction by farmers also explain their much strongerexpansion signatures relative to their neighboring hunter-gatherer populations

For Eurasia when considering the mtDNA data all threemethods (neutrality tests parametric BEAST analyses andEBSPs) yielded expansion signals for all sedentary farmer pop-ulations except Koreans Conversely only EBSPs and Fursquos Fstest showed expansion signals for nomadic herders but notthe parametric BEAST method nor other neutrality tests Thisresult points toward weaker expansion dynamics in herdersthan in farmers as supported also by the tendency for lowergrowth rates and Ne in herder populations than in farmerpopulations on the EBSP graphs (fig 2C and D) It thus seemsthat the flexibility and nonparametric nature of EBSP analysesallows one to detect weaker expansion events than the para-metric method Moreover Fursquos Fs is known to be more sen-sitive than the other neutrality tests to detect expansions(Ramos-Onsins and Rozas 2002) Again these inferred expan-sions may result at least in part from spatial expansion pro-cesses The population-specific FST values are indeed ratherlow in Eurasia Moreover we found a significant negativecorrelation between FST values and inferred growth rates forthe sedentary farmers indicating that less isolated popula-tions showed stronger expansion signals However althoughwe inferred much stronger expansion patterns for the farmersthan for the herders we did not observe any differences inEurasia between the farmers and the herders in the popula-tion-specific FST values or in the estimated number of

9

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

immigrants suggesting that spatial processes alone cannotexplain the strong difference that we observed between theexpansion patterns of these two groups of populations Thisindicates that the intrinsic demographic growth patterns aredifferent between these two kinds of populations the farmersshowing much higher growth rates than the herders

To our knowledge although other studies have found dif-ferent patterns between hunter-gatherers and farmers (egVerdu et al 2009) our study is the first to show differencesbetween farmers and herders the two major post-Neolithichuman groups A plausible explanation could be that no-madic herders and hunter-gatherers share several of the con-straints of a nomadic way of life For instance birth intervalsare generally longer (at least 4 years) in nomadic populationsthan in sedentary populations (eg Short 1982) According toBocquet-Appel (2011) these longer birth intervals may bemainly determined by diet differences Indeed Valeggia andEllison (2009) demonstrated that birth interval is mainly de-termined by the rapidity of postpartum energy recoverywhich may be increased by high carbohydrate food (like ce-reals) consumption Moreover the nomadic herder way of lifemay offer less food security than sedentary farming the latterfacilitating efficient long-term food storage

However unlike in Africa we did not find systematicallyconsistent patterns between the autosomal and mtDNA datain Eurasia The possible contraction events that our resultssuggest for two sedentary populations (Japanese and Danes)with autosomes appeared concomitant with historical eventsthat could have led to bottleneck processes For the Japanesepopulation this contraction signal could indeed result from afounder effect due to the Paleolithic colonization of Japan by asubset of the Northern Asiatic people (especially from KoreaNei 1995) Similarly a bottleneck process may also have oc-curred in the Danish population linked with the last glacialmaximum occurring between 26500 and 19500 YBP (Clarket al 2009) Reasons why these processes impacted the auto-somes but not the mtDNA data remain to be determined forinstance through simulation studies In any case our studyclearly emphasizes the utility of combining mtDNA and au-tosomal sequences as they allow access to different aspects ofhuman history A recent study on harbor porpoises has sim-ilarly shown that nuclear markers were sensitive to a recentcontraction event whereas mtDNA allowed inferring a moreancient expansion (Fontaine et al 2012)

Interestingly Central Asia displayed a distinct pattern fromthe rest of Eurasia Indeed we did not infer higher expansionrates for sedentary farmers than for nomadic herders in thatarea It could result from harsh local environmental condi-tions due to the arid continental climate in this area Indeedusing pollen records Dirksen and van Geel (2004) showedthat the paleoclimate in Central Asia was very arid from atleast 12000 to 3000 YBP which could have limited theamount of suitable areas for farming and impacted humandemography Spatial expansion processes may also haveplayed a role in this difference as population-specific FST

values were higher for the farmers than for the herdersThis may indicate that more migrants were involved in thespatial expansion process for the herders than for the farmers

yielding a weaker expansion signal (ie lower inferred growthrate) for the latter (Ray et al 2003) This is supported by thenegative correlation between the FST values and the inferredgrowth rates in the farmer populations The Korean popula-tion also stood out as an exception in Eurasia Even though itis a population of sedentary farmers it showed no significantexpansion signal with both the parametric and nonparamet-ric methods with HVS-I This could be explained by a latersedentarization of this population The Korean Neolithic isnotably defined by the introduction of Jeulmun ware ce-ramics about 8000 YBP but the people of the Jeulmunperiod were still predominantly semi-nomadic fishers andhunter-gatherers until about 3000 YBP when Koreans startedan intensive crop production implying a sedentary lifestyle(Nelson 1993)

Inferred Expansion Signals Predate the Emergence ofFarming

EBSP analyses revealed that the inferred expansion events infarmers and herder populations were more ancient than theemergence of farming and herding Therefore the differencesin demographic patterns between farmers and herders seemto predate their divergence in lifestyle which raises the ques-tion of the chronology of demographic expansions and theNeolithic transition These findings appear to be quite robustto the choice of the scaling parameters We used here boththe lower and the higher mutation rate estimates in humansfor autosomes (Pluzhnikov et al 2002 Conrad et al 2011) andfor the HVS-I sequence (Forster et al 1996 Howell et al 1996)Despite this uncertainty in mutation rates which lead to a 2-fold uncertainty in our time estimates the inferred expansionsignals predated the emergence of agriculture in both casesfor all populations Similarly using a generation time of 29years (Tremblay and Vezina 2000) instead of 25 years lead toslightly more ancient estimates thus do not change our con-clusions (data not shown) However note that for HVS-Iusing the higher bound of the credibility interval for the high-est estimated mutation rate (275 105generationsiteHeyer et al 2001) instead of the mean value (ie 105gen-erationsite) leads to expansion time estimates consistentwith the Neolithic transition in Eurasian populations (supple-mentary table S11 Supplementary Material online)Nevertheless these estimates still clearly predated theNeolithic for the African populations However 105gener-ationsite is by far the highest estimation of mutation rate inthe literature (Howell et al 1996) To infer Neolithic expan-sions in most Eurasian populations one needs to assume amutation rate of at least 2 105generationsite muchhigher than other estimates from the literature and is thusprobably unrealistic Moreover our method for determiningthe expansion onset time using EBSP graph is very conserva-tive and also tends to favor the lower bound of expansiononset times Finally for autosomes using similarly474 108generationsite instead of 25 108genera-tionsite (Pluzhnikov et al 2002) lead to an inferred expansiononset time that is not compatible with the Neolithic transi-tion for all Eurasian and African populations except for one

10

Aime et al doi101093molbevmst156 MBE

African population the Yoruba (supplementary table S12Supplementary Material online) Consequently it seemsvery likely that the expansions inferred in this study corre-spond to Paleolithic rather than Neolithic demographicevents in agreement also with most previous studies as de-tailed later

In Africa the emergence of agriculture has been datedbetween 5000 and 4000 YBP in the Western part ofCentral Africa and subsequently rapidly expanded to therest of sub-Saharan Africa (Phillipson 1993) However usingHVS-I we showed expansion events in farmer populationssince about 30000 or 60000 YBP thus largely predating theemergence of agriculture in the area Similarly using auto-somes especially in Eastern African populations we inferredexpansion signals that clearly predated the Neolithic Notablywe inferred an expansion signal for Mozambicans since atleast 80000 YBP Several genetic studies have already high-lighted that expansion events occurred in African farmersbefore the Neolithic transition (eg Atkinson et al 2009Laval et al 2010 Batini et al 2011) This finding is also con-sistent with paleoanthropological data (ie radiocarbondating) suggesting an expansion event in Africa 60000ndash80000 YBP (Mellars 2006a) This Paleolithic demographicexpansion could be linked to a rapid environmental changetoward a dryer climate (Partridge et al 1997) andor to theemergence of new hunting technologies (Mellars 2006a)

According to Mellars (2006a) this period corresponds to amajor increase in the complexity of the technological eco-nomic social and cognitive behavior of certain Africangroups It corresponds in particular to the emergence of pro-jectile technologies (Shea 2009) which was probably part of abroader pattern of ecological diversification of early Homosapiens populations These changes could have been decisivefor the human spread ldquoOut of Africardquo during the same periodand could have ultimately also led to the sedentarization ofthe remaining populations This inference is consistent withSauerrsquos (1952) demographic theory which stated that latePaleolithic demographic expansions could have favored thesedentarization and the emergence of agriculture in somehuman populations In the case of Central Africa theperiod of 60000 YBP corresponds to the separation betweenhunter-gatherers and farmers ancestors (Patin et al 2009Verdu et al 2009) Thus these two groups may have pre-sented contrasting demographic patterns since their diver-gence Much later higher expansion rates and largerpopulation sizes among farmersrsquo ancestors may have inducedthe emergence of agriculture and sedentarization

With respect to Eurasia the expansion profiles inferredwith HVS-I for all populations and with autosomes for theHan Chinese population also seem to have begun during thePaleolithic thus before the Neolithic transition Some geneticstudies already reported pre-Neolithic expansions in Asia andEurope (eg Chaix et al 2008) Notably using mismatch andintermatch distributions Chaix et al (2008) showed an east-to-west Paleolithic expansion wave in Eurasia We found asimilar pattern here as the inferred expansions of East-Asianpopulations were earlier than those of Central Asian popula-tions themselves earlier than those of European populations

Moreover we found this pattern in both sedentary farmerand nomadic herder populations Thus the ancestors of cur-rently nomadic herder populations also experienced thesePaleolithic expansions However Paleolithic expansion signalsin nomadic populations seem lower than in sedentary pop-ulations This is again compatible with the demographictheory of the Neolithic sedentarization (Sauer 1952) somepopulations may have experienced more intense Paleolithicexpansions which may have led ultimately to theirsedentarization

The inferred Paleolithic expansion signals might resultpartly from spatial expansions out of some refuge areasafter the Last Glacial Maximum (LGM 26500ndash19500 YBPClark et al 2009) as this time interval matches with our in-ferred dating for expansion onsets in East Asia with HVS-Iusing the pedigree-based mutation rate and in Europe andMiddle East using the transitional mutation rate Some of theearlier date estimates might also be consistent with the out-of-Africa expansion of H sapiens However the time radiocar-bon-based estimates of the spread of H sapiens in Eurasia aregenerally more ancient than our inferred expansion onsettimings For instance Mellars (2006b) dated the colonizationof Middle East by H sapiens at 47000ndash49000 YBP and ofEurope at 41000ndash42000 YBP Pavlov et al (2001) reporttraces of modern human occupation nearly 40000 yearsold in Siberia Finally Liu et al (2010) described modernhuman fossils from South China dated to at least 60000YBP Moreover out-of-Africa or post-LGM expansionswould not explain our finding of an east-to-west gradientof expansion onset timing which rather supports the hypoth-esis of a demographic expansion diffused from east to west inEurasia in a demic (ie migrations of individuals) or cultural(favored by the diffusion of new technologies)

Possible Confounding Factors

Our approach makes the assumption that populations areisolated and panmictic which is questionable for human pop-ulations However we analyzed a large set of populationssampled in very distant geographical regions (ie CentralAfrica East Africa Europe Middle East Central Asia PamirSiberia and East Asia) The main conclusions of this study relyon consistent patterns between most of these areas and itseems unlikely that processes such as admixture could havebiased the estimates similarly everywhere Moreover inCentral Africa several studies have shown that hunter-gath-erer populations show signals of admixture whereas it is notthe case for farmer populations (Patin et al 2009 Verdu et al2009 2013) If this introgression had been strong enough thismay have yielded a spurious expansion signal in the hunter-gatherer populations which is not what we observed here InEurope spatial expansion processes during the Neolithic mayhave led to admixture with Paleolithic populations Aspointed out by a simulation study (Arenas et al 2013) thismay lead to a predominance of the Paleolithic gene pool Thismay be one of the factors explaining why we observed mostlyPaleolithic expansions here

11

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

Similarly potential selection occurring on the whole mito-chondrial genome (eg Pakendorf and Stoneking 2005) seemsunlikely to have impacted in the same way all the studiedpopulations within each group (eg stronger positive selec-tion on sedentary than on nomadic populations) as we an-alyzed different nomadic and sedentary populations livingnear each other in several geographically distant areas

Regarding the potential effects of recombination on theinferences from autosomal data we found that neutralitytests gave similar results on the whole sequences whenusing a simulation procedure that was taking the known re-combination rate of each sequence into account (table 1)and on the largest non-recombining blocks as inferred withIMgc without taking recombination into account in the sim-ulation process (supplementary table S12 SupplementaryMaterial online) It thus appears unlikely that the BEAST anal-yses that can only handle the largest inferred non-recombin-ing blocks are biased because of this

Finally note that the effective population sizes inferredusing BEAST correspond to the Ne of the populationsduring their recent history rather than a value of Ne averagedover the history of the population It explains the finding thatfor most populations we inferred Ne estimates much higherthan generally assumed for humans by population geneticists(about 10000)

Material and Methods

Genetic MarkersAutosomal SequencesWe used data from 20 noncoding a priori neutral and un-linked autosomal regions selected by Patin et al (2009) to beat least 200 kb away from any known or predicted gene tonot be in linkage disequilibrium (LD) neither with each othernor with any known or predicted gene and to have a regionof homology with the chimpanzee genome These regions areon average 1253 bp long Using the four-gamete test (Hudsonand Kaplan 1985) as implemented in IMgc online (Woerneret al 2007) we identified recombination events for 6 of these20 regions As some methods used in this study cannothandle recombination we retained for these six sequencesthe largest non-recombining block inferred by IMgc Becauseof this reduction the 20 regions used were on average1228 bp long To identify potential bias related to thismethod (eg some recombination events may not bedetected using the four-gamete test larger blocks of non-recombining sequence may select for gene trees that areshorter than expected) we computed the summary statisticsand performed neutrality tests (discussed later) both on thewhole sequences (table 1) and on the largest non-recombin-ing blocks (supplementary table S12 Supplementary Materialonline)

Mitochondrial SequencesWe used the first hypervariable segment of the mitochondrialcontrol region (HVS-I) sequenced between positions 16067and 16383 excluding the hypervariable poly-C region (sites16179ndash16195) The total length of the sequence was thus of300 bp

Population Panel

For Africa we used the autosomal sequences data set of Patinet al (2009) which consists of five farmer populations(N = 118 individuals) and five Pygmy hunter-gatherer popu-lations (N = 95) In addition we used the HVS-I data set fromQuintana-Murci et al (2008) which consists of nine CentralAfrican farmer populations (N = 486) and seven CentralAfrican hunter-gatherer populations (N = 318) (supplemen-tary table S1 Supplementary Material online)

For Eurasia we used the autosomal sequences data set ofLaval et al (2010) consisting of 48 individuals from two East-Asian populations (Han Chinese and Japanese) and 47 indi-viduals from two European populations (Chuvash andDanes) We also used the data from 48 individuals fromone sedentary Central Asian population (Tajik farmers) and48 individuals from one nomadic Central Asian population(Kyrgyz herders) of Segurel et al (2013) For HVS-I we ana-lyzed data from 17 Eurasian populations (N = 494 in total)located from Eastern to Western Eurasia belonging to severalpublished data sets (Derenko et al 2000 Richards et al 2000Bermisheva et al 2002 Imaizumi et al 2002 Yao et al 2002Kong et al 2003 Quintana-Murci et al 2004 supplementarytable S1 Supplementary Material online)

For our detailed study of Central Asia we used HVS-I se-quences from 12 farmer populations (N = 408 in total) and 16herder populations (N = 567 in total) These data come fromthe studies by Chaix et al (2007) and Heyer et al (2009) for 25populations (supplementary table S1 SupplementaryMaterial online) The other populations (KIB TAB andTKY) were sequenced for this study As in Chaix et al(2007) and Heyer et al (2009) DNA was extracted fromblood samples using standard protocols and the sequencequality was ensured as follows each base pair was determinedonce with a forward and once with a reverse primer anyambiguous base call was checked by additional and indepen-dent PCR and sequencing reactions all sequences were ex-amined by two independent investigators All sampledindividuals were healthy donors from whom informed con-sent was obtained The study was approved by appropriateEthic Committees and scientific organizations in all countrieswhere samples have been collected

Demographic Inferences from Sequences AnalysisSummary Statistics and Neutrality TestsWe computed classical summary statistics (number of poly-morphic sites S number of haplotypes K) and four neutralitytests (Tajimarsquos [1989] D Fu and Lirsquos [1993] D and F and Fursquos[1997] Fs) on both mitochondrial and autosomal sequencesAlthough neutrality tests were originally designed to detectselective events they also give information about demo-graphic processes especially when applied to neutral markersIndeed expansion events lead to more negative values thanexpected in the absence of selective and demographic pro-cesses Conversely contraction events lead to more positivevalues of the neutrality tests For HVS-I sequences we com-puted all summary statistics and neutrality tests and tested

12

Aime et al doi101093molbevmst156 MBE

their departure from neutrality using the coalescent-basedtests provided in DnaSP (Librado and Rozas 2009)

For the autosomal sequences we used the procedure de-veloped in Laval et al (2010) which combines all autosomalsequences into a single test This procedure consists in com-puting the mean value of each summary statistics across the20 loci and in testing whether this mean value departs signif-icantly from its expectation under neutrality in a constant-size population model using a simulation procedure For agiven population with sample size n we produced 105 simu-lated samples of the same size n under a constant populationsize model using the generation-per-generation coalescent-based algorithm implemented in SIMCOAL v2 (Laval andExcoffier 2004) Each simulated individual was constitutedby 20 independent sequences of 1253 bp (the average se-quence length for the real data)Then we used ARLEQUIN v3(Excoffier et al 2005) as modified by Laval et al (2010) tocompute the summary statistics on these simulated samplesWe assessed whether the observed statistics differed signifi-cantly from the constant population model under neutralityby comparing these statistics with their null distribution ob-tained from the simulated data We used gamma distributedmutation rates with a mean value of 25 108generationsite (95 confidence interval 1476 108

4036 108)in agreement with previous studies (Pluzhnikov et al 2002Voight et al 2005) This procedure yielded a P value for thesignificance of the departure from a constant size model Weperformed this procedure both on the whole sequences andon the largest non-recombining blocks For the whole se-quences (ie including recombination) we performed thedata simulation under a coalescent model with recombina-tion using for each locus the recombination rate provided bythe HapMap build GRCh37 genetic map (InternationalHapMap Consortium 2003) (supplementary table S8Supplementary Material online) whereas for the largestnon-recombining blocks we used a coalescent model withoutrecombination

Both for autosomes and for HVS-I sequences we adjustedthe obtained P values for each neutrality test using a falsediscovery rate (FDR) correction (Benjamini and Hochberg1995) in R v2141 (R Development Core Team 2011) inorder to take into account the increased error probability inthe case of multiple testing

MCMC Estimations of Demographic ParametersWe used the MCMC algorithm implemented in BEAST v16(Drummond and Rambaut 2007) We tested the four demo-graphic models implemented in this software constant effec-tive population size (N0) (constant model) populationexpansion with an increasing growth rate (g) (exponentialmodel) population expansion with an decreasing growthrate (g) (logistic model) and the expansion model in whichN0 is the present day population size N1 the population sizethat the model asymptotes to going into the distant past andg the exponential growth rate that determines how fast thetransition is from near the N1 population size to N0 popula-tion size In fact BEAST estimates composite parameters foreach model namely N0 and g where N0 is the current

effective population size g the growth rate and the muta-tion rate In addition for the expansion model the ratio be-tween the current (N0) and ancestral (N1) effectivepopulation size is also estimated To infer N0 and g fromthese composite parameters we needed to assume a valuefor the mutation rate However there is no consensus formutation rates in humans in the literature as different meth-ods lead to different estimations For autosomes the mostcommonly used value is the phylogenetic rate of= 25 108generationsite (Pluzhnikov et al 2002)However recent studies based on the 1000 genome project(1000 Genomes Project Consortium 2012) have found a2-fold lower rate (= 12 108generationsite) by directlycomparing genome-wide sequences from children and theirparents (Conrad et al 2011 Scally and Durbin 2012) We usedhere both mutation rates Similarly for HVS-I estimated mu-tation rates are highly dependent on methodologies andmodes of calibration (Endicott et al 2009) We used boththe lower and the higher estimated mutation rates the tran-sitional changes mutation rate of = 5 106generationsite (Forster et al 1996) and the pedigree-based rate of= 105generationsite (Howell et al 1996 Heyer et al2001) We used a general time-reversible substitution model(Rodriguez et al 1990) We assumed a generation time of 25years permitting the comparison with previous human pop-ulation genetics studies (eg Chaix et al 2008 Patin et al 2009Laval et al 2010) As BEAST cannot handle recombinationevents we used the largest nonrecombining block withineach sequence (discussed earlier)

We performed three runs of 107 steps per population andper demographic model for the HVS-I sequence and threeruns of 2 108 steps (which corresponded to three runs of107 steps per locus) for the autosomal sequences We re-corded one tree every 1000 steps which thus implied atotal of 105 trees per locus and per run We then removedthe first 10 steps of each run (burn-in period) and combinedthe runs to obtain acceptable effective sample sizes (ESSs of100 or above) The convergence of these runs was assessedusing two methods visual inspection of traces using Tracerv15 (Rambaut and Drummond 2007) to check for concor-dance between runs and computation of Gelman and Rubinrsquos(1992) convergence diagnostic using R v2141 (RDevelopment Core Team 2011) with the function gelmandiag available in the package coda (Plummer et al 2006)

To facilitate a large exploration of the parameter space forthe autosomal sequences we chose uniform priors between 0and 005 for 2N0 and between 109 and 109 for g ForHVS-I sequences we chose uniform priors for N0 between 0and 10 and for gm between25 106 and 25 106 result-ing for the same priors on N0 and g than for autosomalsequences if we assumed = 105generationsite (ieN0 constrained between 0 and 106 and g constrained between1 and 1 per year) Conversely if we assumed = 5 106generationsite it meant that N0 was constrained between 0and 2 106 and g was constrained between05 and 05 peryear

For each population and model we obtained themode and the 95 HPD of N0 and g inferred from their

13

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

posterior distributions (supplementary tables S4 and S7Supplementary Material online) using the add-on packageLocfit (Loader 1999) in R v2141 We selected the best-fittingmodel among the four tested demographic models by esti-mating marginal likelihoods using two methods path sam-pling and stepping-stone sampling (Baele et al 2012) Themodel with the greater marginal likelihood (supplementarytable S3 Supplementary Material online) was considered asthe best-fitting model

Extended Bayesian Skyline PlotsEBSPs (Heled and Drummond 2010) also implemented inBEAST estimate demographic changes occurring continu-ously through time in a population using the time intervalsbetween successive coalescent events This method allows avisualization of the evolution of Ne through time As abovewe combined three runs of 107 steps for mitochondrial se-quences and three runs of 2 108 steps for autosomal se-quences to obtain acceptable ESS values We assumed thesame mutation rates as above and a generation time of 25years Outputs were analyzed with Tracer v15 to visuallycheck for convergence and ESS also to obtain the 95 HPDinterval for the number of demographic changes that oc-curred in the population (supplementary table S5Supplementary Material online) A constant population sizecould be rejected when the 95 HPD of the number ofchange points excluded 0 but included 1 (Heled andDrummond 2010) Then we used R v2141 to computeGelman and Rubinrsquos (1992) convergence diagnostic asabove as well as to compute skyline plots Finally we usedthe population growth curves generated from BEAST toassess the time at which populations began to expandEach Skyline plot consisted of smoothed data points atamp10ndash20 generation intervals We consider that the popula-tion increased (or decreased) when both the median and 95HPD values for Ne increased (or decreased) between morethan two successive data points Although this method didnot provide a 95 HPD interval for the inferred expansiontimings this conservative approach ensured that we consid-ered only relevant expansion signals

Correlation Tests of Inferred Growth Rates andIsolationImmigration Patterns

To test how isolation degrees and migration patterns differ-ences could have impacted our demographic inferences weused ARLEQUIN v311 (Excoffier et al 2005) to compute pop-ulation-specific FST values (Weir and Hill 2002) from HVS-Idata for Central Africa Eurasia and Central Asia (supplemen-tary table S9 Supplementary Material online) and estimateimmigration rates from mismatch distributions under a spa-tially explicit model (Excoffier 2004) (supplementary tableS10 Supplementary Material online) We performed thenSpearman tests using R v2141 to investigate for eachregion how the inferred parametric growth rates were corre-lated with those FST values and immigration rates We used avalue of 0 for the growth rate when the constant model bestfitted the data

Supplementary MaterialSupplementary figures S1 and S2 and tables S1ndashS12 are avail-able at Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

The authors would like to warmly thank all volunteer partic-ipants They also thank Michael Fontaine for his help on somedata analyses Phillip Endicott and Julio Bendezu-Sarmientofor insightful discussions Alexei Drummond for helpful dis-cussions and Friso Palstra for his help on English usage Theythank Laurent Excoffier two anonymous reviewers and theeditor for helpful comments and suggestions All computa-tionally intensive analyses were run on the Linux cluster of theMuseum National drsquoHistoire Naturelle (administrated by JulioPedraza) and on the web-based portal ldquoBioportalrdquo (Kumaret al 2009) This work was supported by ActionsTransversales du Museum - Museum National drsquoHistoireNaturelle (grant ldquoLes relations Societes-Natures dans le longtermerdquo) and Agence Nationale de la Recherche (grantsldquoAlterite culturellerdquo ANR-10-ESVS-0010 and ldquoDemochipsrdquoANR-12-BSV7-0012) CA was financed by a PhD grantfrom the Centre National de la Recherche Scientifique

References1000 Genomes Project Consortium 2012 An integrated map of genetic

variation from 1092 human genomes Nature 49156ndash65Ambrose SH 2001 Paleolithic technology and human evolution Science

2911748ndash1753Arenas M Francois O Currat M Ray N Excoffier L 2013 Influence of

admixture and Paleolithic range contractions on current Europeandiversity gradients Mol Biol Evol 30 57ndash61

Atkinson QD Gray RD Drummond AJ 2009 Bayesian coalescent infer-ence of major human mitochondrial DNA haplogroup expansionsin Africa Proc R Soc Lond B Biol Sci 276367ndash373

Baele G Lemey P Bedford T Rambaut A Suchard MA Alekseyenko AV2012 Improving the accuracy of demographic and molecular clockmodel comparison while accommodating phylogenetic uncertaintyMol Biol Evol 292157ndash2167

Balaresque PL Ballereau SJ Jobling MA 2007 Challenges in humangenetic diversity demographic history and adaptation Hum MolGenet 16R134ndashR139

Batini C Lopes J Behar DM Calafell F Jorde LB van der Veen LQuintana-Murci L Spedini G Destro-Bisol G Comas D 2011Insights into the demographic history of African pygmies from com-plete mitochondrial genomes Mol Biol Evol 281099ndash1110

Beaumont MA 2004 Recent developments in genetic data analysiswhat can they tell us about human demographic historyHeredity 92365ndash379

Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdashapractical and powerful approach to multiple testing J R Stat SocSer B 57289ndash300

Bermisheva M Salimova A Korshunova T Svyatova G Berezina GVillems R Khusnutdinova E 2002 Mitochondrial DNA diversity inthe populations of Middle Asia and Northern Caucasus Eur J HumGenet 10179ndash180

Bocquet-Appel JP 2011 When the worldrsquos population took off thespringboard of the Neolithic Demographic Transition Science 333560ndash561

Bocquet-Appel J-P Bar-Yosef O 2008 The Neolithic demographic tran-sition and its consequences Dordrecht (The Netherlands) Springer

14

Aime et al doi101093molbevmst156 MBE

Chaix R Austerlitz F Hegay T Quintana-Murci L Heyer E 2008 Genetictraces of east-to-west human expansion waves in Eurasia Am J PhysAnthropol 136309ndash317

Chaix R Quintana-Murci L Hegay T Hammer MF Mobasher ZAusterlitz F Heyer E 2007 From social to genetic structures incentral Asia Curr Biol 1743ndash48

Clark PU Dyke AS Shakun JD Carlson AE Clark J Wohlfarth BMitrovica JX Hostetler SW McCabe AM 2009 The Last GlacialMaximum Science 325710ndash714

Conrad DF Keebler JEM DePristo MA et al (17 co-authors) 2011Variation in genome-wide mutation rates within and betweenhuman families Nat Genet 43712ndash714

Derenko MV Malyarchuk BA Dambueva IK Shaikhaev GO DorzhuCM Nimaev DD Zakharov IA 2000 Mitochondrial DNA variationin two South Siberian aboriginal populations implications for thegenetic history of North Asia Hum Biol 72945ndash973

Dirksen VG van Geel B 2004 Mid to late Holocene climate change andits influence on cultural development in South Central Siberia NatoSci S Ss IV Ear 42291ndash307

Drummond AJ Rambaut A 2007 BEAST Bayesian evolutionary analysisby sampling trees BMC Evol Biol 7214

Endicott P Ho SYW Metspalu M Stringer C 2009 Evaluating the mi-tochondrial timescale of human evolution Trends Ecol Evol 24515ndash521

Excoffier L 2004 Patterns of DNA sequence diversity and genetic struc-ture after a range expansion lessons from the infinite-island modelMol Ecol 13853ndash864

Excoffier L Heckel G 2006 Computer programs for population geneticsdata analysis a survival guide Nat Rev Genet 7745ndash758

Excoffier L Laval G Schneider S 2005 Arlequin (version 30) an inte-grated software package for population genetics data analysis EvolBioinform Online 147ndash50

Fontaine MC Snirc A Frantzis A Koutrakis E Ozturk B Ozturk AAAusterlitz F 2012 History of expansion and anthropogenic collapsein a top marine predator of the Black Sea estimated from geneticdata Proc Natl Acad Sci U S A 109E2569ndashE2576

Forster P Harding R Torroni A Bandelt HJ 1996 Origin and evolutionof native American mtDNA variation a reappraisal Am J HumGenet 59935ndash945

Fu YX 1997 Statistical tests of neutrality of mutations against popula-tion growth hitchhiking and background selection Genetics 147915ndash925

Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133693ndash709

Gelman A Rubin DB 1992 Inference from iterative simulation usingmultiple sequences Stat Sci 7457ndash511

Heled J Drummond AJ 2010 Bayesian inference of species trees frommultilocus data Mol Biol Evol 27570ndash580

Heyer E Balaresque P Jobling MA Quintana-Murci L Chaix R Segurel LAldashev A Hegay T 2009 Genetic diversity and the emergence ofethnic groups in Central Asia BMC Genet 1049

Heyer E Chaix R Pavard S Austerlitz F 2012 Sex-specific demographicbehaviours that shape human genomic variation Mol Ecol 21597ndash612

Heyer E Zietkiewicz E Rochowski A Yotova V Puymirat J Labuda D2001 Phylogenetic and familial estimates of mitochondrial substi-tution rates study of control region mutations in deep-rootingpedigrees Am J Hum Genet 631113ndash1126

Ho SYW Shapiro B 2011 Skyline-plot methods for estimatingdemographic history from nucleotide sequences Mol Ecol Res 11423ndash434

Howell N Kubacka I Mackey DA 1996 How rapidly does the humanmitochondrial genome evolve Am J Hum Genet 59501ndash509

Hudson RR Kaplan NL 1985 Statistical properties of the number ofrecombination events in the history of a sample of DNA sequencesGenetics 111147ndash164

Imaizumi K Parsons TJ Yoshino M Holland MM 2002 A new databaseof mitochondrial DNA hypervariable regions I and II sequences from162 Japanese individuals Int J Legal Med 11668ndash73

International HapMap Consortium 2003 The International HapMapProject Nature 426789ndash796

Kingman JFC 1982 The coalescent Stoch Proc Appl 13235ndash248Kong QP Yao YG Sun C Bandelt HJ Zhu CL Zhang YP 2003 Phylogeny

of East Asian mitochondrial DNA lineages inferred from completesequences Am J Hum Genet 73671ndash676

Kumar V Langstieh BT Madhavi KV Naidu VM Singh HP Biswas SThangaraj K Singh L Reddy BM 2006 Global patterns in humanmitochondrial DNA and Y-chromosome variation caused by spatialinstability of the local cultural processes PLoS Genet 2e53

Kumar S Skjaeveland A Orr RJS Enger P Ruden T Mevik BH Burki FBotnen A Shalchian-Tabrizi K 2009 AIR a batch-oriented webprogram package for construction of supermatrices ready for phy-logenomic analyses BMC Bioinformatics 10357

Laval G Excoffier L 2004 SIMCOAL 20 a program to simulate genomicdiversity over large recombining regions in a subdivided populationwith a complex history Bioinformatics 202485ndash2487

Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulatinga historical and demographic model of recent human evolutionbased on resequencing data from noncoding regions PLoS One 5e10284

Librado P Rozas J 2009 DnaSP v5 a software for comprehensive analysisof DNA polymorphism data Bioinformatics 251451ndash1452

Liu W Jin CZ Zhang YQ et al (13 co-authors) 2010 Human remainsfrom Zhirendong South China and modern human emergence inEast Asia Proc Natl Acad Sci U S A 10719201ndash19206

Loader C 1999 Local regression and likelihood New York SpringerMellars P 2006a Why did human populations disperse from Africa ca

60000 years ago A new model Proc Natl Acad Sci U S A 1039381ndash9386

Mellars P 2006b A new radiocarbon revolution and the dispersal ofmodern humans in Eurasia Nature 439931ndash935

Nei M 1995 The origins of human populations genetic linguistic andarcheological data In Brenner S Hanihara K editors The origin andpast of modern humans as viewed from DNA Singapore WorldScientific p 71ndash91

Nei M Roychoudhury AK 1993 Evolutionary relationships of human-populations on a global scale Mol Biol Evol 10927ndash943

Nelson SM 1993 The archaeology of Korea Cambridge CambridgeUniversity Press p 307

Oota H Settheetham-Ishida W Tiwawech D Ishida T Stoneking M2001 Human mtDNA and Y-chromosome variation is corre-lated with matrilocal versus patrilocal residence Nat Genet 2920ndash21

Pakendorf B Stoneking M 2005 Mitochondrial DNA and human evo-lution Annu Rev Genomics Hum Genet 6165ndash183

Partridge TC Demenocal PB Lorentz SA Paiker MJ Vogel JC 1997Orbital forcing of climate over South Africa a 200000-year rainfallrecord from the Pretoria Saltpan Quat Sci Rev 161125ndash1133

Patin E Laval G Barreiro LB et al (15 co-authors) 2009 Inferring thedemographic history of African farmers and pygmy hunter-gath-erers using a multilocus resequencing data set PLoS Genet 5e1000448

Pavlov P Svendsen J Indrelid SI 2001 Human presence in the EuropeanArctic nearly 40000 years ago Nature 41364ndash67

Phillipson DW 1993 African archaeology Cambridge CambridgeUniversity Press

Plummer M Best N Cowles K Vines K 2006 CODA Convergencediagnosis and output analysis for MCMC R News 67ndash11

Pluzhnikov A Di Rienzo A Hudson RR 2002 Inferences about humandemography based on multilocus analyses of noncoding sequencesGenetics 1611209ndash1218

Quintana-Murci L Chaix R Wells RS et al (17 co-authors) 2004 Wherewest meets east the complex mtDNA landscape of the southwestand Central Asian corridor Am J Hum Genet 74827ndash845

Quintana-Murci L Quach H Harmant C et al (23 co-authors) 2008Maternal traces of deep common ancestry and asymmetric geneflow between Pygmy hunter-gatherers and Bantu-speaking farmersProc Natl Acad Sci U S A 1051596ndash1601

15

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE

Page 5: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

FIG 2 EBSPs inferred from HVS-I sequences in African sedentary farmers (A) African nomadic hunter-gatherers (B) Eurasian sedentary farmers (C)Eurasian nomadic herders (D) Central Asian sedentary farmers (E) and Central Asian nomadic herders (F) The values indicated in bold are obtainedassuming a mutation rate of = 5 106generationsite (transitional changes rate Forster et al 1996) and the others correspond to = 105generationsite (pedigree-based Howell et al 1996 Heyer et al 2001) Time is represented time in years assuming a generation time of 25 years It isrepresented backward on the x axis from present to the left to the most distant past on the right 95 lower and upper HPD are represented by dashedlines Populations for which the estimated number of demographic changes include 0 (ie no significant signal of expansion or decline) are representedin light gray and the others in black

5

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

For the HVS-I sequences from Eurasia (supplementarytables S7 and S8 Supplementary Material online) the para-metric BEAST analyses showed that models consistent withan increase in population size (expansion model or exponen-tial model with positive growth rates) best fitted the data forall sedentary populations except Koreans Conversely theconstant model best fitted the data for all nomadic popula-tions as well as Koreans EBSPs showed however significantexpansion events for both farmers and herders but not forKoreans (fig 2C and D and supplementary table S5Supplementary Material online) Nevertheless there was atendency toward stronger expansion rates and higher Ne

values in sedentary than in nomadic populations (fig 2Cand D) although the 95 HPD intervals for Ne were quitelarge for sedentary populations The estimated expansiononset times inferred from the EBSPs (supplementary tableS6 Supplementary Material online) followed an east-to-west gradient they appeared more ancient in Eastern popu-lations in both sedentary and nomadic populations (supple-mentary fig S1 Supplementary Material online) They alsoclearly predated the Neolithic transition in all geographicareas (fig 4)

The Central Asian Exception Similar Demographic Patterns inFarmers and HerdersFor autosomes the constant model best fitted the data forboth sedentary farmers (TAB) and traditionally nomadic her-ders (KIB) (supplementary tables S3 and S4 SupplementaryMaterial online) EBSPs showed also no significant demo-graphic changes for these populations (figs 1C and D andsupplementary table S5 Supplementary Material online)

For HVS-I (supplementary tables S3 and S7 SupplementaryMaterial online) the exponential model best fitted the datafor six of the 12 sedentary farmer populations (includingTAB) whereas the constant model was preferred for theother farmers Unlike the rest of Eurasia a model indicatingexpansion (the exponential model with positive growth rates)was also selected for all nomadic herders Moreover EBSPsshowed significant expansion signals for both herder andfarmer populations except TJY (Yagnobs from Dushanbe)since at least 13860 YBP (or 27720 YBP) for farmers and16546 YBP (or 33092 YBP) for herders on average (fig 2Eand F and supplementary tables S5 and S6 Supplementary

Material online) Again these inferred expansion onsets pre-dated the emergence of farming in the area about 8000 YBP(Bocquet-Appel and Bar-Yosef 2008) (fig 4) Inferred expan-sions for Central Asian sedentary farmers seemed overallweaker (ie lower growth rate and lower Ne) than those ob-served for other sedentary populations in Eurasia althoughwe observed important variations in growth rates and Ne

among populations and large 95 HPD intervals for someof them (fig 2C and E)

Degrees of Isolation and Migration Patterns

African farmer populations appeared less isolated and re-ceived more migrants than hunter-gatherer populationsIndeed the population-specific FST values (supplementarytable S9 Supplementary Material online) were on averagesignificantly lower for farmers than for hunter-gatherers(mean[farmers] = 0058 mean[HG] = 0192 Wilcoxon two-sidedtest P value = 00002) Moreover the estimated number ofimmigrants was significantly higher for sedentary farmersthan for nomadic hunter-gatherers (mean[farmers] = 314mean[HG] = 221 P value = 00001) (supplementary table S10Supplementary Material online) For hunter-gatherers the FST

values were negatively correlated with the negative growthrates that we inferred from the parametric method (=0893 P value = 0012) (fig 5B) meaning that less isolatedpopulations showed weaker contraction events (ie less neg-ative growth rates) Conversely there was no significant cor-relation between FST values and inferred growth rates forsedentary farmers (= 0433 P value = 0249) (fig 5A)However we found a significant positive correlation betweenthe number of immigrants and the inferred growth rates(supplementary fig S2 Supplementary Material online)among sedentary farmer populations (= 0867 P value =0004) but not among nomadic hunter-gatherers (= 0536P value = 0235)

For Eurasia we found no significant difference in FST valuesbetween farmers and herders (mean[farmers] = 0039 mean-

[herders] = 0043 P value = 077 supplementary table S9Supplementary Material online) except in Central Asia forwhich we found significantly lower FST values for nomadicherders than for sedentary farmers (mean[farmers] = 0018mean[herders] = 0008 P value = 0017) We report a significant

FIG 3 Comparison of estimated times for expansion onsets using autosomes and dating of the first archeological traces of farming in Africa and ChinaTime is represented backward (in YBP) Only populations for which the EBSP analysis showed a significant expansion event are represented Wereported the time values estimated with the highest mutation rate that we used for the autosomes (= 25 108generationsite) Thus these timevalues can be considered as a lower bound for the expansion onsets The dates for the emergence of farming come from the review by Bocquet-Appeland Bar-Yosef (2008) They are based on archeological remains

6

Aime et al doi101093molbevmst156 MBE

FIG

4

Com

par

ison

ofes

tim

ated

tim

esfo

rex

pan

sion

onse

tsus

ing

HV

S-Ia

nd

dati

ng

ofth

efir

star

cheo

logi

calt

race

sof

farm

ing

orhe

rdin

gin

Cen

tral

Afr

ica

(A)

Eura

sia

(B)

and

Cen

tral

Asi

a(C

)T

ime

isre

pre

sen

ted

back

war

d(i

nY

BP)

On

lyp

opul

atio

ns

for

whi

chth

eEB

SPan

alys

issh

owed

asi

gnifi

can

tex

pan

sion

even

tar

ere

pre

sen

ted

We

rep

orte

dth

eti

me

valu

eses

tim

ated

wit

hth

ehi

ghes

tm

utat

ion

rate

that

we

used

for

the

HV

S-Is

eque

nce

s(

=10

5 gen

erat

ion

sit

e)T

hus

thes

eti

me

valu

esca

nbe

con

side

red

asa

low

erbo

und

for

the

exp

ansi

onon

sets

The

date

sfo

rth

eem

erge

nce

offa

rmin

gco

me

from

the

revi

ewby

Bocq

uet-

Ap

pel

and

Bar-

Yos

ef(2

008)

The

yar

eba

sed

onar

cheo

logi

cal

rem

ain

s

7

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

FIG 5 Correlations between population-specific FST values and inferred growth rates in African farmer (A) and hunter-gatherer (B) populationsEurasian farmer populations (C) and Central Asian farmer (D) and herder populations (E) Population-specific FST values were computed withARLEQUIN v311 (Excoffier et al 2005) The growth rates were inferred under the best-fitting model from the parametric method using BEAST(Drummond and Rambaut 2007) When the best-fitting model was the constant model we assumed a growth rate of 0 Note that we did not representEurasian herder populations as the constant model best-fitted the data for all of them Plots and correlation tests were performed using R v2141(R Development Core Team 2011)

8

Aime et al doi101093molbevmst156 MBE

negative correlation between FST values and inferred growthrates for sedentary farmers in Eurasia (=0673 P value =0028) (fig 5C) and Central Asia (=0773 P value = 0003)(fig 5D) thus meaning that less isolated populations showedhigher inferred growth rates There was no significant corre-lation for Central Asian herders (=0092 P value = 0736)(fig 5E) Note that this analysis could not be performed forthe other Eurasian herder populations as the constant modelbest fitted the data with the parametric method

The estimation of the proportion of immigrants did notconverge for 11 Eurasian populations (Han Chinese LiaoningQingdao Palestinians Pathans Mongols as well as threeCentral Asian farmer populations and two Central Asianherder populations) Regarding the other populations weshowed no significant difference in the proportion of immi-grants between farmers and herders both in Central Asia(mean[farmers] = 82868 mean[herders] = 260265 P value = 012)and in the rest of Eurasia (mean[farmers] = 576757 mean-

[herders] = 201311 P value = 051) (supplementary table S10Supplementary Material online) We also found no significantcorrelation between this proportion and the inferred growthrates for Eurasian farmers (=0238 P value = 048) CentralAsians farmers (= 0386 P value = 030) and Central Asianherders (=042 P value = 0139) (supplementary fig S2Supplementary Material online)

DiscussionIn this study using a large set of populations from distantgeographic areas we report contrasted demographic historiesthat correlate with lifestyle Moreover the inferred expansionsignals in both African and Eurasian farmer and herder pop-ulations predated the Neolithic transition and the sedentar-ization of these populations

Contrasted Demographic Histories in Sedentary andNomadic Populations

For Africa both mtDNA and autosomal data revealed expan-sion patterns in most sedentary farmer populations as indi-cated by neutrality tests and the parametric andnonparametric BEAST methods Conversely we found con-stant effective population sizes (or possibly contractionevents) for all hunter-gatherer populations Among the farm-ers results were least clear for the Yoruba and the Ewondopopulations as no neutrality test was significant for thesepopulations whereas they showed evidence of expansionevents when analyzed with BEAST This indicates that thesepopulations may have undergone weaker expansion dynam-ics (ie lower growth rates and Ne) than the others Theseremarkable results are of particular importance for theYoruba as it is a reference population in many databases(HapMap 1000 genomes) This also demonstrates thehigher sensitivity of MCMC methods such as BEAST todetect expansions in comparison to neutrality tests

The contrasted patterns inferred between sedentary andnomadic populations in Africa suggest strong differences be-tween the demographic histories of these two groups of pop-ulations The question is whether this pattern results mostly

from differences in local expansion dynamics or whether spa-tial expansion processes at a larger scale were also involved Asshown by Ray et al (2003) negative values for the neutralitytests will be observed in a spatial expansion process if the rateof migrants (Nm) is high enough (at least 20) but not other-wise As in previous studies (eg Verdu et al 2013) we reporta higher degree of isolation (higher population-specific FST

values) in hunter-gatherer populations than in farmer popu-lations Using the spatial expansion model of Excoffier (2004)also leads to higher estimates of the number of immigrantsinto farmer populations Thus both farmers and hunter-gath-erers may have been subject to a spatial expansion processbut the limited number of migrants among hunter-gatherersmay have resulted in an absence of expansion signals forthem This would be consistent with the positive correlationthat we observe between the growth rates estimated withBEAST and the inferred number of immigrants in the seden-tary farmer populations However this spatial expansion pro-cess seems unlikely to completely explain the strongassociation that we observed between lifestyle and expansionpatterns as some farmer populations (Teke Gabonese Fang)displayed FST values similar to those of hunter-gatherers but aclear signal of expansion with relatively high growth rates Thissuggests that even rather isolated farmer populations showsubstantial level of expansions Moreover FST values and in-ferred growth rates in farmer populations were not signifi-cantly correlated Therefore our results suggest that theexpansion patterns observed in sedentary populationsresult not only from a spatial expansion pattern In additionlocal dynamics connected with the higher capacity of foodproduction by farmers also explain their much strongerexpansion signatures relative to their neighboring hunter-gatherer populations

For Eurasia when considering the mtDNA data all threemethods (neutrality tests parametric BEAST analyses andEBSPs) yielded expansion signals for all sedentary farmer pop-ulations except Koreans Conversely only EBSPs and Fursquos Fstest showed expansion signals for nomadic herders but notthe parametric BEAST method nor other neutrality tests Thisresult points toward weaker expansion dynamics in herdersthan in farmers as supported also by the tendency for lowergrowth rates and Ne in herder populations than in farmerpopulations on the EBSP graphs (fig 2C and D) It thus seemsthat the flexibility and nonparametric nature of EBSP analysesallows one to detect weaker expansion events than the para-metric method Moreover Fursquos Fs is known to be more sen-sitive than the other neutrality tests to detect expansions(Ramos-Onsins and Rozas 2002) Again these inferred expan-sions may result at least in part from spatial expansion pro-cesses The population-specific FST values are indeed ratherlow in Eurasia Moreover we found a significant negativecorrelation between FST values and inferred growth rates forthe sedentary farmers indicating that less isolated popula-tions showed stronger expansion signals However althoughwe inferred much stronger expansion patterns for the farmersthan for the herders we did not observe any differences inEurasia between the farmers and the herders in the popula-tion-specific FST values or in the estimated number of

9

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

immigrants suggesting that spatial processes alone cannotexplain the strong difference that we observed between theexpansion patterns of these two groups of populations Thisindicates that the intrinsic demographic growth patterns aredifferent between these two kinds of populations the farmersshowing much higher growth rates than the herders

To our knowledge although other studies have found dif-ferent patterns between hunter-gatherers and farmers (egVerdu et al 2009) our study is the first to show differencesbetween farmers and herders the two major post-Neolithichuman groups A plausible explanation could be that no-madic herders and hunter-gatherers share several of the con-straints of a nomadic way of life For instance birth intervalsare generally longer (at least 4 years) in nomadic populationsthan in sedentary populations (eg Short 1982) According toBocquet-Appel (2011) these longer birth intervals may bemainly determined by diet differences Indeed Valeggia andEllison (2009) demonstrated that birth interval is mainly de-termined by the rapidity of postpartum energy recoverywhich may be increased by high carbohydrate food (like ce-reals) consumption Moreover the nomadic herder way of lifemay offer less food security than sedentary farming the latterfacilitating efficient long-term food storage

However unlike in Africa we did not find systematicallyconsistent patterns between the autosomal and mtDNA datain Eurasia The possible contraction events that our resultssuggest for two sedentary populations (Japanese and Danes)with autosomes appeared concomitant with historical eventsthat could have led to bottleneck processes For the Japanesepopulation this contraction signal could indeed result from afounder effect due to the Paleolithic colonization of Japan by asubset of the Northern Asiatic people (especially from KoreaNei 1995) Similarly a bottleneck process may also have oc-curred in the Danish population linked with the last glacialmaximum occurring between 26500 and 19500 YBP (Clarket al 2009) Reasons why these processes impacted the auto-somes but not the mtDNA data remain to be determined forinstance through simulation studies In any case our studyclearly emphasizes the utility of combining mtDNA and au-tosomal sequences as they allow access to different aspects ofhuman history A recent study on harbor porpoises has sim-ilarly shown that nuclear markers were sensitive to a recentcontraction event whereas mtDNA allowed inferring a moreancient expansion (Fontaine et al 2012)

Interestingly Central Asia displayed a distinct pattern fromthe rest of Eurasia Indeed we did not infer higher expansionrates for sedentary farmers than for nomadic herders in thatarea It could result from harsh local environmental condi-tions due to the arid continental climate in this area Indeedusing pollen records Dirksen and van Geel (2004) showedthat the paleoclimate in Central Asia was very arid from atleast 12000 to 3000 YBP which could have limited theamount of suitable areas for farming and impacted humandemography Spatial expansion processes may also haveplayed a role in this difference as population-specific FST

values were higher for the farmers than for the herdersThis may indicate that more migrants were involved in thespatial expansion process for the herders than for the farmers

yielding a weaker expansion signal (ie lower inferred growthrate) for the latter (Ray et al 2003) This is supported by thenegative correlation between the FST values and the inferredgrowth rates in the farmer populations The Korean popula-tion also stood out as an exception in Eurasia Even though itis a population of sedentary farmers it showed no significantexpansion signal with both the parametric and nonparamet-ric methods with HVS-I This could be explained by a latersedentarization of this population The Korean Neolithic isnotably defined by the introduction of Jeulmun ware ce-ramics about 8000 YBP but the people of the Jeulmunperiod were still predominantly semi-nomadic fishers andhunter-gatherers until about 3000 YBP when Koreans startedan intensive crop production implying a sedentary lifestyle(Nelson 1993)

Inferred Expansion Signals Predate the Emergence ofFarming

EBSP analyses revealed that the inferred expansion events infarmers and herder populations were more ancient than theemergence of farming and herding Therefore the differencesin demographic patterns between farmers and herders seemto predate their divergence in lifestyle which raises the ques-tion of the chronology of demographic expansions and theNeolithic transition These findings appear to be quite robustto the choice of the scaling parameters We used here boththe lower and the higher mutation rate estimates in humansfor autosomes (Pluzhnikov et al 2002 Conrad et al 2011) andfor the HVS-I sequence (Forster et al 1996 Howell et al 1996)Despite this uncertainty in mutation rates which lead to a 2-fold uncertainty in our time estimates the inferred expansionsignals predated the emergence of agriculture in both casesfor all populations Similarly using a generation time of 29years (Tremblay and Vezina 2000) instead of 25 years lead toslightly more ancient estimates thus do not change our con-clusions (data not shown) However note that for HVS-Iusing the higher bound of the credibility interval for the high-est estimated mutation rate (275 105generationsiteHeyer et al 2001) instead of the mean value (ie 105gen-erationsite) leads to expansion time estimates consistentwith the Neolithic transition in Eurasian populations (supple-mentary table S11 Supplementary Material online)Nevertheless these estimates still clearly predated theNeolithic for the African populations However 105gener-ationsite is by far the highest estimation of mutation rate inthe literature (Howell et al 1996) To infer Neolithic expan-sions in most Eurasian populations one needs to assume amutation rate of at least 2 105generationsite muchhigher than other estimates from the literature and is thusprobably unrealistic Moreover our method for determiningthe expansion onset time using EBSP graph is very conserva-tive and also tends to favor the lower bound of expansiononset times Finally for autosomes using similarly474 108generationsite instead of 25 108genera-tionsite (Pluzhnikov et al 2002) lead to an inferred expansiononset time that is not compatible with the Neolithic transi-tion for all Eurasian and African populations except for one

10

Aime et al doi101093molbevmst156 MBE

African population the Yoruba (supplementary table S12Supplementary Material online) Consequently it seemsvery likely that the expansions inferred in this study corre-spond to Paleolithic rather than Neolithic demographicevents in agreement also with most previous studies as de-tailed later

In Africa the emergence of agriculture has been datedbetween 5000 and 4000 YBP in the Western part ofCentral Africa and subsequently rapidly expanded to therest of sub-Saharan Africa (Phillipson 1993) However usingHVS-I we showed expansion events in farmer populationssince about 30000 or 60000 YBP thus largely predating theemergence of agriculture in the area Similarly using auto-somes especially in Eastern African populations we inferredexpansion signals that clearly predated the Neolithic Notablywe inferred an expansion signal for Mozambicans since atleast 80000 YBP Several genetic studies have already high-lighted that expansion events occurred in African farmersbefore the Neolithic transition (eg Atkinson et al 2009Laval et al 2010 Batini et al 2011) This finding is also con-sistent with paleoanthropological data (ie radiocarbondating) suggesting an expansion event in Africa 60000ndash80000 YBP (Mellars 2006a) This Paleolithic demographicexpansion could be linked to a rapid environmental changetoward a dryer climate (Partridge et al 1997) andor to theemergence of new hunting technologies (Mellars 2006a)

According to Mellars (2006a) this period corresponds to amajor increase in the complexity of the technological eco-nomic social and cognitive behavior of certain Africangroups It corresponds in particular to the emergence of pro-jectile technologies (Shea 2009) which was probably part of abroader pattern of ecological diversification of early Homosapiens populations These changes could have been decisivefor the human spread ldquoOut of Africardquo during the same periodand could have ultimately also led to the sedentarization ofthe remaining populations This inference is consistent withSauerrsquos (1952) demographic theory which stated that latePaleolithic demographic expansions could have favored thesedentarization and the emergence of agriculture in somehuman populations In the case of Central Africa theperiod of 60000 YBP corresponds to the separation betweenhunter-gatherers and farmers ancestors (Patin et al 2009Verdu et al 2009) Thus these two groups may have pre-sented contrasting demographic patterns since their diver-gence Much later higher expansion rates and largerpopulation sizes among farmersrsquo ancestors may have inducedthe emergence of agriculture and sedentarization

With respect to Eurasia the expansion profiles inferredwith HVS-I for all populations and with autosomes for theHan Chinese population also seem to have begun during thePaleolithic thus before the Neolithic transition Some geneticstudies already reported pre-Neolithic expansions in Asia andEurope (eg Chaix et al 2008) Notably using mismatch andintermatch distributions Chaix et al (2008) showed an east-to-west Paleolithic expansion wave in Eurasia We found asimilar pattern here as the inferred expansions of East-Asianpopulations were earlier than those of Central Asian popula-tions themselves earlier than those of European populations

Moreover we found this pattern in both sedentary farmerand nomadic herder populations Thus the ancestors of cur-rently nomadic herder populations also experienced thesePaleolithic expansions However Paleolithic expansion signalsin nomadic populations seem lower than in sedentary pop-ulations This is again compatible with the demographictheory of the Neolithic sedentarization (Sauer 1952) somepopulations may have experienced more intense Paleolithicexpansions which may have led ultimately to theirsedentarization

The inferred Paleolithic expansion signals might resultpartly from spatial expansions out of some refuge areasafter the Last Glacial Maximum (LGM 26500ndash19500 YBPClark et al 2009) as this time interval matches with our in-ferred dating for expansion onsets in East Asia with HVS-Iusing the pedigree-based mutation rate and in Europe andMiddle East using the transitional mutation rate Some of theearlier date estimates might also be consistent with the out-of-Africa expansion of H sapiens However the time radiocar-bon-based estimates of the spread of H sapiens in Eurasia aregenerally more ancient than our inferred expansion onsettimings For instance Mellars (2006b) dated the colonizationof Middle East by H sapiens at 47000ndash49000 YBP and ofEurope at 41000ndash42000 YBP Pavlov et al (2001) reporttraces of modern human occupation nearly 40000 yearsold in Siberia Finally Liu et al (2010) described modernhuman fossils from South China dated to at least 60000YBP Moreover out-of-Africa or post-LGM expansionswould not explain our finding of an east-to-west gradientof expansion onset timing which rather supports the hypoth-esis of a demographic expansion diffused from east to west inEurasia in a demic (ie migrations of individuals) or cultural(favored by the diffusion of new technologies)

Possible Confounding Factors

Our approach makes the assumption that populations areisolated and panmictic which is questionable for human pop-ulations However we analyzed a large set of populationssampled in very distant geographical regions (ie CentralAfrica East Africa Europe Middle East Central Asia PamirSiberia and East Asia) The main conclusions of this study relyon consistent patterns between most of these areas and itseems unlikely that processes such as admixture could havebiased the estimates similarly everywhere Moreover inCentral Africa several studies have shown that hunter-gath-erer populations show signals of admixture whereas it is notthe case for farmer populations (Patin et al 2009 Verdu et al2009 2013) If this introgression had been strong enough thismay have yielded a spurious expansion signal in the hunter-gatherer populations which is not what we observed here InEurope spatial expansion processes during the Neolithic mayhave led to admixture with Paleolithic populations Aspointed out by a simulation study (Arenas et al 2013) thismay lead to a predominance of the Paleolithic gene pool Thismay be one of the factors explaining why we observed mostlyPaleolithic expansions here

11

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

Similarly potential selection occurring on the whole mito-chondrial genome (eg Pakendorf and Stoneking 2005) seemsunlikely to have impacted in the same way all the studiedpopulations within each group (eg stronger positive selec-tion on sedentary than on nomadic populations) as we an-alyzed different nomadic and sedentary populations livingnear each other in several geographically distant areas

Regarding the potential effects of recombination on theinferences from autosomal data we found that neutralitytests gave similar results on the whole sequences whenusing a simulation procedure that was taking the known re-combination rate of each sequence into account (table 1)and on the largest non-recombining blocks as inferred withIMgc without taking recombination into account in the sim-ulation process (supplementary table S12 SupplementaryMaterial online) It thus appears unlikely that the BEAST anal-yses that can only handle the largest inferred non-recombin-ing blocks are biased because of this

Finally note that the effective population sizes inferredusing BEAST correspond to the Ne of the populationsduring their recent history rather than a value of Ne averagedover the history of the population It explains the finding thatfor most populations we inferred Ne estimates much higherthan generally assumed for humans by population geneticists(about 10000)

Material and Methods

Genetic MarkersAutosomal SequencesWe used data from 20 noncoding a priori neutral and un-linked autosomal regions selected by Patin et al (2009) to beat least 200 kb away from any known or predicted gene tonot be in linkage disequilibrium (LD) neither with each othernor with any known or predicted gene and to have a regionof homology with the chimpanzee genome These regions areon average 1253 bp long Using the four-gamete test (Hudsonand Kaplan 1985) as implemented in IMgc online (Woerneret al 2007) we identified recombination events for 6 of these20 regions As some methods used in this study cannothandle recombination we retained for these six sequencesthe largest non-recombining block inferred by IMgc Becauseof this reduction the 20 regions used were on average1228 bp long To identify potential bias related to thismethod (eg some recombination events may not bedetected using the four-gamete test larger blocks of non-recombining sequence may select for gene trees that areshorter than expected) we computed the summary statisticsand performed neutrality tests (discussed later) both on thewhole sequences (table 1) and on the largest non-recombin-ing blocks (supplementary table S12 Supplementary Materialonline)

Mitochondrial SequencesWe used the first hypervariable segment of the mitochondrialcontrol region (HVS-I) sequenced between positions 16067and 16383 excluding the hypervariable poly-C region (sites16179ndash16195) The total length of the sequence was thus of300 bp

Population Panel

For Africa we used the autosomal sequences data set of Patinet al (2009) which consists of five farmer populations(N = 118 individuals) and five Pygmy hunter-gatherer popu-lations (N = 95) In addition we used the HVS-I data set fromQuintana-Murci et al (2008) which consists of nine CentralAfrican farmer populations (N = 486) and seven CentralAfrican hunter-gatherer populations (N = 318) (supplemen-tary table S1 Supplementary Material online)

For Eurasia we used the autosomal sequences data set ofLaval et al (2010) consisting of 48 individuals from two East-Asian populations (Han Chinese and Japanese) and 47 indi-viduals from two European populations (Chuvash andDanes) We also used the data from 48 individuals fromone sedentary Central Asian population (Tajik farmers) and48 individuals from one nomadic Central Asian population(Kyrgyz herders) of Segurel et al (2013) For HVS-I we ana-lyzed data from 17 Eurasian populations (N = 494 in total)located from Eastern to Western Eurasia belonging to severalpublished data sets (Derenko et al 2000 Richards et al 2000Bermisheva et al 2002 Imaizumi et al 2002 Yao et al 2002Kong et al 2003 Quintana-Murci et al 2004 supplementarytable S1 Supplementary Material online)

For our detailed study of Central Asia we used HVS-I se-quences from 12 farmer populations (N = 408 in total) and 16herder populations (N = 567 in total) These data come fromthe studies by Chaix et al (2007) and Heyer et al (2009) for 25populations (supplementary table S1 SupplementaryMaterial online) The other populations (KIB TAB andTKY) were sequenced for this study As in Chaix et al(2007) and Heyer et al (2009) DNA was extracted fromblood samples using standard protocols and the sequencequality was ensured as follows each base pair was determinedonce with a forward and once with a reverse primer anyambiguous base call was checked by additional and indepen-dent PCR and sequencing reactions all sequences were ex-amined by two independent investigators All sampledindividuals were healthy donors from whom informed con-sent was obtained The study was approved by appropriateEthic Committees and scientific organizations in all countrieswhere samples have been collected

Demographic Inferences from Sequences AnalysisSummary Statistics and Neutrality TestsWe computed classical summary statistics (number of poly-morphic sites S number of haplotypes K) and four neutralitytests (Tajimarsquos [1989] D Fu and Lirsquos [1993] D and F and Fursquos[1997] Fs) on both mitochondrial and autosomal sequencesAlthough neutrality tests were originally designed to detectselective events they also give information about demo-graphic processes especially when applied to neutral markersIndeed expansion events lead to more negative values thanexpected in the absence of selective and demographic pro-cesses Conversely contraction events lead to more positivevalues of the neutrality tests For HVS-I sequences we com-puted all summary statistics and neutrality tests and tested

12

Aime et al doi101093molbevmst156 MBE

their departure from neutrality using the coalescent-basedtests provided in DnaSP (Librado and Rozas 2009)

For the autosomal sequences we used the procedure de-veloped in Laval et al (2010) which combines all autosomalsequences into a single test This procedure consists in com-puting the mean value of each summary statistics across the20 loci and in testing whether this mean value departs signif-icantly from its expectation under neutrality in a constant-size population model using a simulation procedure For agiven population with sample size n we produced 105 simu-lated samples of the same size n under a constant populationsize model using the generation-per-generation coalescent-based algorithm implemented in SIMCOAL v2 (Laval andExcoffier 2004) Each simulated individual was constitutedby 20 independent sequences of 1253 bp (the average se-quence length for the real data)Then we used ARLEQUIN v3(Excoffier et al 2005) as modified by Laval et al (2010) tocompute the summary statistics on these simulated samplesWe assessed whether the observed statistics differed signifi-cantly from the constant population model under neutralityby comparing these statistics with their null distribution ob-tained from the simulated data We used gamma distributedmutation rates with a mean value of 25 108generationsite (95 confidence interval 1476 108

4036 108)in agreement with previous studies (Pluzhnikov et al 2002Voight et al 2005) This procedure yielded a P value for thesignificance of the departure from a constant size model Weperformed this procedure both on the whole sequences andon the largest non-recombining blocks For the whole se-quences (ie including recombination) we performed thedata simulation under a coalescent model with recombina-tion using for each locus the recombination rate provided bythe HapMap build GRCh37 genetic map (InternationalHapMap Consortium 2003) (supplementary table S8Supplementary Material online) whereas for the largestnon-recombining blocks we used a coalescent model withoutrecombination

Both for autosomes and for HVS-I sequences we adjustedthe obtained P values for each neutrality test using a falsediscovery rate (FDR) correction (Benjamini and Hochberg1995) in R v2141 (R Development Core Team 2011) inorder to take into account the increased error probability inthe case of multiple testing

MCMC Estimations of Demographic ParametersWe used the MCMC algorithm implemented in BEAST v16(Drummond and Rambaut 2007) We tested the four demo-graphic models implemented in this software constant effec-tive population size (N0) (constant model) populationexpansion with an increasing growth rate (g) (exponentialmodel) population expansion with an decreasing growthrate (g) (logistic model) and the expansion model in whichN0 is the present day population size N1 the population sizethat the model asymptotes to going into the distant past andg the exponential growth rate that determines how fast thetransition is from near the N1 population size to N0 popula-tion size In fact BEAST estimates composite parameters foreach model namely N0 and g where N0 is the current

effective population size g the growth rate and the muta-tion rate In addition for the expansion model the ratio be-tween the current (N0) and ancestral (N1) effectivepopulation size is also estimated To infer N0 and g fromthese composite parameters we needed to assume a valuefor the mutation rate However there is no consensus formutation rates in humans in the literature as different meth-ods lead to different estimations For autosomes the mostcommonly used value is the phylogenetic rate of= 25 108generationsite (Pluzhnikov et al 2002)However recent studies based on the 1000 genome project(1000 Genomes Project Consortium 2012) have found a2-fold lower rate (= 12 108generationsite) by directlycomparing genome-wide sequences from children and theirparents (Conrad et al 2011 Scally and Durbin 2012) We usedhere both mutation rates Similarly for HVS-I estimated mu-tation rates are highly dependent on methodologies andmodes of calibration (Endicott et al 2009) We used boththe lower and the higher estimated mutation rates the tran-sitional changes mutation rate of = 5 106generationsite (Forster et al 1996) and the pedigree-based rate of= 105generationsite (Howell et al 1996 Heyer et al2001) We used a general time-reversible substitution model(Rodriguez et al 1990) We assumed a generation time of 25years permitting the comparison with previous human pop-ulation genetics studies (eg Chaix et al 2008 Patin et al 2009Laval et al 2010) As BEAST cannot handle recombinationevents we used the largest nonrecombining block withineach sequence (discussed earlier)

We performed three runs of 107 steps per population andper demographic model for the HVS-I sequence and threeruns of 2 108 steps (which corresponded to three runs of107 steps per locus) for the autosomal sequences We re-corded one tree every 1000 steps which thus implied atotal of 105 trees per locus and per run We then removedthe first 10 steps of each run (burn-in period) and combinedthe runs to obtain acceptable effective sample sizes (ESSs of100 or above) The convergence of these runs was assessedusing two methods visual inspection of traces using Tracerv15 (Rambaut and Drummond 2007) to check for concor-dance between runs and computation of Gelman and Rubinrsquos(1992) convergence diagnostic using R v2141 (RDevelopment Core Team 2011) with the function gelmandiag available in the package coda (Plummer et al 2006)

To facilitate a large exploration of the parameter space forthe autosomal sequences we chose uniform priors between 0and 005 for 2N0 and between 109 and 109 for g ForHVS-I sequences we chose uniform priors for N0 between 0and 10 and for gm between25 106 and 25 106 result-ing for the same priors on N0 and g than for autosomalsequences if we assumed = 105generationsite (ieN0 constrained between 0 and 106 and g constrained between1 and 1 per year) Conversely if we assumed = 5 106generationsite it meant that N0 was constrained between 0and 2 106 and g was constrained between05 and 05 peryear

For each population and model we obtained themode and the 95 HPD of N0 and g inferred from their

13

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

posterior distributions (supplementary tables S4 and S7Supplementary Material online) using the add-on packageLocfit (Loader 1999) in R v2141 We selected the best-fittingmodel among the four tested demographic models by esti-mating marginal likelihoods using two methods path sam-pling and stepping-stone sampling (Baele et al 2012) Themodel with the greater marginal likelihood (supplementarytable S3 Supplementary Material online) was considered asthe best-fitting model

Extended Bayesian Skyline PlotsEBSPs (Heled and Drummond 2010) also implemented inBEAST estimate demographic changes occurring continu-ously through time in a population using the time intervalsbetween successive coalescent events This method allows avisualization of the evolution of Ne through time As abovewe combined three runs of 107 steps for mitochondrial se-quences and three runs of 2 108 steps for autosomal se-quences to obtain acceptable ESS values We assumed thesame mutation rates as above and a generation time of 25years Outputs were analyzed with Tracer v15 to visuallycheck for convergence and ESS also to obtain the 95 HPDinterval for the number of demographic changes that oc-curred in the population (supplementary table S5Supplementary Material online) A constant population sizecould be rejected when the 95 HPD of the number ofchange points excluded 0 but included 1 (Heled andDrummond 2010) Then we used R v2141 to computeGelman and Rubinrsquos (1992) convergence diagnostic asabove as well as to compute skyline plots Finally we usedthe population growth curves generated from BEAST toassess the time at which populations began to expandEach Skyline plot consisted of smoothed data points atamp10ndash20 generation intervals We consider that the popula-tion increased (or decreased) when both the median and 95HPD values for Ne increased (or decreased) between morethan two successive data points Although this method didnot provide a 95 HPD interval for the inferred expansiontimings this conservative approach ensured that we consid-ered only relevant expansion signals

Correlation Tests of Inferred Growth Rates andIsolationImmigration Patterns

To test how isolation degrees and migration patterns differ-ences could have impacted our demographic inferences weused ARLEQUIN v311 (Excoffier et al 2005) to compute pop-ulation-specific FST values (Weir and Hill 2002) from HVS-Idata for Central Africa Eurasia and Central Asia (supplemen-tary table S9 Supplementary Material online) and estimateimmigration rates from mismatch distributions under a spa-tially explicit model (Excoffier 2004) (supplementary tableS10 Supplementary Material online) We performed thenSpearman tests using R v2141 to investigate for eachregion how the inferred parametric growth rates were corre-lated with those FST values and immigration rates We used avalue of 0 for the growth rate when the constant model bestfitted the data

Supplementary MaterialSupplementary figures S1 and S2 and tables S1ndashS12 are avail-able at Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

The authors would like to warmly thank all volunteer partic-ipants They also thank Michael Fontaine for his help on somedata analyses Phillip Endicott and Julio Bendezu-Sarmientofor insightful discussions Alexei Drummond for helpful dis-cussions and Friso Palstra for his help on English usage Theythank Laurent Excoffier two anonymous reviewers and theeditor for helpful comments and suggestions All computa-tionally intensive analyses were run on the Linux cluster of theMuseum National drsquoHistoire Naturelle (administrated by JulioPedraza) and on the web-based portal ldquoBioportalrdquo (Kumaret al 2009) This work was supported by ActionsTransversales du Museum - Museum National drsquoHistoireNaturelle (grant ldquoLes relations Societes-Natures dans le longtermerdquo) and Agence Nationale de la Recherche (grantsldquoAlterite culturellerdquo ANR-10-ESVS-0010 and ldquoDemochipsrdquoANR-12-BSV7-0012) CA was financed by a PhD grantfrom the Centre National de la Recherche Scientifique

References1000 Genomes Project Consortium 2012 An integrated map of genetic

variation from 1092 human genomes Nature 49156ndash65Ambrose SH 2001 Paleolithic technology and human evolution Science

2911748ndash1753Arenas M Francois O Currat M Ray N Excoffier L 2013 Influence of

admixture and Paleolithic range contractions on current Europeandiversity gradients Mol Biol Evol 30 57ndash61

Atkinson QD Gray RD Drummond AJ 2009 Bayesian coalescent infer-ence of major human mitochondrial DNA haplogroup expansionsin Africa Proc R Soc Lond B Biol Sci 276367ndash373

Baele G Lemey P Bedford T Rambaut A Suchard MA Alekseyenko AV2012 Improving the accuracy of demographic and molecular clockmodel comparison while accommodating phylogenetic uncertaintyMol Biol Evol 292157ndash2167

Balaresque PL Ballereau SJ Jobling MA 2007 Challenges in humangenetic diversity demographic history and adaptation Hum MolGenet 16R134ndashR139

Batini C Lopes J Behar DM Calafell F Jorde LB van der Veen LQuintana-Murci L Spedini G Destro-Bisol G Comas D 2011Insights into the demographic history of African pygmies from com-plete mitochondrial genomes Mol Biol Evol 281099ndash1110

Beaumont MA 2004 Recent developments in genetic data analysiswhat can they tell us about human demographic historyHeredity 92365ndash379

Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdashapractical and powerful approach to multiple testing J R Stat SocSer B 57289ndash300

Bermisheva M Salimova A Korshunova T Svyatova G Berezina GVillems R Khusnutdinova E 2002 Mitochondrial DNA diversity inthe populations of Middle Asia and Northern Caucasus Eur J HumGenet 10179ndash180

Bocquet-Appel JP 2011 When the worldrsquos population took off thespringboard of the Neolithic Demographic Transition Science 333560ndash561

Bocquet-Appel J-P Bar-Yosef O 2008 The Neolithic demographic tran-sition and its consequences Dordrecht (The Netherlands) Springer

14

Aime et al doi101093molbevmst156 MBE

Chaix R Austerlitz F Hegay T Quintana-Murci L Heyer E 2008 Genetictraces of east-to-west human expansion waves in Eurasia Am J PhysAnthropol 136309ndash317

Chaix R Quintana-Murci L Hegay T Hammer MF Mobasher ZAusterlitz F Heyer E 2007 From social to genetic structures incentral Asia Curr Biol 1743ndash48

Clark PU Dyke AS Shakun JD Carlson AE Clark J Wohlfarth BMitrovica JX Hostetler SW McCabe AM 2009 The Last GlacialMaximum Science 325710ndash714

Conrad DF Keebler JEM DePristo MA et al (17 co-authors) 2011Variation in genome-wide mutation rates within and betweenhuman families Nat Genet 43712ndash714

Derenko MV Malyarchuk BA Dambueva IK Shaikhaev GO DorzhuCM Nimaev DD Zakharov IA 2000 Mitochondrial DNA variationin two South Siberian aboriginal populations implications for thegenetic history of North Asia Hum Biol 72945ndash973

Dirksen VG van Geel B 2004 Mid to late Holocene climate change andits influence on cultural development in South Central Siberia NatoSci S Ss IV Ear 42291ndash307

Drummond AJ Rambaut A 2007 BEAST Bayesian evolutionary analysisby sampling trees BMC Evol Biol 7214

Endicott P Ho SYW Metspalu M Stringer C 2009 Evaluating the mi-tochondrial timescale of human evolution Trends Ecol Evol 24515ndash521

Excoffier L 2004 Patterns of DNA sequence diversity and genetic struc-ture after a range expansion lessons from the infinite-island modelMol Ecol 13853ndash864

Excoffier L Heckel G 2006 Computer programs for population geneticsdata analysis a survival guide Nat Rev Genet 7745ndash758

Excoffier L Laval G Schneider S 2005 Arlequin (version 30) an inte-grated software package for population genetics data analysis EvolBioinform Online 147ndash50

Fontaine MC Snirc A Frantzis A Koutrakis E Ozturk B Ozturk AAAusterlitz F 2012 History of expansion and anthropogenic collapsein a top marine predator of the Black Sea estimated from geneticdata Proc Natl Acad Sci U S A 109E2569ndashE2576

Forster P Harding R Torroni A Bandelt HJ 1996 Origin and evolutionof native American mtDNA variation a reappraisal Am J HumGenet 59935ndash945

Fu YX 1997 Statistical tests of neutrality of mutations against popula-tion growth hitchhiking and background selection Genetics 147915ndash925

Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133693ndash709

Gelman A Rubin DB 1992 Inference from iterative simulation usingmultiple sequences Stat Sci 7457ndash511

Heled J Drummond AJ 2010 Bayesian inference of species trees frommultilocus data Mol Biol Evol 27570ndash580

Heyer E Balaresque P Jobling MA Quintana-Murci L Chaix R Segurel LAldashev A Hegay T 2009 Genetic diversity and the emergence ofethnic groups in Central Asia BMC Genet 1049

Heyer E Chaix R Pavard S Austerlitz F 2012 Sex-specific demographicbehaviours that shape human genomic variation Mol Ecol 21597ndash612

Heyer E Zietkiewicz E Rochowski A Yotova V Puymirat J Labuda D2001 Phylogenetic and familial estimates of mitochondrial substi-tution rates study of control region mutations in deep-rootingpedigrees Am J Hum Genet 631113ndash1126

Ho SYW Shapiro B 2011 Skyline-plot methods for estimatingdemographic history from nucleotide sequences Mol Ecol Res 11423ndash434

Howell N Kubacka I Mackey DA 1996 How rapidly does the humanmitochondrial genome evolve Am J Hum Genet 59501ndash509

Hudson RR Kaplan NL 1985 Statistical properties of the number ofrecombination events in the history of a sample of DNA sequencesGenetics 111147ndash164

Imaizumi K Parsons TJ Yoshino M Holland MM 2002 A new databaseof mitochondrial DNA hypervariable regions I and II sequences from162 Japanese individuals Int J Legal Med 11668ndash73

International HapMap Consortium 2003 The International HapMapProject Nature 426789ndash796

Kingman JFC 1982 The coalescent Stoch Proc Appl 13235ndash248Kong QP Yao YG Sun C Bandelt HJ Zhu CL Zhang YP 2003 Phylogeny

of East Asian mitochondrial DNA lineages inferred from completesequences Am J Hum Genet 73671ndash676

Kumar V Langstieh BT Madhavi KV Naidu VM Singh HP Biswas SThangaraj K Singh L Reddy BM 2006 Global patterns in humanmitochondrial DNA and Y-chromosome variation caused by spatialinstability of the local cultural processes PLoS Genet 2e53

Kumar S Skjaeveland A Orr RJS Enger P Ruden T Mevik BH Burki FBotnen A Shalchian-Tabrizi K 2009 AIR a batch-oriented webprogram package for construction of supermatrices ready for phy-logenomic analyses BMC Bioinformatics 10357

Laval G Excoffier L 2004 SIMCOAL 20 a program to simulate genomicdiversity over large recombining regions in a subdivided populationwith a complex history Bioinformatics 202485ndash2487

Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulatinga historical and demographic model of recent human evolutionbased on resequencing data from noncoding regions PLoS One 5e10284

Librado P Rozas J 2009 DnaSP v5 a software for comprehensive analysisof DNA polymorphism data Bioinformatics 251451ndash1452

Liu W Jin CZ Zhang YQ et al (13 co-authors) 2010 Human remainsfrom Zhirendong South China and modern human emergence inEast Asia Proc Natl Acad Sci U S A 10719201ndash19206

Loader C 1999 Local regression and likelihood New York SpringerMellars P 2006a Why did human populations disperse from Africa ca

60000 years ago A new model Proc Natl Acad Sci U S A 1039381ndash9386

Mellars P 2006b A new radiocarbon revolution and the dispersal ofmodern humans in Eurasia Nature 439931ndash935

Nei M 1995 The origins of human populations genetic linguistic andarcheological data In Brenner S Hanihara K editors The origin andpast of modern humans as viewed from DNA Singapore WorldScientific p 71ndash91

Nei M Roychoudhury AK 1993 Evolutionary relationships of human-populations on a global scale Mol Biol Evol 10927ndash943

Nelson SM 1993 The archaeology of Korea Cambridge CambridgeUniversity Press p 307

Oota H Settheetham-Ishida W Tiwawech D Ishida T Stoneking M2001 Human mtDNA and Y-chromosome variation is corre-lated with matrilocal versus patrilocal residence Nat Genet 2920ndash21

Pakendorf B Stoneking M 2005 Mitochondrial DNA and human evo-lution Annu Rev Genomics Hum Genet 6165ndash183

Partridge TC Demenocal PB Lorentz SA Paiker MJ Vogel JC 1997Orbital forcing of climate over South Africa a 200000-year rainfallrecord from the Pretoria Saltpan Quat Sci Rev 161125ndash1133

Patin E Laval G Barreiro LB et al (15 co-authors) 2009 Inferring thedemographic history of African farmers and pygmy hunter-gath-erers using a multilocus resequencing data set PLoS Genet 5e1000448

Pavlov P Svendsen J Indrelid SI 2001 Human presence in the EuropeanArctic nearly 40000 years ago Nature 41364ndash67

Phillipson DW 1993 African archaeology Cambridge CambridgeUniversity Press

Plummer M Best N Cowles K Vines K 2006 CODA Convergencediagnosis and output analysis for MCMC R News 67ndash11

Pluzhnikov A Di Rienzo A Hudson RR 2002 Inferences about humandemography based on multilocus analyses of noncoding sequencesGenetics 1611209ndash1218

Quintana-Murci L Chaix R Wells RS et al (17 co-authors) 2004 Wherewest meets east the complex mtDNA landscape of the southwestand Central Asian corridor Am J Hum Genet 74827ndash845

Quintana-Murci L Quach H Harmant C et al (23 co-authors) 2008Maternal traces of deep common ancestry and asymmetric geneflow between Pygmy hunter-gatherers and Bantu-speaking farmersProc Natl Acad Sci U S A 1051596ndash1601

15

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE

Page 6: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

For the HVS-I sequences from Eurasia (supplementarytables S7 and S8 Supplementary Material online) the para-metric BEAST analyses showed that models consistent withan increase in population size (expansion model or exponen-tial model with positive growth rates) best fitted the data forall sedentary populations except Koreans Conversely theconstant model best fitted the data for all nomadic popula-tions as well as Koreans EBSPs showed however significantexpansion events for both farmers and herders but not forKoreans (fig 2C and D and supplementary table S5Supplementary Material online) Nevertheless there was atendency toward stronger expansion rates and higher Ne

values in sedentary than in nomadic populations (fig 2Cand D) although the 95 HPD intervals for Ne were quitelarge for sedentary populations The estimated expansiononset times inferred from the EBSPs (supplementary tableS6 Supplementary Material online) followed an east-to-west gradient they appeared more ancient in Eastern popu-lations in both sedentary and nomadic populations (supple-mentary fig S1 Supplementary Material online) They alsoclearly predated the Neolithic transition in all geographicareas (fig 4)

The Central Asian Exception Similar Demographic Patterns inFarmers and HerdersFor autosomes the constant model best fitted the data forboth sedentary farmers (TAB) and traditionally nomadic her-ders (KIB) (supplementary tables S3 and S4 SupplementaryMaterial online) EBSPs showed also no significant demo-graphic changes for these populations (figs 1C and D andsupplementary table S5 Supplementary Material online)

For HVS-I (supplementary tables S3 and S7 SupplementaryMaterial online) the exponential model best fitted the datafor six of the 12 sedentary farmer populations (includingTAB) whereas the constant model was preferred for theother farmers Unlike the rest of Eurasia a model indicatingexpansion (the exponential model with positive growth rates)was also selected for all nomadic herders Moreover EBSPsshowed significant expansion signals for both herder andfarmer populations except TJY (Yagnobs from Dushanbe)since at least 13860 YBP (or 27720 YBP) for farmers and16546 YBP (or 33092 YBP) for herders on average (fig 2Eand F and supplementary tables S5 and S6 Supplementary

Material online) Again these inferred expansion onsets pre-dated the emergence of farming in the area about 8000 YBP(Bocquet-Appel and Bar-Yosef 2008) (fig 4) Inferred expan-sions for Central Asian sedentary farmers seemed overallweaker (ie lower growth rate and lower Ne) than those ob-served for other sedentary populations in Eurasia althoughwe observed important variations in growth rates and Ne

among populations and large 95 HPD intervals for someof them (fig 2C and E)

Degrees of Isolation and Migration Patterns

African farmer populations appeared less isolated and re-ceived more migrants than hunter-gatherer populationsIndeed the population-specific FST values (supplementarytable S9 Supplementary Material online) were on averagesignificantly lower for farmers than for hunter-gatherers(mean[farmers] = 0058 mean[HG] = 0192 Wilcoxon two-sidedtest P value = 00002) Moreover the estimated number ofimmigrants was significantly higher for sedentary farmersthan for nomadic hunter-gatherers (mean[farmers] = 314mean[HG] = 221 P value = 00001) (supplementary table S10Supplementary Material online) For hunter-gatherers the FST

values were negatively correlated with the negative growthrates that we inferred from the parametric method (=0893 P value = 0012) (fig 5B) meaning that less isolatedpopulations showed weaker contraction events (ie less neg-ative growth rates) Conversely there was no significant cor-relation between FST values and inferred growth rates forsedentary farmers (= 0433 P value = 0249) (fig 5A)However we found a significant positive correlation betweenthe number of immigrants and the inferred growth rates(supplementary fig S2 Supplementary Material online)among sedentary farmer populations (= 0867 P value =0004) but not among nomadic hunter-gatherers (= 0536P value = 0235)

For Eurasia we found no significant difference in FST valuesbetween farmers and herders (mean[farmers] = 0039 mean-

[herders] = 0043 P value = 077 supplementary table S9Supplementary Material online) except in Central Asia forwhich we found significantly lower FST values for nomadicherders than for sedentary farmers (mean[farmers] = 0018mean[herders] = 0008 P value = 0017) We report a significant

FIG 3 Comparison of estimated times for expansion onsets using autosomes and dating of the first archeological traces of farming in Africa and ChinaTime is represented backward (in YBP) Only populations for which the EBSP analysis showed a significant expansion event are represented Wereported the time values estimated with the highest mutation rate that we used for the autosomes (= 25 108generationsite) Thus these timevalues can be considered as a lower bound for the expansion onsets The dates for the emergence of farming come from the review by Bocquet-Appeland Bar-Yosef (2008) They are based on archeological remains

6

Aime et al doi101093molbevmst156 MBE

FIG

4

Com

par

ison

ofes

tim

ated

tim

esfo

rex

pan

sion

onse

tsus

ing

HV

S-Ia

nd

dati

ng

ofth

efir

star

cheo

logi

calt

race

sof

farm

ing

orhe

rdin

gin

Cen

tral

Afr

ica

(A)

Eura

sia

(B)

and

Cen

tral

Asi

a(C

)T

ime

isre

pre

sen

ted

back

war

d(i

nY

BP)

On

lyp

opul

atio

ns

for

whi

chth

eEB

SPan

alys

issh

owed

asi

gnifi

can

tex

pan

sion

even

tar

ere

pre

sen

ted

We

rep

orte

dth

eti

me

valu

eses

tim

ated

wit

hth

ehi

ghes

tm

utat

ion

rate

that

we

used

for

the

HV

S-Is

eque

nce

s(

=10

5 gen

erat

ion

sit

e)T

hus

thes

eti

me

valu

esca

nbe

con

side

red

asa

low

erbo

und

for

the

exp

ansi

onon

sets

The

date

sfo

rth

eem

erge

nce

offa

rmin

gco

me

from

the

revi

ewby

Bocq

uet-

Ap

pel

and

Bar-

Yos

ef(2

008)

The

yar

eba

sed

onar

cheo

logi

cal

rem

ain

s

7

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

FIG 5 Correlations between population-specific FST values and inferred growth rates in African farmer (A) and hunter-gatherer (B) populationsEurasian farmer populations (C) and Central Asian farmer (D) and herder populations (E) Population-specific FST values were computed withARLEQUIN v311 (Excoffier et al 2005) The growth rates were inferred under the best-fitting model from the parametric method using BEAST(Drummond and Rambaut 2007) When the best-fitting model was the constant model we assumed a growth rate of 0 Note that we did not representEurasian herder populations as the constant model best-fitted the data for all of them Plots and correlation tests were performed using R v2141(R Development Core Team 2011)

8

Aime et al doi101093molbevmst156 MBE

negative correlation between FST values and inferred growthrates for sedentary farmers in Eurasia (=0673 P value =0028) (fig 5C) and Central Asia (=0773 P value = 0003)(fig 5D) thus meaning that less isolated populations showedhigher inferred growth rates There was no significant corre-lation for Central Asian herders (=0092 P value = 0736)(fig 5E) Note that this analysis could not be performed forthe other Eurasian herder populations as the constant modelbest fitted the data with the parametric method

The estimation of the proportion of immigrants did notconverge for 11 Eurasian populations (Han Chinese LiaoningQingdao Palestinians Pathans Mongols as well as threeCentral Asian farmer populations and two Central Asianherder populations) Regarding the other populations weshowed no significant difference in the proportion of immi-grants between farmers and herders both in Central Asia(mean[farmers] = 82868 mean[herders] = 260265 P value = 012)and in the rest of Eurasia (mean[farmers] = 576757 mean-

[herders] = 201311 P value = 051) (supplementary table S10Supplementary Material online) We also found no significantcorrelation between this proportion and the inferred growthrates for Eurasian farmers (=0238 P value = 048) CentralAsians farmers (= 0386 P value = 030) and Central Asianherders (=042 P value = 0139) (supplementary fig S2Supplementary Material online)

DiscussionIn this study using a large set of populations from distantgeographic areas we report contrasted demographic historiesthat correlate with lifestyle Moreover the inferred expansionsignals in both African and Eurasian farmer and herder pop-ulations predated the Neolithic transition and the sedentar-ization of these populations

Contrasted Demographic Histories in Sedentary andNomadic Populations

For Africa both mtDNA and autosomal data revealed expan-sion patterns in most sedentary farmer populations as indi-cated by neutrality tests and the parametric andnonparametric BEAST methods Conversely we found con-stant effective population sizes (or possibly contractionevents) for all hunter-gatherer populations Among the farm-ers results were least clear for the Yoruba and the Ewondopopulations as no neutrality test was significant for thesepopulations whereas they showed evidence of expansionevents when analyzed with BEAST This indicates that thesepopulations may have undergone weaker expansion dynam-ics (ie lower growth rates and Ne) than the others Theseremarkable results are of particular importance for theYoruba as it is a reference population in many databases(HapMap 1000 genomes) This also demonstrates thehigher sensitivity of MCMC methods such as BEAST todetect expansions in comparison to neutrality tests

The contrasted patterns inferred between sedentary andnomadic populations in Africa suggest strong differences be-tween the demographic histories of these two groups of pop-ulations The question is whether this pattern results mostly

from differences in local expansion dynamics or whether spa-tial expansion processes at a larger scale were also involved Asshown by Ray et al (2003) negative values for the neutralitytests will be observed in a spatial expansion process if the rateof migrants (Nm) is high enough (at least 20) but not other-wise As in previous studies (eg Verdu et al 2013) we reporta higher degree of isolation (higher population-specific FST

values) in hunter-gatherer populations than in farmer popu-lations Using the spatial expansion model of Excoffier (2004)also leads to higher estimates of the number of immigrantsinto farmer populations Thus both farmers and hunter-gath-erers may have been subject to a spatial expansion processbut the limited number of migrants among hunter-gatherersmay have resulted in an absence of expansion signals forthem This would be consistent with the positive correlationthat we observe between the growth rates estimated withBEAST and the inferred number of immigrants in the seden-tary farmer populations However this spatial expansion pro-cess seems unlikely to completely explain the strongassociation that we observed between lifestyle and expansionpatterns as some farmer populations (Teke Gabonese Fang)displayed FST values similar to those of hunter-gatherers but aclear signal of expansion with relatively high growth rates Thissuggests that even rather isolated farmer populations showsubstantial level of expansions Moreover FST values and in-ferred growth rates in farmer populations were not signifi-cantly correlated Therefore our results suggest that theexpansion patterns observed in sedentary populationsresult not only from a spatial expansion pattern In additionlocal dynamics connected with the higher capacity of foodproduction by farmers also explain their much strongerexpansion signatures relative to their neighboring hunter-gatherer populations

For Eurasia when considering the mtDNA data all threemethods (neutrality tests parametric BEAST analyses andEBSPs) yielded expansion signals for all sedentary farmer pop-ulations except Koreans Conversely only EBSPs and Fursquos Fstest showed expansion signals for nomadic herders but notthe parametric BEAST method nor other neutrality tests Thisresult points toward weaker expansion dynamics in herdersthan in farmers as supported also by the tendency for lowergrowth rates and Ne in herder populations than in farmerpopulations on the EBSP graphs (fig 2C and D) It thus seemsthat the flexibility and nonparametric nature of EBSP analysesallows one to detect weaker expansion events than the para-metric method Moreover Fursquos Fs is known to be more sen-sitive than the other neutrality tests to detect expansions(Ramos-Onsins and Rozas 2002) Again these inferred expan-sions may result at least in part from spatial expansion pro-cesses The population-specific FST values are indeed ratherlow in Eurasia Moreover we found a significant negativecorrelation between FST values and inferred growth rates forthe sedentary farmers indicating that less isolated popula-tions showed stronger expansion signals However althoughwe inferred much stronger expansion patterns for the farmersthan for the herders we did not observe any differences inEurasia between the farmers and the herders in the popula-tion-specific FST values or in the estimated number of

9

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

immigrants suggesting that spatial processes alone cannotexplain the strong difference that we observed between theexpansion patterns of these two groups of populations Thisindicates that the intrinsic demographic growth patterns aredifferent between these two kinds of populations the farmersshowing much higher growth rates than the herders

To our knowledge although other studies have found dif-ferent patterns between hunter-gatherers and farmers (egVerdu et al 2009) our study is the first to show differencesbetween farmers and herders the two major post-Neolithichuman groups A plausible explanation could be that no-madic herders and hunter-gatherers share several of the con-straints of a nomadic way of life For instance birth intervalsare generally longer (at least 4 years) in nomadic populationsthan in sedentary populations (eg Short 1982) According toBocquet-Appel (2011) these longer birth intervals may bemainly determined by diet differences Indeed Valeggia andEllison (2009) demonstrated that birth interval is mainly de-termined by the rapidity of postpartum energy recoverywhich may be increased by high carbohydrate food (like ce-reals) consumption Moreover the nomadic herder way of lifemay offer less food security than sedentary farming the latterfacilitating efficient long-term food storage

However unlike in Africa we did not find systematicallyconsistent patterns between the autosomal and mtDNA datain Eurasia The possible contraction events that our resultssuggest for two sedentary populations (Japanese and Danes)with autosomes appeared concomitant with historical eventsthat could have led to bottleneck processes For the Japanesepopulation this contraction signal could indeed result from afounder effect due to the Paleolithic colonization of Japan by asubset of the Northern Asiatic people (especially from KoreaNei 1995) Similarly a bottleneck process may also have oc-curred in the Danish population linked with the last glacialmaximum occurring between 26500 and 19500 YBP (Clarket al 2009) Reasons why these processes impacted the auto-somes but not the mtDNA data remain to be determined forinstance through simulation studies In any case our studyclearly emphasizes the utility of combining mtDNA and au-tosomal sequences as they allow access to different aspects ofhuman history A recent study on harbor porpoises has sim-ilarly shown that nuclear markers were sensitive to a recentcontraction event whereas mtDNA allowed inferring a moreancient expansion (Fontaine et al 2012)

Interestingly Central Asia displayed a distinct pattern fromthe rest of Eurasia Indeed we did not infer higher expansionrates for sedentary farmers than for nomadic herders in thatarea It could result from harsh local environmental condi-tions due to the arid continental climate in this area Indeedusing pollen records Dirksen and van Geel (2004) showedthat the paleoclimate in Central Asia was very arid from atleast 12000 to 3000 YBP which could have limited theamount of suitable areas for farming and impacted humandemography Spatial expansion processes may also haveplayed a role in this difference as population-specific FST

values were higher for the farmers than for the herdersThis may indicate that more migrants were involved in thespatial expansion process for the herders than for the farmers

yielding a weaker expansion signal (ie lower inferred growthrate) for the latter (Ray et al 2003) This is supported by thenegative correlation between the FST values and the inferredgrowth rates in the farmer populations The Korean popula-tion also stood out as an exception in Eurasia Even though itis a population of sedentary farmers it showed no significantexpansion signal with both the parametric and nonparamet-ric methods with HVS-I This could be explained by a latersedentarization of this population The Korean Neolithic isnotably defined by the introduction of Jeulmun ware ce-ramics about 8000 YBP but the people of the Jeulmunperiod were still predominantly semi-nomadic fishers andhunter-gatherers until about 3000 YBP when Koreans startedan intensive crop production implying a sedentary lifestyle(Nelson 1993)

Inferred Expansion Signals Predate the Emergence ofFarming

EBSP analyses revealed that the inferred expansion events infarmers and herder populations were more ancient than theemergence of farming and herding Therefore the differencesin demographic patterns between farmers and herders seemto predate their divergence in lifestyle which raises the ques-tion of the chronology of demographic expansions and theNeolithic transition These findings appear to be quite robustto the choice of the scaling parameters We used here boththe lower and the higher mutation rate estimates in humansfor autosomes (Pluzhnikov et al 2002 Conrad et al 2011) andfor the HVS-I sequence (Forster et al 1996 Howell et al 1996)Despite this uncertainty in mutation rates which lead to a 2-fold uncertainty in our time estimates the inferred expansionsignals predated the emergence of agriculture in both casesfor all populations Similarly using a generation time of 29years (Tremblay and Vezina 2000) instead of 25 years lead toslightly more ancient estimates thus do not change our con-clusions (data not shown) However note that for HVS-Iusing the higher bound of the credibility interval for the high-est estimated mutation rate (275 105generationsiteHeyer et al 2001) instead of the mean value (ie 105gen-erationsite) leads to expansion time estimates consistentwith the Neolithic transition in Eurasian populations (supple-mentary table S11 Supplementary Material online)Nevertheless these estimates still clearly predated theNeolithic for the African populations However 105gener-ationsite is by far the highest estimation of mutation rate inthe literature (Howell et al 1996) To infer Neolithic expan-sions in most Eurasian populations one needs to assume amutation rate of at least 2 105generationsite muchhigher than other estimates from the literature and is thusprobably unrealistic Moreover our method for determiningthe expansion onset time using EBSP graph is very conserva-tive and also tends to favor the lower bound of expansiononset times Finally for autosomes using similarly474 108generationsite instead of 25 108genera-tionsite (Pluzhnikov et al 2002) lead to an inferred expansiononset time that is not compatible with the Neolithic transi-tion for all Eurasian and African populations except for one

10

Aime et al doi101093molbevmst156 MBE

African population the Yoruba (supplementary table S12Supplementary Material online) Consequently it seemsvery likely that the expansions inferred in this study corre-spond to Paleolithic rather than Neolithic demographicevents in agreement also with most previous studies as de-tailed later

In Africa the emergence of agriculture has been datedbetween 5000 and 4000 YBP in the Western part ofCentral Africa and subsequently rapidly expanded to therest of sub-Saharan Africa (Phillipson 1993) However usingHVS-I we showed expansion events in farmer populationssince about 30000 or 60000 YBP thus largely predating theemergence of agriculture in the area Similarly using auto-somes especially in Eastern African populations we inferredexpansion signals that clearly predated the Neolithic Notablywe inferred an expansion signal for Mozambicans since atleast 80000 YBP Several genetic studies have already high-lighted that expansion events occurred in African farmersbefore the Neolithic transition (eg Atkinson et al 2009Laval et al 2010 Batini et al 2011) This finding is also con-sistent with paleoanthropological data (ie radiocarbondating) suggesting an expansion event in Africa 60000ndash80000 YBP (Mellars 2006a) This Paleolithic demographicexpansion could be linked to a rapid environmental changetoward a dryer climate (Partridge et al 1997) andor to theemergence of new hunting technologies (Mellars 2006a)

According to Mellars (2006a) this period corresponds to amajor increase in the complexity of the technological eco-nomic social and cognitive behavior of certain Africangroups It corresponds in particular to the emergence of pro-jectile technologies (Shea 2009) which was probably part of abroader pattern of ecological diversification of early Homosapiens populations These changes could have been decisivefor the human spread ldquoOut of Africardquo during the same periodand could have ultimately also led to the sedentarization ofthe remaining populations This inference is consistent withSauerrsquos (1952) demographic theory which stated that latePaleolithic demographic expansions could have favored thesedentarization and the emergence of agriculture in somehuman populations In the case of Central Africa theperiod of 60000 YBP corresponds to the separation betweenhunter-gatherers and farmers ancestors (Patin et al 2009Verdu et al 2009) Thus these two groups may have pre-sented contrasting demographic patterns since their diver-gence Much later higher expansion rates and largerpopulation sizes among farmersrsquo ancestors may have inducedthe emergence of agriculture and sedentarization

With respect to Eurasia the expansion profiles inferredwith HVS-I for all populations and with autosomes for theHan Chinese population also seem to have begun during thePaleolithic thus before the Neolithic transition Some geneticstudies already reported pre-Neolithic expansions in Asia andEurope (eg Chaix et al 2008) Notably using mismatch andintermatch distributions Chaix et al (2008) showed an east-to-west Paleolithic expansion wave in Eurasia We found asimilar pattern here as the inferred expansions of East-Asianpopulations were earlier than those of Central Asian popula-tions themselves earlier than those of European populations

Moreover we found this pattern in both sedentary farmerand nomadic herder populations Thus the ancestors of cur-rently nomadic herder populations also experienced thesePaleolithic expansions However Paleolithic expansion signalsin nomadic populations seem lower than in sedentary pop-ulations This is again compatible with the demographictheory of the Neolithic sedentarization (Sauer 1952) somepopulations may have experienced more intense Paleolithicexpansions which may have led ultimately to theirsedentarization

The inferred Paleolithic expansion signals might resultpartly from spatial expansions out of some refuge areasafter the Last Glacial Maximum (LGM 26500ndash19500 YBPClark et al 2009) as this time interval matches with our in-ferred dating for expansion onsets in East Asia with HVS-Iusing the pedigree-based mutation rate and in Europe andMiddle East using the transitional mutation rate Some of theearlier date estimates might also be consistent with the out-of-Africa expansion of H sapiens However the time radiocar-bon-based estimates of the spread of H sapiens in Eurasia aregenerally more ancient than our inferred expansion onsettimings For instance Mellars (2006b) dated the colonizationof Middle East by H sapiens at 47000ndash49000 YBP and ofEurope at 41000ndash42000 YBP Pavlov et al (2001) reporttraces of modern human occupation nearly 40000 yearsold in Siberia Finally Liu et al (2010) described modernhuman fossils from South China dated to at least 60000YBP Moreover out-of-Africa or post-LGM expansionswould not explain our finding of an east-to-west gradientof expansion onset timing which rather supports the hypoth-esis of a demographic expansion diffused from east to west inEurasia in a demic (ie migrations of individuals) or cultural(favored by the diffusion of new technologies)

Possible Confounding Factors

Our approach makes the assumption that populations areisolated and panmictic which is questionable for human pop-ulations However we analyzed a large set of populationssampled in very distant geographical regions (ie CentralAfrica East Africa Europe Middle East Central Asia PamirSiberia and East Asia) The main conclusions of this study relyon consistent patterns between most of these areas and itseems unlikely that processes such as admixture could havebiased the estimates similarly everywhere Moreover inCentral Africa several studies have shown that hunter-gath-erer populations show signals of admixture whereas it is notthe case for farmer populations (Patin et al 2009 Verdu et al2009 2013) If this introgression had been strong enough thismay have yielded a spurious expansion signal in the hunter-gatherer populations which is not what we observed here InEurope spatial expansion processes during the Neolithic mayhave led to admixture with Paleolithic populations Aspointed out by a simulation study (Arenas et al 2013) thismay lead to a predominance of the Paleolithic gene pool Thismay be one of the factors explaining why we observed mostlyPaleolithic expansions here

11

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

Similarly potential selection occurring on the whole mito-chondrial genome (eg Pakendorf and Stoneking 2005) seemsunlikely to have impacted in the same way all the studiedpopulations within each group (eg stronger positive selec-tion on sedentary than on nomadic populations) as we an-alyzed different nomadic and sedentary populations livingnear each other in several geographically distant areas

Regarding the potential effects of recombination on theinferences from autosomal data we found that neutralitytests gave similar results on the whole sequences whenusing a simulation procedure that was taking the known re-combination rate of each sequence into account (table 1)and on the largest non-recombining blocks as inferred withIMgc without taking recombination into account in the sim-ulation process (supplementary table S12 SupplementaryMaterial online) It thus appears unlikely that the BEAST anal-yses that can only handle the largest inferred non-recombin-ing blocks are biased because of this

Finally note that the effective population sizes inferredusing BEAST correspond to the Ne of the populationsduring their recent history rather than a value of Ne averagedover the history of the population It explains the finding thatfor most populations we inferred Ne estimates much higherthan generally assumed for humans by population geneticists(about 10000)

Material and Methods

Genetic MarkersAutosomal SequencesWe used data from 20 noncoding a priori neutral and un-linked autosomal regions selected by Patin et al (2009) to beat least 200 kb away from any known or predicted gene tonot be in linkage disequilibrium (LD) neither with each othernor with any known or predicted gene and to have a regionof homology with the chimpanzee genome These regions areon average 1253 bp long Using the four-gamete test (Hudsonand Kaplan 1985) as implemented in IMgc online (Woerneret al 2007) we identified recombination events for 6 of these20 regions As some methods used in this study cannothandle recombination we retained for these six sequencesthe largest non-recombining block inferred by IMgc Becauseof this reduction the 20 regions used were on average1228 bp long To identify potential bias related to thismethod (eg some recombination events may not bedetected using the four-gamete test larger blocks of non-recombining sequence may select for gene trees that areshorter than expected) we computed the summary statisticsand performed neutrality tests (discussed later) both on thewhole sequences (table 1) and on the largest non-recombin-ing blocks (supplementary table S12 Supplementary Materialonline)

Mitochondrial SequencesWe used the first hypervariable segment of the mitochondrialcontrol region (HVS-I) sequenced between positions 16067and 16383 excluding the hypervariable poly-C region (sites16179ndash16195) The total length of the sequence was thus of300 bp

Population Panel

For Africa we used the autosomal sequences data set of Patinet al (2009) which consists of five farmer populations(N = 118 individuals) and five Pygmy hunter-gatherer popu-lations (N = 95) In addition we used the HVS-I data set fromQuintana-Murci et al (2008) which consists of nine CentralAfrican farmer populations (N = 486) and seven CentralAfrican hunter-gatherer populations (N = 318) (supplemen-tary table S1 Supplementary Material online)

For Eurasia we used the autosomal sequences data set ofLaval et al (2010) consisting of 48 individuals from two East-Asian populations (Han Chinese and Japanese) and 47 indi-viduals from two European populations (Chuvash andDanes) We also used the data from 48 individuals fromone sedentary Central Asian population (Tajik farmers) and48 individuals from one nomadic Central Asian population(Kyrgyz herders) of Segurel et al (2013) For HVS-I we ana-lyzed data from 17 Eurasian populations (N = 494 in total)located from Eastern to Western Eurasia belonging to severalpublished data sets (Derenko et al 2000 Richards et al 2000Bermisheva et al 2002 Imaizumi et al 2002 Yao et al 2002Kong et al 2003 Quintana-Murci et al 2004 supplementarytable S1 Supplementary Material online)

For our detailed study of Central Asia we used HVS-I se-quences from 12 farmer populations (N = 408 in total) and 16herder populations (N = 567 in total) These data come fromthe studies by Chaix et al (2007) and Heyer et al (2009) for 25populations (supplementary table S1 SupplementaryMaterial online) The other populations (KIB TAB andTKY) were sequenced for this study As in Chaix et al(2007) and Heyer et al (2009) DNA was extracted fromblood samples using standard protocols and the sequencequality was ensured as follows each base pair was determinedonce with a forward and once with a reverse primer anyambiguous base call was checked by additional and indepen-dent PCR and sequencing reactions all sequences were ex-amined by two independent investigators All sampledindividuals were healthy donors from whom informed con-sent was obtained The study was approved by appropriateEthic Committees and scientific organizations in all countrieswhere samples have been collected

Demographic Inferences from Sequences AnalysisSummary Statistics and Neutrality TestsWe computed classical summary statistics (number of poly-morphic sites S number of haplotypes K) and four neutralitytests (Tajimarsquos [1989] D Fu and Lirsquos [1993] D and F and Fursquos[1997] Fs) on both mitochondrial and autosomal sequencesAlthough neutrality tests were originally designed to detectselective events they also give information about demo-graphic processes especially when applied to neutral markersIndeed expansion events lead to more negative values thanexpected in the absence of selective and demographic pro-cesses Conversely contraction events lead to more positivevalues of the neutrality tests For HVS-I sequences we com-puted all summary statistics and neutrality tests and tested

12

Aime et al doi101093molbevmst156 MBE

their departure from neutrality using the coalescent-basedtests provided in DnaSP (Librado and Rozas 2009)

For the autosomal sequences we used the procedure de-veloped in Laval et al (2010) which combines all autosomalsequences into a single test This procedure consists in com-puting the mean value of each summary statistics across the20 loci and in testing whether this mean value departs signif-icantly from its expectation under neutrality in a constant-size population model using a simulation procedure For agiven population with sample size n we produced 105 simu-lated samples of the same size n under a constant populationsize model using the generation-per-generation coalescent-based algorithm implemented in SIMCOAL v2 (Laval andExcoffier 2004) Each simulated individual was constitutedby 20 independent sequences of 1253 bp (the average se-quence length for the real data)Then we used ARLEQUIN v3(Excoffier et al 2005) as modified by Laval et al (2010) tocompute the summary statistics on these simulated samplesWe assessed whether the observed statistics differed signifi-cantly from the constant population model under neutralityby comparing these statistics with their null distribution ob-tained from the simulated data We used gamma distributedmutation rates with a mean value of 25 108generationsite (95 confidence interval 1476 108

4036 108)in agreement with previous studies (Pluzhnikov et al 2002Voight et al 2005) This procedure yielded a P value for thesignificance of the departure from a constant size model Weperformed this procedure both on the whole sequences andon the largest non-recombining blocks For the whole se-quences (ie including recombination) we performed thedata simulation under a coalescent model with recombina-tion using for each locus the recombination rate provided bythe HapMap build GRCh37 genetic map (InternationalHapMap Consortium 2003) (supplementary table S8Supplementary Material online) whereas for the largestnon-recombining blocks we used a coalescent model withoutrecombination

Both for autosomes and for HVS-I sequences we adjustedthe obtained P values for each neutrality test using a falsediscovery rate (FDR) correction (Benjamini and Hochberg1995) in R v2141 (R Development Core Team 2011) inorder to take into account the increased error probability inthe case of multiple testing

MCMC Estimations of Demographic ParametersWe used the MCMC algorithm implemented in BEAST v16(Drummond and Rambaut 2007) We tested the four demo-graphic models implemented in this software constant effec-tive population size (N0) (constant model) populationexpansion with an increasing growth rate (g) (exponentialmodel) population expansion with an decreasing growthrate (g) (logistic model) and the expansion model in whichN0 is the present day population size N1 the population sizethat the model asymptotes to going into the distant past andg the exponential growth rate that determines how fast thetransition is from near the N1 population size to N0 popula-tion size In fact BEAST estimates composite parameters foreach model namely N0 and g where N0 is the current

effective population size g the growth rate and the muta-tion rate In addition for the expansion model the ratio be-tween the current (N0) and ancestral (N1) effectivepopulation size is also estimated To infer N0 and g fromthese composite parameters we needed to assume a valuefor the mutation rate However there is no consensus formutation rates in humans in the literature as different meth-ods lead to different estimations For autosomes the mostcommonly used value is the phylogenetic rate of= 25 108generationsite (Pluzhnikov et al 2002)However recent studies based on the 1000 genome project(1000 Genomes Project Consortium 2012) have found a2-fold lower rate (= 12 108generationsite) by directlycomparing genome-wide sequences from children and theirparents (Conrad et al 2011 Scally and Durbin 2012) We usedhere both mutation rates Similarly for HVS-I estimated mu-tation rates are highly dependent on methodologies andmodes of calibration (Endicott et al 2009) We used boththe lower and the higher estimated mutation rates the tran-sitional changes mutation rate of = 5 106generationsite (Forster et al 1996) and the pedigree-based rate of= 105generationsite (Howell et al 1996 Heyer et al2001) We used a general time-reversible substitution model(Rodriguez et al 1990) We assumed a generation time of 25years permitting the comparison with previous human pop-ulation genetics studies (eg Chaix et al 2008 Patin et al 2009Laval et al 2010) As BEAST cannot handle recombinationevents we used the largest nonrecombining block withineach sequence (discussed earlier)

We performed three runs of 107 steps per population andper demographic model for the HVS-I sequence and threeruns of 2 108 steps (which corresponded to three runs of107 steps per locus) for the autosomal sequences We re-corded one tree every 1000 steps which thus implied atotal of 105 trees per locus and per run We then removedthe first 10 steps of each run (burn-in period) and combinedthe runs to obtain acceptable effective sample sizes (ESSs of100 or above) The convergence of these runs was assessedusing two methods visual inspection of traces using Tracerv15 (Rambaut and Drummond 2007) to check for concor-dance between runs and computation of Gelman and Rubinrsquos(1992) convergence diagnostic using R v2141 (RDevelopment Core Team 2011) with the function gelmandiag available in the package coda (Plummer et al 2006)

To facilitate a large exploration of the parameter space forthe autosomal sequences we chose uniform priors between 0and 005 for 2N0 and between 109 and 109 for g ForHVS-I sequences we chose uniform priors for N0 between 0and 10 and for gm between25 106 and 25 106 result-ing for the same priors on N0 and g than for autosomalsequences if we assumed = 105generationsite (ieN0 constrained between 0 and 106 and g constrained between1 and 1 per year) Conversely if we assumed = 5 106generationsite it meant that N0 was constrained between 0and 2 106 and g was constrained between05 and 05 peryear

For each population and model we obtained themode and the 95 HPD of N0 and g inferred from their

13

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

posterior distributions (supplementary tables S4 and S7Supplementary Material online) using the add-on packageLocfit (Loader 1999) in R v2141 We selected the best-fittingmodel among the four tested demographic models by esti-mating marginal likelihoods using two methods path sam-pling and stepping-stone sampling (Baele et al 2012) Themodel with the greater marginal likelihood (supplementarytable S3 Supplementary Material online) was considered asthe best-fitting model

Extended Bayesian Skyline PlotsEBSPs (Heled and Drummond 2010) also implemented inBEAST estimate demographic changes occurring continu-ously through time in a population using the time intervalsbetween successive coalescent events This method allows avisualization of the evolution of Ne through time As abovewe combined three runs of 107 steps for mitochondrial se-quences and three runs of 2 108 steps for autosomal se-quences to obtain acceptable ESS values We assumed thesame mutation rates as above and a generation time of 25years Outputs were analyzed with Tracer v15 to visuallycheck for convergence and ESS also to obtain the 95 HPDinterval for the number of demographic changes that oc-curred in the population (supplementary table S5Supplementary Material online) A constant population sizecould be rejected when the 95 HPD of the number ofchange points excluded 0 but included 1 (Heled andDrummond 2010) Then we used R v2141 to computeGelman and Rubinrsquos (1992) convergence diagnostic asabove as well as to compute skyline plots Finally we usedthe population growth curves generated from BEAST toassess the time at which populations began to expandEach Skyline plot consisted of smoothed data points atamp10ndash20 generation intervals We consider that the popula-tion increased (or decreased) when both the median and 95HPD values for Ne increased (or decreased) between morethan two successive data points Although this method didnot provide a 95 HPD interval for the inferred expansiontimings this conservative approach ensured that we consid-ered only relevant expansion signals

Correlation Tests of Inferred Growth Rates andIsolationImmigration Patterns

To test how isolation degrees and migration patterns differ-ences could have impacted our demographic inferences weused ARLEQUIN v311 (Excoffier et al 2005) to compute pop-ulation-specific FST values (Weir and Hill 2002) from HVS-Idata for Central Africa Eurasia and Central Asia (supplemen-tary table S9 Supplementary Material online) and estimateimmigration rates from mismatch distributions under a spa-tially explicit model (Excoffier 2004) (supplementary tableS10 Supplementary Material online) We performed thenSpearman tests using R v2141 to investigate for eachregion how the inferred parametric growth rates were corre-lated with those FST values and immigration rates We used avalue of 0 for the growth rate when the constant model bestfitted the data

Supplementary MaterialSupplementary figures S1 and S2 and tables S1ndashS12 are avail-able at Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

The authors would like to warmly thank all volunteer partic-ipants They also thank Michael Fontaine for his help on somedata analyses Phillip Endicott and Julio Bendezu-Sarmientofor insightful discussions Alexei Drummond for helpful dis-cussions and Friso Palstra for his help on English usage Theythank Laurent Excoffier two anonymous reviewers and theeditor for helpful comments and suggestions All computa-tionally intensive analyses were run on the Linux cluster of theMuseum National drsquoHistoire Naturelle (administrated by JulioPedraza) and on the web-based portal ldquoBioportalrdquo (Kumaret al 2009) This work was supported by ActionsTransversales du Museum - Museum National drsquoHistoireNaturelle (grant ldquoLes relations Societes-Natures dans le longtermerdquo) and Agence Nationale de la Recherche (grantsldquoAlterite culturellerdquo ANR-10-ESVS-0010 and ldquoDemochipsrdquoANR-12-BSV7-0012) CA was financed by a PhD grantfrom the Centre National de la Recherche Scientifique

References1000 Genomes Project Consortium 2012 An integrated map of genetic

variation from 1092 human genomes Nature 49156ndash65Ambrose SH 2001 Paleolithic technology and human evolution Science

2911748ndash1753Arenas M Francois O Currat M Ray N Excoffier L 2013 Influence of

admixture and Paleolithic range contractions on current Europeandiversity gradients Mol Biol Evol 30 57ndash61

Atkinson QD Gray RD Drummond AJ 2009 Bayesian coalescent infer-ence of major human mitochondrial DNA haplogroup expansionsin Africa Proc R Soc Lond B Biol Sci 276367ndash373

Baele G Lemey P Bedford T Rambaut A Suchard MA Alekseyenko AV2012 Improving the accuracy of demographic and molecular clockmodel comparison while accommodating phylogenetic uncertaintyMol Biol Evol 292157ndash2167

Balaresque PL Ballereau SJ Jobling MA 2007 Challenges in humangenetic diversity demographic history and adaptation Hum MolGenet 16R134ndashR139

Batini C Lopes J Behar DM Calafell F Jorde LB van der Veen LQuintana-Murci L Spedini G Destro-Bisol G Comas D 2011Insights into the demographic history of African pygmies from com-plete mitochondrial genomes Mol Biol Evol 281099ndash1110

Beaumont MA 2004 Recent developments in genetic data analysiswhat can they tell us about human demographic historyHeredity 92365ndash379

Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdashapractical and powerful approach to multiple testing J R Stat SocSer B 57289ndash300

Bermisheva M Salimova A Korshunova T Svyatova G Berezina GVillems R Khusnutdinova E 2002 Mitochondrial DNA diversity inthe populations of Middle Asia and Northern Caucasus Eur J HumGenet 10179ndash180

Bocquet-Appel JP 2011 When the worldrsquos population took off thespringboard of the Neolithic Demographic Transition Science 333560ndash561

Bocquet-Appel J-P Bar-Yosef O 2008 The Neolithic demographic tran-sition and its consequences Dordrecht (The Netherlands) Springer

14

Aime et al doi101093molbevmst156 MBE

Chaix R Austerlitz F Hegay T Quintana-Murci L Heyer E 2008 Genetictraces of east-to-west human expansion waves in Eurasia Am J PhysAnthropol 136309ndash317

Chaix R Quintana-Murci L Hegay T Hammer MF Mobasher ZAusterlitz F Heyer E 2007 From social to genetic structures incentral Asia Curr Biol 1743ndash48

Clark PU Dyke AS Shakun JD Carlson AE Clark J Wohlfarth BMitrovica JX Hostetler SW McCabe AM 2009 The Last GlacialMaximum Science 325710ndash714

Conrad DF Keebler JEM DePristo MA et al (17 co-authors) 2011Variation in genome-wide mutation rates within and betweenhuman families Nat Genet 43712ndash714

Derenko MV Malyarchuk BA Dambueva IK Shaikhaev GO DorzhuCM Nimaev DD Zakharov IA 2000 Mitochondrial DNA variationin two South Siberian aboriginal populations implications for thegenetic history of North Asia Hum Biol 72945ndash973

Dirksen VG van Geel B 2004 Mid to late Holocene climate change andits influence on cultural development in South Central Siberia NatoSci S Ss IV Ear 42291ndash307

Drummond AJ Rambaut A 2007 BEAST Bayesian evolutionary analysisby sampling trees BMC Evol Biol 7214

Endicott P Ho SYW Metspalu M Stringer C 2009 Evaluating the mi-tochondrial timescale of human evolution Trends Ecol Evol 24515ndash521

Excoffier L 2004 Patterns of DNA sequence diversity and genetic struc-ture after a range expansion lessons from the infinite-island modelMol Ecol 13853ndash864

Excoffier L Heckel G 2006 Computer programs for population geneticsdata analysis a survival guide Nat Rev Genet 7745ndash758

Excoffier L Laval G Schneider S 2005 Arlequin (version 30) an inte-grated software package for population genetics data analysis EvolBioinform Online 147ndash50

Fontaine MC Snirc A Frantzis A Koutrakis E Ozturk B Ozturk AAAusterlitz F 2012 History of expansion and anthropogenic collapsein a top marine predator of the Black Sea estimated from geneticdata Proc Natl Acad Sci U S A 109E2569ndashE2576

Forster P Harding R Torroni A Bandelt HJ 1996 Origin and evolutionof native American mtDNA variation a reappraisal Am J HumGenet 59935ndash945

Fu YX 1997 Statistical tests of neutrality of mutations against popula-tion growth hitchhiking and background selection Genetics 147915ndash925

Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133693ndash709

Gelman A Rubin DB 1992 Inference from iterative simulation usingmultiple sequences Stat Sci 7457ndash511

Heled J Drummond AJ 2010 Bayesian inference of species trees frommultilocus data Mol Biol Evol 27570ndash580

Heyer E Balaresque P Jobling MA Quintana-Murci L Chaix R Segurel LAldashev A Hegay T 2009 Genetic diversity and the emergence ofethnic groups in Central Asia BMC Genet 1049

Heyer E Chaix R Pavard S Austerlitz F 2012 Sex-specific demographicbehaviours that shape human genomic variation Mol Ecol 21597ndash612

Heyer E Zietkiewicz E Rochowski A Yotova V Puymirat J Labuda D2001 Phylogenetic and familial estimates of mitochondrial substi-tution rates study of control region mutations in deep-rootingpedigrees Am J Hum Genet 631113ndash1126

Ho SYW Shapiro B 2011 Skyline-plot methods for estimatingdemographic history from nucleotide sequences Mol Ecol Res 11423ndash434

Howell N Kubacka I Mackey DA 1996 How rapidly does the humanmitochondrial genome evolve Am J Hum Genet 59501ndash509

Hudson RR Kaplan NL 1985 Statistical properties of the number ofrecombination events in the history of a sample of DNA sequencesGenetics 111147ndash164

Imaizumi K Parsons TJ Yoshino M Holland MM 2002 A new databaseof mitochondrial DNA hypervariable regions I and II sequences from162 Japanese individuals Int J Legal Med 11668ndash73

International HapMap Consortium 2003 The International HapMapProject Nature 426789ndash796

Kingman JFC 1982 The coalescent Stoch Proc Appl 13235ndash248Kong QP Yao YG Sun C Bandelt HJ Zhu CL Zhang YP 2003 Phylogeny

of East Asian mitochondrial DNA lineages inferred from completesequences Am J Hum Genet 73671ndash676

Kumar V Langstieh BT Madhavi KV Naidu VM Singh HP Biswas SThangaraj K Singh L Reddy BM 2006 Global patterns in humanmitochondrial DNA and Y-chromosome variation caused by spatialinstability of the local cultural processes PLoS Genet 2e53

Kumar S Skjaeveland A Orr RJS Enger P Ruden T Mevik BH Burki FBotnen A Shalchian-Tabrizi K 2009 AIR a batch-oriented webprogram package for construction of supermatrices ready for phy-logenomic analyses BMC Bioinformatics 10357

Laval G Excoffier L 2004 SIMCOAL 20 a program to simulate genomicdiversity over large recombining regions in a subdivided populationwith a complex history Bioinformatics 202485ndash2487

Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulatinga historical and demographic model of recent human evolutionbased on resequencing data from noncoding regions PLoS One 5e10284

Librado P Rozas J 2009 DnaSP v5 a software for comprehensive analysisof DNA polymorphism data Bioinformatics 251451ndash1452

Liu W Jin CZ Zhang YQ et al (13 co-authors) 2010 Human remainsfrom Zhirendong South China and modern human emergence inEast Asia Proc Natl Acad Sci U S A 10719201ndash19206

Loader C 1999 Local regression and likelihood New York SpringerMellars P 2006a Why did human populations disperse from Africa ca

60000 years ago A new model Proc Natl Acad Sci U S A 1039381ndash9386

Mellars P 2006b A new radiocarbon revolution and the dispersal ofmodern humans in Eurasia Nature 439931ndash935

Nei M 1995 The origins of human populations genetic linguistic andarcheological data In Brenner S Hanihara K editors The origin andpast of modern humans as viewed from DNA Singapore WorldScientific p 71ndash91

Nei M Roychoudhury AK 1993 Evolutionary relationships of human-populations on a global scale Mol Biol Evol 10927ndash943

Nelson SM 1993 The archaeology of Korea Cambridge CambridgeUniversity Press p 307

Oota H Settheetham-Ishida W Tiwawech D Ishida T Stoneking M2001 Human mtDNA and Y-chromosome variation is corre-lated with matrilocal versus patrilocal residence Nat Genet 2920ndash21

Pakendorf B Stoneking M 2005 Mitochondrial DNA and human evo-lution Annu Rev Genomics Hum Genet 6165ndash183

Partridge TC Demenocal PB Lorentz SA Paiker MJ Vogel JC 1997Orbital forcing of climate over South Africa a 200000-year rainfallrecord from the Pretoria Saltpan Quat Sci Rev 161125ndash1133

Patin E Laval G Barreiro LB et al (15 co-authors) 2009 Inferring thedemographic history of African farmers and pygmy hunter-gath-erers using a multilocus resequencing data set PLoS Genet 5e1000448

Pavlov P Svendsen J Indrelid SI 2001 Human presence in the EuropeanArctic nearly 40000 years ago Nature 41364ndash67

Phillipson DW 1993 African archaeology Cambridge CambridgeUniversity Press

Plummer M Best N Cowles K Vines K 2006 CODA Convergencediagnosis and output analysis for MCMC R News 67ndash11

Pluzhnikov A Di Rienzo A Hudson RR 2002 Inferences about humandemography based on multilocus analyses of noncoding sequencesGenetics 1611209ndash1218

Quintana-Murci L Chaix R Wells RS et al (17 co-authors) 2004 Wherewest meets east the complex mtDNA landscape of the southwestand Central Asian corridor Am J Hum Genet 74827ndash845

Quintana-Murci L Quach H Harmant C et al (23 co-authors) 2008Maternal traces of deep common ancestry and asymmetric geneflow between Pygmy hunter-gatherers and Bantu-speaking farmersProc Natl Acad Sci U S A 1051596ndash1601

15

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE

Page 7: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

FIG

4

Com

par

ison

ofes

tim

ated

tim

esfo

rex

pan

sion

onse

tsus

ing

HV

S-Ia

nd

dati

ng

ofth

efir

star

cheo

logi

calt

race

sof

farm

ing

orhe

rdin

gin

Cen

tral

Afr

ica

(A)

Eura

sia

(B)

and

Cen

tral

Asi

a(C

)T

ime

isre

pre

sen

ted

back

war

d(i

nY

BP)

On

lyp

opul

atio

ns

for

whi

chth

eEB

SPan

alys

issh

owed

asi

gnifi

can

tex

pan

sion

even

tar

ere

pre

sen

ted

We

rep

orte

dth

eti

me

valu

eses

tim

ated

wit

hth

ehi

ghes

tm

utat

ion

rate

that

we

used

for

the

HV

S-Is

eque

nce

s(

=10

5 gen

erat

ion

sit

e)T

hus

thes

eti

me

valu

esca

nbe

con

side

red

asa

low

erbo

und

for

the

exp

ansi

onon

sets

The

date

sfo

rth

eem

erge

nce

offa

rmin

gco

me

from

the

revi

ewby

Bocq

uet-

Ap

pel

and

Bar-

Yos

ef(2

008)

The

yar

eba

sed

onar

cheo

logi

cal

rem

ain

s

7

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

FIG 5 Correlations between population-specific FST values and inferred growth rates in African farmer (A) and hunter-gatherer (B) populationsEurasian farmer populations (C) and Central Asian farmer (D) and herder populations (E) Population-specific FST values were computed withARLEQUIN v311 (Excoffier et al 2005) The growth rates were inferred under the best-fitting model from the parametric method using BEAST(Drummond and Rambaut 2007) When the best-fitting model was the constant model we assumed a growth rate of 0 Note that we did not representEurasian herder populations as the constant model best-fitted the data for all of them Plots and correlation tests were performed using R v2141(R Development Core Team 2011)

8

Aime et al doi101093molbevmst156 MBE

negative correlation between FST values and inferred growthrates for sedentary farmers in Eurasia (=0673 P value =0028) (fig 5C) and Central Asia (=0773 P value = 0003)(fig 5D) thus meaning that less isolated populations showedhigher inferred growth rates There was no significant corre-lation for Central Asian herders (=0092 P value = 0736)(fig 5E) Note that this analysis could not be performed forthe other Eurasian herder populations as the constant modelbest fitted the data with the parametric method

The estimation of the proportion of immigrants did notconverge for 11 Eurasian populations (Han Chinese LiaoningQingdao Palestinians Pathans Mongols as well as threeCentral Asian farmer populations and two Central Asianherder populations) Regarding the other populations weshowed no significant difference in the proportion of immi-grants between farmers and herders both in Central Asia(mean[farmers] = 82868 mean[herders] = 260265 P value = 012)and in the rest of Eurasia (mean[farmers] = 576757 mean-

[herders] = 201311 P value = 051) (supplementary table S10Supplementary Material online) We also found no significantcorrelation between this proportion and the inferred growthrates for Eurasian farmers (=0238 P value = 048) CentralAsians farmers (= 0386 P value = 030) and Central Asianherders (=042 P value = 0139) (supplementary fig S2Supplementary Material online)

DiscussionIn this study using a large set of populations from distantgeographic areas we report contrasted demographic historiesthat correlate with lifestyle Moreover the inferred expansionsignals in both African and Eurasian farmer and herder pop-ulations predated the Neolithic transition and the sedentar-ization of these populations

Contrasted Demographic Histories in Sedentary andNomadic Populations

For Africa both mtDNA and autosomal data revealed expan-sion patterns in most sedentary farmer populations as indi-cated by neutrality tests and the parametric andnonparametric BEAST methods Conversely we found con-stant effective population sizes (or possibly contractionevents) for all hunter-gatherer populations Among the farm-ers results were least clear for the Yoruba and the Ewondopopulations as no neutrality test was significant for thesepopulations whereas they showed evidence of expansionevents when analyzed with BEAST This indicates that thesepopulations may have undergone weaker expansion dynam-ics (ie lower growth rates and Ne) than the others Theseremarkable results are of particular importance for theYoruba as it is a reference population in many databases(HapMap 1000 genomes) This also demonstrates thehigher sensitivity of MCMC methods such as BEAST todetect expansions in comparison to neutrality tests

The contrasted patterns inferred between sedentary andnomadic populations in Africa suggest strong differences be-tween the demographic histories of these two groups of pop-ulations The question is whether this pattern results mostly

from differences in local expansion dynamics or whether spa-tial expansion processes at a larger scale were also involved Asshown by Ray et al (2003) negative values for the neutralitytests will be observed in a spatial expansion process if the rateof migrants (Nm) is high enough (at least 20) but not other-wise As in previous studies (eg Verdu et al 2013) we reporta higher degree of isolation (higher population-specific FST

values) in hunter-gatherer populations than in farmer popu-lations Using the spatial expansion model of Excoffier (2004)also leads to higher estimates of the number of immigrantsinto farmer populations Thus both farmers and hunter-gath-erers may have been subject to a spatial expansion processbut the limited number of migrants among hunter-gatherersmay have resulted in an absence of expansion signals forthem This would be consistent with the positive correlationthat we observe between the growth rates estimated withBEAST and the inferred number of immigrants in the seden-tary farmer populations However this spatial expansion pro-cess seems unlikely to completely explain the strongassociation that we observed between lifestyle and expansionpatterns as some farmer populations (Teke Gabonese Fang)displayed FST values similar to those of hunter-gatherers but aclear signal of expansion with relatively high growth rates Thissuggests that even rather isolated farmer populations showsubstantial level of expansions Moreover FST values and in-ferred growth rates in farmer populations were not signifi-cantly correlated Therefore our results suggest that theexpansion patterns observed in sedentary populationsresult not only from a spatial expansion pattern In additionlocal dynamics connected with the higher capacity of foodproduction by farmers also explain their much strongerexpansion signatures relative to their neighboring hunter-gatherer populations

For Eurasia when considering the mtDNA data all threemethods (neutrality tests parametric BEAST analyses andEBSPs) yielded expansion signals for all sedentary farmer pop-ulations except Koreans Conversely only EBSPs and Fursquos Fstest showed expansion signals for nomadic herders but notthe parametric BEAST method nor other neutrality tests Thisresult points toward weaker expansion dynamics in herdersthan in farmers as supported also by the tendency for lowergrowth rates and Ne in herder populations than in farmerpopulations on the EBSP graphs (fig 2C and D) It thus seemsthat the flexibility and nonparametric nature of EBSP analysesallows one to detect weaker expansion events than the para-metric method Moreover Fursquos Fs is known to be more sen-sitive than the other neutrality tests to detect expansions(Ramos-Onsins and Rozas 2002) Again these inferred expan-sions may result at least in part from spatial expansion pro-cesses The population-specific FST values are indeed ratherlow in Eurasia Moreover we found a significant negativecorrelation between FST values and inferred growth rates forthe sedentary farmers indicating that less isolated popula-tions showed stronger expansion signals However althoughwe inferred much stronger expansion patterns for the farmersthan for the herders we did not observe any differences inEurasia between the farmers and the herders in the popula-tion-specific FST values or in the estimated number of

9

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

immigrants suggesting that spatial processes alone cannotexplain the strong difference that we observed between theexpansion patterns of these two groups of populations Thisindicates that the intrinsic demographic growth patterns aredifferent between these two kinds of populations the farmersshowing much higher growth rates than the herders

To our knowledge although other studies have found dif-ferent patterns between hunter-gatherers and farmers (egVerdu et al 2009) our study is the first to show differencesbetween farmers and herders the two major post-Neolithichuman groups A plausible explanation could be that no-madic herders and hunter-gatherers share several of the con-straints of a nomadic way of life For instance birth intervalsare generally longer (at least 4 years) in nomadic populationsthan in sedentary populations (eg Short 1982) According toBocquet-Appel (2011) these longer birth intervals may bemainly determined by diet differences Indeed Valeggia andEllison (2009) demonstrated that birth interval is mainly de-termined by the rapidity of postpartum energy recoverywhich may be increased by high carbohydrate food (like ce-reals) consumption Moreover the nomadic herder way of lifemay offer less food security than sedentary farming the latterfacilitating efficient long-term food storage

However unlike in Africa we did not find systematicallyconsistent patterns between the autosomal and mtDNA datain Eurasia The possible contraction events that our resultssuggest for two sedentary populations (Japanese and Danes)with autosomes appeared concomitant with historical eventsthat could have led to bottleneck processes For the Japanesepopulation this contraction signal could indeed result from afounder effect due to the Paleolithic colonization of Japan by asubset of the Northern Asiatic people (especially from KoreaNei 1995) Similarly a bottleneck process may also have oc-curred in the Danish population linked with the last glacialmaximum occurring between 26500 and 19500 YBP (Clarket al 2009) Reasons why these processes impacted the auto-somes but not the mtDNA data remain to be determined forinstance through simulation studies In any case our studyclearly emphasizes the utility of combining mtDNA and au-tosomal sequences as they allow access to different aspects ofhuman history A recent study on harbor porpoises has sim-ilarly shown that nuclear markers were sensitive to a recentcontraction event whereas mtDNA allowed inferring a moreancient expansion (Fontaine et al 2012)

Interestingly Central Asia displayed a distinct pattern fromthe rest of Eurasia Indeed we did not infer higher expansionrates for sedentary farmers than for nomadic herders in thatarea It could result from harsh local environmental condi-tions due to the arid continental climate in this area Indeedusing pollen records Dirksen and van Geel (2004) showedthat the paleoclimate in Central Asia was very arid from atleast 12000 to 3000 YBP which could have limited theamount of suitable areas for farming and impacted humandemography Spatial expansion processes may also haveplayed a role in this difference as population-specific FST

values were higher for the farmers than for the herdersThis may indicate that more migrants were involved in thespatial expansion process for the herders than for the farmers

yielding a weaker expansion signal (ie lower inferred growthrate) for the latter (Ray et al 2003) This is supported by thenegative correlation between the FST values and the inferredgrowth rates in the farmer populations The Korean popula-tion also stood out as an exception in Eurasia Even though itis a population of sedentary farmers it showed no significantexpansion signal with both the parametric and nonparamet-ric methods with HVS-I This could be explained by a latersedentarization of this population The Korean Neolithic isnotably defined by the introduction of Jeulmun ware ce-ramics about 8000 YBP but the people of the Jeulmunperiod were still predominantly semi-nomadic fishers andhunter-gatherers until about 3000 YBP when Koreans startedan intensive crop production implying a sedentary lifestyle(Nelson 1993)

Inferred Expansion Signals Predate the Emergence ofFarming

EBSP analyses revealed that the inferred expansion events infarmers and herder populations were more ancient than theemergence of farming and herding Therefore the differencesin demographic patterns between farmers and herders seemto predate their divergence in lifestyle which raises the ques-tion of the chronology of demographic expansions and theNeolithic transition These findings appear to be quite robustto the choice of the scaling parameters We used here boththe lower and the higher mutation rate estimates in humansfor autosomes (Pluzhnikov et al 2002 Conrad et al 2011) andfor the HVS-I sequence (Forster et al 1996 Howell et al 1996)Despite this uncertainty in mutation rates which lead to a 2-fold uncertainty in our time estimates the inferred expansionsignals predated the emergence of agriculture in both casesfor all populations Similarly using a generation time of 29years (Tremblay and Vezina 2000) instead of 25 years lead toslightly more ancient estimates thus do not change our con-clusions (data not shown) However note that for HVS-Iusing the higher bound of the credibility interval for the high-est estimated mutation rate (275 105generationsiteHeyer et al 2001) instead of the mean value (ie 105gen-erationsite) leads to expansion time estimates consistentwith the Neolithic transition in Eurasian populations (supple-mentary table S11 Supplementary Material online)Nevertheless these estimates still clearly predated theNeolithic for the African populations However 105gener-ationsite is by far the highest estimation of mutation rate inthe literature (Howell et al 1996) To infer Neolithic expan-sions in most Eurasian populations one needs to assume amutation rate of at least 2 105generationsite muchhigher than other estimates from the literature and is thusprobably unrealistic Moreover our method for determiningthe expansion onset time using EBSP graph is very conserva-tive and also tends to favor the lower bound of expansiononset times Finally for autosomes using similarly474 108generationsite instead of 25 108genera-tionsite (Pluzhnikov et al 2002) lead to an inferred expansiononset time that is not compatible with the Neolithic transi-tion for all Eurasian and African populations except for one

10

Aime et al doi101093molbevmst156 MBE

African population the Yoruba (supplementary table S12Supplementary Material online) Consequently it seemsvery likely that the expansions inferred in this study corre-spond to Paleolithic rather than Neolithic demographicevents in agreement also with most previous studies as de-tailed later

In Africa the emergence of agriculture has been datedbetween 5000 and 4000 YBP in the Western part ofCentral Africa and subsequently rapidly expanded to therest of sub-Saharan Africa (Phillipson 1993) However usingHVS-I we showed expansion events in farmer populationssince about 30000 or 60000 YBP thus largely predating theemergence of agriculture in the area Similarly using auto-somes especially in Eastern African populations we inferredexpansion signals that clearly predated the Neolithic Notablywe inferred an expansion signal for Mozambicans since atleast 80000 YBP Several genetic studies have already high-lighted that expansion events occurred in African farmersbefore the Neolithic transition (eg Atkinson et al 2009Laval et al 2010 Batini et al 2011) This finding is also con-sistent with paleoanthropological data (ie radiocarbondating) suggesting an expansion event in Africa 60000ndash80000 YBP (Mellars 2006a) This Paleolithic demographicexpansion could be linked to a rapid environmental changetoward a dryer climate (Partridge et al 1997) andor to theemergence of new hunting technologies (Mellars 2006a)

According to Mellars (2006a) this period corresponds to amajor increase in the complexity of the technological eco-nomic social and cognitive behavior of certain Africangroups It corresponds in particular to the emergence of pro-jectile technologies (Shea 2009) which was probably part of abroader pattern of ecological diversification of early Homosapiens populations These changes could have been decisivefor the human spread ldquoOut of Africardquo during the same periodand could have ultimately also led to the sedentarization ofthe remaining populations This inference is consistent withSauerrsquos (1952) demographic theory which stated that latePaleolithic demographic expansions could have favored thesedentarization and the emergence of agriculture in somehuman populations In the case of Central Africa theperiod of 60000 YBP corresponds to the separation betweenhunter-gatherers and farmers ancestors (Patin et al 2009Verdu et al 2009) Thus these two groups may have pre-sented contrasting demographic patterns since their diver-gence Much later higher expansion rates and largerpopulation sizes among farmersrsquo ancestors may have inducedthe emergence of agriculture and sedentarization

With respect to Eurasia the expansion profiles inferredwith HVS-I for all populations and with autosomes for theHan Chinese population also seem to have begun during thePaleolithic thus before the Neolithic transition Some geneticstudies already reported pre-Neolithic expansions in Asia andEurope (eg Chaix et al 2008) Notably using mismatch andintermatch distributions Chaix et al (2008) showed an east-to-west Paleolithic expansion wave in Eurasia We found asimilar pattern here as the inferred expansions of East-Asianpopulations were earlier than those of Central Asian popula-tions themselves earlier than those of European populations

Moreover we found this pattern in both sedentary farmerand nomadic herder populations Thus the ancestors of cur-rently nomadic herder populations also experienced thesePaleolithic expansions However Paleolithic expansion signalsin nomadic populations seem lower than in sedentary pop-ulations This is again compatible with the demographictheory of the Neolithic sedentarization (Sauer 1952) somepopulations may have experienced more intense Paleolithicexpansions which may have led ultimately to theirsedentarization

The inferred Paleolithic expansion signals might resultpartly from spatial expansions out of some refuge areasafter the Last Glacial Maximum (LGM 26500ndash19500 YBPClark et al 2009) as this time interval matches with our in-ferred dating for expansion onsets in East Asia with HVS-Iusing the pedigree-based mutation rate and in Europe andMiddle East using the transitional mutation rate Some of theearlier date estimates might also be consistent with the out-of-Africa expansion of H sapiens However the time radiocar-bon-based estimates of the spread of H sapiens in Eurasia aregenerally more ancient than our inferred expansion onsettimings For instance Mellars (2006b) dated the colonizationof Middle East by H sapiens at 47000ndash49000 YBP and ofEurope at 41000ndash42000 YBP Pavlov et al (2001) reporttraces of modern human occupation nearly 40000 yearsold in Siberia Finally Liu et al (2010) described modernhuman fossils from South China dated to at least 60000YBP Moreover out-of-Africa or post-LGM expansionswould not explain our finding of an east-to-west gradientof expansion onset timing which rather supports the hypoth-esis of a demographic expansion diffused from east to west inEurasia in a demic (ie migrations of individuals) or cultural(favored by the diffusion of new technologies)

Possible Confounding Factors

Our approach makes the assumption that populations areisolated and panmictic which is questionable for human pop-ulations However we analyzed a large set of populationssampled in very distant geographical regions (ie CentralAfrica East Africa Europe Middle East Central Asia PamirSiberia and East Asia) The main conclusions of this study relyon consistent patterns between most of these areas and itseems unlikely that processes such as admixture could havebiased the estimates similarly everywhere Moreover inCentral Africa several studies have shown that hunter-gath-erer populations show signals of admixture whereas it is notthe case for farmer populations (Patin et al 2009 Verdu et al2009 2013) If this introgression had been strong enough thismay have yielded a spurious expansion signal in the hunter-gatherer populations which is not what we observed here InEurope spatial expansion processes during the Neolithic mayhave led to admixture with Paleolithic populations Aspointed out by a simulation study (Arenas et al 2013) thismay lead to a predominance of the Paleolithic gene pool Thismay be one of the factors explaining why we observed mostlyPaleolithic expansions here

11

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

Similarly potential selection occurring on the whole mito-chondrial genome (eg Pakendorf and Stoneking 2005) seemsunlikely to have impacted in the same way all the studiedpopulations within each group (eg stronger positive selec-tion on sedentary than on nomadic populations) as we an-alyzed different nomadic and sedentary populations livingnear each other in several geographically distant areas

Regarding the potential effects of recombination on theinferences from autosomal data we found that neutralitytests gave similar results on the whole sequences whenusing a simulation procedure that was taking the known re-combination rate of each sequence into account (table 1)and on the largest non-recombining blocks as inferred withIMgc without taking recombination into account in the sim-ulation process (supplementary table S12 SupplementaryMaterial online) It thus appears unlikely that the BEAST anal-yses that can only handle the largest inferred non-recombin-ing blocks are biased because of this

Finally note that the effective population sizes inferredusing BEAST correspond to the Ne of the populationsduring their recent history rather than a value of Ne averagedover the history of the population It explains the finding thatfor most populations we inferred Ne estimates much higherthan generally assumed for humans by population geneticists(about 10000)

Material and Methods

Genetic MarkersAutosomal SequencesWe used data from 20 noncoding a priori neutral and un-linked autosomal regions selected by Patin et al (2009) to beat least 200 kb away from any known or predicted gene tonot be in linkage disequilibrium (LD) neither with each othernor with any known or predicted gene and to have a regionof homology with the chimpanzee genome These regions areon average 1253 bp long Using the four-gamete test (Hudsonand Kaplan 1985) as implemented in IMgc online (Woerneret al 2007) we identified recombination events for 6 of these20 regions As some methods used in this study cannothandle recombination we retained for these six sequencesthe largest non-recombining block inferred by IMgc Becauseof this reduction the 20 regions used were on average1228 bp long To identify potential bias related to thismethod (eg some recombination events may not bedetected using the four-gamete test larger blocks of non-recombining sequence may select for gene trees that areshorter than expected) we computed the summary statisticsand performed neutrality tests (discussed later) both on thewhole sequences (table 1) and on the largest non-recombin-ing blocks (supplementary table S12 Supplementary Materialonline)

Mitochondrial SequencesWe used the first hypervariable segment of the mitochondrialcontrol region (HVS-I) sequenced between positions 16067and 16383 excluding the hypervariable poly-C region (sites16179ndash16195) The total length of the sequence was thus of300 bp

Population Panel

For Africa we used the autosomal sequences data set of Patinet al (2009) which consists of five farmer populations(N = 118 individuals) and five Pygmy hunter-gatherer popu-lations (N = 95) In addition we used the HVS-I data set fromQuintana-Murci et al (2008) which consists of nine CentralAfrican farmer populations (N = 486) and seven CentralAfrican hunter-gatherer populations (N = 318) (supplemen-tary table S1 Supplementary Material online)

For Eurasia we used the autosomal sequences data set ofLaval et al (2010) consisting of 48 individuals from two East-Asian populations (Han Chinese and Japanese) and 47 indi-viduals from two European populations (Chuvash andDanes) We also used the data from 48 individuals fromone sedentary Central Asian population (Tajik farmers) and48 individuals from one nomadic Central Asian population(Kyrgyz herders) of Segurel et al (2013) For HVS-I we ana-lyzed data from 17 Eurasian populations (N = 494 in total)located from Eastern to Western Eurasia belonging to severalpublished data sets (Derenko et al 2000 Richards et al 2000Bermisheva et al 2002 Imaizumi et al 2002 Yao et al 2002Kong et al 2003 Quintana-Murci et al 2004 supplementarytable S1 Supplementary Material online)

For our detailed study of Central Asia we used HVS-I se-quences from 12 farmer populations (N = 408 in total) and 16herder populations (N = 567 in total) These data come fromthe studies by Chaix et al (2007) and Heyer et al (2009) for 25populations (supplementary table S1 SupplementaryMaterial online) The other populations (KIB TAB andTKY) were sequenced for this study As in Chaix et al(2007) and Heyer et al (2009) DNA was extracted fromblood samples using standard protocols and the sequencequality was ensured as follows each base pair was determinedonce with a forward and once with a reverse primer anyambiguous base call was checked by additional and indepen-dent PCR and sequencing reactions all sequences were ex-amined by two independent investigators All sampledindividuals were healthy donors from whom informed con-sent was obtained The study was approved by appropriateEthic Committees and scientific organizations in all countrieswhere samples have been collected

Demographic Inferences from Sequences AnalysisSummary Statistics and Neutrality TestsWe computed classical summary statistics (number of poly-morphic sites S number of haplotypes K) and four neutralitytests (Tajimarsquos [1989] D Fu and Lirsquos [1993] D and F and Fursquos[1997] Fs) on both mitochondrial and autosomal sequencesAlthough neutrality tests were originally designed to detectselective events they also give information about demo-graphic processes especially when applied to neutral markersIndeed expansion events lead to more negative values thanexpected in the absence of selective and demographic pro-cesses Conversely contraction events lead to more positivevalues of the neutrality tests For HVS-I sequences we com-puted all summary statistics and neutrality tests and tested

12

Aime et al doi101093molbevmst156 MBE

their departure from neutrality using the coalescent-basedtests provided in DnaSP (Librado and Rozas 2009)

For the autosomal sequences we used the procedure de-veloped in Laval et al (2010) which combines all autosomalsequences into a single test This procedure consists in com-puting the mean value of each summary statistics across the20 loci and in testing whether this mean value departs signif-icantly from its expectation under neutrality in a constant-size population model using a simulation procedure For agiven population with sample size n we produced 105 simu-lated samples of the same size n under a constant populationsize model using the generation-per-generation coalescent-based algorithm implemented in SIMCOAL v2 (Laval andExcoffier 2004) Each simulated individual was constitutedby 20 independent sequences of 1253 bp (the average se-quence length for the real data)Then we used ARLEQUIN v3(Excoffier et al 2005) as modified by Laval et al (2010) tocompute the summary statistics on these simulated samplesWe assessed whether the observed statistics differed signifi-cantly from the constant population model under neutralityby comparing these statistics with their null distribution ob-tained from the simulated data We used gamma distributedmutation rates with a mean value of 25 108generationsite (95 confidence interval 1476 108

4036 108)in agreement with previous studies (Pluzhnikov et al 2002Voight et al 2005) This procedure yielded a P value for thesignificance of the departure from a constant size model Weperformed this procedure both on the whole sequences andon the largest non-recombining blocks For the whole se-quences (ie including recombination) we performed thedata simulation under a coalescent model with recombina-tion using for each locus the recombination rate provided bythe HapMap build GRCh37 genetic map (InternationalHapMap Consortium 2003) (supplementary table S8Supplementary Material online) whereas for the largestnon-recombining blocks we used a coalescent model withoutrecombination

Both for autosomes and for HVS-I sequences we adjustedthe obtained P values for each neutrality test using a falsediscovery rate (FDR) correction (Benjamini and Hochberg1995) in R v2141 (R Development Core Team 2011) inorder to take into account the increased error probability inthe case of multiple testing

MCMC Estimations of Demographic ParametersWe used the MCMC algorithm implemented in BEAST v16(Drummond and Rambaut 2007) We tested the four demo-graphic models implemented in this software constant effec-tive population size (N0) (constant model) populationexpansion with an increasing growth rate (g) (exponentialmodel) population expansion with an decreasing growthrate (g) (logistic model) and the expansion model in whichN0 is the present day population size N1 the population sizethat the model asymptotes to going into the distant past andg the exponential growth rate that determines how fast thetransition is from near the N1 population size to N0 popula-tion size In fact BEAST estimates composite parameters foreach model namely N0 and g where N0 is the current

effective population size g the growth rate and the muta-tion rate In addition for the expansion model the ratio be-tween the current (N0) and ancestral (N1) effectivepopulation size is also estimated To infer N0 and g fromthese composite parameters we needed to assume a valuefor the mutation rate However there is no consensus formutation rates in humans in the literature as different meth-ods lead to different estimations For autosomes the mostcommonly used value is the phylogenetic rate of= 25 108generationsite (Pluzhnikov et al 2002)However recent studies based on the 1000 genome project(1000 Genomes Project Consortium 2012) have found a2-fold lower rate (= 12 108generationsite) by directlycomparing genome-wide sequences from children and theirparents (Conrad et al 2011 Scally and Durbin 2012) We usedhere both mutation rates Similarly for HVS-I estimated mu-tation rates are highly dependent on methodologies andmodes of calibration (Endicott et al 2009) We used boththe lower and the higher estimated mutation rates the tran-sitional changes mutation rate of = 5 106generationsite (Forster et al 1996) and the pedigree-based rate of= 105generationsite (Howell et al 1996 Heyer et al2001) We used a general time-reversible substitution model(Rodriguez et al 1990) We assumed a generation time of 25years permitting the comparison with previous human pop-ulation genetics studies (eg Chaix et al 2008 Patin et al 2009Laval et al 2010) As BEAST cannot handle recombinationevents we used the largest nonrecombining block withineach sequence (discussed earlier)

We performed three runs of 107 steps per population andper demographic model for the HVS-I sequence and threeruns of 2 108 steps (which corresponded to three runs of107 steps per locus) for the autosomal sequences We re-corded one tree every 1000 steps which thus implied atotal of 105 trees per locus and per run We then removedthe first 10 steps of each run (burn-in period) and combinedthe runs to obtain acceptable effective sample sizes (ESSs of100 or above) The convergence of these runs was assessedusing two methods visual inspection of traces using Tracerv15 (Rambaut and Drummond 2007) to check for concor-dance between runs and computation of Gelman and Rubinrsquos(1992) convergence diagnostic using R v2141 (RDevelopment Core Team 2011) with the function gelmandiag available in the package coda (Plummer et al 2006)

To facilitate a large exploration of the parameter space forthe autosomal sequences we chose uniform priors between 0and 005 for 2N0 and between 109 and 109 for g ForHVS-I sequences we chose uniform priors for N0 between 0and 10 and for gm between25 106 and 25 106 result-ing for the same priors on N0 and g than for autosomalsequences if we assumed = 105generationsite (ieN0 constrained between 0 and 106 and g constrained between1 and 1 per year) Conversely if we assumed = 5 106generationsite it meant that N0 was constrained between 0and 2 106 and g was constrained between05 and 05 peryear

For each population and model we obtained themode and the 95 HPD of N0 and g inferred from their

13

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

posterior distributions (supplementary tables S4 and S7Supplementary Material online) using the add-on packageLocfit (Loader 1999) in R v2141 We selected the best-fittingmodel among the four tested demographic models by esti-mating marginal likelihoods using two methods path sam-pling and stepping-stone sampling (Baele et al 2012) Themodel with the greater marginal likelihood (supplementarytable S3 Supplementary Material online) was considered asthe best-fitting model

Extended Bayesian Skyline PlotsEBSPs (Heled and Drummond 2010) also implemented inBEAST estimate demographic changes occurring continu-ously through time in a population using the time intervalsbetween successive coalescent events This method allows avisualization of the evolution of Ne through time As abovewe combined three runs of 107 steps for mitochondrial se-quences and three runs of 2 108 steps for autosomal se-quences to obtain acceptable ESS values We assumed thesame mutation rates as above and a generation time of 25years Outputs were analyzed with Tracer v15 to visuallycheck for convergence and ESS also to obtain the 95 HPDinterval for the number of demographic changes that oc-curred in the population (supplementary table S5Supplementary Material online) A constant population sizecould be rejected when the 95 HPD of the number ofchange points excluded 0 but included 1 (Heled andDrummond 2010) Then we used R v2141 to computeGelman and Rubinrsquos (1992) convergence diagnostic asabove as well as to compute skyline plots Finally we usedthe population growth curves generated from BEAST toassess the time at which populations began to expandEach Skyline plot consisted of smoothed data points atamp10ndash20 generation intervals We consider that the popula-tion increased (or decreased) when both the median and 95HPD values for Ne increased (or decreased) between morethan two successive data points Although this method didnot provide a 95 HPD interval for the inferred expansiontimings this conservative approach ensured that we consid-ered only relevant expansion signals

Correlation Tests of Inferred Growth Rates andIsolationImmigration Patterns

To test how isolation degrees and migration patterns differ-ences could have impacted our demographic inferences weused ARLEQUIN v311 (Excoffier et al 2005) to compute pop-ulation-specific FST values (Weir and Hill 2002) from HVS-Idata for Central Africa Eurasia and Central Asia (supplemen-tary table S9 Supplementary Material online) and estimateimmigration rates from mismatch distributions under a spa-tially explicit model (Excoffier 2004) (supplementary tableS10 Supplementary Material online) We performed thenSpearman tests using R v2141 to investigate for eachregion how the inferred parametric growth rates were corre-lated with those FST values and immigration rates We used avalue of 0 for the growth rate when the constant model bestfitted the data

Supplementary MaterialSupplementary figures S1 and S2 and tables S1ndashS12 are avail-able at Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

The authors would like to warmly thank all volunteer partic-ipants They also thank Michael Fontaine for his help on somedata analyses Phillip Endicott and Julio Bendezu-Sarmientofor insightful discussions Alexei Drummond for helpful dis-cussions and Friso Palstra for his help on English usage Theythank Laurent Excoffier two anonymous reviewers and theeditor for helpful comments and suggestions All computa-tionally intensive analyses were run on the Linux cluster of theMuseum National drsquoHistoire Naturelle (administrated by JulioPedraza) and on the web-based portal ldquoBioportalrdquo (Kumaret al 2009) This work was supported by ActionsTransversales du Museum - Museum National drsquoHistoireNaturelle (grant ldquoLes relations Societes-Natures dans le longtermerdquo) and Agence Nationale de la Recherche (grantsldquoAlterite culturellerdquo ANR-10-ESVS-0010 and ldquoDemochipsrdquoANR-12-BSV7-0012) CA was financed by a PhD grantfrom the Centre National de la Recherche Scientifique

References1000 Genomes Project Consortium 2012 An integrated map of genetic

variation from 1092 human genomes Nature 49156ndash65Ambrose SH 2001 Paleolithic technology and human evolution Science

2911748ndash1753Arenas M Francois O Currat M Ray N Excoffier L 2013 Influence of

admixture and Paleolithic range contractions on current Europeandiversity gradients Mol Biol Evol 30 57ndash61

Atkinson QD Gray RD Drummond AJ 2009 Bayesian coalescent infer-ence of major human mitochondrial DNA haplogroup expansionsin Africa Proc R Soc Lond B Biol Sci 276367ndash373

Baele G Lemey P Bedford T Rambaut A Suchard MA Alekseyenko AV2012 Improving the accuracy of demographic and molecular clockmodel comparison while accommodating phylogenetic uncertaintyMol Biol Evol 292157ndash2167

Balaresque PL Ballereau SJ Jobling MA 2007 Challenges in humangenetic diversity demographic history and adaptation Hum MolGenet 16R134ndashR139

Batini C Lopes J Behar DM Calafell F Jorde LB van der Veen LQuintana-Murci L Spedini G Destro-Bisol G Comas D 2011Insights into the demographic history of African pygmies from com-plete mitochondrial genomes Mol Biol Evol 281099ndash1110

Beaumont MA 2004 Recent developments in genetic data analysiswhat can they tell us about human demographic historyHeredity 92365ndash379

Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdashapractical and powerful approach to multiple testing J R Stat SocSer B 57289ndash300

Bermisheva M Salimova A Korshunova T Svyatova G Berezina GVillems R Khusnutdinova E 2002 Mitochondrial DNA diversity inthe populations of Middle Asia and Northern Caucasus Eur J HumGenet 10179ndash180

Bocquet-Appel JP 2011 When the worldrsquos population took off thespringboard of the Neolithic Demographic Transition Science 333560ndash561

Bocquet-Appel J-P Bar-Yosef O 2008 The Neolithic demographic tran-sition and its consequences Dordrecht (The Netherlands) Springer

14

Aime et al doi101093molbevmst156 MBE

Chaix R Austerlitz F Hegay T Quintana-Murci L Heyer E 2008 Genetictraces of east-to-west human expansion waves in Eurasia Am J PhysAnthropol 136309ndash317

Chaix R Quintana-Murci L Hegay T Hammer MF Mobasher ZAusterlitz F Heyer E 2007 From social to genetic structures incentral Asia Curr Biol 1743ndash48

Clark PU Dyke AS Shakun JD Carlson AE Clark J Wohlfarth BMitrovica JX Hostetler SW McCabe AM 2009 The Last GlacialMaximum Science 325710ndash714

Conrad DF Keebler JEM DePristo MA et al (17 co-authors) 2011Variation in genome-wide mutation rates within and betweenhuman families Nat Genet 43712ndash714

Derenko MV Malyarchuk BA Dambueva IK Shaikhaev GO DorzhuCM Nimaev DD Zakharov IA 2000 Mitochondrial DNA variationin two South Siberian aboriginal populations implications for thegenetic history of North Asia Hum Biol 72945ndash973

Dirksen VG van Geel B 2004 Mid to late Holocene climate change andits influence on cultural development in South Central Siberia NatoSci S Ss IV Ear 42291ndash307

Drummond AJ Rambaut A 2007 BEAST Bayesian evolutionary analysisby sampling trees BMC Evol Biol 7214

Endicott P Ho SYW Metspalu M Stringer C 2009 Evaluating the mi-tochondrial timescale of human evolution Trends Ecol Evol 24515ndash521

Excoffier L 2004 Patterns of DNA sequence diversity and genetic struc-ture after a range expansion lessons from the infinite-island modelMol Ecol 13853ndash864

Excoffier L Heckel G 2006 Computer programs for population geneticsdata analysis a survival guide Nat Rev Genet 7745ndash758

Excoffier L Laval G Schneider S 2005 Arlequin (version 30) an inte-grated software package for population genetics data analysis EvolBioinform Online 147ndash50

Fontaine MC Snirc A Frantzis A Koutrakis E Ozturk B Ozturk AAAusterlitz F 2012 History of expansion and anthropogenic collapsein a top marine predator of the Black Sea estimated from geneticdata Proc Natl Acad Sci U S A 109E2569ndashE2576

Forster P Harding R Torroni A Bandelt HJ 1996 Origin and evolutionof native American mtDNA variation a reappraisal Am J HumGenet 59935ndash945

Fu YX 1997 Statistical tests of neutrality of mutations against popula-tion growth hitchhiking and background selection Genetics 147915ndash925

Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133693ndash709

Gelman A Rubin DB 1992 Inference from iterative simulation usingmultiple sequences Stat Sci 7457ndash511

Heled J Drummond AJ 2010 Bayesian inference of species trees frommultilocus data Mol Biol Evol 27570ndash580

Heyer E Balaresque P Jobling MA Quintana-Murci L Chaix R Segurel LAldashev A Hegay T 2009 Genetic diversity and the emergence ofethnic groups in Central Asia BMC Genet 1049

Heyer E Chaix R Pavard S Austerlitz F 2012 Sex-specific demographicbehaviours that shape human genomic variation Mol Ecol 21597ndash612

Heyer E Zietkiewicz E Rochowski A Yotova V Puymirat J Labuda D2001 Phylogenetic and familial estimates of mitochondrial substi-tution rates study of control region mutations in deep-rootingpedigrees Am J Hum Genet 631113ndash1126

Ho SYW Shapiro B 2011 Skyline-plot methods for estimatingdemographic history from nucleotide sequences Mol Ecol Res 11423ndash434

Howell N Kubacka I Mackey DA 1996 How rapidly does the humanmitochondrial genome evolve Am J Hum Genet 59501ndash509

Hudson RR Kaplan NL 1985 Statistical properties of the number ofrecombination events in the history of a sample of DNA sequencesGenetics 111147ndash164

Imaizumi K Parsons TJ Yoshino M Holland MM 2002 A new databaseof mitochondrial DNA hypervariable regions I and II sequences from162 Japanese individuals Int J Legal Med 11668ndash73

International HapMap Consortium 2003 The International HapMapProject Nature 426789ndash796

Kingman JFC 1982 The coalescent Stoch Proc Appl 13235ndash248Kong QP Yao YG Sun C Bandelt HJ Zhu CL Zhang YP 2003 Phylogeny

of East Asian mitochondrial DNA lineages inferred from completesequences Am J Hum Genet 73671ndash676

Kumar V Langstieh BT Madhavi KV Naidu VM Singh HP Biswas SThangaraj K Singh L Reddy BM 2006 Global patterns in humanmitochondrial DNA and Y-chromosome variation caused by spatialinstability of the local cultural processes PLoS Genet 2e53

Kumar S Skjaeveland A Orr RJS Enger P Ruden T Mevik BH Burki FBotnen A Shalchian-Tabrizi K 2009 AIR a batch-oriented webprogram package for construction of supermatrices ready for phy-logenomic analyses BMC Bioinformatics 10357

Laval G Excoffier L 2004 SIMCOAL 20 a program to simulate genomicdiversity over large recombining regions in a subdivided populationwith a complex history Bioinformatics 202485ndash2487

Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulatinga historical and demographic model of recent human evolutionbased on resequencing data from noncoding regions PLoS One 5e10284

Librado P Rozas J 2009 DnaSP v5 a software for comprehensive analysisof DNA polymorphism data Bioinformatics 251451ndash1452

Liu W Jin CZ Zhang YQ et al (13 co-authors) 2010 Human remainsfrom Zhirendong South China and modern human emergence inEast Asia Proc Natl Acad Sci U S A 10719201ndash19206

Loader C 1999 Local regression and likelihood New York SpringerMellars P 2006a Why did human populations disperse from Africa ca

60000 years ago A new model Proc Natl Acad Sci U S A 1039381ndash9386

Mellars P 2006b A new radiocarbon revolution and the dispersal ofmodern humans in Eurasia Nature 439931ndash935

Nei M 1995 The origins of human populations genetic linguistic andarcheological data In Brenner S Hanihara K editors The origin andpast of modern humans as viewed from DNA Singapore WorldScientific p 71ndash91

Nei M Roychoudhury AK 1993 Evolutionary relationships of human-populations on a global scale Mol Biol Evol 10927ndash943

Nelson SM 1993 The archaeology of Korea Cambridge CambridgeUniversity Press p 307

Oota H Settheetham-Ishida W Tiwawech D Ishida T Stoneking M2001 Human mtDNA and Y-chromosome variation is corre-lated with matrilocal versus patrilocal residence Nat Genet 2920ndash21

Pakendorf B Stoneking M 2005 Mitochondrial DNA and human evo-lution Annu Rev Genomics Hum Genet 6165ndash183

Partridge TC Demenocal PB Lorentz SA Paiker MJ Vogel JC 1997Orbital forcing of climate over South Africa a 200000-year rainfallrecord from the Pretoria Saltpan Quat Sci Rev 161125ndash1133

Patin E Laval G Barreiro LB et al (15 co-authors) 2009 Inferring thedemographic history of African farmers and pygmy hunter-gath-erers using a multilocus resequencing data set PLoS Genet 5e1000448

Pavlov P Svendsen J Indrelid SI 2001 Human presence in the EuropeanArctic nearly 40000 years ago Nature 41364ndash67

Phillipson DW 1993 African archaeology Cambridge CambridgeUniversity Press

Plummer M Best N Cowles K Vines K 2006 CODA Convergencediagnosis and output analysis for MCMC R News 67ndash11

Pluzhnikov A Di Rienzo A Hudson RR 2002 Inferences about humandemography based on multilocus analyses of noncoding sequencesGenetics 1611209ndash1218

Quintana-Murci L Chaix R Wells RS et al (17 co-authors) 2004 Wherewest meets east the complex mtDNA landscape of the southwestand Central Asian corridor Am J Hum Genet 74827ndash845

Quintana-Murci L Quach H Harmant C et al (23 co-authors) 2008Maternal traces of deep common ancestry and asymmetric geneflow between Pygmy hunter-gatherers and Bantu-speaking farmersProc Natl Acad Sci U S A 1051596ndash1601

15

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE

Page 8: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

FIG 5 Correlations between population-specific FST values and inferred growth rates in African farmer (A) and hunter-gatherer (B) populationsEurasian farmer populations (C) and Central Asian farmer (D) and herder populations (E) Population-specific FST values were computed withARLEQUIN v311 (Excoffier et al 2005) The growth rates were inferred under the best-fitting model from the parametric method using BEAST(Drummond and Rambaut 2007) When the best-fitting model was the constant model we assumed a growth rate of 0 Note that we did not representEurasian herder populations as the constant model best-fitted the data for all of them Plots and correlation tests were performed using R v2141(R Development Core Team 2011)

8

Aime et al doi101093molbevmst156 MBE

negative correlation between FST values and inferred growthrates for sedentary farmers in Eurasia (=0673 P value =0028) (fig 5C) and Central Asia (=0773 P value = 0003)(fig 5D) thus meaning that less isolated populations showedhigher inferred growth rates There was no significant corre-lation for Central Asian herders (=0092 P value = 0736)(fig 5E) Note that this analysis could not be performed forthe other Eurasian herder populations as the constant modelbest fitted the data with the parametric method

The estimation of the proportion of immigrants did notconverge for 11 Eurasian populations (Han Chinese LiaoningQingdao Palestinians Pathans Mongols as well as threeCentral Asian farmer populations and two Central Asianherder populations) Regarding the other populations weshowed no significant difference in the proportion of immi-grants between farmers and herders both in Central Asia(mean[farmers] = 82868 mean[herders] = 260265 P value = 012)and in the rest of Eurasia (mean[farmers] = 576757 mean-

[herders] = 201311 P value = 051) (supplementary table S10Supplementary Material online) We also found no significantcorrelation between this proportion and the inferred growthrates for Eurasian farmers (=0238 P value = 048) CentralAsians farmers (= 0386 P value = 030) and Central Asianherders (=042 P value = 0139) (supplementary fig S2Supplementary Material online)

DiscussionIn this study using a large set of populations from distantgeographic areas we report contrasted demographic historiesthat correlate with lifestyle Moreover the inferred expansionsignals in both African and Eurasian farmer and herder pop-ulations predated the Neolithic transition and the sedentar-ization of these populations

Contrasted Demographic Histories in Sedentary andNomadic Populations

For Africa both mtDNA and autosomal data revealed expan-sion patterns in most sedentary farmer populations as indi-cated by neutrality tests and the parametric andnonparametric BEAST methods Conversely we found con-stant effective population sizes (or possibly contractionevents) for all hunter-gatherer populations Among the farm-ers results were least clear for the Yoruba and the Ewondopopulations as no neutrality test was significant for thesepopulations whereas they showed evidence of expansionevents when analyzed with BEAST This indicates that thesepopulations may have undergone weaker expansion dynam-ics (ie lower growth rates and Ne) than the others Theseremarkable results are of particular importance for theYoruba as it is a reference population in many databases(HapMap 1000 genomes) This also demonstrates thehigher sensitivity of MCMC methods such as BEAST todetect expansions in comparison to neutrality tests

The contrasted patterns inferred between sedentary andnomadic populations in Africa suggest strong differences be-tween the demographic histories of these two groups of pop-ulations The question is whether this pattern results mostly

from differences in local expansion dynamics or whether spa-tial expansion processes at a larger scale were also involved Asshown by Ray et al (2003) negative values for the neutralitytests will be observed in a spatial expansion process if the rateof migrants (Nm) is high enough (at least 20) but not other-wise As in previous studies (eg Verdu et al 2013) we reporta higher degree of isolation (higher population-specific FST

values) in hunter-gatherer populations than in farmer popu-lations Using the spatial expansion model of Excoffier (2004)also leads to higher estimates of the number of immigrantsinto farmer populations Thus both farmers and hunter-gath-erers may have been subject to a spatial expansion processbut the limited number of migrants among hunter-gatherersmay have resulted in an absence of expansion signals forthem This would be consistent with the positive correlationthat we observe between the growth rates estimated withBEAST and the inferred number of immigrants in the seden-tary farmer populations However this spatial expansion pro-cess seems unlikely to completely explain the strongassociation that we observed between lifestyle and expansionpatterns as some farmer populations (Teke Gabonese Fang)displayed FST values similar to those of hunter-gatherers but aclear signal of expansion with relatively high growth rates Thissuggests that even rather isolated farmer populations showsubstantial level of expansions Moreover FST values and in-ferred growth rates in farmer populations were not signifi-cantly correlated Therefore our results suggest that theexpansion patterns observed in sedentary populationsresult not only from a spatial expansion pattern In additionlocal dynamics connected with the higher capacity of foodproduction by farmers also explain their much strongerexpansion signatures relative to their neighboring hunter-gatherer populations

For Eurasia when considering the mtDNA data all threemethods (neutrality tests parametric BEAST analyses andEBSPs) yielded expansion signals for all sedentary farmer pop-ulations except Koreans Conversely only EBSPs and Fursquos Fstest showed expansion signals for nomadic herders but notthe parametric BEAST method nor other neutrality tests Thisresult points toward weaker expansion dynamics in herdersthan in farmers as supported also by the tendency for lowergrowth rates and Ne in herder populations than in farmerpopulations on the EBSP graphs (fig 2C and D) It thus seemsthat the flexibility and nonparametric nature of EBSP analysesallows one to detect weaker expansion events than the para-metric method Moreover Fursquos Fs is known to be more sen-sitive than the other neutrality tests to detect expansions(Ramos-Onsins and Rozas 2002) Again these inferred expan-sions may result at least in part from spatial expansion pro-cesses The population-specific FST values are indeed ratherlow in Eurasia Moreover we found a significant negativecorrelation between FST values and inferred growth rates forthe sedentary farmers indicating that less isolated popula-tions showed stronger expansion signals However althoughwe inferred much stronger expansion patterns for the farmersthan for the herders we did not observe any differences inEurasia between the farmers and the herders in the popula-tion-specific FST values or in the estimated number of

9

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

immigrants suggesting that spatial processes alone cannotexplain the strong difference that we observed between theexpansion patterns of these two groups of populations Thisindicates that the intrinsic demographic growth patterns aredifferent between these two kinds of populations the farmersshowing much higher growth rates than the herders

To our knowledge although other studies have found dif-ferent patterns between hunter-gatherers and farmers (egVerdu et al 2009) our study is the first to show differencesbetween farmers and herders the two major post-Neolithichuman groups A plausible explanation could be that no-madic herders and hunter-gatherers share several of the con-straints of a nomadic way of life For instance birth intervalsare generally longer (at least 4 years) in nomadic populationsthan in sedentary populations (eg Short 1982) According toBocquet-Appel (2011) these longer birth intervals may bemainly determined by diet differences Indeed Valeggia andEllison (2009) demonstrated that birth interval is mainly de-termined by the rapidity of postpartum energy recoverywhich may be increased by high carbohydrate food (like ce-reals) consumption Moreover the nomadic herder way of lifemay offer less food security than sedentary farming the latterfacilitating efficient long-term food storage

However unlike in Africa we did not find systematicallyconsistent patterns between the autosomal and mtDNA datain Eurasia The possible contraction events that our resultssuggest for two sedentary populations (Japanese and Danes)with autosomes appeared concomitant with historical eventsthat could have led to bottleneck processes For the Japanesepopulation this contraction signal could indeed result from afounder effect due to the Paleolithic colonization of Japan by asubset of the Northern Asiatic people (especially from KoreaNei 1995) Similarly a bottleneck process may also have oc-curred in the Danish population linked with the last glacialmaximum occurring between 26500 and 19500 YBP (Clarket al 2009) Reasons why these processes impacted the auto-somes but not the mtDNA data remain to be determined forinstance through simulation studies In any case our studyclearly emphasizes the utility of combining mtDNA and au-tosomal sequences as they allow access to different aspects ofhuman history A recent study on harbor porpoises has sim-ilarly shown that nuclear markers were sensitive to a recentcontraction event whereas mtDNA allowed inferring a moreancient expansion (Fontaine et al 2012)

Interestingly Central Asia displayed a distinct pattern fromthe rest of Eurasia Indeed we did not infer higher expansionrates for sedentary farmers than for nomadic herders in thatarea It could result from harsh local environmental condi-tions due to the arid continental climate in this area Indeedusing pollen records Dirksen and van Geel (2004) showedthat the paleoclimate in Central Asia was very arid from atleast 12000 to 3000 YBP which could have limited theamount of suitable areas for farming and impacted humandemography Spatial expansion processes may also haveplayed a role in this difference as population-specific FST

values were higher for the farmers than for the herdersThis may indicate that more migrants were involved in thespatial expansion process for the herders than for the farmers

yielding a weaker expansion signal (ie lower inferred growthrate) for the latter (Ray et al 2003) This is supported by thenegative correlation between the FST values and the inferredgrowth rates in the farmer populations The Korean popula-tion also stood out as an exception in Eurasia Even though itis a population of sedentary farmers it showed no significantexpansion signal with both the parametric and nonparamet-ric methods with HVS-I This could be explained by a latersedentarization of this population The Korean Neolithic isnotably defined by the introduction of Jeulmun ware ce-ramics about 8000 YBP but the people of the Jeulmunperiod were still predominantly semi-nomadic fishers andhunter-gatherers until about 3000 YBP when Koreans startedan intensive crop production implying a sedentary lifestyle(Nelson 1993)

Inferred Expansion Signals Predate the Emergence ofFarming

EBSP analyses revealed that the inferred expansion events infarmers and herder populations were more ancient than theemergence of farming and herding Therefore the differencesin demographic patterns between farmers and herders seemto predate their divergence in lifestyle which raises the ques-tion of the chronology of demographic expansions and theNeolithic transition These findings appear to be quite robustto the choice of the scaling parameters We used here boththe lower and the higher mutation rate estimates in humansfor autosomes (Pluzhnikov et al 2002 Conrad et al 2011) andfor the HVS-I sequence (Forster et al 1996 Howell et al 1996)Despite this uncertainty in mutation rates which lead to a 2-fold uncertainty in our time estimates the inferred expansionsignals predated the emergence of agriculture in both casesfor all populations Similarly using a generation time of 29years (Tremblay and Vezina 2000) instead of 25 years lead toslightly more ancient estimates thus do not change our con-clusions (data not shown) However note that for HVS-Iusing the higher bound of the credibility interval for the high-est estimated mutation rate (275 105generationsiteHeyer et al 2001) instead of the mean value (ie 105gen-erationsite) leads to expansion time estimates consistentwith the Neolithic transition in Eurasian populations (supple-mentary table S11 Supplementary Material online)Nevertheless these estimates still clearly predated theNeolithic for the African populations However 105gener-ationsite is by far the highest estimation of mutation rate inthe literature (Howell et al 1996) To infer Neolithic expan-sions in most Eurasian populations one needs to assume amutation rate of at least 2 105generationsite muchhigher than other estimates from the literature and is thusprobably unrealistic Moreover our method for determiningthe expansion onset time using EBSP graph is very conserva-tive and also tends to favor the lower bound of expansiononset times Finally for autosomes using similarly474 108generationsite instead of 25 108genera-tionsite (Pluzhnikov et al 2002) lead to an inferred expansiononset time that is not compatible with the Neolithic transi-tion for all Eurasian and African populations except for one

10

Aime et al doi101093molbevmst156 MBE

African population the Yoruba (supplementary table S12Supplementary Material online) Consequently it seemsvery likely that the expansions inferred in this study corre-spond to Paleolithic rather than Neolithic demographicevents in agreement also with most previous studies as de-tailed later

In Africa the emergence of agriculture has been datedbetween 5000 and 4000 YBP in the Western part ofCentral Africa and subsequently rapidly expanded to therest of sub-Saharan Africa (Phillipson 1993) However usingHVS-I we showed expansion events in farmer populationssince about 30000 or 60000 YBP thus largely predating theemergence of agriculture in the area Similarly using auto-somes especially in Eastern African populations we inferredexpansion signals that clearly predated the Neolithic Notablywe inferred an expansion signal for Mozambicans since atleast 80000 YBP Several genetic studies have already high-lighted that expansion events occurred in African farmersbefore the Neolithic transition (eg Atkinson et al 2009Laval et al 2010 Batini et al 2011) This finding is also con-sistent with paleoanthropological data (ie radiocarbondating) suggesting an expansion event in Africa 60000ndash80000 YBP (Mellars 2006a) This Paleolithic demographicexpansion could be linked to a rapid environmental changetoward a dryer climate (Partridge et al 1997) andor to theemergence of new hunting technologies (Mellars 2006a)

According to Mellars (2006a) this period corresponds to amajor increase in the complexity of the technological eco-nomic social and cognitive behavior of certain Africangroups It corresponds in particular to the emergence of pro-jectile technologies (Shea 2009) which was probably part of abroader pattern of ecological diversification of early Homosapiens populations These changes could have been decisivefor the human spread ldquoOut of Africardquo during the same periodand could have ultimately also led to the sedentarization ofthe remaining populations This inference is consistent withSauerrsquos (1952) demographic theory which stated that latePaleolithic demographic expansions could have favored thesedentarization and the emergence of agriculture in somehuman populations In the case of Central Africa theperiod of 60000 YBP corresponds to the separation betweenhunter-gatherers and farmers ancestors (Patin et al 2009Verdu et al 2009) Thus these two groups may have pre-sented contrasting demographic patterns since their diver-gence Much later higher expansion rates and largerpopulation sizes among farmersrsquo ancestors may have inducedthe emergence of agriculture and sedentarization

With respect to Eurasia the expansion profiles inferredwith HVS-I for all populations and with autosomes for theHan Chinese population also seem to have begun during thePaleolithic thus before the Neolithic transition Some geneticstudies already reported pre-Neolithic expansions in Asia andEurope (eg Chaix et al 2008) Notably using mismatch andintermatch distributions Chaix et al (2008) showed an east-to-west Paleolithic expansion wave in Eurasia We found asimilar pattern here as the inferred expansions of East-Asianpopulations were earlier than those of Central Asian popula-tions themselves earlier than those of European populations

Moreover we found this pattern in both sedentary farmerand nomadic herder populations Thus the ancestors of cur-rently nomadic herder populations also experienced thesePaleolithic expansions However Paleolithic expansion signalsin nomadic populations seem lower than in sedentary pop-ulations This is again compatible with the demographictheory of the Neolithic sedentarization (Sauer 1952) somepopulations may have experienced more intense Paleolithicexpansions which may have led ultimately to theirsedentarization

The inferred Paleolithic expansion signals might resultpartly from spatial expansions out of some refuge areasafter the Last Glacial Maximum (LGM 26500ndash19500 YBPClark et al 2009) as this time interval matches with our in-ferred dating for expansion onsets in East Asia with HVS-Iusing the pedigree-based mutation rate and in Europe andMiddle East using the transitional mutation rate Some of theearlier date estimates might also be consistent with the out-of-Africa expansion of H sapiens However the time radiocar-bon-based estimates of the spread of H sapiens in Eurasia aregenerally more ancient than our inferred expansion onsettimings For instance Mellars (2006b) dated the colonizationof Middle East by H sapiens at 47000ndash49000 YBP and ofEurope at 41000ndash42000 YBP Pavlov et al (2001) reporttraces of modern human occupation nearly 40000 yearsold in Siberia Finally Liu et al (2010) described modernhuman fossils from South China dated to at least 60000YBP Moreover out-of-Africa or post-LGM expansionswould not explain our finding of an east-to-west gradientof expansion onset timing which rather supports the hypoth-esis of a demographic expansion diffused from east to west inEurasia in a demic (ie migrations of individuals) or cultural(favored by the diffusion of new technologies)

Possible Confounding Factors

Our approach makes the assumption that populations areisolated and panmictic which is questionable for human pop-ulations However we analyzed a large set of populationssampled in very distant geographical regions (ie CentralAfrica East Africa Europe Middle East Central Asia PamirSiberia and East Asia) The main conclusions of this study relyon consistent patterns between most of these areas and itseems unlikely that processes such as admixture could havebiased the estimates similarly everywhere Moreover inCentral Africa several studies have shown that hunter-gath-erer populations show signals of admixture whereas it is notthe case for farmer populations (Patin et al 2009 Verdu et al2009 2013) If this introgression had been strong enough thismay have yielded a spurious expansion signal in the hunter-gatherer populations which is not what we observed here InEurope spatial expansion processes during the Neolithic mayhave led to admixture with Paleolithic populations Aspointed out by a simulation study (Arenas et al 2013) thismay lead to a predominance of the Paleolithic gene pool Thismay be one of the factors explaining why we observed mostlyPaleolithic expansions here

11

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

Similarly potential selection occurring on the whole mito-chondrial genome (eg Pakendorf and Stoneking 2005) seemsunlikely to have impacted in the same way all the studiedpopulations within each group (eg stronger positive selec-tion on sedentary than on nomadic populations) as we an-alyzed different nomadic and sedentary populations livingnear each other in several geographically distant areas

Regarding the potential effects of recombination on theinferences from autosomal data we found that neutralitytests gave similar results on the whole sequences whenusing a simulation procedure that was taking the known re-combination rate of each sequence into account (table 1)and on the largest non-recombining blocks as inferred withIMgc without taking recombination into account in the sim-ulation process (supplementary table S12 SupplementaryMaterial online) It thus appears unlikely that the BEAST anal-yses that can only handle the largest inferred non-recombin-ing blocks are biased because of this

Finally note that the effective population sizes inferredusing BEAST correspond to the Ne of the populationsduring their recent history rather than a value of Ne averagedover the history of the population It explains the finding thatfor most populations we inferred Ne estimates much higherthan generally assumed for humans by population geneticists(about 10000)

Material and Methods

Genetic MarkersAutosomal SequencesWe used data from 20 noncoding a priori neutral and un-linked autosomal regions selected by Patin et al (2009) to beat least 200 kb away from any known or predicted gene tonot be in linkage disequilibrium (LD) neither with each othernor with any known or predicted gene and to have a regionof homology with the chimpanzee genome These regions areon average 1253 bp long Using the four-gamete test (Hudsonand Kaplan 1985) as implemented in IMgc online (Woerneret al 2007) we identified recombination events for 6 of these20 regions As some methods used in this study cannothandle recombination we retained for these six sequencesthe largest non-recombining block inferred by IMgc Becauseof this reduction the 20 regions used were on average1228 bp long To identify potential bias related to thismethod (eg some recombination events may not bedetected using the four-gamete test larger blocks of non-recombining sequence may select for gene trees that areshorter than expected) we computed the summary statisticsand performed neutrality tests (discussed later) both on thewhole sequences (table 1) and on the largest non-recombin-ing blocks (supplementary table S12 Supplementary Materialonline)

Mitochondrial SequencesWe used the first hypervariable segment of the mitochondrialcontrol region (HVS-I) sequenced between positions 16067and 16383 excluding the hypervariable poly-C region (sites16179ndash16195) The total length of the sequence was thus of300 bp

Population Panel

For Africa we used the autosomal sequences data set of Patinet al (2009) which consists of five farmer populations(N = 118 individuals) and five Pygmy hunter-gatherer popu-lations (N = 95) In addition we used the HVS-I data set fromQuintana-Murci et al (2008) which consists of nine CentralAfrican farmer populations (N = 486) and seven CentralAfrican hunter-gatherer populations (N = 318) (supplemen-tary table S1 Supplementary Material online)

For Eurasia we used the autosomal sequences data set ofLaval et al (2010) consisting of 48 individuals from two East-Asian populations (Han Chinese and Japanese) and 47 indi-viduals from two European populations (Chuvash andDanes) We also used the data from 48 individuals fromone sedentary Central Asian population (Tajik farmers) and48 individuals from one nomadic Central Asian population(Kyrgyz herders) of Segurel et al (2013) For HVS-I we ana-lyzed data from 17 Eurasian populations (N = 494 in total)located from Eastern to Western Eurasia belonging to severalpublished data sets (Derenko et al 2000 Richards et al 2000Bermisheva et al 2002 Imaizumi et al 2002 Yao et al 2002Kong et al 2003 Quintana-Murci et al 2004 supplementarytable S1 Supplementary Material online)

For our detailed study of Central Asia we used HVS-I se-quences from 12 farmer populations (N = 408 in total) and 16herder populations (N = 567 in total) These data come fromthe studies by Chaix et al (2007) and Heyer et al (2009) for 25populations (supplementary table S1 SupplementaryMaterial online) The other populations (KIB TAB andTKY) were sequenced for this study As in Chaix et al(2007) and Heyer et al (2009) DNA was extracted fromblood samples using standard protocols and the sequencequality was ensured as follows each base pair was determinedonce with a forward and once with a reverse primer anyambiguous base call was checked by additional and indepen-dent PCR and sequencing reactions all sequences were ex-amined by two independent investigators All sampledindividuals were healthy donors from whom informed con-sent was obtained The study was approved by appropriateEthic Committees and scientific organizations in all countrieswhere samples have been collected

Demographic Inferences from Sequences AnalysisSummary Statistics and Neutrality TestsWe computed classical summary statistics (number of poly-morphic sites S number of haplotypes K) and four neutralitytests (Tajimarsquos [1989] D Fu and Lirsquos [1993] D and F and Fursquos[1997] Fs) on both mitochondrial and autosomal sequencesAlthough neutrality tests were originally designed to detectselective events they also give information about demo-graphic processes especially when applied to neutral markersIndeed expansion events lead to more negative values thanexpected in the absence of selective and demographic pro-cesses Conversely contraction events lead to more positivevalues of the neutrality tests For HVS-I sequences we com-puted all summary statistics and neutrality tests and tested

12

Aime et al doi101093molbevmst156 MBE

their departure from neutrality using the coalescent-basedtests provided in DnaSP (Librado and Rozas 2009)

For the autosomal sequences we used the procedure de-veloped in Laval et al (2010) which combines all autosomalsequences into a single test This procedure consists in com-puting the mean value of each summary statistics across the20 loci and in testing whether this mean value departs signif-icantly from its expectation under neutrality in a constant-size population model using a simulation procedure For agiven population with sample size n we produced 105 simu-lated samples of the same size n under a constant populationsize model using the generation-per-generation coalescent-based algorithm implemented in SIMCOAL v2 (Laval andExcoffier 2004) Each simulated individual was constitutedby 20 independent sequences of 1253 bp (the average se-quence length for the real data)Then we used ARLEQUIN v3(Excoffier et al 2005) as modified by Laval et al (2010) tocompute the summary statistics on these simulated samplesWe assessed whether the observed statistics differed signifi-cantly from the constant population model under neutralityby comparing these statistics with their null distribution ob-tained from the simulated data We used gamma distributedmutation rates with a mean value of 25 108generationsite (95 confidence interval 1476 108

4036 108)in agreement with previous studies (Pluzhnikov et al 2002Voight et al 2005) This procedure yielded a P value for thesignificance of the departure from a constant size model Weperformed this procedure both on the whole sequences andon the largest non-recombining blocks For the whole se-quences (ie including recombination) we performed thedata simulation under a coalescent model with recombina-tion using for each locus the recombination rate provided bythe HapMap build GRCh37 genetic map (InternationalHapMap Consortium 2003) (supplementary table S8Supplementary Material online) whereas for the largestnon-recombining blocks we used a coalescent model withoutrecombination

Both for autosomes and for HVS-I sequences we adjustedthe obtained P values for each neutrality test using a falsediscovery rate (FDR) correction (Benjamini and Hochberg1995) in R v2141 (R Development Core Team 2011) inorder to take into account the increased error probability inthe case of multiple testing

MCMC Estimations of Demographic ParametersWe used the MCMC algorithm implemented in BEAST v16(Drummond and Rambaut 2007) We tested the four demo-graphic models implemented in this software constant effec-tive population size (N0) (constant model) populationexpansion with an increasing growth rate (g) (exponentialmodel) population expansion with an decreasing growthrate (g) (logistic model) and the expansion model in whichN0 is the present day population size N1 the population sizethat the model asymptotes to going into the distant past andg the exponential growth rate that determines how fast thetransition is from near the N1 population size to N0 popula-tion size In fact BEAST estimates composite parameters foreach model namely N0 and g where N0 is the current

effective population size g the growth rate and the muta-tion rate In addition for the expansion model the ratio be-tween the current (N0) and ancestral (N1) effectivepopulation size is also estimated To infer N0 and g fromthese composite parameters we needed to assume a valuefor the mutation rate However there is no consensus formutation rates in humans in the literature as different meth-ods lead to different estimations For autosomes the mostcommonly used value is the phylogenetic rate of= 25 108generationsite (Pluzhnikov et al 2002)However recent studies based on the 1000 genome project(1000 Genomes Project Consortium 2012) have found a2-fold lower rate (= 12 108generationsite) by directlycomparing genome-wide sequences from children and theirparents (Conrad et al 2011 Scally and Durbin 2012) We usedhere both mutation rates Similarly for HVS-I estimated mu-tation rates are highly dependent on methodologies andmodes of calibration (Endicott et al 2009) We used boththe lower and the higher estimated mutation rates the tran-sitional changes mutation rate of = 5 106generationsite (Forster et al 1996) and the pedigree-based rate of= 105generationsite (Howell et al 1996 Heyer et al2001) We used a general time-reversible substitution model(Rodriguez et al 1990) We assumed a generation time of 25years permitting the comparison with previous human pop-ulation genetics studies (eg Chaix et al 2008 Patin et al 2009Laval et al 2010) As BEAST cannot handle recombinationevents we used the largest nonrecombining block withineach sequence (discussed earlier)

We performed three runs of 107 steps per population andper demographic model for the HVS-I sequence and threeruns of 2 108 steps (which corresponded to three runs of107 steps per locus) for the autosomal sequences We re-corded one tree every 1000 steps which thus implied atotal of 105 trees per locus and per run We then removedthe first 10 steps of each run (burn-in period) and combinedthe runs to obtain acceptable effective sample sizes (ESSs of100 or above) The convergence of these runs was assessedusing two methods visual inspection of traces using Tracerv15 (Rambaut and Drummond 2007) to check for concor-dance between runs and computation of Gelman and Rubinrsquos(1992) convergence diagnostic using R v2141 (RDevelopment Core Team 2011) with the function gelmandiag available in the package coda (Plummer et al 2006)

To facilitate a large exploration of the parameter space forthe autosomal sequences we chose uniform priors between 0and 005 for 2N0 and between 109 and 109 for g ForHVS-I sequences we chose uniform priors for N0 between 0and 10 and for gm between25 106 and 25 106 result-ing for the same priors on N0 and g than for autosomalsequences if we assumed = 105generationsite (ieN0 constrained between 0 and 106 and g constrained between1 and 1 per year) Conversely if we assumed = 5 106generationsite it meant that N0 was constrained between 0and 2 106 and g was constrained between05 and 05 peryear

For each population and model we obtained themode and the 95 HPD of N0 and g inferred from their

13

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

posterior distributions (supplementary tables S4 and S7Supplementary Material online) using the add-on packageLocfit (Loader 1999) in R v2141 We selected the best-fittingmodel among the four tested demographic models by esti-mating marginal likelihoods using two methods path sam-pling and stepping-stone sampling (Baele et al 2012) Themodel with the greater marginal likelihood (supplementarytable S3 Supplementary Material online) was considered asthe best-fitting model

Extended Bayesian Skyline PlotsEBSPs (Heled and Drummond 2010) also implemented inBEAST estimate demographic changes occurring continu-ously through time in a population using the time intervalsbetween successive coalescent events This method allows avisualization of the evolution of Ne through time As abovewe combined three runs of 107 steps for mitochondrial se-quences and three runs of 2 108 steps for autosomal se-quences to obtain acceptable ESS values We assumed thesame mutation rates as above and a generation time of 25years Outputs were analyzed with Tracer v15 to visuallycheck for convergence and ESS also to obtain the 95 HPDinterval for the number of demographic changes that oc-curred in the population (supplementary table S5Supplementary Material online) A constant population sizecould be rejected when the 95 HPD of the number ofchange points excluded 0 but included 1 (Heled andDrummond 2010) Then we used R v2141 to computeGelman and Rubinrsquos (1992) convergence diagnostic asabove as well as to compute skyline plots Finally we usedthe population growth curves generated from BEAST toassess the time at which populations began to expandEach Skyline plot consisted of smoothed data points atamp10ndash20 generation intervals We consider that the popula-tion increased (or decreased) when both the median and 95HPD values for Ne increased (or decreased) between morethan two successive data points Although this method didnot provide a 95 HPD interval for the inferred expansiontimings this conservative approach ensured that we consid-ered only relevant expansion signals

Correlation Tests of Inferred Growth Rates andIsolationImmigration Patterns

To test how isolation degrees and migration patterns differ-ences could have impacted our demographic inferences weused ARLEQUIN v311 (Excoffier et al 2005) to compute pop-ulation-specific FST values (Weir and Hill 2002) from HVS-Idata for Central Africa Eurasia and Central Asia (supplemen-tary table S9 Supplementary Material online) and estimateimmigration rates from mismatch distributions under a spa-tially explicit model (Excoffier 2004) (supplementary tableS10 Supplementary Material online) We performed thenSpearman tests using R v2141 to investigate for eachregion how the inferred parametric growth rates were corre-lated with those FST values and immigration rates We used avalue of 0 for the growth rate when the constant model bestfitted the data

Supplementary MaterialSupplementary figures S1 and S2 and tables S1ndashS12 are avail-able at Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

The authors would like to warmly thank all volunteer partic-ipants They also thank Michael Fontaine for his help on somedata analyses Phillip Endicott and Julio Bendezu-Sarmientofor insightful discussions Alexei Drummond for helpful dis-cussions and Friso Palstra for his help on English usage Theythank Laurent Excoffier two anonymous reviewers and theeditor for helpful comments and suggestions All computa-tionally intensive analyses were run on the Linux cluster of theMuseum National drsquoHistoire Naturelle (administrated by JulioPedraza) and on the web-based portal ldquoBioportalrdquo (Kumaret al 2009) This work was supported by ActionsTransversales du Museum - Museum National drsquoHistoireNaturelle (grant ldquoLes relations Societes-Natures dans le longtermerdquo) and Agence Nationale de la Recherche (grantsldquoAlterite culturellerdquo ANR-10-ESVS-0010 and ldquoDemochipsrdquoANR-12-BSV7-0012) CA was financed by a PhD grantfrom the Centre National de la Recherche Scientifique

References1000 Genomes Project Consortium 2012 An integrated map of genetic

variation from 1092 human genomes Nature 49156ndash65Ambrose SH 2001 Paleolithic technology and human evolution Science

2911748ndash1753Arenas M Francois O Currat M Ray N Excoffier L 2013 Influence of

admixture and Paleolithic range contractions on current Europeandiversity gradients Mol Biol Evol 30 57ndash61

Atkinson QD Gray RD Drummond AJ 2009 Bayesian coalescent infer-ence of major human mitochondrial DNA haplogroup expansionsin Africa Proc R Soc Lond B Biol Sci 276367ndash373

Baele G Lemey P Bedford T Rambaut A Suchard MA Alekseyenko AV2012 Improving the accuracy of demographic and molecular clockmodel comparison while accommodating phylogenetic uncertaintyMol Biol Evol 292157ndash2167

Balaresque PL Ballereau SJ Jobling MA 2007 Challenges in humangenetic diversity demographic history and adaptation Hum MolGenet 16R134ndashR139

Batini C Lopes J Behar DM Calafell F Jorde LB van der Veen LQuintana-Murci L Spedini G Destro-Bisol G Comas D 2011Insights into the demographic history of African pygmies from com-plete mitochondrial genomes Mol Biol Evol 281099ndash1110

Beaumont MA 2004 Recent developments in genetic data analysiswhat can they tell us about human demographic historyHeredity 92365ndash379

Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdashapractical and powerful approach to multiple testing J R Stat SocSer B 57289ndash300

Bermisheva M Salimova A Korshunova T Svyatova G Berezina GVillems R Khusnutdinova E 2002 Mitochondrial DNA diversity inthe populations of Middle Asia and Northern Caucasus Eur J HumGenet 10179ndash180

Bocquet-Appel JP 2011 When the worldrsquos population took off thespringboard of the Neolithic Demographic Transition Science 333560ndash561

Bocquet-Appel J-P Bar-Yosef O 2008 The Neolithic demographic tran-sition and its consequences Dordrecht (The Netherlands) Springer

14

Aime et al doi101093molbevmst156 MBE

Chaix R Austerlitz F Hegay T Quintana-Murci L Heyer E 2008 Genetictraces of east-to-west human expansion waves in Eurasia Am J PhysAnthropol 136309ndash317

Chaix R Quintana-Murci L Hegay T Hammer MF Mobasher ZAusterlitz F Heyer E 2007 From social to genetic structures incentral Asia Curr Biol 1743ndash48

Clark PU Dyke AS Shakun JD Carlson AE Clark J Wohlfarth BMitrovica JX Hostetler SW McCabe AM 2009 The Last GlacialMaximum Science 325710ndash714

Conrad DF Keebler JEM DePristo MA et al (17 co-authors) 2011Variation in genome-wide mutation rates within and betweenhuman families Nat Genet 43712ndash714

Derenko MV Malyarchuk BA Dambueva IK Shaikhaev GO DorzhuCM Nimaev DD Zakharov IA 2000 Mitochondrial DNA variationin two South Siberian aboriginal populations implications for thegenetic history of North Asia Hum Biol 72945ndash973

Dirksen VG van Geel B 2004 Mid to late Holocene climate change andits influence on cultural development in South Central Siberia NatoSci S Ss IV Ear 42291ndash307

Drummond AJ Rambaut A 2007 BEAST Bayesian evolutionary analysisby sampling trees BMC Evol Biol 7214

Endicott P Ho SYW Metspalu M Stringer C 2009 Evaluating the mi-tochondrial timescale of human evolution Trends Ecol Evol 24515ndash521

Excoffier L 2004 Patterns of DNA sequence diversity and genetic struc-ture after a range expansion lessons from the infinite-island modelMol Ecol 13853ndash864

Excoffier L Heckel G 2006 Computer programs for population geneticsdata analysis a survival guide Nat Rev Genet 7745ndash758

Excoffier L Laval G Schneider S 2005 Arlequin (version 30) an inte-grated software package for population genetics data analysis EvolBioinform Online 147ndash50

Fontaine MC Snirc A Frantzis A Koutrakis E Ozturk B Ozturk AAAusterlitz F 2012 History of expansion and anthropogenic collapsein a top marine predator of the Black Sea estimated from geneticdata Proc Natl Acad Sci U S A 109E2569ndashE2576

Forster P Harding R Torroni A Bandelt HJ 1996 Origin and evolutionof native American mtDNA variation a reappraisal Am J HumGenet 59935ndash945

Fu YX 1997 Statistical tests of neutrality of mutations against popula-tion growth hitchhiking and background selection Genetics 147915ndash925

Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133693ndash709

Gelman A Rubin DB 1992 Inference from iterative simulation usingmultiple sequences Stat Sci 7457ndash511

Heled J Drummond AJ 2010 Bayesian inference of species trees frommultilocus data Mol Biol Evol 27570ndash580

Heyer E Balaresque P Jobling MA Quintana-Murci L Chaix R Segurel LAldashev A Hegay T 2009 Genetic diversity and the emergence ofethnic groups in Central Asia BMC Genet 1049

Heyer E Chaix R Pavard S Austerlitz F 2012 Sex-specific demographicbehaviours that shape human genomic variation Mol Ecol 21597ndash612

Heyer E Zietkiewicz E Rochowski A Yotova V Puymirat J Labuda D2001 Phylogenetic and familial estimates of mitochondrial substi-tution rates study of control region mutations in deep-rootingpedigrees Am J Hum Genet 631113ndash1126

Ho SYW Shapiro B 2011 Skyline-plot methods for estimatingdemographic history from nucleotide sequences Mol Ecol Res 11423ndash434

Howell N Kubacka I Mackey DA 1996 How rapidly does the humanmitochondrial genome evolve Am J Hum Genet 59501ndash509

Hudson RR Kaplan NL 1985 Statistical properties of the number ofrecombination events in the history of a sample of DNA sequencesGenetics 111147ndash164

Imaizumi K Parsons TJ Yoshino M Holland MM 2002 A new databaseof mitochondrial DNA hypervariable regions I and II sequences from162 Japanese individuals Int J Legal Med 11668ndash73

International HapMap Consortium 2003 The International HapMapProject Nature 426789ndash796

Kingman JFC 1982 The coalescent Stoch Proc Appl 13235ndash248Kong QP Yao YG Sun C Bandelt HJ Zhu CL Zhang YP 2003 Phylogeny

of East Asian mitochondrial DNA lineages inferred from completesequences Am J Hum Genet 73671ndash676

Kumar V Langstieh BT Madhavi KV Naidu VM Singh HP Biswas SThangaraj K Singh L Reddy BM 2006 Global patterns in humanmitochondrial DNA and Y-chromosome variation caused by spatialinstability of the local cultural processes PLoS Genet 2e53

Kumar S Skjaeveland A Orr RJS Enger P Ruden T Mevik BH Burki FBotnen A Shalchian-Tabrizi K 2009 AIR a batch-oriented webprogram package for construction of supermatrices ready for phy-logenomic analyses BMC Bioinformatics 10357

Laval G Excoffier L 2004 SIMCOAL 20 a program to simulate genomicdiversity over large recombining regions in a subdivided populationwith a complex history Bioinformatics 202485ndash2487

Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulatinga historical and demographic model of recent human evolutionbased on resequencing data from noncoding regions PLoS One 5e10284

Librado P Rozas J 2009 DnaSP v5 a software for comprehensive analysisof DNA polymorphism data Bioinformatics 251451ndash1452

Liu W Jin CZ Zhang YQ et al (13 co-authors) 2010 Human remainsfrom Zhirendong South China and modern human emergence inEast Asia Proc Natl Acad Sci U S A 10719201ndash19206

Loader C 1999 Local regression and likelihood New York SpringerMellars P 2006a Why did human populations disperse from Africa ca

60000 years ago A new model Proc Natl Acad Sci U S A 1039381ndash9386

Mellars P 2006b A new radiocarbon revolution and the dispersal ofmodern humans in Eurasia Nature 439931ndash935

Nei M 1995 The origins of human populations genetic linguistic andarcheological data In Brenner S Hanihara K editors The origin andpast of modern humans as viewed from DNA Singapore WorldScientific p 71ndash91

Nei M Roychoudhury AK 1993 Evolutionary relationships of human-populations on a global scale Mol Biol Evol 10927ndash943

Nelson SM 1993 The archaeology of Korea Cambridge CambridgeUniversity Press p 307

Oota H Settheetham-Ishida W Tiwawech D Ishida T Stoneking M2001 Human mtDNA and Y-chromosome variation is corre-lated with matrilocal versus patrilocal residence Nat Genet 2920ndash21

Pakendorf B Stoneking M 2005 Mitochondrial DNA and human evo-lution Annu Rev Genomics Hum Genet 6165ndash183

Partridge TC Demenocal PB Lorentz SA Paiker MJ Vogel JC 1997Orbital forcing of climate over South Africa a 200000-year rainfallrecord from the Pretoria Saltpan Quat Sci Rev 161125ndash1133

Patin E Laval G Barreiro LB et al (15 co-authors) 2009 Inferring thedemographic history of African farmers and pygmy hunter-gath-erers using a multilocus resequencing data set PLoS Genet 5e1000448

Pavlov P Svendsen J Indrelid SI 2001 Human presence in the EuropeanArctic nearly 40000 years ago Nature 41364ndash67

Phillipson DW 1993 African archaeology Cambridge CambridgeUniversity Press

Plummer M Best N Cowles K Vines K 2006 CODA Convergencediagnosis and output analysis for MCMC R News 67ndash11

Pluzhnikov A Di Rienzo A Hudson RR 2002 Inferences about humandemography based on multilocus analyses of noncoding sequencesGenetics 1611209ndash1218

Quintana-Murci L Chaix R Wells RS et al (17 co-authors) 2004 Wherewest meets east the complex mtDNA landscape of the southwestand Central Asian corridor Am J Hum Genet 74827ndash845

Quintana-Murci L Quach H Harmant C et al (23 co-authors) 2008Maternal traces of deep common ancestry and asymmetric geneflow between Pygmy hunter-gatherers and Bantu-speaking farmersProc Natl Acad Sci U S A 1051596ndash1601

15

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE

Page 9: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

negative correlation between FST values and inferred growthrates for sedentary farmers in Eurasia (=0673 P value =0028) (fig 5C) and Central Asia (=0773 P value = 0003)(fig 5D) thus meaning that less isolated populations showedhigher inferred growth rates There was no significant corre-lation for Central Asian herders (=0092 P value = 0736)(fig 5E) Note that this analysis could not be performed forthe other Eurasian herder populations as the constant modelbest fitted the data with the parametric method

The estimation of the proportion of immigrants did notconverge for 11 Eurasian populations (Han Chinese LiaoningQingdao Palestinians Pathans Mongols as well as threeCentral Asian farmer populations and two Central Asianherder populations) Regarding the other populations weshowed no significant difference in the proportion of immi-grants between farmers and herders both in Central Asia(mean[farmers] = 82868 mean[herders] = 260265 P value = 012)and in the rest of Eurasia (mean[farmers] = 576757 mean-

[herders] = 201311 P value = 051) (supplementary table S10Supplementary Material online) We also found no significantcorrelation between this proportion and the inferred growthrates for Eurasian farmers (=0238 P value = 048) CentralAsians farmers (= 0386 P value = 030) and Central Asianherders (=042 P value = 0139) (supplementary fig S2Supplementary Material online)

DiscussionIn this study using a large set of populations from distantgeographic areas we report contrasted demographic historiesthat correlate with lifestyle Moreover the inferred expansionsignals in both African and Eurasian farmer and herder pop-ulations predated the Neolithic transition and the sedentar-ization of these populations

Contrasted Demographic Histories in Sedentary andNomadic Populations

For Africa both mtDNA and autosomal data revealed expan-sion patterns in most sedentary farmer populations as indi-cated by neutrality tests and the parametric andnonparametric BEAST methods Conversely we found con-stant effective population sizes (or possibly contractionevents) for all hunter-gatherer populations Among the farm-ers results were least clear for the Yoruba and the Ewondopopulations as no neutrality test was significant for thesepopulations whereas they showed evidence of expansionevents when analyzed with BEAST This indicates that thesepopulations may have undergone weaker expansion dynam-ics (ie lower growth rates and Ne) than the others Theseremarkable results are of particular importance for theYoruba as it is a reference population in many databases(HapMap 1000 genomes) This also demonstrates thehigher sensitivity of MCMC methods such as BEAST todetect expansions in comparison to neutrality tests

The contrasted patterns inferred between sedentary andnomadic populations in Africa suggest strong differences be-tween the demographic histories of these two groups of pop-ulations The question is whether this pattern results mostly

from differences in local expansion dynamics or whether spa-tial expansion processes at a larger scale were also involved Asshown by Ray et al (2003) negative values for the neutralitytests will be observed in a spatial expansion process if the rateof migrants (Nm) is high enough (at least 20) but not other-wise As in previous studies (eg Verdu et al 2013) we reporta higher degree of isolation (higher population-specific FST

values) in hunter-gatherer populations than in farmer popu-lations Using the spatial expansion model of Excoffier (2004)also leads to higher estimates of the number of immigrantsinto farmer populations Thus both farmers and hunter-gath-erers may have been subject to a spatial expansion processbut the limited number of migrants among hunter-gatherersmay have resulted in an absence of expansion signals forthem This would be consistent with the positive correlationthat we observe between the growth rates estimated withBEAST and the inferred number of immigrants in the seden-tary farmer populations However this spatial expansion pro-cess seems unlikely to completely explain the strongassociation that we observed between lifestyle and expansionpatterns as some farmer populations (Teke Gabonese Fang)displayed FST values similar to those of hunter-gatherers but aclear signal of expansion with relatively high growth rates Thissuggests that even rather isolated farmer populations showsubstantial level of expansions Moreover FST values and in-ferred growth rates in farmer populations were not signifi-cantly correlated Therefore our results suggest that theexpansion patterns observed in sedentary populationsresult not only from a spatial expansion pattern In additionlocal dynamics connected with the higher capacity of foodproduction by farmers also explain their much strongerexpansion signatures relative to their neighboring hunter-gatherer populations

For Eurasia when considering the mtDNA data all threemethods (neutrality tests parametric BEAST analyses andEBSPs) yielded expansion signals for all sedentary farmer pop-ulations except Koreans Conversely only EBSPs and Fursquos Fstest showed expansion signals for nomadic herders but notthe parametric BEAST method nor other neutrality tests Thisresult points toward weaker expansion dynamics in herdersthan in farmers as supported also by the tendency for lowergrowth rates and Ne in herder populations than in farmerpopulations on the EBSP graphs (fig 2C and D) It thus seemsthat the flexibility and nonparametric nature of EBSP analysesallows one to detect weaker expansion events than the para-metric method Moreover Fursquos Fs is known to be more sen-sitive than the other neutrality tests to detect expansions(Ramos-Onsins and Rozas 2002) Again these inferred expan-sions may result at least in part from spatial expansion pro-cesses The population-specific FST values are indeed ratherlow in Eurasia Moreover we found a significant negativecorrelation between FST values and inferred growth rates forthe sedentary farmers indicating that less isolated popula-tions showed stronger expansion signals However althoughwe inferred much stronger expansion patterns for the farmersthan for the herders we did not observe any differences inEurasia between the farmers and the herders in the popula-tion-specific FST values or in the estimated number of

9

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

immigrants suggesting that spatial processes alone cannotexplain the strong difference that we observed between theexpansion patterns of these two groups of populations Thisindicates that the intrinsic demographic growth patterns aredifferent between these two kinds of populations the farmersshowing much higher growth rates than the herders

To our knowledge although other studies have found dif-ferent patterns between hunter-gatherers and farmers (egVerdu et al 2009) our study is the first to show differencesbetween farmers and herders the two major post-Neolithichuman groups A plausible explanation could be that no-madic herders and hunter-gatherers share several of the con-straints of a nomadic way of life For instance birth intervalsare generally longer (at least 4 years) in nomadic populationsthan in sedentary populations (eg Short 1982) According toBocquet-Appel (2011) these longer birth intervals may bemainly determined by diet differences Indeed Valeggia andEllison (2009) demonstrated that birth interval is mainly de-termined by the rapidity of postpartum energy recoverywhich may be increased by high carbohydrate food (like ce-reals) consumption Moreover the nomadic herder way of lifemay offer less food security than sedentary farming the latterfacilitating efficient long-term food storage

However unlike in Africa we did not find systematicallyconsistent patterns between the autosomal and mtDNA datain Eurasia The possible contraction events that our resultssuggest for two sedentary populations (Japanese and Danes)with autosomes appeared concomitant with historical eventsthat could have led to bottleneck processes For the Japanesepopulation this contraction signal could indeed result from afounder effect due to the Paleolithic colonization of Japan by asubset of the Northern Asiatic people (especially from KoreaNei 1995) Similarly a bottleneck process may also have oc-curred in the Danish population linked with the last glacialmaximum occurring between 26500 and 19500 YBP (Clarket al 2009) Reasons why these processes impacted the auto-somes but not the mtDNA data remain to be determined forinstance through simulation studies In any case our studyclearly emphasizes the utility of combining mtDNA and au-tosomal sequences as they allow access to different aspects ofhuman history A recent study on harbor porpoises has sim-ilarly shown that nuclear markers were sensitive to a recentcontraction event whereas mtDNA allowed inferring a moreancient expansion (Fontaine et al 2012)

Interestingly Central Asia displayed a distinct pattern fromthe rest of Eurasia Indeed we did not infer higher expansionrates for sedentary farmers than for nomadic herders in thatarea It could result from harsh local environmental condi-tions due to the arid continental climate in this area Indeedusing pollen records Dirksen and van Geel (2004) showedthat the paleoclimate in Central Asia was very arid from atleast 12000 to 3000 YBP which could have limited theamount of suitable areas for farming and impacted humandemography Spatial expansion processes may also haveplayed a role in this difference as population-specific FST

values were higher for the farmers than for the herdersThis may indicate that more migrants were involved in thespatial expansion process for the herders than for the farmers

yielding a weaker expansion signal (ie lower inferred growthrate) for the latter (Ray et al 2003) This is supported by thenegative correlation between the FST values and the inferredgrowth rates in the farmer populations The Korean popula-tion also stood out as an exception in Eurasia Even though itis a population of sedentary farmers it showed no significantexpansion signal with both the parametric and nonparamet-ric methods with HVS-I This could be explained by a latersedentarization of this population The Korean Neolithic isnotably defined by the introduction of Jeulmun ware ce-ramics about 8000 YBP but the people of the Jeulmunperiod were still predominantly semi-nomadic fishers andhunter-gatherers until about 3000 YBP when Koreans startedan intensive crop production implying a sedentary lifestyle(Nelson 1993)

Inferred Expansion Signals Predate the Emergence ofFarming

EBSP analyses revealed that the inferred expansion events infarmers and herder populations were more ancient than theemergence of farming and herding Therefore the differencesin demographic patterns between farmers and herders seemto predate their divergence in lifestyle which raises the ques-tion of the chronology of demographic expansions and theNeolithic transition These findings appear to be quite robustto the choice of the scaling parameters We used here boththe lower and the higher mutation rate estimates in humansfor autosomes (Pluzhnikov et al 2002 Conrad et al 2011) andfor the HVS-I sequence (Forster et al 1996 Howell et al 1996)Despite this uncertainty in mutation rates which lead to a 2-fold uncertainty in our time estimates the inferred expansionsignals predated the emergence of agriculture in both casesfor all populations Similarly using a generation time of 29years (Tremblay and Vezina 2000) instead of 25 years lead toslightly more ancient estimates thus do not change our con-clusions (data not shown) However note that for HVS-Iusing the higher bound of the credibility interval for the high-est estimated mutation rate (275 105generationsiteHeyer et al 2001) instead of the mean value (ie 105gen-erationsite) leads to expansion time estimates consistentwith the Neolithic transition in Eurasian populations (supple-mentary table S11 Supplementary Material online)Nevertheless these estimates still clearly predated theNeolithic for the African populations However 105gener-ationsite is by far the highest estimation of mutation rate inthe literature (Howell et al 1996) To infer Neolithic expan-sions in most Eurasian populations one needs to assume amutation rate of at least 2 105generationsite muchhigher than other estimates from the literature and is thusprobably unrealistic Moreover our method for determiningthe expansion onset time using EBSP graph is very conserva-tive and also tends to favor the lower bound of expansiononset times Finally for autosomes using similarly474 108generationsite instead of 25 108genera-tionsite (Pluzhnikov et al 2002) lead to an inferred expansiononset time that is not compatible with the Neolithic transi-tion for all Eurasian and African populations except for one

10

Aime et al doi101093molbevmst156 MBE

African population the Yoruba (supplementary table S12Supplementary Material online) Consequently it seemsvery likely that the expansions inferred in this study corre-spond to Paleolithic rather than Neolithic demographicevents in agreement also with most previous studies as de-tailed later

In Africa the emergence of agriculture has been datedbetween 5000 and 4000 YBP in the Western part ofCentral Africa and subsequently rapidly expanded to therest of sub-Saharan Africa (Phillipson 1993) However usingHVS-I we showed expansion events in farmer populationssince about 30000 or 60000 YBP thus largely predating theemergence of agriculture in the area Similarly using auto-somes especially in Eastern African populations we inferredexpansion signals that clearly predated the Neolithic Notablywe inferred an expansion signal for Mozambicans since atleast 80000 YBP Several genetic studies have already high-lighted that expansion events occurred in African farmersbefore the Neolithic transition (eg Atkinson et al 2009Laval et al 2010 Batini et al 2011) This finding is also con-sistent with paleoanthropological data (ie radiocarbondating) suggesting an expansion event in Africa 60000ndash80000 YBP (Mellars 2006a) This Paleolithic demographicexpansion could be linked to a rapid environmental changetoward a dryer climate (Partridge et al 1997) andor to theemergence of new hunting technologies (Mellars 2006a)

According to Mellars (2006a) this period corresponds to amajor increase in the complexity of the technological eco-nomic social and cognitive behavior of certain Africangroups It corresponds in particular to the emergence of pro-jectile technologies (Shea 2009) which was probably part of abroader pattern of ecological diversification of early Homosapiens populations These changes could have been decisivefor the human spread ldquoOut of Africardquo during the same periodand could have ultimately also led to the sedentarization ofthe remaining populations This inference is consistent withSauerrsquos (1952) demographic theory which stated that latePaleolithic demographic expansions could have favored thesedentarization and the emergence of agriculture in somehuman populations In the case of Central Africa theperiod of 60000 YBP corresponds to the separation betweenhunter-gatherers and farmers ancestors (Patin et al 2009Verdu et al 2009) Thus these two groups may have pre-sented contrasting demographic patterns since their diver-gence Much later higher expansion rates and largerpopulation sizes among farmersrsquo ancestors may have inducedthe emergence of agriculture and sedentarization

With respect to Eurasia the expansion profiles inferredwith HVS-I for all populations and with autosomes for theHan Chinese population also seem to have begun during thePaleolithic thus before the Neolithic transition Some geneticstudies already reported pre-Neolithic expansions in Asia andEurope (eg Chaix et al 2008) Notably using mismatch andintermatch distributions Chaix et al (2008) showed an east-to-west Paleolithic expansion wave in Eurasia We found asimilar pattern here as the inferred expansions of East-Asianpopulations were earlier than those of Central Asian popula-tions themselves earlier than those of European populations

Moreover we found this pattern in both sedentary farmerand nomadic herder populations Thus the ancestors of cur-rently nomadic herder populations also experienced thesePaleolithic expansions However Paleolithic expansion signalsin nomadic populations seem lower than in sedentary pop-ulations This is again compatible with the demographictheory of the Neolithic sedentarization (Sauer 1952) somepopulations may have experienced more intense Paleolithicexpansions which may have led ultimately to theirsedentarization

The inferred Paleolithic expansion signals might resultpartly from spatial expansions out of some refuge areasafter the Last Glacial Maximum (LGM 26500ndash19500 YBPClark et al 2009) as this time interval matches with our in-ferred dating for expansion onsets in East Asia with HVS-Iusing the pedigree-based mutation rate and in Europe andMiddle East using the transitional mutation rate Some of theearlier date estimates might also be consistent with the out-of-Africa expansion of H sapiens However the time radiocar-bon-based estimates of the spread of H sapiens in Eurasia aregenerally more ancient than our inferred expansion onsettimings For instance Mellars (2006b) dated the colonizationof Middle East by H sapiens at 47000ndash49000 YBP and ofEurope at 41000ndash42000 YBP Pavlov et al (2001) reporttraces of modern human occupation nearly 40000 yearsold in Siberia Finally Liu et al (2010) described modernhuman fossils from South China dated to at least 60000YBP Moreover out-of-Africa or post-LGM expansionswould not explain our finding of an east-to-west gradientof expansion onset timing which rather supports the hypoth-esis of a demographic expansion diffused from east to west inEurasia in a demic (ie migrations of individuals) or cultural(favored by the diffusion of new technologies)

Possible Confounding Factors

Our approach makes the assumption that populations areisolated and panmictic which is questionable for human pop-ulations However we analyzed a large set of populationssampled in very distant geographical regions (ie CentralAfrica East Africa Europe Middle East Central Asia PamirSiberia and East Asia) The main conclusions of this study relyon consistent patterns between most of these areas and itseems unlikely that processes such as admixture could havebiased the estimates similarly everywhere Moreover inCentral Africa several studies have shown that hunter-gath-erer populations show signals of admixture whereas it is notthe case for farmer populations (Patin et al 2009 Verdu et al2009 2013) If this introgression had been strong enough thismay have yielded a spurious expansion signal in the hunter-gatherer populations which is not what we observed here InEurope spatial expansion processes during the Neolithic mayhave led to admixture with Paleolithic populations Aspointed out by a simulation study (Arenas et al 2013) thismay lead to a predominance of the Paleolithic gene pool Thismay be one of the factors explaining why we observed mostlyPaleolithic expansions here

11

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

Similarly potential selection occurring on the whole mito-chondrial genome (eg Pakendorf and Stoneking 2005) seemsunlikely to have impacted in the same way all the studiedpopulations within each group (eg stronger positive selec-tion on sedentary than on nomadic populations) as we an-alyzed different nomadic and sedentary populations livingnear each other in several geographically distant areas

Regarding the potential effects of recombination on theinferences from autosomal data we found that neutralitytests gave similar results on the whole sequences whenusing a simulation procedure that was taking the known re-combination rate of each sequence into account (table 1)and on the largest non-recombining blocks as inferred withIMgc without taking recombination into account in the sim-ulation process (supplementary table S12 SupplementaryMaterial online) It thus appears unlikely that the BEAST anal-yses that can only handle the largest inferred non-recombin-ing blocks are biased because of this

Finally note that the effective population sizes inferredusing BEAST correspond to the Ne of the populationsduring their recent history rather than a value of Ne averagedover the history of the population It explains the finding thatfor most populations we inferred Ne estimates much higherthan generally assumed for humans by population geneticists(about 10000)

Material and Methods

Genetic MarkersAutosomal SequencesWe used data from 20 noncoding a priori neutral and un-linked autosomal regions selected by Patin et al (2009) to beat least 200 kb away from any known or predicted gene tonot be in linkage disequilibrium (LD) neither with each othernor with any known or predicted gene and to have a regionof homology with the chimpanzee genome These regions areon average 1253 bp long Using the four-gamete test (Hudsonand Kaplan 1985) as implemented in IMgc online (Woerneret al 2007) we identified recombination events for 6 of these20 regions As some methods used in this study cannothandle recombination we retained for these six sequencesthe largest non-recombining block inferred by IMgc Becauseof this reduction the 20 regions used were on average1228 bp long To identify potential bias related to thismethod (eg some recombination events may not bedetected using the four-gamete test larger blocks of non-recombining sequence may select for gene trees that areshorter than expected) we computed the summary statisticsand performed neutrality tests (discussed later) both on thewhole sequences (table 1) and on the largest non-recombin-ing blocks (supplementary table S12 Supplementary Materialonline)

Mitochondrial SequencesWe used the first hypervariable segment of the mitochondrialcontrol region (HVS-I) sequenced between positions 16067and 16383 excluding the hypervariable poly-C region (sites16179ndash16195) The total length of the sequence was thus of300 bp

Population Panel

For Africa we used the autosomal sequences data set of Patinet al (2009) which consists of five farmer populations(N = 118 individuals) and five Pygmy hunter-gatherer popu-lations (N = 95) In addition we used the HVS-I data set fromQuintana-Murci et al (2008) which consists of nine CentralAfrican farmer populations (N = 486) and seven CentralAfrican hunter-gatherer populations (N = 318) (supplemen-tary table S1 Supplementary Material online)

For Eurasia we used the autosomal sequences data set ofLaval et al (2010) consisting of 48 individuals from two East-Asian populations (Han Chinese and Japanese) and 47 indi-viduals from two European populations (Chuvash andDanes) We also used the data from 48 individuals fromone sedentary Central Asian population (Tajik farmers) and48 individuals from one nomadic Central Asian population(Kyrgyz herders) of Segurel et al (2013) For HVS-I we ana-lyzed data from 17 Eurasian populations (N = 494 in total)located from Eastern to Western Eurasia belonging to severalpublished data sets (Derenko et al 2000 Richards et al 2000Bermisheva et al 2002 Imaizumi et al 2002 Yao et al 2002Kong et al 2003 Quintana-Murci et al 2004 supplementarytable S1 Supplementary Material online)

For our detailed study of Central Asia we used HVS-I se-quences from 12 farmer populations (N = 408 in total) and 16herder populations (N = 567 in total) These data come fromthe studies by Chaix et al (2007) and Heyer et al (2009) for 25populations (supplementary table S1 SupplementaryMaterial online) The other populations (KIB TAB andTKY) were sequenced for this study As in Chaix et al(2007) and Heyer et al (2009) DNA was extracted fromblood samples using standard protocols and the sequencequality was ensured as follows each base pair was determinedonce with a forward and once with a reverse primer anyambiguous base call was checked by additional and indepen-dent PCR and sequencing reactions all sequences were ex-amined by two independent investigators All sampledindividuals were healthy donors from whom informed con-sent was obtained The study was approved by appropriateEthic Committees and scientific organizations in all countrieswhere samples have been collected

Demographic Inferences from Sequences AnalysisSummary Statistics and Neutrality TestsWe computed classical summary statistics (number of poly-morphic sites S number of haplotypes K) and four neutralitytests (Tajimarsquos [1989] D Fu and Lirsquos [1993] D and F and Fursquos[1997] Fs) on both mitochondrial and autosomal sequencesAlthough neutrality tests were originally designed to detectselective events they also give information about demo-graphic processes especially when applied to neutral markersIndeed expansion events lead to more negative values thanexpected in the absence of selective and demographic pro-cesses Conversely contraction events lead to more positivevalues of the neutrality tests For HVS-I sequences we com-puted all summary statistics and neutrality tests and tested

12

Aime et al doi101093molbevmst156 MBE

their departure from neutrality using the coalescent-basedtests provided in DnaSP (Librado and Rozas 2009)

For the autosomal sequences we used the procedure de-veloped in Laval et al (2010) which combines all autosomalsequences into a single test This procedure consists in com-puting the mean value of each summary statistics across the20 loci and in testing whether this mean value departs signif-icantly from its expectation under neutrality in a constant-size population model using a simulation procedure For agiven population with sample size n we produced 105 simu-lated samples of the same size n under a constant populationsize model using the generation-per-generation coalescent-based algorithm implemented in SIMCOAL v2 (Laval andExcoffier 2004) Each simulated individual was constitutedby 20 independent sequences of 1253 bp (the average se-quence length for the real data)Then we used ARLEQUIN v3(Excoffier et al 2005) as modified by Laval et al (2010) tocompute the summary statistics on these simulated samplesWe assessed whether the observed statistics differed signifi-cantly from the constant population model under neutralityby comparing these statistics with their null distribution ob-tained from the simulated data We used gamma distributedmutation rates with a mean value of 25 108generationsite (95 confidence interval 1476 108

4036 108)in agreement with previous studies (Pluzhnikov et al 2002Voight et al 2005) This procedure yielded a P value for thesignificance of the departure from a constant size model Weperformed this procedure both on the whole sequences andon the largest non-recombining blocks For the whole se-quences (ie including recombination) we performed thedata simulation under a coalescent model with recombina-tion using for each locus the recombination rate provided bythe HapMap build GRCh37 genetic map (InternationalHapMap Consortium 2003) (supplementary table S8Supplementary Material online) whereas for the largestnon-recombining blocks we used a coalescent model withoutrecombination

Both for autosomes and for HVS-I sequences we adjustedthe obtained P values for each neutrality test using a falsediscovery rate (FDR) correction (Benjamini and Hochberg1995) in R v2141 (R Development Core Team 2011) inorder to take into account the increased error probability inthe case of multiple testing

MCMC Estimations of Demographic ParametersWe used the MCMC algorithm implemented in BEAST v16(Drummond and Rambaut 2007) We tested the four demo-graphic models implemented in this software constant effec-tive population size (N0) (constant model) populationexpansion with an increasing growth rate (g) (exponentialmodel) population expansion with an decreasing growthrate (g) (logistic model) and the expansion model in whichN0 is the present day population size N1 the population sizethat the model asymptotes to going into the distant past andg the exponential growth rate that determines how fast thetransition is from near the N1 population size to N0 popula-tion size In fact BEAST estimates composite parameters foreach model namely N0 and g where N0 is the current

effective population size g the growth rate and the muta-tion rate In addition for the expansion model the ratio be-tween the current (N0) and ancestral (N1) effectivepopulation size is also estimated To infer N0 and g fromthese composite parameters we needed to assume a valuefor the mutation rate However there is no consensus formutation rates in humans in the literature as different meth-ods lead to different estimations For autosomes the mostcommonly used value is the phylogenetic rate of= 25 108generationsite (Pluzhnikov et al 2002)However recent studies based on the 1000 genome project(1000 Genomes Project Consortium 2012) have found a2-fold lower rate (= 12 108generationsite) by directlycomparing genome-wide sequences from children and theirparents (Conrad et al 2011 Scally and Durbin 2012) We usedhere both mutation rates Similarly for HVS-I estimated mu-tation rates are highly dependent on methodologies andmodes of calibration (Endicott et al 2009) We used boththe lower and the higher estimated mutation rates the tran-sitional changes mutation rate of = 5 106generationsite (Forster et al 1996) and the pedigree-based rate of= 105generationsite (Howell et al 1996 Heyer et al2001) We used a general time-reversible substitution model(Rodriguez et al 1990) We assumed a generation time of 25years permitting the comparison with previous human pop-ulation genetics studies (eg Chaix et al 2008 Patin et al 2009Laval et al 2010) As BEAST cannot handle recombinationevents we used the largest nonrecombining block withineach sequence (discussed earlier)

We performed three runs of 107 steps per population andper demographic model for the HVS-I sequence and threeruns of 2 108 steps (which corresponded to three runs of107 steps per locus) for the autosomal sequences We re-corded one tree every 1000 steps which thus implied atotal of 105 trees per locus and per run We then removedthe first 10 steps of each run (burn-in period) and combinedthe runs to obtain acceptable effective sample sizes (ESSs of100 or above) The convergence of these runs was assessedusing two methods visual inspection of traces using Tracerv15 (Rambaut and Drummond 2007) to check for concor-dance between runs and computation of Gelman and Rubinrsquos(1992) convergence diagnostic using R v2141 (RDevelopment Core Team 2011) with the function gelmandiag available in the package coda (Plummer et al 2006)

To facilitate a large exploration of the parameter space forthe autosomal sequences we chose uniform priors between 0and 005 for 2N0 and between 109 and 109 for g ForHVS-I sequences we chose uniform priors for N0 between 0and 10 and for gm between25 106 and 25 106 result-ing for the same priors on N0 and g than for autosomalsequences if we assumed = 105generationsite (ieN0 constrained between 0 and 106 and g constrained between1 and 1 per year) Conversely if we assumed = 5 106generationsite it meant that N0 was constrained between 0and 2 106 and g was constrained between05 and 05 peryear

For each population and model we obtained themode and the 95 HPD of N0 and g inferred from their

13

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

posterior distributions (supplementary tables S4 and S7Supplementary Material online) using the add-on packageLocfit (Loader 1999) in R v2141 We selected the best-fittingmodel among the four tested demographic models by esti-mating marginal likelihoods using two methods path sam-pling and stepping-stone sampling (Baele et al 2012) Themodel with the greater marginal likelihood (supplementarytable S3 Supplementary Material online) was considered asthe best-fitting model

Extended Bayesian Skyline PlotsEBSPs (Heled and Drummond 2010) also implemented inBEAST estimate demographic changes occurring continu-ously through time in a population using the time intervalsbetween successive coalescent events This method allows avisualization of the evolution of Ne through time As abovewe combined three runs of 107 steps for mitochondrial se-quences and three runs of 2 108 steps for autosomal se-quences to obtain acceptable ESS values We assumed thesame mutation rates as above and a generation time of 25years Outputs were analyzed with Tracer v15 to visuallycheck for convergence and ESS also to obtain the 95 HPDinterval for the number of demographic changes that oc-curred in the population (supplementary table S5Supplementary Material online) A constant population sizecould be rejected when the 95 HPD of the number ofchange points excluded 0 but included 1 (Heled andDrummond 2010) Then we used R v2141 to computeGelman and Rubinrsquos (1992) convergence diagnostic asabove as well as to compute skyline plots Finally we usedthe population growth curves generated from BEAST toassess the time at which populations began to expandEach Skyline plot consisted of smoothed data points atamp10ndash20 generation intervals We consider that the popula-tion increased (or decreased) when both the median and 95HPD values for Ne increased (or decreased) between morethan two successive data points Although this method didnot provide a 95 HPD interval for the inferred expansiontimings this conservative approach ensured that we consid-ered only relevant expansion signals

Correlation Tests of Inferred Growth Rates andIsolationImmigration Patterns

To test how isolation degrees and migration patterns differ-ences could have impacted our demographic inferences weused ARLEQUIN v311 (Excoffier et al 2005) to compute pop-ulation-specific FST values (Weir and Hill 2002) from HVS-Idata for Central Africa Eurasia and Central Asia (supplemen-tary table S9 Supplementary Material online) and estimateimmigration rates from mismatch distributions under a spa-tially explicit model (Excoffier 2004) (supplementary tableS10 Supplementary Material online) We performed thenSpearman tests using R v2141 to investigate for eachregion how the inferred parametric growth rates were corre-lated with those FST values and immigration rates We used avalue of 0 for the growth rate when the constant model bestfitted the data

Supplementary MaterialSupplementary figures S1 and S2 and tables S1ndashS12 are avail-able at Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

The authors would like to warmly thank all volunteer partic-ipants They also thank Michael Fontaine for his help on somedata analyses Phillip Endicott and Julio Bendezu-Sarmientofor insightful discussions Alexei Drummond for helpful dis-cussions and Friso Palstra for his help on English usage Theythank Laurent Excoffier two anonymous reviewers and theeditor for helpful comments and suggestions All computa-tionally intensive analyses were run on the Linux cluster of theMuseum National drsquoHistoire Naturelle (administrated by JulioPedraza) and on the web-based portal ldquoBioportalrdquo (Kumaret al 2009) This work was supported by ActionsTransversales du Museum - Museum National drsquoHistoireNaturelle (grant ldquoLes relations Societes-Natures dans le longtermerdquo) and Agence Nationale de la Recherche (grantsldquoAlterite culturellerdquo ANR-10-ESVS-0010 and ldquoDemochipsrdquoANR-12-BSV7-0012) CA was financed by a PhD grantfrom the Centre National de la Recherche Scientifique

References1000 Genomes Project Consortium 2012 An integrated map of genetic

variation from 1092 human genomes Nature 49156ndash65Ambrose SH 2001 Paleolithic technology and human evolution Science

2911748ndash1753Arenas M Francois O Currat M Ray N Excoffier L 2013 Influence of

admixture and Paleolithic range contractions on current Europeandiversity gradients Mol Biol Evol 30 57ndash61

Atkinson QD Gray RD Drummond AJ 2009 Bayesian coalescent infer-ence of major human mitochondrial DNA haplogroup expansionsin Africa Proc R Soc Lond B Biol Sci 276367ndash373

Baele G Lemey P Bedford T Rambaut A Suchard MA Alekseyenko AV2012 Improving the accuracy of demographic and molecular clockmodel comparison while accommodating phylogenetic uncertaintyMol Biol Evol 292157ndash2167

Balaresque PL Ballereau SJ Jobling MA 2007 Challenges in humangenetic diversity demographic history and adaptation Hum MolGenet 16R134ndashR139

Batini C Lopes J Behar DM Calafell F Jorde LB van der Veen LQuintana-Murci L Spedini G Destro-Bisol G Comas D 2011Insights into the demographic history of African pygmies from com-plete mitochondrial genomes Mol Biol Evol 281099ndash1110

Beaumont MA 2004 Recent developments in genetic data analysiswhat can they tell us about human demographic historyHeredity 92365ndash379

Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdashapractical and powerful approach to multiple testing J R Stat SocSer B 57289ndash300

Bermisheva M Salimova A Korshunova T Svyatova G Berezina GVillems R Khusnutdinova E 2002 Mitochondrial DNA diversity inthe populations of Middle Asia and Northern Caucasus Eur J HumGenet 10179ndash180

Bocquet-Appel JP 2011 When the worldrsquos population took off thespringboard of the Neolithic Demographic Transition Science 333560ndash561

Bocquet-Appel J-P Bar-Yosef O 2008 The Neolithic demographic tran-sition and its consequences Dordrecht (The Netherlands) Springer

14

Aime et al doi101093molbevmst156 MBE

Chaix R Austerlitz F Hegay T Quintana-Murci L Heyer E 2008 Genetictraces of east-to-west human expansion waves in Eurasia Am J PhysAnthropol 136309ndash317

Chaix R Quintana-Murci L Hegay T Hammer MF Mobasher ZAusterlitz F Heyer E 2007 From social to genetic structures incentral Asia Curr Biol 1743ndash48

Clark PU Dyke AS Shakun JD Carlson AE Clark J Wohlfarth BMitrovica JX Hostetler SW McCabe AM 2009 The Last GlacialMaximum Science 325710ndash714

Conrad DF Keebler JEM DePristo MA et al (17 co-authors) 2011Variation in genome-wide mutation rates within and betweenhuman families Nat Genet 43712ndash714

Derenko MV Malyarchuk BA Dambueva IK Shaikhaev GO DorzhuCM Nimaev DD Zakharov IA 2000 Mitochondrial DNA variationin two South Siberian aboriginal populations implications for thegenetic history of North Asia Hum Biol 72945ndash973

Dirksen VG van Geel B 2004 Mid to late Holocene climate change andits influence on cultural development in South Central Siberia NatoSci S Ss IV Ear 42291ndash307

Drummond AJ Rambaut A 2007 BEAST Bayesian evolutionary analysisby sampling trees BMC Evol Biol 7214

Endicott P Ho SYW Metspalu M Stringer C 2009 Evaluating the mi-tochondrial timescale of human evolution Trends Ecol Evol 24515ndash521

Excoffier L 2004 Patterns of DNA sequence diversity and genetic struc-ture after a range expansion lessons from the infinite-island modelMol Ecol 13853ndash864

Excoffier L Heckel G 2006 Computer programs for population geneticsdata analysis a survival guide Nat Rev Genet 7745ndash758

Excoffier L Laval G Schneider S 2005 Arlequin (version 30) an inte-grated software package for population genetics data analysis EvolBioinform Online 147ndash50

Fontaine MC Snirc A Frantzis A Koutrakis E Ozturk B Ozturk AAAusterlitz F 2012 History of expansion and anthropogenic collapsein a top marine predator of the Black Sea estimated from geneticdata Proc Natl Acad Sci U S A 109E2569ndashE2576

Forster P Harding R Torroni A Bandelt HJ 1996 Origin and evolutionof native American mtDNA variation a reappraisal Am J HumGenet 59935ndash945

Fu YX 1997 Statistical tests of neutrality of mutations against popula-tion growth hitchhiking and background selection Genetics 147915ndash925

Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133693ndash709

Gelman A Rubin DB 1992 Inference from iterative simulation usingmultiple sequences Stat Sci 7457ndash511

Heled J Drummond AJ 2010 Bayesian inference of species trees frommultilocus data Mol Biol Evol 27570ndash580

Heyer E Balaresque P Jobling MA Quintana-Murci L Chaix R Segurel LAldashev A Hegay T 2009 Genetic diversity and the emergence ofethnic groups in Central Asia BMC Genet 1049

Heyer E Chaix R Pavard S Austerlitz F 2012 Sex-specific demographicbehaviours that shape human genomic variation Mol Ecol 21597ndash612

Heyer E Zietkiewicz E Rochowski A Yotova V Puymirat J Labuda D2001 Phylogenetic and familial estimates of mitochondrial substi-tution rates study of control region mutations in deep-rootingpedigrees Am J Hum Genet 631113ndash1126

Ho SYW Shapiro B 2011 Skyline-plot methods for estimatingdemographic history from nucleotide sequences Mol Ecol Res 11423ndash434

Howell N Kubacka I Mackey DA 1996 How rapidly does the humanmitochondrial genome evolve Am J Hum Genet 59501ndash509

Hudson RR Kaplan NL 1985 Statistical properties of the number ofrecombination events in the history of a sample of DNA sequencesGenetics 111147ndash164

Imaizumi K Parsons TJ Yoshino M Holland MM 2002 A new databaseof mitochondrial DNA hypervariable regions I and II sequences from162 Japanese individuals Int J Legal Med 11668ndash73

International HapMap Consortium 2003 The International HapMapProject Nature 426789ndash796

Kingman JFC 1982 The coalescent Stoch Proc Appl 13235ndash248Kong QP Yao YG Sun C Bandelt HJ Zhu CL Zhang YP 2003 Phylogeny

of East Asian mitochondrial DNA lineages inferred from completesequences Am J Hum Genet 73671ndash676

Kumar V Langstieh BT Madhavi KV Naidu VM Singh HP Biswas SThangaraj K Singh L Reddy BM 2006 Global patterns in humanmitochondrial DNA and Y-chromosome variation caused by spatialinstability of the local cultural processes PLoS Genet 2e53

Kumar S Skjaeveland A Orr RJS Enger P Ruden T Mevik BH Burki FBotnen A Shalchian-Tabrizi K 2009 AIR a batch-oriented webprogram package for construction of supermatrices ready for phy-logenomic analyses BMC Bioinformatics 10357

Laval G Excoffier L 2004 SIMCOAL 20 a program to simulate genomicdiversity over large recombining regions in a subdivided populationwith a complex history Bioinformatics 202485ndash2487

Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulatinga historical and demographic model of recent human evolutionbased on resequencing data from noncoding regions PLoS One 5e10284

Librado P Rozas J 2009 DnaSP v5 a software for comprehensive analysisof DNA polymorphism data Bioinformatics 251451ndash1452

Liu W Jin CZ Zhang YQ et al (13 co-authors) 2010 Human remainsfrom Zhirendong South China and modern human emergence inEast Asia Proc Natl Acad Sci U S A 10719201ndash19206

Loader C 1999 Local regression and likelihood New York SpringerMellars P 2006a Why did human populations disperse from Africa ca

60000 years ago A new model Proc Natl Acad Sci U S A 1039381ndash9386

Mellars P 2006b A new radiocarbon revolution and the dispersal ofmodern humans in Eurasia Nature 439931ndash935

Nei M 1995 The origins of human populations genetic linguistic andarcheological data In Brenner S Hanihara K editors The origin andpast of modern humans as viewed from DNA Singapore WorldScientific p 71ndash91

Nei M Roychoudhury AK 1993 Evolutionary relationships of human-populations on a global scale Mol Biol Evol 10927ndash943

Nelson SM 1993 The archaeology of Korea Cambridge CambridgeUniversity Press p 307

Oota H Settheetham-Ishida W Tiwawech D Ishida T Stoneking M2001 Human mtDNA and Y-chromosome variation is corre-lated with matrilocal versus patrilocal residence Nat Genet 2920ndash21

Pakendorf B Stoneking M 2005 Mitochondrial DNA and human evo-lution Annu Rev Genomics Hum Genet 6165ndash183

Partridge TC Demenocal PB Lorentz SA Paiker MJ Vogel JC 1997Orbital forcing of climate over South Africa a 200000-year rainfallrecord from the Pretoria Saltpan Quat Sci Rev 161125ndash1133

Patin E Laval G Barreiro LB et al (15 co-authors) 2009 Inferring thedemographic history of African farmers and pygmy hunter-gath-erers using a multilocus resequencing data set PLoS Genet 5e1000448

Pavlov P Svendsen J Indrelid SI 2001 Human presence in the EuropeanArctic nearly 40000 years ago Nature 41364ndash67

Phillipson DW 1993 African archaeology Cambridge CambridgeUniversity Press

Plummer M Best N Cowles K Vines K 2006 CODA Convergencediagnosis and output analysis for MCMC R News 67ndash11

Pluzhnikov A Di Rienzo A Hudson RR 2002 Inferences about humandemography based on multilocus analyses of noncoding sequencesGenetics 1611209ndash1218

Quintana-Murci L Chaix R Wells RS et al (17 co-authors) 2004 Wherewest meets east the complex mtDNA landscape of the southwestand Central Asian corridor Am J Hum Genet 74827ndash845

Quintana-Murci L Quach H Harmant C et al (23 co-authors) 2008Maternal traces of deep common ancestry and asymmetric geneflow between Pygmy hunter-gatherers and Bantu-speaking farmersProc Natl Acad Sci U S A 1051596ndash1601

15

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE

Page 10: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

immigrants suggesting that spatial processes alone cannotexplain the strong difference that we observed between theexpansion patterns of these two groups of populations Thisindicates that the intrinsic demographic growth patterns aredifferent between these two kinds of populations the farmersshowing much higher growth rates than the herders

To our knowledge although other studies have found dif-ferent patterns between hunter-gatherers and farmers (egVerdu et al 2009) our study is the first to show differencesbetween farmers and herders the two major post-Neolithichuman groups A plausible explanation could be that no-madic herders and hunter-gatherers share several of the con-straints of a nomadic way of life For instance birth intervalsare generally longer (at least 4 years) in nomadic populationsthan in sedentary populations (eg Short 1982) According toBocquet-Appel (2011) these longer birth intervals may bemainly determined by diet differences Indeed Valeggia andEllison (2009) demonstrated that birth interval is mainly de-termined by the rapidity of postpartum energy recoverywhich may be increased by high carbohydrate food (like ce-reals) consumption Moreover the nomadic herder way of lifemay offer less food security than sedentary farming the latterfacilitating efficient long-term food storage

However unlike in Africa we did not find systematicallyconsistent patterns between the autosomal and mtDNA datain Eurasia The possible contraction events that our resultssuggest for two sedentary populations (Japanese and Danes)with autosomes appeared concomitant with historical eventsthat could have led to bottleneck processes For the Japanesepopulation this contraction signal could indeed result from afounder effect due to the Paleolithic colonization of Japan by asubset of the Northern Asiatic people (especially from KoreaNei 1995) Similarly a bottleneck process may also have oc-curred in the Danish population linked with the last glacialmaximum occurring between 26500 and 19500 YBP (Clarket al 2009) Reasons why these processes impacted the auto-somes but not the mtDNA data remain to be determined forinstance through simulation studies In any case our studyclearly emphasizes the utility of combining mtDNA and au-tosomal sequences as they allow access to different aspects ofhuman history A recent study on harbor porpoises has sim-ilarly shown that nuclear markers were sensitive to a recentcontraction event whereas mtDNA allowed inferring a moreancient expansion (Fontaine et al 2012)

Interestingly Central Asia displayed a distinct pattern fromthe rest of Eurasia Indeed we did not infer higher expansionrates for sedentary farmers than for nomadic herders in thatarea It could result from harsh local environmental condi-tions due to the arid continental climate in this area Indeedusing pollen records Dirksen and van Geel (2004) showedthat the paleoclimate in Central Asia was very arid from atleast 12000 to 3000 YBP which could have limited theamount of suitable areas for farming and impacted humandemography Spatial expansion processes may also haveplayed a role in this difference as population-specific FST

values were higher for the farmers than for the herdersThis may indicate that more migrants were involved in thespatial expansion process for the herders than for the farmers

yielding a weaker expansion signal (ie lower inferred growthrate) for the latter (Ray et al 2003) This is supported by thenegative correlation between the FST values and the inferredgrowth rates in the farmer populations The Korean popula-tion also stood out as an exception in Eurasia Even though itis a population of sedentary farmers it showed no significantexpansion signal with both the parametric and nonparamet-ric methods with HVS-I This could be explained by a latersedentarization of this population The Korean Neolithic isnotably defined by the introduction of Jeulmun ware ce-ramics about 8000 YBP but the people of the Jeulmunperiod were still predominantly semi-nomadic fishers andhunter-gatherers until about 3000 YBP when Koreans startedan intensive crop production implying a sedentary lifestyle(Nelson 1993)

Inferred Expansion Signals Predate the Emergence ofFarming

EBSP analyses revealed that the inferred expansion events infarmers and herder populations were more ancient than theemergence of farming and herding Therefore the differencesin demographic patterns between farmers and herders seemto predate their divergence in lifestyle which raises the ques-tion of the chronology of demographic expansions and theNeolithic transition These findings appear to be quite robustto the choice of the scaling parameters We used here boththe lower and the higher mutation rate estimates in humansfor autosomes (Pluzhnikov et al 2002 Conrad et al 2011) andfor the HVS-I sequence (Forster et al 1996 Howell et al 1996)Despite this uncertainty in mutation rates which lead to a 2-fold uncertainty in our time estimates the inferred expansionsignals predated the emergence of agriculture in both casesfor all populations Similarly using a generation time of 29years (Tremblay and Vezina 2000) instead of 25 years lead toslightly more ancient estimates thus do not change our con-clusions (data not shown) However note that for HVS-Iusing the higher bound of the credibility interval for the high-est estimated mutation rate (275 105generationsiteHeyer et al 2001) instead of the mean value (ie 105gen-erationsite) leads to expansion time estimates consistentwith the Neolithic transition in Eurasian populations (supple-mentary table S11 Supplementary Material online)Nevertheless these estimates still clearly predated theNeolithic for the African populations However 105gener-ationsite is by far the highest estimation of mutation rate inthe literature (Howell et al 1996) To infer Neolithic expan-sions in most Eurasian populations one needs to assume amutation rate of at least 2 105generationsite muchhigher than other estimates from the literature and is thusprobably unrealistic Moreover our method for determiningthe expansion onset time using EBSP graph is very conserva-tive and also tends to favor the lower bound of expansiononset times Finally for autosomes using similarly474 108generationsite instead of 25 108genera-tionsite (Pluzhnikov et al 2002) lead to an inferred expansiononset time that is not compatible with the Neolithic transi-tion for all Eurasian and African populations except for one

10

Aime et al doi101093molbevmst156 MBE

African population the Yoruba (supplementary table S12Supplementary Material online) Consequently it seemsvery likely that the expansions inferred in this study corre-spond to Paleolithic rather than Neolithic demographicevents in agreement also with most previous studies as de-tailed later

In Africa the emergence of agriculture has been datedbetween 5000 and 4000 YBP in the Western part ofCentral Africa and subsequently rapidly expanded to therest of sub-Saharan Africa (Phillipson 1993) However usingHVS-I we showed expansion events in farmer populationssince about 30000 or 60000 YBP thus largely predating theemergence of agriculture in the area Similarly using auto-somes especially in Eastern African populations we inferredexpansion signals that clearly predated the Neolithic Notablywe inferred an expansion signal for Mozambicans since atleast 80000 YBP Several genetic studies have already high-lighted that expansion events occurred in African farmersbefore the Neolithic transition (eg Atkinson et al 2009Laval et al 2010 Batini et al 2011) This finding is also con-sistent with paleoanthropological data (ie radiocarbondating) suggesting an expansion event in Africa 60000ndash80000 YBP (Mellars 2006a) This Paleolithic demographicexpansion could be linked to a rapid environmental changetoward a dryer climate (Partridge et al 1997) andor to theemergence of new hunting technologies (Mellars 2006a)

According to Mellars (2006a) this period corresponds to amajor increase in the complexity of the technological eco-nomic social and cognitive behavior of certain Africangroups It corresponds in particular to the emergence of pro-jectile technologies (Shea 2009) which was probably part of abroader pattern of ecological diversification of early Homosapiens populations These changes could have been decisivefor the human spread ldquoOut of Africardquo during the same periodand could have ultimately also led to the sedentarization ofthe remaining populations This inference is consistent withSauerrsquos (1952) demographic theory which stated that latePaleolithic demographic expansions could have favored thesedentarization and the emergence of agriculture in somehuman populations In the case of Central Africa theperiod of 60000 YBP corresponds to the separation betweenhunter-gatherers and farmers ancestors (Patin et al 2009Verdu et al 2009) Thus these two groups may have pre-sented contrasting demographic patterns since their diver-gence Much later higher expansion rates and largerpopulation sizes among farmersrsquo ancestors may have inducedthe emergence of agriculture and sedentarization

With respect to Eurasia the expansion profiles inferredwith HVS-I for all populations and with autosomes for theHan Chinese population also seem to have begun during thePaleolithic thus before the Neolithic transition Some geneticstudies already reported pre-Neolithic expansions in Asia andEurope (eg Chaix et al 2008) Notably using mismatch andintermatch distributions Chaix et al (2008) showed an east-to-west Paleolithic expansion wave in Eurasia We found asimilar pattern here as the inferred expansions of East-Asianpopulations were earlier than those of Central Asian popula-tions themselves earlier than those of European populations

Moreover we found this pattern in both sedentary farmerand nomadic herder populations Thus the ancestors of cur-rently nomadic herder populations also experienced thesePaleolithic expansions However Paleolithic expansion signalsin nomadic populations seem lower than in sedentary pop-ulations This is again compatible with the demographictheory of the Neolithic sedentarization (Sauer 1952) somepopulations may have experienced more intense Paleolithicexpansions which may have led ultimately to theirsedentarization

The inferred Paleolithic expansion signals might resultpartly from spatial expansions out of some refuge areasafter the Last Glacial Maximum (LGM 26500ndash19500 YBPClark et al 2009) as this time interval matches with our in-ferred dating for expansion onsets in East Asia with HVS-Iusing the pedigree-based mutation rate and in Europe andMiddle East using the transitional mutation rate Some of theearlier date estimates might also be consistent with the out-of-Africa expansion of H sapiens However the time radiocar-bon-based estimates of the spread of H sapiens in Eurasia aregenerally more ancient than our inferred expansion onsettimings For instance Mellars (2006b) dated the colonizationof Middle East by H sapiens at 47000ndash49000 YBP and ofEurope at 41000ndash42000 YBP Pavlov et al (2001) reporttraces of modern human occupation nearly 40000 yearsold in Siberia Finally Liu et al (2010) described modernhuman fossils from South China dated to at least 60000YBP Moreover out-of-Africa or post-LGM expansionswould not explain our finding of an east-to-west gradientof expansion onset timing which rather supports the hypoth-esis of a demographic expansion diffused from east to west inEurasia in a demic (ie migrations of individuals) or cultural(favored by the diffusion of new technologies)

Possible Confounding Factors

Our approach makes the assumption that populations areisolated and panmictic which is questionable for human pop-ulations However we analyzed a large set of populationssampled in very distant geographical regions (ie CentralAfrica East Africa Europe Middle East Central Asia PamirSiberia and East Asia) The main conclusions of this study relyon consistent patterns between most of these areas and itseems unlikely that processes such as admixture could havebiased the estimates similarly everywhere Moreover inCentral Africa several studies have shown that hunter-gath-erer populations show signals of admixture whereas it is notthe case for farmer populations (Patin et al 2009 Verdu et al2009 2013) If this introgression had been strong enough thismay have yielded a spurious expansion signal in the hunter-gatherer populations which is not what we observed here InEurope spatial expansion processes during the Neolithic mayhave led to admixture with Paleolithic populations Aspointed out by a simulation study (Arenas et al 2013) thismay lead to a predominance of the Paleolithic gene pool Thismay be one of the factors explaining why we observed mostlyPaleolithic expansions here

11

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

Similarly potential selection occurring on the whole mito-chondrial genome (eg Pakendorf and Stoneking 2005) seemsunlikely to have impacted in the same way all the studiedpopulations within each group (eg stronger positive selec-tion on sedentary than on nomadic populations) as we an-alyzed different nomadic and sedentary populations livingnear each other in several geographically distant areas

Regarding the potential effects of recombination on theinferences from autosomal data we found that neutralitytests gave similar results on the whole sequences whenusing a simulation procedure that was taking the known re-combination rate of each sequence into account (table 1)and on the largest non-recombining blocks as inferred withIMgc without taking recombination into account in the sim-ulation process (supplementary table S12 SupplementaryMaterial online) It thus appears unlikely that the BEAST anal-yses that can only handle the largest inferred non-recombin-ing blocks are biased because of this

Finally note that the effective population sizes inferredusing BEAST correspond to the Ne of the populationsduring their recent history rather than a value of Ne averagedover the history of the population It explains the finding thatfor most populations we inferred Ne estimates much higherthan generally assumed for humans by population geneticists(about 10000)

Material and Methods

Genetic MarkersAutosomal SequencesWe used data from 20 noncoding a priori neutral and un-linked autosomal regions selected by Patin et al (2009) to beat least 200 kb away from any known or predicted gene tonot be in linkage disequilibrium (LD) neither with each othernor with any known or predicted gene and to have a regionof homology with the chimpanzee genome These regions areon average 1253 bp long Using the four-gamete test (Hudsonand Kaplan 1985) as implemented in IMgc online (Woerneret al 2007) we identified recombination events for 6 of these20 regions As some methods used in this study cannothandle recombination we retained for these six sequencesthe largest non-recombining block inferred by IMgc Becauseof this reduction the 20 regions used were on average1228 bp long To identify potential bias related to thismethod (eg some recombination events may not bedetected using the four-gamete test larger blocks of non-recombining sequence may select for gene trees that areshorter than expected) we computed the summary statisticsand performed neutrality tests (discussed later) both on thewhole sequences (table 1) and on the largest non-recombin-ing blocks (supplementary table S12 Supplementary Materialonline)

Mitochondrial SequencesWe used the first hypervariable segment of the mitochondrialcontrol region (HVS-I) sequenced between positions 16067and 16383 excluding the hypervariable poly-C region (sites16179ndash16195) The total length of the sequence was thus of300 bp

Population Panel

For Africa we used the autosomal sequences data set of Patinet al (2009) which consists of five farmer populations(N = 118 individuals) and five Pygmy hunter-gatherer popu-lations (N = 95) In addition we used the HVS-I data set fromQuintana-Murci et al (2008) which consists of nine CentralAfrican farmer populations (N = 486) and seven CentralAfrican hunter-gatherer populations (N = 318) (supplemen-tary table S1 Supplementary Material online)

For Eurasia we used the autosomal sequences data set ofLaval et al (2010) consisting of 48 individuals from two East-Asian populations (Han Chinese and Japanese) and 47 indi-viduals from two European populations (Chuvash andDanes) We also used the data from 48 individuals fromone sedentary Central Asian population (Tajik farmers) and48 individuals from one nomadic Central Asian population(Kyrgyz herders) of Segurel et al (2013) For HVS-I we ana-lyzed data from 17 Eurasian populations (N = 494 in total)located from Eastern to Western Eurasia belonging to severalpublished data sets (Derenko et al 2000 Richards et al 2000Bermisheva et al 2002 Imaizumi et al 2002 Yao et al 2002Kong et al 2003 Quintana-Murci et al 2004 supplementarytable S1 Supplementary Material online)

For our detailed study of Central Asia we used HVS-I se-quences from 12 farmer populations (N = 408 in total) and 16herder populations (N = 567 in total) These data come fromthe studies by Chaix et al (2007) and Heyer et al (2009) for 25populations (supplementary table S1 SupplementaryMaterial online) The other populations (KIB TAB andTKY) were sequenced for this study As in Chaix et al(2007) and Heyer et al (2009) DNA was extracted fromblood samples using standard protocols and the sequencequality was ensured as follows each base pair was determinedonce with a forward and once with a reverse primer anyambiguous base call was checked by additional and indepen-dent PCR and sequencing reactions all sequences were ex-amined by two independent investigators All sampledindividuals were healthy donors from whom informed con-sent was obtained The study was approved by appropriateEthic Committees and scientific organizations in all countrieswhere samples have been collected

Demographic Inferences from Sequences AnalysisSummary Statistics and Neutrality TestsWe computed classical summary statistics (number of poly-morphic sites S number of haplotypes K) and four neutralitytests (Tajimarsquos [1989] D Fu and Lirsquos [1993] D and F and Fursquos[1997] Fs) on both mitochondrial and autosomal sequencesAlthough neutrality tests were originally designed to detectselective events they also give information about demo-graphic processes especially when applied to neutral markersIndeed expansion events lead to more negative values thanexpected in the absence of selective and demographic pro-cesses Conversely contraction events lead to more positivevalues of the neutrality tests For HVS-I sequences we com-puted all summary statistics and neutrality tests and tested

12

Aime et al doi101093molbevmst156 MBE

their departure from neutrality using the coalescent-basedtests provided in DnaSP (Librado and Rozas 2009)

For the autosomal sequences we used the procedure de-veloped in Laval et al (2010) which combines all autosomalsequences into a single test This procedure consists in com-puting the mean value of each summary statistics across the20 loci and in testing whether this mean value departs signif-icantly from its expectation under neutrality in a constant-size population model using a simulation procedure For agiven population with sample size n we produced 105 simu-lated samples of the same size n under a constant populationsize model using the generation-per-generation coalescent-based algorithm implemented in SIMCOAL v2 (Laval andExcoffier 2004) Each simulated individual was constitutedby 20 independent sequences of 1253 bp (the average se-quence length for the real data)Then we used ARLEQUIN v3(Excoffier et al 2005) as modified by Laval et al (2010) tocompute the summary statistics on these simulated samplesWe assessed whether the observed statistics differed signifi-cantly from the constant population model under neutralityby comparing these statistics with their null distribution ob-tained from the simulated data We used gamma distributedmutation rates with a mean value of 25 108generationsite (95 confidence interval 1476 108

4036 108)in agreement with previous studies (Pluzhnikov et al 2002Voight et al 2005) This procedure yielded a P value for thesignificance of the departure from a constant size model Weperformed this procedure both on the whole sequences andon the largest non-recombining blocks For the whole se-quences (ie including recombination) we performed thedata simulation under a coalescent model with recombina-tion using for each locus the recombination rate provided bythe HapMap build GRCh37 genetic map (InternationalHapMap Consortium 2003) (supplementary table S8Supplementary Material online) whereas for the largestnon-recombining blocks we used a coalescent model withoutrecombination

Both for autosomes and for HVS-I sequences we adjustedthe obtained P values for each neutrality test using a falsediscovery rate (FDR) correction (Benjamini and Hochberg1995) in R v2141 (R Development Core Team 2011) inorder to take into account the increased error probability inthe case of multiple testing

MCMC Estimations of Demographic ParametersWe used the MCMC algorithm implemented in BEAST v16(Drummond and Rambaut 2007) We tested the four demo-graphic models implemented in this software constant effec-tive population size (N0) (constant model) populationexpansion with an increasing growth rate (g) (exponentialmodel) population expansion with an decreasing growthrate (g) (logistic model) and the expansion model in whichN0 is the present day population size N1 the population sizethat the model asymptotes to going into the distant past andg the exponential growth rate that determines how fast thetransition is from near the N1 population size to N0 popula-tion size In fact BEAST estimates composite parameters foreach model namely N0 and g where N0 is the current

effective population size g the growth rate and the muta-tion rate In addition for the expansion model the ratio be-tween the current (N0) and ancestral (N1) effectivepopulation size is also estimated To infer N0 and g fromthese composite parameters we needed to assume a valuefor the mutation rate However there is no consensus formutation rates in humans in the literature as different meth-ods lead to different estimations For autosomes the mostcommonly used value is the phylogenetic rate of= 25 108generationsite (Pluzhnikov et al 2002)However recent studies based on the 1000 genome project(1000 Genomes Project Consortium 2012) have found a2-fold lower rate (= 12 108generationsite) by directlycomparing genome-wide sequences from children and theirparents (Conrad et al 2011 Scally and Durbin 2012) We usedhere both mutation rates Similarly for HVS-I estimated mu-tation rates are highly dependent on methodologies andmodes of calibration (Endicott et al 2009) We used boththe lower and the higher estimated mutation rates the tran-sitional changes mutation rate of = 5 106generationsite (Forster et al 1996) and the pedigree-based rate of= 105generationsite (Howell et al 1996 Heyer et al2001) We used a general time-reversible substitution model(Rodriguez et al 1990) We assumed a generation time of 25years permitting the comparison with previous human pop-ulation genetics studies (eg Chaix et al 2008 Patin et al 2009Laval et al 2010) As BEAST cannot handle recombinationevents we used the largest nonrecombining block withineach sequence (discussed earlier)

We performed three runs of 107 steps per population andper demographic model for the HVS-I sequence and threeruns of 2 108 steps (which corresponded to three runs of107 steps per locus) for the autosomal sequences We re-corded one tree every 1000 steps which thus implied atotal of 105 trees per locus and per run We then removedthe first 10 steps of each run (burn-in period) and combinedthe runs to obtain acceptable effective sample sizes (ESSs of100 or above) The convergence of these runs was assessedusing two methods visual inspection of traces using Tracerv15 (Rambaut and Drummond 2007) to check for concor-dance between runs and computation of Gelman and Rubinrsquos(1992) convergence diagnostic using R v2141 (RDevelopment Core Team 2011) with the function gelmandiag available in the package coda (Plummer et al 2006)

To facilitate a large exploration of the parameter space forthe autosomal sequences we chose uniform priors between 0and 005 for 2N0 and between 109 and 109 for g ForHVS-I sequences we chose uniform priors for N0 between 0and 10 and for gm between25 106 and 25 106 result-ing for the same priors on N0 and g than for autosomalsequences if we assumed = 105generationsite (ieN0 constrained between 0 and 106 and g constrained between1 and 1 per year) Conversely if we assumed = 5 106generationsite it meant that N0 was constrained between 0and 2 106 and g was constrained between05 and 05 peryear

For each population and model we obtained themode and the 95 HPD of N0 and g inferred from their

13

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

posterior distributions (supplementary tables S4 and S7Supplementary Material online) using the add-on packageLocfit (Loader 1999) in R v2141 We selected the best-fittingmodel among the four tested demographic models by esti-mating marginal likelihoods using two methods path sam-pling and stepping-stone sampling (Baele et al 2012) Themodel with the greater marginal likelihood (supplementarytable S3 Supplementary Material online) was considered asthe best-fitting model

Extended Bayesian Skyline PlotsEBSPs (Heled and Drummond 2010) also implemented inBEAST estimate demographic changes occurring continu-ously through time in a population using the time intervalsbetween successive coalescent events This method allows avisualization of the evolution of Ne through time As abovewe combined three runs of 107 steps for mitochondrial se-quences and three runs of 2 108 steps for autosomal se-quences to obtain acceptable ESS values We assumed thesame mutation rates as above and a generation time of 25years Outputs were analyzed with Tracer v15 to visuallycheck for convergence and ESS also to obtain the 95 HPDinterval for the number of demographic changes that oc-curred in the population (supplementary table S5Supplementary Material online) A constant population sizecould be rejected when the 95 HPD of the number ofchange points excluded 0 but included 1 (Heled andDrummond 2010) Then we used R v2141 to computeGelman and Rubinrsquos (1992) convergence diagnostic asabove as well as to compute skyline plots Finally we usedthe population growth curves generated from BEAST toassess the time at which populations began to expandEach Skyline plot consisted of smoothed data points atamp10ndash20 generation intervals We consider that the popula-tion increased (or decreased) when both the median and 95HPD values for Ne increased (or decreased) between morethan two successive data points Although this method didnot provide a 95 HPD interval for the inferred expansiontimings this conservative approach ensured that we consid-ered only relevant expansion signals

Correlation Tests of Inferred Growth Rates andIsolationImmigration Patterns

To test how isolation degrees and migration patterns differ-ences could have impacted our demographic inferences weused ARLEQUIN v311 (Excoffier et al 2005) to compute pop-ulation-specific FST values (Weir and Hill 2002) from HVS-Idata for Central Africa Eurasia and Central Asia (supplemen-tary table S9 Supplementary Material online) and estimateimmigration rates from mismatch distributions under a spa-tially explicit model (Excoffier 2004) (supplementary tableS10 Supplementary Material online) We performed thenSpearman tests using R v2141 to investigate for eachregion how the inferred parametric growth rates were corre-lated with those FST values and immigration rates We used avalue of 0 for the growth rate when the constant model bestfitted the data

Supplementary MaterialSupplementary figures S1 and S2 and tables S1ndashS12 are avail-able at Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

The authors would like to warmly thank all volunteer partic-ipants They also thank Michael Fontaine for his help on somedata analyses Phillip Endicott and Julio Bendezu-Sarmientofor insightful discussions Alexei Drummond for helpful dis-cussions and Friso Palstra for his help on English usage Theythank Laurent Excoffier two anonymous reviewers and theeditor for helpful comments and suggestions All computa-tionally intensive analyses were run on the Linux cluster of theMuseum National drsquoHistoire Naturelle (administrated by JulioPedraza) and on the web-based portal ldquoBioportalrdquo (Kumaret al 2009) This work was supported by ActionsTransversales du Museum - Museum National drsquoHistoireNaturelle (grant ldquoLes relations Societes-Natures dans le longtermerdquo) and Agence Nationale de la Recherche (grantsldquoAlterite culturellerdquo ANR-10-ESVS-0010 and ldquoDemochipsrdquoANR-12-BSV7-0012) CA was financed by a PhD grantfrom the Centre National de la Recherche Scientifique

References1000 Genomes Project Consortium 2012 An integrated map of genetic

variation from 1092 human genomes Nature 49156ndash65Ambrose SH 2001 Paleolithic technology and human evolution Science

2911748ndash1753Arenas M Francois O Currat M Ray N Excoffier L 2013 Influence of

admixture and Paleolithic range contractions on current Europeandiversity gradients Mol Biol Evol 30 57ndash61

Atkinson QD Gray RD Drummond AJ 2009 Bayesian coalescent infer-ence of major human mitochondrial DNA haplogroup expansionsin Africa Proc R Soc Lond B Biol Sci 276367ndash373

Baele G Lemey P Bedford T Rambaut A Suchard MA Alekseyenko AV2012 Improving the accuracy of demographic and molecular clockmodel comparison while accommodating phylogenetic uncertaintyMol Biol Evol 292157ndash2167

Balaresque PL Ballereau SJ Jobling MA 2007 Challenges in humangenetic diversity demographic history and adaptation Hum MolGenet 16R134ndashR139

Batini C Lopes J Behar DM Calafell F Jorde LB van der Veen LQuintana-Murci L Spedini G Destro-Bisol G Comas D 2011Insights into the demographic history of African pygmies from com-plete mitochondrial genomes Mol Biol Evol 281099ndash1110

Beaumont MA 2004 Recent developments in genetic data analysiswhat can they tell us about human demographic historyHeredity 92365ndash379

Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdashapractical and powerful approach to multiple testing J R Stat SocSer B 57289ndash300

Bermisheva M Salimova A Korshunova T Svyatova G Berezina GVillems R Khusnutdinova E 2002 Mitochondrial DNA diversity inthe populations of Middle Asia and Northern Caucasus Eur J HumGenet 10179ndash180

Bocquet-Appel JP 2011 When the worldrsquos population took off thespringboard of the Neolithic Demographic Transition Science 333560ndash561

Bocquet-Appel J-P Bar-Yosef O 2008 The Neolithic demographic tran-sition and its consequences Dordrecht (The Netherlands) Springer

14

Aime et al doi101093molbevmst156 MBE

Chaix R Austerlitz F Hegay T Quintana-Murci L Heyer E 2008 Genetictraces of east-to-west human expansion waves in Eurasia Am J PhysAnthropol 136309ndash317

Chaix R Quintana-Murci L Hegay T Hammer MF Mobasher ZAusterlitz F Heyer E 2007 From social to genetic structures incentral Asia Curr Biol 1743ndash48

Clark PU Dyke AS Shakun JD Carlson AE Clark J Wohlfarth BMitrovica JX Hostetler SW McCabe AM 2009 The Last GlacialMaximum Science 325710ndash714

Conrad DF Keebler JEM DePristo MA et al (17 co-authors) 2011Variation in genome-wide mutation rates within and betweenhuman families Nat Genet 43712ndash714

Derenko MV Malyarchuk BA Dambueva IK Shaikhaev GO DorzhuCM Nimaev DD Zakharov IA 2000 Mitochondrial DNA variationin two South Siberian aboriginal populations implications for thegenetic history of North Asia Hum Biol 72945ndash973

Dirksen VG van Geel B 2004 Mid to late Holocene climate change andits influence on cultural development in South Central Siberia NatoSci S Ss IV Ear 42291ndash307

Drummond AJ Rambaut A 2007 BEAST Bayesian evolutionary analysisby sampling trees BMC Evol Biol 7214

Endicott P Ho SYW Metspalu M Stringer C 2009 Evaluating the mi-tochondrial timescale of human evolution Trends Ecol Evol 24515ndash521

Excoffier L 2004 Patterns of DNA sequence diversity and genetic struc-ture after a range expansion lessons from the infinite-island modelMol Ecol 13853ndash864

Excoffier L Heckel G 2006 Computer programs for population geneticsdata analysis a survival guide Nat Rev Genet 7745ndash758

Excoffier L Laval G Schneider S 2005 Arlequin (version 30) an inte-grated software package for population genetics data analysis EvolBioinform Online 147ndash50

Fontaine MC Snirc A Frantzis A Koutrakis E Ozturk B Ozturk AAAusterlitz F 2012 History of expansion and anthropogenic collapsein a top marine predator of the Black Sea estimated from geneticdata Proc Natl Acad Sci U S A 109E2569ndashE2576

Forster P Harding R Torroni A Bandelt HJ 1996 Origin and evolutionof native American mtDNA variation a reappraisal Am J HumGenet 59935ndash945

Fu YX 1997 Statistical tests of neutrality of mutations against popula-tion growth hitchhiking and background selection Genetics 147915ndash925

Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133693ndash709

Gelman A Rubin DB 1992 Inference from iterative simulation usingmultiple sequences Stat Sci 7457ndash511

Heled J Drummond AJ 2010 Bayesian inference of species trees frommultilocus data Mol Biol Evol 27570ndash580

Heyer E Balaresque P Jobling MA Quintana-Murci L Chaix R Segurel LAldashev A Hegay T 2009 Genetic diversity and the emergence ofethnic groups in Central Asia BMC Genet 1049

Heyer E Chaix R Pavard S Austerlitz F 2012 Sex-specific demographicbehaviours that shape human genomic variation Mol Ecol 21597ndash612

Heyer E Zietkiewicz E Rochowski A Yotova V Puymirat J Labuda D2001 Phylogenetic and familial estimates of mitochondrial substi-tution rates study of control region mutations in deep-rootingpedigrees Am J Hum Genet 631113ndash1126

Ho SYW Shapiro B 2011 Skyline-plot methods for estimatingdemographic history from nucleotide sequences Mol Ecol Res 11423ndash434

Howell N Kubacka I Mackey DA 1996 How rapidly does the humanmitochondrial genome evolve Am J Hum Genet 59501ndash509

Hudson RR Kaplan NL 1985 Statistical properties of the number ofrecombination events in the history of a sample of DNA sequencesGenetics 111147ndash164

Imaizumi K Parsons TJ Yoshino M Holland MM 2002 A new databaseof mitochondrial DNA hypervariable regions I and II sequences from162 Japanese individuals Int J Legal Med 11668ndash73

International HapMap Consortium 2003 The International HapMapProject Nature 426789ndash796

Kingman JFC 1982 The coalescent Stoch Proc Appl 13235ndash248Kong QP Yao YG Sun C Bandelt HJ Zhu CL Zhang YP 2003 Phylogeny

of East Asian mitochondrial DNA lineages inferred from completesequences Am J Hum Genet 73671ndash676

Kumar V Langstieh BT Madhavi KV Naidu VM Singh HP Biswas SThangaraj K Singh L Reddy BM 2006 Global patterns in humanmitochondrial DNA and Y-chromosome variation caused by spatialinstability of the local cultural processes PLoS Genet 2e53

Kumar S Skjaeveland A Orr RJS Enger P Ruden T Mevik BH Burki FBotnen A Shalchian-Tabrizi K 2009 AIR a batch-oriented webprogram package for construction of supermatrices ready for phy-logenomic analyses BMC Bioinformatics 10357

Laval G Excoffier L 2004 SIMCOAL 20 a program to simulate genomicdiversity over large recombining regions in a subdivided populationwith a complex history Bioinformatics 202485ndash2487

Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulatinga historical and demographic model of recent human evolutionbased on resequencing data from noncoding regions PLoS One 5e10284

Librado P Rozas J 2009 DnaSP v5 a software for comprehensive analysisof DNA polymorphism data Bioinformatics 251451ndash1452

Liu W Jin CZ Zhang YQ et al (13 co-authors) 2010 Human remainsfrom Zhirendong South China and modern human emergence inEast Asia Proc Natl Acad Sci U S A 10719201ndash19206

Loader C 1999 Local regression and likelihood New York SpringerMellars P 2006a Why did human populations disperse from Africa ca

60000 years ago A new model Proc Natl Acad Sci U S A 1039381ndash9386

Mellars P 2006b A new radiocarbon revolution and the dispersal ofmodern humans in Eurasia Nature 439931ndash935

Nei M 1995 The origins of human populations genetic linguistic andarcheological data In Brenner S Hanihara K editors The origin andpast of modern humans as viewed from DNA Singapore WorldScientific p 71ndash91

Nei M Roychoudhury AK 1993 Evolutionary relationships of human-populations on a global scale Mol Biol Evol 10927ndash943

Nelson SM 1993 The archaeology of Korea Cambridge CambridgeUniversity Press p 307

Oota H Settheetham-Ishida W Tiwawech D Ishida T Stoneking M2001 Human mtDNA and Y-chromosome variation is corre-lated with matrilocal versus patrilocal residence Nat Genet 2920ndash21

Pakendorf B Stoneking M 2005 Mitochondrial DNA and human evo-lution Annu Rev Genomics Hum Genet 6165ndash183

Partridge TC Demenocal PB Lorentz SA Paiker MJ Vogel JC 1997Orbital forcing of climate over South Africa a 200000-year rainfallrecord from the Pretoria Saltpan Quat Sci Rev 161125ndash1133

Patin E Laval G Barreiro LB et al (15 co-authors) 2009 Inferring thedemographic history of African farmers and pygmy hunter-gath-erers using a multilocus resequencing data set PLoS Genet 5e1000448

Pavlov P Svendsen J Indrelid SI 2001 Human presence in the EuropeanArctic nearly 40000 years ago Nature 41364ndash67

Phillipson DW 1993 African archaeology Cambridge CambridgeUniversity Press

Plummer M Best N Cowles K Vines K 2006 CODA Convergencediagnosis and output analysis for MCMC R News 67ndash11

Pluzhnikov A Di Rienzo A Hudson RR 2002 Inferences about humandemography based on multilocus analyses of noncoding sequencesGenetics 1611209ndash1218

Quintana-Murci L Chaix R Wells RS et al (17 co-authors) 2004 Wherewest meets east the complex mtDNA landscape of the southwestand Central Asian corridor Am J Hum Genet 74827ndash845

Quintana-Murci L Quach H Harmant C et al (23 co-authors) 2008Maternal traces of deep common ancestry and asymmetric geneflow between Pygmy hunter-gatherers and Bantu-speaking farmersProc Natl Acad Sci U S A 1051596ndash1601

15

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE

Page 11: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

African population the Yoruba (supplementary table S12Supplementary Material online) Consequently it seemsvery likely that the expansions inferred in this study corre-spond to Paleolithic rather than Neolithic demographicevents in agreement also with most previous studies as de-tailed later

In Africa the emergence of agriculture has been datedbetween 5000 and 4000 YBP in the Western part ofCentral Africa and subsequently rapidly expanded to therest of sub-Saharan Africa (Phillipson 1993) However usingHVS-I we showed expansion events in farmer populationssince about 30000 or 60000 YBP thus largely predating theemergence of agriculture in the area Similarly using auto-somes especially in Eastern African populations we inferredexpansion signals that clearly predated the Neolithic Notablywe inferred an expansion signal for Mozambicans since atleast 80000 YBP Several genetic studies have already high-lighted that expansion events occurred in African farmersbefore the Neolithic transition (eg Atkinson et al 2009Laval et al 2010 Batini et al 2011) This finding is also con-sistent with paleoanthropological data (ie radiocarbondating) suggesting an expansion event in Africa 60000ndash80000 YBP (Mellars 2006a) This Paleolithic demographicexpansion could be linked to a rapid environmental changetoward a dryer climate (Partridge et al 1997) andor to theemergence of new hunting technologies (Mellars 2006a)

According to Mellars (2006a) this period corresponds to amajor increase in the complexity of the technological eco-nomic social and cognitive behavior of certain Africangroups It corresponds in particular to the emergence of pro-jectile technologies (Shea 2009) which was probably part of abroader pattern of ecological diversification of early Homosapiens populations These changes could have been decisivefor the human spread ldquoOut of Africardquo during the same periodand could have ultimately also led to the sedentarization ofthe remaining populations This inference is consistent withSauerrsquos (1952) demographic theory which stated that latePaleolithic demographic expansions could have favored thesedentarization and the emergence of agriculture in somehuman populations In the case of Central Africa theperiod of 60000 YBP corresponds to the separation betweenhunter-gatherers and farmers ancestors (Patin et al 2009Verdu et al 2009) Thus these two groups may have pre-sented contrasting demographic patterns since their diver-gence Much later higher expansion rates and largerpopulation sizes among farmersrsquo ancestors may have inducedthe emergence of agriculture and sedentarization

With respect to Eurasia the expansion profiles inferredwith HVS-I for all populations and with autosomes for theHan Chinese population also seem to have begun during thePaleolithic thus before the Neolithic transition Some geneticstudies already reported pre-Neolithic expansions in Asia andEurope (eg Chaix et al 2008) Notably using mismatch andintermatch distributions Chaix et al (2008) showed an east-to-west Paleolithic expansion wave in Eurasia We found asimilar pattern here as the inferred expansions of East-Asianpopulations were earlier than those of Central Asian popula-tions themselves earlier than those of European populations

Moreover we found this pattern in both sedentary farmerand nomadic herder populations Thus the ancestors of cur-rently nomadic herder populations also experienced thesePaleolithic expansions However Paleolithic expansion signalsin nomadic populations seem lower than in sedentary pop-ulations This is again compatible with the demographictheory of the Neolithic sedentarization (Sauer 1952) somepopulations may have experienced more intense Paleolithicexpansions which may have led ultimately to theirsedentarization

The inferred Paleolithic expansion signals might resultpartly from spatial expansions out of some refuge areasafter the Last Glacial Maximum (LGM 26500ndash19500 YBPClark et al 2009) as this time interval matches with our in-ferred dating for expansion onsets in East Asia with HVS-Iusing the pedigree-based mutation rate and in Europe andMiddle East using the transitional mutation rate Some of theearlier date estimates might also be consistent with the out-of-Africa expansion of H sapiens However the time radiocar-bon-based estimates of the spread of H sapiens in Eurasia aregenerally more ancient than our inferred expansion onsettimings For instance Mellars (2006b) dated the colonizationof Middle East by H sapiens at 47000ndash49000 YBP and ofEurope at 41000ndash42000 YBP Pavlov et al (2001) reporttraces of modern human occupation nearly 40000 yearsold in Siberia Finally Liu et al (2010) described modernhuman fossils from South China dated to at least 60000YBP Moreover out-of-Africa or post-LGM expansionswould not explain our finding of an east-to-west gradientof expansion onset timing which rather supports the hypoth-esis of a demographic expansion diffused from east to west inEurasia in a demic (ie migrations of individuals) or cultural(favored by the diffusion of new technologies)

Possible Confounding Factors

Our approach makes the assumption that populations areisolated and panmictic which is questionable for human pop-ulations However we analyzed a large set of populationssampled in very distant geographical regions (ie CentralAfrica East Africa Europe Middle East Central Asia PamirSiberia and East Asia) The main conclusions of this study relyon consistent patterns between most of these areas and itseems unlikely that processes such as admixture could havebiased the estimates similarly everywhere Moreover inCentral Africa several studies have shown that hunter-gath-erer populations show signals of admixture whereas it is notthe case for farmer populations (Patin et al 2009 Verdu et al2009 2013) If this introgression had been strong enough thismay have yielded a spurious expansion signal in the hunter-gatherer populations which is not what we observed here InEurope spatial expansion processes during the Neolithic mayhave led to admixture with Paleolithic populations Aspointed out by a simulation study (Arenas et al 2013) thismay lead to a predominance of the Paleolithic gene pool Thismay be one of the factors explaining why we observed mostlyPaleolithic expansions here

11

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

Similarly potential selection occurring on the whole mito-chondrial genome (eg Pakendorf and Stoneking 2005) seemsunlikely to have impacted in the same way all the studiedpopulations within each group (eg stronger positive selec-tion on sedentary than on nomadic populations) as we an-alyzed different nomadic and sedentary populations livingnear each other in several geographically distant areas

Regarding the potential effects of recombination on theinferences from autosomal data we found that neutralitytests gave similar results on the whole sequences whenusing a simulation procedure that was taking the known re-combination rate of each sequence into account (table 1)and on the largest non-recombining blocks as inferred withIMgc without taking recombination into account in the sim-ulation process (supplementary table S12 SupplementaryMaterial online) It thus appears unlikely that the BEAST anal-yses that can only handle the largest inferred non-recombin-ing blocks are biased because of this

Finally note that the effective population sizes inferredusing BEAST correspond to the Ne of the populationsduring their recent history rather than a value of Ne averagedover the history of the population It explains the finding thatfor most populations we inferred Ne estimates much higherthan generally assumed for humans by population geneticists(about 10000)

Material and Methods

Genetic MarkersAutosomal SequencesWe used data from 20 noncoding a priori neutral and un-linked autosomal regions selected by Patin et al (2009) to beat least 200 kb away from any known or predicted gene tonot be in linkage disequilibrium (LD) neither with each othernor with any known or predicted gene and to have a regionof homology with the chimpanzee genome These regions areon average 1253 bp long Using the four-gamete test (Hudsonand Kaplan 1985) as implemented in IMgc online (Woerneret al 2007) we identified recombination events for 6 of these20 regions As some methods used in this study cannothandle recombination we retained for these six sequencesthe largest non-recombining block inferred by IMgc Becauseof this reduction the 20 regions used were on average1228 bp long To identify potential bias related to thismethod (eg some recombination events may not bedetected using the four-gamete test larger blocks of non-recombining sequence may select for gene trees that areshorter than expected) we computed the summary statisticsand performed neutrality tests (discussed later) both on thewhole sequences (table 1) and on the largest non-recombin-ing blocks (supplementary table S12 Supplementary Materialonline)

Mitochondrial SequencesWe used the first hypervariable segment of the mitochondrialcontrol region (HVS-I) sequenced between positions 16067and 16383 excluding the hypervariable poly-C region (sites16179ndash16195) The total length of the sequence was thus of300 bp

Population Panel

For Africa we used the autosomal sequences data set of Patinet al (2009) which consists of five farmer populations(N = 118 individuals) and five Pygmy hunter-gatherer popu-lations (N = 95) In addition we used the HVS-I data set fromQuintana-Murci et al (2008) which consists of nine CentralAfrican farmer populations (N = 486) and seven CentralAfrican hunter-gatherer populations (N = 318) (supplemen-tary table S1 Supplementary Material online)

For Eurasia we used the autosomal sequences data set ofLaval et al (2010) consisting of 48 individuals from two East-Asian populations (Han Chinese and Japanese) and 47 indi-viduals from two European populations (Chuvash andDanes) We also used the data from 48 individuals fromone sedentary Central Asian population (Tajik farmers) and48 individuals from one nomadic Central Asian population(Kyrgyz herders) of Segurel et al (2013) For HVS-I we ana-lyzed data from 17 Eurasian populations (N = 494 in total)located from Eastern to Western Eurasia belonging to severalpublished data sets (Derenko et al 2000 Richards et al 2000Bermisheva et al 2002 Imaizumi et al 2002 Yao et al 2002Kong et al 2003 Quintana-Murci et al 2004 supplementarytable S1 Supplementary Material online)

For our detailed study of Central Asia we used HVS-I se-quences from 12 farmer populations (N = 408 in total) and 16herder populations (N = 567 in total) These data come fromthe studies by Chaix et al (2007) and Heyer et al (2009) for 25populations (supplementary table S1 SupplementaryMaterial online) The other populations (KIB TAB andTKY) were sequenced for this study As in Chaix et al(2007) and Heyer et al (2009) DNA was extracted fromblood samples using standard protocols and the sequencequality was ensured as follows each base pair was determinedonce with a forward and once with a reverse primer anyambiguous base call was checked by additional and indepen-dent PCR and sequencing reactions all sequences were ex-amined by two independent investigators All sampledindividuals were healthy donors from whom informed con-sent was obtained The study was approved by appropriateEthic Committees and scientific organizations in all countrieswhere samples have been collected

Demographic Inferences from Sequences AnalysisSummary Statistics and Neutrality TestsWe computed classical summary statistics (number of poly-morphic sites S number of haplotypes K) and four neutralitytests (Tajimarsquos [1989] D Fu and Lirsquos [1993] D and F and Fursquos[1997] Fs) on both mitochondrial and autosomal sequencesAlthough neutrality tests were originally designed to detectselective events they also give information about demo-graphic processes especially when applied to neutral markersIndeed expansion events lead to more negative values thanexpected in the absence of selective and demographic pro-cesses Conversely contraction events lead to more positivevalues of the neutrality tests For HVS-I sequences we com-puted all summary statistics and neutrality tests and tested

12

Aime et al doi101093molbevmst156 MBE

their departure from neutrality using the coalescent-basedtests provided in DnaSP (Librado and Rozas 2009)

For the autosomal sequences we used the procedure de-veloped in Laval et al (2010) which combines all autosomalsequences into a single test This procedure consists in com-puting the mean value of each summary statistics across the20 loci and in testing whether this mean value departs signif-icantly from its expectation under neutrality in a constant-size population model using a simulation procedure For agiven population with sample size n we produced 105 simu-lated samples of the same size n under a constant populationsize model using the generation-per-generation coalescent-based algorithm implemented in SIMCOAL v2 (Laval andExcoffier 2004) Each simulated individual was constitutedby 20 independent sequences of 1253 bp (the average se-quence length for the real data)Then we used ARLEQUIN v3(Excoffier et al 2005) as modified by Laval et al (2010) tocompute the summary statistics on these simulated samplesWe assessed whether the observed statistics differed signifi-cantly from the constant population model under neutralityby comparing these statistics with their null distribution ob-tained from the simulated data We used gamma distributedmutation rates with a mean value of 25 108generationsite (95 confidence interval 1476 108

4036 108)in agreement with previous studies (Pluzhnikov et al 2002Voight et al 2005) This procedure yielded a P value for thesignificance of the departure from a constant size model Weperformed this procedure both on the whole sequences andon the largest non-recombining blocks For the whole se-quences (ie including recombination) we performed thedata simulation under a coalescent model with recombina-tion using for each locus the recombination rate provided bythe HapMap build GRCh37 genetic map (InternationalHapMap Consortium 2003) (supplementary table S8Supplementary Material online) whereas for the largestnon-recombining blocks we used a coalescent model withoutrecombination

Both for autosomes and for HVS-I sequences we adjustedthe obtained P values for each neutrality test using a falsediscovery rate (FDR) correction (Benjamini and Hochberg1995) in R v2141 (R Development Core Team 2011) inorder to take into account the increased error probability inthe case of multiple testing

MCMC Estimations of Demographic ParametersWe used the MCMC algorithm implemented in BEAST v16(Drummond and Rambaut 2007) We tested the four demo-graphic models implemented in this software constant effec-tive population size (N0) (constant model) populationexpansion with an increasing growth rate (g) (exponentialmodel) population expansion with an decreasing growthrate (g) (logistic model) and the expansion model in whichN0 is the present day population size N1 the population sizethat the model asymptotes to going into the distant past andg the exponential growth rate that determines how fast thetransition is from near the N1 population size to N0 popula-tion size In fact BEAST estimates composite parameters foreach model namely N0 and g where N0 is the current

effective population size g the growth rate and the muta-tion rate In addition for the expansion model the ratio be-tween the current (N0) and ancestral (N1) effectivepopulation size is also estimated To infer N0 and g fromthese composite parameters we needed to assume a valuefor the mutation rate However there is no consensus formutation rates in humans in the literature as different meth-ods lead to different estimations For autosomes the mostcommonly used value is the phylogenetic rate of= 25 108generationsite (Pluzhnikov et al 2002)However recent studies based on the 1000 genome project(1000 Genomes Project Consortium 2012) have found a2-fold lower rate (= 12 108generationsite) by directlycomparing genome-wide sequences from children and theirparents (Conrad et al 2011 Scally and Durbin 2012) We usedhere both mutation rates Similarly for HVS-I estimated mu-tation rates are highly dependent on methodologies andmodes of calibration (Endicott et al 2009) We used boththe lower and the higher estimated mutation rates the tran-sitional changes mutation rate of = 5 106generationsite (Forster et al 1996) and the pedigree-based rate of= 105generationsite (Howell et al 1996 Heyer et al2001) We used a general time-reversible substitution model(Rodriguez et al 1990) We assumed a generation time of 25years permitting the comparison with previous human pop-ulation genetics studies (eg Chaix et al 2008 Patin et al 2009Laval et al 2010) As BEAST cannot handle recombinationevents we used the largest nonrecombining block withineach sequence (discussed earlier)

We performed three runs of 107 steps per population andper demographic model for the HVS-I sequence and threeruns of 2 108 steps (which corresponded to three runs of107 steps per locus) for the autosomal sequences We re-corded one tree every 1000 steps which thus implied atotal of 105 trees per locus and per run We then removedthe first 10 steps of each run (burn-in period) and combinedthe runs to obtain acceptable effective sample sizes (ESSs of100 or above) The convergence of these runs was assessedusing two methods visual inspection of traces using Tracerv15 (Rambaut and Drummond 2007) to check for concor-dance between runs and computation of Gelman and Rubinrsquos(1992) convergence diagnostic using R v2141 (RDevelopment Core Team 2011) with the function gelmandiag available in the package coda (Plummer et al 2006)

To facilitate a large exploration of the parameter space forthe autosomal sequences we chose uniform priors between 0and 005 for 2N0 and between 109 and 109 for g ForHVS-I sequences we chose uniform priors for N0 between 0and 10 and for gm between25 106 and 25 106 result-ing for the same priors on N0 and g than for autosomalsequences if we assumed = 105generationsite (ieN0 constrained between 0 and 106 and g constrained between1 and 1 per year) Conversely if we assumed = 5 106generationsite it meant that N0 was constrained between 0and 2 106 and g was constrained between05 and 05 peryear

For each population and model we obtained themode and the 95 HPD of N0 and g inferred from their

13

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

posterior distributions (supplementary tables S4 and S7Supplementary Material online) using the add-on packageLocfit (Loader 1999) in R v2141 We selected the best-fittingmodel among the four tested demographic models by esti-mating marginal likelihoods using two methods path sam-pling and stepping-stone sampling (Baele et al 2012) Themodel with the greater marginal likelihood (supplementarytable S3 Supplementary Material online) was considered asthe best-fitting model

Extended Bayesian Skyline PlotsEBSPs (Heled and Drummond 2010) also implemented inBEAST estimate demographic changes occurring continu-ously through time in a population using the time intervalsbetween successive coalescent events This method allows avisualization of the evolution of Ne through time As abovewe combined three runs of 107 steps for mitochondrial se-quences and three runs of 2 108 steps for autosomal se-quences to obtain acceptable ESS values We assumed thesame mutation rates as above and a generation time of 25years Outputs were analyzed with Tracer v15 to visuallycheck for convergence and ESS also to obtain the 95 HPDinterval for the number of demographic changes that oc-curred in the population (supplementary table S5Supplementary Material online) A constant population sizecould be rejected when the 95 HPD of the number ofchange points excluded 0 but included 1 (Heled andDrummond 2010) Then we used R v2141 to computeGelman and Rubinrsquos (1992) convergence diagnostic asabove as well as to compute skyline plots Finally we usedthe population growth curves generated from BEAST toassess the time at which populations began to expandEach Skyline plot consisted of smoothed data points atamp10ndash20 generation intervals We consider that the popula-tion increased (or decreased) when both the median and 95HPD values for Ne increased (or decreased) between morethan two successive data points Although this method didnot provide a 95 HPD interval for the inferred expansiontimings this conservative approach ensured that we consid-ered only relevant expansion signals

Correlation Tests of Inferred Growth Rates andIsolationImmigration Patterns

To test how isolation degrees and migration patterns differ-ences could have impacted our demographic inferences weused ARLEQUIN v311 (Excoffier et al 2005) to compute pop-ulation-specific FST values (Weir and Hill 2002) from HVS-Idata for Central Africa Eurasia and Central Asia (supplemen-tary table S9 Supplementary Material online) and estimateimmigration rates from mismatch distributions under a spa-tially explicit model (Excoffier 2004) (supplementary tableS10 Supplementary Material online) We performed thenSpearman tests using R v2141 to investigate for eachregion how the inferred parametric growth rates were corre-lated with those FST values and immigration rates We used avalue of 0 for the growth rate when the constant model bestfitted the data

Supplementary MaterialSupplementary figures S1 and S2 and tables S1ndashS12 are avail-able at Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

The authors would like to warmly thank all volunteer partic-ipants They also thank Michael Fontaine for his help on somedata analyses Phillip Endicott and Julio Bendezu-Sarmientofor insightful discussions Alexei Drummond for helpful dis-cussions and Friso Palstra for his help on English usage Theythank Laurent Excoffier two anonymous reviewers and theeditor for helpful comments and suggestions All computa-tionally intensive analyses were run on the Linux cluster of theMuseum National drsquoHistoire Naturelle (administrated by JulioPedraza) and on the web-based portal ldquoBioportalrdquo (Kumaret al 2009) This work was supported by ActionsTransversales du Museum - Museum National drsquoHistoireNaturelle (grant ldquoLes relations Societes-Natures dans le longtermerdquo) and Agence Nationale de la Recherche (grantsldquoAlterite culturellerdquo ANR-10-ESVS-0010 and ldquoDemochipsrdquoANR-12-BSV7-0012) CA was financed by a PhD grantfrom the Centre National de la Recherche Scientifique

References1000 Genomes Project Consortium 2012 An integrated map of genetic

variation from 1092 human genomes Nature 49156ndash65Ambrose SH 2001 Paleolithic technology and human evolution Science

2911748ndash1753Arenas M Francois O Currat M Ray N Excoffier L 2013 Influence of

admixture and Paleolithic range contractions on current Europeandiversity gradients Mol Biol Evol 30 57ndash61

Atkinson QD Gray RD Drummond AJ 2009 Bayesian coalescent infer-ence of major human mitochondrial DNA haplogroup expansionsin Africa Proc R Soc Lond B Biol Sci 276367ndash373

Baele G Lemey P Bedford T Rambaut A Suchard MA Alekseyenko AV2012 Improving the accuracy of demographic and molecular clockmodel comparison while accommodating phylogenetic uncertaintyMol Biol Evol 292157ndash2167

Balaresque PL Ballereau SJ Jobling MA 2007 Challenges in humangenetic diversity demographic history and adaptation Hum MolGenet 16R134ndashR139

Batini C Lopes J Behar DM Calafell F Jorde LB van der Veen LQuintana-Murci L Spedini G Destro-Bisol G Comas D 2011Insights into the demographic history of African pygmies from com-plete mitochondrial genomes Mol Biol Evol 281099ndash1110

Beaumont MA 2004 Recent developments in genetic data analysiswhat can they tell us about human demographic historyHeredity 92365ndash379

Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdashapractical and powerful approach to multiple testing J R Stat SocSer B 57289ndash300

Bermisheva M Salimova A Korshunova T Svyatova G Berezina GVillems R Khusnutdinova E 2002 Mitochondrial DNA diversity inthe populations of Middle Asia and Northern Caucasus Eur J HumGenet 10179ndash180

Bocquet-Appel JP 2011 When the worldrsquos population took off thespringboard of the Neolithic Demographic Transition Science 333560ndash561

Bocquet-Appel J-P Bar-Yosef O 2008 The Neolithic demographic tran-sition and its consequences Dordrecht (The Netherlands) Springer

14

Aime et al doi101093molbevmst156 MBE

Chaix R Austerlitz F Hegay T Quintana-Murci L Heyer E 2008 Genetictraces of east-to-west human expansion waves in Eurasia Am J PhysAnthropol 136309ndash317

Chaix R Quintana-Murci L Hegay T Hammer MF Mobasher ZAusterlitz F Heyer E 2007 From social to genetic structures incentral Asia Curr Biol 1743ndash48

Clark PU Dyke AS Shakun JD Carlson AE Clark J Wohlfarth BMitrovica JX Hostetler SW McCabe AM 2009 The Last GlacialMaximum Science 325710ndash714

Conrad DF Keebler JEM DePristo MA et al (17 co-authors) 2011Variation in genome-wide mutation rates within and betweenhuman families Nat Genet 43712ndash714

Derenko MV Malyarchuk BA Dambueva IK Shaikhaev GO DorzhuCM Nimaev DD Zakharov IA 2000 Mitochondrial DNA variationin two South Siberian aboriginal populations implications for thegenetic history of North Asia Hum Biol 72945ndash973

Dirksen VG van Geel B 2004 Mid to late Holocene climate change andits influence on cultural development in South Central Siberia NatoSci S Ss IV Ear 42291ndash307

Drummond AJ Rambaut A 2007 BEAST Bayesian evolutionary analysisby sampling trees BMC Evol Biol 7214

Endicott P Ho SYW Metspalu M Stringer C 2009 Evaluating the mi-tochondrial timescale of human evolution Trends Ecol Evol 24515ndash521

Excoffier L 2004 Patterns of DNA sequence diversity and genetic struc-ture after a range expansion lessons from the infinite-island modelMol Ecol 13853ndash864

Excoffier L Heckel G 2006 Computer programs for population geneticsdata analysis a survival guide Nat Rev Genet 7745ndash758

Excoffier L Laval G Schneider S 2005 Arlequin (version 30) an inte-grated software package for population genetics data analysis EvolBioinform Online 147ndash50

Fontaine MC Snirc A Frantzis A Koutrakis E Ozturk B Ozturk AAAusterlitz F 2012 History of expansion and anthropogenic collapsein a top marine predator of the Black Sea estimated from geneticdata Proc Natl Acad Sci U S A 109E2569ndashE2576

Forster P Harding R Torroni A Bandelt HJ 1996 Origin and evolutionof native American mtDNA variation a reappraisal Am J HumGenet 59935ndash945

Fu YX 1997 Statistical tests of neutrality of mutations against popula-tion growth hitchhiking and background selection Genetics 147915ndash925

Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133693ndash709

Gelman A Rubin DB 1992 Inference from iterative simulation usingmultiple sequences Stat Sci 7457ndash511

Heled J Drummond AJ 2010 Bayesian inference of species trees frommultilocus data Mol Biol Evol 27570ndash580

Heyer E Balaresque P Jobling MA Quintana-Murci L Chaix R Segurel LAldashev A Hegay T 2009 Genetic diversity and the emergence ofethnic groups in Central Asia BMC Genet 1049

Heyer E Chaix R Pavard S Austerlitz F 2012 Sex-specific demographicbehaviours that shape human genomic variation Mol Ecol 21597ndash612

Heyer E Zietkiewicz E Rochowski A Yotova V Puymirat J Labuda D2001 Phylogenetic and familial estimates of mitochondrial substi-tution rates study of control region mutations in deep-rootingpedigrees Am J Hum Genet 631113ndash1126

Ho SYW Shapiro B 2011 Skyline-plot methods for estimatingdemographic history from nucleotide sequences Mol Ecol Res 11423ndash434

Howell N Kubacka I Mackey DA 1996 How rapidly does the humanmitochondrial genome evolve Am J Hum Genet 59501ndash509

Hudson RR Kaplan NL 1985 Statistical properties of the number ofrecombination events in the history of a sample of DNA sequencesGenetics 111147ndash164

Imaizumi K Parsons TJ Yoshino M Holland MM 2002 A new databaseof mitochondrial DNA hypervariable regions I and II sequences from162 Japanese individuals Int J Legal Med 11668ndash73

International HapMap Consortium 2003 The International HapMapProject Nature 426789ndash796

Kingman JFC 1982 The coalescent Stoch Proc Appl 13235ndash248Kong QP Yao YG Sun C Bandelt HJ Zhu CL Zhang YP 2003 Phylogeny

of East Asian mitochondrial DNA lineages inferred from completesequences Am J Hum Genet 73671ndash676

Kumar V Langstieh BT Madhavi KV Naidu VM Singh HP Biswas SThangaraj K Singh L Reddy BM 2006 Global patterns in humanmitochondrial DNA and Y-chromosome variation caused by spatialinstability of the local cultural processes PLoS Genet 2e53

Kumar S Skjaeveland A Orr RJS Enger P Ruden T Mevik BH Burki FBotnen A Shalchian-Tabrizi K 2009 AIR a batch-oriented webprogram package for construction of supermatrices ready for phy-logenomic analyses BMC Bioinformatics 10357

Laval G Excoffier L 2004 SIMCOAL 20 a program to simulate genomicdiversity over large recombining regions in a subdivided populationwith a complex history Bioinformatics 202485ndash2487

Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulatinga historical and demographic model of recent human evolutionbased on resequencing data from noncoding regions PLoS One 5e10284

Librado P Rozas J 2009 DnaSP v5 a software for comprehensive analysisof DNA polymorphism data Bioinformatics 251451ndash1452

Liu W Jin CZ Zhang YQ et al (13 co-authors) 2010 Human remainsfrom Zhirendong South China and modern human emergence inEast Asia Proc Natl Acad Sci U S A 10719201ndash19206

Loader C 1999 Local regression and likelihood New York SpringerMellars P 2006a Why did human populations disperse from Africa ca

60000 years ago A new model Proc Natl Acad Sci U S A 1039381ndash9386

Mellars P 2006b A new radiocarbon revolution and the dispersal ofmodern humans in Eurasia Nature 439931ndash935

Nei M 1995 The origins of human populations genetic linguistic andarcheological data In Brenner S Hanihara K editors The origin andpast of modern humans as viewed from DNA Singapore WorldScientific p 71ndash91

Nei M Roychoudhury AK 1993 Evolutionary relationships of human-populations on a global scale Mol Biol Evol 10927ndash943

Nelson SM 1993 The archaeology of Korea Cambridge CambridgeUniversity Press p 307

Oota H Settheetham-Ishida W Tiwawech D Ishida T Stoneking M2001 Human mtDNA and Y-chromosome variation is corre-lated with matrilocal versus patrilocal residence Nat Genet 2920ndash21

Pakendorf B Stoneking M 2005 Mitochondrial DNA and human evo-lution Annu Rev Genomics Hum Genet 6165ndash183

Partridge TC Demenocal PB Lorentz SA Paiker MJ Vogel JC 1997Orbital forcing of climate over South Africa a 200000-year rainfallrecord from the Pretoria Saltpan Quat Sci Rev 161125ndash1133

Patin E Laval G Barreiro LB et al (15 co-authors) 2009 Inferring thedemographic history of African farmers and pygmy hunter-gath-erers using a multilocus resequencing data set PLoS Genet 5e1000448

Pavlov P Svendsen J Indrelid SI 2001 Human presence in the EuropeanArctic nearly 40000 years ago Nature 41364ndash67

Phillipson DW 1993 African archaeology Cambridge CambridgeUniversity Press

Plummer M Best N Cowles K Vines K 2006 CODA Convergencediagnosis and output analysis for MCMC R News 67ndash11

Pluzhnikov A Di Rienzo A Hudson RR 2002 Inferences about humandemography based on multilocus analyses of noncoding sequencesGenetics 1611209ndash1218

Quintana-Murci L Chaix R Wells RS et al (17 co-authors) 2004 Wherewest meets east the complex mtDNA landscape of the southwestand Central Asian corridor Am J Hum Genet 74827ndash845

Quintana-Murci L Quach H Harmant C et al (23 co-authors) 2008Maternal traces of deep common ancestry and asymmetric geneflow between Pygmy hunter-gatherers and Bantu-speaking farmersProc Natl Acad Sci U S A 1051596ndash1601

15

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE

Page 12: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

Similarly potential selection occurring on the whole mito-chondrial genome (eg Pakendorf and Stoneking 2005) seemsunlikely to have impacted in the same way all the studiedpopulations within each group (eg stronger positive selec-tion on sedentary than on nomadic populations) as we an-alyzed different nomadic and sedentary populations livingnear each other in several geographically distant areas

Regarding the potential effects of recombination on theinferences from autosomal data we found that neutralitytests gave similar results on the whole sequences whenusing a simulation procedure that was taking the known re-combination rate of each sequence into account (table 1)and on the largest non-recombining blocks as inferred withIMgc without taking recombination into account in the sim-ulation process (supplementary table S12 SupplementaryMaterial online) It thus appears unlikely that the BEAST anal-yses that can only handle the largest inferred non-recombin-ing blocks are biased because of this

Finally note that the effective population sizes inferredusing BEAST correspond to the Ne of the populationsduring their recent history rather than a value of Ne averagedover the history of the population It explains the finding thatfor most populations we inferred Ne estimates much higherthan generally assumed for humans by population geneticists(about 10000)

Material and Methods

Genetic MarkersAutosomal SequencesWe used data from 20 noncoding a priori neutral and un-linked autosomal regions selected by Patin et al (2009) to beat least 200 kb away from any known or predicted gene tonot be in linkage disequilibrium (LD) neither with each othernor with any known or predicted gene and to have a regionof homology with the chimpanzee genome These regions areon average 1253 bp long Using the four-gamete test (Hudsonand Kaplan 1985) as implemented in IMgc online (Woerneret al 2007) we identified recombination events for 6 of these20 regions As some methods used in this study cannothandle recombination we retained for these six sequencesthe largest non-recombining block inferred by IMgc Becauseof this reduction the 20 regions used were on average1228 bp long To identify potential bias related to thismethod (eg some recombination events may not bedetected using the four-gamete test larger blocks of non-recombining sequence may select for gene trees that areshorter than expected) we computed the summary statisticsand performed neutrality tests (discussed later) both on thewhole sequences (table 1) and on the largest non-recombin-ing blocks (supplementary table S12 Supplementary Materialonline)

Mitochondrial SequencesWe used the first hypervariable segment of the mitochondrialcontrol region (HVS-I) sequenced between positions 16067and 16383 excluding the hypervariable poly-C region (sites16179ndash16195) The total length of the sequence was thus of300 bp

Population Panel

For Africa we used the autosomal sequences data set of Patinet al (2009) which consists of five farmer populations(N = 118 individuals) and five Pygmy hunter-gatherer popu-lations (N = 95) In addition we used the HVS-I data set fromQuintana-Murci et al (2008) which consists of nine CentralAfrican farmer populations (N = 486) and seven CentralAfrican hunter-gatherer populations (N = 318) (supplemen-tary table S1 Supplementary Material online)

For Eurasia we used the autosomal sequences data set ofLaval et al (2010) consisting of 48 individuals from two East-Asian populations (Han Chinese and Japanese) and 47 indi-viduals from two European populations (Chuvash andDanes) We also used the data from 48 individuals fromone sedentary Central Asian population (Tajik farmers) and48 individuals from one nomadic Central Asian population(Kyrgyz herders) of Segurel et al (2013) For HVS-I we ana-lyzed data from 17 Eurasian populations (N = 494 in total)located from Eastern to Western Eurasia belonging to severalpublished data sets (Derenko et al 2000 Richards et al 2000Bermisheva et al 2002 Imaizumi et al 2002 Yao et al 2002Kong et al 2003 Quintana-Murci et al 2004 supplementarytable S1 Supplementary Material online)

For our detailed study of Central Asia we used HVS-I se-quences from 12 farmer populations (N = 408 in total) and 16herder populations (N = 567 in total) These data come fromthe studies by Chaix et al (2007) and Heyer et al (2009) for 25populations (supplementary table S1 SupplementaryMaterial online) The other populations (KIB TAB andTKY) were sequenced for this study As in Chaix et al(2007) and Heyer et al (2009) DNA was extracted fromblood samples using standard protocols and the sequencequality was ensured as follows each base pair was determinedonce with a forward and once with a reverse primer anyambiguous base call was checked by additional and indepen-dent PCR and sequencing reactions all sequences were ex-amined by two independent investigators All sampledindividuals were healthy donors from whom informed con-sent was obtained The study was approved by appropriateEthic Committees and scientific organizations in all countrieswhere samples have been collected

Demographic Inferences from Sequences AnalysisSummary Statistics and Neutrality TestsWe computed classical summary statistics (number of poly-morphic sites S number of haplotypes K) and four neutralitytests (Tajimarsquos [1989] D Fu and Lirsquos [1993] D and F and Fursquos[1997] Fs) on both mitochondrial and autosomal sequencesAlthough neutrality tests were originally designed to detectselective events they also give information about demo-graphic processes especially when applied to neutral markersIndeed expansion events lead to more negative values thanexpected in the absence of selective and demographic pro-cesses Conversely contraction events lead to more positivevalues of the neutrality tests For HVS-I sequences we com-puted all summary statistics and neutrality tests and tested

12

Aime et al doi101093molbevmst156 MBE

their departure from neutrality using the coalescent-basedtests provided in DnaSP (Librado and Rozas 2009)

For the autosomal sequences we used the procedure de-veloped in Laval et al (2010) which combines all autosomalsequences into a single test This procedure consists in com-puting the mean value of each summary statistics across the20 loci and in testing whether this mean value departs signif-icantly from its expectation under neutrality in a constant-size population model using a simulation procedure For agiven population with sample size n we produced 105 simu-lated samples of the same size n under a constant populationsize model using the generation-per-generation coalescent-based algorithm implemented in SIMCOAL v2 (Laval andExcoffier 2004) Each simulated individual was constitutedby 20 independent sequences of 1253 bp (the average se-quence length for the real data)Then we used ARLEQUIN v3(Excoffier et al 2005) as modified by Laval et al (2010) tocompute the summary statistics on these simulated samplesWe assessed whether the observed statistics differed signifi-cantly from the constant population model under neutralityby comparing these statistics with their null distribution ob-tained from the simulated data We used gamma distributedmutation rates with a mean value of 25 108generationsite (95 confidence interval 1476 108

4036 108)in agreement with previous studies (Pluzhnikov et al 2002Voight et al 2005) This procedure yielded a P value for thesignificance of the departure from a constant size model Weperformed this procedure both on the whole sequences andon the largest non-recombining blocks For the whole se-quences (ie including recombination) we performed thedata simulation under a coalescent model with recombina-tion using for each locus the recombination rate provided bythe HapMap build GRCh37 genetic map (InternationalHapMap Consortium 2003) (supplementary table S8Supplementary Material online) whereas for the largestnon-recombining blocks we used a coalescent model withoutrecombination

Both for autosomes and for HVS-I sequences we adjustedthe obtained P values for each neutrality test using a falsediscovery rate (FDR) correction (Benjamini and Hochberg1995) in R v2141 (R Development Core Team 2011) inorder to take into account the increased error probability inthe case of multiple testing

MCMC Estimations of Demographic ParametersWe used the MCMC algorithm implemented in BEAST v16(Drummond and Rambaut 2007) We tested the four demo-graphic models implemented in this software constant effec-tive population size (N0) (constant model) populationexpansion with an increasing growth rate (g) (exponentialmodel) population expansion with an decreasing growthrate (g) (logistic model) and the expansion model in whichN0 is the present day population size N1 the population sizethat the model asymptotes to going into the distant past andg the exponential growth rate that determines how fast thetransition is from near the N1 population size to N0 popula-tion size In fact BEAST estimates composite parameters foreach model namely N0 and g where N0 is the current

effective population size g the growth rate and the muta-tion rate In addition for the expansion model the ratio be-tween the current (N0) and ancestral (N1) effectivepopulation size is also estimated To infer N0 and g fromthese composite parameters we needed to assume a valuefor the mutation rate However there is no consensus formutation rates in humans in the literature as different meth-ods lead to different estimations For autosomes the mostcommonly used value is the phylogenetic rate of= 25 108generationsite (Pluzhnikov et al 2002)However recent studies based on the 1000 genome project(1000 Genomes Project Consortium 2012) have found a2-fold lower rate (= 12 108generationsite) by directlycomparing genome-wide sequences from children and theirparents (Conrad et al 2011 Scally and Durbin 2012) We usedhere both mutation rates Similarly for HVS-I estimated mu-tation rates are highly dependent on methodologies andmodes of calibration (Endicott et al 2009) We used boththe lower and the higher estimated mutation rates the tran-sitional changes mutation rate of = 5 106generationsite (Forster et al 1996) and the pedigree-based rate of= 105generationsite (Howell et al 1996 Heyer et al2001) We used a general time-reversible substitution model(Rodriguez et al 1990) We assumed a generation time of 25years permitting the comparison with previous human pop-ulation genetics studies (eg Chaix et al 2008 Patin et al 2009Laval et al 2010) As BEAST cannot handle recombinationevents we used the largest nonrecombining block withineach sequence (discussed earlier)

We performed three runs of 107 steps per population andper demographic model for the HVS-I sequence and threeruns of 2 108 steps (which corresponded to three runs of107 steps per locus) for the autosomal sequences We re-corded one tree every 1000 steps which thus implied atotal of 105 trees per locus and per run We then removedthe first 10 steps of each run (burn-in period) and combinedthe runs to obtain acceptable effective sample sizes (ESSs of100 or above) The convergence of these runs was assessedusing two methods visual inspection of traces using Tracerv15 (Rambaut and Drummond 2007) to check for concor-dance between runs and computation of Gelman and Rubinrsquos(1992) convergence diagnostic using R v2141 (RDevelopment Core Team 2011) with the function gelmandiag available in the package coda (Plummer et al 2006)

To facilitate a large exploration of the parameter space forthe autosomal sequences we chose uniform priors between 0and 005 for 2N0 and between 109 and 109 for g ForHVS-I sequences we chose uniform priors for N0 between 0and 10 and for gm between25 106 and 25 106 result-ing for the same priors on N0 and g than for autosomalsequences if we assumed = 105generationsite (ieN0 constrained between 0 and 106 and g constrained between1 and 1 per year) Conversely if we assumed = 5 106generationsite it meant that N0 was constrained between 0and 2 106 and g was constrained between05 and 05 peryear

For each population and model we obtained themode and the 95 HPD of N0 and g inferred from their

13

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

posterior distributions (supplementary tables S4 and S7Supplementary Material online) using the add-on packageLocfit (Loader 1999) in R v2141 We selected the best-fittingmodel among the four tested demographic models by esti-mating marginal likelihoods using two methods path sam-pling and stepping-stone sampling (Baele et al 2012) Themodel with the greater marginal likelihood (supplementarytable S3 Supplementary Material online) was considered asthe best-fitting model

Extended Bayesian Skyline PlotsEBSPs (Heled and Drummond 2010) also implemented inBEAST estimate demographic changes occurring continu-ously through time in a population using the time intervalsbetween successive coalescent events This method allows avisualization of the evolution of Ne through time As abovewe combined three runs of 107 steps for mitochondrial se-quences and three runs of 2 108 steps for autosomal se-quences to obtain acceptable ESS values We assumed thesame mutation rates as above and a generation time of 25years Outputs were analyzed with Tracer v15 to visuallycheck for convergence and ESS also to obtain the 95 HPDinterval for the number of demographic changes that oc-curred in the population (supplementary table S5Supplementary Material online) A constant population sizecould be rejected when the 95 HPD of the number ofchange points excluded 0 but included 1 (Heled andDrummond 2010) Then we used R v2141 to computeGelman and Rubinrsquos (1992) convergence diagnostic asabove as well as to compute skyline plots Finally we usedthe population growth curves generated from BEAST toassess the time at which populations began to expandEach Skyline plot consisted of smoothed data points atamp10ndash20 generation intervals We consider that the popula-tion increased (or decreased) when both the median and 95HPD values for Ne increased (or decreased) between morethan two successive data points Although this method didnot provide a 95 HPD interval for the inferred expansiontimings this conservative approach ensured that we consid-ered only relevant expansion signals

Correlation Tests of Inferred Growth Rates andIsolationImmigration Patterns

To test how isolation degrees and migration patterns differ-ences could have impacted our demographic inferences weused ARLEQUIN v311 (Excoffier et al 2005) to compute pop-ulation-specific FST values (Weir and Hill 2002) from HVS-Idata for Central Africa Eurasia and Central Asia (supplemen-tary table S9 Supplementary Material online) and estimateimmigration rates from mismatch distributions under a spa-tially explicit model (Excoffier 2004) (supplementary tableS10 Supplementary Material online) We performed thenSpearman tests using R v2141 to investigate for eachregion how the inferred parametric growth rates were corre-lated with those FST values and immigration rates We used avalue of 0 for the growth rate when the constant model bestfitted the data

Supplementary MaterialSupplementary figures S1 and S2 and tables S1ndashS12 are avail-able at Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

The authors would like to warmly thank all volunteer partic-ipants They also thank Michael Fontaine for his help on somedata analyses Phillip Endicott and Julio Bendezu-Sarmientofor insightful discussions Alexei Drummond for helpful dis-cussions and Friso Palstra for his help on English usage Theythank Laurent Excoffier two anonymous reviewers and theeditor for helpful comments and suggestions All computa-tionally intensive analyses were run on the Linux cluster of theMuseum National drsquoHistoire Naturelle (administrated by JulioPedraza) and on the web-based portal ldquoBioportalrdquo (Kumaret al 2009) This work was supported by ActionsTransversales du Museum - Museum National drsquoHistoireNaturelle (grant ldquoLes relations Societes-Natures dans le longtermerdquo) and Agence Nationale de la Recherche (grantsldquoAlterite culturellerdquo ANR-10-ESVS-0010 and ldquoDemochipsrdquoANR-12-BSV7-0012) CA was financed by a PhD grantfrom the Centre National de la Recherche Scientifique

References1000 Genomes Project Consortium 2012 An integrated map of genetic

variation from 1092 human genomes Nature 49156ndash65Ambrose SH 2001 Paleolithic technology and human evolution Science

2911748ndash1753Arenas M Francois O Currat M Ray N Excoffier L 2013 Influence of

admixture and Paleolithic range contractions on current Europeandiversity gradients Mol Biol Evol 30 57ndash61

Atkinson QD Gray RD Drummond AJ 2009 Bayesian coalescent infer-ence of major human mitochondrial DNA haplogroup expansionsin Africa Proc R Soc Lond B Biol Sci 276367ndash373

Baele G Lemey P Bedford T Rambaut A Suchard MA Alekseyenko AV2012 Improving the accuracy of demographic and molecular clockmodel comparison while accommodating phylogenetic uncertaintyMol Biol Evol 292157ndash2167

Balaresque PL Ballereau SJ Jobling MA 2007 Challenges in humangenetic diversity demographic history and adaptation Hum MolGenet 16R134ndashR139

Batini C Lopes J Behar DM Calafell F Jorde LB van der Veen LQuintana-Murci L Spedini G Destro-Bisol G Comas D 2011Insights into the demographic history of African pygmies from com-plete mitochondrial genomes Mol Biol Evol 281099ndash1110

Beaumont MA 2004 Recent developments in genetic data analysiswhat can they tell us about human demographic historyHeredity 92365ndash379

Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdashapractical and powerful approach to multiple testing J R Stat SocSer B 57289ndash300

Bermisheva M Salimova A Korshunova T Svyatova G Berezina GVillems R Khusnutdinova E 2002 Mitochondrial DNA diversity inthe populations of Middle Asia and Northern Caucasus Eur J HumGenet 10179ndash180

Bocquet-Appel JP 2011 When the worldrsquos population took off thespringboard of the Neolithic Demographic Transition Science 333560ndash561

Bocquet-Appel J-P Bar-Yosef O 2008 The Neolithic demographic tran-sition and its consequences Dordrecht (The Netherlands) Springer

14

Aime et al doi101093molbevmst156 MBE

Chaix R Austerlitz F Hegay T Quintana-Murci L Heyer E 2008 Genetictraces of east-to-west human expansion waves in Eurasia Am J PhysAnthropol 136309ndash317

Chaix R Quintana-Murci L Hegay T Hammer MF Mobasher ZAusterlitz F Heyer E 2007 From social to genetic structures incentral Asia Curr Biol 1743ndash48

Clark PU Dyke AS Shakun JD Carlson AE Clark J Wohlfarth BMitrovica JX Hostetler SW McCabe AM 2009 The Last GlacialMaximum Science 325710ndash714

Conrad DF Keebler JEM DePristo MA et al (17 co-authors) 2011Variation in genome-wide mutation rates within and betweenhuman families Nat Genet 43712ndash714

Derenko MV Malyarchuk BA Dambueva IK Shaikhaev GO DorzhuCM Nimaev DD Zakharov IA 2000 Mitochondrial DNA variationin two South Siberian aboriginal populations implications for thegenetic history of North Asia Hum Biol 72945ndash973

Dirksen VG van Geel B 2004 Mid to late Holocene climate change andits influence on cultural development in South Central Siberia NatoSci S Ss IV Ear 42291ndash307

Drummond AJ Rambaut A 2007 BEAST Bayesian evolutionary analysisby sampling trees BMC Evol Biol 7214

Endicott P Ho SYW Metspalu M Stringer C 2009 Evaluating the mi-tochondrial timescale of human evolution Trends Ecol Evol 24515ndash521

Excoffier L 2004 Patterns of DNA sequence diversity and genetic struc-ture after a range expansion lessons from the infinite-island modelMol Ecol 13853ndash864

Excoffier L Heckel G 2006 Computer programs for population geneticsdata analysis a survival guide Nat Rev Genet 7745ndash758

Excoffier L Laval G Schneider S 2005 Arlequin (version 30) an inte-grated software package for population genetics data analysis EvolBioinform Online 147ndash50

Fontaine MC Snirc A Frantzis A Koutrakis E Ozturk B Ozturk AAAusterlitz F 2012 History of expansion and anthropogenic collapsein a top marine predator of the Black Sea estimated from geneticdata Proc Natl Acad Sci U S A 109E2569ndashE2576

Forster P Harding R Torroni A Bandelt HJ 1996 Origin and evolutionof native American mtDNA variation a reappraisal Am J HumGenet 59935ndash945

Fu YX 1997 Statistical tests of neutrality of mutations against popula-tion growth hitchhiking and background selection Genetics 147915ndash925

Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133693ndash709

Gelman A Rubin DB 1992 Inference from iterative simulation usingmultiple sequences Stat Sci 7457ndash511

Heled J Drummond AJ 2010 Bayesian inference of species trees frommultilocus data Mol Biol Evol 27570ndash580

Heyer E Balaresque P Jobling MA Quintana-Murci L Chaix R Segurel LAldashev A Hegay T 2009 Genetic diversity and the emergence ofethnic groups in Central Asia BMC Genet 1049

Heyer E Chaix R Pavard S Austerlitz F 2012 Sex-specific demographicbehaviours that shape human genomic variation Mol Ecol 21597ndash612

Heyer E Zietkiewicz E Rochowski A Yotova V Puymirat J Labuda D2001 Phylogenetic and familial estimates of mitochondrial substi-tution rates study of control region mutations in deep-rootingpedigrees Am J Hum Genet 631113ndash1126

Ho SYW Shapiro B 2011 Skyline-plot methods for estimatingdemographic history from nucleotide sequences Mol Ecol Res 11423ndash434

Howell N Kubacka I Mackey DA 1996 How rapidly does the humanmitochondrial genome evolve Am J Hum Genet 59501ndash509

Hudson RR Kaplan NL 1985 Statistical properties of the number ofrecombination events in the history of a sample of DNA sequencesGenetics 111147ndash164

Imaizumi K Parsons TJ Yoshino M Holland MM 2002 A new databaseof mitochondrial DNA hypervariable regions I and II sequences from162 Japanese individuals Int J Legal Med 11668ndash73

International HapMap Consortium 2003 The International HapMapProject Nature 426789ndash796

Kingman JFC 1982 The coalescent Stoch Proc Appl 13235ndash248Kong QP Yao YG Sun C Bandelt HJ Zhu CL Zhang YP 2003 Phylogeny

of East Asian mitochondrial DNA lineages inferred from completesequences Am J Hum Genet 73671ndash676

Kumar V Langstieh BT Madhavi KV Naidu VM Singh HP Biswas SThangaraj K Singh L Reddy BM 2006 Global patterns in humanmitochondrial DNA and Y-chromosome variation caused by spatialinstability of the local cultural processes PLoS Genet 2e53

Kumar S Skjaeveland A Orr RJS Enger P Ruden T Mevik BH Burki FBotnen A Shalchian-Tabrizi K 2009 AIR a batch-oriented webprogram package for construction of supermatrices ready for phy-logenomic analyses BMC Bioinformatics 10357

Laval G Excoffier L 2004 SIMCOAL 20 a program to simulate genomicdiversity over large recombining regions in a subdivided populationwith a complex history Bioinformatics 202485ndash2487

Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulatinga historical and demographic model of recent human evolutionbased on resequencing data from noncoding regions PLoS One 5e10284

Librado P Rozas J 2009 DnaSP v5 a software for comprehensive analysisof DNA polymorphism data Bioinformatics 251451ndash1452

Liu W Jin CZ Zhang YQ et al (13 co-authors) 2010 Human remainsfrom Zhirendong South China and modern human emergence inEast Asia Proc Natl Acad Sci U S A 10719201ndash19206

Loader C 1999 Local regression and likelihood New York SpringerMellars P 2006a Why did human populations disperse from Africa ca

60000 years ago A new model Proc Natl Acad Sci U S A 1039381ndash9386

Mellars P 2006b A new radiocarbon revolution and the dispersal ofmodern humans in Eurasia Nature 439931ndash935

Nei M 1995 The origins of human populations genetic linguistic andarcheological data In Brenner S Hanihara K editors The origin andpast of modern humans as viewed from DNA Singapore WorldScientific p 71ndash91

Nei M Roychoudhury AK 1993 Evolutionary relationships of human-populations on a global scale Mol Biol Evol 10927ndash943

Nelson SM 1993 The archaeology of Korea Cambridge CambridgeUniversity Press p 307

Oota H Settheetham-Ishida W Tiwawech D Ishida T Stoneking M2001 Human mtDNA and Y-chromosome variation is corre-lated with matrilocal versus patrilocal residence Nat Genet 2920ndash21

Pakendorf B Stoneking M 2005 Mitochondrial DNA and human evo-lution Annu Rev Genomics Hum Genet 6165ndash183

Partridge TC Demenocal PB Lorentz SA Paiker MJ Vogel JC 1997Orbital forcing of climate over South Africa a 200000-year rainfallrecord from the Pretoria Saltpan Quat Sci Rev 161125ndash1133

Patin E Laval G Barreiro LB et al (15 co-authors) 2009 Inferring thedemographic history of African farmers and pygmy hunter-gath-erers using a multilocus resequencing data set PLoS Genet 5e1000448

Pavlov P Svendsen J Indrelid SI 2001 Human presence in the EuropeanArctic nearly 40000 years ago Nature 41364ndash67

Phillipson DW 1993 African archaeology Cambridge CambridgeUniversity Press

Plummer M Best N Cowles K Vines K 2006 CODA Convergencediagnosis and output analysis for MCMC R News 67ndash11

Pluzhnikov A Di Rienzo A Hudson RR 2002 Inferences about humandemography based on multilocus analyses of noncoding sequencesGenetics 1611209ndash1218

Quintana-Murci L Chaix R Wells RS et al (17 co-authors) 2004 Wherewest meets east the complex mtDNA landscape of the southwestand Central Asian corridor Am J Hum Genet 74827ndash845

Quintana-Murci L Quach H Harmant C et al (23 co-authors) 2008Maternal traces of deep common ancestry and asymmetric geneflow between Pygmy hunter-gatherers and Bantu-speaking farmersProc Natl Acad Sci U S A 1051596ndash1601

15

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE

Page 13: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

their departure from neutrality using the coalescent-basedtests provided in DnaSP (Librado and Rozas 2009)

For the autosomal sequences we used the procedure de-veloped in Laval et al (2010) which combines all autosomalsequences into a single test This procedure consists in com-puting the mean value of each summary statistics across the20 loci and in testing whether this mean value departs signif-icantly from its expectation under neutrality in a constant-size population model using a simulation procedure For agiven population with sample size n we produced 105 simu-lated samples of the same size n under a constant populationsize model using the generation-per-generation coalescent-based algorithm implemented in SIMCOAL v2 (Laval andExcoffier 2004) Each simulated individual was constitutedby 20 independent sequences of 1253 bp (the average se-quence length for the real data)Then we used ARLEQUIN v3(Excoffier et al 2005) as modified by Laval et al (2010) tocompute the summary statistics on these simulated samplesWe assessed whether the observed statistics differed signifi-cantly from the constant population model under neutralityby comparing these statistics with their null distribution ob-tained from the simulated data We used gamma distributedmutation rates with a mean value of 25 108generationsite (95 confidence interval 1476 108

4036 108)in agreement with previous studies (Pluzhnikov et al 2002Voight et al 2005) This procedure yielded a P value for thesignificance of the departure from a constant size model Weperformed this procedure both on the whole sequences andon the largest non-recombining blocks For the whole se-quences (ie including recombination) we performed thedata simulation under a coalescent model with recombina-tion using for each locus the recombination rate provided bythe HapMap build GRCh37 genetic map (InternationalHapMap Consortium 2003) (supplementary table S8Supplementary Material online) whereas for the largestnon-recombining blocks we used a coalescent model withoutrecombination

Both for autosomes and for HVS-I sequences we adjustedthe obtained P values for each neutrality test using a falsediscovery rate (FDR) correction (Benjamini and Hochberg1995) in R v2141 (R Development Core Team 2011) inorder to take into account the increased error probability inthe case of multiple testing

MCMC Estimations of Demographic ParametersWe used the MCMC algorithm implemented in BEAST v16(Drummond and Rambaut 2007) We tested the four demo-graphic models implemented in this software constant effec-tive population size (N0) (constant model) populationexpansion with an increasing growth rate (g) (exponentialmodel) population expansion with an decreasing growthrate (g) (logistic model) and the expansion model in whichN0 is the present day population size N1 the population sizethat the model asymptotes to going into the distant past andg the exponential growth rate that determines how fast thetransition is from near the N1 population size to N0 popula-tion size In fact BEAST estimates composite parameters foreach model namely N0 and g where N0 is the current

effective population size g the growth rate and the muta-tion rate In addition for the expansion model the ratio be-tween the current (N0) and ancestral (N1) effectivepopulation size is also estimated To infer N0 and g fromthese composite parameters we needed to assume a valuefor the mutation rate However there is no consensus formutation rates in humans in the literature as different meth-ods lead to different estimations For autosomes the mostcommonly used value is the phylogenetic rate of= 25 108generationsite (Pluzhnikov et al 2002)However recent studies based on the 1000 genome project(1000 Genomes Project Consortium 2012) have found a2-fold lower rate (= 12 108generationsite) by directlycomparing genome-wide sequences from children and theirparents (Conrad et al 2011 Scally and Durbin 2012) We usedhere both mutation rates Similarly for HVS-I estimated mu-tation rates are highly dependent on methodologies andmodes of calibration (Endicott et al 2009) We used boththe lower and the higher estimated mutation rates the tran-sitional changes mutation rate of = 5 106generationsite (Forster et al 1996) and the pedigree-based rate of= 105generationsite (Howell et al 1996 Heyer et al2001) We used a general time-reversible substitution model(Rodriguez et al 1990) We assumed a generation time of 25years permitting the comparison with previous human pop-ulation genetics studies (eg Chaix et al 2008 Patin et al 2009Laval et al 2010) As BEAST cannot handle recombinationevents we used the largest nonrecombining block withineach sequence (discussed earlier)

We performed three runs of 107 steps per population andper demographic model for the HVS-I sequence and threeruns of 2 108 steps (which corresponded to three runs of107 steps per locus) for the autosomal sequences We re-corded one tree every 1000 steps which thus implied atotal of 105 trees per locus and per run We then removedthe first 10 steps of each run (burn-in period) and combinedthe runs to obtain acceptable effective sample sizes (ESSs of100 or above) The convergence of these runs was assessedusing two methods visual inspection of traces using Tracerv15 (Rambaut and Drummond 2007) to check for concor-dance between runs and computation of Gelman and Rubinrsquos(1992) convergence diagnostic using R v2141 (RDevelopment Core Team 2011) with the function gelmandiag available in the package coda (Plummer et al 2006)

To facilitate a large exploration of the parameter space forthe autosomal sequences we chose uniform priors between 0and 005 for 2N0 and between 109 and 109 for g ForHVS-I sequences we chose uniform priors for N0 between 0and 10 and for gm between25 106 and 25 106 result-ing for the same priors on N0 and g than for autosomalsequences if we assumed = 105generationsite (ieN0 constrained between 0 and 106 and g constrained between1 and 1 per year) Conversely if we assumed = 5 106generationsite it meant that N0 was constrained between 0and 2 106 and g was constrained between05 and 05 peryear

For each population and model we obtained themode and the 95 HPD of N0 and g inferred from their

13

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

posterior distributions (supplementary tables S4 and S7Supplementary Material online) using the add-on packageLocfit (Loader 1999) in R v2141 We selected the best-fittingmodel among the four tested demographic models by esti-mating marginal likelihoods using two methods path sam-pling and stepping-stone sampling (Baele et al 2012) Themodel with the greater marginal likelihood (supplementarytable S3 Supplementary Material online) was considered asthe best-fitting model

Extended Bayesian Skyline PlotsEBSPs (Heled and Drummond 2010) also implemented inBEAST estimate demographic changes occurring continu-ously through time in a population using the time intervalsbetween successive coalescent events This method allows avisualization of the evolution of Ne through time As abovewe combined three runs of 107 steps for mitochondrial se-quences and three runs of 2 108 steps for autosomal se-quences to obtain acceptable ESS values We assumed thesame mutation rates as above and a generation time of 25years Outputs were analyzed with Tracer v15 to visuallycheck for convergence and ESS also to obtain the 95 HPDinterval for the number of demographic changes that oc-curred in the population (supplementary table S5Supplementary Material online) A constant population sizecould be rejected when the 95 HPD of the number ofchange points excluded 0 but included 1 (Heled andDrummond 2010) Then we used R v2141 to computeGelman and Rubinrsquos (1992) convergence diagnostic asabove as well as to compute skyline plots Finally we usedthe population growth curves generated from BEAST toassess the time at which populations began to expandEach Skyline plot consisted of smoothed data points atamp10ndash20 generation intervals We consider that the popula-tion increased (or decreased) when both the median and 95HPD values for Ne increased (or decreased) between morethan two successive data points Although this method didnot provide a 95 HPD interval for the inferred expansiontimings this conservative approach ensured that we consid-ered only relevant expansion signals

Correlation Tests of Inferred Growth Rates andIsolationImmigration Patterns

To test how isolation degrees and migration patterns differ-ences could have impacted our demographic inferences weused ARLEQUIN v311 (Excoffier et al 2005) to compute pop-ulation-specific FST values (Weir and Hill 2002) from HVS-Idata for Central Africa Eurasia and Central Asia (supplemen-tary table S9 Supplementary Material online) and estimateimmigration rates from mismatch distributions under a spa-tially explicit model (Excoffier 2004) (supplementary tableS10 Supplementary Material online) We performed thenSpearman tests using R v2141 to investigate for eachregion how the inferred parametric growth rates were corre-lated with those FST values and immigration rates We used avalue of 0 for the growth rate when the constant model bestfitted the data

Supplementary MaterialSupplementary figures S1 and S2 and tables S1ndashS12 are avail-able at Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

The authors would like to warmly thank all volunteer partic-ipants They also thank Michael Fontaine for his help on somedata analyses Phillip Endicott and Julio Bendezu-Sarmientofor insightful discussions Alexei Drummond for helpful dis-cussions and Friso Palstra for his help on English usage Theythank Laurent Excoffier two anonymous reviewers and theeditor for helpful comments and suggestions All computa-tionally intensive analyses were run on the Linux cluster of theMuseum National drsquoHistoire Naturelle (administrated by JulioPedraza) and on the web-based portal ldquoBioportalrdquo (Kumaret al 2009) This work was supported by ActionsTransversales du Museum - Museum National drsquoHistoireNaturelle (grant ldquoLes relations Societes-Natures dans le longtermerdquo) and Agence Nationale de la Recherche (grantsldquoAlterite culturellerdquo ANR-10-ESVS-0010 and ldquoDemochipsrdquoANR-12-BSV7-0012) CA was financed by a PhD grantfrom the Centre National de la Recherche Scientifique

References1000 Genomes Project Consortium 2012 An integrated map of genetic

variation from 1092 human genomes Nature 49156ndash65Ambrose SH 2001 Paleolithic technology and human evolution Science

2911748ndash1753Arenas M Francois O Currat M Ray N Excoffier L 2013 Influence of

admixture and Paleolithic range contractions on current Europeandiversity gradients Mol Biol Evol 30 57ndash61

Atkinson QD Gray RD Drummond AJ 2009 Bayesian coalescent infer-ence of major human mitochondrial DNA haplogroup expansionsin Africa Proc R Soc Lond B Biol Sci 276367ndash373

Baele G Lemey P Bedford T Rambaut A Suchard MA Alekseyenko AV2012 Improving the accuracy of demographic and molecular clockmodel comparison while accommodating phylogenetic uncertaintyMol Biol Evol 292157ndash2167

Balaresque PL Ballereau SJ Jobling MA 2007 Challenges in humangenetic diversity demographic history and adaptation Hum MolGenet 16R134ndashR139

Batini C Lopes J Behar DM Calafell F Jorde LB van der Veen LQuintana-Murci L Spedini G Destro-Bisol G Comas D 2011Insights into the demographic history of African pygmies from com-plete mitochondrial genomes Mol Biol Evol 281099ndash1110

Beaumont MA 2004 Recent developments in genetic data analysiswhat can they tell us about human demographic historyHeredity 92365ndash379

Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdashapractical and powerful approach to multiple testing J R Stat SocSer B 57289ndash300

Bermisheva M Salimova A Korshunova T Svyatova G Berezina GVillems R Khusnutdinova E 2002 Mitochondrial DNA diversity inthe populations of Middle Asia and Northern Caucasus Eur J HumGenet 10179ndash180

Bocquet-Appel JP 2011 When the worldrsquos population took off thespringboard of the Neolithic Demographic Transition Science 333560ndash561

Bocquet-Appel J-P Bar-Yosef O 2008 The Neolithic demographic tran-sition and its consequences Dordrecht (The Netherlands) Springer

14

Aime et al doi101093molbevmst156 MBE

Chaix R Austerlitz F Hegay T Quintana-Murci L Heyer E 2008 Genetictraces of east-to-west human expansion waves in Eurasia Am J PhysAnthropol 136309ndash317

Chaix R Quintana-Murci L Hegay T Hammer MF Mobasher ZAusterlitz F Heyer E 2007 From social to genetic structures incentral Asia Curr Biol 1743ndash48

Clark PU Dyke AS Shakun JD Carlson AE Clark J Wohlfarth BMitrovica JX Hostetler SW McCabe AM 2009 The Last GlacialMaximum Science 325710ndash714

Conrad DF Keebler JEM DePristo MA et al (17 co-authors) 2011Variation in genome-wide mutation rates within and betweenhuman families Nat Genet 43712ndash714

Derenko MV Malyarchuk BA Dambueva IK Shaikhaev GO DorzhuCM Nimaev DD Zakharov IA 2000 Mitochondrial DNA variationin two South Siberian aboriginal populations implications for thegenetic history of North Asia Hum Biol 72945ndash973

Dirksen VG van Geel B 2004 Mid to late Holocene climate change andits influence on cultural development in South Central Siberia NatoSci S Ss IV Ear 42291ndash307

Drummond AJ Rambaut A 2007 BEAST Bayesian evolutionary analysisby sampling trees BMC Evol Biol 7214

Endicott P Ho SYW Metspalu M Stringer C 2009 Evaluating the mi-tochondrial timescale of human evolution Trends Ecol Evol 24515ndash521

Excoffier L 2004 Patterns of DNA sequence diversity and genetic struc-ture after a range expansion lessons from the infinite-island modelMol Ecol 13853ndash864

Excoffier L Heckel G 2006 Computer programs for population geneticsdata analysis a survival guide Nat Rev Genet 7745ndash758

Excoffier L Laval G Schneider S 2005 Arlequin (version 30) an inte-grated software package for population genetics data analysis EvolBioinform Online 147ndash50

Fontaine MC Snirc A Frantzis A Koutrakis E Ozturk B Ozturk AAAusterlitz F 2012 History of expansion and anthropogenic collapsein a top marine predator of the Black Sea estimated from geneticdata Proc Natl Acad Sci U S A 109E2569ndashE2576

Forster P Harding R Torroni A Bandelt HJ 1996 Origin and evolutionof native American mtDNA variation a reappraisal Am J HumGenet 59935ndash945

Fu YX 1997 Statistical tests of neutrality of mutations against popula-tion growth hitchhiking and background selection Genetics 147915ndash925

Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133693ndash709

Gelman A Rubin DB 1992 Inference from iterative simulation usingmultiple sequences Stat Sci 7457ndash511

Heled J Drummond AJ 2010 Bayesian inference of species trees frommultilocus data Mol Biol Evol 27570ndash580

Heyer E Balaresque P Jobling MA Quintana-Murci L Chaix R Segurel LAldashev A Hegay T 2009 Genetic diversity and the emergence ofethnic groups in Central Asia BMC Genet 1049

Heyer E Chaix R Pavard S Austerlitz F 2012 Sex-specific demographicbehaviours that shape human genomic variation Mol Ecol 21597ndash612

Heyer E Zietkiewicz E Rochowski A Yotova V Puymirat J Labuda D2001 Phylogenetic and familial estimates of mitochondrial substi-tution rates study of control region mutations in deep-rootingpedigrees Am J Hum Genet 631113ndash1126

Ho SYW Shapiro B 2011 Skyline-plot methods for estimatingdemographic history from nucleotide sequences Mol Ecol Res 11423ndash434

Howell N Kubacka I Mackey DA 1996 How rapidly does the humanmitochondrial genome evolve Am J Hum Genet 59501ndash509

Hudson RR Kaplan NL 1985 Statistical properties of the number ofrecombination events in the history of a sample of DNA sequencesGenetics 111147ndash164

Imaizumi K Parsons TJ Yoshino M Holland MM 2002 A new databaseof mitochondrial DNA hypervariable regions I and II sequences from162 Japanese individuals Int J Legal Med 11668ndash73

International HapMap Consortium 2003 The International HapMapProject Nature 426789ndash796

Kingman JFC 1982 The coalescent Stoch Proc Appl 13235ndash248Kong QP Yao YG Sun C Bandelt HJ Zhu CL Zhang YP 2003 Phylogeny

of East Asian mitochondrial DNA lineages inferred from completesequences Am J Hum Genet 73671ndash676

Kumar V Langstieh BT Madhavi KV Naidu VM Singh HP Biswas SThangaraj K Singh L Reddy BM 2006 Global patterns in humanmitochondrial DNA and Y-chromosome variation caused by spatialinstability of the local cultural processes PLoS Genet 2e53

Kumar S Skjaeveland A Orr RJS Enger P Ruden T Mevik BH Burki FBotnen A Shalchian-Tabrizi K 2009 AIR a batch-oriented webprogram package for construction of supermatrices ready for phy-logenomic analyses BMC Bioinformatics 10357

Laval G Excoffier L 2004 SIMCOAL 20 a program to simulate genomicdiversity over large recombining regions in a subdivided populationwith a complex history Bioinformatics 202485ndash2487

Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulatinga historical and demographic model of recent human evolutionbased on resequencing data from noncoding regions PLoS One 5e10284

Librado P Rozas J 2009 DnaSP v5 a software for comprehensive analysisof DNA polymorphism data Bioinformatics 251451ndash1452

Liu W Jin CZ Zhang YQ et al (13 co-authors) 2010 Human remainsfrom Zhirendong South China and modern human emergence inEast Asia Proc Natl Acad Sci U S A 10719201ndash19206

Loader C 1999 Local regression and likelihood New York SpringerMellars P 2006a Why did human populations disperse from Africa ca

60000 years ago A new model Proc Natl Acad Sci U S A 1039381ndash9386

Mellars P 2006b A new radiocarbon revolution and the dispersal ofmodern humans in Eurasia Nature 439931ndash935

Nei M 1995 The origins of human populations genetic linguistic andarcheological data In Brenner S Hanihara K editors The origin andpast of modern humans as viewed from DNA Singapore WorldScientific p 71ndash91

Nei M Roychoudhury AK 1993 Evolutionary relationships of human-populations on a global scale Mol Biol Evol 10927ndash943

Nelson SM 1993 The archaeology of Korea Cambridge CambridgeUniversity Press p 307

Oota H Settheetham-Ishida W Tiwawech D Ishida T Stoneking M2001 Human mtDNA and Y-chromosome variation is corre-lated with matrilocal versus patrilocal residence Nat Genet 2920ndash21

Pakendorf B Stoneking M 2005 Mitochondrial DNA and human evo-lution Annu Rev Genomics Hum Genet 6165ndash183

Partridge TC Demenocal PB Lorentz SA Paiker MJ Vogel JC 1997Orbital forcing of climate over South Africa a 200000-year rainfallrecord from the Pretoria Saltpan Quat Sci Rev 161125ndash1133

Patin E Laval G Barreiro LB et al (15 co-authors) 2009 Inferring thedemographic history of African farmers and pygmy hunter-gath-erers using a multilocus resequencing data set PLoS Genet 5e1000448

Pavlov P Svendsen J Indrelid SI 2001 Human presence in the EuropeanArctic nearly 40000 years ago Nature 41364ndash67

Phillipson DW 1993 African archaeology Cambridge CambridgeUniversity Press

Plummer M Best N Cowles K Vines K 2006 CODA Convergencediagnosis and output analysis for MCMC R News 67ndash11

Pluzhnikov A Di Rienzo A Hudson RR 2002 Inferences about humandemography based on multilocus analyses of noncoding sequencesGenetics 1611209ndash1218

Quintana-Murci L Chaix R Wells RS et al (17 co-authors) 2004 Wherewest meets east the complex mtDNA landscape of the southwestand Central Asian corridor Am J Hum Genet 74827ndash845

Quintana-Murci L Quach H Harmant C et al (23 co-authors) 2008Maternal traces of deep common ancestry and asymmetric geneflow between Pygmy hunter-gatherers and Bantu-speaking farmersProc Natl Acad Sci U S A 1051596ndash1601

15

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE

Page 14: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

posterior distributions (supplementary tables S4 and S7Supplementary Material online) using the add-on packageLocfit (Loader 1999) in R v2141 We selected the best-fittingmodel among the four tested demographic models by esti-mating marginal likelihoods using two methods path sam-pling and stepping-stone sampling (Baele et al 2012) Themodel with the greater marginal likelihood (supplementarytable S3 Supplementary Material online) was considered asthe best-fitting model

Extended Bayesian Skyline PlotsEBSPs (Heled and Drummond 2010) also implemented inBEAST estimate demographic changes occurring continu-ously through time in a population using the time intervalsbetween successive coalescent events This method allows avisualization of the evolution of Ne through time As abovewe combined three runs of 107 steps for mitochondrial se-quences and three runs of 2 108 steps for autosomal se-quences to obtain acceptable ESS values We assumed thesame mutation rates as above and a generation time of 25years Outputs were analyzed with Tracer v15 to visuallycheck for convergence and ESS also to obtain the 95 HPDinterval for the number of demographic changes that oc-curred in the population (supplementary table S5Supplementary Material online) A constant population sizecould be rejected when the 95 HPD of the number ofchange points excluded 0 but included 1 (Heled andDrummond 2010) Then we used R v2141 to computeGelman and Rubinrsquos (1992) convergence diagnostic asabove as well as to compute skyline plots Finally we usedthe population growth curves generated from BEAST toassess the time at which populations began to expandEach Skyline plot consisted of smoothed data points atamp10ndash20 generation intervals We consider that the popula-tion increased (or decreased) when both the median and 95HPD values for Ne increased (or decreased) between morethan two successive data points Although this method didnot provide a 95 HPD interval for the inferred expansiontimings this conservative approach ensured that we consid-ered only relevant expansion signals

Correlation Tests of Inferred Growth Rates andIsolationImmigration Patterns

To test how isolation degrees and migration patterns differ-ences could have impacted our demographic inferences weused ARLEQUIN v311 (Excoffier et al 2005) to compute pop-ulation-specific FST values (Weir and Hill 2002) from HVS-Idata for Central Africa Eurasia and Central Asia (supplemen-tary table S9 Supplementary Material online) and estimateimmigration rates from mismatch distributions under a spa-tially explicit model (Excoffier 2004) (supplementary tableS10 Supplementary Material online) We performed thenSpearman tests using R v2141 to investigate for eachregion how the inferred parametric growth rates were corre-lated with those FST values and immigration rates We used avalue of 0 for the growth rate when the constant model bestfitted the data

Supplementary MaterialSupplementary figures S1 and S2 and tables S1ndashS12 are avail-able at Molecular Biology and Evolution online (httpwwwmbeoxfordjournalsorg)

Acknowledgments

The authors would like to warmly thank all volunteer partic-ipants They also thank Michael Fontaine for his help on somedata analyses Phillip Endicott and Julio Bendezu-Sarmientofor insightful discussions Alexei Drummond for helpful dis-cussions and Friso Palstra for his help on English usage Theythank Laurent Excoffier two anonymous reviewers and theeditor for helpful comments and suggestions All computa-tionally intensive analyses were run on the Linux cluster of theMuseum National drsquoHistoire Naturelle (administrated by JulioPedraza) and on the web-based portal ldquoBioportalrdquo (Kumaret al 2009) This work was supported by ActionsTransversales du Museum - Museum National drsquoHistoireNaturelle (grant ldquoLes relations Societes-Natures dans le longtermerdquo) and Agence Nationale de la Recherche (grantsldquoAlterite culturellerdquo ANR-10-ESVS-0010 and ldquoDemochipsrdquoANR-12-BSV7-0012) CA was financed by a PhD grantfrom the Centre National de la Recherche Scientifique

References1000 Genomes Project Consortium 2012 An integrated map of genetic

variation from 1092 human genomes Nature 49156ndash65Ambrose SH 2001 Paleolithic technology and human evolution Science

2911748ndash1753Arenas M Francois O Currat M Ray N Excoffier L 2013 Influence of

admixture and Paleolithic range contractions on current Europeandiversity gradients Mol Biol Evol 30 57ndash61

Atkinson QD Gray RD Drummond AJ 2009 Bayesian coalescent infer-ence of major human mitochondrial DNA haplogroup expansionsin Africa Proc R Soc Lond B Biol Sci 276367ndash373

Baele G Lemey P Bedford T Rambaut A Suchard MA Alekseyenko AV2012 Improving the accuracy of demographic and molecular clockmodel comparison while accommodating phylogenetic uncertaintyMol Biol Evol 292157ndash2167

Balaresque PL Ballereau SJ Jobling MA 2007 Challenges in humangenetic diversity demographic history and adaptation Hum MolGenet 16R134ndashR139

Batini C Lopes J Behar DM Calafell F Jorde LB van der Veen LQuintana-Murci L Spedini G Destro-Bisol G Comas D 2011Insights into the demographic history of African pygmies from com-plete mitochondrial genomes Mol Biol Evol 281099ndash1110

Beaumont MA 2004 Recent developments in genetic data analysiswhat can they tell us about human demographic historyHeredity 92365ndash379

Benjamini Y Hochberg Y 1995 Controlling the false discovery ratemdashapractical and powerful approach to multiple testing J R Stat SocSer B 57289ndash300

Bermisheva M Salimova A Korshunova T Svyatova G Berezina GVillems R Khusnutdinova E 2002 Mitochondrial DNA diversity inthe populations of Middle Asia and Northern Caucasus Eur J HumGenet 10179ndash180

Bocquet-Appel JP 2011 When the worldrsquos population took off thespringboard of the Neolithic Demographic Transition Science 333560ndash561

Bocquet-Appel J-P Bar-Yosef O 2008 The Neolithic demographic tran-sition and its consequences Dordrecht (The Netherlands) Springer

14

Aime et al doi101093molbevmst156 MBE

Chaix R Austerlitz F Hegay T Quintana-Murci L Heyer E 2008 Genetictraces of east-to-west human expansion waves in Eurasia Am J PhysAnthropol 136309ndash317

Chaix R Quintana-Murci L Hegay T Hammer MF Mobasher ZAusterlitz F Heyer E 2007 From social to genetic structures incentral Asia Curr Biol 1743ndash48

Clark PU Dyke AS Shakun JD Carlson AE Clark J Wohlfarth BMitrovica JX Hostetler SW McCabe AM 2009 The Last GlacialMaximum Science 325710ndash714

Conrad DF Keebler JEM DePristo MA et al (17 co-authors) 2011Variation in genome-wide mutation rates within and betweenhuman families Nat Genet 43712ndash714

Derenko MV Malyarchuk BA Dambueva IK Shaikhaev GO DorzhuCM Nimaev DD Zakharov IA 2000 Mitochondrial DNA variationin two South Siberian aboriginal populations implications for thegenetic history of North Asia Hum Biol 72945ndash973

Dirksen VG van Geel B 2004 Mid to late Holocene climate change andits influence on cultural development in South Central Siberia NatoSci S Ss IV Ear 42291ndash307

Drummond AJ Rambaut A 2007 BEAST Bayesian evolutionary analysisby sampling trees BMC Evol Biol 7214

Endicott P Ho SYW Metspalu M Stringer C 2009 Evaluating the mi-tochondrial timescale of human evolution Trends Ecol Evol 24515ndash521

Excoffier L 2004 Patterns of DNA sequence diversity and genetic struc-ture after a range expansion lessons from the infinite-island modelMol Ecol 13853ndash864

Excoffier L Heckel G 2006 Computer programs for population geneticsdata analysis a survival guide Nat Rev Genet 7745ndash758

Excoffier L Laval G Schneider S 2005 Arlequin (version 30) an inte-grated software package for population genetics data analysis EvolBioinform Online 147ndash50

Fontaine MC Snirc A Frantzis A Koutrakis E Ozturk B Ozturk AAAusterlitz F 2012 History of expansion and anthropogenic collapsein a top marine predator of the Black Sea estimated from geneticdata Proc Natl Acad Sci U S A 109E2569ndashE2576

Forster P Harding R Torroni A Bandelt HJ 1996 Origin and evolutionof native American mtDNA variation a reappraisal Am J HumGenet 59935ndash945

Fu YX 1997 Statistical tests of neutrality of mutations against popula-tion growth hitchhiking and background selection Genetics 147915ndash925

Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133693ndash709

Gelman A Rubin DB 1992 Inference from iterative simulation usingmultiple sequences Stat Sci 7457ndash511

Heled J Drummond AJ 2010 Bayesian inference of species trees frommultilocus data Mol Biol Evol 27570ndash580

Heyer E Balaresque P Jobling MA Quintana-Murci L Chaix R Segurel LAldashev A Hegay T 2009 Genetic diversity and the emergence ofethnic groups in Central Asia BMC Genet 1049

Heyer E Chaix R Pavard S Austerlitz F 2012 Sex-specific demographicbehaviours that shape human genomic variation Mol Ecol 21597ndash612

Heyer E Zietkiewicz E Rochowski A Yotova V Puymirat J Labuda D2001 Phylogenetic and familial estimates of mitochondrial substi-tution rates study of control region mutations in deep-rootingpedigrees Am J Hum Genet 631113ndash1126

Ho SYW Shapiro B 2011 Skyline-plot methods for estimatingdemographic history from nucleotide sequences Mol Ecol Res 11423ndash434

Howell N Kubacka I Mackey DA 1996 How rapidly does the humanmitochondrial genome evolve Am J Hum Genet 59501ndash509

Hudson RR Kaplan NL 1985 Statistical properties of the number ofrecombination events in the history of a sample of DNA sequencesGenetics 111147ndash164

Imaizumi K Parsons TJ Yoshino M Holland MM 2002 A new databaseof mitochondrial DNA hypervariable regions I and II sequences from162 Japanese individuals Int J Legal Med 11668ndash73

International HapMap Consortium 2003 The International HapMapProject Nature 426789ndash796

Kingman JFC 1982 The coalescent Stoch Proc Appl 13235ndash248Kong QP Yao YG Sun C Bandelt HJ Zhu CL Zhang YP 2003 Phylogeny

of East Asian mitochondrial DNA lineages inferred from completesequences Am J Hum Genet 73671ndash676

Kumar V Langstieh BT Madhavi KV Naidu VM Singh HP Biswas SThangaraj K Singh L Reddy BM 2006 Global patterns in humanmitochondrial DNA and Y-chromosome variation caused by spatialinstability of the local cultural processes PLoS Genet 2e53

Kumar S Skjaeveland A Orr RJS Enger P Ruden T Mevik BH Burki FBotnen A Shalchian-Tabrizi K 2009 AIR a batch-oriented webprogram package for construction of supermatrices ready for phy-logenomic analyses BMC Bioinformatics 10357

Laval G Excoffier L 2004 SIMCOAL 20 a program to simulate genomicdiversity over large recombining regions in a subdivided populationwith a complex history Bioinformatics 202485ndash2487

Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulatinga historical and demographic model of recent human evolutionbased on resequencing data from noncoding regions PLoS One 5e10284

Librado P Rozas J 2009 DnaSP v5 a software for comprehensive analysisof DNA polymorphism data Bioinformatics 251451ndash1452

Liu W Jin CZ Zhang YQ et al (13 co-authors) 2010 Human remainsfrom Zhirendong South China and modern human emergence inEast Asia Proc Natl Acad Sci U S A 10719201ndash19206

Loader C 1999 Local regression and likelihood New York SpringerMellars P 2006a Why did human populations disperse from Africa ca

60000 years ago A new model Proc Natl Acad Sci U S A 1039381ndash9386

Mellars P 2006b A new radiocarbon revolution and the dispersal ofmodern humans in Eurasia Nature 439931ndash935

Nei M 1995 The origins of human populations genetic linguistic andarcheological data In Brenner S Hanihara K editors The origin andpast of modern humans as viewed from DNA Singapore WorldScientific p 71ndash91

Nei M Roychoudhury AK 1993 Evolutionary relationships of human-populations on a global scale Mol Biol Evol 10927ndash943

Nelson SM 1993 The archaeology of Korea Cambridge CambridgeUniversity Press p 307

Oota H Settheetham-Ishida W Tiwawech D Ishida T Stoneking M2001 Human mtDNA and Y-chromosome variation is corre-lated with matrilocal versus patrilocal residence Nat Genet 2920ndash21

Pakendorf B Stoneking M 2005 Mitochondrial DNA and human evo-lution Annu Rev Genomics Hum Genet 6165ndash183

Partridge TC Demenocal PB Lorentz SA Paiker MJ Vogel JC 1997Orbital forcing of climate over South Africa a 200000-year rainfallrecord from the Pretoria Saltpan Quat Sci Rev 161125ndash1133

Patin E Laval G Barreiro LB et al (15 co-authors) 2009 Inferring thedemographic history of African farmers and pygmy hunter-gath-erers using a multilocus resequencing data set PLoS Genet 5e1000448

Pavlov P Svendsen J Indrelid SI 2001 Human presence in the EuropeanArctic nearly 40000 years ago Nature 41364ndash67

Phillipson DW 1993 African archaeology Cambridge CambridgeUniversity Press

Plummer M Best N Cowles K Vines K 2006 CODA Convergencediagnosis and output analysis for MCMC R News 67ndash11

Pluzhnikov A Di Rienzo A Hudson RR 2002 Inferences about humandemography based on multilocus analyses of noncoding sequencesGenetics 1611209ndash1218

Quintana-Murci L Chaix R Wells RS et al (17 co-authors) 2004 Wherewest meets east the complex mtDNA landscape of the southwestand Central Asian corridor Am J Hum Genet 74827ndash845

Quintana-Murci L Quach H Harmant C et al (23 co-authors) 2008Maternal traces of deep common ancestry and asymmetric geneflow between Pygmy hunter-gatherers and Bantu-speaking farmersProc Natl Acad Sci U S A 1051596ndash1601

15

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE

Page 15: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

Chaix R Austerlitz F Hegay T Quintana-Murci L Heyer E 2008 Genetictraces of east-to-west human expansion waves in Eurasia Am J PhysAnthropol 136309ndash317

Chaix R Quintana-Murci L Hegay T Hammer MF Mobasher ZAusterlitz F Heyer E 2007 From social to genetic structures incentral Asia Curr Biol 1743ndash48

Clark PU Dyke AS Shakun JD Carlson AE Clark J Wohlfarth BMitrovica JX Hostetler SW McCabe AM 2009 The Last GlacialMaximum Science 325710ndash714

Conrad DF Keebler JEM DePristo MA et al (17 co-authors) 2011Variation in genome-wide mutation rates within and betweenhuman families Nat Genet 43712ndash714

Derenko MV Malyarchuk BA Dambueva IK Shaikhaev GO DorzhuCM Nimaev DD Zakharov IA 2000 Mitochondrial DNA variationin two South Siberian aboriginal populations implications for thegenetic history of North Asia Hum Biol 72945ndash973

Dirksen VG van Geel B 2004 Mid to late Holocene climate change andits influence on cultural development in South Central Siberia NatoSci S Ss IV Ear 42291ndash307

Drummond AJ Rambaut A 2007 BEAST Bayesian evolutionary analysisby sampling trees BMC Evol Biol 7214

Endicott P Ho SYW Metspalu M Stringer C 2009 Evaluating the mi-tochondrial timescale of human evolution Trends Ecol Evol 24515ndash521

Excoffier L 2004 Patterns of DNA sequence diversity and genetic struc-ture after a range expansion lessons from the infinite-island modelMol Ecol 13853ndash864

Excoffier L Heckel G 2006 Computer programs for population geneticsdata analysis a survival guide Nat Rev Genet 7745ndash758

Excoffier L Laval G Schneider S 2005 Arlequin (version 30) an inte-grated software package for population genetics data analysis EvolBioinform Online 147ndash50

Fontaine MC Snirc A Frantzis A Koutrakis E Ozturk B Ozturk AAAusterlitz F 2012 History of expansion and anthropogenic collapsein a top marine predator of the Black Sea estimated from geneticdata Proc Natl Acad Sci U S A 109E2569ndashE2576

Forster P Harding R Torroni A Bandelt HJ 1996 Origin and evolutionof native American mtDNA variation a reappraisal Am J HumGenet 59935ndash945

Fu YX 1997 Statistical tests of neutrality of mutations against popula-tion growth hitchhiking and background selection Genetics 147915ndash925

Fu YX Li WH 1993 Statistical tests of neutrality of mutations Genetics133693ndash709

Gelman A Rubin DB 1992 Inference from iterative simulation usingmultiple sequences Stat Sci 7457ndash511

Heled J Drummond AJ 2010 Bayesian inference of species trees frommultilocus data Mol Biol Evol 27570ndash580

Heyer E Balaresque P Jobling MA Quintana-Murci L Chaix R Segurel LAldashev A Hegay T 2009 Genetic diversity and the emergence ofethnic groups in Central Asia BMC Genet 1049

Heyer E Chaix R Pavard S Austerlitz F 2012 Sex-specific demographicbehaviours that shape human genomic variation Mol Ecol 21597ndash612

Heyer E Zietkiewicz E Rochowski A Yotova V Puymirat J Labuda D2001 Phylogenetic and familial estimates of mitochondrial substi-tution rates study of control region mutations in deep-rootingpedigrees Am J Hum Genet 631113ndash1126

Ho SYW Shapiro B 2011 Skyline-plot methods for estimatingdemographic history from nucleotide sequences Mol Ecol Res 11423ndash434

Howell N Kubacka I Mackey DA 1996 How rapidly does the humanmitochondrial genome evolve Am J Hum Genet 59501ndash509

Hudson RR Kaplan NL 1985 Statistical properties of the number ofrecombination events in the history of a sample of DNA sequencesGenetics 111147ndash164

Imaizumi K Parsons TJ Yoshino M Holland MM 2002 A new databaseof mitochondrial DNA hypervariable regions I and II sequences from162 Japanese individuals Int J Legal Med 11668ndash73

International HapMap Consortium 2003 The International HapMapProject Nature 426789ndash796

Kingman JFC 1982 The coalescent Stoch Proc Appl 13235ndash248Kong QP Yao YG Sun C Bandelt HJ Zhu CL Zhang YP 2003 Phylogeny

of East Asian mitochondrial DNA lineages inferred from completesequences Am J Hum Genet 73671ndash676

Kumar V Langstieh BT Madhavi KV Naidu VM Singh HP Biswas SThangaraj K Singh L Reddy BM 2006 Global patterns in humanmitochondrial DNA and Y-chromosome variation caused by spatialinstability of the local cultural processes PLoS Genet 2e53

Kumar S Skjaeveland A Orr RJS Enger P Ruden T Mevik BH Burki FBotnen A Shalchian-Tabrizi K 2009 AIR a batch-oriented webprogram package for construction of supermatrices ready for phy-logenomic analyses BMC Bioinformatics 10357

Laval G Excoffier L 2004 SIMCOAL 20 a program to simulate genomicdiversity over large recombining regions in a subdivided populationwith a complex history Bioinformatics 202485ndash2487

Laval G Patin E Barreiro LB Quintana-Murci L 2010 Formulatinga historical and demographic model of recent human evolutionbased on resequencing data from noncoding regions PLoS One 5e10284

Librado P Rozas J 2009 DnaSP v5 a software for comprehensive analysisof DNA polymorphism data Bioinformatics 251451ndash1452

Liu W Jin CZ Zhang YQ et al (13 co-authors) 2010 Human remainsfrom Zhirendong South China and modern human emergence inEast Asia Proc Natl Acad Sci U S A 10719201ndash19206

Loader C 1999 Local regression and likelihood New York SpringerMellars P 2006a Why did human populations disperse from Africa ca

60000 years ago A new model Proc Natl Acad Sci U S A 1039381ndash9386

Mellars P 2006b A new radiocarbon revolution and the dispersal ofmodern humans in Eurasia Nature 439931ndash935

Nei M 1995 The origins of human populations genetic linguistic andarcheological data In Brenner S Hanihara K editors The origin andpast of modern humans as viewed from DNA Singapore WorldScientific p 71ndash91

Nei M Roychoudhury AK 1993 Evolutionary relationships of human-populations on a global scale Mol Biol Evol 10927ndash943

Nelson SM 1993 The archaeology of Korea Cambridge CambridgeUniversity Press p 307

Oota H Settheetham-Ishida W Tiwawech D Ishida T Stoneking M2001 Human mtDNA and Y-chromosome variation is corre-lated with matrilocal versus patrilocal residence Nat Genet 2920ndash21

Pakendorf B Stoneking M 2005 Mitochondrial DNA and human evo-lution Annu Rev Genomics Hum Genet 6165ndash183

Partridge TC Demenocal PB Lorentz SA Paiker MJ Vogel JC 1997Orbital forcing of climate over South Africa a 200000-year rainfallrecord from the Pretoria Saltpan Quat Sci Rev 161125ndash1133

Patin E Laval G Barreiro LB et al (15 co-authors) 2009 Inferring thedemographic history of African farmers and pygmy hunter-gath-erers using a multilocus resequencing data set PLoS Genet 5e1000448

Pavlov P Svendsen J Indrelid SI 2001 Human presence in the EuropeanArctic nearly 40000 years ago Nature 41364ndash67

Phillipson DW 1993 African archaeology Cambridge CambridgeUniversity Press

Plummer M Best N Cowles K Vines K 2006 CODA Convergencediagnosis and output analysis for MCMC R News 67ndash11

Pluzhnikov A Di Rienzo A Hudson RR 2002 Inferences about humandemography based on multilocus analyses of noncoding sequencesGenetics 1611209ndash1218

Quintana-Murci L Chaix R Wells RS et al (17 co-authors) 2004 Wherewest meets east the complex mtDNA landscape of the southwestand Central Asian corridor Am J Hum Genet 74827ndash845

Quintana-Murci L Quach H Harmant C et al (23 co-authors) 2008Maternal traces of deep common ancestry and asymmetric geneflow between Pygmy hunter-gatherers and Bantu-speaking farmersProc Natl Acad Sci U S A 1051596ndash1601

15

Demographic Patterns between Sedentary and Nomadic Populations doi101093molbevmst156 MBE

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE

Page 16: Human Genetic Data Reveal Contrasting Demographic Patterns between Sedentary and Nomadic Populations That Predate the Emergence of Farming

R Development Core Team 2011 R a language and environment forstatistical computing Vienna (Austria) R Foundation for StatisticalComputing

Rambaut A Drummond AJ 2007 Tracer ver 15 [cited 2013 Sep 24]Available from httptreebioedacuksoftwaretracer

Ramos-Onsins SE Rozas J 2002 Statistical properties of new neutralitytests against population growth Mol Biol Evol 192092ndash2100

Ray N Currat M Excoffier L 2003 Intra-deme molecular diversity inspatially expanding populations Mol Biol Evol 2076ndash86

Richards M Macaulay V Hickey E et al (37 co-authors) 2000 TracingEuropean founder lineages in the Near Eastern mtDNA pool Am JHum Genet 671251ndash1276

Rodriguez F Oliver JL Marin A Medina JR 1990 The general stochasticmodel of nucleotide substitution J Theor Biol 142485ndash501

Sauer CO 1952 Agricultural origins and dispersals Cambridge (MA)American Geographical Society

Scally A Durbin R 2012 Revising the human mutation rate implicationsfor understanding human evolution Nat Rev Genet 13745ndash53

Segurel L Austerlitz F Toupance B et al (16 co-authors) 2013 Selectionof protective variants for type 2 diabetes from the Neolithic onwarda case study in Central Asia Eur J Hum Genet 211146ndash1151

Shea JJ 2009 The impact of projectile weaponry on LatePleistocene hominin evolution In Hublin J-J Richards MP editorsEvolution of hominin diets Dordrecht (The Netherlands) Springerp 189ndash199

Short R 1982 The biological basis for the contraceptive effects of breastfeeding Int J Gynecol Obstet 25207ndash217

Tajima F 1989 Statistical method for testing the neutral mutation hy-pothesis by DNA polymorphism Genetics 123585ndash595

Tremblay M Vezina H 2000 New estimates of intergenerational timeintervals for the calculation of age and origins of mutations Am JHum Genet 66651ndash658

Valeggia C Ellison PT 2009 Interactions between metabolic and repro-ductive functions in the resumption of postpartum fecundity Am JHum Biol 21559ndash566

Verdu P Austerlitz F Estoup A et al (14 co-authors) 2009 Origins andgenetic diversity of pygmy hunter-gatherers from Western CentralAfrica Curr Biol 19312ndash318

Verdu P Becker NS Froment A et al (12 co-authors) 2013Sociocultural behavior sex-biaised admixture and effective popula-tion sizes in Central African Pygmies and non-Pygmies Mol Biol Evol30918ndash937

Voight BF Adams AM Frisse LA Qian YD Hudson RR Di Rienzo A2005 Interrogating multiple aspects of variation in a full resequen-cing data set to infer human population size changes Proc NatlAcad Sci U S A 10218508ndash18513

Weir BS Hill WG 2002 Estimating F-statistics Ann Rev Genet 36721ndash750

Woerner AE Cox MP Hammer MF 2007 Recombination-filtered ge-nomic datasets by information maximization Bioinformatics 231851ndash1853

Yao YG Kong QP Bandelt HJ Kivisild T Zhang YP 2002Phylogeographic differentiation of mitochondrial DNA in HanChinese Am J Hum Genet 70635ndash651

16

Aime et al doi101093molbevmst156 MBE