A common genetic origin for early farmers from Mediterranean Cardial and Central European LBK...

33
Article A Common Genetic Origin for Early Farmers from Mediterranean Cardial and Central European LBK Cultures I ~ nigo Olalde, y,1 Hannes Schroeder, y,2,3 Marcela Sandoval-Velasco, 2 Lasse Vinner, 2 Irene Lob on, 1 Oscar Ramirez, 1 Sergi Civit, 4 Pablo Garc ıa Borja, 5 Domingo C. Salazar-Garc ıa, 5,6,7,8 Sahra Talamo, 8 Josep Mar ıa Fullola, 9 Francesc Xavier Oms, 9 Mireia Pedro, 9 Pablo Mart ınez, 9,10 Montserrat Sanz, 11 Joan Daura, 11,12 Jo~ ao Zilh~ ao, 9,11,13 Tom as Marque `s-Bonet, 1,13 M. Thomas P. Gilbert, 2 and Carles Lalueza-Fox* ,1 1 Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Barcelona, Spain 2 Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark 3 Faculty of Archaeology, Leiden University, Leiden, The Netherlands 4 Department of Statistics, Faculty of Biology, University of Barcelona, Barcelona, Spain 5 Departament de Prehist oria i Arqueologia, Universitat de Vale `ncia, Vale `ncia, Spain 6 Department of Archaeology, University of Cape Town, Cape Town, South Africa 7 LAMPEA UMR 7269, Maison M editerran eenne des Sciences de l’Homme (MMSH), Aix-en-Provence, France 8 Department of Human Evolution, Max-Planck Institute for Evolutionary Anthropology, Leipzig, Germany 9 Seminari Estudis i Recerques Prehist oriques (SERP; SGR2014-00108), Departament de Prehist oria, H. Antiga i Arqueologia, Facultat de Geografia i Hist oria, Universitat de Barcelona, Barcelona, Spain 10 ColLectiu per a la Investigaci o de la Prehist oria i l’Arqueologia del Garraf-Ordal, CIPAG, Begues, Spain 11 Centro de Arqueologia, Faculdade de Letras, Universidade de Lisboa (UNIARQ), Alameda da Universidade, Lisboa, Portugal 12 GRQ, Grup de Recerca del Quaternari, Seminari Estudis i Recerques Prehist oriques (SERP; SGR2014-00108), Departament de Prehist oria, H. Antiga i Arqueologia, Facultat de Geografia i Hist oria, Universitat de Barcelona, Barcelona, Spain 13 Instituci o Catalana de Recerca i Estudis Avanc ¸ats (ICREA), Barcelona, Spain y These authors contributed equally to this work. *Corresponding author: E-mail: [email protected]. Associate editor: Sohini Ramachandran Abstract The spread of farming out of the Balkans and into the rest of Europe followed two distinct routes: An initial expansion represented by the Impressa and Cardial traditions, which followed the Northern Mediterranean coastline; and another expansion represented by the LBK (Linearbandkeramik) tradition, which followed the Danube River into Central Europe. Although genomic data now exist from samples representing the second migration, such data have yet to be successfully generated from the initial Mediterranean migration. To address this, we generated the complete genome of a 7,400-year- old Cardial individual (CB13) from Cova Bonica in Vallirana (Barcelona), as well as partial nuclear data from five others excavated from different sites in Spain and Portugal. CB13 clusters with all previously sequenced early European farmers and modern-day Sardinians. Furthermore, our analyses suggest that both Cardial and LBK peoples derived from a common ancient population located in or around the Balkan Peninsula. The Iberian Cardial genome also carries a discernible hunter–gatherer genetic signature that likely was not acquired by admixture with local Iberian foragers. Our results indicate that retrieving ancient genomes from similarly warm Mediterranean environments such as the Near East is technically feasible. Key words: Neolithic, paleogenomics, Cardial ware, migration. Introduction The introduction of farming into Europe around 8,000 years ago was a major demographic transition. It involved a sub- stantial replacement of the preexisting hunter–gatherer pop- ulations by migrants of ultimate Near Eastern origin as well as new adaptive challenges (S anchez-Quinto et al. 2012; Skoglund et al. 2012, 2014; Gamba et al. 2014; Lazaridis et al. 2014; Olalde et al. 2014; Allentoft et al. 2015; Haak et al. 2015). Initially these early farmers settled in the Balkan Peninsula, developing what today is referred to as the Starc ˇevo–Ko 00 r os–Cris ¸ culture (Whittle 1996) (fig. 1A). Archaeological evidence suggests that these farmers spread subsequently throughout Europe along at least two distinc- tive routes. Expansion along the first route commenced ap- proximately 5,900 years BCE, and is represented by the distinct Impressa culture that spread along the central and ß The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] Open Access Mol. Biol. Evol. doi:10.1093/molbev/msv181 Advance Access publication September 2, 2015 1 MBE Advance Access published September 24, 2015 by guest on September 24, 2015 http://mbe.oxfordjournals.org/ Downloaded from

Transcript of A common genetic origin for early farmers from Mediterranean Cardial and Central European LBK...

Article

A Common Genetic Origin for Early Farmers fromMediterranean Cardial and Central European LBK CulturesI~nigo Olalde,y,1 Hannes Schroeder,y,2,3 Marcela Sandoval-Velasco,2 Lasse Vinner,2 Irene Lob�on,1

Oscar Ramirez,1 Sergi Civit,4 Pablo Garc�ıa Borja,5 Domingo C. Salazar-Garc�ıa,5,6,7,8 Sahra Talamo,8

Josep Mar�ıa Fullola,9 Francesc Xavier Oms,9 Mireia Pedro,9 Pablo Mart�ınez,9,10 Montserrat Sanz,11

Joan Daura,11,12 Jo~ao Zilh~ao,9,11,13 Tom�as Marques-Bonet,1,13 M. Thomas P. Gilbert,2 andCarles Lalueza-Fox*,1

1Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Barcelona, Spain2Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark3Faculty of Archaeology, Leiden University, Leiden, The Netherlands4Department of Statistics, Faculty of Biology, University of Barcelona, Barcelona, Spain5Departament de Prehist�oria i Arqueologia, Universitat de Valencia, Valencia, Spain6Department of Archaeology, University of Cape Town, Cape Town, South Africa7LAMPEA UMR 7269, Maison M�editerran�eenne des Sciences de l’Homme (MMSH), Aix-en-Provence, France8Department of Human Evolution, Max-Planck Institute for Evolutionary Anthropology, Leipzig, Germany9Seminari Estudis i Recerques Prehist�oriques (SERP; SGR2014-00108), Departament de Prehist�oria, H. Antiga i Arqueologia, Facultatde Geografia i Hist�oria, Universitat de Barcelona, Barcelona, Spain10Col�Lectiu per a la Investigaci�o de la Prehist�oria i l’Arqueologia del Garraf-Ordal, CIPAG, Begues, Spain11Centro de Arqueologia, Faculdade de Letras, Universidade de Lisboa (UNIARQ), Alameda da Universidade, Lisboa, Portugal12GRQ, Grup de Recerca del Quaternari, Seminari Estudis i Recerques Prehist�oriques (SERP; SGR2014-00108), Departament dePrehist�oria, H. Antiga i Arqueologia, Facultat de Geografia i Hist�oria, Universitat de Barcelona, Barcelona, Spain13Instituci�o Catalana de Recerca i Estudis Avancats (ICREA), Barcelona, SpainyThese authors contributed equally to this work.

*Corresponding author: E-mail: [email protected].

Associate editor: Sohini Ramachandran

Abstract

The spread of farming out of the Balkans and into the rest of Europe followed two distinct routes: An initial expansionrepresented by the Impressa and Cardial traditions, which followed the Northern Mediterranean coastline; and anotherexpansion represented by the LBK (Linearbandkeramik) tradition, which followed the Danube River into Central Europe.Although genomic data now exist from samples representing the second migration, such data have yet to be successfullygenerated from the initial Mediterranean migration. To address this, we generated the complete genome of a 7,400-year-old Cardial individual (CB13) from Cova Bonica in Vallirana (Barcelona), as well as partial nuclear data from five othersexcavated from different sites in Spain and Portugal. CB13 clusters with all previously sequenced early European farmersand modern-day Sardinians. Furthermore, our analyses suggest that both Cardial and LBK peoples derived from acommon ancient population located in or around the Balkan Peninsula. The Iberian Cardial genome also carries adiscernible hunter–gatherer genetic signature that likely was not acquired by admixture with local Iberian foragers.Our results indicate that retrieving ancient genomes from similarly warm Mediterranean environments such as the NearEast is technically feasible.

Key words: Neolithic, paleogenomics, Cardial ware, migration.

IntroductionThe introduction of farming into Europe around 8,000 yearsago was a major demographic transition. It involved a sub-stantial replacement of the preexisting hunter–gatherer pop-ulations by migrants of ultimate Near Eastern origin as well asnew adaptive challenges (S�anchez-Quinto et al. 2012;Skoglund et al. 2012, 2014; Gamba et al. 2014; Lazaridiset al. 2014; Olalde et al. 2014; Allentoft et al. 2015; Haak

et al. 2015). Initially these early farmers settled in the BalkanPeninsula, developing what today is referred to as theStarcevo–Ko00 r€os–Cris culture (Whittle 1996) (fig. 1A).Archaeological evidence suggests that these farmers spreadsubsequently throughout Europe along at least two distinc-tive routes. Expansion along the first route commenced ap-proximately 5,900 years BCE, and is represented by thedistinct Impressa culture that spread along the central and

� The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License(http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in anymedium, provided the original work is properly cited. For commercial re-use, please contact [email protected] Open AccessMol. Biol. Evol. doi:10.1093/molbev/msv181 Advance Access publication September 2, 2015 1

MBE Advance Access published September 24, 2015 by guest on Septem

ber 24, 2015http://m

be.oxfordjournals.org/D

ownloaded from

Western Mediterranean basin. A later aspect of this culture,named Cardial for the use of the serrated edge of cockle shellsin pottery decoration (fig. 1B), reached the Iberian Peninsulano later than 5,500 years BCE (Before Common Era) (fig. 1A)(Martins et al. 2015); however, the inundation of the ancientNeolithic coastline hampers our understanding of the originsand dynamics of this first expansion. The second expansionoccurred in parallel with the Cardial into large areas of CentralEurope along the Danube River (fig. 1A), by a culture todayreferred to as the Linearbandkeramik (or LBK) for the bandeddecoration patterns found in their pottery. From anarchaeological point of view, the spread of the Cardial culture,with a distinctive package of pottery, polished stone axesand domesticates, seems to be a migration process similarto that represented by the LBK expansion (Zilh~ao 1993).However, the rapid expansion of the Cardial culturealong the Iberian coast, as suggested by radiocarbondates and its restricted littoral distribution, has beeninterpreted as the result of maritime pioneer colonization(Zilh~ao 2001).

Palaeogenomics represents a powerful tool for refining ourunderstanding of such past events, and thanks to the adventof next-generation sequencing (NGS) technologies, to date 37complete (41�) genome drafts are available from prehisto-ric Eurasians, spanning from the Upper Paleolithic to theBronze and Iron Ages (Allentoft et al. 2015; Olalde andLalueza-Fox 2015). In addition, many more specimens havebeen genotyped for about 400,000 polymorphisms (Haaket al. 2015). All these ancient data have been used to recon-struct past population movements as well as adaptive chal-lenges and selective events that arose during Europeanprehistory.

Given its geographic location on the far western edge ofEurope, the Iberian Peninsula is a critical place for estimatingthe final impact of the substantial population dispersals thatoriginated in the continent’s Eastern periphery (among whichare included the initial Neolithic migration and the LateNeolithic/Bronze age steppe migration).

Previous studies on ancient Cardial specimens have beenrestricted to the analysis of uniparental markers—especially

Fig. 1. Early Neolithic Cardial culture. (A) Main cultural horizons associated with the earliest Neolithic of Central and Western Europe approximately6,000–5,500 cal BCE. 1, Cova Bonica; 2, Cova de la Sarsa; 3, Cova de l‘Or; 4, Galeria da Cisterna-Almonda. (B) Cardial ceramics from Cova de la Sarsa. Theimpressed decoration is characteristically made with the serrated edge of cockle shells.

2

Olalde et al. . doi:10.1093/molbev/msv181 MBE by guest on Septem

ber 24, 2015http://m

be.oxfordjournals.org/D

ownloaded from

mitochondrial DNA (mtDNA)—by a traditional polymerasechain reaction (PCR) approach (Lacan, Keyser, Ricaut,Brucato, Tarr�us, et al. 2011; Gamba et al. 2012). Althoughthe resolution of these genetic markers is limited, themtDNA haplogroup composition of the few availableCardial individuals pointed to a Near East connection andsuggested a pioneer colonization from that region (Lacan,Keyser, Ricaut, Brucato, Tarr�us, et al. 2011; Gamba et al. 2012).

Cardial specimens have yet to be analyzed using NGS tech-niques, although later Neolithic samples from Central andNortheastern Iberia have yielded genotype data that clusterthem with other early European farmers (Haak et al. 2015).The lack of data from the Cardial Neolithic can be explainedby the scarcity of associated human remains, as well as thewarm climatic conditions of the Mediterranean, which arelargely unfavorable for DNA preservation (Garc�ıa-Garcer�aet al. 2011; Hofreiter et al. 2014). Therefore, the genomic af-finities of the Western Mediterranean’s first farmers have re-mained unknown until now.

To clarify the origin and population affinities of this earlyMediterranean migration, we analyzed six individuals fromsome of the oldest securely dated Cardial Iberian sites: Twofrom Cova Bonica (Barcelona) (supplementary fig. S1,Supplementary Material online), one from Cova de l’Or(Alicante), one from Cova de la Sarsa (Valencia), and twofrom the Galeria da Cisterna locus of the Almonda karstsystem (Portugal) (table 1). All calibrated radiocarbon datesrange between approximately 5,470 and 5,220 years BCE (sup-plementary table S1, Supplementary Material online). As ex-pected, the endogenous content of all samples was low.Therefore, we combined shotgun sequencing with a recentlyvalidated commercially available human whole-genome cap-ture assay ( �Avila-Arcos et al. 2015; Schroeder et al. 2015) togenerate the complete genome for one of the Cova Bonicaspecimens (CB13) and partial genome data for the remainingsamples.

Results and DiscussionWhole-genome capture enriched the endogenous DNA inour libraries by an average of 5- to 15-fold, consistent withperformance on similarly degraded materials (Carpenter et al.2013; �Avila-Arcos et al. 2015; Schroeder et al. 2015). However,low DNA endogenous content (below 1% in all samplesexcept CB13) and low complexity only allowed us to retrieve

one complete genome (1.1� coverage) from the female CovaBonica CB13 sample (table 1 and supplementary fig. S2,Supplementary Material online). For the remaining samples,we obtained mtDNA genomes at 0.7–64� coverage and lim-ited nuclear data (between 0.0003�and 0.0129�) (table 1and supplementary table S2, Supplementary Material online).

We estimated very low levels (0.11%) of modern contam-ination at the mtDNA level for CB13 (supplementary tablesS3 and S4, Supplementary Material online). The remainingsamples also showed low mtDNA contamination levels(<5%) with the exception of one of the samples fromAlmonda (F19), which had an estimated 29% of contamina-tion of unknown origin (supplementary table S3,Supplementary Material online). Both pre- and postcapturesequences showed the typical ancient DNA deamination pat-tern at the end of the reads (Brotherton et al. 2007), withdeamination percentages over 20%, with the exception of F19(supplementary fig. S3, Supplementary Material online).

We were able to determine the mtDNA haplogroups fromthe six Cardial individuals. The presence among our samplesof haplogroups K1a, H3, and H4 (supplementary table S5,Supplementary Material online) is consistent with previousresults obtained for 20 specimens including 10 Cardials fromfour Iberian Early Neolithic sites (Lacan, Keyser, Ricaut,Brucato, Tarr�us, et al. 2011; Gamba et al. 2012). In these sam-ples the authors describe N*, K (K1a), H (H3), U5, T2b, and X1lineages, all of which are also present in other early Neolithicsamples from the LBK culture of Central Europe (Brandt et al.2013; Gamba et al. 2014). The haplotype K1a2a found in CB13has been previously described in an Epicardial individual fromEls Trocs (Spain) dated to 5,177–5,069 years cal BCE (Haaket al. 2015). On the other hand, the haplotype X2c from CB14is quite rare in modern Europeans, but it has been found in afew Neolithic samples from France (Deguilloux et al. 2011;Lacan, Keyser, Ricaut, Brucato, Duranthon, et al. 2011) andGermany (Lee et al. 2012). Unfortunately, information on theY-chromosome could not be obtained due to the low geno-mic coverage of the male samples.

At the phenotypic level, CB13 has derived alleles for theSLC24A5 pigmentation gene (supplementary tables S6–S8,Supplementary Material online) and appears heterozygousfor the SLC45A2 skin pigmentation gene (supplementarytables S6 and S7, Supplementary Material online), bothassociated with light skin in Europeans. The same, light

Table 1. Summary Statistics of the Sequenced Cardial Specimens.

Sample Site GenomeCoverage

Human DNAShotgun (%)

Human DNAPostcapture (%)

RadiocarbonAge (cal BCE, 1r)

Sex mtDNAHaplogroup

mtDNACoverage

Contamination(%)

CB13 Cova Bonica 1.1085 5.68 28.23 5,470–5,360 F K1a2a 353.29 0.11

CB14 Cova Bonica 0.0003 0.25 — —a F X2c 4.1 1.35

F19 Almonda cave 0.0129 0.58 9.66 5,310–5,220 F H4a1a 33.79 29.13

G21 Almonda cave 0.0039 0.09 1.27 5,330–5,230 M H3 63.55 1.97

H3C6 Cova de l’Or 0.0011 0.06 1.01 5,360–5,310 M H4a1a 4.45 3.96

CS7675 Cova de la Sarsa 0.0012 0.08 1.14 5,321–5,227 Mb K1a4a1 0.69 —

aCB14 is not directly radiocarbon dated, although it comes from the same stratigraphic layer than CB13.bCS7675 is determined to be a male from morphology.

3

Common Genetic Origin for European Early Farmers . doi:10.1093/molbev/msv181 MBE by guest on Septem

ber 24, 2015http://m

be.oxfordjournals.org/D

ownloaded from

skin-related genotypic combination is also seen in severalEarly, Middle, and Late Neolithic individuals from Hungary(Gamba et al. 2014). Despite uncertainties associated withthe low coverage, the Hirisplex pigmentation prediction(Walsh et al. 2013) yields the highest probability of this indi-vidual having dark hair (0.679). Due to a combination ofmissing (including the critical rs1281382 single nucleotidepolymorphism [SNP] for blue eyes) and heterozygous sitesat the OCA2/HERC2 haplotype (supplementary table S9,Supplementary Material online), the color of the iris couldnot be conclusively determined. At the rs4988235 site (whichhas a regulatory effect on the LCT gene), CB13 shows theancestral variant associated with the inability to digest milkduring adulthood (Itan et al. 2009), sharing this trait with allNeolithic individuals analyzed to date (Gamba et al. 2014;Lazaridis et al. 2014).

To infer the general ancestry of CB13 we performedPrincipal Component Analysis (PCA) with a large data setof present-day Europeans and Near Eastern individuals(Lazaridis et al. 2014), as well as on ancient individuals fromdifferent studies (Keller et al. 2012; Fu et al. 2014; Gamba et al.2014; Lazaridis et al. 2014; Olalde et al. 2014; Raghavan et al.2014; Seguin-Orlando et al. 2014; Skoglund et al. 2014; Haaket al. 2015) (fig. 2A and supplementary fig. S4, SupplementaryMaterial online). Our Cardial individual clusters with otherNeolithic samples, including LBK individuals from Germany(Stuttgart) and Hungary (NE1) and early Neolithic samplesfrom Iberia, as well as later Middle Neolithic and Copper ageindividuals (Keller et al. 2012; Gamba et al. 2014; Lazaridis et al.2014; Haak et al. 2015). As previously noticed (Keller et al.2012; Gamba et al. 2014; Lazaridis et al. 2014; Skoglund et al.2014), all these prehistoric farmers also plot close to present-day southern Europeans, in particular to Sardinians.Interestingly, these individuals and CB13 plot in betweenextant Near Easterners and prehistoric European hunter–gatherers, suggesting that they also share some ancestrywith the latter. This pattern is consistent with previous ob-servations (Lazaridis et al. 2014; Skoglund et al. 2014) and withour ADMIXTURE (Alexander et al. 2009) analysis (fig. 2B),where part of CB13 female’s ancestry is assigned to a compo-nent characteristic of hunter–gatherer populations.

To examine which modern-day and ancient populationsshow the greatest shared genetic drift with the Cardialgenome, we used outgroup f3-statistics (Reich et al. 2009).Among extant populations, we found the highest scores ofshared genetic drift for Sardinians and, to a lesser extent, forBasques (fig. 3A). Among ancient populations/individuals,CB13 shows the highest shared genetic drift with otherNeolithic individuals from different parts of Europe (fig. 3B).

The Basques have been traditionally considered one of theoldest human groups in Europe, inhabiting a marginal area inthe Pyrenean mountain range and exhibiting genetic conti-nuity since pre-Neolithic times (Cavalli-Sforza et al. 1994). Thefact that modern Basque peoples speak the sole survivingrelict of a pre-Indo-European language in Western Europe(the Euskera or Basque language) could have also contributedto their isolation (Renfrew and Bahn 1991). The existence ofsome autochthonous mtDNA subhaplogroups (Cardoso et al.

2013) has been used to further support the Basque singularityamong Europeans, although this has been recently ques-tioned by genome-wide data (Laayouni et al. 2010).

Our analysis suggests that the geographic isolation ofSardinians and Basques partially preserved the originally wide-spread early Neolithic population component, more than anyother populations in Europe. Considering that Basques speaka pre-Indo-European language, this finding also indicates thatthe expansion of Indo-European languages is unlikely to havetaken place during the early Neolithic. This is in agreementwith the recently characterized genetic influx from thesteppes in the Late Neolithic/Chalcolithic, which has beenassociated with the spread of Indo-European languages intoWestern Europe (Allentoft et al. 2015; Haak et al. 2015).

To ascertain where CB13—and other early farmers—acquired their hunter–gatherer genetic component, we com-puted D statistics (Durand et al. 2011) of the form D (Hunter-gatherer1, Hunter-gatherer2; Neolithic farmer, chimpanzee),testing whether a given Neolithic farmer is significantly closerto one of the two hunter–gatherers. We used La Bra~na 1(Spain), Loschbour (Luxembourg), KO1 (Hungary), andMotala12 (Sweden) as hunter–gatherer references. Wefound that CB13 is closer to KO1 than to La Bra~na 1, whichis only 800 km away from the Cova Bonica Cardial site (fig. 4and supplementary table S10, Supplementary Materialonline). Although the number of available genomes is yetlimited and KO1 is a hunter–gatherer in a farming context,these results suggest that the origin of the hunter–gatherergenomic component present in CB13 cannot be traced to theaboriginal hunter–gatherers of the Iberian Peninsula, an ob-servation also supported by TreeMix analysis (supplementaryfig. S7B, Supplementary Material online). The fact that CB13shares more alleles with KO1 than with La Bra~na 1 indicatesthat there was some discernible East–West population struc-ture among European hunter–gatherers, already suggested ina previous study (Haak et al. 2015). In the future, the sequenc-ing of more ancient hunter–gatherer genomes from Greece,Italy, Southern France, and Mediterranean Iberia could po-tentially disentangle fine population structure patternsamong these populations, allowing a further characterizationof the hunter–gatherer component in early farmers.

ConclusionsThe Mediterranean region is crucial not only for understand-ing local cultural horizons such as the Cardial but also forunravelling potential trans-Mediterranean maritime routesand island colonization processes. Although the DNA inour samples was poorly preserved because of the warmMediterranean climate, our results demonstrate that recoveryof complete ancient genomes from areas with a similar cli-mate (including in the Near East and North Africa) may alsobe possible. Cave sites in these regions clearly offer some ad-vantages in terms of preservation.

Our analyses indicate that both the LBK and the Cardialpeoples originated from a common ancient meta-populationthat diverged along two different migration routes, one fol-lowing the Danube River (LBK) and the other one followingthe northern Mediterranean coastline (Impressa and Cardial).

4

Olalde et al. . doi:10.1093/molbev/msv181 MBE by guest on Septem

ber 24, 2015http://m

be.oxfordjournals.org/D

ownloaded from

Furthermore, we detect a discernible hunter–gatherer com-ponent in the Cardial genome, which seems to derive from apopulation more closely related to Eastern European hunter–gatherers than to the neighboring Iberian La Bra~na 1 sample.

From the current genetic evidence, it seems clear that allearly European farmers represent a fairly homogeneous groupat both the genetic and phenotypic levels. Subsequent pop-ulation movements from the Chalcolithic onwards consider-ably altered this scenario, and contributed to the shaping ofpresent-day European genetic diversity (Gamba et al. 2014;Allentoft et al. 2015; Haak et al. 2015).

Materials and Methods

Sample Selection

Samples–mainly teeth–from Cardial individuals were selectedbased on external appearance and cave provenance. Only

securely dated samples (i.e., associated with Cardial potteryremains in undisturbed deposits and/or directly dated by ra-diocarbon) were considered for DNA analysis. The six samplesanalyzed derive from four Cardial sites (supplementary ma-terials and methods, Supplementary Material online): CovaBonica (CB13 and CB14) in Vallirana (Barcelona), Galeria daCisterna in Almonda (G21 and F19) (Portugal), Cova de l‘Or(H3C6) in Beniarr�es (Alicante), and Cova de la Sarsa (CS7675)in Bocairent (Valencia) (table 1 and fig. 1).

DNA Extraction

The samples were extracted using a modified version of thesilica-in-solution protocol described elsewhere (Rohland andHofreiter 2007). Instead of the commonly used dentine, wetargeted the cementum-rich root tip, which has been shownto contain higher levels of endogenous DNA (Adler et al.

Fig. 2. Genetic affinities of CB13. (A) Procrustes PCA of hunter–gatherers, Early Neolithic, Middle Neolithic, and Copper Age farmers. The PCA wasperformed using only transversions (to avoid confounding effects related to postmortem damage). (B) Ancestry proportions assuming 11 ancestralcomponents, as inferred by ADMIXTURE analysis.

5

Common Genetic Origin for European Early Farmers . doi:10.1093/molbev/msv181 MBE by guest on Septem

ber 24, 2015http://m

be.oxfordjournals.org/D

ownloaded from

2011; Damgaard et al. 2015). We also added a “predigestion”step to the extraction protocol that has been shown to sig-nificantly increase library efficiency (Allentoft et al. 2015;Damgaard et al. 2015). In addition, we used a recently intro-duced DNA binding buffer that has been shown to be moreefficient at retaining short DNA fragments compared withother buffers (Allentoft et al. 2015).

Library Preparation

Sequencing libraries were constructed using the NEB’sNEBNext DNA Sample Prep Master Mix Set 2 (E6070) andIllumina-specific adapters, following established protocols(Meyer and Kircher 2010). Following the Bst fill-in step, thelibraries were amplified and indexed in 50ml PCRs containing1�KAPA HiFi HotStart Uracil + ReadyMix (KAPABiosystems, Woburn, MA) and 200 nM of each of Illumina’sMultiplexing PCR primer in PE1.0 (50-AATGATACGGCGACCACCGAGATCTACACTCTTT CCCTACACGAC GCTCTT CCGATCT) and a custom-designed index primer with a six nucle-otide index (50-CAAGCAGAAGACGGCATAC GAGATNNNNNNGTGACTGGAGTTC). Thermocycling conditionswere as follows: 1 min at 94 �C, followed by 8–12 cycles of15 s at 94 �C, 20 s at 60 �C, and 20 s at 72 �C, and a finalextension step of 1 min at 72 �C. The optimal number ofcycles was determined by quantitative PCR, as done inMeyer and Kircher (2010). The amplified libraries were then

purified using Agencourt AMPure XP beads (BeckmanCoulter, Krefeld, Germany) and quantified on an Agilent2200 TapeStation (Agilent Technologies, Palo Alto, CA).

Initial Screening and Whole-Genome Capture

To establish their overall efficiency, the libraries were se-quenced on an Illumina HiSeq 2000 run in 100 SR mode atthe Danish National High-Throughput Sequencing Centre inCopenhagen, Denmark. Following this initial screening run,libraries were enriched using the MYbait Human WholeGenome Capture Kit from MYcroarray (Ann Arbor, MI)( �Avila-Arcos et al. 2015; Schroeder et al. 2015). The librarieswere captured following the manufacturer’s instructions(http://www.mycroarray.com/pdf/MYbaits-manual-v2.pdf).The captured libraries were amplified for 10–20 cycles usingprimers IS5 (50-AATGATACG GCGACCACCGA) and IS6 (50-CAAGCAGAAGACGGCA TACGA) and the same PCR set-upconditions as above. Subsequently, libraries were purified andquantified as above, pooled in equimolar amounts, and se-quenced. Base-calling was performed using the Illumina soft-ware CASAVA 1.8.2.

Cova Bonica DNA Reextraction

Several libraries were constructed from the most efficientsample, CB13, a tooth root from Cova Bonica. After the ex-haustion of the original extract, the undigested tooth pellet

Fig. 3. Outgroup f3-statistic analysis of CB13 Cardial genome. (A) Shared genetic drift between CB13 and present-day Western Eurasian populations.(B) Top 40 populations/individuals (modern and ancient) showing the highest genetic drift with CB13. Black and gray error bars represent two andthree standard errors, respectively.

6

Olalde et al. . doi:10.1093/molbev/msv181 MBE by guest on Septem

ber 24, 2015http://m

be.oxfordjournals.org/D

ownloaded from

was reextracted and a new library was built. The subsequentsequencing results showed that this second-round library hada significantly higher human DNA content and higher com-plexity than previous libraries. In addition, reads from thislibrary showed a lower mean fragment length and higherdeamination levels at the extremes (supplementary fig. S5,Supplementary Material online), suggesting that reextractingthe pellet significantly increased endogenous DNA yields.Similar observations have been made previously and it hasbeen proposed that reextraction triggers the release of a DNAfraction located in the deep, crystalline dentine structure andthus better protected from hydrolysis and microbial action(Orlando et al. 2011; Der Sarkissian et al. 2014). All of theseobservations are also in line with the proposed notion that apredigestion step can significantly improve efficiency in an-cient samples (Damgaard et al. 2015).

Sequencing Data Processing

Owing to postmortem degradation, ancient DNA fragmentsare usually very short, resulting in sequencing of the adapter,which has been ligated during library preparation. Thus,AdapterRemoval (Lindgreen 2012) was used to remove adap-ter sequences from the 30-end of the reads, and to remove

stretches of consecutive bases with 0, 1, or 2 quality scores.Before mapping, we discarded sequences shorter than 30 bp.Reads were mapped with BWA-0.6.1 (Li and Durbin 2009) tothe human reference genome (hg19) and to the rCRS mito-chondrial genome (Andrews et al. 1999), disabling the seedand setting parameters “-n 0.01” and “-o 2.” Duplicate se-quences were removed using PicardTools (http://picard.sourceforge.net/). To estimate the coverage attained for eachsample, the DepthOfCoverage tool implemented in GATK(McKenna et al. 2010) was used. The sequence statistics foreach sample are displayed in supplementary table S2,Supplementary Material online. Only reads with mappingquality greater than 30 were kept for further analysis.

Sex Determination

The sex of each individual was determined following the ap-proach in Skoglund et al. (2013). This method computes theratio of chrY reads to chrX + chrY reads. The differences incoverage between sexual and autosomal chromosomes pro-vide direct evidence of the individual’s sex (females are ex-pected to have an X-chromosome coverage similar to that ofautosomes, whereas males show an X-chromosome with half

Fig. 4. D-statistics to determine whether CB13 and other Neolithic farmers are closer to any hunter–gatherer. Black and gray error bars represent twoand three standard errors, respectively.

7

Common Genetic Origin for European Early Farmers . doi:10.1093/molbev/msv181 MBE by guest on Septem

ber 24, 2015http://m

be.oxfordjournals.org/D

ownloaded from

the coverage for autosomes and also a significant presence ofY-chromosome reads).

mtDNA Haplogroup Determination

Consensus sequences were called using samtools and bcftools(Li et al. 2009), requiring a support of at least three reads andusing a majority rule. Haplogroups were determined usinghaplogrep tool (Kloss-Brandst€atter et al. 2011) (supplemen-tary table S5, Supplementary Material online).

Ancient DNA Authenticity

To estimate the rate of modern human DNA contaminationin our samples, several procedures were followed:

Misincorporation Patterns at the Ends of the ReadsIt has been observed in different studies that the endogenousDNA bears a distinctive pattern at the 50- and 30-ends result-ing from chemical processes related to depurination, frag-mentation, and subsequent deamination of the DNAtemplates (Briggs et al. 2007). This pattern of miscoding le-sions (increased ratio of C to T changes at the 50-ends and ofG to A at the 30-ends) is typically used as a proxy for authen-ticity. The damage pattern at the end of the reads was deter-mined with the mapDamage2.0 software (J�onsson et al. 2013)(supplementary fig. S3, Supplementary Material online).

mtDNA Contamination EstimatesThe analysis of the reads mapped to the mtDNA allows us tocheck whether the majority of the reads derives from a singlebiological source. After calling consensus sequences, weestimated contamination in the mitochondria usingcontamMix-1.0.10. This software implements a Bayesian ap-proach described in Fu et al. (2013). In the case of CB13, thehigh coverage allowed us to use only transversions that arenot affected by postmortem damage. Contamination esti-mates are shown in supplementary tables S3 and S4,Supplementary Material online.

Phenotypic Traits

For the low-coverage Cova Bonica genome, it was possible toscreen for phenotypic traits of interest that have been thesubject of recent natural selection in Europeans, according toseveral studies (Beleza et al. 2013; Grossman et al. 2013). Weexamined the reads overlapping the SNPs and haplotypesinvolved in potential selective sweeps, notably those relatedto pigmentation and lactase persistance (supplementarytables S6–S9, Supplementary Material online). The haircolor was estimated with the Hirisplex prediction model(Walsh et al. 2013, 2014).

Population Genetic Reference Data

We used two different reference data sets:

� The Human Origins data set released in a previous study(Lazaridis et al. 2014), which contains 1,941 present-dayindividuals from worldwide populations genotyped at594,924 sites. We next added variants from CB13 andother Eurasian ancient individuals from different studies

(Fu et al. 2014; Gamba et al. 2014; Lazaridis et al. 2014;Olalde et al. 2014; Raghavan et al. 2014; Seguin-Orlandoet al. 2014; Skoglund et al. 2014). For each ancient sample,we randomly sampled one read at each position, discard-ing alleles not present in the reference data set, and dis-carding reads with mapping quality lower than 30 and basequality lower than 30. Moreover, data from the study ofHaak et al. (2015), containing 69 ancient Europeans, weredownloaded and merged with the data set, resulting in atotal of 93 ancient individuals.� The 1000 Genomes project (1000 Genomes Project

Consortium 2012) phase 3 initial callset downloadedfrom ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/. Only SNPs found to be polymorphic inYoruba were used for the analysis. Variants from CB13and other Neolithic farmers and hunter–gatherers fromdifferent studies (Gamba et al. 2014; Lazaridis et al. 2014;Olalde et al. 2014; Skoglund et al. 2014) were added fol-lowing the same strategy as for the Human Origins refer-ence data set.

Principal Component Analysis

PCA was performed on a reference set of 777 modern WestEurasians and a set of 58 ancient individuals, including CB13.First, for each ancient sample, individual PCA was computedusing EIGENSOFT (Patterson et al. 2006). Then, usingProcrustes transformation (Skoglund et al. 2012) imple-mented in the vegan package (http://vegan.r-forge.r-project.org), the first two principal components were transformed tomatch the configuration of reference only PC1 and PC2.Finally, the average of the transformed coordinates for refer-ence individuals was plotted, together with transformed co-ordinates for each ancient sample (fig. 2A and supplementaryfig. S4, Supplementary Material online). To avoid possibleconfounding effects caused by postmortem deamination,the PCA was generated with only transversion positions.

Outgroup f3-Statistics

To study the degree of genetic relatedness between CB13Cardial sample and different present-day and ancient popu-lations, we computed outgroup f3-statistics (Reich et al. 2009;Raghavan et al. 2014) of the form f3 (Yoruba; CB13,X). Thisstatistic measures the amount of genetic drift shared by CB13and population X, after their divergence from Yoruba.Standard errors were computed using a weighted block jack-knife approach (Busing et al. 1999) over 5-Mb blocks (fig. 3).

D Statistics

Using the 1000 Genomes reference data set, we computed Dstatistics (Durand et al. 2011) of the form D (Hunter-gatherer1,Hunter-gatherer2; Neolithic farmer; Outgroup) to testwhether any Neolithic farmer, including CB13 Cardialsample, is closer to one of the two hunter–gatherers (fig. 4and supplementary table S10, Supplementary Material online).Standard errors were computed using a weighted block jack-knife approach (Busing et al. 1999) over 5-Mb blocks.

8

Olalde et al. . doi:10.1093/molbev/msv181 MBE by guest on Septem

ber 24, 2015http://m

be.oxfordjournals.org/D

ownloaded from

Admixture

We carried out model-based clustering analysis usingADMIXTURE (Alexander et al. 2009) on the Human Originsreference data set, including 1,941 present-day individuals and93 ancient individuals. First, we performed LD-pruning on thedata set using PLINK (Purcell et al. 2007) with the flag –indep-pairwise 200 25 0.4, resulting in 283,136 SNPs that were usedfor analysis. ADMIXTURE was run with the cross validation (–cv) flag considering K = 2 to K = 19, with 25 replicates for eachvalue of K. The lowest median CV error was obtained forK = 11. In figure 2B we show the ancestry proportions forK = 11 of 58 ancient individuals, including the CB13 cardialfemale. CB13 displays an ancestry make up similar to otherearly Neolithic samples, with an orange ancestral componentcharacteristic of hunter–gatherer populations and a blue an-cestral component related to the arrival of the Neolithic.

Treemix Analysis

We applied TreeMix (Pickrell and Pritchard 2012) to theHuman Origins reference data set to infer maximum-likelihood trees and admixture graphs. We included nine an-cient individuals: CB13, MA1 (Raghavan et al. 2014), LaBra~na1(Olalde et al. 2014), Loschbour (Lazaridis et al. 2014),Motala12 (Lazaridis et al. 2014), KO1 (Gamba et al. 2014),NE1 (Gamba et al. 2014), Gok2 (Skoglund et al. 2014) andStuttgart (Lazaridis et al. 2014), and also three present-daypopulations: Mbuti, Papuan, and Karitiana. Only 90,604 siteswith information in all the ancient individuals were used. Weroot the graphs with Mbuti, disabled sample size correction(-noss), performed a round of global realignments of thegraph (-global), and computed standard error of migrationweights (-se).

We considered up to four migration edges and kept foreach edge the graph with the highest log-likelihood amongten replicate runs (supplementary figs. S6 and S7,Supplementary Material online). All the graphs place ourCB13 Cardial sample close to other European early farmers,especially close to NE1 and Stuttgart. Interestingly, the graphwith four migration edges (supplementary fig. S7B,Supplementary Material online) places the group formed bythe European early farmers (CB13, NE1, Gok2, and Stuttgart)basal to hunter–gatherers, MA1 and Karitiana, consistentwith them having basal Eurasian ancestry (Lazaridis et al.2014). Furthermore, around 46% of the European Earlyfarmer ancestry is contributed by a hunter–gatherer popula-tion most closely related to KO1.

Supplementary MaterialSupplementary materials and methods, figures S1–S7, andtables S1–S10 are available at Molecular Biology andEvolution online (http://www.mbe.oxfordjournals.org/).

Acknowledgments

The authors thank the staff at the Danish National HighThroughput Sequencing Centre for technical support andthe Museu de Prehist�oria de la Diputaci�o Provincial deValencia, Direcci�o General de Cultura de la Generalitat

Valenciana, and Ajuntament de Bocairent for granting per-mission to analyze the Cova de la Sarsa and Cova de l’Orsamples. They thank Professor Jean-Jacques Hublin,Professor Michael Richards and the Max Planck Society forsupporting the radiocarbon part, Professor Eske Willerslev forproviding critical input, and Dr Philip Johnson for providingthe contamMix software. The Centre for GeoGenetics isfunded by the Danish National Research Foundation(DNRF94). Cova Bonica work is supported by Serveid’Arqueologia i Paleontologia (2014/100639), Generalitat deCatalunya (2014SGR-108), and Ministerio de Ciencia eInnovaci�on (HAR2011-26193) projects. H.S. was supportedby an ERC Synergy Grant (FP7/2007-2013/319209); C.L.-F. bya FEDER and Spanish Government Grant BFU2012-34157;and S.C. by a grant 2014 SGR 464 from Departamentd’Economia i Coneixement (Generalitat de Catalunya).D.C.S-G. acknowledges support from the GeneralitatValenciana (VALi + d APOSTD/2014/123), the BBVAFoundation (I Ayudas a Investigadores, Innovadores yCreadores Culturales), and the European Union (FP7/2007-2013—MSCA-COFUND, no. 245743 via a Braudel-IFER-FMSH). I.O. was funded by a predoctoral fellowship fromthe Basque Government (DEUI), and M.S. and J.D. by post-doctoral grants from Fundac~ao para a Ciencia e a Tecnologia(FCT) and Juan de la Cierva Subprogram (JCI-2011-09543),respectively. Alignment data are available through theSequence Read Archive (SRA) at the accession codeSRP057056.

References1000 Genomes Project Consortium. 2012. An integrated map of genetic

variation from 1,092 human genomes. Nature 491:56–65.Adler CJ, Haak W, Donlon D, Cooper A. 2011. Survival and recovery of

DNA from ancient teeth and bones. J Archaeol Sci. 38:956–964.Alexander DH, Novembre J, Lange K. 2009. Fast model-based estimation

of ancestry in unrelated individuals. Genome Res. 19:1655–1664.Allentoft ME, Sikora M, Sj€ogren K-G, Rasmussen S, Rasmussen M,

Stenderup J, Damgaard PB, Schroeder H, Ahlstr€om T, Vinner L,et al. 2015. Population genomics of Bronze Age Eurasia. Nature522:167–172.

Andrews RM, Kubacka I, Chinnery PF, Lightowlers RN, Turnbull DM,Howell N. 1999. Reanalysis and revision of the Cambridge referencesequence for human mitochondrial DNA. Nat Genet. 23:147.

�Avila-Arcos MC, Sandoval-Velasco M, Schroeder H, Carpenter ML,Malaspinas A-S, Wales N, Pe~naloza F, Bustamante CD, GilbertMTP. 2015. Comparative performance of two whole-genome cap-ture methodologies on ancient DNA Illumina libraries. Methods EcolEvol. 6:725–734.

Beleza S, Santos AM, McEvoy B, Alves I, Martinho C, Cameron E, ShriverMD, Parra EJ, Rocha J. 2013. The timing of pigmentation lighteningin Europeans. Mol Biol Evol. 30:24–35.

Brandt G, Haak W, Adler CJ, Roth C, Sz�ecs�enyi-Nagy A, Karimnia S,M€oller-Rieker S, Meller H, Ganslmeier R, Friederich S, et al. 2013.Ancient DNA reveals key stages in the formation of centralEuropean mitochondrial genetic diversity. Science 342:257–261.

Briggs AW, Stenzel U, Johnson PLF, Green RE, Kelso J, Pr€ufer K, Meyer M,Krause J, Ronan MT, Lachmann M, et al. 2007. Patterns of damage ingenomic DNA sequences from a Neandertal. Proc Natl Acad Sci U SA. 104:14616–14621.

Brotherton P, Endicott P, Sanchez JJ, Beaumont M, Barnett R, Austin J,Cooper A. 2007. Novel high-resolution characterization of ancientDNA reveals C4U-type base modification events as the sole causeof post mortem miscoding lesions. Nucleic Acids Res. 35:5717–5728.

9

Common Genetic Origin for European Early Farmers . doi:10.1093/molbev/msv181 MBE by guest on Septem

ber 24, 2015http://m

be.oxfordjournals.org/D

ownloaded from

Busing FMTA, Meijer E, Van Der Leeden R. 1999. Delete- m Jackknife forUnequal m. Stat Comput. 9:3–8.

Cardoso S, Valverde L, Alfonso-S�anchez MA, Palencia-Madrid L,Elcoroaristizabal X, Algorta J, Catarino S, Arteta D, Herrera RJ,Zarrabeitia MT, et al. 2013. The expanded mtDNA phylogeny ofthe Franco-Cantabrian region upholds the pre-neolithic genetic sub-strate of Basques. PLoS One 8:e67835.

Carpenter ML, Buenrostro JD, Valdiosera C, Schroeder H, Allentoft ME,Sikora M, Rasmussen M, Gravel S, Guill�en S, Nekhrizov G, et al. 2013.Pulling out the 1%: Whole-genome capture for the targeted enrich-ment of ancient DNA sequencing libraries. Am J Hum Genet.93:852–864.

Cavalli-Sforza LL, Menozzi P, Piazza A. 1994. The history and geographyof human genes. Princeton (NJ): Princeton University Press.

Damgaard PB, Margaryan A, Schroeder H, Orlando L, Willerslev E,Allentoft ME. 2015. Improving access to endogenous DNA in an-cient bones and teeth. Sci Rep. 5:11184.

Deguilloux M-F, Soler L, Pemonge M-H, Scarre C, Joussaume R, LaporteL. 2011. News from the west: ancient DNA from a French megalithicburial chamber. Am J Phys Anthropol. 144:108–118.

Der Sarkissian C, Ermini L, J�onsson H, Alekseev AN, Crubezy E, Shapiro B,Orlando L. 2014. Shotgun microbial profiling of fossil remains. MolEcol. 23:1780–1798.

Durand EY, Patterson N, Reich D, Slatkin M. 2011. Testing for ancientadmixture between closely related populations. Mol Biol Evol.28:2239–2252.

Fu Q, Li H, Moorjani P, Jay F, Slepchenko SM, Bondarev AA, Johnson PLF,Aximu-Petri A, Pr€ufer K, de Filippo C, et al. 2014. Genome sequenceof a 45,000-year-old modern human from western Siberia. Nature514:445–449.

Fu Q, Mittnik A, Johnson PLF, Bos K, Lari M, Bollongino R, Sun C,Giemsch L, Schmitz R, Burger J, et al. 2013. A revised timescale forhuman evolution based on ancient mitochondrial genomes. CurrBiol. 23:553–559.

Gamba C, Fern�andez E, Tirado M, Deguilloux MF, Pemonge MH, UtrillaP, Edo M, Molist M, Rasteiro R, Chikhi L, et al. 2012. Ancient DNAfrom an Early Neolithic Iberian population supports a pioneer col-onization by first farmers. Mol Ecol. 21:45–56.

Gamba C, Jones ER, Teasdale MD, McLaughlin RL, Gonzalez-Fortes G,Mattiangeli V, Dombor�oczki L, Ko00 v�ari I, Pap I, Anders A, et al. 2014.Genome flux and stasis in a five millennium transect of Europeanprehistory. Nat Commun. 5:5257.

Garc�ıa-Garcer�a M, Gigli E, Sanchez-Quinto F, Ramirez O, Calafell F, CivitS, Lalueza-Fox C. 2011. Fragmentation of contaminant and endog-enous DNA in ancient samples determined by shotgun sequencing;prospects for human palaeogenomics. PLoS One 6:e24161.

Grossman SR, Andersen KG, Shlyakhter I, Tabrizi S, Winnicki S, Yen A,Park DJ, Griesemer D, Karlsson EK, Wong SH, et al. 2013.Identifying recent adaptations in large-scale genomic data. Cell152:703–713.

Haak W, Lazaridis I, Patterson N, Rohland N, Mallick S, Llamas B, BrandtG, Nordenfelt S, Harney E, Stewardson K, et al. 2015. Massive mi-gration from the steppe was a source for Indo-European languagesin Europe. Nature 522:207–211.

Hofreiter M, Paijmans JLA, Goodchild H, Speller CF, Barlow A, Fortes GG,Thomas JA, Ludwig A, Collins MJ. 2014. The future of ancient DNA:technical advances and conceptual shifts. Bioessays 37:284–293.

Itan Y, Powell A, Beaumont MA, Burger J, Thomas MG. 2009. Theorigins of lactase persistence in Europe. PLoS Comput Biol.5:e1000491.

J�onsson H, Ginolhac A, Schubert M, Johnson PLF, Orlando L. 2013.mapDamage2.0: fast approximate Bayesian estimates of ancientDNA damage parameters. Bioinformatics 29:1682–1684.

Keller A, Graefen A, Ball M, Matzas M, Boisguerin V, Maixner F, LeidingerP, Backes C, Khairat R, Forster M, et al. 2012. New insights into theTyrolean Iceman’s origin and phenotype as inferred by whole-genome sequencing. Nat Commun. 3:698.

Kloss-Brandst€atter A, Pacher D, Sch€onherr S, Weissensteiner H, Binna R,Specht G, Kronenberg F. 2011. HaploGrep: a fast and reliable

algorithm for automatic classification of mitochondrial DNAhaplogroups. Hum Mutat. 32:25–32.

Laayouni H, Calafell F, Bertranpetit J. 2010. A genome-wide survey doesnot show the genetic distinctiveness of Basques. Hum Genet.127:455–458.

Lacan M, Keyser C, Ricaut F-X, Brucato N, Duranthon F, Guilaine J. 2011.Ancient DNA reveals male diffusion through the NeolithicMediterranean route. Proc Natl Acad Sci U S A. 108:9788–9791.

Lacan M, Keyser C, Ricaut F-X, Brucato N, Tarr�us J, Bosch A, Guilaine J,Crub�ezy E, Ludes B. 2011. Ancient DNA suggests the leading roleplayed by men in the Neolithic dissemination. Proc Natl Acad Sci U SA. 108:18255–18259.

Lazaridis I, Patterson N, Mittnik A, Renaud G, Mallick S, Kirsanow K,Sudmant PH, Schraiber JG, Castellano S, Lipson M, et al. 2014.Ancient human genomes suggest three ancestral populations forpresent-day Europeans. Nature 513:409–413.

Lee EJ, Makarewicz C, Renneberg R, Harder M, Krause-Kyora B, M€uller S,Ostritz S, Fehren-Schmitz L, Schreiber S, M€uller J, et al. 2012.Emerging genetic patterns of the european neolithic: perspectivesfrom a late neolithic bell beaker burial site in Germany. Am J PhysAnthropol. 148:571–579.

Li H, Durbin R. 2009. Fast and accurate short read alignment withBurrows–Wheeler transform. Bioinformatics 25:1754–1760.

Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G,Abecasis G, Durbin R. 2009. The Sequence Alignment/Map formatand SAMtools. Bioinformatics 25:2078–2079.

Lindgreen S. 2012. AdapterRemoval: easy cleaning of next-generationsequencing reads. BMC Res Notes. 5:337.

Martins H, Oms FX, Pereira L, Pike AWG, Rowsell K, Zilh~ao J. 2015.Radiocarbon dating the beginning of the Neolithic in Iberia: newresults, new problems. J Mediterr Archaeol. 28:105–131.

McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A,Garimella K, Altshuler D, Gabriel S, Daly M, et al. 2010. The GenomeAnalysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20:1297–1303.

Meyer M, Kircher M. 2010. Illumina sequencing library preparation forhighly multiplexed target capture and sequencing. Cold Spring HarbProtoc. Available from: http://dx.doi.org/10.1101/prot5448

Olalde I, Allentoft ME, S�anchez-Quinto F, Santpere G, Chiang CWK,DeGiorgio M, Prado-Martinez J, Rodr�ıguez JA, Rasmussen S, QuilezJ, et al. 2014. Derived immune and ancestral pigmentation alleles in a7,000-year-old Mesolithic European. Nature 507:225–228.

Olalde I, Lalueza-Fox C. 2015. Modern humans’ paleogenomics and thenew evidences on the European prehistory. Sci Technol Archaeol Res.1:STAR20151120548.

Orlando L, Ginolhac A, Raghavan M, Vilstrup J, Rasmussen M,Magnussen K, Steinmann KE, Kapranov P, Thompson JF, ZazulaG, et al. 2011. True single-molecule DNA sequencing of a pleistocenehorse bone. Genome Res. 21:1705–1719.

Patterson N, Price AL, Reich D. 2006. Population structure and eigen-analysis. PLoS Genet. 2:e190.

Pickrell JK, Pritchard JK. 2012. Inference of population splits and mixturesfrom genome-wide allele frequency data. PLoS Genet. 8:e1002967.

Purcell S, Neale B, Todd-brown K, Thomas L, Ferreira MAR, Bender D,Maller J, Sklar P, Bakker PIW De, Daly MJ, et al. 2007. PLINK : a toolset for whole-genome association and population-based linkageanalyses. Am J Hum Genet. 81:559–575.

Raghavan M, Skoglund P, Graf KE, Metspalu M, Albrechtsen A, Moltke I,Rasmussen S, Stafford TW, Orlando L, Metspalu E, et al. 2014. UpperPalaeolithic Siberian genome reveals dual ancestry of NativeAmericans. Nature 505:87–91.

Reich D, Thangaraj K, Patterson N, Price AL, Singh L. 2009.Reconstructing Indian population history. Nature 461:489–494.

Renfrew C, Bahn P. 1991. Archaeology theories, methods, and practice.New York: Thames and Hudson.

Rohland N, Hofreiter M. 2007. Ancient DNA extraction from bones andteeth. Nat Protoc. 2:1756–1762.

S�anchez-Quinto F, Schroeder H, Ramirez O, Avila-Arcos MC, Pybus M,Olalde I, Velazquez AMV, Marcos MEP, Encinas JMV, Bertranpetit J,

10

Olalde et al. . doi:10.1093/molbev/msv181 MBE by guest on Septem

ber 24, 2015http://m

be.oxfordjournals.org/D

ownloaded from

et al. 2012. Genomic affinities of two 7,000-year-old Iberian hunter-gatherers. Curr Biol. 22:1494–1499.

Schroeder H, �Avila-Arcos MC, Malaspinas A-S, Poznik GD, Sandoval-Velasco M, Carpenter ML, Moreno-Mayar JV, Sikora M, Johnson PLF,Allentoft ME, et al. 2015. Genome-wide ancestry of 17th-centuryenslaved Africans from the Caribbean. Proc Natl Acad Sci U S A.112:3669–3673.

Seguin-Orlando A, Korneliussen TS, Sikora M, Malaspinas A, Manica A,Moltke I, Westaway M, Lambert D, Khartanovich V, Wall JD, et al.2014. Genomic structure in Europeans dating back at least 36,200years. Science 346:1113–1118.

Skoglund P, Malmstr€om H, Omrak A, Raghavan M, Valdiosera C,G€unther T, Hall P, Tambets K, Parik J, Karl-G€oran S, et al. 2014.Genomic diversity and admixture differs for stone-ageScandinavian foragers and farmers. Science 201:786–792.

Skoglund P, Malmstr€om H, Raghavan M, Stora J, Hall P, Willerslev E,Gilbert MTP, G€otherstr€om A, Jakobsson M. 2012. Origins and ge-netic legacy of Neolithic farmers and hunter-gatherers in Europe.Science 336:466–469.

Skoglund P, Stora J, G€otherstr€om A, Jakobsson M. 2013. Accurate sexidentification of ancient human remains using DNA shotgun se-quencing. J Archaeol Sci. 40:4477–4482.

Walsh S, Chaitanya L, Clarisse L, Wirken L, Draus-Barini J, Kovatsi L,Maeda H, Ishikawa T, Sijen T, de Knijff P, et al. 2014.Developmental validation of the HIrisPlex system: DNA-based eyeand hair colour prediction for forensic and anthropological usage.Forensic Sci Int Genet. 9:150–161.

Walsh S, Liu F, Wollstein A, Kovatsi L, Ralf A, Kosiniak-Kamysz A,Branicki W, Kayser M. 2013. The HIrisPlex system for simultaneousprediction of hair and eye colour from DNA. Forensic Sci Int Genet.7:98–115.

Whittle A. 1996. Europe in the Neolithic: the creation of new worlds.Cambridge: Cambridge University Press.

Zilh~ao J. 1993. The spread of agro-pastoral economies across MediterraneanEurope: a view from the far west. J Mediterr Archaeol. 6:5–63.

Zilh~ao J. 2001. Radiocarbon evidence for maritime pioneer colonizationat the origins of farming in west Mediterranean Europe. Proc NatlAcad Sci U S A. 98:14180–14185.

11

Common Genetic Origin for European Early Farmers . doi:10.1093/molbev/msv181 MBE by guest on Septem

ber 24, 2015http://m

be.oxfordjournals.org/D

ownloaded from

Supplementary Information

1-Samples and archaeological sites

Cova Bonica (Vallirana, Barcelona)

Cova Bonica is a cave located in the NE of the Iberian Peninsula, ~30 km north of

Barcelona (41°22'10.29"N, 1°53'38.64"E) and 402 meters above mean sea level. The original

morphology of the cave was modified by mining for sparry calcite and the cave sediments were

partially destroyed by earthmoving and subsequent mushroom farming. The site consists of a

principal chamber (SP) and two small chambers on the right side of SP, one next to the entrance

(SP1) and the second in the internal area (SP2) (Fig. S1A-B).

The stratigraphic sequence of Cova Bonica is represented in the HG34 cross-section

(Fig. S1C) and consists of mud sediments supporting varying amounts of speleothem, bedrock

limestone clasts and archaeological remains. Layer IV2 is overlying the bedrock and

corresponds to the first Holocene deposition in this area; it consists of clay and silt sediment

supporting limestone and speleothem clasts. Layer IV2 yielded numerous fragments of Early

Neolithic pottery, lithic artifacts of rock crystal and jasper, a bone awl made from an ovicaprine

metapodial, body ornaments, faunal and human remains, and abundant charcoal (Fig. S1E-I).

The upper canine subjected to whole-genome analysis (Fig. S1D), labelled CB13 (CB13-HH34-

IV2-2407), was obtained from this layer in August 2013 and yielded a direct radiocarbon age of

6,410±30 years BP (Table S1) from Beta Analytic, Miami, (Beta-384724). A second Cova

Bonica tooth sample, labelled CB14 (CB14-HH34-IV2-3155), was excavated in 2014 from the

same stratigraphic layer and has no direct radiocarbon date.

Layer XIX2 overlies this Early Neolithic unit and consists of dark plastic mud sediments

supporting limestone clasts. This layer contains abundant charcoal, scant Late Neolithic pottery

and faunal remains and has been dated on a femur of Ovis aries to 4,215±30 years BP from

ORAU laboratory in Oxford (OxA-30387). The uppermost unit of the stratigraphic sequence

(layer 0) is formed of mixed sediments reworked by mining and mushroom farming and

contains prehistoric, Iberian-Roman, and Medieval-Modern ceramics as well as scattered

Cardial remains.

Galeria da Cisterna (Almonda karst system, Torres Novas, Portugal)

Galeria da Cisterna (39°30'17.32"N, 8°36'55.06"W) is a fossil karst spring of the River

Almonda (Zilhão 1997; Zilhão 2009). Its AMD2 locus, excavated 1988-89, featured an Early

Neolithic funerary context contained in a shallow Holocene deposit that also yielded material

related to later prehistoric uses of a similar nature. Even though the overwhelming majority of

the ceramic assemblage, the stone tools and the body ornaments are Early Neolithic, the lack of

unambiguous associations means that the age of the human remains could only be established

by direct dating of individual specimens. Of those that returned results in the time range of the

Cardial, two were successfully analyzed for ancient DNA: G21-631, a right M2 inserted in a

mandible, and F19-385, a first phalanx from the right foot. The mandible was pretreated at the

Max Planck Institute for Evolutionary Anthropology Department of Human Evolution, Leipzig

(S-EVA 27412). Following the method described elsewhere (Talamo and Richards 2011)., the

sample was graphitized and dated at the Curt-Engelhorn-Zentrum Archäometrie (Mannheim,

Germany) (MAMS-18262) to 6,319±22 years BP (Table S1). The direct radiocarbon dating on

the phalanx was done at ORAU (OxA-28855), which results in 6,280±34 years BP (Table S1).

Cova de l'Or

Cova de l'Or cave is a Cardial site located at the Sierra del Benicadell mountain range

(Beniarrés, Alacant), 650 m above sea level (38°50'40.83"N, 0°21'50.83"W). Its entrance opens

to the southwest, allowing natural light to illuminate its main chamber, which measures 24x8

metres (Martí Oliver 1977). The cave was first discovered in 1933. In 1955, Vicent Pascual and

Julián San Valero directed several seasons of excavation under the auspices of the Servei

d'Investigació Prehistòrica of the Diputació Provincial, in València. Decades later, between

1975 and 1985, excavations using modern techniques were done under the direction of Bernat

Martí (Martí Oliver et al. 1980; Martí Oliver 1983). Several studies show that Cova de l'Or was

first inhabited during the Early Neolithic, and was used as a place of habitation until the end of

the Early Neolithic (ca. 4,800 years cal BCE), being later used as an animal pen (Badal García

et al. 2012). Later on, in post-Neolithic times, it became a necropolis. The Cova de l'Or sample

analysed for this study was removed from a human mandible belonging to a male adult

individual excavated in 1957. Within the modern stratigraphic layout, it belongs to Level V,

which dates to the end of the Cardial (Juan-Cabanilles 2008). The mandible was pretreated at

the Max Planck Institute for Evolutionary Anthropology Department of Human Evolution,

Leipzig (S-EVA 7649) and directly C14

dated at Mannheim (MAMS-19063), and yielded a date

of 6,356±23 years BP (Table S1).

Cova de la Sarsa

Cova de la Sarsa is a Cardial site located in the northern slope of the Serra Mariola

mountain range (Bocairent, València), 860 m above sea level (38°45'36.66"N, 0°34'56.70"W),

having a total length of cave systems of 200 m. Although visited early on by Henri Breuil,

Fernando Ponsell Cortés conducted the first excavations only in the 1920s. These excavations,

funded by the Servei d'Investigació Prehistórica of Valencia, lasted until 1939. In the 1970's,

María Dolores Asquerino Fernández excavated again the site (Asquerino Fernández 1978;

Asquerino Fernández et al. 1998).

Despite the well-known importance of the site in the historiography of the western

Mediterranean Neolithic, no precise stratigraphic sequence is currently available. Radiocarbon

dating and pottery remains points to an intense inhabitation of the site during the Early

Neolithic; more than 2,000 fragments of Cardial pottery have been found belonging to that

period (García Borja et al. 2012). Two skeletons were discovered buried in a corridor leading

from the main chamber to the inferior galleries, far from the living area (Casanova Vañó 1978),

and associated to one Cardial vessel, three burins, a bone spoon, two bone rings, three

Columbella ornaments, one Cardium shell, three perforated Pectunculus, one fusiform bone

object and some flint bladelets (García Borja et al. 2011).

These human remains correspond to female and male adults (García Borja et al. 2011).

A female’s vertebra sample was pretreated at the Max Planck Institute for Evolutionary

Anthropology Department of Human Evolution, Leipzig (S-EVA 17064) and dated at ORAU

(OxA-V-2392-26) to 6,341±30 years BP (Table S1). The sample analysed for this study is a

second premolar of the male right maxilla recovered by V. Casanova. The sample was directly

C14 dated at ORAU (OxA-31629), which resulted in 6,309±36 years BP (Table S1).

2- TMRCA estimate

TMRCA was inferred for the three available complete Early Neolithic genomes

(Stuttgart, NE1 and CB13) using the program r8s (Sanderson 2003) and the Human Origins

reference dataset. Positions with missing data in any of the three samples were excluded,

resulting in a total of 276,901 genome-wide markers. To accurately calculate the distances

between the samples, one allele at each position was randomly taken for NE1 and Stuttgart and

then the number of changes between them and CB13 was computed 1000 times. Using the

Langley-Fitch (LF) method with POWELL algorithm and fixing the terminal nodes to 7,200,

7,000 and 6,000 years BP for NE1, Stuttgart and CB13 respectively, resulted in an estimated

TMRCA of 25,300 years BP. The obtained rate was 5.2e-06 substitutions per site per year; this

is much higher than the whole-genome rate but expected given that the included positions were

selected for being polymorphic among extant populations.

Supplementary Figures

Fig. S1. Cova Bonica archaeological site and finds from the Cardial layer. (A) Panoramic view.

(B) Site plan. (C) Stratigraphic column. (D) The CB13 upper canine tooth. (E) Cardial pottery.

(F) Rock crystal bladelets. (G) Jasper bladelet. (H) Bone awl. (I) Columbella rustica ornaments.

Fig. S2. Sex attribution for each Cardial sample based on the ratio of Y to X+Y chromosome

reads, following Skoglund et al. (2013). The sample F19 is significantly contaminated; the

analysis has been repeated including only the reads that show the typical ancient DNA

deamination pattern at the end (PMDS > 4). The fact that it can be attributed to a female with

higher probability suggests that the contaminant individual is a male.

Fig. S3. Misincorporation pattern at the ends of the reads of the Cardial samples. On the left, C

to T deamination rate at the 5’ end. On the right, G to A deamination rate at the 3’ end. Figures

around or over 20% are compatible with the values obtained for other Neolithic samples.

Deamination rates for F19 are lower due to the presence of significant levels of modern

contamination in this specimen.

Fig. S4. Procrustes PCA using only transversion sites and including ancient genomes (from the

Pleistocene up to the Cooper Age) and modern West Eurasians. Complement of Figure 2A with

identification labels.

Fig. S5. Comparison between CB13 DNA libraries constructed from the first extract (red) and

from the reextraction (blue). (A) Frequency of C to T misincorporations at the 5’ end of the

reads. (B) Fragment length distribution of reads mapping to the nuclear genome. The DNA

library constructed from the re-extraction yielded shorter DNA fragments with higher

deamination rate than the previous libraries.

Fig. S6. TreeMix (Pickrell and Pritchard 2012) analysis considering: (A) no migration edges,

(B) one migration edge and (C) two migration edges.

Fig. S7. TreeMix (Pickrell and Pritchard 2012) analysis considering: (A) three migration edges

and (B) four migration edges.

Supplementary Tables

Table S1. AMS radiocarbon dating for all the sites mentioned in the paper. Calibrated using Oxcal 4.2 (Bronk Ramsey and Lee 2013) with the Intcal13 curve

(Reimer et al. 2013).

Site Layer Sample Code Species Anatomical part MPI Code AMS Nr 14

C Age

1σ Err Cal BCE 1σ Cal BCE 2σ

From to From to

Cova Bonica IV2 CB13-HH34-IV2-

2407 Human Upper canine

Beta-384724 6410 30 5470 5360 5470 5320

Galeria da Cisterna - G21-631 Human Mandible S-EVA 27412 MAMS-18262 6319 22 5330 5230 5360 5220

Galeria da Cisterna - F19-385 Human First phalanx

OxA-28855 6280 34 5310 5220 5330 5200

Cova de l'Or V - Human Mandible S-EVA 7649 MAMS-19063 6356 23 5360 5310 5470 5290

Cova de la Sarsaa - - Human Second premolar

OxA-31629 6309 36 5321 5227 5360 5217

Cova de la Sarsa - - Human Vertebra S-EVA 17064 OxA-V-2392-26 6341 30 5370 5300 5470 5220

a Sample analysed in this study

Table S2. Statistics for the different samples and libraries sequenced in this study (SHOT:

shotgun; CAP: whole genome capture).

Library Raw reads Mapped reads Human DNA (%) Unique reads Mean read

length

CB13_46_SHOT 497,224,346 21,566,750 4.34 6,526,834 73.46

CB13_14_SHOT 400,658,464 23,004,860 5.74 6,795,635 74.84

CB13_18_SHOT 209,074,549 9,231,631 4.42 867,163 73.53

CB13_19_SHOT 202,044,458 8,600,972 4.26 845,573 73.43

CB13_20_SHOT 163,338,533 7,032,874 4.31 720,476 73.52

CB13_15_SHOT

(reextraction) 1,050,188,254 73,933,019 7.04 28,376,568 61.74

CB13_7_CAP 441,785,724 124,731,162 28.23 2,626,830 81.11

CB14_33_SHOT 28,563,379 71,874 0.25 14,688 65.43

F19_SHOT 15,466,637 90,441 0.58 89,316 78.87

F19_2_CAP 39,679,389 3,831,422 9.66 391,383 89.27

G21_SHOT 21,143,726 18,565 0.09 18,128 74.34

G21_1_CAP 60,298,287 768,465 1.27 141,799 84.55

H3C6_SHOT 22,620,364 13,513 0.06 12,286 49.97

H3C6_9_CAP 44,454,074 447,838 1.01 46,031 61.85

CS7675_29_SHOT 199,059,953 149,550 0.08 41,717 55.75

CS7675_3_CAP 55,715,268 637,629 1.14 20,064 68.47

Table S3. Contamination estimates at the mtDNA genome following the Bayesian method of

Fu et al. (2013). Due to the low coverage, it was not possible to estimate contamination for

CS7675 sample.

Sample MtDNA

coverage

MtDNA reads

MQ>30

Contamination

estimate (%)

Lower CI limit

(%)

Upper CI limit

(%)

CB13 353.29 63,415 0.11 0.02 2.34

CB14 4.1 975 1.35 0.22 15.76

F19 33.79 6,581 29.13 25.53 33.01

G21 63.55 13,553 1.97 0.76 3.74

H3C6 4.45 1,303 3.96 0.96 12.24

CS7675 0.69 224 - - -

Table S4. CB13 mitochondrial DNA contamination estimates by assessing the nucleotide

substitutions present at the K1a2a haplotype diagnostic positions.

Position rCRS CB13 Haplogroup CB13 / Total Contamination %

11467 A G U 286/305 6.23

12308 A G U 332/346 4.05

12372 G A U 297/297 0

1811 A G U2'3'4'7'8'9 297/314 5.41

9698 T C U8 316/334 5.39

3480 A G U8b'c 301/319 5.64

9055 G A U8b 312/315 0.96

14167 C T U8b 307/307 0

10550 A G K 265/275 3.64

11299 T C K 297/314 5.41

14798 T C K 306/320 4.38

16224 T C K 271/282 3.90

16311 T C K 271/281 3.56

1189 T C K1 300/323 7.12

10398 A G K1 313/330 5.15

497 C T K1a 264/264 0

11025 T C K1a2 278/292 4.79

5773 G A K1a2a 372/376 1.06

Total 5,385/5,594 3.74

(3.23 - 4.23, 95% C.I.)

Totala 1,552/1,559

0.45

(0.12 – 0.78, 95% C.I.)

aExcluding positions affected by post-mortem DNA damage

Table S5. Mitochondrial DNA haplogroup attribution of the Cardial samples.

Sample

ID Haplogroup

HaploGrep's

Quality Polymorphisms Coverage

CB13 K1a2a 99.3

73G, 263G, 497T, 750G, 1189C,

1438G, 1811G, 2706G, 3480G,

4769G, 5773A, 7028T, 8860G,

9055A, 9698C, 10398G,

10550G, 11025C, 11299C,

11467G, 11719A, 12308G,

12372A, 14167T, 14766T,

14798C, 15326G, 16224C,

16311C, 16519C

353.29

CB14 X2c 76.0

73G, 153G, 195C, 225A, 227G,

263G, 750G,1719A, 2706G,

4769G, 6371T, 7028T, 8860G,

11719A, 12705T, 13966G,

14766T, 15326G, 16189C,

16223T, 16255A, 16278T

4.1

F19 H4a1a 100

263G, 750G, 1438G, 3992T,

4024G, 4769G, 5004C, 8269A,

8860G, 9123A, 14365T,

14582G, 15326G

33.79

G21 H3 100

263G, 750G, 1438G, 4769G,

6776C, 8860G, 15326G,

16519C

63.55

H3C6 H4a1a 86.6

263G, 750G, 1438G, 3992T,

4024G, 5004C, 8269A, 8860G,

14365T, 14582G, 15326G

4.45

CS7675 K1a4a1 53.2

263G, 497T, 750G, 1811G,

3480G, 7028T, 11719A,

11840T, 12372A, 13740C,

14766T, 14798C

0.69

Positions in red are represented by less than 3 reads

Table S6. Cova Bonica CB13 alleles for the SNPs included in the 8-plex pigmentation

phenotype prediction.

ID Gene Ancestral Derived Genotype

rs12913832 OCA2-HERC2 A G No data

rs1545397 OCA2 A T A (2x)

rs16891982 SLC45A2 C G 2C,1G (3x)

rs885479 MC1R G A G (5x)

rs1426654 SLC24A5 G A A (2x)

rs12896399 SLC24A4 G T G (1x)

rs6119471 ASIP G C C (1x)

rs12203592 IRF4 C T C (1x)

Table S7. Cova Bonica CB13 alleles for the SNPs included in the Hirisplex pigmentation

phenotype prediction.

ID Gene Ancestral Derived Genotype

N29insA MC1R C A C (2x)

rs11547464 MC1R G A G (5x)

rs885479 MC1R G A G (5x)

rs1805008 MC1R C T C (6x)

rs1805005 MC1R G T G (1x)

rs1805006 MC1R C A No data

rs1805007 MC1R C T C (3x)

rs1805009 MC1R G C G (1x)

Y152OCH MC1R C A C (3x)

rs2228479 MC1R G A G (2x)

rs1110400 MC1R T C T (3x)

rs28777 SLC45A2 C A No data

rs16891982 SLC45A2 C G 2C,1G (3x)

rs12821256 KITLG T C T (5x)

rs4959270 EXOC2 C A C (1x)

rs12203592 IRF4 C T C (1x)

rs1042602 TYR C A A (1x)

rs1800407 OCA2 C T No data

rs2402130 SLC24A4 G A A (1x)

rs12913832 OCA2-

HERC2 A G No data

rs2378249 ASIP/PIGU A G No data

rs12896399 SLC24A4 G T G (1x)

rs1393350 TYR G A G (2x)

rs683 TYRP1 A C No data

Table S8. Cova Bonica CB13 alleles at the SLC24A5 region to ascertain the haplotype around

the light skin allele.

ID Gene Ancestral Derived Genotype

rs1834640 SLC24A5 G A No data

rs2675345 SLC24A5 G A A (1x)

rs2469592 SLC24A5 G A A (2x)

rs2470101 SLC24A5 C T T (1x)

rs938505 SLC24A5 C T No data

rs2433354 SLC24A5 T C No data

rs2459391 SLC24A5 G A No data

rs2433356 SLC24A5 A G No data

rs2675347 SLC24A5 G A A (2x)

rs2675348 SLC24A5 G A A (1x)

rs1426654 SLC24A5 G A A (2x)

rs2470102 SLC24A5 G A No data

rs16960631 SLC24A5 A G A (1x)

rs2675349 SLC24A5 G A A (2x)

rs3817315 SLC24A5 T C No data

rs7163587 SLC24A5 T C No data

Table S9. Cova Bonica CB13 alleles at the OCA2/HERC2 region to ascertain the haplotype

around the blue-eyes mutation.

ID Gene Ancestral Derived Genotype

rs4778241 OCA2-HERC2 A C C (1x)

rs1129038 OCA2-HERC2 C T 1C,2T (3x)

rs12593929 OCA2-HERC2 G A A (1x)

rs12913832 OCA2-HERC2 A G No data

rs7183877 OCA2-HERC2 C A C (2x)

rs3935591 OCA2-HERC2 T C C (3x)

rs7170852 OCA2-HERC2 T A A (2x)

rs2238289 OCA2-HERC2 A G A (2x)

rs3940272 OCA2-HERC2 T G G (1x)

rs8028689 OCA2-HERC2 C T No data

rs2240203 OCA2-HERC2 C T T (1x)

rs11631797 OCA2-HERC2 A G No data

rs916977 OCA2-HERC2 T C No data

Table S10. D-stat of the form D (HG1, HG2, Neolithic Farmer, Outgroup). Standard errors (SE) were computed using block jackknife over 5Mb blocks.

HG1 HG2 Neolithic Farmer D-stat SE Z D-stat

(only transversions) SE Z

KO1 LaBraña1 CB13 0.0229 0.0053405 4.288 0.0252 0.0055494 4.541

KO1 LaBraña1 Stuttgart 0.0251 0.0054435 4.611 0.0241 0.0059536 4.048

KO1 LaBraña1 NE1 0.0253 0.0057868 4.372 0.029 0.0063806 4.545

KO1 LaBraña1 Gok2 0.0178 0.0052725 3.376 0.0215 0.0059922 3.588

Loschbour LaBraña1 CB13 0.0062 0.0067983 0.912 0.0073 0.0069923 1.044

Loschbour LaBraña1 Stuttgart 0.013 0.0057345 2.267 0.0118 0.0062236 1.896

Loschbour LaBraña1 NE1 0.0047 0.0050756 0.926 0.0039 0.005644 0.691

Loschbour LaBraña1 Gok2 0.0116 0.0050833 2.282 0.0193 0.0058895 3.277

KO1 Loschbour CB13 0.0172 0.0068471 2.512 0.0179 0.0072823 2.458

KO1 Loschbour Stuttgart 0.0111 0.0063757 1.741 0.0097 0.0068022 1.426

KO1 Loschbour NE1 0.0193 0.0049335 3.912 0.021 0.0050909 4.125

KO1 Loschbour Gok2 0.0062 0.0049363 1.256 0.0015 0.0060729 0.247

KO1 Motala12 CB13 0.0226 0.0060074 3.762 0.0222 0.0064516 3.441

KO1 Motala12 Stuttgart 0.0116 0.0055609 2.086 0.0102 0.006387 1.597

KO1 Motala12 NE1 0.0263 0.0053849 4.884 0.0242 0.0060758 3.983

KO1 Motala12 Gok2 0.0222 0.0063738 3.483 0.0158 0.0067928 2.326

LaBraña1 Motala12 CB13 0.0011 0.0052632 0.209 -0.0019 0.0055719 -0.341

LaBraña1 Motala12 Stuttgart -0.0118 0.0056513 -2.088 -0.0103 0.0061677 -1.67

LaBraña1 Motala12 NE1 0.0024 0.0048387 0.496 -0.0003 0.0042857 -0.07

LaBraña1 Motala12 Gok2 0.005 0.0056689 0.882 -0.0033 0.0062857 -0.525

Loschbour Motala12 CB13 0.0073 0.007026 1.039 0.0065 0.0071272 0.912

Loschbour Motala12 Stuttgart 0.0011 0.0061453 0.179 0.0018 0.0064516 0.279

Loschbour Motala12 NE1 0.0075 0.0047111 1.592 0.0033 0.0054098 0.61

Loschbour Motala12 Gok2 0.0175 0.0060366 2.899 0.0176 0.0062925 2.797

References

Asquerino Fernández MD, López P, Molero G, Sevilla P, Aparicio MT, Ramos M. 1998. Cova

de la Sarsa (Bocairent, Valencia). Sector II: Gatera. Recerques del Museu d'Alcoi 7:47–88.

Asquerino Fernández MD. 1978. Cova de la Sarsa (Bocairente, Valencia). Análisis estadístico y

tipológico de materiales sin estratigrafía (1971-1974). Sagvntvm. Papeles del laboratorio

de Arqueología de Valencia 13:99–225.

Badal García E, Martí Oliver B, Pérez-Ripoll M. 2012. From agricultural to pastoral use:

changes in neolithic landscape at Cova de l’Or (Alicante, Spain). In: Wood and charcoal.

Evidence for human and natural history. Sagvntvm Extra-13. p. 75–84.

Bronk Ramsey C, Lee S. 2013. Recent and planned developments of the program oxcal.

Radiocarbon 55:720–730.

Casanova Vañó V. 1978. Enterramiento doble en la Cova de la Sarsa (Bocairent, València).

Archivo de Prehistoria Levantina 15:27–36.

Fu Q, Mittnik A, Johnson PLF, Bos K, Lari M, Bollongino R, Sun C, Giemsch L, Schmitz R,

Burger J, et al. 2013. A revised timescale for human evolution based on ancient

mitochondrial genomes. Curr. Biol. 23:553–559.

García Borja P, Salazar-García D, Martins H, Pérez Jordà G, Sanchis Serra A. 2012. Dataciones

radiocarbónicas de la Cova de la Sarsa (Bocairent, València). Recerques del Museu d'Alcoi

22:19–24.

García Borja P, Salazar-García DC, Pérez Fernández A, Pardo Gordó S, Casanova Vañó V.

2011. El Neolítico antiguo cardial y la Cova de la Sarsa (Bocairent, València). Nuevas

perspectivas a partir de su registro funerario. Munibe. Arqueología Antropología. 62:175–

195.

Juan-Cabanilles J. 2008. El utillaje de piedra tallada en la Prehistoria reciente valenciana.

Aspectos tipológicos, estilísticos y evolutivos. Serie de Trabajos Varios del Servicio de

Investigación Prehistórica de la Diputación Provincial de Valencia

Martí Oliver B, Pascual Pérez V, Gallart Martí D, López García P, Pérez Ripoll M, Acuña

Hernández JD, Robles Cuenca F. 1980. Cova de l’Or (Beniarrés Alicante) Vol. II. Serie de

Trabajos Varios del Servicio de Investigación Prehistórica de la Diputación Provincial de

Valencia

Martí Oliver B. 1977. Cova de l’Or (Beniarrés Alicante) Vol. I. Valencia: Serie de Trabajos

Varios del Servicio de Investigación Prehistórica de la Diputación Provincial de Valencia

Martí Oliver B. 1983. Cova de l’Or (Beniarrés, Alicante). Memorias de las campañas de

excavación 1975-1979. Noticiario Arqueológico Hispánico 16:11–55.

Pickrell JK, Pritchard JK. 2012. Inference of population splits and mixtures from genome-wide

allele frequency data. PLoS Genet. 8:e1002967.

Reimer PJ, Bard E, Bayliss A, Beck JW, Blackwell PG, Bronk C, Caitlin R, Hai EB, Edwards

RL. 2013. Intcal13 and marine13 radiocarbon age calibration curves 0 – 50,000 years cal

bp. Radiocarbon 55:1869–1887.

Sanderson MJ. 2003. r8s: inferring absolute rates of molecular evolution and divergence times

in the absence of a molecular clock. Bioinformatics 19:301–302.

Skoglund P, Storå J, Götherström A, Jakobsson M. 2013. Accurate sex identification of ancient

human remains using DNA shotgun sequencing. J. Archaeol. Sci. 40:4477–4482.

Talamo S, Richards M. 2011. A comparison of bone pretreatment methods for AMS dating of

samples >30,000 BP. Radiocarbon 53:443–449.

Zilhão J. 1997. O Paleolítico Superior da Estremadura portuguesa. Lisboa: Colibri

Zilhão J. 2009. The Early Neolithic artifact assemblage from the Galeria da Cisterna (Almonda

karstic system, Torres Novas, Portugal). In: De Méditerranée et d’ailleurs. Mélanges

offerts à Jean Guilaine. Toulouse: Archives d’Écologie Préhistorique. p. 821–835.