Identifying Y-Chromosome Dynastic Haplotypes: The High Kings of Ireland Revisited

10
Maglio: Dynasc Haplotypes Identifying Y-Chromosome Dynastic Haplotypes: The High Kings of Ireland Revisited Michael R. Maglio Abstract The use of median-joining network analysis to illustrate patrilineal clusters and the selection of low mutation rate microsatellites to create a tribal haplotype provide a novel approach for identifying dynastic relationships. Sixty-seven marker short tandem repeat (STR) genetic analysis of Y-chromosomes reveals multiple unnoted modal haplotypes showing a significant association with surnames claimed to have descended from the High Kings of Ireland. This suggests that such phylogenetic prevalence is a biological record and supports the reliability of early genealogies. This approach demonstrates genetic genealogy building blocks and repeatable processes for practitioners. Introduction The septs of Ireland provide us an opportunity to develop genetic genealogy techniques and processes. Irish surnames are typically patronymic. The surnames generally take the form of Mac Cárthaigh (McCarthy), meaning son of Cárthaigh or Ui Néill (O’Neill), meaning grandson / descendant of Néill. Irish septs, while not as formal as Scottish clans, serve as a collective of related families with shared ancestry and patronymic surnames. Multiple septs then belong to larger dynasties such as the Eóganachta and the Dál gCais. If septs are patrilineal, then yDNA haplotypes should be consistent across sept surnames. Research on the Néill haplotype started with a geographical selection and then a subsequent reduction by sept surnames (Moore et al 2006). Consistent seventeen-marker STR haplotypes were found across the sept. Considering that seventeen- markers have a high probability of haplotype convergence, this paper will revisit the Uí Néill data at a higher STR marker level of 67 microsatellites. Contrary to the Uí Néill paper, a subsequent paper on Irish patrilineal kinship (McEvoy et al 2008), found no genetic affinity among septs. In addition to the Uí Néill dynasty, this paper will also examine the Eóganachta, Dál gCais and Connachta dynasties. Two genetic genealogy techniques will be demonstrated and their results compared to determine the validity of the Irish dynastic haplotypes. The first method will use median-joining network analysis. This is not new to genetics. Its use with 67 STR markers will identify patrilineal clusters to within a TMRCA of 390 years. Median-joining network analysis with 17 or fewer markers has been standard in the field, achieving a TMRCA of only 2,200 years. It is not surprising that there has been conflicting results from previous papers. The second method takes advantage of “slow” mutating STR markers (Zhivotovsky 2004). Each microsatellite has its own mutation rate (Chandler 2006) and they may have different mutation rates per haplotype. By selecting the 15 “slowest” markers with an average mutation rate of 0.00024, a virtual tribal haplotype is created that would be stable within the last 2,000 years (90% probability of 80 generations). This is an order of magnitude lower than the average

Transcript of Identifying Y-Chromosome Dynastic Haplotypes: The High Kings of Ireland Revisited

Maglio: Dynastic Haplotypes

Identifying Y-Chromosome Dynastic Haplotypes: The High Kings of Ireland Revisited Michael R. Maglio

Abstract

The use of median-joining network analysis to illustrate patrilineal clusters and the

selection of low mutation rate microsatellites to create a tribal haplotype provide a novel

approach for identifying dynastic relationships. Sixty-seven marker short tandem repeat

(STR) genetic analysis of Y-chromosomes reveals multiple unnoted modal haplotypes

showing a significant association with surnames claimed to have descended from the High

Kings of Ireland. This suggests that such phylogenetic prevalence is a biological record

and supports the reliability of early genealogies. This approach demonstrates genetic

genealogy building blocks and repeatable processes for practitioners.

Introduction

The septs of Ireland provide us an

opportunity to develop genetic genealogy

techniques and processes. Irish surnames are

typically patronymic. The surnames

generally take the form of Mac Cárthaigh

(McCarthy), meaning son of Cárthaigh or Ui

Néill (O’Neill), meaning grandson /

descendant of Néill. Irish septs, while not as

formal as Scottish clans, serve as a collective

of related families with shared ancestry and

patronymic surnames. Multiple septs then

belong to larger dynasties such as the

Eóganachta and the Dál gCais.

If septs are patrilineal, then yDNA

haplotypes should be consistent across sept

surnames. Research on the Uí Néill

haplotype started with a geographical

selection and then a subsequent reduction by

sept surnames (Moore et al 2006). Consistent

seventeen-marker STR haplotypes were found

across the sept. Considering that seventeen-

markers have a high probability of haplotype

convergence, this paper will revisit the Uí

Néill data at a higher STR marker level of 67

microsatellites.

Contrary to the Uí Néill paper, a subsequent

paper on Irish patrilineal kinship (McEvoy et

al 2008), found no genetic affinity among

septs. In addition to the Uí Néill dynasty, this

paper will also examine the Eóganachta, Dál

gCais and Connachta dynasties.

Two genetic genealogy techniques will be

demonstrated and their results compared to

determine the validity of the Irish dynastic

haplotypes. The first method will use

median-joining network analysis. This is not

new to genetics. Its use with 67 STR markers

will identify patrilineal clusters to within a

TMRCA of 390 years. Median-joining

network analysis with 17 or fewer markers

has been standard in the field, achieving a

TMRCA of only 2,200 years. It is not

surprising that there has been conflicting

results from previous papers.

The second method takes advantage of

“slow” mutating STR markers (Zhivotovsky

2004). Each microsatellite has its own

mutation rate (Chandler 2006) and they may

have different mutation rates per haplotype.

By selecting the 15 “slowest” markers with an

average mutation rate of 0.00024, a virtual

tribal haplotype is created that would be

stable within the last 2,000 years (90%

probability of 80 generations). This is an

order of magnitude lower than the average

Maglio: Dynastic Haplotypes

rate of 0.0029 used as a constant in typical

TMRCA calculations.

The “tribal” markers isolated are DYS426,

DYS388, DYS392, DYS455, DYS454, DYS578,

DYS590, DYS641, DYS472, DYS594, DYS436,

DYS490, DYS450 and DYS640.

Each technique will illustrate the

identification of a dynastic haplotype and the

combined results reinforce the conclusions.

Methods

For each target sept, affiliated surnames are

identified. In the case of Uí Néill, the

following surnames and associated yDNA

STR records were accessed from Family Tree

DNA projects: O’Neill, Gallagher, Doherty

and O’Donnell. The selection includes 600

records and 5 common European haplogroups

(Y Chromosome Consortium 2002).

Each group of records is anonymized to

protect the privacy of the individuals.

Duplicate haplotypes within a surname are

removed to prevent skewing the results. Any

duplicates that remain will be across

surnames and add value to the results.

The dataset is processes through Dean

McGee’s Y-Utility: Y-DNA Comparison

tool at mymcgee.com/tools/yutility.html to

produce a ych format file for the Fluxus

Software Network application. A median-

joining network analysis is calculated for the

dataset (Bandelt et al 1999).

The resulting median-joining diagram has

clusters of tightly related individuals. The

more coherent the cluster, the closer the

relationship. For the purpose of this

technique, the optimum cluster will be highly

coherent and contain an example of every

surname in the dataset. Other clusters will

typically only contain a single surname.

Figure 1a shows the complete median-

joining network analysis for the Uí Néill

surnames including outliers. The outliers are

from haplogroups G, I, J and R1a. The main

cluster is haplogroup R1b. To identify the

core clusters, the application has the ability to

display only the torso of the network. Figure

1b shows four distinct clusters.

Figure 1 Median-joining network of yDNA

from dataset with Uí Néill dynasty surnames (n = 303).

a) The outlier nodes are haplogroup J and R1a. b) Torso

only view, showing the central clusters.

Each cluster is examined for surname

content, number of nodes and potential SNP

identification. Figure 2 is an enhanced view

of the four distinct clusters. The three clusters

on the right have only O’Neill surnames. The

cluster at the far right is predominantly R-

L159 and the cluster at the lower right has R-

P311/R-L151 nodes.

Figure 2 View of the Uí Néill network torso

showing four distinct clusters. The three groups on the

right are O’Neill only clusters.

The cluster at the left contains all of the Uí

Néill surnames, has the majority of nodes and

is SNP R-M222, which is consistent with

Maglio: Dynastic Haplotypes

earlier studies. Figure 3 gives a detailed view

of the Uí Néill surname cluster.

Figure 3 Main Uí Néill cluster showing all

surnames. The O’Neill records are marked green.

Figure 4 demonstrates that a random

sampling of surnames will not resolve into a

coherent cluster of affiliated surnames.

Figure 4 Median-joining network of yDNA

sampled from three random Irish surnames (n = 340)

Duffy, Kelly and McCormick. The network resolves

into 10 distinct clusters with essentially no

commonality among the surnames.

The second technique involves examining

only the “slow” mutating markers for the Uí

Néill dataset (n=303). To manipulate the

“tribal” haplotype of 15 microsatellites faster

the resulting values are concatenated into a

string – ex. 12121411119168108101212811.

The “tribal” haplotypes are summarized per

surname and plotted to illustrate majority and

affinity.

Figure 5 Uí Néill dynastic haplotype. The

haplotypes from the Uí Néill dataset (n=303) converted

into 15 marker “tribal” haplotypes and summarized.

The Uí Néill dataset resolved into 37 unique

“tribal” haplotypes. Figure 5 shows that

haplotype 12121411119168108101212811 is

the most dominant across the Uí Néill

surnames. As with the median-joining

network analysis, this “tribal” haplotype is

consistent with SNP R-M222. As a triple

check, Figure 6 shows a traditional

phylogenetic tree. The same genetic cohorts

cluster together.

Figure 6 Uí Néill dataset phylogenetic tree.

Red bounding box designates the R-M222 cluster.

Maglio: Dynastic Haplotypes

The full 67 marker Uí Néill haplotype can be

found in the appendix.

Haplotype data from this analysis is

available in Table 1, Table 2, Table 3 and

Table 4.

Discussion

The work on the Uí Néill dataset confirms

the work by Moore et al and takes it to a

higher level of accuracy. The challenge is to

repeat the process with other Irish septs. For

this exercise the following surnames are used:

Uí Briúin (O’Brien) of the Dál gCais, Mac

Cárthaigh (McCarthy) of the Eóganachta and

Ua Conchobhair (O’Connor) of the

Connachta.

For the Uí Briúin dataset, the following

surnames and associated yDNA STR records

were accessed from Family Tree DNA

projects: O’Brien, Hogan, Kennedy and

McMahon. The selection includes 615

records and 5 common European

haplogroups.

Figure 7 shows the complete median-

joining network analysis for the Uí Briúin

surnames including outliers. The outliers are

from haplogroups E, G, I, J and R1a. The

main cluster is haplogroup R1b.

Figure 7 Median-joining network of yDNA

from dataset with Uí Briúin dynasty surnames (n =

330).

Figure 8 is an enhanced view of two distinct

clusters and multiple indistinct clusters. The

cluster on the middle left has only McMahon

surnames and is R-DF21.

Figure 8 View of the Uí Briúin network torso

showing two distinct clusters and multiple indistinct

clusters.

The lower cluster has all of the Uí Briúin

surnames, has the majority of nodes and is

SNP R-L226. This is consistent with

previous work (Wright 2009). Figure 9 gives

a detailed view of the Uí Briúin surname

cluster.

Figure 9 Main Uí Briúin cluster showing all

tested surnames.

Maglio: Dynastic Haplotypes

The “tribal” haplotypes are summarized per

surname and plotted to illustrate majority and

affinity.

Figure 10 Uí Briúin dynastic haplotype. The

haplotypes from the Uí Briúin dataset (n=330)

converted into 15 marker “tribal” haplotypes and

summarized.

The Uí Briúin dataset resolved into 46

unique “tribal” haplotypes. Figure 10 shows

that haplotype 12121311119168108101212811

is the most dominant across the Uí Briúin

surnames. As with the median-joining

network analysis, this “tribal” haplotype is

consistent with SNP R-L226. The full 67

marker Uí Briúin haplotype can be found in

the appendix.

The Mac Cárthaigh dataset has the

following surnames and associated yDNA

STR records as accessed from Family Tree

DNA projects: McCarthy, Callaghan,

Donovan and Sullivan. The selection

includes 319 records and 4 common European

haplogroups.

Figure 11 shows the complete median-

joining network analysis for the Mac

Cárthaigh surnames including outliers. The

outliers are from haplogroups E, I and J. The

main cluster is haplogroup R1b.

Figure 11 Median-joining network of yDNA

from dataset with Mac Cárthaigh dynasty surnames (n

= 207).

Figure 12 is an enhanced view of five

distinct clusters.

Figure 12 View of the Mac Cárthaigh network

torso showing five distinct clusters.

The center cluster has all of the Mac

Cárthaigh surnames, the majority of nodes

and is SNP R-CTS4466. This is consistent

with previous work (McCarthy 2013). Figure

13 gives a detailed view of the Mac Cárthaigh

surname cluster.

Maglio: Dynastic Haplotypes

Figure 13 Main Mac Cárthaigh cluster showing

all tested surnames.

The “tribal” haplotypes are summarized per

surname and plotted to illustrate majority and

affinity.

Figure 14 Mac Cárthaigh dynastic haplotype.

The haplotypes from the Mac Cárthaigh dataset

(n=207) converted into 15 marker “tribal” haplotypes

and summarized.

The Mac Cárthaigh dataset resolved into 38

unique “tribal” haplotypes. Figure 14 shows

that haplotype 12121311119168108101212811

is the most dominant across the Mac

Cárthaigh surnames. As with the median-

joining network analysis, this “tribal”

haplotype is consistent with SNP R-CTS4466.

The full 67 marker Mac Cárthaigh haplotype

can be found in the appendix.

The final dataset, Ua Conchobhair, has the

following surnames and associated yDNA

STR records as accessed from Family Tree

DNA projects: O’Connor, McManus, Reilly

and Rourke. The selection includes 352

records and 5 common European

haplogroups.

Figure 15 shows the complete median-

joining network analysis for the Ua

Conchobhair surnames including outliers.

The outliers are from haplogroups E, I and

R1a. The main cluster is haplogroup R1b.

Figure 15 Median-joining network of yDNA

from dataset with Ua Conchobhair dynasty surnames (n

= 145).

In Figure 16, the cluster on the right is a

McManus only group with a SNP of R-L513.

Figure 16 View of the Ua Conchobhair network

torso showing two distinct clusters.

The left cluster has all of the Ua Conchobhair

surnames, the majority of nodes and is SNP

Maglio: Dynastic Haplotypes

R-M222. This is consistent with previous

work relating O’Connor individuals to

O’Neill (Moore 2006). Figure 17 gives a

detailed view of the Ua Conchobhair surname

cluster.

Figure 17 Main Ua Conchobhair cluster

showing all tested surnames.

The “tribal” haplotypes are summarized per

surname and plotted to illustrate majority and

affinity.

Figure 18 Ua Conchobhair dynastic haplotype.

The haplotypes from the Ua Conchobhair dataset

(n=145) converted into 15 marker “tribal” haplotypes

and summarized.

The Ua Conchobhair dataset resolved into

22 unique “tribal” haplotypes. Figure 18

shows that haplotypes

12121311119168108101212811 and

12121411119168108101212811 are equally

dominant across the Ua Conchobhair

surnames. Tribal haplotype

12121411119168108101212811 is associated

with the central cluster from the network

analysis (consistent with SNP R-M222) and

haplotype 12121311119168108101212811 is

loosely clustered (not closely related). The

full 67 marker Ua Conchobhair haplotype can

be found in the appendix.

Conclusions

While median-joining network analysis is

not new, there are advantages to using the

technique with 67 STR markers. The same

four Uí Néill clusters found in Figure 2 are

absent when the network is analyzed at 25

markers (Fig. 19). At 67 markers the network

clusters coalesce into discernable SNP

families.

Figure 19 Using only 25 STR markers, the Uí

Néill network collapses to a single cluster.

The quantity of “tribal” markers (15) was

selected to obtain stable haplotypes with an

age of 2,000 years. Three out of the four

cases presented obvious results. In the Ua

Conchobhair case, the network analysis was

needed to differentiate the haplotypes. The

Uí Briúin and Mac Cárthaigh data presented

the same “tribal” haplotype, yet they have two

different SNPs. The 15 markers selected

Maglio: Dynastic Haplotypes

were not unique to the SNP level. The

“tribal” haplotypes did act as a first pass filter

to identify the primary families within the

larger population.

The genealogies of the Kings of Ireland are

semi-mythological. These techniques validate

the most likely descendants of these kings.

Short of opening a crypt and exhuming

remains, genetic analysis to reconstruct the

DNA of historic individuals is necessary. By

identifying the 67 marker modal haplotypes

and SNPs of the Kings, other genealogical

comparisons can be established.

Niall Noígíallach was High King of Ireland

around 378 CE and founder of the Uí Néill

dynasty. Historically, his half-brother Brión,

was one of the founders on the Connachta

dynasty and an ancestor of the last High King

of Ireland, Ruaidrí Ua Conchobair. If their

genealogies are correct, the evidence is in

their DNA. The data shows that Uí Néill and

Ua Conchobair share the same SNP, R-M222.

While the modal haplotype for each is

derived, it serves as an approximation. The

Uí Néill and Ua Conchobair modals are a 6-

step match at 67 markers. There is a 99%

probability of a relationship not further than

1,260 years ago. That time period is more

recent than Niall and Brión. TMRCA

calculations are notoriously difficult. A more

recent relationship is preferred over a more

distant relationship. A more distant

relationship does not provide evidence for

Niall and Brión’s relation. A more recent

relationship allows for fewer mutations per

generation and/or back mutations. The results

make a strong case for the validity of the

historic genealogy.

Brian Boru, High King of Ireland in 1002

CE, belonged to the Dál gCais dynasty and

Tadhg Mac Cárthaigh, the first King of

Desmond, belonged to the Eóganachta

dynasty. Ancient genealogies have the

Eóganachta and Dál gCais dynasties

descended from Ailill Aulom, the son-in-law

of legendary king Conn of the Hundred

Battles. The Mac Cárthaighs and Uí Briúins

do not share the same SNP (R-L226 vs. R-

CTS4466), but by descent they would share a

common R-DF13 ancestor. The Mac

Cárthaigh and Uí Briúin modals are an 11-

step match at 67 markers. There is a 99%

probability of a relationship not further than

1,920 years ago. This puts a Mac Cárthaigh-

Uí Briúin common ancestor as a

contemporary of the legendary Conn.

While the data can prove that a definite

genetic ancestor exists for the Mac Cárthaigh-

Uí Briúin septs and the Uí Néill-Ua

Conchobair septs, it is impossible to prove the

identity of that common ancestor. Genetic

genealogy techniques are invaluable for the

reconstruction of distant family trees at the

macro level.

Conflict of Interest

The author declares no conflict of interest.

Acknowledgements

I thank all of the DNA donors who have

made their results publically accessible for

review. Special thanks to Dean McGee for

making his DNA analysis website available.

Web Resources

Fluxus Engineering, http://www.fluxus-

engineering.com/sharenet.htm (for

NETWORK)

Y-Utility: Y-DNA Comparison Utility,

http://www.mymcgee.com/tools/yutility.ht

ml?mode=ftdna_mode

Maglio: Dynastic Haplotypes

References

Bandelt HJ, Forster P, Rohl A (1999) Median-

joining networks for inferring intraspecific

phylogenies. Mol Biol Evol 16:37–48

Chandler JF (2006). Estimating per-locus

mutation rates. Journal of Genetic

Genealogy, 2, 27-33.

McCarthy, N (2013), DNA Profiling of

McCarthy Septs and Agnomens.

Presented at Back to Our Past, 18-20

October 2013, RDS, Dublin

McEvoy B, Simms K, & Bradley DG (2008).

Genetic investigation of the patrilineal

kinship structure of early medieval

Ireland. American journal of physical

anthropology, 136(4), 415-422.

Moore, LT, McEvoy B, Cape E, Simms K,

Bradley DG (2006) A Y-Chromosome

Signature of Hegemony in Gaelic Ireland.

Am J Hum Genet, 78:334–338.

O’Neill EB, & McLaughlin JD (2006).

Insights Into the O'Neills of Ireland from

DNA Testing. Journal of Genetic

Genealogy, 2(2).

Wright DM (2009). A Set of Distinctive

Marker Values Defines a Y-STR

Signature for Gaelic Dalcassian Families.

Journal of Genetic Genealogy, 5, 1.

Y-Chromosome-Consortium (2002) A

nomenclature system for the tree of

human Y-chromosomal binary

haplogroups. Genome Res 12:339–348

Zhivotovsky LA, Underhill PA, Cinniog˘lu C,

Kayser M, Morar B, Kivisild T, Scozzari

R, Cruciani F, Destro-Bisol G, Spedini G,

Chambers GK, Herrera RJ, Yong KK,

Gresham D, Tournev I, Feldman MW,

Kalaydjieva L (2004) The effective

mutation rate at Y chromosome short

tandem repeats, with application to human

population divergence time. Am J Hum

Genet 74:50–61

Maglio: Dynastic Haplotypes

Appendix

Figure A1 Sixty-seven STR Uí Néill Modal Haplotype.

Figure A2 Sixty-seven STR Uí Briúin Modal Haplotype.

Figure A3 Sixty-seven STR Mac Cárthaigh Modal Haplotype.

Figure A4 Sixty-seven STR Ua Conchobhair Modal Haplotype.