An algebraic analysis of the two state Markov model on tripod trees

28
An algebraic analysis of the two state Markov model on tripod trees Steffen Klaere a,* , Volkmar Liebscher b a Department of Mathematics and Statistics, University of Otago, Dunedin, New Zealand b Institut f¨ ur Mathematik und Informatik, Universit¨at Greifswald, Germany Abstract Methods of phylogenetic inference use more and more complex models to generate trees from data. However, even simple models and their implications are not fully understood. Here, we investigate the two-state Markov model on a tripod tree, inferring conditions under which a given set of observations gives rise to such a model. This approach has been taken before by several scientists from different fields of research. We fully analyze the model, present conditions under which one can infer a model from the observation or at least get support for the tree-shaped interdependence of the leaves considered. We also present all conditions under which the results can be extended from tripod trees to quartet trees, a step necessary to reconstruct at least a topology. Apart from finding conditions under which such an extension works we discuss example cases for which such an extension does not work. We present a complete algebraic analysis of the two-state Markov model of evolution on tripod trees. Consequences for quartet trees are indicated, too. Keywords: Phylogenetics, Identifiability, Invariant, Two-State-Model 1. Introduction 1 In phylogeny, one assumes that the relationship of a set of taxonomic units (short 2 taxa) can be visualised by a (binary) tree. The aim is to derive this tree from the 3 observations at the taxa. From a stochastic modelling point of view, one assigns the 4 taxa to the leaves of a (binary) tree, and assumes that the observations (which are 5 usually considered to be i.i.d. over different sites) are the end results of a Markov 6 process along the tree. The goal is to derive the best combination of tree and Markov 7 model to explain the observations. 8 This work regards the identifiability problem of this inference. It essentially asks 9 whether it is possible that infinite data sets are able to uniquely identify the transitions 10 * [email protected] Preprint submitted to arXiv February 6, 2011 arXiv:1012.0062v2 [q-bio.PE] 18 Dec 2010

Transcript of An algebraic analysis of the two state Markov model on tripod trees

An algebraic analysis of the two state Markov model on tripod

trees

Steffen Klaerea,∗, Volkmar Liebscherb

aDepartment of Mathematics and Statistics, University of Otago, Dunedin, New ZealandbInstitut fur Mathematik und Informatik, Universitat Greifswald, Germany

Abstract

Methods of phylogenetic inference use more and more complex models to generate treesfrom data. However, even simple models and their implications are not fully understood.

Here, we investigate the two-state Markov model on a tripod tree, inferringconditions under which a given set of observations gives rise to such a model. Thisapproach has been taken before by several scientists from different fields of research.

We fully analyze the model, present conditions under which one can infer a modelfrom the observation or at least get support for the tree-shaped interdependence of theleaves considered.

We also present all conditions under which the results can be extended fromtripod trees to quartet trees, a step necessary to reconstruct at least a topology. Apartfrom finding conditions under which such an extension works we discuss example casesfor which such an extension does not work.

We present a complete algebraic analysis of the two-state Markov model ofevolution on tripod trees. Consequences for quartet trees are indicated, too.

Keywords: Phylogenetics, Identifiability, Invariant, Two-State-Model

1. Introduction1

In phylogeny, one assumes that the relationship of a set of taxonomic units (short2

taxa) can be visualised by a (binary) tree. The aim is to derive this tree from the3

observations at the taxa. From a stochastic modelling point of view, one assigns the4

taxa to the leaves of a (binary) tree, and assumes that the observations (which are5

usually considered to be i.i.d. over different sites) are the end results of a Markov6

process along the tree. The goal is to derive the best combination of tree and Markov7

model to explain the observations.8

This work regards the identifiability problem of this inference. It essentially asks9

whether it is possible that infinite data sets are able to uniquely identify the transitions10

[email protected]

Preprint submitted to arXiv February 6, 2011

arX

iv:1

012.

0062

v2 [

q-bi

o.PE

] 1

8 D

ec 2

010

on the tree and the tree completely. Note that in the present context, identifiability11

readily leads to consistency of various methods of estimating the parameters of the12

model (see Bryant et al., 2005, Section 2.2 for an overview).13

Chang (1996) states that a reversible Markov process on a tree can be14

reconstructed from the restrictions of its distribution to all triples of leaves, and provides15

a characterisation of the transition matrices yielding a reversible process. However,16

usually one only has an estimate of the leaf distribution such a process induces. This17

leads to the question whether one can find (simple) conditions to determine whether a18

taxon distribution comes from a Markov process. In other words, we ask whether we19

could validate the model, at least if there are infinitely many data points available.20

To approach this problem, we consider a very simple model. We assume our21

process can take only one of two states states, which means that there are only two22

states for every site, and the tree is a tripod tree.23

Under this restrictions, we can completely describe the map from the taxon24

distribution to the parameters of the model, including necessary and sufficient25

conditions on positivity of the parameters. Thereby, no conditions on reversibility of26

the processes on the edges are needed. The analysis of the model on tripod trees has27

immediate consequences for quartet trees. We derive these conditions to exemplify the28

shortcomings of an extension from tripods to quartets.29

Technically, the generic part of this work is already well-known. Initial work on30

the two state model from psychology can be found in (Lazarsfeld and Henry, 1968).31

Pearl and Tarsi (1986) used those results in artificial intelligence to algorithmically32

identify the whole tree behind two-state Markov models. Note that identifiability of33

Markov models especially in phylogeny was studied in Allman et al. (2009); Allman and34

Rhodes (2008, 2003); Baake (1998); Chang (1996). We add to those results the analysis35

of the degenerate cases, together with a complete analysis of the quartet tree model.36

The typical tool (for more-state models) to identify a subspace of taxon37

distributions which might come from a Markovian tree model are phylogenetic38

invariants (Allman and Rhodes, 2008; Sumner et al., 2008; Sturmfels and Sullivant,39

2005; Allman and Rhodes, 2003; Lake, 1987; Cavender and Felsenstein, 1987). Those40

invariants are polynomials in the taxon distribution which are zero for those41

distributions which are derived from the model of interest. In the two-state tripod case42

there is only a trivial invariant. But, not all leaf distributions are derived from the43

Markov model. In fact, we derived polynomials which vanish in distributions which44

satisfy the trivial invariant but are not identifiable under the Markov model. To45

accommodate this observation we suggest to incorporate these polynomials into the set46

of invariants but with the addition that these polynomials do not to vanish for47

identifiable distributions. We discuss degenerate distributions to describe this48

observation.49

Although most of the leaf distributions allow for complex solutions of the model50

equations, in order for the solution of the algebraic equation to be parameters of a51

Markov model additional inequalities must be fulfilled (Zwiernik and Smith, 2010;52

2

Matsen, 2009). The approach of Matsen is restricted to the Cavender-Farris-Neyman53

model (CFN Cavender, 1978; Farris, 1973; Neyman, 1971) to accommodate the54

Hadamard approach (Hendy and Penny, 1989; Szekely et al., 1993). Extending our55

approach we recover the inequalities presented in Pearl and Tarsi (1986).56

As a final step we investigate how the results for tripod trees extend to trees of57

four leaves. The results provide a glimpse at what we can expect from the58

reconstruction from tripods when we have no knowledge of the identifiability of the59

given taxon distribution.60

The structure of this work is as follows: In Section 2 we describe the general61

mutation model on a tree, with specialisation to tripod trees coming in Section 3.62

Section 4 deals with the complete solution of the two-state tripod tree model. Then, in63

Section 5 we use these results to analyse the general two-state Markov model on quartet64

trees. For the sake of readability, proofs are presented in Appendix A.65

2. The Markov model of mutation along a tree66

In this section we introduce the general Markov model and its properties. Pearl67

and Tarsi (1986) nicely motivate this model in the following way. Assume, one is given68

a set L of taxa and a set of observations from a Markov process X : L→ {0, 1}. From69

these observations one deduces a correlation between the taxa. The assumption is that70

this correlation can be explained by an underlying (binary) tree T = (V,E) and an71

extension Y : V → {0, 1} of X such that for any pair of taxa there is an interior node72

such that given the state at the interior node the two taxa are independent. See Fig.73

A.1 for a depiction of this.74

[Figure 1 about here.]75

Let us look closer at the process Y . The independence of pairs of taxa given an76

interior vertex on the path between them corresponds to the so-called directed local77

Markov property (e.g., Lauritzen, 1996, Chapter 2). For this property one has to78

identify a vertex ζ ∈ V as the root of the tree and direct all edges away from ζ.79

The directed local Markov property states that conditioned on the state of its80

parent vertex the state of a vertex α ∈ V is independent of the states of its81

non-descendants, i.e. all those vertices whose path to the root does not pass α,82

excluding its parent. With this property the joint distribution pY has the factorization83

property, i.e. for the joint state χ ∈ {0, 1}|V | we get84

pYχ = Pr[Yζ = χζ ]∏

(α,β)∈E

Pr[Yβ = χβ|Yα = χα] = qζχζ

∏(α,β)∈E

Mαβχαχβ

. (1)

We only have partial knowledge on the realizations of the process Y through the85

process X on the leaves. The joint distribution pX of X can then be inferred from (1)86

3

using the law of total probability. Let x ∈ {0, 1}|L| denote the joint state at the leaves.87

Then88

pXx =∑χ∈Vχ|L=e

pYχ =∑χ∈Vχ|L=e

qζχζ

∏(α,β)∈E

Mαβχαχβ

. (2)

Note that under the assumption that X comes from a reversible Markov process Y89

Chang (1996) proved that all process parameters (M e)e∈E and qζ can be recovered90

from all the distributions of the restrictions of X to arbitrary triples of taxa.91

If we find parameters (M e)e∈E and qζ for a joint taxon distribution p then we call92

p decomposable. If the parameters are unique (up to model-specific symmetries) but93

complex valued, we call p algebraically identifiable, and if further the parameters are94

marginal and transition probabilities, then p is called tree identifiable.95

Looking at (2) we realize that deriving the decomposability of a distribution p is96

equivalent to solving a polynomial equation system of 2|L| − 1 independent equations in97

4|L| − 5 variables. We observe that the Markov equations are overdetermined for98

|L| > 3, i.e. the space of of decomposable distributions is a true subspace of the space99

of all distributions. From this we conclude, that there are conditions which define a100

decomposable distribution. These conditions are generally known as invariants,101

polynomials in 2|L| − 1 variables whose roots are distributions which are algebraically102

decomposable. One example of an invariant is103 ∑x∈{0,1}|L|

px = 1, (3)

i.e. all probabilities sum to one. This is fittingly called the trivial invariant. Allman104

and Rhodes (2008) provide a complete set of invariants for trees of arbitrary size under105

a two-state-model, and observe that for complete identifications the knowledge of the106

restrictions to six taxa are necessary.107

However, as pointed out in multiple publications (e.g., Pearl and Tarsi, 1986;108

Matsen, 2009) such invariants are not sufficient to guarantee tree identifiability. In109

particular, additional inequalities are needed.110

Here, we are not only interested in recapturing invariants and inequalities but also111

look at those distributions which are not tree identifiable or not decomposable at all to112

discuss their impact on invariant-based inference.113

3. General properties of a Markov model on a tripod tree114

The starting point of our analysis is the tripod tree T with taxa α, β, γ, root ζ115

and edges (ζ, α), (ζ, β), (ζ, γ) (see Fig. A.2). This is the only labeled topology for three116

taxa. Hence any inference will be process- and not topology-related. Allman and117

Rhodes (2003) place the root at a taxon for their approach. We will leave the root at an118

interior node for the symmetry this provides in the tree equations.119

4

[Figure 2 about here.]120

As we have seen before, if the joint distribution p of Xα, Xβ, Xγ comes from a121

Markov process then there are parameters qζ ,Mα,Mβ,M γ such that the Markov122

equations (2) are satisfied. On a tripod tree these equations are the tripod equations123

pabc = qζ1Mα1aM

β1bM

γ1c + (1− qζ1)Mα

0aMβ0bM

γ0c, a, b, c ∈ {0, 1}. (4)

As before we call p decomposable, if there are parameters, algebraically identifiable,124

when the parameters are unique, and tripod identifiable if the parameters are unique125

and proper marginal and transition probabilities.126

The works of Lazarsfeld and Henry (1968) and Pearl and Tarsi (1986) were mainly127

interested in inferring conditions under which a triplet distribution is identifiable.128

While recovering their results we also investigate decomposability and algebraic129

identifiability in order to describe their impact on invariant-based inference.130

For three taxa the only invariant restriction is the trivial invariant. Thus, one131

could expect that all triplet distributions are decomposable. As we will see later, this is132

not the case, in fact one has to add polynomials for which a decomposable distribution133

cannot be a zero point.134

3.1. Statistics for binary models135

Since we are looking at binary random variables we can employ some properties of136

these models. In particular, computing means is very simple:137

εαβγ := EXαXβXγ = Pr[Xα = 1, Xβ = 1, Xγ = 1] = p111,

εαβ := EXαXβ = Pr[Xα = 1, Xβ = 1] = p11Σ = p110 + p111,

εα := EXα = Pr[Xα = 1] = p1ΣΣ = p100 + p101 + p110 + p111.

Below, we use equivalent definitions for εαγ, εβγ, εβ and εγ, and p01Σ and so on. With138

this, we see easily that a triplet distribution can be recovered from all the means and139

thus the tripod equations can be reformulated in terms of the means.140

In the following, we make use of the simple relationship between means and141

covariances.142

ταβ := Cov[Xα, Xβ] = EXαXβ − EXαEXβ = εαβ − εαεβ = p00Σp11Σ − p01Σp10Σ,

with equivalent definitions for ταγ and τβγ. Of further interest are terms which are143

related to conditional covariances for binary variables (c ∈ {0, 1})144

ταβ|c := p11cpΣΣc − p1ΣcpΣ1c = p00cp11c − p01cp10c,

with equivalent definitions for ταγ|b and τβγ|a. Finally, we also introduce at the145

three-way covariances146

ταβγ := Cov[Xα, Xβ, Xγ] = E(Xα − EXα)(Xβ − EXβ)(Xγ − EXγ)

= εαβγ − εαεβγ − εβεαγ − εγεαβ + 2εαεβεγ.

5

For a review on covariance for more than two random variables see e.g. Rayner and Beh147

(2009). Using these notations we can immediately propose a useful property.148

Lemma 1. Let p denote the joint probability for binary random variables Xα, Xβ and149

Xγ. If we flip the state in one taxon, then we flip the signs in its pairwise covariances.150

E.g., if Xα 7→ 1−Xα, then ταβ 7→ −ταβ, ταγ 7→ −ταγ τβγ 7→ τβγ.151

In consequence, the product ταβταγτβγ will always have the same sign no matter152

how often we flip states.153

3.2. Tree properties154

In this section we assume that p is decomposable and regard some immediate155

consequences. We will later see that these conditions are necessary for identifiability156

but not sufficient. Nevertheless, these conditions provide some immediate insights.157

Lemma 2. 1. If a triplet distribution p is decomposable on T with ταβ = 0, then158

also ταβγ = 0 and ταγ = 0 or τβγ = 0.159

2. If a triplet distribution p is tree identifiable then the product ταβταγτβγ is160

non-negative.161

The non-negativity of the product has already been verified by Lazarsfeld and162

Henry (1968). With Lemma 1 it is not complicated to derive that on a star tree (with163

arbitrary number of leaves) there always is a state flipping such that all pairs of leaves164

are positively correlated.165

Point 1 occurs exactly if Xα or Xβ are independent of the remaining random166

variables. It also implies the following:167

Corollary 3. A triplet distribution p with ταβ = 0 but ταγ 6= 0 and τβγ 6= 0 is not168

decomposable.169

Thus we already see, that the trivial invariant does not characterize decomposable170

distributions in this setting. The following example shows that such cases can be easily171

constructed.172

Example 1. Triplet distributions of type173

p = (p000, p001, p010, p011, p100, p101, p110, p111)

= (4− x, x, 2, 2, 2, 2, 2, 2)/16, x ∈ [0, 4] \ {2},

yield ταβ = 0 but ταγ = τβγ = (2− x)/32 and hence are not decomposable. Moreover,174

there is no graphical structure for which two taxa can be independent of each other but175

each is correlated with a third taxon.176

4. Solving the tripod equations177

In this section we are given a triplet distribution p and infer conditions under178

which it is identifiable. For each case we will present an example.179

6

4.1. The algebraic solution180

As has been pointed out multiple times, the only invariant in the tripod case is181

the trivial invariant. In other words, the “set” of invariants for a tripod tree is satisfied182

by all triplet distributions. However, as we have seen in Corollary 3 there are triplet183

distributions which are not decomposable even though they satisfy the trivial invariant.184

Thus executing the actual decomposition, i.e. finding a solution for the tripod185

equations not only provides complete forms for the parameters but is also helpful to186

identify further cases. The first task is to clarify up to which level of uniqueness the187

decomposition of a triplet distribution can be attained.188

Lemma 4. If a triplet distribution p is decomposable with parameters189

qζ ,Mα,Mβ,M γ then it is also decomposable for root-flipped parameters190

qζ , Mα, Mβ, M γ with qζz = qζ1−z, Mαza = Mα

(1−z)a, Mβzb = Mα

(1−z)b, Mγzc = Mγ

(1−z)c.191

Hence, there will always be at least two sets of parameters which decompose a192

triplet distribution p. In terms of molecular evolution one can view these solutions as193

one having only few mutations (M δz(1−z) < M δ

zz, δ leaf) or many mutations194

(M δz(1−z) > M δ

zz, δ leaf). Chang (1996) addressed the problem of symmetric solutions by195

introducing matrix categories which are reconstructible from rows. One such class are196

the matrices of diagonally dominant matrices, i.e. M δzz > M δ

z(1−z) for all leaves and197

z ∈ {0, 1}. If only these two sets of parameters exist then we will still regard the198

associated distribution as identifiable.199

Next, we present conditions under which p is algebraically identifiable and present200

the closed form for the parameters.201

Theorem 5. Let p denote a triplet distribution and assume202

ταβταγτβγ 6= 0, ταβταγτβγ 6= −(ταβγ

2

)2

. (5)

Then p is algebraically identifiable. The associated parameters have the following form:203

qζ1 =1

2− ταβγ

2√χ,

Mα01 = εα +

ταβγ −√χ

2τβγ, Mβ

01 = εβ +ταβγ −

√χ

2ταγ, Mγ

01 = εγ +ταβγ −

√χ

2ταβ,

Mα11 = εα +

ταβγ +√χ

2τβγ, Mβ

11 = εβ +ταβγ +

√χ

2ταγ, Mγ

11 = εγ +ταβγ +

√χ

2ταβ,

(6)

where χ = τ 2αβγ + 4ταβταγτβγ.204

Note, that Pearl and Tarsi (1986) presented a similar solution for the parameters.205

Looking at the parameters in (6) we see that algebraically the conditions in (5) prevent206

7

division by zero. Together with the trivial invariant we can thus claim that the space of207

algebraically identifiable triplet distributions is given by S − S0 − S1 with208

S := {p ∈ R8+ : p000 + · · ·+ p111 = 1},

S0 := {p ∈ S : ταβταγτβγ = 0},S1 := {p ∈ S : τ 2

αβγ + 4ταβταγτβγ = 0}.

Considering (5) and Lemma 2.2 we see that triplet distributions with209

ταβταγτβγ < 0 are only algebraically, but not tripod identifiable. In fact, for210

−τ 2αβγ < 4ταβταγτβγ < 0 we get real-valued parameters, and for 4ταβταγτβγ < −τ 2

αβγ we211

get a set of complex-valued parameters.212

The following example presents such distributions.213

Example 2. Regard the distributions214

p1 = (6, 7, 2, 1, 1, 1, 4, 5)/27, p2 = (6, 7, 1, 2, 1, 1, 4, 5)/27, p3 = (6, 6, 2, 2, 1, 1, 4, 5)/27.

All three distributions satisfy the conditions (5), i.e. they are algebraically identifiable.215

For p1 the covariance τβγ is negative and the other two positive, while for p2 we have216

ταγ negative and the other two positive. The distribution p3 has only positive pairwise217

covariances.218

The parameters for p1 are real-valued, the parameters for p2 are complex-valued219

and p3 is tripod decomposable.220

Though this example is artificial it indicates just how sensitive the model is to221

misreads in alignments. E.g., the difference between p1 and p2 could be seen as reading222

a single pattern 011 instead of a pattern 001.223

4.2. Tripod identifiable distributions224

The next step is to determine conditions under which a distribution satisfying (5)225

is tripod identifiable. These conditions should correspond to the conditions given by226

Pearl and Tarsi (1986, Theorem 1).227

Example 2 dealt with ταβταγτβγ < 0. However, as the following example shows,228

positivity of the product does not necessarily yield tripod identifiability.229

Example 3. The tripod distribution230

p = (68, 0, 20, 12, 20, 12, 17, 51)/200

yields positive covariances for all three pairs but also Mγ01 = −1/20, i.e. not a231

probability.232

The example contains a pattern of expected zero occurrence. From the tripod233

equations we conclude that a tripod identifiable distribution is strictly positive, thus234

this example is slightly crooked. However, as Example 1 showed, a strictly positive235

triplet distribution is not necessarily tripod identifiability either.236

8

In order to get necessary and sufficient conditions on a triplet distribution to be237

tripod identifiable we need to go back to the parameters in (6) and bound them238

accordingly. This yields:239

Theorem 6. A triplet distribution p is uniquely tripod identifiable if and only if after240

suitable state flips241

ταβ > 0, ταβ|0 ≥ 0, ταβ|1 ≥ 0,

ταγ > 0, ταγ|0 ≥ 0, ταγ|1 ≥ 0,

τβγ > 0, τβγ|0 ≥ 0, τβγ|1 ≥ 0.

(7)

In other words, the direction of the correlation between a pair of leaves shall not242

be influenced by the third leaf. With this we can summarise that a triplet distribution is243

tripod identifiable if it is in S −S0−S1 and there is a state flip such that (7) is satisfied.244

Example 4. The tripod distribution p from Example 3 has positive pairwise and245

conditional covariances except for ταβ|1 = −9/2500. Thus it does not satisfy (7).246

4.3. Non-identifiable cases247

The above considerations dealt with cases where a given triplet distribution p is248

identifiable. The final step of the tripod analysis is to regard those distributions which249

violate the conditions (5). Corollary 3 already discussed the case where one pairwise250

covariance is zero while the other two are not and we found that they were not251

decomposable. In the following we look at the remaining cases.252

Proposition 7. Assume that a triplet distribution p obeys ταβταγτβγ = −(ταβγ/2)2 but253

ταβταγτβγ 6= 0. Then p is not algebraically decomposable.254

In other words, we found another set of triplet distributions which are not255

decomposable.256

Example 5. The distribution257

p = (16, 5, 8, 15, 14, 5, 2, 15)/80

yields ταβ = −1/80, ταγ = 1/40 and τβγ = 1/8 but χ = 0 and hence has no factorization258

in the sense of (4).259

This case is particularly disturbing because here all taxa appear to be correlated260

and yet no structure can be found to explain the correlation.261

Together with Corollary 3 this covers the non-decomposable distributions. The262

remaining cases are triplet distributions which are decomposable but not identifiable.263

Proposition 8. Let p be a triplet distribution with ταβ = 0 and ταγ = 0. Then p is264

decomposable with infinitely many parameter sets.265

The parameter sets are identified by one of the following compositions:266

9

(i) τβγ 6= 0. Then Mα0a = Mα

1a = paΣΣ, a ∈ {0, 1}, and for any u, b, c ∈ {0, 1}:267

qζ1 =pΣΣc −Mγ

1c

Mγ0c −M

γ1c

, Mβub =

pΣbc − pΣbΣMγ(1−u)c

pΣΣc −Mγ(1−u)c

(8)

with free parameters Mγ0c 6= Mγ

1c.268

(ii) τβγ = 0. Then for all a, b, c,∈ {0, 1} the free parameters can be distributed as269

follows:270

(a) Mα0a = Mα

1a = paΣΣ, Mβ0b = Mβ

1b = pΣbΣ and271

qζ1 =pΣΣc −Mγ

0c

Mγ1c −M

γ0c

, (9)

with free parameters Mγ0z 6= Mγ

1z.272

(b) Mα0a = Mα

1a = paΣΣ, Mβ0b = Mβ

1b = pΣbΣ, Mγ0c = Mγ

1c = pΣΣc with free273

parameter qζ1.274

(c) qζ1 = 0, Mα0a = paΣΣ, M

β0b = pΣbΣ, M

γ0c = pΣΣc with free parameters275

Mα1a, M

β1b, M

γ1c.276

In other words, if we observe such a non-identifiable case we have no means to277

recover the true parameters of the tripod decomposition.278

Example 6. The triplet distribution279

p = (2, 2, 2, 2, 2, 2, 2, 2)/16

yields complete independence of the leaves ταβ = ταγ = τβγ = 0, i.e. the case (ii) in280

Proposition 8 is to be regarded here. It is not too surprising that such a distribution281

yields an infinite number of solutions since the state at the root is completely282

undetermined.283

Looking again at the cases listed above, we see that Xα is not only pairwise284

independent from (Xβ, Xγ) (induced by ταβ = ταγ = 0), but even completely285

independent. Then the multiple solutions come from the fact that we can place the root286

arbitrarily between β and γ.287

The good news is, that the non-identifiable cases form a small subset among all288

triplet distributions. In fact:289

Proposition 9. Non-identifiable triplet distributions, i.e. distributions violating the290

conditions (5) form a Lebesgue zero set in the set of all possible triplet distributions.291

This concludes our analysis of the tripod case. We identified the subset of triplet292

equations which are uniquely algebraically and tripod identifiable, and those which are293

decomposable but not identifiable, or not at all decomposable.294

10

5. Extension to quartet trees295

In this section we will explore the implications of extending the results for three296

taxa to four taxa. For this section we look at the quartet tree Q = (V,E) with297

V = {ζ, ψ, α, β, γ, δ}, E = {(ζ, ψ), (ζ, α), (ζ, β), (ψ, γ), (ψ, δ)}.

Fig. A.3 provides an illustration including the four tripod restrictions298

T = Tαβγ, T = Tαβδ, T = Tαγδ and T = Tβγδ. The two alternative quartet topologies299

can be described by leaf switches. E.g., the topology which groups α, γ against βδ is300

retrieved from the above conventions by switching β and γ. Regard the quartet301

distribution π = (πabcd)a,b,c,d∈{0,1} describing the joint distribution for α, β γ and δ. If π302

is identifiable and reversible then it can be reconstructed from the restrictions on these303

four tripods (Chang, 1996), i.e. computing the parameters for all tripods will304

immediately return the full process. However, the converse is not necessarily true. As305

Example 7 below shows, there are cases where each tripod restriction is identifiable but306

no quartet tree can be reconstructed.307

[Figure 3 about here.]308

Pearl and Tarsi (1986) presented an algorithm to reconstruct the topology for an309

arbitrary number of taxa. Their algorithm employs the condition that tripods which310

share an interior node in the (unknown) tree topology must result in the same marginal311

distribution at this interior node. Their approach yields an invariant, which for Q312

amounts to313

f1(π) = ταδτβγ − ταγτβδ. (10)

This invariant is related to the four-point-condition (e.g., Semple and Steel, 2003, p.314

146) and thus topologically informative, i.e. it is particular to topology Q. If a315

distribution π is from another tree than f1(π) 6= 0.316

To reconstruct the process parameters as well, more invariants are needed. In317

particular, for π to be identifiable on Q the parameters obtained from the tripod318

restrictions must satisfy the following properties:319

1. The parameters for edges (ζ, α), (ζ, β) and qζ obtained from triplet distributions320

p and p, respectively, must be equal.321

2. The parameters for edges (ψ, γ), (ψ, δ) and qψ obtained from triplet distributions322

p and p, respectively, must be equal.323

3. The parameters Mψ for the interior edge (ζ, ψ) are obtained from the equations324

01 = (1−Mψ01)Mγ

01 +Mψ01M

γ11,

11 = (1−Mψ11)Mγ

01 +Mψ11M

γ11.

(11)

These equations must hold equivalently when γ is replaced by δ and the325

parameters come from tripod T instead of M .326

11

These conditions imply further restrictions to π. An indicator for the minimal327

number of such conditions is the observation that a quartet distribution π has 15328

degrees of freedom, but there are only 11 model parameters on Q, two for each edge and329

one for the root distribution. Thus we need at least four additional conditions or rather330

invariants. We will use the above observations to derive an equivalent set of invariants.331

Proposition 10. A quartet distribution π is algebraically identifiable on Q if it332

satisfies conditions (5) and the following invariants vanish in π:333

f0(π) = εαβγδταγ − εαβγεαγδ + εγεαβεαγδ + εαεγδεαβγ − εαβεαγεγδ,f1(π) = ταδτβγ − ταγτβδ,f2(π) = ταγτβγδ − τβγταγδ,f3(π) = ταγταβδ − ταδταβγ.

The parameters unique up to state flip at the interior nodes are then given by Theorem334

5 and335

Mψ01 =

1

2+ταδταβγ − ταβταγδ − ταδ

√χαβγ

2ταβ√χαγδ

,

Mψ11 =

1

2+ταδταβγ − ταβταγδ + ταδ

√χαβγ

2ταβ√χαγδ

.

(12)

The existence of these invariants means that decomposable quartet distributions336

form a Lebesgue zero set in the set of all quartet distributions for the same reason that337

the non-identifiable sets are a Lebesgue zero set in the set of all decomposable338

distributions.339

Invariant f1 comes from the equality of the marginal distributions at the interior340

nodes, as proposed by Pearl and Tarsi (1986). Invariants f2 and f3 come from the341

equality of edge transition matrices. Hence, distributions for which f1, f2 and f3 vanish342

will uniquely identify topology Q. Therefore, f1 − f3 are topologically informative.343

However, only distributions for which f0 vanishes will be subject to the inferred344

parameters. In other words, in the set of zero points for f1 − f3 there is a set of345

distributions which returns the same set of parameters for Q, but only for one of these346

distributions f0 vanishes. It would be interesting to investigate how this distribution347

relates to the set it projects from, e.g. if it is related to the possible maximum348

likelihood optimum.349

Despite the fact that f1 − f3 are sufficient to infer a topology, f0 is also350

topologically informative in that it will not vanish for distributions coming from351

another tree.352

In the case of the CFN model, all triplet covariances vanish. Hence, only353

invariants f0 and f1 are of interest in that case. Therefore, either invariant is sufficient354

to identify the associated tree topology.355

12

The parameters for the interior edge do not add more non-identifiable cases.356

However, as in the tripod case, further conditions are needed to guarantee quartet357

identifiability. Hence we get:358

Proposition 11. A quartet distribution is quartet identifiable if and only if every359

triplet restriction satisfies both Theorem 6 and the following inequalities360

ταδ√χαβγ − ταβ

√χαγδ ≤ ταβταγδ − ταδταβγ ≤ ταβ

√χαγδ − ταδ

√χαβγ. (13)

All other relations are covered due to the fact that the quartet distribution p361

needs to satisfy the invariants f0 − f3. The following example provides a very nice case362

in which reconstruction is not possible but offers a very interesting challenge.363

Example 7. Chor et al. (2000) provided several examples of distributions with364

multiple maxima of the likelihood function. These examples relate to the CFN model,365

i.e., pabcd = p(1−a)(1−b)(1−c)(1−d) so that the Hadamard approach can be used. Regard the366

symmetric distribution367

p = (14, 0, 0, 3, 0, 2, 1, 0, 0, 1, 2, 0, 3, 0, 0, 14)/40. (14)

Retrieving the statistics yields:368

ταβ = 7/40 = τγδ, ταγ = 3/20 = τβδ, ταδ = 1/8 = τβγ,

ταβγ = ταβδ = ταγδ = τβγδ = 0.

The last equality immediately shows, that the above distribution will trivially satisfy369

invariants f2 and f3. However, we get f1 = −11/1600 and f0 = −23/375, i.e. our370

observations do not come from the quartet tree defined by the bipartition αβ|γδ.371

Looking at the alternative invariants for f1, i.e. at372

fαδ|βγ1 = ταβτγδ − ταγτβδ = 13/1600,

fαγ|βδ1 = ταβτγδ − ταδτβγ = 3/200,

we see that this distribution comes from none of the available quartet trees.373

Nevertheless, we shall have a look at the parameters. Note that the symmetry of374

the distribution p implies Mα01 = 1−Mα

11 =: Mα. Looking at the numerical values for375

the parameters for every tripod tree we find surprising similarities:376

[Table 1 about here.]377

These parameters permit us to infer parameters Mζ = 1/14 and Mψ = 1/7 such378

that e.g. the parameters for α on the tripod trees αβδ and αγδ can be obtained from379

the parameter for tripod tree αβγ by380

Mα = Mζ(1−Mα) + (1−Mζ)Mα, Mα = Mψ(1−Mα) + (1−Mψ)Mα,

13

with analogue assignments for the other leaves. These computations can be visualized381

by the network in Fig. A.4. The assignment of probabilities for each split permits to382

justify the observations for each of the four tripod trees. However, the visualization is383

misleading because the factorization of the system does not follow the edges in the384

network (e.g., Strimmer et al., 2001; Bryant, 2005).385

[Figure 4 about here.]386

Acknowledgements. We thank Elizabeth S. Allman and John A. Rhodes for stimulating387

the finalization of the manuscript as well as for sharing their thoughts on this subject388

with us. Further, we owe much to the discussions with David Bryant, Mike Steel, and389

Arndt von Haeseler.390

References391

Elizabeth S. Allman and John A. Rhodes. Phylogenetic invariants for the general392

Markov model of sequence mutation. Mathematical Biosciences, 186(2):113–144,393

December 2003. URL http://dx.doi.org/10.1016/j.mbs.2003.08.004.394

Elizabeth S. Allman and John A. Rhodes. Phylogenetic ideals and varieties for the395

general markov model. Advances in Applied Mathematics, 40(2):127–148, 2008. URL396

http://dx.doi.org/10.1016/j.aam.2006.10.002.397

Elizabeth S. Allman, Catherine Matias, and John A. Rhodes. Identifiability of398

parameters in latent structure models with many observed variables. The Annals of399

Statistics, 37(6A):3099–3132, 2009. URL http://dx.doi.org/10.1214/09-AOS689.400

Ellen Baake. What can and what cannot be inferred from pairwise sequence401

comparisons? Mathematical Biosciences, 154(1):1–21, 1998. URL402

http://dx.doi.org/10.1016/S0025-5564(98)10044-5.403

David Bryant. Extending tree models to splits networks. In Algebraic Statistics for404

Computational Biology, chapter 17, pages 320–332. Cambridge University Press,405

2005. URL http://ebooks.cambridge.org/chapter.jsf?bid=406

CBO9780511610684&cid=CBO9780511610684A097.407

David Bryant, Nicolas Galtier, and Marie-Anne Poursat. Likelihood calculation in408

molecular phylogenetics. In Olivier Gascuel, editor, Mathematics of Evolution and409

Phylogeny, chapter 2, pages 33–62. Oxford University Press, 2005.410

James A. Cavender. Taxonomy with confidence. Mathematical Biosciences, 40:271–280,411

1978. URL http://dx.doi.org/10.1016/0025-5564(78)90089-5.412

James A. Cavender and Joseph Felsenstein. Invariants of phylogenies in a simple case413

with discrete states. Journal of Classification, 4:57–71, 1987. URL414

http://hdl.handle.net/10.1007/BF01890075.415

14

Joseph T. Chang. Full reconstruction of Markov models on Evolutionary Trees:416

Identifiability and consistency. Mathematical Biosciences, 137:51–73, 1996. URL417

http://dx.doi.org/10.1016/S0025-5564(96)00075-2.418

Benny Chor, Michael D Hendy, Barbara R Holland, and David Penny. Multiple419

maxima of likelihood in phylogenetic trees: An analytic approach. Molecular Biology420

and Evolution, 17(10):1529–1541, 2000. URL421

http://mbe.oxfordjournals.org/content/17/10/1529.full.422

James S. Farris. A probability model for inferring evolutionary trees. Systematic423

Zoology, 22(3):250–256, 1973. URL http://www.jstor.org/stable/2412305.424

Michael D. Hendy and David Penny. A framework for the quantitative study of425

evolutionary trees. Syst Zool, 38(4):297–309, 1989. URL426

http://dx.doi.org/10.2307/2992396.427

Morris W. Hirsch. Differential Topology. Graduate Texts in Mathematics. Springer,428

New York, Heidelberg, Berlin, 1976. ISBN 0387901485.429

James A. Lake. A rate-independent technique for analysis of nucleic acid sequences:430

evolutionary parsimony. Molecular Biology and Evolution, 4(2):167–191, 1987. URL431

http://mbe.oxfordjournals.org/content/4/2/167.abstract.432

Steffen L. Lauritzen. Graphical Models. Oxford Stastical Science Series. Clarendon433

Press, Oxford, 1996. ISBN 0-19-852219-3. URL434

http://ukcatalogue.oup.com/product/9780198522195.do.435

Paul F. Lazarsfeld and Neil W. Henry. Latent Structure Analysis. Houghton, Mifflin,436

New York, 1968.437

Frederick A. Matsen. Fourier transform inequalities for phylogenetic trees. IEEE/ACM438

Transactions on Computational Biology and Bioinformatics, 6(1):89–95, 2009. URL439

http://dx.doi.org/10.1109/TCBB.2008.68.440

Jerzy Neyman. Molecular studies of evolution: A source of novel statistical problems.441

In S. dasGupta and J. Yackel, editors, Statistical Decision Theory and Related Topics,442

pages 1–27. Academic Press, New York, 1971.443

Judea Pearl and Michael Tarsi. Structuring causal trees. Journal of Complexity, 2:444

60–77, 1986. URL http://dx.doi.org/10.1016/0885-064X(86)90023-3.445

J. C. W. Rayner and Eric J. Beh. Towards a better understanding of correlation.446

Statistica Netherlandica, 63(3):324–333, 2009. URL447

http://dx.doi.org/10.1111/j.1467-9574.2009.00425.x.448

15

Charles Semple and Mike Steel. Phylogenetics. Oxford Lectures Series in Mathematics449

and its Applications. Oxford University Press, 2003. ISBN 0-19-850942-1.450

Korbinian Strimmer, Carsten Wiuf, and Vincent Moulton. Recombination analysis451

using directed graphical models. Molecular Biology and Evolution, 18(1):97–99, 2001.452

URL http://mbe.oxfordjournals.org/content/18/1/97.full.453

Bernd Sturmfels and Seth Sullivant. Toric ideals of phylogenetic invariants. Journal of454

Computational Biology, 12(4):457–481, 2005. URL455

http://dx.doi.org/10.1089/cmb.2005.12.457.456

Jeremy G. Sumner, Michael A. Charleston, Lars S. Jermiin, and Peter D. Jarvis.457

Markov invariants, plethysms, and phylogenetics. Journal of Theoretical Biology, 253:458

601–615, 2008. URL http://dx.doi.org/10.1016/j.jtbi.2008.04.001.459

Laszlo Szekely, Mike A. Steel, and Peter L. Erdos. Fourier calculus on evolutionary460

trees. Advances in Applied Mathematics, 14(2):200–210, 1993.461

Piotr Zwiernik and Jim Q. Smith. Tree-cumulants and the identifiability of bayesian462

tree model. arXiv:1004.4360v1, 2010. URL http://arxiv.org/abs/1004.4360.463

Appendix A. Proofs464

Proof of Lemma 1. A state flip replaces the probabilities at leaf α implies a “new”465

distribution p with pabc = p(1−a)bc, a, b, c ∈ {0, 1}. This has the following implications to466

the covariances.467

ταβ = p11Σ − p1ΣΣpΣ1Σ = (pΣ1Σ − p01Σ)− p1ΣΣpΣ1Σ

= −p01Σ + pΣ1Σ(1− p1ΣΣ) = −(p01Σ − p0ΣΣpΣ1Σ)

= −(p11Σ − p1ΣΣpΣ1Σ) = −ταβ.

and analogously ταγ = −ταγ and τβγ = τβγ. Thus, if ταβ and ταγ are smaller than zero,468

then a state flip produces positive covariances and the sign for the overall product469

remains the same.470

Proof of Lemma 2. With the tripod equations we find the following dependencies471

between covariances and model parameters:472

ταβγ = (Mα11 −Mα

01)(Mβ11 −M

β01)(Mγ

11 −Mγ01)qζ1(1− qζ1)(1− 2qζ1), (A.1)

ταβ = (Mα11 −Mα

01)(Mβ11 −M

β01)qζ1(1− qζ1), (A.2)

ταβ|c =(Mα

11 −Mα01)(Mβ

11 −Mβ01)Mγ

0cMγ1cq

ζ1(1− qζ1)

((1− qζ1)Mγ0c + qζ1M

γ1c)

2, c ∈ {0, 1}, (A.3)

16

with equivalent terms for the other covariances. Therefore, if ταβ = 0 due to473

Mα01 −Mα

11 = 0 then also ταγ = 0 and ταβγ = 0. If qζ1 ∈ {0, 1} then all four covariances474

are zero.475

For point 2 simply observe that all parameters need to be probabilities. Thus, if476

e.g., Mα01 −Mα

11 < 0 but Mβ01 −M

β11 > 0 and Mγ

01 −Mγ11 > 0 then ταβ < 0, ταγ < 0, and477

τβγ > 0 and hence ταβταγτβγ > 0. All other cases are similar, and thus we are478

finished.479

Proof of Corollary 3. A triplet distribution p for which only one covariance is zero does480

not satisfy Lemma 2(1) and hence is not tripod decomposable. Further, by looking at481

(A.2) we see that there is also no real- or complex-valued parameter set which would482

yield only one zero covariance. Hence, such a triplet distribution would also not be483

algebraically decomposible.484

Proof of Lemma 4. We insert the refined parameters into the tripod equations to get:485

pabc = qζ1Mα1aM

β1bM

γ1c + (1− qζ1)Mα

0aMβ 0bMγ

0c

= qζ0Mα0aM

β0bM

γ0c + (1− qζ0)Mα

1aMβ1bM

γ1c

= (1− qζ1)Mα0aM

β0bM

γ0c + qζ1M

α1aM

β1bM

γ1c,

i.e. the tripod equations are recovered with flipped parameters. This completes the486

proof.487

Proof of Theorem 5. We derive the parameters from the triplet equations. The first488

step is to replace the equation such that we get489

εαβγ = qζ1Mα11M

β11M

γ11 + (1− qζ1)Mα

01Mβ01M

γ01, (A.4)

εαβ = qζ1Mα11M

β11 + (1− qζ1)Mα

01Mβ01, (A.5)

εαγ = qζ1Mα11M

γ11 + (1− qζ1)Mα

01Mγ01, (A.6)

εβγ = qζ1Mβ11M

γ11 + (1− qζ1)Mβ

01Mγ01, (A.7)

εα = qζ1Mα11 + (1− qζ1)Mα

01, (A.8)

εβ = qζ1Mβ11 + (1− qζ1)Mβ

01, (A.9)

εγ = qζ1Mγ11 + (1− qζ1)Mγ

01. (A.10)

Equations (A.8)-(A.10) yield490

(1−qζ1)Mα01 = εα−qζ1Mα

11, (1−qζ1)Mβ01 = εβ−qζ1M

β11, (1−qζ1)Mγ

01 = εγ−qζ1Mγ11. (A.11)

Inserting (A.11) into (A.5) returns491

(1− qζ1)εαβ = qζ1(1− qζ1)Mα11M

β11 + (εα − qζ1Mα

11)(εβ − qζ1Mβ11)

= qζ1Mα11M

β11 + εαεβ − qζ1(εαM

β11 + εβM

α11),

17

and in consequence492

qζ1Mβ11(Mα

11 − εα) = ταβ + qζ1(εβMα11 − εαβ), (A.12)

qζ1Mγ11(Mα

11 − εα) = ταγ + qζ1(εγMα11 − εαγ). (A.13)

We insert (A.12)-(A.13) back into (A.11)493

(1− qζ1)Mβ01(Mα

11 − εα) = εβ(Mα11 − εα)− ταβ − qζ1(εβM

α11 − εαβ)

= (1− qζ1)(εβMα11 − εαβ).

In the case of qζ1 = 1 we get from (A.8) and (A.5) that Mα11 = εα and εαβ = εαεβ. Hence,494

we remove 1− qζ1 from the above equation without destroying equality. Thus, we get495

Mβ01(Mα

11 − εα) = εβMα11 − εαβ, (A.14)

Mγ01(Mα

11 − εα) = εγMα11 − εαγ. (A.15)

We insert (A.11) in (A.4) to get496

Mα11εβγ − εαβγ = Mβ

01Mγ01(Mα

11 − εα).

Applying (A.14) and (A.15) to this gives us497

0 = (Mα11εβγ − εαβγ)(Mα

11 − εα)− (εβMα11 − εαβ)(εγM

α11 − εαγ)

= (Mα11)2τβγ −Mα

11(ταβγ + 2εατβγ) + εαβγεα − εαβεαγ.

We can apply the solution formula for quadratic equations provided τβγ 6= 0, i.e. our498

condition (5) is satisfied. In that case we get499

(Mα11)± =

ταβγ + 2εατβγ2τβγ

±√

(ταβγ + 2εατβγ)2 − 4(εαβγεα − εαβεαγ)τβγ2τβγ

= εα +ταβγ ±

√τ 2αβγ + 4ταβταγτβγ

2τβγ. (A.16)

Thus we have established the term for Mα11. The next step is to derive qζ1. We insert500

(A.12)-(A.15) into (A.7) and get501

qζ1(Mα11 − εα)2εβγ = (ταβ + qζ1(εβM

α11 − εαβ))(ταγ + qζ1(εγM

α11 − εαγ))

+ qζ1(1− qζ1)(εβMα11 − εαβ)(εγM

α11 − εαγ)

= (1− qζ1)ταβταγ + qζ1εβεγ(Mα11 − εα)2

and hence we get the quadratic relation502

0 = (1− qζ1)ταβταγ − qζ1τβγ(Mα11 − εα)2 (A.17)

18

We insert (A.16) and get503

ταβταγ = qζ1(ταβταγ + τβγ(M

α11 − εα)2

),

4ταβταγτβγ = qζ1

(4ταβταγτβγ +

(ταβγ +

√χ)2),

4ταβταγτβγ = 2qζ1√χ(√

χ+ ταβγ).

We use the equality504

4ταβταγτβγ = χ− τ 2αβγ = (

√χ+ ταβγ)(

√χ− ταβγ)

and the observation that√χ(√χ− ταβγ) = 0 if and only if the conditions in (5) are505

violated to get506

qζ1 =

√χ− ταβγ2√χ

=1

2− ταβγ

2√χ, (A.18)

thus inferring the proposed term for qζ1. Next we infer the term for Mα01. To this end we507

insert (A.16) and (A.18) into (A.11):508

−qζ1(Mα11 − εα) = (1− qζ1)(Mα

01 − εα),

(ταβγ −√χ)(ταβγ +

√χ) = 2τβγ(ταβγ +

√χ)(Mα

01 − εα),

Mα01 = εα +

ταβγ −√χ

2τβγ,

thus inferring the proposed term. The remaining terms are inferred analogously. This509

completes the proof.510

Proof of Theorem 6. We bound the parameters from (6) between 0 and 1:511

0 ≤ 1

2− ταβγ

2√χ≤ 1,

−√χ ≤ ταβγ ≤√χ,

0 ≤ ταβταγτβγ.

With (5) this yields positivity for the unconditional covariances. Next we look at Mα01512

and Mα11:513

0 ≤ εα +ταβγ −

√χ

2τβγ≤ 1,

−2εατβγ ≤ ταβγ −√χ ≤ 2(1− εα)τβγ,

ταβγ − 2(1− εα)τβγ ≤√χ ≤ ταβγ + 2εατβγ

19

and514

0 ≤ εα +ταβγ +

√χ

2τβγ≤ 1,

−2εατβγ ≤ ταβγ +√χ ≤ 2(1− εα)τβγ,

−(2εατβγ + ταβγ) ≤√χ ≤ 2(1− εα)τβγ − ταβγ.

Squaring both inequalities reduces the four inequalities to the following two:515

τ 2αβγ + 4ταβταγτβγ ≤ (2εατβγ + ταβγ)

2, (A.19)

τ 2αβγ + 4ταβταγτβγ ≤ (2(1− εα)τβγ − ταβγ)2. (A.20)

We look first at inequality (A.19) and get516

ταβταγτβγ ≤ ε2ατ

2βγ + εατβγταβγ,

0 ≤ εα(εατβγ + ταβγ)− ταβταγ,0 ≤ εαεαβγ − εαβεαγ = τβγ|1.

Set εα := (1− εα) = p0ΣΣ and look at (A.20):517

ταβταγτβγ ≤ ε2ατ

2βγ − εατβγταβγ,

0 ≤ εα(εατβγ − ταβγ)− ταβταγ,0 ≤ p000p011 − p001p010 = τβγ|0.

Hence, we have derived the proposed inequalities.518

Proof of Proposition 7. The tripod equations (4) imply:519

χ = τ 2αβγ + 4ταβταγτβγ = (Mα

11 −Mα01)2(Mβ

11 −Mβ01)2(Mγ

11 −Mγ01)2(1− qζ1)2(qζ1)2

Together with (A.1) and (A.2) we see that there is no set of real or complex parameters520

such that χ = 0 but ταβταγτβγ 6= 0.521

Proof of Proposition 8. The cases are easily verified by looking at Equation (A.2) and522

inserting the selected parameters back into (4).523

Proof of Proposition 9. The function χ : C8 → C is a nonconstant polynomial mapping.524

Thus the set {p ∈ R8 : χ(p) = 0} is a Lebesgue zero set. The same holds for the set525

{p ∈ R8 : ταβ(p) = 0 or ταγ(p) = 0 or τβγ(p) = 0}.

This completes the proof.526

20

Proof of Proposition 10. We recover Mψ by inserting the parameters from (6) into527

(11). To infer the invariants we first look at the equality conditions. We do this528

representatively by looking at Mα

= Mα. In particularly we look at529

11 −Mα

01 = Mα11 − Mα

01, Mα

11 +Mα

01 = Mα11 + Mα

01,

and thus530 √χαβγ

τβγ=

√χαβδ

τβδ,

ταβγτβγ

=ταγδτβδ

,

τ 2αβγ + 4ταβταγτβγ

τ 2βγ

=τ 2αβδ + 4ταβταδτβδ

τ 2βδ

,ταβγτβγ

=ταγδτβδ

,

ταγτβγ

=ταδτβδ

,ταβγτβγ

=ταγδτβδ

,

ταβγταβδ

=τβγτβδ

=ταγταδ

.

Looking at Mβ

= Mβ yields the same equalities. Reproducing the calculations for531

M γ = M γ yields the invariants f1 − f3.532

The equation system for quartets can be rewritten such that we have equations in533

εαβγδ, εαβγ, εαβδ, εαγδ, εβγδ, εαβ, εαγ, εαδ, εβγ, εβδ, εγδ, εα, εβ, εγ and εδ. Apart from the534

first term the equations for all terms have been handled in the tripod cases. Hence, all535

that remains is to insert the parameters obtained in (6) and (12) in to the equation536

εαβγδ = (1− qζ1)Mα

01Mβ

01((1−Mψ01)Mγ

01Mδ01 +Mψ

01Mγ11M

δ11)

+ qζ1Mα

11Mβ

11((1−Mψ11)Mγ

01Mδ01 +Mψ

11Mγ11M

δ11).

Reordering and restructuring this equation eventually yields invariant f0. This537

completes the proof.538

Proof of Proposition 11. The conditions for with the parameters from (6) are539

probabilities are covered by Theorem 6.540

Thus, for the remaining condition we have to bound (12) between 0 and 1, and541

use that the covariances are always positive from Lemma 2(1):542

−1 ≤ταδταβγ − ταβταγδ − ταδ

√χαβγ

ταβ√χαγδ

≤ 1,

ταβ(ταγδ −√χαγδ) ≤ ταδ(ταβγ −

√χαβγ) ≤ ταβ(ταγδ +

√χαγδ),

−1 ≤ταδταβγ − ταβταγδ + ταδ

√χαβγ

ταβ√χαγδ

≤ 1,

ταβ(ταγδ −√χαγδ) ≤ ταδ(ταβγ +

√χαβγ) ≤ ταβ(ταγδ +

√χαγδ),

543

21

List of Figures544

A.1 A binary tree with six leaves. Gray lines and vertices describe the hidden545

part of the process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23546

A.2 The tripod tree T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24547

A.3 The quartet tree Q with its tripod restrictions T , T , T and T . Again,548

gray lines and vertices indicate the hidden or unknown variables of the549

approach presented here. . . . . . . . . . . . . . . . . . . . . . . . . . . . 25550

A.4 Assignment of mutation probability from the symmetric distribution in551

(14). The black lines indicate the triplet αβγ. Assigned branch lengths552

are rounded values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26553

22

ψ

ξω

ζ

β

α

γ

φ ε

δ

Figure A.1: A binary tree with six leaves. Gray lines and vertices describe the hidden part of the process.

23

ζ

γ

α β

β

γ

ζα

M

MqM

Figure A.2: The tripod tree T .

24

Q T T T T

Mq

MMM

Mq

M

M

q

MM

q

MM

M

MM

q

M

ψ

γ

ζ

δ

ζ

δ γ

ψ

β

ζ

αβ

ζ

αβ

γ

ψ

δ

ζ

αβ

ζ

ψ

δ γ

βα

ζ

γ

α β

δ

ζ

α

δ γ

ψ

δ γ

ψ

βα

Figure A.3: The quartet tree Q with its tripod restrictions T , T , T and T . Again, gray lines and verticesindicate the hidden or unknown variables of the approach presented here.

25

0.042

δ

α0.042 0.143 0.042

γ

0.071

0.042

βFigure A.4: Assignment of mutation probability from the symmetric distribution in (14). The blacklines indicate the triplet αβγ. Assigned branch lengths are rounded values.

26

List of Tables554

A.1 The parameters for each triplet. . . . . . . . . . . . . . . . . . . . . . . . 28555

27

triplet Mα01 Mβ Mγ Mδ qζ

αβγ 0.0417424 0.118119 0.172673 0 0.5αβδ 0.118119 0.0417424 0 0.172673 0.5αγδ 0.172673 0 0.0417424 0.118119 0.5βγδ 0 0.172673 0.118119 0.0417424 0.5

Table A.1: The parameters for each triplet.

28