Semi-Markov backward credit risk migration models compared with Markov models
An algebraic analysis of the two state Markov model on tripod trees
-
Upload
greifswald -
Category
Documents
-
view
2 -
download
0
Transcript of An algebraic analysis of the two state Markov model on tripod trees
An algebraic analysis of the two state Markov model on tripod
trees
Steffen Klaerea,∗, Volkmar Liebscherb
aDepartment of Mathematics and Statistics, University of Otago, Dunedin, New ZealandbInstitut fur Mathematik und Informatik, Universitat Greifswald, Germany
Abstract
Methods of phylogenetic inference use more and more complex models to generate treesfrom data. However, even simple models and their implications are not fully understood.
Here, we investigate the two-state Markov model on a tripod tree, inferringconditions under which a given set of observations gives rise to such a model. Thisapproach has been taken before by several scientists from different fields of research.
We fully analyze the model, present conditions under which one can infer a modelfrom the observation or at least get support for the tree-shaped interdependence of theleaves considered.
We also present all conditions under which the results can be extended fromtripod trees to quartet trees, a step necessary to reconstruct at least a topology. Apartfrom finding conditions under which such an extension works we discuss example casesfor which such an extension does not work.
We present a complete algebraic analysis of the two-state Markov model ofevolution on tripod trees. Consequences for quartet trees are indicated, too.
Keywords: Phylogenetics, Identifiability, Invariant, Two-State-Model
1. Introduction1
In phylogeny, one assumes that the relationship of a set of taxonomic units (short2
taxa) can be visualised by a (binary) tree. The aim is to derive this tree from the3
observations at the taxa. From a stochastic modelling point of view, one assigns the4
taxa to the leaves of a (binary) tree, and assumes that the observations (which are5
usually considered to be i.i.d. over different sites) are the end results of a Markov6
process along the tree. The goal is to derive the best combination of tree and Markov7
model to explain the observations.8
This work regards the identifiability problem of this inference. It essentially asks9
whether it is possible that infinite data sets are able to uniquely identify the transitions10
Preprint submitted to arXiv February 6, 2011
arX
iv:1
012.
0062
v2 [
q-bi
o.PE
] 1
8 D
ec 2
010
on the tree and the tree completely. Note that in the present context, identifiability11
readily leads to consistency of various methods of estimating the parameters of the12
model (see Bryant et al., 2005, Section 2.2 for an overview).13
Chang (1996) states that a reversible Markov process on a tree can be14
reconstructed from the restrictions of its distribution to all triples of leaves, and provides15
a characterisation of the transition matrices yielding a reversible process. However,16
usually one only has an estimate of the leaf distribution such a process induces. This17
leads to the question whether one can find (simple) conditions to determine whether a18
taxon distribution comes from a Markov process. In other words, we ask whether we19
could validate the model, at least if there are infinitely many data points available.20
To approach this problem, we consider a very simple model. We assume our21
process can take only one of two states states, which means that there are only two22
states for every site, and the tree is a tripod tree.23
Under this restrictions, we can completely describe the map from the taxon24
distribution to the parameters of the model, including necessary and sufficient25
conditions on positivity of the parameters. Thereby, no conditions on reversibility of26
the processes on the edges are needed. The analysis of the model on tripod trees has27
immediate consequences for quartet trees. We derive these conditions to exemplify the28
shortcomings of an extension from tripods to quartets.29
Technically, the generic part of this work is already well-known. Initial work on30
the two state model from psychology can be found in (Lazarsfeld and Henry, 1968).31
Pearl and Tarsi (1986) used those results in artificial intelligence to algorithmically32
identify the whole tree behind two-state Markov models. Note that identifiability of33
Markov models especially in phylogeny was studied in Allman et al. (2009); Allman and34
Rhodes (2008, 2003); Baake (1998); Chang (1996). We add to those results the analysis35
of the degenerate cases, together with a complete analysis of the quartet tree model.36
The typical tool (for more-state models) to identify a subspace of taxon37
distributions which might come from a Markovian tree model are phylogenetic38
invariants (Allman and Rhodes, 2008; Sumner et al., 2008; Sturmfels and Sullivant,39
2005; Allman and Rhodes, 2003; Lake, 1987; Cavender and Felsenstein, 1987). Those40
invariants are polynomials in the taxon distribution which are zero for those41
distributions which are derived from the model of interest. In the two-state tripod case42
there is only a trivial invariant. But, not all leaf distributions are derived from the43
Markov model. In fact, we derived polynomials which vanish in distributions which44
satisfy the trivial invariant but are not identifiable under the Markov model. To45
accommodate this observation we suggest to incorporate these polynomials into the set46
of invariants but with the addition that these polynomials do not to vanish for47
identifiable distributions. We discuss degenerate distributions to describe this48
observation.49
Although most of the leaf distributions allow for complex solutions of the model50
equations, in order for the solution of the algebraic equation to be parameters of a51
Markov model additional inequalities must be fulfilled (Zwiernik and Smith, 2010;52
2
Matsen, 2009). The approach of Matsen is restricted to the Cavender-Farris-Neyman53
model (CFN Cavender, 1978; Farris, 1973; Neyman, 1971) to accommodate the54
Hadamard approach (Hendy and Penny, 1989; Szekely et al., 1993). Extending our55
approach we recover the inequalities presented in Pearl and Tarsi (1986).56
As a final step we investigate how the results for tripod trees extend to trees of57
four leaves. The results provide a glimpse at what we can expect from the58
reconstruction from tripods when we have no knowledge of the identifiability of the59
given taxon distribution.60
The structure of this work is as follows: In Section 2 we describe the general61
mutation model on a tree, with specialisation to tripod trees coming in Section 3.62
Section 4 deals with the complete solution of the two-state tripod tree model. Then, in63
Section 5 we use these results to analyse the general two-state Markov model on quartet64
trees. For the sake of readability, proofs are presented in Appendix A.65
2. The Markov model of mutation along a tree66
In this section we introduce the general Markov model and its properties. Pearl67
and Tarsi (1986) nicely motivate this model in the following way. Assume, one is given68
a set L of taxa and a set of observations from a Markov process X : L→ {0, 1}. From69
these observations one deduces a correlation between the taxa. The assumption is that70
this correlation can be explained by an underlying (binary) tree T = (V,E) and an71
extension Y : V → {0, 1} of X such that for any pair of taxa there is an interior node72
such that given the state at the interior node the two taxa are independent. See Fig.73
A.1 for a depiction of this.74
[Figure 1 about here.]75
Let us look closer at the process Y . The independence of pairs of taxa given an76
interior vertex on the path between them corresponds to the so-called directed local77
Markov property (e.g., Lauritzen, 1996, Chapter 2). For this property one has to78
identify a vertex ζ ∈ V as the root of the tree and direct all edges away from ζ.79
The directed local Markov property states that conditioned on the state of its80
parent vertex the state of a vertex α ∈ V is independent of the states of its81
non-descendants, i.e. all those vertices whose path to the root does not pass α,82
excluding its parent. With this property the joint distribution pY has the factorization83
property, i.e. for the joint state χ ∈ {0, 1}|V | we get84
pYχ = Pr[Yζ = χζ ]∏
(α,β)∈E
Pr[Yβ = χβ|Yα = χα] = qζχζ
∏(α,β)∈E
Mαβχαχβ
. (1)
We only have partial knowledge on the realizations of the process Y through the85
process X on the leaves. The joint distribution pX of X can then be inferred from (1)86
3
using the law of total probability. Let x ∈ {0, 1}|L| denote the joint state at the leaves.87
Then88
pXx =∑χ∈Vχ|L=e
pYχ =∑χ∈Vχ|L=e
qζχζ
∏(α,β)∈E
Mαβχαχβ
. (2)
Note that under the assumption that X comes from a reversible Markov process Y89
Chang (1996) proved that all process parameters (M e)e∈E and qζ can be recovered90
from all the distributions of the restrictions of X to arbitrary triples of taxa.91
If we find parameters (M e)e∈E and qζ for a joint taxon distribution p then we call92
p decomposable. If the parameters are unique (up to model-specific symmetries) but93
complex valued, we call p algebraically identifiable, and if further the parameters are94
marginal and transition probabilities, then p is called tree identifiable.95
Looking at (2) we realize that deriving the decomposability of a distribution p is96
equivalent to solving a polynomial equation system of 2|L| − 1 independent equations in97
4|L| − 5 variables. We observe that the Markov equations are overdetermined for98
|L| > 3, i.e. the space of of decomposable distributions is a true subspace of the space99
of all distributions. From this we conclude, that there are conditions which define a100
decomposable distribution. These conditions are generally known as invariants,101
polynomials in 2|L| − 1 variables whose roots are distributions which are algebraically102
decomposable. One example of an invariant is103 ∑x∈{0,1}|L|
px = 1, (3)
i.e. all probabilities sum to one. This is fittingly called the trivial invariant. Allman104
and Rhodes (2008) provide a complete set of invariants for trees of arbitrary size under105
a two-state-model, and observe that for complete identifications the knowledge of the106
restrictions to six taxa are necessary.107
However, as pointed out in multiple publications (e.g., Pearl and Tarsi, 1986;108
Matsen, 2009) such invariants are not sufficient to guarantee tree identifiability. In109
particular, additional inequalities are needed.110
Here, we are not only interested in recapturing invariants and inequalities but also111
look at those distributions which are not tree identifiable or not decomposable at all to112
discuss their impact on invariant-based inference.113
3. General properties of a Markov model on a tripod tree114
The starting point of our analysis is the tripod tree T with taxa α, β, γ, root ζ115
and edges (ζ, α), (ζ, β), (ζ, γ) (see Fig. A.2). This is the only labeled topology for three116
taxa. Hence any inference will be process- and not topology-related. Allman and117
Rhodes (2003) place the root at a taxon for their approach. We will leave the root at an118
interior node for the symmetry this provides in the tree equations.119
4
[Figure 2 about here.]120
As we have seen before, if the joint distribution p of Xα, Xβ, Xγ comes from a121
Markov process then there are parameters qζ ,Mα,Mβ,M γ such that the Markov122
equations (2) are satisfied. On a tripod tree these equations are the tripod equations123
pabc = qζ1Mα1aM
β1bM
γ1c + (1− qζ1)Mα
0aMβ0bM
γ0c, a, b, c ∈ {0, 1}. (4)
As before we call p decomposable, if there are parameters, algebraically identifiable,124
when the parameters are unique, and tripod identifiable if the parameters are unique125
and proper marginal and transition probabilities.126
The works of Lazarsfeld and Henry (1968) and Pearl and Tarsi (1986) were mainly127
interested in inferring conditions under which a triplet distribution is identifiable.128
While recovering their results we also investigate decomposability and algebraic129
identifiability in order to describe their impact on invariant-based inference.130
For three taxa the only invariant restriction is the trivial invariant. Thus, one131
could expect that all triplet distributions are decomposable. As we will see later, this is132
not the case, in fact one has to add polynomials for which a decomposable distribution133
cannot be a zero point.134
3.1. Statistics for binary models135
Since we are looking at binary random variables we can employ some properties of136
these models. In particular, computing means is very simple:137
εαβγ := EXαXβXγ = Pr[Xα = 1, Xβ = 1, Xγ = 1] = p111,
εαβ := EXαXβ = Pr[Xα = 1, Xβ = 1] = p11Σ = p110 + p111,
εα := EXα = Pr[Xα = 1] = p1ΣΣ = p100 + p101 + p110 + p111.
Below, we use equivalent definitions for εαγ, εβγ, εβ and εγ, and p01Σ and so on. With138
this, we see easily that a triplet distribution can be recovered from all the means and139
thus the tripod equations can be reformulated in terms of the means.140
In the following, we make use of the simple relationship between means and141
covariances.142
ταβ := Cov[Xα, Xβ] = EXαXβ − EXαEXβ = εαβ − εαεβ = p00Σp11Σ − p01Σp10Σ,
with equivalent definitions for ταγ and τβγ. Of further interest are terms which are143
related to conditional covariances for binary variables (c ∈ {0, 1})144
ταβ|c := p11cpΣΣc − p1ΣcpΣ1c = p00cp11c − p01cp10c,
with equivalent definitions for ταγ|b and τβγ|a. Finally, we also introduce at the145
three-way covariances146
ταβγ := Cov[Xα, Xβ, Xγ] = E(Xα − EXα)(Xβ − EXβ)(Xγ − EXγ)
= εαβγ − εαεβγ − εβεαγ − εγεαβ + 2εαεβεγ.
5
For a review on covariance for more than two random variables see e.g. Rayner and Beh147
(2009). Using these notations we can immediately propose a useful property.148
Lemma 1. Let p denote the joint probability for binary random variables Xα, Xβ and149
Xγ. If we flip the state in one taxon, then we flip the signs in its pairwise covariances.150
E.g., if Xα 7→ 1−Xα, then ταβ 7→ −ταβ, ταγ 7→ −ταγ τβγ 7→ τβγ.151
In consequence, the product ταβταγτβγ will always have the same sign no matter152
how often we flip states.153
3.2. Tree properties154
In this section we assume that p is decomposable and regard some immediate155
consequences. We will later see that these conditions are necessary for identifiability156
but not sufficient. Nevertheless, these conditions provide some immediate insights.157
Lemma 2. 1. If a triplet distribution p is decomposable on T with ταβ = 0, then158
also ταβγ = 0 and ταγ = 0 or τβγ = 0.159
2. If a triplet distribution p is tree identifiable then the product ταβταγτβγ is160
non-negative.161
The non-negativity of the product has already been verified by Lazarsfeld and162
Henry (1968). With Lemma 1 it is not complicated to derive that on a star tree (with163
arbitrary number of leaves) there always is a state flipping such that all pairs of leaves164
are positively correlated.165
Point 1 occurs exactly if Xα or Xβ are independent of the remaining random166
variables. It also implies the following:167
Corollary 3. A triplet distribution p with ταβ = 0 but ταγ 6= 0 and τβγ 6= 0 is not168
decomposable.169
Thus we already see, that the trivial invariant does not characterize decomposable170
distributions in this setting. The following example shows that such cases can be easily171
constructed.172
Example 1. Triplet distributions of type173
p = (p000, p001, p010, p011, p100, p101, p110, p111)
= (4− x, x, 2, 2, 2, 2, 2, 2)/16, x ∈ [0, 4] \ {2},
yield ταβ = 0 but ταγ = τβγ = (2− x)/32 and hence are not decomposable. Moreover,174
there is no graphical structure for which two taxa can be independent of each other but175
each is correlated with a third taxon.176
4. Solving the tripod equations177
In this section we are given a triplet distribution p and infer conditions under178
which it is identifiable. For each case we will present an example.179
6
4.1. The algebraic solution180
As has been pointed out multiple times, the only invariant in the tripod case is181
the trivial invariant. In other words, the “set” of invariants for a tripod tree is satisfied182
by all triplet distributions. However, as we have seen in Corollary 3 there are triplet183
distributions which are not decomposable even though they satisfy the trivial invariant.184
Thus executing the actual decomposition, i.e. finding a solution for the tripod185
equations not only provides complete forms for the parameters but is also helpful to186
identify further cases. The first task is to clarify up to which level of uniqueness the187
decomposition of a triplet distribution can be attained.188
Lemma 4. If a triplet distribution p is decomposable with parameters189
qζ ,Mα,Mβ,M γ then it is also decomposable for root-flipped parameters190
qζ , Mα, Mβ, M γ with qζz = qζ1−z, Mαza = Mα
(1−z)a, Mβzb = Mα
(1−z)b, Mγzc = Mγ
(1−z)c.191
Hence, there will always be at least two sets of parameters which decompose a192
triplet distribution p. In terms of molecular evolution one can view these solutions as193
one having only few mutations (M δz(1−z) < M δ
zz, δ leaf) or many mutations194
(M δz(1−z) > M δ
zz, δ leaf). Chang (1996) addressed the problem of symmetric solutions by195
introducing matrix categories which are reconstructible from rows. One such class are196
the matrices of diagonally dominant matrices, i.e. M δzz > M δ
z(1−z) for all leaves and197
z ∈ {0, 1}. If only these two sets of parameters exist then we will still regard the198
associated distribution as identifiable.199
Next, we present conditions under which p is algebraically identifiable and present200
the closed form for the parameters.201
Theorem 5. Let p denote a triplet distribution and assume202
ταβταγτβγ 6= 0, ταβταγτβγ 6= −(ταβγ
2
)2
. (5)
Then p is algebraically identifiable. The associated parameters have the following form:203
qζ1 =1
2− ταβγ
2√χ,
Mα01 = εα +
ταβγ −√χ
2τβγ, Mβ
01 = εβ +ταβγ −
√χ
2ταγ, Mγ
01 = εγ +ταβγ −
√χ
2ταβ,
Mα11 = εα +
ταβγ +√χ
2τβγ, Mβ
11 = εβ +ταβγ +
√χ
2ταγ, Mγ
11 = εγ +ταβγ +
√χ
2ταβ,
(6)
where χ = τ 2αβγ + 4ταβταγτβγ.204
Note, that Pearl and Tarsi (1986) presented a similar solution for the parameters.205
Looking at the parameters in (6) we see that algebraically the conditions in (5) prevent206
7
division by zero. Together with the trivial invariant we can thus claim that the space of207
algebraically identifiable triplet distributions is given by S − S0 − S1 with208
S := {p ∈ R8+ : p000 + · · ·+ p111 = 1},
S0 := {p ∈ S : ταβταγτβγ = 0},S1 := {p ∈ S : τ 2
αβγ + 4ταβταγτβγ = 0}.
Considering (5) and Lemma 2.2 we see that triplet distributions with209
ταβταγτβγ < 0 are only algebraically, but not tripod identifiable. In fact, for210
−τ 2αβγ < 4ταβταγτβγ < 0 we get real-valued parameters, and for 4ταβταγτβγ < −τ 2
αβγ we211
get a set of complex-valued parameters.212
The following example presents such distributions.213
Example 2. Regard the distributions214
p1 = (6, 7, 2, 1, 1, 1, 4, 5)/27, p2 = (6, 7, 1, 2, 1, 1, 4, 5)/27, p3 = (6, 6, 2, 2, 1, 1, 4, 5)/27.
All three distributions satisfy the conditions (5), i.e. they are algebraically identifiable.215
For p1 the covariance τβγ is negative and the other two positive, while for p2 we have216
ταγ negative and the other two positive. The distribution p3 has only positive pairwise217
covariances.218
The parameters for p1 are real-valued, the parameters for p2 are complex-valued219
and p3 is tripod decomposable.220
Though this example is artificial it indicates just how sensitive the model is to221
misreads in alignments. E.g., the difference between p1 and p2 could be seen as reading222
a single pattern 011 instead of a pattern 001.223
4.2. Tripod identifiable distributions224
The next step is to determine conditions under which a distribution satisfying (5)225
is tripod identifiable. These conditions should correspond to the conditions given by226
Pearl and Tarsi (1986, Theorem 1).227
Example 2 dealt with ταβταγτβγ < 0. However, as the following example shows,228
positivity of the product does not necessarily yield tripod identifiability.229
Example 3. The tripod distribution230
p = (68, 0, 20, 12, 20, 12, 17, 51)/200
yields positive covariances for all three pairs but also Mγ01 = −1/20, i.e. not a231
probability.232
The example contains a pattern of expected zero occurrence. From the tripod233
equations we conclude that a tripod identifiable distribution is strictly positive, thus234
this example is slightly crooked. However, as Example 1 showed, a strictly positive235
triplet distribution is not necessarily tripod identifiability either.236
8
In order to get necessary and sufficient conditions on a triplet distribution to be237
tripod identifiable we need to go back to the parameters in (6) and bound them238
accordingly. This yields:239
Theorem 6. A triplet distribution p is uniquely tripod identifiable if and only if after240
suitable state flips241
ταβ > 0, ταβ|0 ≥ 0, ταβ|1 ≥ 0,
ταγ > 0, ταγ|0 ≥ 0, ταγ|1 ≥ 0,
τβγ > 0, τβγ|0 ≥ 0, τβγ|1 ≥ 0.
(7)
In other words, the direction of the correlation between a pair of leaves shall not242
be influenced by the third leaf. With this we can summarise that a triplet distribution is243
tripod identifiable if it is in S −S0−S1 and there is a state flip such that (7) is satisfied.244
Example 4. The tripod distribution p from Example 3 has positive pairwise and245
conditional covariances except for ταβ|1 = −9/2500. Thus it does not satisfy (7).246
4.3. Non-identifiable cases247
The above considerations dealt with cases where a given triplet distribution p is248
identifiable. The final step of the tripod analysis is to regard those distributions which249
violate the conditions (5). Corollary 3 already discussed the case where one pairwise250
covariance is zero while the other two are not and we found that they were not251
decomposable. In the following we look at the remaining cases.252
Proposition 7. Assume that a triplet distribution p obeys ταβταγτβγ = −(ταβγ/2)2 but253
ταβταγτβγ 6= 0. Then p is not algebraically decomposable.254
In other words, we found another set of triplet distributions which are not255
decomposable.256
Example 5. The distribution257
p = (16, 5, 8, 15, 14, 5, 2, 15)/80
yields ταβ = −1/80, ταγ = 1/40 and τβγ = 1/8 but χ = 0 and hence has no factorization258
in the sense of (4).259
This case is particularly disturbing because here all taxa appear to be correlated260
and yet no structure can be found to explain the correlation.261
Together with Corollary 3 this covers the non-decomposable distributions. The262
remaining cases are triplet distributions which are decomposable but not identifiable.263
Proposition 8. Let p be a triplet distribution with ταβ = 0 and ταγ = 0. Then p is264
decomposable with infinitely many parameter sets.265
The parameter sets are identified by one of the following compositions:266
9
(i) τβγ 6= 0. Then Mα0a = Mα
1a = paΣΣ, a ∈ {0, 1}, and for any u, b, c ∈ {0, 1}:267
qζ1 =pΣΣc −Mγ
1c
Mγ0c −M
γ1c
, Mβub =
pΣbc − pΣbΣMγ(1−u)c
pΣΣc −Mγ(1−u)c
(8)
with free parameters Mγ0c 6= Mγ
1c.268
(ii) τβγ = 0. Then for all a, b, c,∈ {0, 1} the free parameters can be distributed as269
follows:270
(a) Mα0a = Mα
1a = paΣΣ, Mβ0b = Mβ
1b = pΣbΣ and271
qζ1 =pΣΣc −Mγ
0c
Mγ1c −M
γ0c
, (9)
with free parameters Mγ0z 6= Mγ
1z.272
(b) Mα0a = Mα
1a = paΣΣ, Mβ0b = Mβ
1b = pΣbΣ, Mγ0c = Mγ
1c = pΣΣc with free273
parameter qζ1.274
(c) qζ1 = 0, Mα0a = paΣΣ, M
β0b = pΣbΣ, M
γ0c = pΣΣc with free parameters275
Mα1a, M
β1b, M
γ1c.276
In other words, if we observe such a non-identifiable case we have no means to277
recover the true parameters of the tripod decomposition.278
Example 6. The triplet distribution279
p = (2, 2, 2, 2, 2, 2, 2, 2)/16
yields complete independence of the leaves ταβ = ταγ = τβγ = 0, i.e. the case (ii) in280
Proposition 8 is to be regarded here. It is not too surprising that such a distribution281
yields an infinite number of solutions since the state at the root is completely282
undetermined.283
Looking again at the cases listed above, we see that Xα is not only pairwise284
independent from (Xβ, Xγ) (induced by ταβ = ταγ = 0), but even completely285
independent. Then the multiple solutions come from the fact that we can place the root286
arbitrarily between β and γ.287
The good news is, that the non-identifiable cases form a small subset among all288
triplet distributions. In fact:289
Proposition 9. Non-identifiable triplet distributions, i.e. distributions violating the290
conditions (5) form a Lebesgue zero set in the set of all possible triplet distributions.291
This concludes our analysis of the tripod case. We identified the subset of triplet292
equations which are uniquely algebraically and tripod identifiable, and those which are293
decomposable but not identifiable, or not at all decomposable.294
10
5. Extension to quartet trees295
In this section we will explore the implications of extending the results for three296
taxa to four taxa. For this section we look at the quartet tree Q = (V,E) with297
V = {ζ, ψ, α, β, γ, δ}, E = {(ζ, ψ), (ζ, α), (ζ, β), (ψ, γ), (ψ, δ)}.
Fig. A.3 provides an illustration including the four tripod restrictions298
T = Tαβγ, T = Tαβδ, T = Tαγδ and T = Tβγδ. The two alternative quartet topologies299
can be described by leaf switches. E.g., the topology which groups α, γ against βδ is300
retrieved from the above conventions by switching β and γ. Regard the quartet301
distribution π = (πabcd)a,b,c,d∈{0,1} describing the joint distribution for α, β γ and δ. If π302
is identifiable and reversible then it can be reconstructed from the restrictions on these303
four tripods (Chang, 1996), i.e. computing the parameters for all tripods will304
immediately return the full process. However, the converse is not necessarily true. As305
Example 7 below shows, there are cases where each tripod restriction is identifiable but306
no quartet tree can be reconstructed.307
[Figure 3 about here.]308
Pearl and Tarsi (1986) presented an algorithm to reconstruct the topology for an309
arbitrary number of taxa. Their algorithm employs the condition that tripods which310
share an interior node in the (unknown) tree topology must result in the same marginal311
distribution at this interior node. Their approach yields an invariant, which for Q312
amounts to313
f1(π) = ταδτβγ − ταγτβδ. (10)
This invariant is related to the four-point-condition (e.g., Semple and Steel, 2003, p.314
146) and thus topologically informative, i.e. it is particular to topology Q. If a315
distribution π is from another tree than f1(π) 6= 0.316
To reconstruct the process parameters as well, more invariants are needed. In317
particular, for π to be identifiable on Q the parameters obtained from the tripod318
restrictions must satisfy the following properties:319
1. The parameters for edges (ζ, α), (ζ, β) and qζ obtained from triplet distributions320
p and p, respectively, must be equal.321
2. The parameters for edges (ψ, γ), (ψ, δ) and qψ obtained from triplet distributions322
p and p, respectively, must be equal.323
3. The parameters Mψ for the interior edge (ζ, ψ) are obtained from the equations324
Mγ
01 = (1−Mψ01)Mγ
01 +Mψ01M
γ11,
Mγ
11 = (1−Mψ11)Mγ
01 +Mψ11M
γ11.
(11)
These equations must hold equivalently when γ is replaced by δ and the325
parameters come from tripod T instead of M .326
11
These conditions imply further restrictions to π. An indicator for the minimal327
number of such conditions is the observation that a quartet distribution π has 15328
degrees of freedom, but there are only 11 model parameters on Q, two for each edge and329
one for the root distribution. Thus we need at least four additional conditions or rather330
invariants. We will use the above observations to derive an equivalent set of invariants.331
Proposition 10. A quartet distribution π is algebraically identifiable on Q if it332
satisfies conditions (5) and the following invariants vanish in π:333
f0(π) = εαβγδταγ − εαβγεαγδ + εγεαβεαγδ + εαεγδεαβγ − εαβεαγεγδ,f1(π) = ταδτβγ − ταγτβδ,f2(π) = ταγτβγδ − τβγταγδ,f3(π) = ταγταβδ − ταδταβγ.
The parameters unique up to state flip at the interior nodes are then given by Theorem334
5 and335
Mψ01 =
1
2+ταδταβγ − ταβταγδ − ταδ
√χαβγ
2ταβ√χαγδ
,
Mψ11 =
1
2+ταδταβγ − ταβταγδ + ταδ
√χαβγ
2ταβ√χαγδ
.
(12)
The existence of these invariants means that decomposable quartet distributions336
form a Lebesgue zero set in the set of all quartet distributions for the same reason that337
the non-identifiable sets are a Lebesgue zero set in the set of all decomposable338
distributions.339
Invariant f1 comes from the equality of the marginal distributions at the interior340
nodes, as proposed by Pearl and Tarsi (1986). Invariants f2 and f3 come from the341
equality of edge transition matrices. Hence, distributions for which f1, f2 and f3 vanish342
will uniquely identify topology Q. Therefore, f1 − f3 are topologically informative.343
However, only distributions for which f0 vanishes will be subject to the inferred344
parameters. In other words, in the set of zero points for f1 − f3 there is a set of345
distributions which returns the same set of parameters for Q, but only for one of these346
distributions f0 vanishes. It would be interesting to investigate how this distribution347
relates to the set it projects from, e.g. if it is related to the possible maximum348
likelihood optimum.349
Despite the fact that f1 − f3 are sufficient to infer a topology, f0 is also350
topologically informative in that it will not vanish for distributions coming from351
another tree.352
In the case of the CFN model, all triplet covariances vanish. Hence, only353
invariants f0 and f1 are of interest in that case. Therefore, either invariant is sufficient354
to identify the associated tree topology.355
12
The parameters for the interior edge do not add more non-identifiable cases.356
However, as in the tripod case, further conditions are needed to guarantee quartet357
identifiability. Hence we get:358
Proposition 11. A quartet distribution is quartet identifiable if and only if every359
triplet restriction satisfies both Theorem 6 and the following inequalities360
ταδ√χαβγ − ταβ
√χαγδ ≤ ταβταγδ − ταδταβγ ≤ ταβ
√χαγδ − ταδ
√χαβγ. (13)
All other relations are covered due to the fact that the quartet distribution p361
needs to satisfy the invariants f0 − f3. The following example provides a very nice case362
in which reconstruction is not possible but offers a very interesting challenge.363
Example 7. Chor et al. (2000) provided several examples of distributions with364
multiple maxima of the likelihood function. These examples relate to the CFN model,365
i.e., pabcd = p(1−a)(1−b)(1−c)(1−d) so that the Hadamard approach can be used. Regard the366
symmetric distribution367
p = (14, 0, 0, 3, 0, 2, 1, 0, 0, 1, 2, 0, 3, 0, 0, 14)/40. (14)
Retrieving the statistics yields:368
ταβ = 7/40 = τγδ, ταγ = 3/20 = τβδ, ταδ = 1/8 = τβγ,
ταβγ = ταβδ = ταγδ = τβγδ = 0.
The last equality immediately shows, that the above distribution will trivially satisfy369
invariants f2 and f3. However, we get f1 = −11/1600 and f0 = −23/375, i.e. our370
observations do not come from the quartet tree defined by the bipartition αβ|γδ.371
Looking at the alternative invariants for f1, i.e. at372
fαδ|βγ1 = ταβτγδ − ταγτβδ = 13/1600,
fαγ|βδ1 = ταβτγδ − ταδτβγ = 3/200,
we see that this distribution comes from none of the available quartet trees.373
Nevertheless, we shall have a look at the parameters. Note that the symmetry of374
the distribution p implies Mα01 = 1−Mα
11 =: Mα. Looking at the numerical values for375
the parameters for every tripod tree we find surprising similarities:376
[Table 1 about here.]377
These parameters permit us to infer parameters Mζ = 1/14 and Mψ = 1/7 such378
that e.g. the parameters for α on the tripod trees αβδ and αγδ can be obtained from379
the parameter for tripod tree αβγ by380
Mα = Mζ(1−Mα) + (1−Mζ)Mα, Mα = Mψ(1−Mα) + (1−Mψ)Mα,
13
with analogue assignments for the other leaves. These computations can be visualized381
by the network in Fig. A.4. The assignment of probabilities for each split permits to382
justify the observations for each of the four tripod trees. However, the visualization is383
misleading because the factorization of the system does not follow the edges in the384
network (e.g., Strimmer et al., 2001; Bryant, 2005).385
[Figure 4 about here.]386
Acknowledgements. We thank Elizabeth S. Allman and John A. Rhodes for stimulating387
the finalization of the manuscript as well as for sharing their thoughts on this subject388
with us. Further, we owe much to the discussions with David Bryant, Mike Steel, and389
Arndt von Haeseler.390
References391
Elizabeth S. Allman and John A. Rhodes. Phylogenetic invariants for the general392
Markov model of sequence mutation. Mathematical Biosciences, 186(2):113–144,393
December 2003. URL http://dx.doi.org/10.1016/j.mbs.2003.08.004.394
Elizabeth S. Allman and John A. Rhodes. Phylogenetic ideals and varieties for the395
general markov model. Advances in Applied Mathematics, 40(2):127–148, 2008. URL396
http://dx.doi.org/10.1016/j.aam.2006.10.002.397
Elizabeth S. Allman, Catherine Matias, and John A. Rhodes. Identifiability of398
parameters in latent structure models with many observed variables. The Annals of399
Statistics, 37(6A):3099–3132, 2009. URL http://dx.doi.org/10.1214/09-AOS689.400
Ellen Baake. What can and what cannot be inferred from pairwise sequence401
comparisons? Mathematical Biosciences, 154(1):1–21, 1998. URL402
http://dx.doi.org/10.1016/S0025-5564(98)10044-5.403
David Bryant. Extending tree models to splits networks. In Algebraic Statistics for404
Computational Biology, chapter 17, pages 320–332. Cambridge University Press,405
2005. URL http://ebooks.cambridge.org/chapter.jsf?bid=406
CBO9780511610684&cid=CBO9780511610684A097.407
David Bryant, Nicolas Galtier, and Marie-Anne Poursat. Likelihood calculation in408
molecular phylogenetics. In Olivier Gascuel, editor, Mathematics of Evolution and409
Phylogeny, chapter 2, pages 33–62. Oxford University Press, 2005.410
James A. Cavender. Taxonomy with confidence. Mathematical Biosciences, 40:271–280,411
1978. URL http://dx.doi.org/10.1016/0025-5564(78)90089-5.412
James A. Cavender and Joseph Felsenstein. Invariants of phylogenies in a simple case413
with discrete states. Journal of Classification, 4:57–71, 1987. URL414
http://hdl.handle.net/10.1007/BF01890075.415
14
Joseph T. Chang. Full reconstruction of Markov models on Evolutionary Trees:416
Identifiability and consistency. Mathematical Biosciences, 137:51–73, 1996. URL417
http://dx.doi.org/10.1016/S0025-5564(96)00075-2.418
Benny Chor, Michael D Hendy, Barbara R Holland, and David Penny. Multiple419
maxima of likelihood in phylogenetic trees: An analytic approach. Molecular Biology420
and Evolution, 17(10):1529–1541, 2000. URL421
http://mbe.oxfordjournals.org/content/17/10/1529.full.422
James S. Farris. A probability model for inferring evolutionary trees. Systematic423
Zoology, 22(3):250–256, 1973. URL http://www.jstor.org/stable/2412305.424
Michael D. Hendy and David Penny. A framework for the quantitative study of425
evolutionary trees. Syst Zool, 38(4):297–309, 1989. URL426
http://dx.doi.org/10.2307/2992396.427
Morris W. Hirsch. Differential Topology. Graduate Texts in Mathematics. Springer,428
New York, Heidelberg, Berlin, 1976. ISBN 0387901485.429
James A. Lake. A rate-independent technique for analysis of nucleic acid sequences:430
evolutionary parsimony. Molecular Biology and Evolution, 4(2):167–191, 1987. URL431
http://mbe.oxfordjournals.org/content/4/2/167.abstract.432
Steffen L. Lauritzen. Graphical Models. Oxford Stastical Science Series. Clarendon433
Press, Oxford, 1996. ISBN 0-19-852219-3. URL434
http://ukcatalogue.oup.com/product/9780198522195.do.435
Paul F. Lazarsfeld and Neil W. Henry. Latent Structure Analysis. Houghton, Mifflin,436
New York, 1968.437
Frederick A. Matsen. Fourier transform inequalities for phylogenetic trees. IEEE/ACM438
Transactions on Computational Biology and Bioinformatics, 6(1):89–95, 2009. URL439
http://dx.doi.org/10.1109/TCBB.2008.68.440
Jerzy Neyman. Molecular studies of evolution: A source of novel statistical problems.441
In S. dasGupta and J. Yackel, editors, Statistical Decision Theory and Related Topics,442
pages 1–27. Academic Press, New York, 1971.443
Judea Pearl and Michael Tarsi. Structuring causal trees. Journal of Complexity, 2:444
60–77, 1986. URL http://dx.doi.org/10.1016/0885-064X(86)90023-3.445
J. C. W. Rayner and Eric J. Beh. Towards a better understanding of correlation.446
Statistica Netherlandica, 63(3):324–333, 2009. URL447
http://dx.doi.org/10.1111/j.1467-9574.2009.00425.x.448
15
Charles Semple and Mike Steel. Phylogenetics. Oxford Lectures Series in Mathematics449
and its Applications. Oxford University Press, 2003. ISBN 0-19-850942-1.450
Korbinian Strimmer, Carsten Wiuf, and Vincent Moulton. Recombination analysis451
using directed graphical models. Molecular Biology and Evolution, 18(1):97–99, 2001.452
URL http://mbe.oxfordjournals.org/content/18/1/97.full.453
Bernd Sturmfels and Seth Sullivant. Toric ideals of phylogenetic invariants. Journal of454
Computational Biology, 12(4):457–481, 2005. URL455
http://dx.doi.org/10.1089/cmb.2005.12.457.456
Jeremy G. Sumner, Michael A. Charleston, Lars S. Jermiin, and Peter D. Jarvis.457
Markov invariants, plethysms, and phylogenetics. Journal of Theoretical Biology, 253:458
601–615, 2008. URL http://dx.doi.org/10.1016/j.jtbi.2008.04.001.459
Laszlo Szekely, Mike A. Steel, and Peter L. Erdos. Fourier calculus on evolutionary460
trees. Advances in Applied Mathematics, 14(2):200–210, 1993.461
Piotr Zwiernik and Jim Q. Smith. Tree-cumulants and the identifiability of bayesian462
tree model. arXiv:1004.4360v1, 2010. URL http://arxiv.org/abs/1004.4360.463
Appendix A. Proofs464
Proof of Lemma 1. A state flip replaces the probabilities at leaf α implies a “new”465
distribution p with pabc = p(1−a)bc, a, b, c ∈ {0, 1}. This has the following implications to466
the covariances.467
ταβ = p11Σ − p1ΣΣpΣ1Σ = (pΣ1Σ − p01Σ)− p1ΣΣpΣ1Σ
= −p01Σ + pΣ1Σ(1− p1ΣΣ) = −(p01Σ − p0ΣΣpΣ1Σ)
= −(p11Σ − p1ΣΣpΣ1Σ) = −ταβ.
and analogously ταγ = −ταγ and τβγ = τβγ. Thus, if ταβ and ταγ are smaller than zero,468
then a state flip produces positive covariances and the sign for the overall product469
remains the same.470
Proof of Lemma 2. With the tripod equations we find the following dependencies471
between covariances and model parameters:472
ταβγ = (Mα11 −Mα
01)(Mβ11 −M
β01)(Mγ
11 −Mγ01)qζ1(1− qζ1)(1− 2qζ1), (A.1)
ταβ = (Mα11 −Mα
01)(Mβ11 −M
β01)qζ1(1− qζ1), (A.2)
ταβ|c =(Mα
11 −Mα01)(Mβ
11 −Mβ01)Mγ
0cMγ1cq
ζ1(1− qζ1)
((1− qζ1)Mγ0c + qζ1M
γ1c)
2, c ∈ {0, 1}, (A.3)
16
with equivalent terms for the other covariances. Therefore, if ταβ = 0 due to473
Mα01 −Mα
11 = 0 then also ταγ = 0 and ταβγ = 0. If qζ1 ∈ {0, 1} then all four covariances474
are zero.475
For point 2 simply observe that all parameters need to be probabilities. Thus, if476
e.g., Mα01 −Mα
11 < 0 but Mβ01 −M
β11 > 0 and Mγ
01 −Mγ11 > 0 then ταβ < 0, ταγ < 0, and477
τβγ > 0 and hence ταβταγτβγ > 0. All other cases are similar, and thus we are478
finished.479
Proof of Corollary 3. A triplet distribution p for which only one covariance is zero does480
not satisfy Lemma 2(1) and hence is not tripod decomposable. Further, by looking at481
(A.2) we see that there is also no real- or complex-valued parameter set which would482
yield only one zero covariance. Hence, such a triplet distribution would also not be483
algebraically decomposible.484
Proof of Lemma 4. We insert the refined parameters into the tripod equations to get:485
pabc = qζ1Mα1aM
β1bM
γ1c + (1− qζ1)Mα
0aMβ 0bMγ
0c
= qζ0Mα0aM
β0bM
γ0c + (1− qζ0)Mα
1aMβ1bM
γ1c
= (1− qζ1)Mα0aM
β0bM
γ0c + qζ1M
α1aM
β1bM
γ1c,
i.e. the tripod equations are recovered with flipped parameters. This completes the486
proof.487
Proof of Theorem 5. We derive the parameters from the triplet equations. The first488
step is to replace the equation such that we get489
εαβγ = qζ1Mα11M
β11M
γ11 + (1− qζ1)Mα
01Mβ01M
γ01, (A.4)
εαβ = qζ1Mα11M
β11 + (1− qζ1)Mα
01Mβ01, (A.5)
εαγ = qζ1Mα11M
γ11 + (1− qζ1)Mα
01Mγ01, (A.6)
εβγ = qζ1Mβ11M
γ11 + (1− qζ1)Mβ
01Mγ01, (A.7)
εα = qζ1Mα11 + (1− qζ1)Mα
01, (A.8)
εβ = qζ1Mβ11 + (1− qζ1)Mβ
01, (A.9)
εγ = qζ1Mγ11 + (1− qζ1)Mγ
01. (A.10)
Equations (A.8)-(A.10) yield490
(1−qζ1)Mα01 = εα−qζ1Mα
11, (1−qζ1)Mβ01 = εβ−qζ1M
β11, (1−qζ1)Mγ
01 = εγ−qζ1Mγ11. (A.11)
Inserting (A.11) into (A.5) returns491
(1− qζ1)εαβ = qζ1(1− qζ1)Mα11M
β11 + (εα − qζ1Mα
11)(εβ − qζ1Mβ11)
= qζ1Mα11M
β11 + εαεβ − qζ1(εαM
β11 + εβM
α11),
17
and in consequence492
qζ1Mβ11(Mα
11 − εα) = ταβ + qζ1(εβMα11 − εαβ), (A.12)
qζ1Mγ11(Mα
11 − εα) = ταγ + qζ1(εγMα11 − εαγ). (A.13)
We insert (A.12)-(A.13) back into (A.11)493
(1− qζ1)Mβ01(Mα
11 − εα) = εβ(Mα11 − εα)− ταβ − qζ1(εβM
α11 − εαβ)
= (1− qζ1)(εβMα11 − εαβ).
In the case of qζ1 = 1 we get from (A.8) and (A.5) that Mα11 = εα and εαβ = εαεβ. Hence,494
we remove 1− qζ1 from the above equation without destroying equality. Thus, we get495
Mβ01(Mα
11 − εα) = εβMα11 − εαβ, (A.14)
Mγ01(Mα
11 − εα) = εγMα11 − εαγ. (A.15)
We insert (A.11) in (A.4) to get496
Mα11εβγ − εαβγ = Mβ
01Mγ01(Mα
11 − εα).
Applying (A.14) and (A.15) to this gives us497
0 = (Mα11εβγ − εαβγ)(Mα
11 − εα)− (εβMα11 − εαβ)(εγM
α11 − εαγ)
= (Mα11)2τβγ −Mα
11(ταβγ + 2εατβγ) + εαβγεα − εαβεαγ.
We can apply the solution formula for quadratic equations provided τβγ 6= 0, i.e. our498
condition (5) is satisfied. In that case we get499
(Mα11)± =
ταβγ + 2εατβγ2τβγ
±√
(ταβγ + 2εατβγ)2 − 4(εαβγεα − εαβεαγ)τβγ2τβγ
= εα +ταβγ ±
√τ 2αβγ + 4ταβταγτβγ
2τβγ. (A.16)
Thus we have established the term for Mα11. The next step is to derive qζ1. We insert500
(A.12)-(A.15) into (A.7) and get501
qζ1(Mα11 − εα)2εβγ = (ταβ + qζ1(εβM
α11 − εαβ))(ταγ + qζ1(εγM
α11 − εαγ))
+ qζ1(1− qζ1)(εβMα11 − εαβ)(εγM
α11 − εαγ)
= (1− qζ1)ταβταγ + qζ1εβεγ(Mα11 − εα)2
and hence we get the quadratic relation502
0 = (1− qζ1)ταβταγ − qζ1τβγ(Mα11 − εα)2 (A.17)
18
We insert (A.16) and get503
ταβταγ = qζ1(ταβταγ + τβγ(M
α11 − εα)2
),
4ταβταγτβγ = qζ1
(4ταβταγτβγ +
(ταβγ +
√χ)2),
4ταβταγτβγ = 2qζ1√χ(√
χ+ ταβγ).
We use the equality504
4ταβταγτβγ = χ− τ 2αβγ = (
√χ+ ταβγ)(
√χ− ταβγ)
and the observation that√χ(√χ− ταβγ) = 0 if and only if the conditions in (5) are505
violated to get506
qζ1 =
√χ− ταβγ2√χ
=1
2− ταβγ
2√χ, (A.18)
thus inferring the proposed term for qζ1. Next we infer the term for Mα01. To this end we507
insert (A.16) and (A.18) into (A.11):508
−qζ1(Mα11 − εα) = (1− qζ1)(Mα
01 − εα),
(ταβγ −√χ)(ταβγ +
√χ) = 2τβγ(ταβγ +
√χ)(Mα
01 − εα),
Mα01 = εα +
ταβγ −√χ
2τβγ,
thus inferring the proposed term. The remaining terms are inferred analogously. This509
completes the proof.510
Proof of Theorem 6. We bound the parameters from (6) between 0 and 1:511
0 ≤ 1
2− ταβγ
2√χ≤ 1,
−√χ ≤ ταβγ ≤√χ,
0 ≤ ταβταγτβγ.
With (5) this yields positivity for the unconditional covariances. Next we look at Mα01512
and Mα11:513
0 ≤ εα +ταβγ −
√χ
2τβγ≤ 1,
−2εατβγ ≤ ταβγ −√χ ≤ 2(1− εα)τβγ,
ταβγ − 2(1− εα)τβγ ≤√χ ≤ ταβγ + 2εατβγ
19
and514
0 ≤ εα +ταβγ +
√χ
2τβγ≤ 1,
−2εατβγ ≤ ταβγ +√χ ≤ 2(1− εα)τβγ,
−(2εατβγ + ταβγ) ≤√χ ≤ 2(1− εα)τβγ − ταβγ.
Squaring both inequalities reduces the four inequalities to the following two:515
τ 2αβγ + 4ταβταγτβγ ≤ (2εατβγ + ταβγ)
2, (A.19)
τ 2αβγ + 4ταβταγτβγ ≤ (2(1− εα)τβγ − ταβγ)2. (A.20)
We look first at inequality (A.19) and get516
ταβταγτβγ ≤ ε2ατ
2βγ + εατβγταβγ,
0 ≤ εα(εατβγ + ταβγ)− ταβταγ,0 ≤ εαεαβγ − εαβεαγ = τβγ|1.
Set εα := (1− εα) = p0ΣΣ and look at (A.20):517
ταβταγτβγ ≤ ε2ατ
2βγ − εατβγταβγ,
0 ≤ εα(εατβγ − ταβγ)− ταβταγ,0 ≤ p000p011 − p001p010 = τβγ|0.
Hence, we have derived the proposed inequalities.518
Proof of Proposition 7. The tripod equations (4) imply:519
χ = τ 2αβγ + 4ταβταγτβγ = (Mα
11 −Mα01)2(Mβ
11 −Mβ01)2(Mγ
11 −Mγ01)2(1− qζ1)2(qζ1)2
Together with (A.1) and (A.2) we see that there is no set of real or complex parameters520
such that χ = 0 but ταβταγτβγ 6= 0.521
Proof of Proposition 8. The cases are easily verified by looking at Equation (A.2) and522
inserting the selected parameters back into (4).523
Proof of Proposition 9. The function χ : C8 → C is a nonconstant polynomial mapping.524
Thus the set {p ∈ R8 : χ(p) = 0} is a Lebesgue zero set. The same holds for the set525
{p ∈ R8 : ταβ(p) = 0 or ταγ(p) = 0 or τβγ(p) = 0}.
This completes the proof.526
20
Proof of Proposition 10. We recover Mψ by inserting the parameters from (6) into527
(11). To infer the invariants we first look at the equality conditions. We do this528
representatively by looking at Mα
= Mα. In particularly we look at529
Mα
11 −Mα
01 = Mα11 − Mα
01, Mα
11 +Mα
01 = Mα11 + Mα
01,
and thus530 √χαβγ
τβγ=
√χαβδ
τβδ,
ταβγτβγ
=ταγδτβδ
,
τ 2αβγ + 4ταβταγτβγ
τ 2βγ
=τ 2αβδ + 4ταβταδτβδ
τ 2βδ
,ταβγτβγ
=ταγδτβδ
,
ταγτβγ
=ταδτβδ
,ταβγτβγ
=ταγδτβδ
,
ταβγταβδ
=τβγτβδ
=ταγταδ
.
Looking at Mβ
= Mβ yields the same equalities. Reproducing the calculations for531
M γ = M γ yields the invariants f1 − f3.532
The equation system for quartets can be rewritten such that we have equations in533
εαβγδ, εαβγ, εαβδ, εαγδ, εβγδ, εαβ, εαγ, εαδ, εβγ, εβδ, εγδ, εα, εβ, εγ and εδ. Apart from the534
first term the equations for all terms have been handled in the tripod cases. Hence, all535
that remains is to insert the parameters obtained in (6) and (12) in to the equation536
εαβγδ = (1− qζ1)Mα
01Mβ
01((1−Mψ01)Mγ
01Mδ01 +Mψ
01Mγ11M
δ11)
+ qζ1Mα
11Mβ
11((1−Mψ11)Mγ
01Mδ01 +Mψ
11Mγ11M
δ11).
Reordering and restructuring this equation eventually yields invariant f0. This537
completes the proof.538
Proof of Proposition 11. The conditions for with the parameters from (6) are539
probabilities are covered by Theorem 6.540
Thus, for the remaining condition we have to bound (12) between 0 and 1, and541
use that the covariances are always positive from Lemma 2(1):542
−1 ≤ταδταβγ − ταβταγδ − ταδ
√χαβγ
ταβ√χαγδ
≤ 1,
ταβ(ταγδ −√χαγδ) ≤ ταδ(ταβγ −
√χαβγ) ≤ ταβ(ταγδ +
√χαγδ),
−1 ≤ταδταβγ − ταβταγδ + ταδ
√χαβγ
ταβ√χαγδ
≤ 1,
ταβ(ταγδ −√χαγδ) ≤ ταδ(ταβγ +
√χαβγ) ≤ ταβ(ταγδ +
√χαγδ),
543
21
List of Figures544
A.1 A binary tree with six leaves. Gray lines and vertices describe the hidden545
part of the process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23546
A.2 The tripod tree T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24547
A.3 The quartet tree Q with its tripod restrictions T , T , T and T . Again,548
gray lines and vertices indicate the hidden or unknown variables of the549
approach presented here. . . . . . . . . . . . . . . . . . . . . . . . . . . . 25550
A.4 Assignment of mutation probability from the symmetric distribution in551
(14). The black lines indicate the triplet αβγ. Assigned branch lengths552
are rounded values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26553
22
ψ
ξω
ζ
β
α
γ
φ ε
δ
Figure A.1: A binary tree with six leaves. Gray lines and vertices describe the hidden part of the process.
23
Q T T T T
Mq
MMM
Mq
M
M
q
MM
q
MM
M
MM
q
M
ψ
γ
ζ
δ
ζ
δ γ
ψ
β
ζ
αβ
ζ
αβ
γ
ψ
δ
ζ
αβ
ζ
ψ
δ γ
βα
ζ
γ
α β
δ
ζ
α
δ γ
ψ
δ γ
ψ
βα
Figure A.3: The quartet tree Q with its tripod restrictions T , T , T and T . Again, gray lines and verticesindicate the hidden or unknown variables of the approach presented here.
25
0.042
δ
α0.042 0.143 0.042
γ
0.071
0.042
βFigure A.4: Assignment of mutation probability from the symmetric distribution in (14). The blacklines indicate the triplet αβγ. Assigned branch lengths are rounded values.
26
List of Tables554
A.1 The parameters for each triplet. . . . . . . . . . . . . . . . . . . . . . . . 28555
27