Fixation Maximization in the Positional Moran Process

11
Fixation Maximization in the Positional Moran Process Joachim Brendborg, 1 Panagiotis Karras, 1 Andreas Pavlogiannis, 1 Asger Ullersted Rasmussen, 1 Josef Tkadlec 2 1 Aarhus University, Aabogade 34, Aarhus, Denmark 2 Harvard University, 1 Oxford Street, Cambridge, USA [email protected], [email protected], [email protected] Abstract The Moran process is a classic stochastic process that models spread dynamics on graphs. A single “mutant” (e.g., a new opinion, strain, social trait etc.) invades a population of “resi- dents” scattered over the nodes of a graph. The mutant fitness advantage δ 0 determines how aggressively mutants propa- gate to their neighbors. The quantity of interest is the fixation probability, i.e., the probability that the initial mutant even- tually takes over the whole population. However, in realistic settings, the invading mutant has an advantage only in certain locations. E.g., the ability to metabolize a certain sugar is an advantageous trait to bacteria only when the sugar is actually present in their surroundings. In this paper, we introduce the positional Moran process, a natural generalization in which the mutant fitness advantage is only realized on specific nodes called active nodes, and study the problem of fixation maxi- mization: given a budget k, choose a set of k active nodes that maximize the fixation probability of the invading mutant. We show that the problem is NP-hard, while the optimiza- tion function is not submodular, indicating strong computa- tional hardness. We then focus on two natural limits: at the limit of δ →∞ (strong selection), although the problem re- mains NP-hard, the optimization function becomes submod- ular and thus admits a constant-factor greedy approximation; at the limit of δ 0 (weak selection), we show that we can obtain a tight approximation in O(m ω ) time, where m is the number of edges and ω is the matrix-multiplication exponent. An experimental evaluation of the new algorithms along with some proposed heuristics corroborates our results. 1 Introduction Several real-world phenomena are described in terms of an invasion process that occurs in a network and follows some propagation model. The problems considered in such set- tings aim to optimize the success of the invasion, quanti- fied by some metric. For example, social media campaigns serving purposes of marketing, political advocacy, and social welfare are modeled in terms of influence spread, whereby the goal is to maximize the expected influence spread of a message as a function of a set of initial adopters (Kempe, Kleinberg, and Tardos 2003; Tang, Xiao, and Shi 2014; Borgs et al. 2014; Tang, Shi, and Xiao 2015; Zhang et al. 2020); see (Li et al. 2018) for a survey. Similarly, in rumor Copyright © 2022, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. propagation (Demers et al. 1987), the goal is to minimize the number of rounds until a rumor spreads to the whole popu- lation (Fountoulakis, Panagiotou, and Sauerwald 2012). A related propagation model is that of the Moran process, in- troduced as a model of genetic evolution (Moran 1958), and the variants thereof, such as the discrete Voter model (Clif- ford and Sudbury 1973; Liggett 1985; Antal, Redner, and Sood 2006; Talamali et al. 2021). A single mutant (e.g., a new opinion, strain, social trait, etc.) invades a population of residents scattered across the nodes of the graph. Contrari- wise to influence and rumor propagation models, the Moran process accounts for active resistance to the invasion: a node carrying the mutant trait may not only forward it to its neigh- bors, but also lose it by receiving the resident trait from its neighbors. This model exemplifies settings in which individ- uals may switch opinions several times or sub-species com- peting in a gene pool. A key parameter in this process is the mutant fitness advantage, a real number δ 0 that expresses the intensity by which mutants propagate to their neighbors compared to residents. Large δ favors mutants, while δ =0 renders mutants and residents indistinguishable. Success in the Moran process is quantified in terms of the fixation probability, i.e., the probability that a randomly oc- curring mutation takes over the whole population. As the fo- cal model in evolutionary graph theory (Lieberman, Hauert, and Nowak 2005), the process has spawned extensive work on identifying amplifiers, i.e., graph structures that enhance the fixation probability (Monk, Green, and Paulin 2014; Gi- akkoupis 2016; Galanis et al. 2017; Pavlogiannis et al. 2018; Tkadlec et al. 2021). Still, in the real world, the probability that a mutation spreads to its neighborhood is often affected by its position: the ability to metabolise a certain sugar is advantageous to bacteria only when that sugar is present in their surroundings; likewise, people are likely to spread a viewpoint more enthusiastically if it is supported by experi- ence from their local environment (country, city, neighbor- hood). By default, the Moran process does not account for such positional effects. While optimization questions have been studied in the Moran and related models (Even-Dar and Shapira 2007), the natural setting of positional advan- tage remains unexplored. In this setting, the question arises: to which positions (i.e., nodes) should we confer an advan- tage so as to maximize the fixation probability? PRELIMINARY PREPRINT VERSION: DO NOT CITE The AAAI Digital Library will contain the published version some time after the conference.

Transcript of Fixation Maximization in the Positional Moran Process

Fixation Maximization in the Positional Moran Process

Joachim Brendborg, 1 Panagiotis Karras, 1 Andreas Pavlogiannis, 1

Asger Ullersted Rasmussen, 1 Josef Tkadlec 2

1Aarhus University, Aabogade 34, Aarhus, Denmark2Harvard University, 1 Oxford Street, Cambridge, USA

[email protected], [email protected], [email protected]

Abstract

The Moran process is a classic stochastic process that modelsspread dynamics on graphs. A single “mutant” (e.g., a newopinion, strain, social trait etc.) invades a population of “resi-dents” scattered over the nodes of a graph. The mutant fitnessadvantage δ ≥ 0 determines how aggressively mutants propa-gate to their neighbors. The quantity of interest is the fixationprobability, i.e., the probability that the initial mutant even-tually takes over the whole population. However, in realisticsettings, the invading mutant has an advantage only in certainlocations. E.g., the ability to metabolize a certain sugar is anadvantageous trait to bacteria only when the sugar is actuallypresent in their surroundings. In this paper, we introduce thepositional Moran process, a natural generalization in whichthe mutant fitness advantage is only realized on specific nodescalled active nodes, and study the problem of fixation maxi-mization: given a budget k, choose a set of k active nodesthat maximize the fixation probability of the invading mutant.We show that the problem is NP-hard, while the optimiza-tion function is not submodular, indicating strong computa-tional hardness. We then focus on two natural limits: at thelimit of δ → ∞ (strong selection), although the problem re-mains NP-hard, the optimization function becomes submod-ular and thus admits a constant-factor greedy approximation;at the limit of δ → 0 (weak selection), we show that we canobtain a tight approximation in O(mω) time, where m is thenumber of edges and ω is the matrix-multiplication exponent.An experimental evaluation of the new algorithms along withsome proposed heuristics corroborates our results.

1 IntroductionSeveral real-world phenomena are described in terms of aninvasion process that occurs in a network and follows somepropagation model. The problems considered in such set-tings aim to optimize the success of the invasion, quanti-fied by some metric. For example, social media campaignsserving purposes of marketing, political advocacy, and socialwelfare are modeled in terms of influence spread, wherebythe goal is to maximize the expected influence spread of amessage as a function of a set of initial adopters (Kempe,Kleinberg, and Tardos 2003; Tang, Xiao, and Shi 2014;Borgs et al. 2014; Tang, Shi, and Xiao 2015; Zhang et al.2020); see (Li et al. 2018) for a survey. Similarly, in rumor

Copyright © 2022, Association for the Advancement of ArtificialIntelligence (www.aaai.org). All rights reserved.

propagation (Demers et al. 1987), the goal is to minimize thenumber of rounds until a rumor spreads to the whole popu-lation (Fountoulakis, Panagiotou, and Sauerwald 2012).

A related propagation model is that of the Moran process, in-troduced as a model of genetic evolution (Moran 1958), andthe variants thereof, such as the discrete Voter model (Clif-ford and Sudbury 1973; Liggett 1985; Antal, Redner, andSood 2006; Talamali et al. 2021). A single mutant (e.g., anew opinion, strain, social trait, etc.) invades a population ofresidents scattered across the nodes of the graph. Contrari-wise to influence and rumor propagation models, the Moranprocess accounts for active resistance to the invasion: a nodecarrying the mutant trait may not only forward it to its neigh-bors, but also lose it by receiving the resident trait from itsneighbors. This model exemplifies settings in which individ-uals may switch opinions several times or sub-species com-peting in a gene pool. A key parameter in this process is themutant fitness advantage, a real number δ ≥ 0 that expressesthe intensity by which mutants propagate to their neighborscompared to residents. Large δ favors mutants, while δ = 0renders mutants and residents indistinguishable.

Success in the Moran process is quantified in terms of thefixation probability, i.e., the probability that a randomly oc-curring mutation takes over the whole population. As the fo-cal model in evolutionary graph theory (Lieberman, Hauert,and Nowak 2005), the process has spawned extensive workon identifying amplifiers, i.e., graph structures that enhancethe fixation probability (Monk, Green, and Paulin 2014; Gi-akkoupis 2016; Galanis et al. 2017; Pavlogiannis et al. 2018;Tkadlec et al. 2021). Still, in the real world, the probabilitythat a mutation spreads to its neighborhood is often affectedby its position: the ability to metabolise a certain sugar isadvantageous to bacteria only when that sugar is present intheir surroundings; likewise, people are likely to spread aviewpoint more enthusiastically if it is supported by experi-ence from their local environment (country, city, neighbor-hood). By default, the Moran process does not account forsuch positional effects. While optimization questions havebeen studied in the Moran and related models (Even-Darand Shapira 2007), the natural setting of positional advan-tage remains unexplored. In this setting, the question arises:to which positions (i.e., nodes) should we confer an advan-tage so as to maximize the fixation probability?

PRELIMINARY PREPRINT VERSION: DO NOT CITEThe AAAI Digital Library will contain the published

version some time after the conference.

Our contributions. We define a positional variant of theMoran process, in which the mutant fitness advantage is onlyrealized on a subset of nodes called active nodes and an as-sociated optimization problem, fixation maximization (FM):given a graph G, a mutant fitness advantage δ and a bud-get k, choose a set of k nodes to activate that maximize thefixation probability fp(GS , δ) of mutants. We also study theproblems FM∞ and FM0 that correspond to FM at the naturallimits δ →∞ and δ → 0, respectively. On the negative side,we show that FM and FM∞ are NP-hard (Section 3.1). Onecommon way to circumvent NP-hardness is to prove that theoptimization function is submodular; however, we show thatfp(GS , δ) is not submodular when δ is finite (Section 3.2).On the positive side, we first show that the fixation probabil-ity fp(GS , δ) on unidrected graphs admits a FPRAS, that is,it can be approximated to arbitrary precision in polynomialtime (Section 4.1). We then show that, in contrast to finite δ,the function fp∞(GS) is submodular on undirected graphsand thus FM∞ admits a polynomial-time constant-factor ap-proximation (Section 4.2). Lastly, regarding the limit δ → 0,we show that FM0 can be solved in polynomial time on anygraph (Section 4.3). Overall, while FM is hard in general, weobtain tractability in both natural limits δ →∞ and δ → 0.

Due to space constraints, some proofs are in Appendix A.

2 PreliminariesWe extend the standard Moran process to its positional vari-ant and define our fixation maximization problem on it.

2.1 The Positional Moran ProcessStructured populations. Standard evolutionary graph the-ory (Nowak 2006) describes a population structure by a di-rected weighted graph (network) G = (V,E,w) of |V | = nnodes, where w is a weight function over the edge set Ethat defines a probability distribution on each node u, i.e.,for each u ∈ V ,

∑(u,v)∈E w(u, v) = 1. We require that G

is strongly connected when E is projected to the supportof the probability distribution of each node. Nodes repre-sent sites (locations) and each site is occupied by a singleagent. Agents are of two types — mutants and residents— and agents of the same type are indistinguishable fromeach other. For ease of presentation, we may refer to a nodeand its occupant agent interchangeably. In special cases, wewill consider simple undirected graphs, meaning that (i) Eis symmetric and (ii) w(u, ·) is uniform for every node u.

Active node sets and fitness. Mutants and residents are dif-ferentiated in fitness, which in turn depends on a set S ⊆ Vof active nodes. Consider that mutants occupy the nodesin X⊆V . The fitness of the agent at node u∈V is:

fX,S(u) =

{1 + δ, if u ∈ X ∩ S1, if u 6∈ X ∩ S (1)

where δ ≥ 0 is the mutant fitness advantage. Intuitively,fitness expresses the capacity of a type to spread in the net-work. The resident fitness is normalized to 1, while δ mea-sures the relative competitive advantage conferred by the

mutation. However, the mutant fitness advantage is not re-alized uniformly in the graph, but only at active nodes. Thetotal fitness of the population is FS(X) =

∑u fX,S(u).

The positional Moran process. We introduce the positionalMoran process as a discrete-time stochastic process (Xt)t≥0,where each Xt ⊆ V is a random variable denoting the set ofnodes of G occupied by mutants. Initially, the whole pop-ulation consists of residents; at time t = 0, a single mu-tant appears uniformly at random. That is, X0 is given byP[X0 = {u}] = 1/n for each u ∈ V . In each subsequentstep, we obtain Xt+1 from Xt by executing the followingtwo stochastic events in succession.

1. Birth event: A single agent is selected for reproductionwith probability proportional to its fitness, i.e., an agenton node u is chosen with probability fXt,S(u)/FS(Xt).

2. Death event: A neighbor v of u is chosen with proba-bility w(u, v); its agent is replaced by a copy of the oneat u.

If u is a mutant and v a resident, then mutants spread to v,hence Xt+1 = Xt ∪ {v}; likewise, if u is a resident and v amutant, then Xt+1 = Xt \ {v}; otherwise u and v are of thesame type, and thus Xt+1 = Xt.

Related Moran processes. Our positional Moran processgeneralizes the standard Moran process on graphs (Nowak2006) that forms the special case of S = V (that is, themutant fitness advantage holds on the whole graph). Con-versely, it can be viewed a special case of the Moran processwith two environments (Kaveh, McAvoy, and Nowak 2019).It is different from models in which agent’s fitness dependson which agents occupy the neighboring nodes (so-calledfrequency dependent fitness) (Huang and Traulsen 2010).

Fixation probability. If ∃τ ≥ 0 such that Xτ = V , wesay that the mutants fixate in G, otherwise we say they goextinct. As G is strongly connected, for any initial mutantset A ⊆ V , the process reaches a homogeneous state al-most surely, i.e., limt→∞ P[Xt ∈ {∅, V }] = 1. The fixationprobability (FP) of such a set A is fp(GS , δ, A) = P[∃τ ≥0: Xτ =V | X0 =A]. Thus, the FP of the mutants in G withactive node set S and fitness advantage δ is:

fp(GS , δ) =1

n

∑u∈V

fp(GS , δ, {u}) (2)

For any fixed G and S, the function fp(GS , δ) is continu-ous in δ and infinitely differentiable: the underlying Markovchain is finite and its transition probabilities are rationalfunctions in δ, so fp(GS , δ) too is a rational function inδ, and since the process is defined for any δ ∈ [0,∞),fp(GS , δ) has no points of discontinuity for δ ∈ [0,∞).

Computing the fixation probability. In the neutral settingwhere δ= 0, fp(GS , δ) = 1

n for any graph G (Broom et al.2010). When δ > 0, the complexity of computing the fix-ation probability is unknown even when S = V ; it is openwhether the function can be computed in PTime. For undi-rectedG and S=V , the problem has a fully polynomial ran-domized approximation scheme (FPRAS) (Dıaz et al. 2012;

. . .

1

1

1

1

1

1

1

1

1 + δ 1 + δ

Figure 1: Steps in the positional Moran process. Blue (red)nodes are occupied by mutants (residents). Yellow rings in-dicate active nodes that realize the mutant fitness advantage.

Chatterjee, Ibsen-Jensen, and Nowak 2017; Ann Goldberg,Lapinskas, and Richerby 2020), as the expected absorp-tion time of the process is polynomial and thus admits fastMonte-Carlo simulations. We also rely on Monte-Carlo sim-ulations for computing fp(GS , δ), and show that the problemadmits a FPRAS for any S⊆V when G is undirected.

2.2 Fixation Maximization via Node ActivationWe study the optimization problem of fixation maximizationin the positional Moran process. We focus on three variantsbased on different regimes of the mutant fitness advantage δ.

Fixation maximization. Consider a graph G = (V,E), amutant fitness advantage δ and a budget k∈N. The problemFixation Maximization for the Moran Process FM(G, δ, k)calls to determine an active-node set of size ≤ k that maxi-mizes the fixation probability:

S∗ = arg maxS⊆V,|S|≤k

fp(GS , δ) (3)

As we will argue, fp(GS , δ) is monotonically increasingin S, hence the condition |S|≤k can be replaced by |S|=k.The associated decision problem is: Given a budget k anda threshold p ∈ [0, 1], determine whether there exists anactive-node set S of size |S| = k such that fp(GS , δ) ≥ p.As an illustration, consider a cycle graph on n = 50 nodesand two strategies: one that activates k “spaced” nodes,and another that activates k “contiguous” nodes. As Fig. 2shows, depending on δ and k, either strategy could be better.

The regime of strong selection. Next, we consider FM inthe limit δ →∞. The corresponding fixation probability is:

fp∞(GS) = limδ→∞

fp(GS , δ) (4)

As we show later, fp(GS , δ) is monotonically increasingin δ, and since it is also bounded by 1, the above limit exists.The corresponding optimization problem FM∞(G, k) asksfor the active node set S∗ that maximizes fp∞(GS):

S∗ = arg maxS⊆V,|S|≤k

fp∞(GS) (5)

The regime of weak selection. Further, we consider theproblem FM(G, δ, k) in the limit δ → 0. For small posi-tive δ, different sets S generally give different fixation prob-abilities fp(GS , δ), see Fig. 3. For fixed S ⊆ V , we can

Spaced

Contiguous

Figure 2: FP differences on a cycle graph on n = 50 nodes,under spaced and contiguous activating strategy. E.g. forδ = 100 and k = 18 we have fp(GSpaced, δ)

.= 0.62 and

fp(GCont, δ).= 0.43, so the difference is almost 20%.

view fp(GS , δ) as a function of δ. By the Taylor expansionof fp(GS , δ), and given that fp(GS , 0) = 1/n, we have

fp(GS , δ) =1

n+ δ · fp′(GS , 0) +O(δ2), (6)

where fp′(GS , 0) = dd δ

∣∣∣δ=0

fp(GS , δ). Since lower-order

termsO(δ2) tend to 0 faster than δ as δ→0, on a sufficientlysmall neighborhood of δ=0, maximizing fp(GS , δ) reducesto maximizing the derivative fp′(GS , 0) (up to lower orderterms). The corresponding optimization problem FM0(G, k)asks for the set S∗ that maximizes fp′(GS , 0):

S∗ = arg maxS⊆V,|S|≤k

fp′(GS , 0) (7)

As fp(GS , δ) ≥ 1/n for all S, our choice of S max-imizes the gain of fixation probability, i.e., the differ-ence ∆fp(GS , δ) = fp(GS , δ) − 1

n . The set S∗ of Eq. (7)guarantees a gain with relative approximation factor tendingto 1 as δ → 0, i.e., we have the following (simple) lemma.

Lemma 1. ∆fp(GSOPT,δ)

∆fp(GS∗ ,δ)= 1 +O(δ).

Monotonicity. Enlarging the set S or the advantage δ in-creases the fixation probability. The proof generalizes thestandard Moran process monotonicity (Dıaz et al. 2016).

Lemma 2 (Monotonicity). Consider a graph G = (V,E),two subsets S ⊆ S′ ⊆ V and two real numbers 0 ≤ δ ≤ δ′.Then fp(GS , δ) ≤ fp(GS

′, δ′).

3 Negative ResultsWe start by presenting hardness results for FMand FM∞.

3.1 Hardness of FM and FM∞We first show NP-hardness for FM and FM∞. This hardnesspersists even given an oracle that computes the fixation prob-ability fp(GS , δ) (or fp∞(GS)) for a graph G and a set of

S = {1, 2}1

2

3

45

67

S = {3, 5}1

2

3

45

67

Figure 3: A fixed graph G on 7 nodes and fixation probabil-ity fp(GS , δ) for all six non-equivalent subsets S of k = 2nodes; fp(GS , δ) is roughly linear in δ when δ ∈ [0, 0.1].

active nodes S. Given a set of mutants A, let

fp∞(GS , A) = limδ→∞

fp(GS , δ, A) (8)

be the fixation probability of a set of mutantsA under strongselection, i.e., in the limit of large fitness advantage. Thefollowing lemma states that for undirected graphs, a singleactive mutant with infinite fitness guarantees fixation.

Lemma 3. Let G be an undirected graph, S a set of ac-tive nodes, and A a set of mutants. If A ∩ S 6= ∅ thenfp∞(GS , A) = 1.

The intuition behind Lemma 3 is as follows: consider a nodeu ∈ A ∩ S. Then, as δ → ∞, node u is chosen for re-production at a much higher rate than any resident node.Hence, in a typical evolutionary trajectory, the neighbors ofnode u are occupied by mutants for most of the time andthus the individual at node u is effectively guarded from anythreat of becoming a resident. In contrast, the mutants arereasonably likely to spread to any other part of the graph.In combination, this allows us to argue that mutants fixatewith high probability as δ → ∞. Notably, this argumentrelies on G being undirected. As a simple counterexample,consider G being a directed triangle a → b → c → aand S = A = {a}. Then, for the mutants to spread to c,a mutant on b must reproduce. Since b 6∈ S, a mutant on bis as likely to reproduce in each round as a resident on c. Ifthe latter happens prior to the former, node a becomes res-ident, and, with no active mutants left, we have a constant(non-zero) probability of extinction.

Due to Lemma 3, if G is an undirected regular graph witha vertex cover of size k, under strong selection, we achievethe maximum fixation probability by activating a set S of knodes that forms a vertex cover ofG. The key insight is that,if the initial mutant lands on some node u 6∈ S, then themutant fixates if and only if it manages to reproduce oncebefore it is replaced, as all neighbors of u are in S and thusLemma 3 applies. This is captured in the following lemma.

Lemma 4. For any undirected regular graphG and S ⊆ V ,fp∞(GS) ≥ n+|S|

2n iff S is a vertex cover of G (and if so, theequality holds).

Proof. First, note that due to Lemma 3, the fixation prob-ability equals the probability that eventually a mutant isplaced on an active node. Due to the uniform placement ofthe initial mutant, the probability that it lands on an activenode is |S|n . Let A be the set of nodes in V \ S that have atleast one neighbor not in S. Thus we can write:

fp∞(GS) = |S|n + n−|S|−|A|

n p+∑u∈A

1n fp∞(GS, {u}) (9)

where p is the probability that an initial mutant occupyinga non-active node u whose neighbors are all in S spreadsto any of its neighbors, i.e., reproduces before any of itsneighbors replace it; u reproduces with probability p1 =1/n, while its neighbors replace it with probability p2 =∑

(v,u)∈E 1/n · 1/d = d · 1/n · 1/d = 1/n, where d

is the degree of G, hence p = p1/(p1 + p2) = 1/2.If S is a vertex cover of G, then A = ∅, hence Eq. (9)yields fp∞(GS) = n+|S|

2n . If S is not a vertex cover of G,then |A| ≥ 1; for any node u ∈ A, since at least one ofits neighbors is not in S, fp∞(GS , {u}) is strictly smallerthan the probability that the mutant on u reproduces beforeit gets replaced, which, as we argued, is 1

2 , hence Eq. (9)yields fp∞(GS) < n+|S|

2n .

The hardness for FM∞ is a direct consequence of Lemma 4and the fact that vertex cover is NP-hard even on regulargraphs (see, e.g., (Feige 2003)). Moreover, as fp(GS , δ) isa continuous function on δ, we also obtain hardness for FM(i.e., under finite δ). We thus have the following theorem.

Theorem 1. FM∞(G, k) and FM(G, δ, k) are NP-hard,even on undirected regular graphs.

Finally, we note that the hardness persists even given anoracle that computes the fixation probability fp(GS , δ) (orfp∞(GS)) given a graph G and a set of active nodes S.

3.2 Non-Submodularity for FMOne standard way to circumvent NP-hardness is to showthat the optimization function is submodular, which impliesthat a greedy algorithm offers constant-factor approximationguarantees (Nemhauser, Wolsey, and Fisher 1978; Krauseand Golovin 2014). Although in the next section we provethat fp∞(GS) is indeed submodular, unfortunately the samedoes not hold for fp(GS , δ), even for very simple graphs.

Theorem 2. The function fp(GS , δ) is not submodular.

Proof. Our proof is by means of a counter-example. Con-sider G = K4, that is, G is a clique on 4 nodes. Fori ∈ {0, 1, 2} denote by pi the fixation probability on K4

when i of the 4 nodes are active and mutants have fitnessadvantage δ = 1/3. For fixed i, the value pi can be com-puted exactly by solving a system of 24 linear equations(one for each configuration of mutants and residents). Solv-ing the three systems using a computer algebra system, weobtain p0 = 1/4, p1 = 38413/137740 < 0.2788 andp2 = 28984/94153 > 0.3078. Hence p0 + p2 > 0.5578 =2 · 0.2789 > 2p1, thus submodularity is violated.

For G = K4, the three systems can even be solved symbol-ically, in terms of δ > 0. It turns out that the submodularityproperty is violated for δ < δ?, where δ? .

= 0.432. In con-trast, for G = K3 the submodularity property holds for allδ ≥ 0 (the verification is straightforward, though tedious).

4 Positive ResultsWe now turn our attention to positive results for fixationmaximization in the limits of strong and weak selection.

4.1 Approximating the Fixation ProbabilityWe first focus on computing the fixation probability. Al-though the complexity of the problem is open even in thestandard Moran process, when the underlying graph is undi-rected, the expected number of steps until the standard (non-positional) process terminates is polynomial in n (Dıaz et al.2012), which yields a FPRAS. A straightforward extensionof that approach applies to the expected number T(GS , δ)of steps in the positional Moran process.

Lemma 5. Given an undirected graph G = (V,E) on nnodes, a set S ⊆ V and a real number δ ≥ 1, we haveT(GS , δ) ≤ (1 + δ)n6.

As a consequence, by simulating the process multiple timesand reporting the proportion of runs that terminated withmutant fixation, we obtain a fully polynomial randomizedapproximation scheme (FPRAS) for the fixation probability.

Corollary 1. Given a connected undirected graph G =(V,E), a set S ⊆ V and a real number δ ≥ 1, the func-tion fp(GS , δ) admits a FPRAS.

4.2 Fixation Maximization under Strong SelectionHere we show that the function fp∞(GS) is submodular. Asa consequence, we obtain a constant-factor approximationfor fp∞(GS) in polynomial time.

Lemma 6 (Submodularity). For any undirected graph G,the function fp∞(GS) is submodular.

Proof. Consider any set S of active nodes. Then the posi-tional Moran process {Xt} with mutant advantage δ → ∞is equivalent to the following process:

1. While Xt ∩ S = ∅, perform an update step Xt → Xt+1

as if δ = 0.2. If Xt ∩ S 6= ∅, terminate and report fixation.

Indeed, as long as no active node hosts a mutant, all individ-uals have the same (unit) fitness and the process is indistin-guishable from the Moran process with δ = 0. On the otherhand, once any one active node receives a mutant, fixationhappens with high probability by Lemma 3. All in all, thefixation probability p(S) = fp∞(GS) in the limit δ → ∞can be computed by simulating the neutral process (δ = 0)until either Xt = ∅ (extinction) or Xt ∩ S 6= ∅ (fixation).

To prove the submodularity of p(S), it suffices to show thatfor any two sets S, T ⊆ V we have:

p(S) + p(T ) ≥ p(S ∪ T ) + p(S ∩ T ). (10)

Consider any fixed trajectory T = (X0, X1, . . . ). We saythat a subset U ⊆ V is good with respect to T , or simplygood, if there exists a time point τ ≥ 0 such that Xτ∩U 6= ∅.It suffices to show that, for any T , there are at least as manygood sets among S, T as they are among S ∪ T and S ∩ T .To that end, we distinguish three cases based on how manyof the sets S ∪ T , S ∩ T are good.

1. Both of them: Since S ∩ T is good, both S and T aregood and we get 2 ≥ 2.

2. One of them: If S ∩T is good we conclude as before (weget 2 ≥ 1). Otherwise S ∪ T is good, hence at least oneof S, T is good and we get 1 ≥ 1.

3. None of them: We have 0 ≥ 0.

The submodularity lemma leads to the following approx-imation guarantee for the greedy algorithm (Nemhauser,Wolsey, and Fisher 1978; Krause and Golovin 2014).

Theorem 3. Given an undirected graph G and integer k,let S∗ be the solution to FM∞(G, k), and S′ the set returnedby a greedy maximization algorithm. Then fp∞(GS

′) ≥(

1− 1e

)fp∞(GS

∗).

Consequently, a greedy algorithm approximates the optimalfixation probability within a factor of 1 − 1/e, provided itis equipped with an oracle that estimates the fixation proba-bility fp∞(GS) given any set S. For the class of undirectedgraphs, Lemma 5 and Theorem 3 yield a fully polynomialrandomized greedy algorithm for FM∞(G, k) with approxi-mation guarantee 1− 1/e.

4.3 Fixation Maximization under Weak SelectionHere we show that FM0 can be solved in polynomial time.Recall that given an active set S ⊂ V , we can writefp(GS , δ) = fp(GS , 0) + δ · fp′(GS , 0) + O(δ2), whileFM0(G, k) calls to compute a set S? which maximizes thecoefficient fp′(GS , 0) across all sets S that satisfy |S| ≤ k.We build on the machinery developed in (Allen et al. 2017).

In Lemma 7 below, we show that fp′(GS , 0) =∑ui∈S α(ui) for a certain function α : V → R. The bot-

tleneck in computing α(·) is solving a system of O(m)linear equations, where m is the number of edges in G.This can be done in matrix-multiplication time O(mω) forω < 2.373 (Alman and Williams 2021). By virtue of thelinearity of fp′(GS , 0), the optimal set S? is then given bychoosing the top-k nodes ui in terms of α(ui). Thus we es-tablish the following theorem.

Theorem 4. Given a graph G of m edges and some inte-ger k, the solution to FM0(G, k) can be computed inO(mω)time, where ω < 2.373 is the matrix multiplication exponent.

In the rest of this section, we state Lemma 7 and giveintuition about the function α(·). For the formal proofof Lemma 7, see Appendix.

For brevity, we denote the n nodes of G by u1, . . . , un andwe denote by pij = w(ui, uj) the probability that, if ui isselected for reproduction, then the offspring migrates to uj .

Lemma 7. Let G = (V,E) be a graph with n nodesu1, . . . , un and m edges. Consider a function α : V → Rdefined by α(ui) = 1

n ·∑j∈[n] pij · πj · ψij , where {πi |

ui ∈ V } is the solution to the linear system

πi =

1−∑j∈[n]

pji

· πi +∑j∈[n]

pij · πj ∀i ∈ [n],

(11)

and {ψij | (ui, uj) ∈ E} is the solution to the linear system

ψij =

{1+

∑`∈[n](p`i·ψ`j+p`j ·ψi`)∑

`∈[n](p`i+p`j) i 6= j and (ui, uj) ∈ E0, otherwise.

(12)

Then fp′(GS , 0) =∑ui∈S α(ui).

The intuition behind the quantities πi, ψij , and α(ui)from Lemma 7 is as follows: Consider the Moran process onGwith δ = 0. Then for i ∈ [n], the value πi is in fact the mu-tant fixation probability starting from the initial configura-tion X0 = {ui} (Allen et al. 2021). Indeed, at any time step,the agent on ui will eventually fixate if (i) ui is not replacedby its neighbors and eventually fixates from the next steponward, or (ii) ui spreads to a neighbor uj and fixates fromthere. The first event happens with rate (1−

∑j∈[n] pji) ·πi,

while the second event happens with rate∑j∈[n] pij · πj .

(We note that for undirected graphs, the system has an ex-plicit solution πi =

1/deg(ui)∑j∈[n]

1/deg(uj)(Broom et al. 2010).)

The values ψij for (ui, uj) ∈ E also have intuitive meaning:They are equal to the expected total number of steps duringthe Moran process (with δ = 0) in which ui is mutant anduj is not, when starting from a random initial configurationP[X0 = {ui}] = 1/n. To sketch the idea behind this claim,suppose that in a single step, a node u` spreads to node ui(this happens with rate p`i). Then the event “ui is mutantwhile uj is not” holds if and only if u` was mutant but ujwas not, hence the product p`i · ψ`j . Similar reasoning ap-plies to the product p`j · ψi`. The term 1 in the numeratorcomes out of the 1/n probability that the initial mutant landson ui (in which case indeed ui is mutant and uj is not).

Finally, the function α(·) also has a natural interpretation:The contribution of an active node ui to the fixation prob-ability grows vs. δ at δ = 0 to the extent that a mutant uiis likely to be chosen for reproduction, spread to a residentneighbor uj , and fixate from there; that is, by the total timethat ui is mutant but a neighbor uj is not (ψij), calibrated bythe probability that, when chosen for reproduction, ui prop-agates to that neighbor (pij), who thereafter achieves mu-tant fixation (πj); this growth rate is summed over all neigh-bors and weighted by the rate 1/n at which ui reproduceswhen δ = 0.

5 ExperimentsHere we report on an experimental evaluation of the pro-posed algorithms and some other heuristics. Our data set

consists of 110 connected subgraphs of community graphsand social networks of the Standford Network AnalysisProject (Leskovec and Krevl 2014). These subgraphs wherechosen randomly, and varied in size between 20-170 nodes.Our evaluation is not aimed to be exhaustive, but ratherto outline the practical performance of various heuristics,sometimes in relation to their theoretical guarantees.

In particular, we use 7 heuristics, each taking as input agraph, a budget k, and (optionally) the mutant fitness ad-vantage δ. The heuristics are motivated by our theoreticalresults and by the related fields of influence maximizationand evolutionary graph theory.

1. Random: Simply choose k nodes uniformly at random.This heuristic serves as a baseline.

2. High Degree: Choose the k nodes with the largest degree.3. Centrality: Choose the k nodes with the largest between-

ness centrality.4. Temperature: The temperature of a node u is defined asT (u) =

∑v w(v, u); this heuristic chooses the k nodes

with the largest temperature.5. Vertex Cover: Motivated by Lemma 4, this heuristic at-

tempts to maximize the number of edges with at least oneendpoint in S. In particular, given a set A ⊆ V , let

c(A) = {|(u, v) ∈ E : u ∈ A or v ∈ A|}. (13)

The heuristic greedily maximizes c(·), i.e., we start withS = ∅ and perform k update steps

S ← S ∪ arg maxu6∈S

c(S ∪ {u}). (14)

6. Weak Selector: Choose the k nodes that maximizefp′(GS , 0) using the (optimal) weak-selection method.

7. Lazy Greedy: A simple greedy algorithm starts with S =∅, and in each step chooses the node u to add to S thatmaximizes the objective function. For FM(G, δ, k) andFM∞(G, k) this process requires to repeatedly evaluatefp(GS∪{u}, δ) (or fp(GS∪{u}, δ)) for every node u. Thisis done by simulating the process a large number of times(recall Lemma 5), and becomes a computational bottle-neck when we require high precision. As a workaroundwe suggest a lazy variant of the greedy algorithm (Mi-noux 1978), which is faster but requires submodularity ofthe objective function. In effect, this algorithm is a cor-rect implementation of the greedy heuristic in the limitof strong selection (recall Lemma 6), while it may stillperform well (but without guarantees) for finite δ.

In all cases ties are broken arbitrarily. Note that the heuris-tics vary in the amount of information they have about thegraph and the invasion process. In particular, Random hasno information whatsoever, High Degree only considers thedirect neighbors of a node, Temperature considers direct anddistance-2 neighbors of a node, Centrality and Vertex Coverconsider the whole graph, while Weak Selector and LazyGreedy are the only ones informed about the Moran process.

For each graphG, we have chosen values of k correspondingto 10%, 30% and 50% of its nodes, and have evaluated theabove heuristics in their ability to solve FM∞(G,K) (strong

Figure 4: Heuristic performance for FM∞.

Figure 5: Heuristic performance for FM0.

selection) and FM0(G, k) (weak selection). We have not con-sidered other values of δ as evaluating fp(GS , δ) preciselyvia simulations requires many repetitions and becomes slow.

Strong selection. We start with the case of FM∞(G,K).Since different graphs G and budgets k generally result inhighly variant fixation probabilities, in order to get an in-formative aggregate measure of each heuristic, we dividethe fixation probability it obtains by the maximum fixationprobability obtained across all heuristics for the same graphand budget. This normalization yields values in the interval[0, 1], and makes comparison straightforward. Fig. 4 showsour results. We see that the Lazy Greedy algorithm performsbest for small budgets (10%, 30%), while its performance ismatched by Vertex Cover for budget 50%. The high perfor-mance of Lazy Greedy is expected given its theoretical guar-antee (Theorem 3). We also observe that, apart from WeakSelector, the other heuristics perform quite well on manygraphs, though there are several cases on which they fail, ap-pearing as outlier points in the box plots. As expected, ourbaseline Random heuristic performs quite poorly; this resultindicates that the node activation set S has a significant im-pact on the fixation probability. Finally, recall that Weak Se-lector is optimal for the regime of weak selection (δ → 0).The fact that Weak Selector underperforms for strong selec-tion (δ → ∞) indicates an intricate relationship betweenfitness advantage and fixation probability.

Weak selection. We collect the results on weak selection

in Fig. 5, using the normalization process above. Sincethe derivative satisfies fp′(GS , 0) =

∑ui∈S α(ui) (Sec-

tion 4.3), Lazy Greedy is optimal and coincides with WeakSelector, and is thus omitted from the figure. Naturally, theWeak Selector always outperforms others, while the Ran-dom heuristic is weak. The other heuristics have mediocreperformance, with a clear advantage of Centrality over therest, which becomes clearer for larger budget values.

6 ConclusionWe introduced the positional Moran process and studiedthe associated fixation maximization problem. We haveshown that the problem is NP-hard in general, but becomestractable in the limits of strong and weak selection. Our re-sults only scratch the surface of this new process, as sev-eral interesting questions are open, such as: Does the strong-selection setting admit a better approximation than the onebased on submodularity? Can the problem for finite δ be ap-proximated within some constant-factor? Are there classesof graphs for which it becomes tractable?

ReferencesAllen, B.; Lippner, G.; Chen, Y.-T.; Fotouhi, B.; Momeni,N.; Yau, S.-T.; and Nowak, M. A. 2017. Evolutionary dy-namics on any population structure. Nature, 544(7649):227–230.Allen, B.; Sample, C.; Steinhagen, P.; Shapiro, J.; King, M.;Hedspeth, T.; and Goncalves, M. 2021. Fixation probabil-

ities in graph-structured populations under weak selection.PLOS Computational Biology, 17(2): 1–25.Alman, J.; and Williams, V. V. 2021. A Refined LaserMethod and Faster Matrix Multiplication. In Marx, D., ed.,Proceedings of the 2021 ACM-SIAM Symposium on DiscreteAlgorithms, SODA 2021, Virtual Conference, January 10 -13, 2021, 522–539. SIAM.Ann Goldberg, L.; Lapinskas, J.; and Richerby, D. 2020.Phase transitions of the Moran process and algorithmic con-sequences. Random Structures & Algorithms, 56(3): 597–647.Antal, T.; Redner, S.; and Sood, V. 2006. Evolutionary Dy-namics on Degree-Heterogeneous Graphs. Phys. Rev. Lett.,96: 188104.Borgs, C.; Brautbar, M.; Chayes, J.; and Lucier, B. 2014.Maximizing Social Influence in Nearly Optimal Time. InSODA, 946–957.Broom, M.; Hadjichrysanthou, C.; Rychtar, J.; and Stadler,B. 2010. Two results on evolutionary processes on gen-eral non-directed graphs. Proceedings of the Royal Soci-ety A: Mathematical, Physical and Engineering Sciences,466(2121): 2795–2798.Chatterjee, K.; Ibsen-Jensen, R.; and Nowak, M. A. 2017.Faster Monte-Carlo Algorithms for Fixation Probability ofthe Moran Process on Undirected Graphs. In Larsen, K. G.;Bodlaender, H. L.; and Raskin, J.-F., eds., 42nd Interna-tional Symposium on Mathematical Foundations of Com-puter Science (MFCS 2017), volume 83 of Leibniz Inter-national Proceedings in Informatics (LIPIcs), 61:1–61:13.Dagstuhl, Germany: Schloss Dagstuhl–Leibniz-Zentrumfuer Informatik. ISBN 978-3-95977-046-0.Clifford, P.; and Sudbury, A. 1973. A Model for SpatialConflict. Biometrika, 60(3): 581–588.Demers, A.; Greene, D.; Hauser, C.; Irish, W.; Larson, J.;Shenker, S.; Sturgis, H.; Swinehart, D.; and Terry, D. 1987.Epidemic Algorithms for Replicated Database Maintenance.In Proceedings of the Sixth Annual ACM Symposium onPrinciples of Distributed Computing, PODC ’87, 1–12. NewYork, NY, USA: Association for Computing Machinery.ISBN 089791239X.Dıaz, J.; Goldberg, L. A.; Mertzios, G. B.; Richerby, D.;Serna, M.; and Spirakis, P. G. 2012. Approximating FixationProbabilities in the Generalized Moran Process. In Proceed-ings of the Twenty-Third Annual ACM-SIAM Symposium onDiscrete Algorithms, SODA ’12, 954–960. USA: Society forIndustrial and Applied Mathematics.Dıaz, J.; Goldberg, L. A.; Richerby, D.; and Serna, M. 2016.Absorption time of the Moran process. Random Structures& Algorithms, 49(1): 137–159.Even-Dar, E.; and Shapira, A. 2007. A Note on Maximizingthe Spread of Influence in Social Networks. In Deng, X.; andGraham, F. C., eds., Internet and Network Economics, 281–286. Berlin, Heidelberg: Springer Berlin Heidelberg. ISBN978-3-540-77105-0.Feige, U. 2003. Vertex cover is hardest to approximate onregular graphs. Technical Report MCS03-15, The Weiz-mann Institute of Science.

Fountoulakis, N.; Panagiotou, K.; and Sauerwald, T. 2012.Ultra-Fast Rumor Spreading in Social Networks. In Pro-ceedings of the Twenty-Third Annual ACM-SIAM Sympo-sium on Discrete Algorithms, SODA ’12, 1642–1660. USA:Society for Industrial and Applied Mathematics.Galanis, A.; Gobel, A.; Goldberg, L. A.; Lapinskas, J.; andRicherby, D. 2017. Amplifiers for the Moran process. Jour-nal of the ACM (JACM), 64(1): 5.Giakkoupis, G. 2016. Amplifiers and Suppressors of Selec-tion for the Moran Process on Undirected Graphs. arXivpreprint arXiv:1611.01585.Huang, W.; and Traulsen, A. 2010. Fixation probabilities ofrandom mutants under frequency dependent selection. Jour-nal of Theoretical Biology, 263(2): 262–268.Kaveh, K.; McAvoy, A.; and Nowak, M. A. 2019. Environ-mental fitness heterogeneity in the Moran process. RoyalSociety open science, 6(1): 181661.Kempe, D.; Kleinberg, J.; and Tardos, E. 2003. Maximizingthe Spread of Influence through a Social Network. In Pro-ceedings of the Ninth ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining, 137–146.Kotzing, T.; and Krejca, M. S. 2019. First-hitting times un-der drift. Theoretical Computer Science, 796: 51–69.Krause, A.; and Golovin, D. 2014. Submodular functionmaximization. Tractability, 3: 71–104.Leskovec, J.; and Krevl, A. 2014. SNAP Datasets:Stanford Large Network Dataset Collection. /url-http://snap.stanford.edu/data.Li, Y.; Fan, J.; Wang, Y.; and Tan, K. 2018. Influence Maxi-mization on Social Graphs: A Survey. IEEE TKDE, 30(10):1852–1872.Lieberman, E.; Hauert, C.; and Nowak, M. A. 2005. Evolu-tionary dynamics on graphs. Nature, 433(7023): 312–316.Liggett, T. 1985. Interacting Particle Systems. Classics inmathematics. Springer New York. ISBN 9783540960690.McAvoy, A.; and Allen, B. 2021. Fixation probabilities inevolutionary dynamics under weak selection. Journal ofMathematical Biology, 82(3): 1–41.Minoux, M. 1978. Accelerated greedy algorithms for max-imizing submodular set functions. In Stoer, J., ed., Opti-mization Techniques, 234–243. Berlin, Heidelberg: SpringerBerlin Heidelberg. ISBN 978-3-540-35890-9.Monk, T.; Green, P.; and Paulin, M. 2014. Martingales andfixation probabilities of evolutionary graphs. Proc. R. Soc.A Math. Phys. Eng. Sci., 470(2165): 20130730.Moran, P. A. P. 1958. Random processes in genetics. Math-ematical Proceedings of the Cambridge Philosophical Soci-ety, 54(1): 60–71.Nemhauser, G. L.; Wolsey, L. A.; and Fisher, M. L. 1978.An Analysis of Approximations for Maximizing Submodu-lar Set Functions–I. Math. Program., 14(1): 265–294.Nowak, M. A. 2006. Evolutionary dynamics: exploring theequations of life. Cambridge, Massachusetts: Belknap Pressof Harvard University Press. ISBN 0674023382 (alk. paper).

Pavlogiannis, A.; Tkadlec, J.; Chatterjee, K.; and Nowak,M. A. 2018. Construction of arbitrarily strong amplifiers ofnatural selection using evolutionary graph theory. Commu-nications Biology, 1(1): 71.

Talamali, M. S.; Saha, A.; Marshall, J. A. R.; and Reina,A. 2021. When less is more: Robot swarms adapt better tochanges with constrained communication. Science Robotics,6(56): eabf1416.

Tang, Y.; Shi, Y.; and Xiao, X. 2015. Influence Maximiza-tion in Near-Linear Time: A Martingale Approach. In SIG-MOD, 1539–1554.

Tang, Y.; Xiao, X.; and Shi, Y. 2014. Influence Maximiza-tion: Near-optimal Time Complexity Meets Practical Effi-ciency. In SIGMOD.

Tkadlec, J.; Pavlogiannis, A.; Chatterjee, K.; and Nowak,M. A. 2021. Fast and strong amplifiers of natural selection.Nature Communications, 12(1): 4009.

Zhang, K.; Zhou, J.; Tao, D.; Karras, P.; Li, Q.; andXiong, H. 2020. Geodemographic Influence Maximization,2764–2774. New York, NY, USA: Association for Comput-ing Machinery. ISBN 9781450379984.

A AppendixLemma 2 (Monotonicity). Consider a graph G = (V,E),two subsets S ⊆ S′ ⊆ V and two real numbers 0 ≤ δ ≤ δ′.Then fp(GS , δ) ≤ fp(GS

′, δ′).

Proof. Fix pairs (S, δ) and (S′, δ′). The key property is that,for any two configurations X ⊆ X′ and any node u ∈ V ,we have fX,S(u) ≤ fX′,S′(u): Indeed, the inequality is strictif and only if u ∈ (X′ \ X) ∩ (S′ \ S). As a consequence,Lemma 5 from (Dıaz et al. 2016) applies and yields a desiredcoupling between the processes with parameters (S, δ) and(S′, δ′).

Lemma 3. Let G be an undirected graph, S a set of ac-tive nodes, and A a set of mutants. If A ∩ S 6= ∅ thenfp∞(GS , A) = 1.

Proof. Due to Lemma 2, it suffices to prove the statementfor when A = {u} is a singleton set. Our proof is by astochastic domination argument. In particular, we constructa simple Markov chainM and a coupling betweenM andthe Moran process on G from X = {u} such that the proba-bility of a random walk starting from a particular start statesu ofM has at least probability fp∞(GS , A) of getting ab-sorbed in a final state sn.

Let ` = 2 + |{(u, v) ∈ E}|, i.e., ` equals 2 plus the numberof neighbors of u.M consists of the following set of states

S = {s0, su, su∗} ∪ {si : ` ≤ i ≤ n} .

The transition probability function p : S × S → [0, 1] is as

follows, for constants c1, c2 that depend on G but not on δ.

p(su, s0) =c1

1 + δp(su, su∗) = 1− p(su, s0)

p(su∗ , s`) = p(si, si+1) = c2 for ` ≤ i < n

p(su∗ , su) = p(si, su) = 1− c2 for ` ≤ i ≤ n

while for every state s, the remaining probability mass isadded as a self-loop probability p(s, s).

We now sketch the coupling betweenM and the positionalMoran process on G. The coupling guarantees the followingcorrespondence between the states ofM and the configura-tion X of the Moran process.

1. For the state su, we have u ∈ X.2. For the state su∗ , we have (i) u ∈ X , and (ii) for every

(u, v) ∈ E, we have v ∈ X .3. For each state si, with ` ≤ i ≤ n, we have (i) u ∈ X ,

(ii) for every (u, v) ∈ E, we have v ∈ X , and (iii) |X| ≥i.

We argue that the transition probabilities preserve the abovecorrespondence.

1. While u ∈ X, the probability for a resident neighbor vof u to reproduce and place an offspring on u is boundedby ∝ 1

1+δ . On the other hand, the probability that u re-produces in successive rounds until it places a mutant off-spring to each of its neighbors can be exponentially smallin n but independent of δ, as, since u ∈ S, the fitness ofu is at least a fraction 1/n of the total population fitnessregardless of the remaining nodes. It follows that, from aconfiguration X with u ∈ X , the probability that u turnsresident before every neighbor of v turns mutant is upper-bounded by c

1+δ , for some constant c that depends on Gbut not on δ.

2. Given any configuration X with u ∈ X, the probabilitythat a resident reproduces is at most∝ 1

1+δ . Similarly, theprobability that a mutant reproduces and replaces a resi-dent is at least∝ 1

1+δ . Thus, the probability that the latterevent happens before the former event is lower-boundedby some constant c2 that depends on G but not on δ.

Thus, we have a coupling in which the probability that arandom walk in M starting at su gets absorbed in sn is atleast as large as fp∞(GS , {u}). It is straightforward to verifythat the probability of the former event tends to 1 as δ →∞.Thus fp∞(GS , {u}), as desired.

Theorem 1. FM∞(G, k) and FM(G, δ, k) are NP-hard,even on undirected regular graphs.

Proof. Following Lemma 4, the hardness for FM∞ followsdirectly by reducing the question of whether a graphG has avertex cover of of size k to the problem of whether FM∞ canachieve fixation probability of at least n+k

2n . For the hardnessof FM, recall that for any S ⊆ V , the function fp(GS , δ) iscontinuous on δ. A crude but slightly more detailed analy-sis in the proof of Lemma 4 shows that if we have an edge

(u, v) ∈ E with u, v 6∈ S, then the fixation probability fromu is

fp∞(GS , {u}) < 1

2− 1

2n4.

In turn, if S is not a vertex cover of G then

fp∞(GS) < c =n+ |S|

2n− 1

2n5

Due to the continuity of fp(GS , δ), there exists a largeenough δ∗ such that if G has a vertex cover S of size k thenfp(GS , δ∗) ≥ c. On the other hand, due to the monotonicity(Lemma 2), if G has no vertex cover of size k, then we have

maxS⊆V,|S|=k

fp(GS , δ∗) ≤ maxS⊆V,|S|=k

fp∞(GS) < c .

Thus for such a fitness advantage δ∗, we can achieve fixationprobability c iff G has a vertex cover of size k.

Lemma 5. Given an undirected graph G = (V,E) on nnodes, a set S ⊆ V and a real number δ ≥ 1, we haveT(GS , δ) ≤ (1 + δ)n6.

Proof. We follow the proof strategy of Theorem 11from (Dıaz et al. 2012). Given a set X of nodes occu-pied by mutants, consider a potential function defined byφ(X) =

∑u∈X

1deg(u) , where deg(u) is the number of edges

incident to u. Note that 0 ≤ φ(X) ≤ n, since 0 ≤ |X| ≤ nand for each u ∈ V we have deg(u) ≥ 1. Moreover, given aconfiguration Xt at time-point t, let ∆t = φ(Xt+1)− φ(Xt)be a random variable that measures the increase of the po-tential function φ in a single step. We make two claims:

1. E[∆t | Xt] ≥ 0: Let S be a set of edges (u, v) such that uis occupied by a mutant and v by a resident, and let F =∑u∈V fXt,S(u) be the total fitness of the population. In a

single step, the set of nodes occupied by mutants changeseither by a mutant replacing a resident neighbor, or by aresident replacing a mutant neighbor. Thus

E[∆t | Xt] =∑

(u,v)∈S

fXt,S(u)

F· 1

deg(u)· +1

deg(v)

+∑

(u,v)∈S

fXt,S(v)

F· 1

deg(v)· −1

deg(u)

=∑

(u,v)∈S

fXt,S(u)− fXt,S(v)

F deg(u) deg(v)≥ 0,

where the last inequality holds since the fitness of anymutant is at least as large as that of any neighboring res-ident.

2. If ∅ ( Xt ( V then P[∆t ≤ −1/n | Xt] ≥ 1(1+δ)n2 :

If the population is not homogeneous then there existsat least one edge (u, v) ∈ E with u 6∈ Xt and v ∈ Xt.With probability pu = fXt,S(u)/F ≥ 1

(1+δ)n the agentat u is selected for reproduction and with probabilitypu→v = 1/ deg(u) ≥ 1/n its offspring migrates to v,which changes the potential function by −1/deg(v) ≤−1/n.

By Item 1, the potential function φ gives rise to a sub-martingale. Moreover, the function φ is bounded by B = nand by Item 2, we have V = Var[φ(Xt+1) | Xt] ≥pu · pu→v · (1/n)2 ≥ 1

(1+δ)n4 . Thus, the standard martin-gale machinery of drift analysis applies (Kotzing and Krejca2019). Namely, the rescaled function φ·(φ−2B)+B2 satis-fies the conditions of the upper additive drift theorem (withinitial value at most B2 and step-wise drift at least V ). Theexpected time until termination is thus at most

T(GS , δ) ≤ B2

V≤ (1 + δ)n6.

Lemma 7. Let G = (V,E) be a graph with n nodesu1, . . . , un and m edges. Consider a function α : V → Rdefined by α(ui) = 1

n ·∑j∈[n] pij · πj · ψij , where {πi |

ui ∈ V } is the solution to the linear system

πi =

1−∑j∈[n]

pji

· πi +∑j∈[n]

pij · πj ∀i ∈ [n],

(11)

and {ψij | (ui, uj) ∈ E} is the solution to the linear system

ψij =

{1+

∑`∈[n](p`i·ψ`j+p`j ·ψi`)∑

`∈[n](p`i+p`j) i 6= j and (ui, uj) ∈ E0, otherwise.

(12)

Then fp′(GS , 0) =∑ui∈S α(ui).

Proof. Given a set S, we write λi = 1 to indicate that ui ∈S (and λi = 0 otherwise). Given any configuration X ⊂V , we write xi = 1 to indicate that ui ∈ X (and xi = 0otherwise). We also define a function φ(X) =

∑i∈[n] πi · xi

and we let

∆(X, δ) = E[φ(Xt+1)− φ(Xt) | Xt = X] (15)

be its expected change in a single step of the Moran processwith fitness advantage δ at nodes S. Let x = 1

n

∑i∈[n] λixi.

Then we can write

∆(X, δ) =∑i,j∈[n]

1 + δ · λixin(1 + δ · x)

pij(xi − xj)πj . (16)

Finally, the derivative of ∆(X, δ) at δ = 0 is

∆′(X) =d

d δ

∣∣∣δ=0

∆(X, δ)

=1

n

∑i,j∈[n]

(λi · xi − x) · pij · (xi − xj) · πj

=1

n

∑i,j∈[n]

λi · xi · pij · (xi − xj) · πj

− x

n

∑i,j∈[n]

pij · (xi − xj) · πj

=1

n

∑i,j∈[n]

λi · pij · πj · xi(1− xj), (17)

where in the last equality we used that x2i = xi and that

the second sum vanishes, since all the terms that involve anyfixed xi sum up to xi ·

∑j∈[n] (pij · πj − pji · πi) = 0 (due

to Eq. (11)).

Now consider the neutral positional Moran process (δ = 0)starting with P[X0 = {ui}] = 1/n for i ∈ [n]. For any i, j ∈[n] and t ≥ 0, consider an event Yij(t) defined as “at time twe have ui ∈ Xt and uj 6∈ Xt” and letψij =

∑∞t=0 P[Yij(t)]

be the total expected time spent with ui being a mutant anduj not. Then by (McAvoy and Allen 2021, Theorem 1) wehave

fp′(GS , 0) =

∞∑t=0

∑X⊂V

P[Xt = X] ·∆′(X)

=1

n

∑i,j∈[n]

λi · pij · πj · ψij =∑ui∈S

α(ui),

It remains to show that the quantities ψij satisfy the linearsystem of Eq. (12). For i = j we clearly have ψii = 0. Fori 6= j we write

ψij = P[Yij(0)] +

∞∑t=0

P[Yij(t+ 1)]. (18)

The first term is simply 1/n. Let eij = 1npij be the prob-

ability that, in a single step of the standard Moran process(δ = 0), node ui is selected for reproduction and places itsoffspring on uj . Then we can rewrite each term of the sumin terms of events at time t as follows:

P[Yij(t+ 1)] =∑l∈[n]

eli · P[Ylj(t)] +∑l∈[n]

elj · P[Yil(t)]

+

(1−

∑l∈[n]

(eli + elj

))· P[Yij(t)].

Summing over t and plugging this into Eq. (18) we get

ψij =

1n +

∑l∈[n] eli · ψlj +

∑l∈[n] elj · ψil∑

l∈[n] eli +∑l∈[n] elj

,

thus we arrive at the the linear system of Eq. (12).

This concludes the proof.

Graph from weak-selection experiments. Fig. 6 illus-trates a challenging graph for FM0, along with the activationchoices of different heuristics.

SWeak = {8, 12}SHigh = STemp = SVert = {7, 3}

SCent = {7, 8}

Figure 6: Node activation for weak selection on a challeng-ing graph of n = 26 nodes, that shows as an outlier in Fig. 5.The budget is k = b10% · nc = 2. The choices of the differ-ent heuristics are shown.