Non-Uniform Random Spanning Trees on Weighted Graphs

10

Transcript of Non-Uniform Random Spanning Trees on Weighted Graphs

Non Uniform Random Spanning Trees onWeighted Graphs�M. MOSBAH N. SAHEBLaBRIy, Universit�e Bordeaux-I351, cours de la Lib�eration, 33405 Talence, France.E-mail: fmosbah,[email protected] study random walks on undirected graphs with weighted edges.Our main result shows that any spanning tree de�ned by the edgescorresponding to a �rst visit of a vertex, appears with a probabilityproportional to its weight, which is the product of the weights of itsedges. This provides an algorithm for generating non uniform randomspanning trees in a weighted graph. The technique used here is basedon linear equations over regular expressions and �nite automata the-ory. R�esum�eNous �etudions les marches al�eatoires sur les graphes pond�er�es non ori-ent�es. Nous d�emontrons, en particulier, que la probabilit�e d'engendrerun arbre couvrant est proportionnelle �a son poids, qui est le produitdes poids de ses aretes. Ce r�esultat fournit un algorithme pour g�en�ererdes arbres couvrants non uniformes dans un graphe pond�er�e. Lestechniques utilis�ees sont fond�ees sur les syst�emes d'�equations sur leslangages rationnels et la th�eorie des automates �nis.�This work has been supported by the ESPRIT Basic Research Working Group \Com-puting by Graph Transformations II" .yLaboratoire associ�e au CNRS 1

1 IntroductionRandom walks have been studied extensively, and have many applicationssuch as generation of random spanning trees[1, 2, 12], token managementschemes[8, 11], e�ective resistance of electrical networks[3, 10, 5], and on-linealgorithms[4].In this paper, we consider a connected simple undirected graph G =(V;E) together with a positive real-valued map w over E, w(e) being calledthe weight of e. A discrete time random walk (or Markov chain) on G isde�ned as follows. At each step, a particle, located on a vertex x, movesto a neighbour vertex y with a probability proportional to the weight of theedge (x; y), that is, the probability of the transition from x to a neighboury is w((x;y))Pz2N(x)w((x;z)) , where N(x) is the set of neighbours of x. The generatedspanning tree by the random walk is the spanning tree consisting of thoseedges which correspond to the �rst entrance to an arbitrary vertex of G,other than the starting vertex.In the case of uniform random walks (where w(e) = 1, for any e 2 E),Aldous and Broder proved that all spanning trees of G have the same prob-ability of being generated, whatever is the starting vertex.For non uniform random walks, the set of walk behaviours providing agiven spanning tree, for a �xed starting vertex, can be modeled by a regularexpression, charaterised by a linear system of equations. Translating thissystem into an arithmetic system of equations over the corresponding prob-abilities, we prove that the probability of generating a tree is proportionalto the product of the weights of its edges. We also show how to �nd thestationary probability of a vertex, i.e. the probability that a random walk iscurrently at such a vertex. This probability is proportional to the weight ofthis vertex.We consider the case of a weighted cycle. A random walk can be viewedas a particle that goes randomly from a vertex to one its both neighbours.When the particle visits all vertices, it has gone across all edges of the cycleexcept one, which has been left out. We prove that the probability of anedge to be left out does not depend on the starting vertex. Its probability isproportional to the inverse of its weight.The paper is organised as follows. In Section 2, we introduce some nota-tions and de�nitions realated to words and random walks. The propabiblity2

of a generated spanning tree is discussed is Section 3. In this Section, we alsoinvestigate the stationary probability of a vertex. Section 4 deals with theparticular class of cycle graphs. In particular we compute the probabilityfor an edge of a cycle to be left out during a random walk. Section 5 isa conclusion presenting some open problems and further extensions of thiswork.2 Notations and de�nitionsWe denote by G = (V;E) a simple undirected connected weighted graph.Weights are positive reals as de�ned above.A �nite walk over G is a �nite sequence w = i1; i2; : : : ; im of vertices of Gsuch that (ik; ik+1) 2 E; i = 1; : : : ;m � 1. An in�nite walk over G is anin�nite sequence i1; i2; : : : ; with (ik; ik+1) 2 E. A walk may also be viewed asa sequence of directed edges (i1; i2); (i2; i3); (i3; i4) : : :. Thus, if we consideran alphabet A whose letters are elements of E directed in both senses, a�nite (in�nite) walk according to this view, is a word (in�nite word) over A.The set of in�nite walks on G starting with vertex i will be denoted by Wi.The generated tree � (w) by the walk w is the set of all edges correspondingto the �rst entrance of the walk into a vertex of G [1, 2]; it is easy to seethat � (w) is an undirected tree. For a given vertex i and a tree T containingi, the language Li(T ) denotes the set of shortest �nite walks (in the pre�xorder) starting with i and generating T , i.e.Li(T ) = fw=w is a �nite walk starting with i, such that � (w) = Tand for no proper left factor w0 of w; � (w0) = Tg:It is easy to construct a �nite automaton which recognizes Li(T ). Thus Li(T )is a regular language. The elements of Li(T ) can be extended into in�nitewalks in a natural way, more preciselyLi(T ) = fww0=w 2 Li(T ); w0 2 Wj where j is the last vertex of wg:Intuitively, Li(T ) consists of in�nite walks starting with i and generating T .Example 1. Let G be the following graph:3

1 2

34Let T be the tree f(4; 1); (4; 2); (4; 3)g. For the starting vertex 1, we considerthe following scheme: 1

4

1 2

3 4

1

3

1 4

2 1

4

2

4 3

1Dashed transitions correspond to �rst entrances. Edges containing arrows inboth directions can be scanned by the walk following these directions. Boldedges belong to the tree T . Vertex 2 on the left side is the last visited vertexbefore generating T . Similarly for vertex 3 on the right side.4

L1(T ) is the set of all �nite paths beginning at the starting arrow and endingwith one of the exit arrows. L1(T ) is the set of all in�nite paths beginningwith the starting arrow and going in�nitely many times through one of thevertices of the bottom rectagles labeled with 2. It is clear that the languageof in�nite words Li(T ) is recognizable [6, Chap. XIV]. Li(T ) and its recog-nizing automata can be used to compute the cover time (i.e. the expectednecessary time to visit all vertices of G), see[9].We now introduce useful de�nitions related to probability measures onthe walks.For a vertex i, let w(i), called the weight of i, be the sum of the weightsof the incident edges to i. The weight of a tree T is de�ned by the prod-uct of the weights of its edges. The probability of the directed edge (i; j) ispij = w(i;j)w(i) . It is the probability of a transition from i to j, and we have ob-viously Pj2N(i) pij = 1, where N(i) is the set of neighbour vertices of i. Theprobability of a �nite walk w = (i1; i2); (i2; i3); : : : ; (im�1; im) is the productof the probability of its directed edges, it will be denoted by p(w). Finally,for a vertex i and a tree T , we de�ne pi(T ) = Pw2Li(T ) p(w) = p(Li(T )). It iseasy to see that, for a spanning tree T , pi(T ) is the probability of generatingT , whenever the walk starts at i. For so, it su�ces to see that words of Li(T )correspond to pairwisely disjoint events which generate T , and that T cannotbe generated by anything else than these events. Since the elements of L areobtained from those of L by adding all possible extensions, L and L are ofthe same probability and therefore pi(T ) = p(Li(T )).3 Random spanning treesIn the sequel, the set of trees which are subgraphs of G is denoted by T andthe set of spanning trees of G by S.Lemma 3.1 For any vertex i, we haveXT2S pi(T ) = 1:Proof. Starting from i, all vertices of G will �nally be visited with probabil-ity 1. On the other hand, the language Li(T ), T 2 S, are pairwisely disjointsets. The lemma follows.Let i be a vertex and T a spanning tree in G. For a neighbour vertex of5

j such that (i; j) 62 T , the set T [ f(i; j)g contains a unique cycle. In thefollowing lemma, we denote by j the unique vertex such that (i; j) belongsto T and Tj = T [ f(i; j)gnf(i; j)g.Lemma 3.2 For i and T as above, we haveLi(T ) = Xj:(i;j)2T(i; j)Lj(T ) + Xj2N(i):(i;j)62T(i; j)Lj(Tj) + Xj:(i;j)2T(i; j)Kjwhere Kj = ( Lj(T 0) if Tnf(i; j)g is a tree T 0 containing j; otherwiseIt should be noted here that the last sum Pj:(i;j)2T(i; j)Kj contains at mostone term and this is only the case when i is a leaf of T and j is its father.Proof. It is clear that the left member of the equation is a subset of theright one. A simple veri�cation case-by-case allows to show that the rightmember is also a subset of the �rst one.Example 2. Consider again the graph given in Example 1 .Let T = f(1; 2); (2; 4); (4; 3)g, thenL2(T ) = (2; 1)L1(T ) + (2; 4)L4(T ) + (2; 4)L4(f(1; 2); (2; 3); (3; 4)gWe now state the main result,Proposition 3.1 The probability of generating a given spanning tree is pro-portional to its weight.Proof: The previous lemma provides jV j�jSj equations over elements of A1( A1 is the set of in�nite words over A). Since the terms of the sum arepairwisely disjoint, the equations on the languages can be transformed intoequations on probabilities. Using the fact that for a non spanning tree U ,the probability p(Li(U)) = 0, we get 8i 2 V;8T 2 Spi(T ) = Xj:(i;j)2T w(i; j)w(i) pj(T ) + Xj2N(i);(i;j)62T w(i; j)w(i) pj(Tj): (1)Now, by Lemma 3.1, we must have also8i 2 V; XT2S pi(T ) = 1:6

Clearly, the vector pi(T ) = w(T )PT 02S w(T 0) is a solution of the above system oflinear equations. It remains, therefore, to prove that the solution is unique.Let us consider the �nite Markov chain with the set of states f(i; T ); i 2V; T 2 Sg whose stationary probabilities are a solution of the above systemof linear equations. For showing the uniqueness of the solution, it su�cesto prove that the Markov chain is irreducible[7, Chap. XV]. The �rst termsof the �rst equations show that, for a �xed T , a k-step transition betweentwo states (i; T ) and (i0; T ) is possible for any pair of vertices i and i0. Thesecond terms of the �rst equations correspond to a transition of the type(i; T ) (j; Tj). It shows a transition (:; T1) (:; T2) is possible wheneverT1 and T2 di�er only by one edge. Since any spanning tree may be obtainedfrom any other one by a sequence of insertion-deletions, the introduced chainwill be irreducible and the theorem follows.During a random walk, a vertex can be visited many times. The questionthat we shall answer now, is what is the probability that the current visitedis actually a given vertex i ? we shall show that it is proportional to theweight of i. Let � be the stationary distribution over the set of vertices.That is, �(i) is the probability that a random walk, that has begun at anyother vertex, is currently at vertex i.Proposition 3.2 The stationary probability of any vertex is proportional toits weight. Indeed, �(i) = w(i)Pj2V w(j) :Proof. The vector �(i); i 2 V satis�es the system�(i) = Xj2N(i)pji�(j); 8i 2 VWe also have the equation Pi2V �(i) = 1. Using similar techniques as above,we can show that the vector �(i) = w(i)Pj2V w(j) i 2 V is the unique solutionof the system. This ends the proof.Note that, for the particular case where the weight of any edge is 1, thestationary probability of vertex i is d(i)2jEj where d(i) is the degree of i and jEjthe cardinality of the set of edges. 7

4 Cycle graphLetG be an undirected connected cycle, called also a ring. Consider a particlethat moves on G. At each step, the particle goes from the current vertex toone of its two neighbours, with a probability proportional to their weights.Starting at a vertex, it is clear that the �rst time the particle has visited allvertices, it has gone through all edges in the cycle except one, called the left-out edge[2]. Surprisingly, even for weighted graphs, the probability of an edgeto be the left-out one, does not depend on starting vertex; it is proportionalto the inverse of its weight indeed.Proposition 4.1 Let G = (V;E) be a cycle graph, and let e be an edge inG. Then, the probability p(e) for e to be left-out isp(e) = 1Pf2E 1w(f) � 1w(e)Proof: This is a consequence of the main Proposition. In fact, the proba-bility of an edge e to be left-out is the same as the spanning tree containingall other edges. Writing the probability of generating such a tree, we obtainthe result.Note that for the particular case where the weight edge is 1, the left-outedge induced by a random walk is uniformly distributed over the edges ofthe cycle.5 ConclusionIn this paper we used regular expressions and automata to deal with randomwalks on weighted graphs. The advantage of using automata is to memorizealready visited vertices. A transition is created as soon as a new vertexis visited. Therefore, the execution of the automaton generates a randomspanning tree. An interesting extension of this work is to use this techniqueto compute the cover time of a random walk on a weighted graph, i.e. theexpected time to visit all vertices at least once, which is also the expectedtime to generate a random spanning tree. One other useful extension wouldbe to investigate cover time of random walks on particular classes of weigtedgraphs. For example, trees, cycles, chains, cliques are classes of graphs forwhich the cover time seem to be e�ciently computable.8

References[1] D.J. Aldous. The random walk construction of uniform spanning treesand uniform labelled trees. SIAM Journal on Discrete Mathematics,3(4):450{465, 1990.[2] A.Z. Broder. Generating random spanning trees. In Proc. 30th Ann.IEEE Symp. on Foundations of Computer Science, pages 442{453, Oc-tober 1989.[3] A.K. Chandra, P. Raghavan, W.L. Ruzzo, R. Smolensky, and P. Tiwari.The electrical resistance of a graph captures its commute and covertimes. In ACM Symposium on Theory of Computing (STOC), pages574{586, 1989.[4] D. Coppersmith, P. Doyle, P. Raghavan, and M. Snir. Random walkson weighted graphs and applications to on-line algorithms. Journal ofthe ACM, 40(3):421{453, July 1993.[5] P.G. Doyle and J.L. Snell. Random walks and electrical networks. TheMathematical Association of America, 1984.[6] S. Eilenberg. Automata, Languages, and Machines, volumeA. AcademicPress, Newyork, 1974.[7] W. Feller. An introduction to probability theory and its applications,volume Vol. 1. 2nd ed. Wiley, New York, 1957.[8] A. Israeli and M. Jalfon. Token management schemes and randomwalks yield self-stabilizing mutual exclusion. In Proc. of the Ninth An-nual Symposium on Principles of Distributed Computing, pages 119{131,1990.[9] M. Mosbah and N. Saheb. Using automata to compute cover time.Technical report, University of Bordeaux 1, 1996. In preparation.[10] P. Tetali. Randomwalks and the e�ective resistance of networks. Journalof Theoretical Probability, 4:101{109, 1991.[11] P. Tetali and P. Winkler. On a random walk arising in self-stabilizing to-ken management. In Proceedings of the Tenth Annual ACM Symposiumon Principles of Distributed Computing, pages 273{280, 1991.9

[12] D.B. Wilson and J.G. Propp. How to get an exact sample from ageneric Markov chain and sample a random spanning tree from a di-rected graph, both within the cover time. In Proceedings of the SeventhAnnual ACM-SIAM Symposium on Discrete Algorithms, pages 448{457,Atlanta, Georgia, 28{30 January 1996.

10