Asymptotically Optimal Solutions for Small World Graphs

19
Theory Comput Syst (2008) 42: 632–650 DOI 10.1007/s00224-007-9073-y Asymptotically Optimal Solutions for Small World Graphs Michele Flammini · Luca Moscardelli · Alfredo Navarra · Stéphane Pérennes Published online: 18 October 2007 © Springer Science+Business Media, LLC 2007 Abstract We consider the problem of determining constructions with an asymptoti- cally optimal oblivious diameter in small world graphs under the Kleinberg’s model. In particular, we give the first general lower bound holding for any monotone dis- tance distribution, that is induced by a monotone generating function. Namely, we prove that the expected oblivious diameter is (log 2 n) even on a path of n nodes. We then focus on deterministic constructions and after showing that the problem of minimizing the oblivious diameter is generally intractable, we give asymptotically optimal solutions, that is with a logarithmic oblivious diameter, for paths, trees and Cartesian products of graphs, including d -dimensional grids for any fixed value of d . Keywords Small world graph · Social networks · Oblivious diameter 1 Introduction A generic graph G is said to represent a Small-World when each node can “easily and quickly” reach any other one using just local information or, in other words, if The research was partially funded by the European project COST Action 293, “Graphs and Algorithms in Communication Networks” (GRAAL). M. Flammini · L. Moscardelli ( ) · A. Navarra Computer Science Department, University of L’Aquila, Via Vetoio, 67100 L’Aquila, Italy e-mail: [email protected] M. Flammini e-mail: fl[email protected] A. Navarra e-mail: [email protected] S. Pérennes MASCOTTE project, I3S-CNRS/INRIA/University of Nice, Sophia Antipolis, France e-mail: [email protected]

Transcript of Asymptotically Optimal Solutions for Small World Graphs

Theory Comput Syst (2008) 42: 632–650DOI 10.1007/s00224-007-9073-y

Asymptotically Optimal Solutions for Small WorldGraphs

Michele Flammini · Luca Moscardelli ·Alfredo Navarra · Stéphane Pérennes

Published online: 18 October 2007© Springer Science+Business Media, LLC 2007

Abstract We consider the problem of determining constructions with an asymptoti-cally optimal oblivious diameter in small world graphs under the Kleinberg’s model.In particular, we give the first general lower bound holding for any monotone dis-tance distribution, that is induced by a monotone generating function. Namely, weprove that the expected oblivious diameter is �(log2 n) even on a path of n nodes.We then focus on deterministic constructions and after showing that the problem ofminimizing the oblivious diameter is generally intractable, we give asymptoticallyoptimal solutions, that is with a logarithmic oblivious diameter, for paths, trees andCartesian products of graphs, including d-dimensional grids for any fixed value of d .

Keywords Small world graph · Social networks · Oblivious diameter

1 Introduction

A generic graph G is said to represent a Small-World when each node can “easilyand quickly” reach any other one using just local information or, in other words, if

The research was partially funded by the European project COST Action 293, “Graphs andAlgorithms in Communication Networks” (GRAAL).

M. Flammini · L. Moscardelli (�) · A. NavarraComputer Science Department, University of L’Aquila, Via Vetoio, 67100 L’Aquila, Italye-mail: [email protected]

M. Flamminie-mail: [email protected]

A. Navarrae-mail: [email protected]

S. PérennesMASCOTTE project, I3S-CNRS/INRIA/University of Nice, Sophia Antipolis, Francee-mail: [email protected]

Theory Comput Syst (2008) 42: 632–650 633

the resulting “oblivious” diameter is polylogarithmic in the number of the involvednodes. In the literature such a property is also known as “six degrees of separation”coming from Miligram’s experiments [12, 15], that showed an average distance ofsix hops between any two USA citizens in delivering mails. Recently such a propertyhas been extensively studied and formalized by the Kleinberg’s works formalizingthe so-called “small-world phenomena” [7–9]. The relevance of such a topic resultsfrom its numerous applications in social, natural and peer-to-peer networks [1, 13,14, 16], where the common property is the partial knowledge of the environment andthe wish of sharing information.

In the Kleinberg’s model, a two dimensional square mesh is augmented by therandom addition of one directed outgoing arc or “long link” per node. Thus, besidesits four adjacent nodes in the mesh, a generic node x has a further neighbor y chosenwith probability Px[y] = 1

Hx dist2(x,y), where dist(x, y) is the Manhattan distance be-

tween x and y in the mesh and Hx = ∑y �=x

1dist2(x,y)

is a normalizing coefficient. As a

consequence, the closer y is from x, the higher is the probability to have the long link(x, y). Kleinberg proposed a greedy algorithm for routing on the augmented graphthat at each step always chooses the link (eventually the long one) whose endpointis closest to the target node according to the original distances without long links.He proved an O(log2 n) expected number of hops performed by such an algorithmfor connecting every source-destination pair. Starting from such results, considerableeffort has been then devoted in investigating several variants of the original modelin terms of network topology, number and probability distribution of the long links,starting knowledge of each node and so forth (see for instance [3–5, 10, 11]).

1.1 Related Work

The basic Kleinberg’s model [9] was extensively studied in the recent years due to itsapplications and characterizations of several environments. For instance such a modelwas used to generate search protocols in peer-to-peer networks [16] and to evaluatethe World Wide Web diameter [1].

In [5] and independently in [10] is proved the tightness of Kleinberg’s resultsfor n × n mesh networks, that is, the greedy routing algorithm performs �(log2 n)

hops to connect any pair of nodes, while the diameter of the network is �(logn).In [10], the authors also extend their results to the case in which each node of thegrid has the additional knowledge of the long links of the logn closest ones, ob-

taining O(log32 n) steps for the 2-dimensional case and O(log1+ 1

d n) in the general

d-dimensional model. The O(log1+ 1d n) bound is also achieved in [5] by an oblivious

greedy algorithm.A nice characterization of small-world graphs is given in [11], where a limit sep-

arating small-world from “large-world” graphs is defined. Namely, considering a d-dimensional grid and a probability distribution of the long links according to the in-verse power r of the covered distance, the authors showed a poly-log expected diame-ter when d < r < 2d , but polynomial when r > 2d . They also presented a frameworkto construct classes of small-world graphs with �(logn) expected diameter.

In [4], treewidth aspects and their relations with social networks are considered.Informally, the treewidth of a graph represents the “distance” of its structure from a

634 Theory Comput Syst (2008) 42: 632–650

tree. Its relevance is given by the fact that many classes of graphs have a boundedtreewidth, like for instance trees, outer-planar graphs and series-parallel graphs, andmany NP-hard problems can be polynomially solved when restricted to boundedtreewidth graphs. In [4] an almost tight upper bound of O(log2 n) on the expecteddiameter is provided for any graph G of n nodes with bounded treewidth. For the

lower bound, in fact, an �(log2 n

log logn) holds on rings of n nodes [2]. Such a lower

bound applies also to all independent link distributions in the directed case (insteadof just monotone distributions), and allows for l ≥ 1 extra edges per node at the cost

of reducing the lower bound (to �(log2 n

l log logn) in the directed case and �(

log2n

l2 log logn) in

the undirected case).

1.2 Our Contribution

In this paper we consider the problem of determining constructions with an asymptot-ically optimal oblivious diameter in small world graphs under the Kleinberg’s model.

In particular, we give the first general lower bound holding for any non increas-ing distance distributions, that is induced by a non increasing generating function.Namely, we prove that the expected oblivious diameter is �(log2 n) even on a path ofn nodes. Such a result is particularly relevant because, as shown in [9, 11], only highlydecreasing distance distributions are expected to have good performances. However,we extend the lower bound also to non decreasing distance distributions, thus includ-ing all the monotone ones.

Even though our results are related to the ones in [2], there are major differences.The expected oblivious diameter investigated in this paper is a slightly different mea-sure from the expected delivery time of [2], which according to Kleinberg’s model isthe average delivery time between all the source-destination pairs. As a consequence,while our lower bound is higher than the one of [2] and is obtained by exploiting anovel and unrelated approach, it does not apply to the expected delivery time.

We then focus on deterministic constructions. We first show that the problem ofadding at most k outgoing long links per node so as to minimize the oblivious di-ameter is generally intractable. We then give asymptotically optimal solutions, thatis with a logarithmic oblivious diameter, for paths, trees and Cartesian products ofgraphs, including d-dimensional grids for any fixed value of d . Clearly, our upperbounds hold also for the expected delivery time considered by the previous works.

The paper is organized as follows. In the next section we introduce the basic nota-tion and definitions. Section 3 deals with the general lower bound for distance distri-butions. In Sect. 4 we prove the intractability of the oblivious diameter minimizationproblem and in Sect. 5 we give the above mentioned asymptotically optimal determin-istic construction for various topologies. Finally, in Sect. 6, we give some conclusiveremarks and discuss some open questions.

2 Definitions

We model a network as a symmetric digraph G = (V ,A) of n nodes in which V ={1, . . . , n}. We denote as short links all the edges in A. Small World Graphs (SWG)

Theory Comput Syst (2008) 42: 632–650 635

are build from G by adding directed long arcs, also called long links, that are arbitrarycouples in V × V .

Definition 1 (Small World Graph) Given a digraph G = (V ,A) and A ⊂ V × V ,SWG(G, A) is the digraph H = (V ,A∪ A) obtained by adding to G the long links inA.

– For any two nodes x, y ∈ V , dist(x, y) is the distance, i.e. the length of a minimumpath, between x and y, that is, the one in the network G.

– The outdegree and the indegree of SWG(G, A) are always considered to be theones of the digraph with arc set A.

A greedy or oblivious algorithm for routing on SWG(G, A) at each step choosesone of the links, including the long ones, whose endpoint is closest to the target nodeaccording to the distances in G.

Definition 2 (Oblivious Routing) In SWG(G, A)

– A path P = 〈x0, x1, x2, . . . , xk〉 from x0 to xk is said to be oblivious if, for anyi ∈ {0,1,2, . . . , k − 1}, (xi, xi+1) ∈ A ∪ A and for any y such that (xi, y) ∈ A ∪ A,dist(xi+1, xk) ≤ dist(y, xk).

– The oblivious distance from x to y in SWG(G, A) is the maximum length of anoblivious path from x to y.

– The oblivious diameter OD(SWG(G, A)) of SWG(G, A) is the maximum oblivi-ous distance between two nodes in V .

Note that the above definition of oblivious distance differs from the standard defin-itions of distance since it measures the maximum and not the minimum length amongall the oblivious paths. This is due to the fact that we are interested in evaluating theworst case path chosen by the greedy routing algorithm.

Given a network G and an integer k > 0, a relevant design goal is that of deter-mining a small world graph SWG(G, A) with minimum oblivious diameter amongthe ones with maximum outdegree bounded by k.

Definition 3 (Small World Diameter) Given an integer k > 0, the small world diam-eter ODk(G) of a network G is the minimum oblivious diameter of a small worldgraph SWG(G, A) with outdegree k.

In the basic Kleinberg’s model [9] long links are chosen randomly according togiven probabilistic distributions. As a consequence, the oblivious diameter of theresulting small world graph is a random variable and we are interested in determiningdistributions reducing its expected value.

Since any deterministic construction of a small world graph is a particular proba-bilistic one, we are mostly interested in “truly” random solutions in which long linksselecting probabilities of distinct nodes are independent and have a low dependencefrom the target nodes. This avoids deterministic constructions to appear as specialcases.

636 Theory Comput Syst (2008) 42: 632–650

On this respect, a possible reasonable random distribution is the one in which theprobability of a long link depends only on its length.

Definition 4 A probabilistic distribution is distance generated or is a distancedistribution if there exists a function f : N → R

+ such that Pr((x, y) ∈ A) =f (dist(x,y))∑

z∈V f (dist(x,z)).

Note that the probabilistic distributions on meshes based on the Kleinberg’smodel [9] are then induced by the family of the functions f (d) = d−α .

3 Distance Generated SWG

According to the results shown in [9, 11], distance distributions achieving a low obliv-ious diameter should have the property that the probability of the long links is a highlydecreasing function of their length. However, in this section we show that even undersuch an assumption and with outdegree k = 1 we cannot hope to have an expectedoblivious diameter of an order lower than log2 n, and that the same holds in the re-verse case in which probabilities increase with the distance, even in a simple path Pn

of n nodes.In order to prove the claim, we prove a slightly stronger statement for a broader

class of distance distributions. Namely, given a fixed positive number p ≤ 1, oncean outgoing long link at a given node x has been drawn according to the distancedistribution, it is installed or added to the graph with a given probability rx ≤ p,otherwise it is removed (with probability at least equal to 1 − p), independently ofall the other long links. We call the resulting distribution a p-distribution. Our resultswill then follow as a corollary of the p-distributions ones considering the specialcase p = 1.

In the following we restrict on paths Pn of n nodes with node set V = {1, . . . , n}and arc set A = {(x, x + 1), (x + 1, x)|1 ≤ x < n}. Let Lx be the random variable“length of the long link of x”. Given a function f : N → R

+, we will always im-plicitly assume that the probabilities of Lx are given by the distance distributiongenerated by f , that is Pr(Lx = d) = a · f (d)/

∑y∈V f (dist(x, y)), where a is the

number of nodes at distance d from x.The following lemma will be useful for proving the final result.

Lemma 1 Given a path Pn and a non increasing generating function f : N → R, forany two nodes x, y such that x < y ≤ n/2, Pr(Lx ≤ j) ≥ Pr(Ly ≤ j) for any j < x.

Proof Since f is not increasing,

Pr(Lx ≤ j) = 2∑j

i=1 f (i)

2∑x−1

i=1 f (i) + ∑n−x−1i=x f (i)

≥ 2∑j

i=1 f (i)

2∑y−1

i=1 f (i) + ∑n−y−1i=y f (i)

= Pr(Ly ≤ j). �

Theory Comput Syst (2008) 42: 632–650 637

Theorem 1 For any non increasing function f and any p-distance distribution gen-

erated by f , for a path Pn, E(OD(SWG(Pn, A))) = �(min

(n,

log2(n)√p

)).

Proof Before going through the details, let us informally describe the underlyingintuition. Fixed a distinguished source node and a subset of destination nodes metalong one direction on the chain, we want to show that there exists a node in thesubset reached in at least a given (high) number of oblivious steps. Such a numberclearly corresponds to a lower bound on the oblivious diameter. If the parameterp is very low, then the theorem will trivially hold by observing that the expectednumber of edge traversals needed to meet the first long link along the chain is alreadygreater than the claimed lower bound. On the contrary, if p is sufficiently high, thatis a polylogarithmic fraction, then the following situation holds. If for some fixedinteger L the probability of having long links of length at most L is low, then thep-distance distribution on the subchain given by the prefix of the nodes at distanceL from the source node induces a p′-distance distribution with a low value of p′.Thus applying the induction on this subchain again the claim holds. Finally, providedthat for every integer L the probability of having long links of length at most L issufficiently high, in the average the length of the first met long link is sufficientlylow. Thus counting the number of steps needed to meet such a long link and applyingthe inductive hypothesis on a smaller chain whose source node is the endpoint of thetraversed long link, again it is possible to prove the claim.

Starting from such a rough idea, technicalities have to be carefully worked out tofit calculations, to fix suitable values for the subchain lengths and to ensure that whenapplying the induction the subsets of the new destinations is included in the previousone. In fact, under this constraint the path given by the concatenation of the initialedges before the first long link, plus the long link and then the recursively determinedoblivious path in the induced subchain forms an oblivious path for the initial chain.

More precisely, we show that E(OD(SWG(G, A))) ≥ c · min(n,

log2(n)√p

)with c =

1/n0, where n0 is a suitably large (constant) number specified in the sequel.In order to prove the claim, we show that starting from node n

4 � (from now oncalled the S-node), at least one node in the destination segment, that is between n

4 �and �n

2 �, has expected oblivious distance at least c · min(n,

log2(n)√p

).

We prove this by induction on the number of nodes in the path. As base of theinduction we consider a path Pn0 , where n0 is the suitably large integer defining theconstant n0. Clearly the claim for Pn0 is true as c = 1

n0. Now, given any n > n0,

assuming the claim true for any Pn′ with n′ < n, we prove that it holds also for Pn.We divide the proof in the following three cases.

• p < 1n

. In this case we consider as destination node �n2 �. The expected oblivious

distance is then at least n8 . This is due to the fact that the probability of having no

long link that shortens the path from the S-node to �n2 � is 1 − (

1 − 1n

) n4 �

, that forn > 1 is always less than 1

2 . Thus the claim holds as c ≤ 1/8 since the expectednumber of steps to reach �n

2 � from the S-node is lower bounded by n8 .

• 1n

≤ p ≤ 1log4 n

. Again we consider as destination node �n2 �. Since the expected

oblivious diameter is at least equal to the expected number of steps needed from

638 Theory Comput Syst (2008) 42: 632–650

Fig. 1 The subpaths of n

2i � nodes

the S-node to meet the first maintained long link, that is 1/p, it is sufficient to

observe that 1p

= log2 n√p

1log2 n

√p

≥ log2 n√p

. Thus again the claim holds as c ≤ 1.

• p > 1log4 n

. In this case we prove the claim by exploiting the inductive hypothesis

as described in the remaining part of the proof.

For any i = 2,3, . . . , �logn − 4 log logn − log c�, let P i be the subpath of n2i �

nodes whose S-node, that is the one placed at one fourth of its length, coincides withthe one of Pn (see Fig. 1). Clearly, by assumption, each P i has at least c log4 n ≥c

log2 n√p

nodes.

Any oblivious path going from the S-node of P i to any of the nodes in its des-tination segment cannot use any long link starting from P i and ending outside P i ,or starting before n

4 �, that is at lefthand side of the S-node. Moreover, if pi is theprobability that node n

4 � has a long link of length at most n2i �, then by Lemma 1 pi

is an upper bound on the probability of having long links of length at most n2i � for

all the nodes in P i greater than n4 �, i.e., in the righthand side of its S-node, which

is also an upper bound to the probability for such nodes of having a long link fallinginside P i . Considering the long links starting from P i and ending outside P i , orstarting before the S-node, as removed, the routing is then performed according to a(p · pi)-distribution.

By the inductive hypothesis, with this induced distribution in P i , at leastone node in the destination segment has expected oblivious distance at least

c · min( n

2i �, log2(|Pi |)√p·pi

)from the S-node.

If the minimum is n2i � then n

2i � ≥ n2i ≥ log2 n√

p> c

log2 n√p

and the claim holds by

observing that the S-nodes of Pn and P i coincide and the destination segment of P i

is contained in the one of Pn.

If n2i � is not the minimum, it must be c

log2( n

2i �)√p·pi

< clog2(n)√

p, otherwise the claim

holds by the same arguments of the previous case. Therefore√

pi >log2 n

2i �log2 n

> 1 −2i

lognand pi >

(1 − 2i

logn

)2> 1 − 4i

logn. So the probability pi that from the S-node

there exists a long link longer than n2i � is 1 − pi < 4i

logn.

Being this true for every i = 2,3, . . . , �logn − 4 log logn − log c�, let us denoteby g(n) the highest expected number of steps needed in Pn to reach a node in itsdestination segment from its S-node, or analogously the expected length of the longest

Theory Comput Syst (2008) 42: 632–650 639

oblivious path from the S-node to one of the nodes in the destination segment. Wecan recursively calculate g(n) using the inductive hypothesis on shorter paths in thefollowing way.

We distinguish the cases in which either there exists a long link starting from theS-node (with probability at most p) or we are forced to move on a short link (withprobability at least 1 − p).

• If there exist a long link starting from the S-node, denoting by L its length wedistinguish four subcases.– The long link is toward a node to the left of the S-node: we move through a

short link and recurse on the path of length n − 4 having the node n4 � + 1 as its

S-node (see Fig. 2a).– L ≤ n

8 �: we move along the long link and we recurse on the path of lengthn − 4L having the node n

4 � + L as its S-node (see Fig. 2b).– n

8 � < L ≤ �n2 �: we move along a short link and recurse on the path of length

2L − 4 having the node n4 � + 1 as its S-node (see Fig. 2c).

– L > �n2 �: we move along a short link and recurse on the path of length n − 4

having the node n4 � + 1 as its S-node (see Fig. 2a).

• If we are forced to move through a short link, we recurse on the path of lengthn − 4 having the node n

4 � + 1 as its S-node (see Fig. 2a).

Note that in all the cases the destination segment of the shorter path on which werecurse is included in the one of Pn. Then,

g(n) ≥ 1 + (1 − p)g(n − 4) + p

( n8 �∑

l=2

Pr(L = l)g(n − 4l)

+� n

2 �∑

l= n8 �+1

Pr(L = l)g(2l − 4) +n∑

l=� n2 �+1

Pr(L = l)g(n − 4)

)

≥ 1 + (1 − p)g(n − 4) + p

n8 �∑

l=2

Pr(L = l)g(n − 4l)

+ p

n∑

l= n8 �+1

Pr(L = l)g

(⌈n

8

⌉)

≥ 1 + (1 − p)g(n − 4) + p

logn�−1∑

i=4

Pr

(⌈n

2i

< L ≤⌈

n

2i−1

⌉)

·g(

n − 4

⌈n

2i−1

⌉)

+ p · Pr

(

L >

⌈n

8

⌉)

g

(⌈n

8

⌉)

.

The last term of the above inequality is minimized when the probabilities multi-plying the terms with the lower arguments of g, that is Pr(L > n

8 �) and Pr( n2i � <

L ≤ n

2i−1 �) for the lowest values of i, are maximized. Recalling that Pr(L > n2i �) ≤

640 Theory Comput Syst (2008) 42: 632–650

Fig. 2 The first recursive step. (a) We move along a short link and recurse on the path of length n − 4having the node n

4 �+ 1 as its S-node. (b) We move along the long link of length L ≤ n8 � and we recurse

on the path of length n − 4L having the node n4 � + L as its S-node. (c) We move along a short link

and recurse on the path of length 2L − 4 ( n8 � < L ≤ � n

2 � is the length of the long link) having the node n

4 � + 1 as its S-node

pi < 4ilogn

for every i = 2,3, . . . , �logn − 4 log logn − log c�, pushing probabilitiesas much as possible to such terms, it can be easily shown that the last term of theabove inequality is minimized when Pr(L > n

8 �) = 12/ logn and Pr( n2i � < L ≤

n

2i−1 �) = min(4, logn − 4(i − 1))/ logn for i between 4 and logn4 �. This actually

holds only if we consider intervals of L for which we previously bounded pi , thatis if logn

4 � < �logn − 4 log logn�, that can be obtained by setting n0 to a suitably

large integer such that logn04 � < �logn0 − 4 log logn0� (this happens for n0 ≥ 225).

Therefore,

g(n) ≥ 1 + (1 − p)g(n − 4) + p

logn4 �∑

i=4

(min(4, logn − 4(i − 1))

logng

(

n − 4

⌈n

2i−1

⌉))

+ p12

logng

(⌈n

8

⌉)

.

Theory Comput Syst (2008) 42: 632–650 641

Since in each subchain the probability of maintaining a long link is always at mostp, we obtain the following inequalities, that hold for a suitable large n:

g(n) ≥ 1 + (1 − p)clog2(n − 4)√

p

+ p

logn4 �∑

i=4

(min(4, logn − 4(i − 1))

lognc

log2(n − 4n

2i−1 )√

p

)

+ p12

lognc

log2( n8 )√

p

≥ 1 + clog2 n√

p+ c

log2(1 − 4n)√

p+ 2c

logn log(1 − 4n)√

p

− c√

p log2 n − c√

p log2(

1 − 4

n

)

− 2c√

p logn log

(

1 − 4

n

)

+ 4c√

p logn

logn4 �∑

i=4

min(4, logn − 4(i − 1))

+ 4c√

p

logn4 �∑

i=4

(log2(1 − 4

2i−1 − 4n)

logn+ 2 log

(

1 − 4

2i−1− 4

n

))

+ 12c logn√

p − 72c√

p + 108c

logn

√p

≥ 1 + clog2 n√

p+ 2c

logn log(1 − 4n)√

p− c

√p log2 n − c

√p log2

(

1 − 4

n

)

+ c√

p(logn − 12) logn − 16c√

p + 12c logn√

p − 72c√

p

≥ 1 + clog2 n√

p+ 2c

logn log(1 − 4n)√

p− c

√p log2

(

1 − 4

n

)

− 88c√

p

≥ 1 + clog2 n√

p+ 2c log2 n log

(

1 − 4

n

)

− 6c√

p − 88c√

p

≥ 1 + clog2 n√

p− 16c − 6c − 88c.

Thus, since c ≤ 1110 , we finally obtain

g(n) ≥ 1 + clog2 n√

p. �

An analogous proof obtained by considering as S-node node �n2 � shows the fol-

lowing theorem.

642 Theory Comput Syst (2008) 42: 632–650

Fig. 3 The reduction graph ofTheorem 3

Theorem 2 For any non decreasing function f and any p-distance distribution gen-

erated by f , for a path Pn, E(OD(SWG(Pn, A))) = �(min

(n,

log2(n)√p

)).

4 Complexity Results

Starting from the results shown in the previous section, we now focus on the deter-mination of efficient deterministic constructions minimizing the oblivious diameter.

First of all we prove that the resulting optimization problem is intractable in gen-eral graphs.

Theorem 3 Given a graph G and an integer k > 0, deciding if ODk(G) ≤ 2 is anNP-complete problem.

Proof Let us first observe that the decision problem belongs to the class NP. In fact,given a graph G = (V ,A) and a SWG(G, A), it is possible to compute in polynomialtime the length of all the oblivious paths connecting a node u ∈ V to a generic nodez ∈ V . For this task for every v ∈ V −{z} let us define Fv as the set of all the adjacentnodes reachable from v through an oblivious path directed to z. Then, for every v ∈ V ,let gz(v) = max{gz(w) + 1|w ∈ Fv} be the length of the longest oblivious path fromv to z. Clearly gz(z) = 0, while gz(v) for any other node v ∈ V can be computed inan incremental way such that at the i-th step, the oblivious distances of all the nodesat distance at most i in G from z are evaluated. Thus, in at most |V | steps all the gz(v)

can be computed for all the nodes v ∈ V . The oblivious diameter of SWG(G, A) willthen be OD(SWG) = maxu,z∈V {gz(u)}.

In order to prove the NP-completeness of the problem, we provide a polynomialtime reduction from the Minimum Set Cover problem (MSC) (known to be NP-complete; see [6]). In this problem we have a universe set U = {u1, . . . , um} of m

elements, a family {S1, . . . , Sf } of f subsets of U and an integer k ≤ f ; we want todecide if there exist k subsets Sj1, . . . , Sjk

that cover U , i.e., such that⋃k

i=1 Sji= U .

Starting from an instance IMSC of MSC, we construct a Small World Graph H =(V ,A ∪ A) with oblivious diameter at most equal to 2 if and only if IMSC admits acover of k subsets.

Let t = k2 + k + 1 and G = (V ,A), with V = {r} ∪ {s} ∪ V1 ∪ V2 and A ={(r, s), (s, r)} ∪ A1 ∪ A2 ∪ A3 (see Fig. 3), V1 = {qi | i = 1, . . . , f }, V2 = {zj,h | j =

Theory Comput Syst (2008) 42: 632–650 643

1, . . . ,m ∧ h = 1, . . . , t}, and A1 = {(s, qi), (qi, s) | i = 1, . . . , f }, A2 = {(qi, qj ) |i = 1, . . . , f ∧ j = 1, . . . , f ∧ i �= j}, A3 = {(qi, zj,h), (zj,h, qi) | uj ∈ Si ∧ h =1, . . . , t}.

Informally, in the reduction graph, each subset Si corresponds to the subgraphinduced by node qi and all the nodes zj,h, connected to qi , such that uj ∈ Si .

The idea underlying our construction is that, in order to obtain an oblivious diam-eter equal to 2, the only way is to put k long links in A from r ∈ V to the k nodes qi

associated to the subsets of the cover, and the same for each node in V2.Assume there are k covering sets Sl1, . . . , Slk , we show that there exists a

SWG(G, A) having oblivious diameter at most 2. First of all, note that node s isat distance at most 2 from all the other nodes and the same holds for all the nodesin V1, since there is a clique between them (the arcs in A2). Moreover, node r is atdistance at most 2 from node s and from all the nodes in V1. Finally, the nodes in V2are at distance at most 2 from s and all the nodes in V1.

It remains to show that it is possible to choose at most k long links per node insuch a way that the oblivious distance from node r to the nodes in V2, from everynode in V2 to node r and between any couple of nodes in V2, is at most 2.

Let us consider A = A1 ∪ A2 ∪ A3 with A1 = {(r, qli ) | 1 ≤ i ≤ k}, A2 ={(zj,h, qli ) | 1 ≤ j ≤ m ∧ 1 ≤ h ≤ t ∧ 1 ≤ i ≤ k} and A3 = {(qi, r) | 1 ≤ i ≤ f }.

The oblivious path between r and the nodes in V2 is of length 2 since its firstarc is a long link in A1 whose endpoint is adjacent to the destination. Similarly, theoblivious path between two nodes in V2 is of length 2 since its first arc is a long linkin A2. The oblivious path between a node in V2 and r is of length 2 since its secondarc is a long link in A3.

In order to conclude the proof, it remains to show that if there are no k coveringsets, then no SWG(G, A), having oblivious diameter at most 2, exists. Consider theoblivious paths between r and the nodes in V2. Since there are no k covering sets, forany choice of the k long links outgoing from r , after the first move there must exist atleast t nodes in V2 at distance greater than 1 from s and from each of the endpointsof the long links of r . Recalling that t > (k + 1)k, since from s and from each of suchendpoints at most k other nodes at distance grater than 1 can be reached in a furtherstep, there must exist at least one node in V2 that is not reachable in 2 steps from r . �

5 Deterministic Results for Specific Topologies

In this section we show that distance generated small world graphs cannot achievethe performance of deterministic ones on basic topologies such as paths, trees andCartesian products of graphs, including d-dimensional grids.

5.1 Paths

We give a slightly more general deterministic construction for node weighted pathsthat will be useful both for unweighted paths, that is with uniform weights, and fortrees.

Let P wn be a path of n > 1 nodes with weights wx ≥ 1, wx ∈ Z, associated to nodes

x ∈ {1,2, . . . , n}, and let W = ∑nx=1 wx .

644 Theory Comput Syst (2008) 42: 632–650

Let W = {W1,W2, . . . ,W log2 W�} be a partition of the set {1,2, . . . ,W } such thatWi = {z|W

2i ≤ z < W

2i−1 } for all i = 1,2, . . . , log2 W�.In the case of maximum outdegree at most equal to 1, we perform the deterministic

construction in the following way.Let u be the first node of the path such that

∑ux=1 wx ≥ W

2 . We assign a long linkfrom node 1 to node u and from node u + 1 to node n, plus their opposite ones fromu to 1 and from n to u + 1, respectively. Let those four links be considered of level1. We now recursively consider the subpaths [2, . . . , u − 1] and [u + 2, . . . , n − 1],assigning for each of them four long links of level 2 in the same way, that is, dividingthe current subpath in two weight-balanced portions. The long links are assigned tillat most the logW�-th level is reached. Let us say that a node is of level i if it is theend-point of a long link of level i (by construction it cannot be the endpoint of longlinks of different levels).

Lemma 2 In the above construction an oblivious path from a node x with weightwx ∈ Wi to a node y with weight wy ∈ Wj has length at most 4(i + j − 1).

Proof First of all, it is easy to check that a node with weight in Wi is of level atmost i. In fact, after assigning the long links of level i, the recursive construction isapplied to subpaths consisting of nodes whose weights sum is less than W

2i , so that nonode with weight in Wi can exist in them. Thus, a long link to each node in Wi mustbe assigned before applying the recursive construction after level i. Clearly such along link is of level at most i.

Let us now consider a generic source-destination pair (x, y) and let wx ∈ Wi andwy ∈ Wj be the weights of x and y, respectively. In order to prove that the obliviouspath OPx,y from x to y is bounded by 4(i + j − 1), we need to show the followingproperties:

(1) The difference between the levels of two consecutive nodes in OPx,y is at most1.

(2) The maximum number of consecutive nodes of the same level in OPx,y is atmost 4.

(3) The sequence of the levels met from the source to the destination along OPx,y

determines a particular node z for which the subsequence from the source toz is not increasing while the one from z to the destination is not decreasing.Eventually, z can coincide with the source or with the destination, thus yieldinga monotonic sequence.

In fact, if the above properties hold, since the level of node x is at most i and theone of node y is at most j , the claim holds by observing that OPx,y meets at mostall the levels from i to 1 and then (if j ≥ 2) from 2 to j , and for each level at most 4consecutive nodes, so that |OPx,y | ≤ 4(i + j − 1).

To prove (1) it is sufficient to note that, by construction, the levels of two adjacentnodes of Pn (without considering the long links) differ of at most 1, while two nodesconnected by a long link are of the same level.

By construction, in fact, a node x of level i may only have neighbors of level i,i − 1 or i + 1. The first case occurs either if x is connected by a long link to y or

Theory Comput Syst (2008) 42: 632–650 645

Fig. 4 The deterministicconstruction for the path. Thedashed line represents thephysical path Pn . The solid onesare the added long links

if they both are the endpoints of two different consecutive long links of level i. Thesecond case occurs if x is the endpoint of a long link of level i that is included in along link of level i − 1. While the third case can occur if x is the endpoint of a longlink of level i that includes a long link of level i + 1. The property (1) then followsby observing that there are no other possible cases and that the long links are notcrossing.1

Property (2) follows by observing that, by construction, nodes of a given leveli ≥ 2 are included in groups of at most four nodes under long links of level i − 1.Since, as already noted, there are no crossing links, in order to move from a quadrupleof level i to another one of the same level we have to encounter at least two nodes oflevel i − 1. It remains to observe that the number of nodes of level 1 is 4.

In order to prove (3) it is sufficient to show that if a node xi of level i =2, . . . , log2 W� is reached from a node xi−1 of level i − 1 in OPx,y , then no node oflevel less than i can appear in the remaining part of OPx,y . As already observed, byconstruction xi and at most three other nodes of level i are included under a long linkli−1 of level i −1, and there is no crossing link. Therefore, assuming by contradictionthat a node of level smaller than i is reached in the remaining part of OPx,y , it fol-lows that one of the two endpoints of li−1, say x, has to be reached first. If x ≡ xi−1the contradiction follows directly by observing that in no oblivious path a node canappear twice. In the other case, it follows since the long link connecting xi−1 to x

should have been previously chosen due to the obliviousness of the path OPx,y . �

Theorem 4 OD1(Pn) = O(logn).

Proof The claim is a direct consequence of Lemma 2. In fact, in the special case inwhich all the weights of the nodes are equal to 1, and thus summing up to W = n,the levels are at most log2 n� and the length of any oblivious path is bounded by4(2 log2 n� − 1) (see Fig. 4). �

We now extend these results to the general case in which the maximum outdegreeis bounded by a given k > 0. The corresponding construction is quite similar to theprevious one.

Let W = {W1,W2, . . . ,W log2k W�} be a partition of the set {1,2, . . . ,W } such thatfor all i = 1,2, . . . , log2k W�, Wi = {z| W

(2k)i≤ z < W

(2k)i−1 }.Let u0 = 0, and having determined u0, . . . , uj−1, let uj , j = 1, . . . ,2k − 1, be the

first node of the path such that∑uj

x=uj−1+1 wx ≥ W2k

. Finally, let u2k = n. Note that u0

does not correspond to any node of the path, but it has been set to 0 to simplify the

1Two links {u,v} and {w,z} on a path P are said to be crossing if u < w < v < z.

646 Theory Comput Syst (2008) 42: 632–650

description of the construction. We assign two pairwise opposite long links betweenuj + 1 and nodes uj+1, . . . , umin{j+k,2k} for j = 0, . . . ,2k − 1. Let those links beconsidered of level 1. We now recursively consider the subpaths [uj + 2, . . . , uj+1 −1] for j = 0, . . . ,2k − 1, assigning to each of them long links of level 2 in the sameway, that is, dividing the current subpath in 2k weight-balanced portions. The longlinks are assigned till at most the log2k W�-th level is reached. Again, a node is saidto be of level i if it is the end-point of a long link of level i.

Arguments similar to the ones for outdegree 1 prove the following claims.

Lemma 3 In the above construction the oblivious path from a node x with weightwx ∈ Wi to a node y with weight wy ∈ Wj has length at most 4(i + j − 1).

Theorem 5 ODk(Pn) = O(logk n).

5.2 Trees

We now present a deterministic construction having outdegree 1 for trees based ontheir standard separation property. Namely, it is possible to determine a heavy path(u1, . . . , uh) starting from the root r = u1 and descending at each step in one of thesubtrees having largest cardinality, till reaching a leaf. Let Uj , j = 1, . . . , h, be theset of the sons of uj not equal to uj+1. As it can be easily checked, the deletion ofthe heavy path separates the tree into several subtrees with roots in Uj , j = 1, . . . , h.Each of such subtrees has size at most equal to the half of the one of the original tree.

On the heavy path we apply the outdegree 1 construction for weighted paths shownin the previous subsection, assigning to each node uj a weight equal to the totalnumber of nodes in the subtrees rooted at nodes in Uj plus 1.

In each subtree we recursively do the same construction.

Theorem 6 For a tree T with n nodes OD1(T ) = O(logn).

Proof Let the eccentricity e(n) of a tree of n nodes be the maximum length of anoblivious path starting from the root of the tree to one internal node and vice versa,according to the above construction. Clearly e(1) = 0. We first show that e(n) =O(logn).

Consider an oblivious path between the root and a generic node x of the tree. Byconstruction, considering the heavy path determined from the root, it is easy to checkthat if x belongs to a subtree rooted at a node in Uj , node uj of the heavy path mustbe a node of the oblivious path from the root to x. By Lemma 2, the length of theoblivious path from the root to uj depends on the weight wj of uj . More precisely,if i is the maximum value such that wj < n

2i−1 , such a length is bounded by 4i.The same arguments hold for the oblivious path from x to the root, which must

step through uj .Thus, for i ≥ 1, the following recursive inequality holds: e(n) ≤ 4i + 1 +

e(

n

2max{1,i−1}), where the addition of 1 is due to the oblivious step from uj to the

appropriate subtree with root in Uj , while the maximum between 1 and i − 1 belowthe last factor is due to the fact that such a subtree has size at most wj < n

2i−1 and

Theory Comput Syst (2008) 42: 632–650 647

Fig. 5 The heavy path and theassociated weights

according to the standard separation property of the tree at most equal to n/2. Welook for an upper bound for e(n), showing that the worst case is obtained when i

is always equal to 2. Consider, in fact, a recursive application of the previous for-mula defining e(n) varying the value of i. Clearly each step in which i = 1 can bereplaced by i = 2 maintaining the same number of steps but adding bigger values tothe recursion. Now let us consider a step in which the formula of e(n) is applied byconsidering i > 2. Such a step contributes as 4i + 1 to the recursion while decreasingthe number of steps by a factor of 2i−1. It can then be replaced by i − 1 steps withi always equal to 2 hence contributing 9(i − 1) ≥ 4i + 1 for any i > 1. It followse(n) ≤ 9 + e

(n2

) = 9 logn = O(logn).If we consider now the oblivious diameter of the deterministic construction, it is

at most twice e(n/2) plus the oblivious diameter of a heavy path (generated fromthe root or recursively in one of the subtrees). In fact, for any source-destinationpair (xs, xd), there must exist a minimal subtree T ⊆ T (eventually the whole tree)generated during the construction containing both xs and xd . Consider the heavy pathof T and let us and ud the two nodes such that Us and Ud include the roots of thesubtrees containing xs and xd , respectively. The oblivious path is then obtained bythe concatenation of the oblivious ones from xs to us , from us to ud and from ud

to xd .Since the maximum sum of the node weights of the heavy path of T is at most n,

by Lemma 2 the oblivious distance of us and us is O(logn).Therefore, OD1(T ) ≤ 2 + 2e(n

2 ) + O(logn) = O(logn). �

5.3 Cartesian Products

Given any two digraphs G1 = (V1,A1) and G2 = (V2,A2), the Cartesian prod-uct of G1 and G2, denoted as G1�G2, is a digraph having |V2| rows or hori-zontal components and |V1| columns or vertical components. Routing along eachrow is done according to G1 and in each column according to G2. More precisely,G1�G2 has node set {(x, y)}|x ∈ V1, y ∈ V2} and arc set {((x, y), (x, y′))|(y, y′) ∈A2} ∪ {((x, y), (x′, y))|(x, x′) ∈ A1}.

Lemma 4 For every two digraphs G1 and G2, ODk1+k2(G1�G2) ≤ ODk1(G1) +ODk2(G2).

Proof We use along each row (resp. column) the same construction with outdegree k1(resp. k2) for G1 (resp. G2). Clearly, this yields outdegree at most k1 +k2. The lemma

648 Theory Comput Syst (2008) 42: 632–650

Fig. 6 The obtained grid from the Cartesian product of P� n2 � and P� m

2 � without the long links and the 4

steps needed to perform a basic step on P� m2 �

then follows by observing that the oblivious paths from a node (x, y) to anothernode (x′, y′) in G1�G2 are all and only the ones alternating in all the possible wayshorizontal and vertical moves according to the oblivious paths from x to x′ in G1 andfrom y to y′ in G2, so that ODk1+k2(G1�G2) ≤ ODk1(G1) +ODk2(G2). �

As far as we restrict to Cartesian products of paths, it is also possible to obtaindeterministic constructions having outdegree equal to 1. Consider in fact the n × n

grid resulting from the product (Pn)2 ≡ Pn�Pn of two paths of n nodes. The idea

is to alternate the nodes of the grid for vertical and horizontal movements accordingto the long links of our deterministic construction with outdegree 1 for P� n

2 � in achessboard like fashion, thus obtaining at each node of the grid either horizontal orvertical long links. Thus, for any odd (resp. even) row the odd (resp. even) nodesrepresent a horizontal copy of P� n

2 � and for any odd (resp. even) column the even(resp. odd) nodes represent a vertical copy (see for instance Fig. 6).

Lemma 5 OD1(Pn�Pm) ≤ 4(OD1

(P� n

2 �) +OD1

(P� m

2 �)) + 2.

Proof Following the previous described construction, let {xs, ys} and {xd, yd} be ageneric source-destination pair on the induced grid G by the two paths P� n

2 � andP� m

2 �. Let P = {{xs, ys} ≡ {x1, y1}, {x2, y2}, . . . , {xk, yk} ≡ {xd, yd}} be the inducedoblivious path. By construction each long link appearing in P corresponds to somecopy of a long link of P� n

2 � or P� m2 �, so moving on it the distance till the destination

is decreased either on the X-axis or on the Y -axis according to the original pathconstruction. On the other hand, a basic link in P does not correspond directly to abasic link in the original paths. In order to conclude the proof it is then sufficient tonote that one step on either P� n

2 � or P� m2 � is now emulated by at most 4 steps on G.

In fact, after 4 basic steps, by construction, the final node cannot be a copy ofthe first one, see for instance Fig. 6. Finally the plus 2 of the claim comes just fromthe fact that n and m can be odd. In such a case, in fact, we could need to add afurther row or column in order to obtain G, hence needing a further step horizontallyor vertically in order to perform a desired path. �

Theorem 7 For a grid G = (Pn)2 with N = n2 nodes, OD1(G) = O(logN).

Theory Comput Syst (2008) 42: 632–650 649

Proof It is sufficient to apply Lemma 5 with n = m and considering the constructionof the long links given by Proposition 4. �

By extending such a construction to d-dimensional grids (Pn)d , that is to Cartesian

products of d paths of n nodes, it is possible to prove the following corollary.

Theorem 8 Given any integer d ≥ 2, OD1((Pn)d) = O(d2 logN), where N = nd is

the number of nodes of (Pn)d .

Proof It is sufficient to note that in the d-dimensional case, by construction, one stepalong some dimension can be emulated by at most d2 steps. The same arguments ofTheorem 7 then hold. �

Again such a construction is asymptotically optimal, that is with an oblivious di-ameter logarithmic in the total number of nodes, for the most significant cases ofgrids having a constant number of dimensions.

6 Conclusion and Future Work

We have given the first general lower bound on the expected oblivious diameter hold-ing for any monotone distance distribution. Moreover, after showing the intractabilityof the problem in the deterministic case, we have given asymptotically optimal con-structions for paths, trees and Cartesian products of graphs, including d-dimensionalgrids for any fixed value of d .

Many problems are left open. First of all, even though as stated in the literatureonly non increasing distance distributions are expected to yield a low oblivious di-ameter, can our lower bound be extended to completely arbitrary distance distrib-utions? Curiously, both this paper and [2] require monotone link distributions andone-dimensional routing for the lower bound; this may indicate a fundamental issuewith obtaining lower bounds for higher-dimensional spaces.

Moreover, the lower bound concerns only path topologies, that is one-dimensionalgrids. Does the oblivious diameter get worse in higher dimensional grids? Intuitivelythis looks the case due to the possibility of having long links changing many dimen-sions at the same time. For such topologies we do not have a formal proof yet.

Another worth investigating issue is that of determining randomized or determin-istic constructions always achieving a polylogarithmic oblivious diameter when atmost a polylogarithmic number of outgoing long links per node are allowed.

Is there a construction achieving an O(logk n) oblivious diameter for trees in caseof at most k outgoing long links per node?

It would be also interesting to give asymptotically optimal deterministic solutionsfor broader classes of networks, like for instance for grids having a non constantnumber of dimensions.

An important final remark is that, differently from the classical problem of de-termining bounded degree graphs with a minimum diameter, in the considered spe-cific small world graphs deterministic constructions seem to be more effective thanprobabilistic ones. Is this true in general, that is, when there is no restriction on thetopology?

650 Theory Comput Syst (2008) 42: 632–650

Acknowledgements Comments and suggestions of anonymous referees are gratefully acknowledged.

References

1. Adamic, L.A.: The small world web. In: Proc. of the 3rd European Conference on Research andAdvanced Technology for DIGITAL LIBRARIES (ECDL). Lecture Notes in Computer Science, vol.1696, pp. 443–452. Springer, New York (1999)

2. Aspnes, J., Diamadi, Z., Shah, G.: Fault-tolerant routing in peer-to-peer systems. In: Proc. of the 21stAnnual Symposium on Principles of Distributed Computing (PODC), pp. 223–232. ACM Press, NewYork (2002)

3. Barriere, L., Fraigniaud, P., Kranakis, E., Krizanc, D.: Efficient routing in networks with long rangecontacts. In: Proc. of the 15th International Conference on Distributed Computing (DISC). LectureNotes in Computer Science, vol. 2180, pp. 270–284. Springer, Berlin (2001)

4. Fraigniaud, P.: Greedy routing in tree-decomposed graphs. In: Proceedings of the 13th Annual Euro-pean Symposium on Algorithms (ESA). Lecture Notes in Computer Science, vol. 3669, pp. 791–802.Springer, Berlin (2005)

5. Fraigniaud, P., Gavoille, C., Paul, C.: Eclecticism shrinks even small worlds. In: Proc. of the 23rdAnnual ACM Symposium on Principles of Distributed Computing (PODC), pp. 169–178. ACM Press,New York (2004)

6. Garey, M., Johnson, D.: Computers and Intractability: A Guide to the Theory of NP-Completeness.Freeman, San Francisco (1979)

7. Kleinberg, J.: Small-world phenomena and the dynamics of information. In: Proc. of the 14th Ad-vances in Neural Information Processing Systems (NIPS) (2001)

8. Kleinberg, J.: The small-world phenomenon and decentralized search. SIAM News 37, 3 (2004)9. Kleinberg, J.M.: The small-world phenomenon: an algorithm perspective. In: Proc. of the 32nd ACM

Symposium on Theory of Computing (STOC), pp. 163–170 (2000)10. Martel, C., Nguyen, V.: Analyzing Kleinberg’s (and other) small-world models. In: Proc. of the 23rd

Annual ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing (PODC), pp.179–188 (2004)

11. Martel, C., Nguyen, V.: Analyzing and characterizing small-world graphs. In: Proc. of the 16th AnnualACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 311–320 (2005)

12. Milgram, S.: The small world problem. Psychol. Today 2, 60–67 (1967)13. Walsh, T.: Search in a small world. In: Proc. of the 16th International Joint Conference on Artificial

Intelligence (IJCAI), pp. 1172–1177 (1999)14. Wang, X.F., Chen, G.: Complex networks: small-world, scale-free, and beyond. IEEE Circ. Syst. Mag.

3(1), 6–20 (2003)15. Watts, D.J., Strogatz, S.H.: Networks, dynamics and small-world phenomenon. Am. J. Soc. 105(2),

493–527 (1999)16. Zhang, H., Goel, A., Govindan, R.: Using the small-world model to improve freenet performance.

SIGCOMM Comput. Commun. Rev. 32(1), 79–79 (2002)