An arrival-based framework for human mobility modeling

978-1-4673-1239-4/12$31.00 c©2012 IEEE

An Arrival-based Framework for Human Mobility Modeling

Dmytro KaramshukIMT LuccaLucca, Italy

[email protected]

Chiara Boldrini, Marco Conti, Andrea PassarellaIIT-CNR

Pisa, [email protected]

Abstract—Modeling human mobility is crucial in the per-formance analysis and simulation of mobile ad hoc networks,where contacts are exploited as opportunities for peer-to-peermessage forwarding. The current approach to human mobilitymodeling has been based on continuously modifying models,trying to embed in them the newest features of mobilityproperties (e.g., visiting patterns to locations or inter-contacttimes) as they came up from trace analysis. As a consequence,typically these models are neither flexible (i.e., features ofmobility cannot be changed without changing the model) norcontrollable (i.e., the exact shape of mobility properties cannotbe controlled directly). In order to take into account the aboverequirements, in this paper we propose a mobility frameworkwhose goal is, starting from the stochastic process describingthe arrival patterns of users to locations, to generate pairwiseinter-contact times and aggregate inter-contact times featuringa predictable probability distribution. We validate the proposedframework by means of simulations. In addition, assuming thatthe arrival process of users to locations can be described bya Bernoulli process, we mathematically derive a closed formfor the pairwise and aggregate inter-contact times, proving thecontrollability of the proposed approach in this case.

I. INTRODUCTION

Modeling human mobility is crucial in performance anal-ysis of networking protocols for mobile ad hoc networks. Inmobile ad hoc networks messages are routed by the usersof the network (which exchange them upon encounter withother users) and eventually delivered to their destinations.The delay experienced by messages is thus a function ofthe pairwise inter-contact time, which is the time betweentwo consecutive contacts for a pair of nodes. Characterizingthe inter-contact time is therefore essential for modelingthe performance of networking protocols for mobile adhoc networks. Inter-contact times are determined by themovement patterns of users: users visiting the same locationswill meet more frequently, and their inter-contact time willbe shorter.

The first step in modeling human mobility is to understandhow users move. Recently, starting from traces of real usermovements, there has been a huge research effort in orderto characterize the spatio-temporal (i.e, how users travelacross locations [10] [3] [22]) and social (i.e., how longusers stay together and how long they have to wait beforemeeting again [5] [7]) properties of human mobility. Thereis a general agreement that users tend to travel most of the

time along short distances while only occasionally followingvery long paths. In addition, user movements are generallycharacterized by a high degree of predictability: users tendto visit the same locations frequently, and to appear at themat about the same time. Less clear is how inter-contact timesare characterised. Many hypotheses have been made (aboutthem featuring an exponential distribution [9], a Paretodistribution [5], a Pareto with exponential cut-off distribution[13], a LogNormal distribution [7],tc.), but the problem hasyet to be solved.

Building upon the above findings, the current approach tohuman mobility modeling has been so far based on tryingto reflect in the model the newest features of mobilityproperties as they came up from trace analysis. Typically,each model focuses on just a few properties of humanmobility. The class of location-based mobility models aimsto realistically represent user mobility patterns in space.They are typically concerned with the regular reappearanceto a set of preferred locations [12] or with the length ofpaths travelled by the users [16]. Similarly, there are modelsmostly focused on the accurate representation of the time-varying behavior of users, often relying on very detailedschedules of human activities [8] [23]. Finally, the class ofsocial-based mobility models aims to exploit the relationbetween sociality and movements, and to formalize socialinteractions as the main driver of human movements [1] [2].

The disadvantage of the current approach to modelinghuman mobility is that the proposed models are intrinsicallybound to the current state of the art on trace analysis, andtypically need to be redesigned from scratch any time anew discovery is made. In addition, with current mobilitymodels it is typically difficult, if not impossible, to finetune the mobility properties (e.g., obtaining inter-contacttimes featuring a probability distribution with controllableparameters). Overall, flexibility and controllability are cur-rently missing from available models of human mobility.Flexibility implies allowing for different distributions ofmobility properties (e.g., return times to locations or inter-contact times) to be used with the model. The importanceof flexibility is twofold. First, it gives the opportunity toevaluate networking protocols in different scenarios, and testtheir robustness to different mobility behaviors. Second, onmore practical terms, it allows for changing the model upon

new discoveries from trace analysis without the need to startover from clean slate. On the other hand, controllabilityrelates to the capability of obtaining a predictable outputstarting from a given input. This can be done only at acoarse grain with the majority of available mobility models.For example, in social-based mobility, where the numberof social relationships determines the shape of inter-contacttimes, an appropriate configuration can lead to heavy tailedinter-contact times [1]. However, there is no direct way forcontrolling the parameters characterizing this distribution,and a fine tuning can be attempted only with a trial-and-error approach.

In light of the above discussion, the contribution of thispaper is twofold. First, we propose a mobility frameworkaiming to be flexible and controllable (Section III). Ourframework takes as input the social graph representing thesocial relationships between the users of the network andthe stochastic processes characterizing the visiting patternsof users to locations. This information can be extractedfrom available mobility traces or collected from location-based social networking applications such as Gowalla andFoursquare. Based on the input social graph, communitiesare identified and are assigned different locations. Thus,people belonging to the same community share a commonlocation where the members of the community meet. Then,users visit these locations over time based on a configurablestochastic process. The proposed framework thus builds anetwork of users and locations (called arrival network),where a link between a generic user i and a location lcharacterizes the way user i visits location l. As emergesfrom the above description, the framework combines thesocial dimension of user movements together with spatialand temporal preferences, while at the same time allowingfor flexibility in visiting patterns and in the resulting inter-contact times (as shown in Section IV).

The second contribution of this work lies in considering aspecific instance of the framework, and analytically derivingthe relation between arrival patterns of users to locationsand the resulting pairwise and aggregate inter-contact timedistribution. In our analysis we assume that the time isslotted, and we represent the way users arrive to locationsas Bernoulli process. We prove that when the arrival pro-cess describing how users visit their assigned locations isBernoulli, then also the contact process is Bernoulli, and thepairwise inter-contact time features a geometric distributionwhose parameter can be derived in closed form (SectionV-A). This shows that there can be a direct control on theoutput of the mobility model using the proposed framework,even if we do not have a general enough analytical modelto cover all cases.

As for aggregate inter-contact times, recently Passarellaand Conti [21] have investigated how aggregate inter-contacttimes depend on the pairwise statistics from which theyoriginate. Understanding this dependence is extremely im-

portant, because aggregate statistics are much easier tocollect in practice than pairwise ones, and, in the past,conclusions were often drawn from the former consider-ing them to be representative of the latter. In this paperwe take a step further into the investigation of aggregateand pairwise statistics by studying how individual arrivalpatterns to locations affect the aggregate inter-contact time.More specifically, we prove (Section V-B) that heavy-tailedaggregate inter-contact times, which have emerged fromthe analysis of real mobility traces [5], can be obtainedfrom simple heterogenous Bernoulli arrivals. This confirmsthe main result in [21], i.e., that heterogeneity in pairwisestatistics can lead to aggregate statistics that are very distantin distribution.

II. RELATED WORK

A comprehensive overview of the state-of-the-art in mo-bility modeling was presented in [14]. The main findingsin human mobility research can be classified along the threeaxes of spatial, temporal, and connectivity properties. Spatialproperties pertain to the behavior of users in the physicalspace (e.g., the distance they travel), temporal properties tothe time-varying features of human mobility (e.g., the timeusers spend at specific locations), connectivity properties tothe interactions between users.

Similarly, three dominating techniques have emerged inthe approaches to modeling human mobility: maps of pre-ferred locations, personal agendas, and social graphs. Themodels of the first group account for the properties charac-terizing regular reappearance of users at the sets of preferredlocations. Their general approach is to store the maps (i.e.,the sets) of preferred places for each of the users and toexplore them while deciding on the next destination for theirwalks. The main representatives of this group are SLAW[22] and the model proposed by Song et al. [16]. Beingable to satisfy the main spatial properties of human mobilitytrajectories, these models do not pay enough attention to theother - social and temporal - aspects of the movements.

The second class of models focuses on reproducing realis-tic temporal patterns of human mobility explicitly includingrepeating daily activities in human schedules. The mostcomprehensive approach of this group is presented in [23].The model incorporates detailed geographic topology, per-sonal schedules and motion generator defined for more than30 different types of activities. Although the model givesan extremely thorough representation of human movementsin very particular scenarios, it does not explain the maindriving forces of human mobility and it is too complex formathematical analysis.

The most recent and most rapidly evolving trend inmodeling human mobility is based on incorporating complexnetwork theory and considering human relations as the maindriver of individual movements. The main idea is that thedestination for the next movement of a user depends on the

Figure 1. Framework overview

position of people with whom the user shares social ties.The first models of this class were CMM [17] and HCMM[1], although many others have recently been developed.

Although existing models are able to reproduce realisticmobility trajectories in a wide range of scenarios, there isstill the need for a better understanding of the correlationsbetween the main characteristics of human movements, i.e.,patterns of user co-appearance at locations and emergingconnectivity properties, i.e., inter-contact times. To the bestof our knowledge, here we present the first approach wherethe time sequence of user re-appearance in shared meetingplaces are explicitly modeled. Therefore, with respect toexisting models, our approach is easily customizable to anytemporal pattern of user co-appearance in locations and givesa natural framework for the mathematical analysis of thecontact sequences between them. Additionally, we reflectthe social dimension of human mobility in the distributionof the shared meeting places between communities of tightlycoupled users.

III. THE PROPOSED MOBILITY FRAMEWORK

In this section we introduce our mobility framework,designed around the three main dimensions – social, spatialand temporal – of human mobility (see Figure 1). Thesocial dimension is explicitly captured in the frameworkby taking a graph of human social relationships as inputparameter. This graph can be any well known graph, suchas a random graph [18] or a scale-free graph [18], or it canbe extracted from real traces. Then, the framework adds thespatial dimension to the social ties by generating an arrivalnetwork, which is a bipartite graph that connects users andmeeting places. A link between a user and a meeting placein the arrival network implies that the user visits that placeduring its movements. We explore the fact that the structureof communities in the social graph has a significant impacton human mobility, thus we assign users to meeting placessuch that communities of tightly connected users (cliques, incomplex network terminology) share similar meeting places.

In order to add the temporal dimension to the model, wedescribe the way users visit the meeting places to which theyare connected in terms of stochastic point processes [24]. A

stochastic point process is a stochastic process that character-izes how events (usually called arrivals) are distributed overtime. By sampling from the random variables representingthe time between consecutive arrivals, we obtain the timesequences of the visits of a user to a given location. Then,the contact network, i.e., the network describing the contactsbetween nodes, can be obtained by assuming that two nodesare in contact with each other if they happen to be at thesame time in the same meeting place.

A. The social dimension of human mobility

Social interactions between users have emerged as oneof the key factors defining human mobile behavior, becauseindividuals belong to social communities and their social tiesstrongly affect their movement decisions [6]. As anticipated,in our analysis we consider proximity-based communities,i.e., communities whose members share a common meetingplace (e.g., office, bar, apartment). The fact that all membersof the community visit a shared meeting place impliesthat users are socially connected with all other membersof the community, and, therefore, form fully connectedcomponents (i.e., cliques) in the social graph.

Such cliques in real social networks exhibit a overlappingand hierarchical structure [19] [20]. Each user belongsto several overlapping cliques, representing different socialcircles (e.g., friends, relatives, colleagues). On the otherhand, each clique is itself composed of a number of nestedcliques, which share additional meeting places that are notcommon to all users of the parent clique. For example, acompany shares a set of offices visited by all its employees,while each division has its own working place.

B. Adding the spatial dimension to social graphs

The goal of the proposed framework is to reflect in thespatial behavior of users the structure of their social commu-nities. As anticipated, we represent the relation between thespatial and the social dimension of human mobility by meansof a bipartite graph of users and meeting places, whichwe call arrival network. In the algorithm (summarized inTable I) for generating the arrival network starting from theinput social graph we mainly need two components: a clique

4

0

6

1

7

2

3

5 8

9

0

1

7

3

2

5

9

4

6

8

1

2

3

4

Figure 2. A round of assigning social cliques to meeting place; cliquesare marked with different line styles

finding algorithm (that also detects overlapping cliques) anda way for reproducing hierarchical cliques.

Table IALGORITHM FOR BUILDING THE ARRIVAL NETWORK - INPUT: SOCIAL

GRAPH G AND REMOVAL PROBABILITY α.

1) Divide input social graph G into a set of overlapping cliques, suchthat the sizes of the cliques are maximum and each link is assignedto exactly one clique. To this aim, the BronKerbosch algorithm [4]can be used.

2) To each clique assign a separate meeting place, i.e., create a newmeeting place and a set of links between this place and eachmember of the clique in the arrival network.

3) Remove randomly each link in the social graph with probabilityα, inducing emergence of new nested cliques.

4) Repeat the procedure starting from the first step, until there are nolinks left in the input graph.

The first component corresponds to steps 1 and 2 inTable I. In each round, the social graph is divided into aset (called cover) of overlapping cliques, such that eachlink of the graph is assigned to exactly one clique. Tothis purpose, we use the BronKerbosch algorithm [4]. Thecover of each round tries to capture the biggest possiblecliques. For each of the newly identified cliques, we createa new meeting place and assign all members of the cliqueto the meeting place. In other words, we create a newmeeting place vertex in the arrival network and we addlinks between this vertex and all members of the community.As an example, we describe in Figure 2 how cliques arereflected into corresponding meeting places.

The second component (step 3 in Table I) of the algorithmfor generating the arrival network allows us to generatenested cliques. More specifically, our algorithm tries toidentify cliques of lower size nested into those identifiedin the previous round. To do so, cliques are split accord-ing to a very simple random process, according to whichevery link in the social graph is randomly removed with aconstant, configurable, probability α (removal probability).This leads to the emergence of smaller cliques, which areindeed nested into the original ones. This simple strategyhas also the advantage of allowing for a fine control of thenumber of meeting places shared by users. In fact, each link

participates into a geometrically distributed (with parameterα) number of rounds of meeting place assignments. As eachlink is assigned to at most one clique per round, also thenumber of cliques that include that link will be geometricallydistributed. This implies that the pair of users i, j with whichthis link is associated will share a number Lij of cliques (andthus of meeting places) that is itself geometrically distributedwith parameter α.

The algorithm for generating the arrival network stops(step 4 in Table I) when there are no more links to beremoved in the social graph.

1) From meeting places to geographical locations: Theanalysis of the algorithm in Table I reveals that the numberof generated meeting places grows with the number ofcliques. Thus, the more cliques in the input social graph,the more meeting places are required. The proliferation ofmeeting places is not of big concern, as meeting places mightcorrespond to very small geographic areas (e.g., offices).However, in order to improve the realism of the generatedscenario, we want to combine these meeting places into afixed number L of wider physical locations (e.g., this isequivalent to combining offices into a business center).

To this aim, we exploit the observation that a generalurban environment is characterized by the existence of loca-tions with an extremely large number of meeting places (e.g.,business centers), while for the majority of locations thisnumber is significantly smaller (e.g., private houses). This isconceptually similar to a power law distribution of meetingplaces per location. Therefore, the grouping of the meetingplaces to a given number of physical locations can be donewith the de-facto standard for reproducing such distribution,i.e., the preferential attachment scheme [18]. In this scheme,meeting places are one-by-one randomly attached to one ofthe available locations, with the probability of selecting aspecific location being proportional to the number of placesalready attached to it.

C. The temporal dimension of user visits to meeting places

The arrival network that we have built in the previoussection tells us which are the meeting places visited byeach user. Here we want to further characterize the wayusers visit locations by considering the temporal propertiesof such visits. To this aim, we assign to each link in thearrival network a stochastic point process Ali that describesthe arrivals of user i to meeting place l over time. In thiswork, we consider only discrete point processes, leaving thecontinuous case for future work. In a discrete point process,the time is slotted. As an example, in light of the results fromtrace analysis, a single time slot can be taken to be equal toone day.

Once we have characterized the time at which users visittheir assigned meeting places, we can build the contact graphof the network (Figure 1). In fact, a contact between twousers happens if the two users appear in the same meeting

place at the same time slot. The contact graph can be fullymathematically characterized (we provide an example of thischaracterization in Section V for the case of arrival pro-cesses being heterogenous Bernouilli processes) or it can beobtained from simulations. From the simulation standpoint,the output of the proposed framework is a contact trace inthe form <user1, user2, startTime, endTime>,where the first two elements are the node identifiers and thelast two denote the start and end time slots of the contact.This trace can be then fed into a networking simulator likethe ONE simulator [15].

IV. CASE STUDIES

We discussed in Section I that a desired property ofa mobility framework is its ability to reproduce differentdistributions for the main mobility properties. From thisstandpoint, in this section we analyze, by means of sim-ulations, the behavior of the framework described above,focusing on the distribution of inter-contact times that itgenerates. Complementary to this, in Appendix B we providea preliminary validation of the model based on real worldmobility traces.

As we already discussed, inter-contact times play a majorrole in the delay experienced by messages in mobile ad hocnetworks. The evaluation provided below, despite prelimi-nary, clearly indicates that the proposed framework is able togenerate different distributions for the inter-contact time. Insome cases, as in the first scenario discussed below, we havea complete understanding of how a given inter-contact timedistribution can be obtained from the input parameters of theframework. However, the derivation of a general analyticalmodel relating the arrivals of users and the resulting inter-contact times is an ongoing work.

In order to instantiate the proposed framework, we need todefine its input parameters: the social graph G, the removalprobability α, and the arrival processes Ali for each useri visiting a location l (Table I). As input social graph weconsider two random graphs Gn1,χ1 and Gn2,χ2 of n1 = 500and n2 = 1000 users, in which each possible edge existswith probability χ1 = 0.2 and χ2 = 0.1, respectively. Recallthat a random graph Gni,χi

is obtained by starting witha set of ni vertices and randomly adding edges betweeneach pair of vertices with probability χi [18]. We evaluateboth these two graphs when the removal probability used bythe algorithm for generating the arrival network is α1 = 0.5and α2 = 0.2. These settings correspond to the averagenumber of locations shared by a pair of users (which aregeometrically distributed) equal to 1/α1 = 2 and 1/α2 = 5,correspondingly. As a result, we obtain four arrival networkswith different structural parameters which we explore insimulations. For each of these arrival networks, we study theresulting inter-contact times obtained changing the arrivalprocesses Ali of users to meeting places. Simulations arerun for 10000 time units of simulated time, and results are

shown with a confidence level of 99.9%.In the first experiment we model users’ arrivals to places

with Bernoulli arrival processes. In a Bernoulli process, theprobability of an arrival at a given time slot is constantand equal to ρAl

i, which also corresponds to the rate of the

process. Here we assign rates ρAli

of the Bernoulli arrivalprocesses such that ρAl

i= e−

12Y

2

, where Y is a standardnormal random variable. These settings correspond to thecase which we mathematically characterize in Section V.Figure 3 depicts the result of simulations for each arrivalnetwork. For instance, Figure 3.a depicts simulation resultsfor the network with parameters n = 500, χ = 0.2 anda = 0.5. As we can see from the figure, the resulting ag-gregate inter-contact time CCDF for this network decays asa power law with exponent γ = −2, i.e., F (τ) ∼ τ−2. In theother arrival networks we observe similar results: while thestructural properties of the network influence the parameterof the aggregate inter-contact time CCDF, the shape of thedistribution remains the same for all the networks, and itcan be approximated with a power law function of exponentγ = −2.

1e-009

1e-008

1e-007

1e-006

1e-005

0.0001

0.001

0.01

0.1

1

1 10 100 1000 10000

CC

DF

τ (time units)

a) n=500 χ=0.2 α=0.5

simulation~ τ

-2

99,9% confidence

1e-009

1e-008

1e-007

1e-006

1e-005

0.0001

0.001

0.01

0.1

1

1 10 100 1000

CC

DF

τ (time units)

b) n=500 χ=0.2 α=0.2

simulation~ τ

-2

99,9% confidence

1e-009

1e-008

1e-007

1e-006

1e-005

0.0001

0.001

0.01

0.1

1

1 10 100 1000 10000

CC

DF

τ (time units)

c) n=1000 χ=0.1 α=0.5

simulation~ τ

-2

99,9% confidence

1e-009

1e-008

1e-007

1e-006

1e-005

0.0001

0.001

0.01

0.1

1

1 10 100 1000 10000

CC

DF

τ (time units)

d) n=1000 χ=0.1 α=0.2

simulation~ τ

-2

99,9% confidence

Figure 3. The aggregate inter-contact time for different arrival networks

In the second experiment we change the type of arrivalprocesses in the network and study the corresponding changein the distribution of aggregate inter-contact times. Morespecifically, we consider the case when the arrival processesare point processes with independent uniformly distributedintervals, i.e., the intervals between arrivals of process Aliare drawn from a discrete uniform distribution of range[1, b], where b = 2× b1/ρAl

ic − 1, b c is the floor (greatest

integer) function and ρAli

is drawn from the same distributionas in the first experiment (note that the average inter-arrivalof a user to a location is approximately the same as inthe previous experiment). We model two networks with

parameters {n1 = 500, χ1 = 0.2, a1 = 0.5} and {n2 = 500,χ2 = 0.2, a2 = 0.2}. Therefore, the only difference withrespect to the first experiment is the type of arrival processes.The results presented in Figure 4 clearly show that theaggregate inter-contact time distribution also in this casedecays as a power law with exponent γ = −2. In otherwords, this result reflects the corresponding result for theBernoulli arrival processes with similar distribution of thearrival rates. Thus, it may indicate that the distribution ofthe arrival rates (which is approximately the same in bothexperiments and equal to ρAl

i) plays a major role in the

emergence of the power law aggregate inter-contact timesdistribution.

1e-008

1e-007

1e-006

1e-005

0.0001

0.001

0.01

0.1

1

1 10 100 1000 10000

CC

DF

τ (time units)

a) n=500 χ=0.2 α=0.5

simulation~ τ

-2

99,9% confidence

1e-008

1e-007

1e-006

1e-005

0.0001

0.001

0.01

0.1

1

1 10 100 1000

CC

DF

τ (time units)

b) n=500 χ=0.2 α=0.2

simulation~ τ

-2

99,9% confidence

Figure 4. The aggregate inter-contact time for the case of arrival processeswith independent uniformly distributed intervals

To better understand the influence of the arrival rates onthe resulting inter-contact times characteristic, in the thirdexperiment we simulate arrival networks where arrival pro-cesses are Bernoulli processes, like in the first experiment,but this time with identical rates. More specifically, wemodel two networks with the same parameters {n = 500,χ = 0.2, a = 0.5}, in which all the rates of arrival processesare identical and equal to ρ

(1)

Ali

= 1/2 for the first network,

and ρ(2)

Ali

= 1/3 for the second one. Recall that the rateof the arrival process is the reciprocal of the average ofthe inter-arrival times. Therefore, the first case correspondsto the network where the average inter-arrival time for allprocesses is equal to 1/ρ

(1)

Ali

= 2 time units, and the secondcase to the network with average inter-arrival time for allprocesses equal to 1/ρ

(2)

Ali

= 3 time units. From Figure 5(plotted in lin-log scale) we can see that the aggregateinter-contact times characteristic in such networks differsfrom the results we obtained before. More specifically, theresulting distribution of the aggregate inter-contact timesdecays as an exponential function rather than as a powerlaw function. This result additionally supports the hypothesisthat the resulting aggregate inter-contact times characteristicsignificantly depends on the distribution of arrival rates.

The preliminary results discussed in this section providean indication that the distribution of the rates of the arrivalprocesses plays a major role in the resulting aggregateinter-contact times between users. Moreover, these resultsallowed us to show how very different distributions for

1e-008

1e-007

1e-006

1e-005

0.0001

0.001

0.01

0.1

1

10 20 30 40 50 60

CC

DF

τ (time units)

a) ρ = 0.5

simulation~ e

-0.29τ

99,9% confidence

1e-008

1e-007

1e-006

1e-005

0.0001

0.001

0.01

0.1

1

20 40 60 80 100 120 140

CC

DF

τ (time units)

b) ρ = 1/3

simulation~ e

-0.12τ

99,9% confidence

Figure 5. The aggregate inter-contact time for arrival network withidentical arrival rates

the aggregate inter-contact times can be obtained startingfrom simple Bernoulli arrival processes. This finding is veryinteresting from the standpoint of a mathematical analysisof the proposed framework, as Bernoulli processes possessa number of properties (e.g., single parameter, memorylessproperty) that significantly simplify the analysis, as we showin the following section.

V. ANALYTICAL MODEL FOR THE CONTACT PROCESS

In this section we show how, in a concrete instance of theframework, we are able to analytically derive the distribu-tions of pairwise and aggregate inter-contact times startingfrom a given discrete stochastic process describing howusers visit locations, thus highlighting the controllability ofthe framework. For the sake of completeness, the findingsfrom this section are validated in Section VI by comparinganalytical results with simulations.

In the following we study the dependence between in-dividual arrival processes of users and the correspondingcontact processes between them. A contact process describeshow users meet with each other. Assuming that two users Uiand Uj can meet at Lij distinct meeting places, the contactprocess between users i and j comprises all contacts hap-pening at all Lij shared meeting places. The time betweenconsecutive contacts in the contact process defines the inter-contact times between the pair of nodes. In the followingwe also characterize the single-place contact process, as thecontact process between users Ui and Uj limited to a specificmeeting place Ml.

As anticipated, in this analysis we model arrival processesas Bernoulli processes. We show that, if the individual arrivalprocesses are Bernoulli processes, then the contact processand the single-place contact process are also Bernoulliprocesses for any pair of users. As inter-arrival times fora Bernoulli process feature a geometric distribution, weobtain that from geometric inter-arrival times to specificmeeting places (corresponding to Bernoulli arrivals) a ge-ometric distribution of pairwise inter-contact times follows.Additionally, we show that the rates of the contact processesdepend on the rates of the arrival processes. Starting fromthis dependence, we are able to derive analytically also theaggregate inter-contact times as a function of the arrivalrates of users to meeting places. Specifically, we describe

the conditions for which the aggregate inter-contact timehas a power law shape.

Before proceeding to the details of our analysis, wefirst introduce the notation used throughout the section. Weconsider an arrival network made up of N users and Lmeeting places. We assume that each user Ui visits placeMl according to a Bernoulli process Ali with rate ρAl

i. For

each meeting place Ml and for each pair of users Ui and Ujwe characterize the single-place contact process Clij (of rateρCl

ij) and the contact process Cij of rate ρCij

, aggregatedover the Lij shared meeting places. The latter defines thedistribution of pairwise inter-contact times. We denote thecomplementary cumulative distribution function (CCDF) ofthe pairwise inter-contact times of rate ρ with Fρ(τ), andthat of the aggregate inter-contact times with F (τ). F (τ)is obtained as a function of the probability density function(PDF) of the rates of individual inter-contact times fP (ρ).The notation is summarized in Table II. Due to spacelimitations, the complete proofs for the results shown in thissection can be found in Appendix A.

Table IITABLE OF NOTATION

N number of users in the arrival networkL number of meeting places in the arrival networkUi user iMl meeting place lLij number of shared meeting places between users Ui and UjAli arrival process of user Ui to meeting place Ml

Clij single-place contact process between users Ui and Uj atmeeting place Ml

Cij contact process between users Ui and UjρAl

irate of arrival process Ali

ρClij

rate of single-place contact process ClijρCij

rate of contact process CijE[P ] expectation of the rate of pairwise inter-contact timesFρ(τ) CCDF of individual inter-contact times τ between a pair of

nodes whose rate is equal to ρfP (ρ) PDF of the rates of individual inter-contact timesF (τ) CCDF of the aggregated inter-contact times

A. Contact process for a pair of users

In this section, assuming Bernoulli arrivals to locations,we analytically characterize the contact process between apair of users. To this aim, consider two Bernoulli processes,Ali and Alj , describing arrivals of users Ui and Uj in a sharedplace Ml. For a Bernoulli process, the probability 0 < ρ ≤ 1of an arrival in a time slot τ is constant (i.e., does not dependon τ ), and is called the parameter or the rate of the process.Moreover, time intervals between arrivals are independentgeometrically distributed random variables.

We assume that individual arrival processes are indepen-dent, and that a contact between two users happens if bothof them decide to visit place Ml in the same time slot. Thus,the single-place contact process Clij between user pair Ui, Ujat meeting place Ml can be obtained from the intersectionof the individual Bernoulli arrival processes of users Ui and

Uj at meeting place Ml. An example of the intersectionof individual arrival processes is provided in Figure 6. Inthe following lemma we prove that the single-place contactprocess Clij is also a Bernoulli point process.

sta

rt

time

1

time

2

time 3

time

4

time

5

...

...

time n

Ali

Alj

Clij

Figure 6. The single-place contact process as an intersection of arrivalprocesses

Lemma 1 (Single-place contact process): The single-place contact process Clij resulting from independentBernoulli arrival processes Ali and Ali, of rates ρAl

i

and ρAlj

respectively, is a Bernoulli process of rateρCl

ij= ρAl

i× ρAl

j.

The intuitive proof for the above lemma is that theprobability of a contact at meeting place Ml is equal tothe probability that both users are at meeting place Ml inthe same time slot. This can be obtained as the productρAl

i×ρAl

j, recalling that, for a Bernoulli process, the rate of

the process is equal to the probability of an arrival in a timeslot. A discrete stochastic process in which arrivals happenwith constant probability ρAl

i× ρAl

jis again a Bernoulli

process.In the following we focus on the contact process between

a pair of users Ui, Uj , i.e., on their contacts in the Lij sharedmeeting places. A contact happens between the two users ina given time slot if they meet at least in one of the Lijmeeting places that they share. Thus, the contact processbetween users Ui and Uj can be obtained merging (as shownin [24]) their single-place contact processes (Figure 7). Inthe following theorem we show that if single-place contactprocesses are Bernoulli, then also the contact process isBernoulli.

C1ij

C2ij

CLij

Cijτ

Figure 7. The compound contact process as a merging of single-placecontact processes

Theorem 1 (Contact process): The contact process Cijbetween contacts resulting from a number Lij of individualplace contact processes Clij , which, in their turn, emergefrom Bernoulli arrival processes Ali and Alj of rates ρAl

i

and ρAli, is a Bernoulli point process with the rate ρCij

=

1−∏Lij

l=1 (1− ρAli× ρAl

j).

The intuitive proof for the above result is that the proba-bility of at least one contact in a time slot can be computedas one minus the probability of no contact in that time slot.The probability of no contact in the time slot is equal to theprobability that the two users do not meet in any of theirshared meeting places. Then, Theorem 1 follows.

The contact process described in the Theorem 1 alsodefines the time intervals between consecutive contacts ofa pair of users. Specifically, for a Bernoulli process the dis-tribution of inter-contact times is geometric. We summarizethis result in the following corollary.

Corollary 1 (Pairwise inter-contact times): The inter-contact times distribution between a pair of users Ui andUj , meeting at a number Lij of meeting places, and whosearrivals to these meeting places are described as Bernoulliarrival processes Ali and Alj of rates ρAl

iand ρAl

i, is

geometric with the following rate:

ρ = 1−Lij∏l=1

(1− ρAli× ρAl

j). (1)

B. Aggregate contact process

In this section we describe how to derive the aggregateinter-contact times starting from pairwise inter-contact timesfeaturing a geometric distribution. More specifically, wesolve a concrete case by providing the conditions on theBernoulli arrival processes of users to locations such that theresulting aggregate inter-contact time distribution is heavy-tailed. Heavy-tailed distributions for aggregate inter-contacttimes are important as they have often emerged from theanalysis of real mobility traces [5]. Our derivation showshow such heavy tailed behavior can result from simpleheterogenous Bernoulli arrival processes, which are veryconvenient to deal with for mathematical analysis. Thisresult also confirms the main finding of [21]: very differentaggregate statistics can emerge from the heterogeneity ofsimple pairwise statistics.

In order to derive the aggregate inter-contact times, weexploit the result of the work by Passarella and Conti [21],which describes the dependence between the aggregate inter-contact time distribution and the inter-contact time distribu-tions of individual pairs of users. Specifically, they considera heterogeneous scenario, where pairwise inter-contact timesdistributions are all of the same type (e.g., exponential), butwhose parameters (the rates, in the exponential example)are unknown a-priori. The rates of the individual contactsequences are drawn from a given distribution, which, there-fore, determines the specific parameters of each pair’s inter-contact times. The model described in [21] shows that boththe distribution of the rates and the distributions of pairwiseinter-contact times impact on the aggregate distribution.

For the convenience of the reader we recall this result inTheorem 2.

Theorem 2: In a network where the rates of pair-wise inter-contact times are distributed according to acontinuous random variable P with density fP (ρ), theCCDF of the aggregate inter-contact time is F (τ) =

1E[P ]

∫∞0ρfP (ρ)Fρ(τ)dρ, where Fρ(τ) denotes the CCDF

of the inter-contact times between a pair of nodes whoserate is equal to ρ.

We extend this finding to our network of interest, wherepairwise inter-contact times depend on their correspondingarrival processes. We have shown in Corollary 1 that, forthe case of independent Bernoulli arrival processes, thedistribution of individual inter-contact times is geometric.In other words, the shape of the pairwise inter-contacttime distribution Fρ(τ) is fixed in our model and, thus,the resulting aggregate inter-contact times characteristic iscontrolled by the distribution of the rates of individualinter-contact times fP (ρ). This distribution, in turn, dependson the distribution of the corresponding arrival rates. Thisdependence may not be trivial in the general case.

In order to apply Theorem 2 to our case of pairwise inter-contact times featuring a geometric distribution, we note thata discrete random variable X featuring a geometric distribu-tion with rate ρ can be expressed in terms of a discrete ran-dom variable Y featuring a discrete exponential distribution.More specifically, the CCDF1 of the geometric distributionof the pairwise inter-contact times, i.e., Fρ(τ) = (1− ρ)τ ,τ ∈ {1, 2, 3, ...}, can be re-written in a discrete exponentialform, i.e., Fλ(τ) = e−λτ , τ ∈ {1, 2, 3, ...}, by substitutingρ = 1− e−λ, where λ ∈ (0,∞). Variables X and Y are thusexactly the same, but written in a different form. Using thissubstitution and the result in Theorem 2, in Lemma 2 wederive under which condition for parameter λ the aggregateinter-contact time is heavy-tailed.

Lemma 2: In a network where pairwise inter-contacttimes have a discrete exponential distribution of the formFλ(τ) = e−λτ , τ ∈ {1, 2, 3, ...}, and parameters λ are drawnfrom an exponential distribution with rate α, the aggregateinter-contact time distribution is as follows:

F (τ) =α+ α2

(τ + α)(τ + α+ 1). (2)

The complete proof for the above Lemma and for all resultsintroduced below can be found in Appendix A.

More generally, the result in Lemma 2 says that theaggregate inter-contact times distribution decays proportion-ally to the power γ = −2 of τ , i.e., F (τ) ∼ 1/τ2, if thedistribution of the parameters λ of individual inter-contacttimes is exponential. In the rest of the section we develop

1Please note that the corresponding probability mass function is givenby e−λτ

(1− e−λ

)and adds up to one, thus showing that the discrete

exponential is a properly defined distribution.

this case and show how the exponential distribution of theparameter of individual inter-contact times emerges in thearrival network with independent Bernoulli arrival processes.

As we have already shown, the distribution of the pa-rameters of pairwise inter-contact times depends on thedistribution of the corresponding arrival rates. This depen-dence is described by Corollary 1, which after substitutionof ρCij

with λ, according to what we discussed above,takes the form λ =

∑Lij

l=1− ln(1− ρAli× ρAl

j). From this

dependence, we find a distribution of arrival rates ρAli

such that the conditions of Lemma 2 are satisfied, i.e., thedistribution of parameters λ of the individual inter-contacttimes is exponential. To this aim, we prove the followinglemma.

Lemma 3: If individual arrival processes are independentBernoulli point processes, the rates ρAl

iof the processes

are drawn such that ρAli= e−

12Y

2

, where Y is a standardnormal random variable, and the number of shared meetingplaces Lij between pairs of users is a geometric randomvariable with parameter α, then the resulting pairwise inter-contact times parameters λ are exponentially distributed withparameter α.

A condition for Lemma 2 to be applicable is that thenumber of shared meeting places between pairs of users isgeometrically distributed. Recall that this type of distribu-tion is secured by the arrival network generating algorithmdescribed in Section III. Therefore, the result of Lemma 2can be applied to the networks generated by the mobilityframework. Finally, we combine the results of Lemma 2and Lemma 3 in the following theorem.

Theorem 3: If individual arrival processes are indepen-dent Bernoulli point processes, the rates ρAl

iof the processes


12Y

2

, where Y is a standardnormal random variable, and the number of shared meetingplaces Lij between pairs of users is a geometric randomvariable with parameter α, the CCDF of the aggregated inter-contact times is given by Lemma 2.

VI. VALIDATION

In this section we validate the findings from the previoussection by comparing the analytical results with simulations.For our validation we explore the same four arrival net-works that we have studied in Section IV. We assume thatusers visit the meeting places to which they are linked inthe arrival network according to a Bernoulli process withrate ρAl

i= e−

12Y

2

, where Y is a standard normal randomvariable (this follows from Theorem 3).

The results for the aggregate inter-contact times areshown in Figure 8. For instance, Figure 8.a depicts sim-ulation results for the network with parameters n = 500,χ = 0.2 and a = 0.5. According to Theorem 3, the ag-gregate inter-contact time distribution in this network isF (τ) = 0.75

(τ+0.5)(τ+1.5) (gray line in the figure). It is clear

that simulation and analytical results are in very good agree-ment. Similarly, we observe good conformance betweenanalytical and simulation results in the other simulatednetworks.

1e-009

1e-008

1e-007

1e-006

1e-005

0.0001

0.001

0.01

0.1

1

1 10 100 1000 10000

CC

DF

τ (time units)

a) n=500 χ=0.2 α=0.5

simulationtheoretical

99,9% confidence

1e-009

1e-008

1e-007

1e-006

1e-005

0.0001

0.001

0.01

0.1

1

1 10 100 1000

CC

DF

τ (time units)

b) n=500 χ=0.2 α=0.2


99,9% confidence

1e-009

1e-008

1e-007

1e-006

1e-005

0.0001

0.001

0.01

0.1

1

1 10 100 1000 10000

CC

DF

τ (time units)

c) n=1000 χ=0.1 α=0.5


99,9% confidence

1e-009

1e-008

1e-007

1e-006

1e-005

0.0001

0.001

0.01

0.1

1

1 10 100 1000 10000

CC

DF

τ (time units)

d) n=1000 χ=0.1 α=0.2


99,9% confidence

Figure 8. The aggregate inter-contact times distribution for different arrivalnetworks

VII. CONCLUSIONS

In this paper we have proposed a mobility framework that,starting from an input social graph, characterizes the wayusers visit locations. The spatial dimension of mobility isadded imposing that people belonging to the same socialcommunity are assigned the same location, which is wherethe people of that community meet. Then, the way users visittheir assigned locations over time is described by means ofa stochastic process. We have shown that this framework isflexible, i.e., is able to generate inter-contact times featuringdifferent distributions. In addition, assuming that users arriveto locations according to heterogenous Bernoulli processes,we have analytically derived the pairwise and aggregateinter-contact times, thus showing the predictability of themodel in this case. Finally, we have validated the analyticalpredictions against simulation results and we have shownthat the two are in good agreement.

ACKNOWLEDGMENT

This work was partially funded by the European Commis-sion under the SCAMPI (FP7-FIRE 258414), RECOGNI-TION (FET-AWARENESS 257756), and EINS (FP7-FIRE288021) projects.

REFERENCES

[1] C. Boldrini and A. Passarella. HCMM: Modelling spatial andtemporal properties of human mobility driven by users’ socialrelationships. Computer Communications, 33(9):1056–1074,2010.

[2] V. Borrel, F. Legendre, M.D. De Amorim, and S. Fdida.Simps: Using sociology for personal mobility. IEEE/ACMTrans. Netw., 17(3):831–842, 2009.

[3] D. Brockmann, L. Hufnagel, and T. Geisel. The scaling lawsof human travel. Nature, 439(7075):462–465, 2006.

[4] C. Bron and J. Kerbosch. Algorithm 457: finding all cliquesof an undirected graph. Communications of the ACM,16(9):575–577, 1973.

[5] A. Chaintreau, P. Hui, J. Crowcroft, C. Diot, R. Gass, andJ. Scott. Impact of human mobility on opportunistic forward-ing algorithms. IEEE Trans. Mobile Comp., pages 606–620,2007.

[6] E. Cho, S.A. Myers, and J. Leskovec. Friendship andmobility: user movement in location-based social networks.In SIGKDD’11, pages 1082–1090. ACM, 2011.

[7] V. Conan, J. Leguay, and T. Friedman. Characterizingpairwise inter-contact patterns in delay tolerant networks. InProceedings of the Autonomics’07, 2007.

[8] F. Ekman, A. Keranen, J. Karvo, and J. Ott. Workingday movement model. In ACM/SIGMOBILE workshop onMobility models, pages 33–40. ACM, 2008.

[9] Wei Gao, Qinghua Li, Bo Zhao, and Guohong Cao. Multicas-ting in delay tolerant networks: a social network perspective.In MobiHoc ’09, pages 299–308. ACM, 2009.

[10] M.C. Gonzalez, C.A. Hidalgo, and A.L. Barabasi. Un-derstanding individual human mobility patterns. Nature,453(7196):779–782, 2008.

[11] Theus Hossmann, Thrasyvoulos Spyropoulos, and FranckLegendre. Putting contacts into context: Mobility modelingbeyond inter-contact times. In Twelfth ACM InternationalSymposium on Mobile Ad Hoc Networking and Computing(MobiHoc 11), Paris, France, May 2011. ACM.

[12] W. Hsu, T. Spyropoulos, K. Psounis, and A. Helmy. Modelingtime-variant user mobility in wireless mobile networks. InIEEE INFOCOM 2007, pages 758–766, 2007.

[13] T. Karagiannis, J.Y. Le Boudec, and M. Vojnovic. Power lawand exponential decay of intercontact times between mobiledevices. IEEE Trans. Mobile Comp., 9(10):1377–1390, 2010.

[14] D. Karamshuk, C. Boldrini, M. Conti, and A. Passarella.Human mobility models for opportunistic networks. IEEECommun. Mag, (12):157–165, December 2011.

[15] Ari Keranen, Jorg Ott, and Teemu Karkkainen. The ONESimulator for DTN Protocol Evaluation. In SIMUTools ’09,2009.

[16] K. Lee, S. Hong, S.J. Kim, I. Rhee, and S. Chong. Slaw: Anew mobility model for human walks. In IEEE INFOCOM2009, pages 855–863, 2009.

[17] M. Musolesi and C. Mascolo. Designing mobility modelsbased on social network theory. ACM SIGMOBILE CCR,11(3):59–70, 2007.

[18] M.E.J. Newman. The structure and function of complexnetworks. SIAM review, pages 167–256, 2003.

[19] M.E.J. Newman and M. Girvan. Finding and evaluat-ing community structure in networks. Physical review E,69(2):026113, 2004.

[20] G. Palla, I. Derenyi, I. Farkas, and T. Vicsek. Uncoveringthe overlapping community structure of complex networks innature and society. Nature, 435(7043):814–818, 2005.

[21] A. Passarella and M. Conti. Characterising aggregate inter-contact times in heterogeneous opportunistic networks. NET-WORKING 2011, pages 301–313, 2011.

[22] C. Song, T. Koren, P. Wang, and A.L. Barabasi. Modellingthe scaling properties of human mobility. Nature Physics,6(10):818–823, 2010.

[23] Q. Zheng, X. Hong, J. Liu, and D. Cordes. Agenda drivenmobility modelling. International Journal of Ad Hoc andUbiquitous Computing, 5(1):22–36, 2010.

[24] M. Zukerman. An introduction to queueing theory andstochastic teletraffic models, 2007.

APPENDIX

A. Proofs of the theorems

Lemma 1 (Single-place contact process): The single-place contact process Clij resulting from independentBernoulli arrival processes Ali and Alj , of rates ρAl

i

and ρAlj

respectively, is a Bernoulli process of rateρCl

ij= ρAl

i× ρAl

j.

Proof: The probability of a contact at meeting placeMl is equal to the probability that both users are at meetingplace Ml in the same time slot. This can be obtained as theproduct ρAl

i×ρAl

j, recalling that, for a Bernoulli process, the

rate of the process is equal to the probability of an arrival ina time slot. A discrete stochastic process in which arrivalshappen with constant probability ρCl

ij= ρAl

i× ρAl

jis again

a Bernoulli process of rate ρClij

.Theorem 1 (Contact process): The contact process Cij

between contacts resulting from a number Lij of individualplace contact processes Clij , which, in their turn, emergefrom Bernoulli arrival processes Ali and Alj of rates ρAl

i

and ρAlj, is a Bernoulli process of rate ρCij

= 1 −∏Lij


j).

Proof: The probability of at least one contact in a timeslot can be computed as one minus the probability of nocontact in that time slot. The probability of no contact inthe time slot is equal to the probability that the two users donot meet in any of their shared meeting places. As it followsfrom Lemma 1, the probability of a contact in a single sharedplace is constant and equal to ρAl

i× ρAl

j. Therefore, the

probability of at least one contact in a time slot is alsoconstant and equal to ρCij

= 1 −∏Lij


j).

It then follows that the sequence of time slots with at leastone contact form a Bernoulli process of rate ρCij .

Lemma 2: In a network where pairwise inter-contacttimes have a discrete exponential distribution of the formFλ(τ) = e−λτ , τ ∈ {1, 2, 3, ...}, and parameters λ are drawnfrom an exponential distribution with rate α, the aggregateinter-contact time distribution is as follows:

F (τ) =α+ α2

(τ + α)(τ + α+ 1). (3)

Proof: The proof is based on adapting Theorem 2for the case when the pair-wise contact sequences aremodeled by the corresponding Bernoulli arrival processes.As it follows from Corollary 1, in this case the distri-bution of individual inter-contact times is geometric, i.e.,Fρ(τ) = (1− ρ)τ , τ ∈ {1, 2, 3, ...}. We note that a discreterandom variable X featuring a geometric distribution withrate ρ can be expressed in terms of a discrete randomvariable Y featuring a discrete exponential distribution.More specifically, the CCDF of the geometric distributionof the pairwise inter-contact times, i.e., Fρ(τ) = (1− ρ)τ ,τ ∈ {1, 2, 3, ...}, can be re-written in a discrete exponential

form, i.e., Fλ(τ) = e−λτ , τ ∈ {1, 2, 3, ...}, by substitutingρ = 1− e−λ, where λ ∈ (0,∞). Variables X and Y arethus exactly the same, but written in a different form. Usingthis substitution the distribution of the pair-wise inter-contacttimes fP (ρ) in Theorem 2 can be rewritten in the form:

fP (ρ) =dFP (ρ)

dρ= fΛ(λ)

dλ

dρ, (4)

which follows from the following

FP (ρ) = P (1− e−Λ ≤ ρ) = FΛ(− ln(1− p)). (5)

The expectation E[P ] of the rates of the pair-wise inter-contact times can be rewritten as:

E[P ] =

∫ ∞0

ρfP (ρ)dρ =

∫ ∞0

(1− e−λ)fΛ(λ)dλ (6)

Therefore, after substituting of 4 and 6 in Theorem 2,Equation 7 follows.

F (τ) =

∫∞0

(1− e−λ)e−λτfΛ(λ)dλ∫∞0

(1− e−λ)fΛ(λ)dλ(7)

In a specific case, when the distribution of λ is exponentialwith parameter α, i.e., fΛ(λ) = αe−αλ, Equation 7 takes theform of Equation 2. This concludes the proof.

Lemma 3: If individual arrival processes are independentBernoulli point processes, the rates ρAl

iof the processes


12Y

2

, where Y is a standardnormal random variable, and the number of shared meetingplaces Lij between pairs of users is a geometric randomvariable with parameter α, then the resulting pairwise inter-contact times parameters λ are exponentially distributed withparameter α.

Proof: As it follows from Corollary 1, the rate of thepair-wise inter-contact times for the case of Bernoulli arrivalprocesses depends on the rates of the arrival processes asdescribed by Equation 1. By substituting rates ρ of the pair-wise inter-contact times with λ parameters, i.e., ρ = 1−e−λ,Equation 1 can be rewritten as:

λ =∑l∈Lij

− ln(1− ρAli× ρAl

j). (8)

In the following we show that the exponential distributionof λ follows from Equation 8 if the rates ρAl

iof the arrival

processes are drawn such that ρAli= e−

12Y

2

, where Y is astandard normal random variable, and the number of sharedmeeting places Lij between pairs of users is a geometricrandom variable with parameter α. To this purpose we, first,analyze random variable Xl, defined as follows:

Xl = − ln(1− ρAli× ρAl

j) = − ln(1− e− 1

2 (Y l2

i +Y l2

j )) (9)

In the above equation, Y li and Y lj are i.i.d. random variableswith a standard normal distribution. Note also that equalityλ =

∑l∈Lij

Xl holds.

In the following, we show that Xl is an exponentialrandom variable with parameter β = 1. More specifically,if random variables Y l

2

i and Y l2

j have moment generatingfunction MY 2(t), then random variable Z = Y l

2

i + Y l2

j hasmoment generating function MZ(t) =M2

Y 2(t). Particularly,if Y li and Y lj are standard normal random variables, thenMY 2(t) = (1 − 2t)−

12 , and, thus, MZ(t) = (1 − 2t)−1.

This corresponds to an exponential random variable Z withparameter 1

2 , i.e., FZ(z) = 1 − e−12 z . Then the CDF of

random variable Xl can be obtained as follows:

FXl(x) = P (− ln(1− e− 1

2Z) ≤ x) == P (Z ≥ −2 ln(1− ex)) == 1− FZ(−2 ln(1− ex)) = 1− e−x.

Thus, Xl is distributed as an exponential random variableXl with parameter β = 1.

To derive the distribution of random variable λ =∑l∈Lij

Xl, we explore the fact that a random sum ofLij i.i.d. random variables Xl with moment generatingfunction MXl

(t), has moment generating function MΛ(t) =GLij (MXl

(t)), where GLij (z) is a probability generatingfunction of a discrete random variable Lij . Particularly, ifXl are i.i.d. exponential random variables, i.e., MXl

(t) =(1 − t

β )−1, and Lij is a geometric random variable, i.e.,

GLij= αz

1−z(1−α) , then random variable λ has a momentgenerating function MΛ(t) = (1− t

α×β )−1. This corresponds

to an exponential random variable λ with parameter α×β. Inour case β = 1, therefore, the distribution of λ is exponentialwith parameter α.

Theorem 3: If individual arrival processes are indepen-dent Bernoulli point processes, the rates ρAl

iof the processes


12Y

2

, where Y is a standardnormal random variable, and the number of shared meetingplaces Lij between pairs of users is a geometric randomvariable with parameter α, the CCDF of the aggregated inter-contact times is given by Equation 2.

Proof: The theorem combines the results of Lemma2 and Lemma 3. More specifically, from Lemma 3 itfollows that for the case when individual arrival processesare independent Bernoulli point processes, the rates ρAl

i

of the processes are drawn such that ρAli= e−

12Y

2

, whereY is a standard normal random variable, and the numberof shared meeting places Lij between pairs of users is ageometric random variable with parameter α, the resultingpairwise inter-contact times parameters λ are exponentiallydistributed with parameter α. This allows us to apply Lemma2 which says that the CCDF of the aggregated inter-contacttimes in this case is given by Equation 2, then Theorem 3follows.

B. Validating the model over realistic traces

In this section we attempt to validate the frameworkwith the mobility traces collected from location based socialnetwork (LBSN) Gowalla [11]. Users can use the Gowallaapplication on their mobile phones to share their currentlocation with their friends by making check-ins at close-by places. The concept of check-ins is very similar tothe arrivals considered in the proposed mobility frameworkas both represent time records when users visit particularvenues. With the dataset we aim to validate the assumptionsmade in the analytical model in Section V, namely:A1 geometric distribution of individual inter-arrival timesA2 geometric distribution of individual inter-contact timesA3 power-law distribution of aggregate inter-contact times

(Theorem 3)A4 distribution of individual inter-arrival times (Theorem 3)

Since our goal is to study regular mobile behaviour ofthe users, we consider only regular check-ins at the venues.Thus, we explore the dataset of 10K user-place pairs eachcontaining at least 10 repeating check-in records collectedfrom 2428 users at 6652 places around the city of SanFrancisco, CA.

Figure 9. a. The aggregate inter-contact time distribution from the datatraces (left) b. The part of the CCDF from one hour to one week (right)

For each pair of user and place we compute the timeintervals between consecutive check-ins, that correspondto the inter-arrival times in the model. The CCDF of theaggregate characteristic, as measured for all user-place pairs,is shown in Figure 9. The sharp decays around 24h, 48h, ...,in Figure 9.b confirm the previous finding in [10] that ongeneral people tend to return to places at the same timeof day. We further, plot the distribution of inter-check-intimes for the individual user-place pairs in the the lin-logscale. The plots of four representative user-place pairs areshown in Figure 10. Comparing the resulting distributionswith the corresponding geometric distributions (straight linesin the figures, obtained by Maximum Likelihood Fitting)we observe the general conformance of the results from thetraces with assumption A1 in the model.

To measure the inter-contact times characteristics (A2)we compute the time intervals between consecutive contacts

Figure 10. Individual inter-check-in times distribution from the data traces

of the users. Since there are no check-out records in thedataset, i.e., no records of the times when users leave places,we explore the assumption, commonly used in the relatedliterature [11], that a contact between two users happens ifthey have checked-in less than 1 hour apart at the same place.Firstly, we plot the distribution of inter-contact times for theindividual user-place pairs. The plots for four representativedifferent user-place pairs are shown in Figure 11. Comparingindividual pairs’ inter-contact times distribution with thegeometric distribution of the rate (the straight lines in theplots, again obtained from MLE fitting), we conclude thatthe resulting distributions are in general in agreement withassumption A2 in the model.

In Figure 12.a we plot the aggregate distribution of theinter-contact times, as measured for all pairs of contacts. Wecan observe from the plot that the distribution function hasan exponential decay (straight line in the lin-log scale). Thisresult is in line with the diversity of the documented shapesof the aggregate inter-contact distribution in the literature(exponential [9], Pareto [5], Pareto with exponential cut-off [13], LogNormal [7], etc.). However, it differs fromthe assumption in the analytical model (assumption A3).To describe the networking environment in which the givenshape of the aggregate distribution have emerged, we plotthe distribution of the check-in rates for the individualuser-place pairs, which corresponds to the arrival rates inthe framework. From Figure 12.b we can see that thedistribution of the arrival rates assumed in the analyticalmodel (assumption A4) deviates from that of the empiricaldistribution. The difference is better described by the cor-responding distributions of mean inter-arrival times, which

Figure 11. Individual inter-contact times distribution from the data traces

are the inverse of rates (Figure 13). As it is shown in thefigure, the mean inter-arrival times from the traces decayexponentially in the interval from one hour to one year (i.e.,8760 hours), which corresponds to the straight line in the lin-log scale (13.a), and even more rapidly for the values above.Instead the analytical model corresponds to the case whenthe mean inter-arrival time decays as a power-law (straightline in the lin-log scale, 13.a) and, thus, might occasionallytake considerably big values.

Figure 12. a. The aggregate inter-contact times distributions (left) b.The distribution of arrival rates from the traces (blue) and assumed in theanalytical model (red) – (right)

We can conclude that the geometric assumption on theindividual inter-arrival time distribution (assumption A1)holds in the traces, as well as it holds the same assumptionon the shape of the individual inter-contact time distribu-tion (assumption A2). However, the form of heterogeneityobserved in the traces differs from the assumption in theanalytical model (assumption A4). This also explains thedeviations between the empirical and analytical aggregate

inter-contact time distributions (assumption A3). As a futurework, we plan to extend our analytical model in Section V inorder to account for the rate distribution that has emergedfrom the Gowalla trace analysis as well as to extend ourvalidation to different kind of traces.

Figure 13. The distribution of mean inter-arrival times from the traces(blue) and assumed in the analytical model (red) in the a. lin-log scale(left) and b. log-log scale (right)

An arrival-based framework for human mobility modeling

Documents

Transcript of An arrival-based framework for human mobility modeling