Optimizing a class of in-network processing applications in networked sensor systems

Optimizing a Class of In-network ProcessingApplications in Networked Sensor Systems

Bo Hong and Viktor K. PrasannaDepartment of Electrical Engineering

University of Southern CaliforniaLos Angeles, CA 90089-2562

�bohong, prasanna�@usc.edu

Abstract— A key application of networked sensor systems isto detect and classify events of interest in an environment. Suchapplications require processing of raw data and the fusion ofindividual decisions. In-network processing of the sensed datahas been shown to be more energy efficient than the centralizedscheme that gathers all the raw data to a (powerful) base stationfor further processing. We formulate the problem as a specialclass of flow optimization problem. We propose a decentralizedadaptive algorithm to maximize the throughput of a class ofin-network processing applications. This algorithm is furtherimplemented as a decentralized in-network processing protocolthat adapts to any changes in link bandwidths and node process-ing capabilities. Simulations show that the proposed in-networkprocessing protocol achieves upto 95% of the optimal systemthroughput. We also show that path based greedy heuristics havevery poor performance in the worst case.

I. INTRODUCTION

Many applications envisioned for the networked sensorsystems are to detect and monitor events in the environ-ment. These include the monitoring of habitats for specificbirds and animals [1], detection of intrusion [2], and targettracking/identification [3], etc. All these applications requireprocessing of the raw data collected by the sensors. The trade-offs between communication and computation energy [4], [5]have shown that in-network processing of the sensed datais more energy efficient than transferring the raw data to apowerful base station for processing. In-network processingleads to prolonged lifetime of the system, which is one ofthe most critical factors in the design and deployment ofnetworked sensor systems.

Recently, in-network processing has been studied exten-sively from various perspectives. In [6], a hierarchical ar-chitecture is proposed to organize the heterogeneous nodesaccording to their computation capabilities. This hierarchyfacilitates the partitioning of tasks into different sub-tasks sothat they can be mapped onto heterogeneous nodes. The studyin [7] focuses on the development of security models for in-network processing. The major concern of this study is to setup trust between aggregators and sensors. Some other worksconsider the systems as distributed databases where the data is

Supported by the National Science Foundation under award No. IIS-0330445 and in part by DARPA under contract F33615-02-2-4005.

stored/collected by the sensors and in-networking processingis used to retrieve data from as well as disseminate commandsto the sensors. For example, the placement of aggregatorsand filters (which execute data queries) is studied in [8] tominimize the overall communication cost. The TinyDB [9]and the Cougar [10] projects offer powerful database toolsthat support efficient in-network query processing.

In this paper, we study the performance of those applicationsthat require in-network processing of raw data blocks. Theevent is assumed to be detected and sensed by a subset ofthe nodes, which we call the source nodes. Other nodes canreceive and relay the data blocks. At the same time, all thenodes can process the data blocks and derive a decision foreach data block. Compared with previous studies that focus onthe development of infrastructures for in-network processing,we study this problem from an algorithm perspective andmaximize the throughput of the system, i.e. the total numberof data blocks that can be processed by the system in one unitof time.

Such an in-network processing problem models a widerange of practical applications. For example, in environmentmonitoring applications, it is often the case that the senseddata is divided into blocks (e.g. a block may consist of theacoustic data collected by a node within one second, whilethe overall data collection process may last several minutes.)and decisions are made for each individual block based onprocessing at the nodes. Because the background noise maybe time-varying, the sensors may be at different distancesfrom the event of interest, or there may be obstacles betweensome sensors and the event, the raw data collected by thesensors have different signal to noise ratios along both tem-poral (as time advances) and spatial (among different sensors)dimensions. Due to these reasons, the accuracy rates of theindividual decisions vary. Decision fusion is then applied tocombine these individual decisions [11]. Studies have shownthat the overall accuracy rate of the final decision increasesmonotonically with the number of decision-making individuals(sensors in our case) as well as the number of data blocks. Ifthe base station is required to make a decision with certainaccuracy rate and within a given time period, then the sensornetwork needs to collect enough data blocks, make decisions

1540-7803-8815-1/04/$20.00 '2004 IEEE

for these data blocks, and transfer these decisions to the basestation, all within the given time period.

The proposed in-network processing problem is similarto the resource allocation problem in heterogeneous envi-ronments [12], in the sense that we need to coordinate thecomputation and communication resources in both classes ofsystems. But the former problem is much more challengingdue to the following two requirements of networked sensorsystems.

First of all, there is no central controller. It is thereforeprohibitively expensive, if not impossible, to construct a globalview of the system and optimize the performance of the systemaccordingly [13]. In-network processing needs to be performedin a distributed fashion.

Additionally, energy efficiency is a key consideration in al-gorithm development. Various techniques have been proposedto explore the trade-offs between processing/communicationspeed and energy consumption [14], [15], [16]. This results inthe continuous variation of the performance of the nodes (e.g.processing capabilities may change as a result of dynamic volt-age scaling; data communication rate may change as a result ofmodulation scaling). It can be envisioned that it is necessary tomaintain an application level power management scheme thatcontinuously monitors and adjusts the energy consumption ofthe sensors. Consequently, in-network processing must also beadaptive to such energy-related performance changes in thesystem.

By modeling the processing of data blocks as a special typeof data flow, we reduce the in-network processing problem tothe network flow optimization problem. More importantly, wedevelop a distributed and adaptive algorithm for the networkflow problem. This algorithm is based on the Push-Relabelalgorithm [17]. It finds the optimal solution in �� time, where � is the number of adaptation operations thatthe system has executed (details of the adaptation operationcan be found in Section IV), �� is the number of nodes,and �� is the number of links. With the algorithm thusestablished, we further develop a simple distributed protocolfor in-network processing. The performance of this protocol isstudied through simulations and system throughput upto 95%of the optimal was observed. Note that the proposed workdoes not directly model energy consumption. Performance ofthe sensors are characterized by their processing capabilitiesand communication rates. These parameters are assumed tobe continuously adjusted by some application level powermanagement scheme. Instead of controlling the energy con-sumption directly, our objective is to develop an algorithmthat can adapt to such energy-related changes.

Path based heuristics can be used as alternative methods forin-network processing. In these heuristics, the nodes determinesome paths to transfer the data, based on locally availableinformation. Examples of such heuristics include shortestpath, minimum latency path, etc. These heuristics are easy toimplement. However, we show that these heuristics can havevery poor performance.

The rest of the paper is organized as follows. Section II

presents the system model and the formal problem statement.In Section III, we show that the throughput of in-networkprocessing reduces to the network flow in a correspondinggraph. This leads to our decentralized adaptive algorithm forin-network processing. This algorithm is presented in Sec-tion IV. A simple in-network processing protocol is developedin Section V. Experimental results are shown in Section VI.Section VII studies the performance of path based greedyheuristics. Section VIII presents some discussion about futureworks.

II. PROBLEM STATEMENT

The sensor nodes are assumed to be connected via anarbitrary topology and the network is represented by a graph��. Each node � � � represents a sensor. The weightof � is denoted by ��. �� represents the processing power ofnode �, i.e. � can perform one unit of computation in ��

time. Each edge �� in the graph represents a networklink. The capacity of �� is denoted by �� . Link �� cantherefore transfer one unit of data from � to in �� time.The links are uni-directional, so � is directed and in general�� . In the rest of the paper, ‘edge’ and ‘link’ areinterchangeably used. We assume that the communications arescheduled by time/frequency division multiplexing or channelassignment techniques such as the one in [18].

The successors of � in � is defined as �� and the predecessors of � in � is defined as �� . � is the set of source nodes that collect data bysensing the environment. Node � � � can collect data fromthe environment at a rate no more than ��. �� is assumed to belarger than ��, because otherwise � can process all the data itcollects and the problem can be solved trivially. � is the basestation that eventually receives all the results of processing. �is called the sink node in �.

Without loss of generality, we assume that each data blockconsists of one unit of data and requires one unit of com-putation to process. A data block is an atomic logical unitfor processing. It may consist of multiple data packets butthe complete data block must be received by a node before itcan be processed by the node. The processing of various datablocks are independent of each other.

Let �� denote the number of data blocks transferredfrom � to in one unit of time. For notational convenience, ifedge �� , we define �� ; if the actual data transferis from � to , we define �� . With these twodefinitions, if neither �� nor �� belongs to �, then �� , which implies that �� . In this way,we can define �� over � �� , rather than being restrictedto �. �� also allows us to compute the totalnumber of data blocks transferred to � as

��

�� ,which equals

�� since �� if

�� and �� . The maximization of the systemthroughput is mathematically formulated as follows:

Throughput Maximization for In-network Processing(TMIP)

155

Given: Graph �� with the set of source nodes ��.Node � � � has processing capability �� and data collectioncapability ��. �� has capacity ��. �� if �� .

Maximize:�

��

��

��

�� Subject to:�

�� for � � � � �� 1�

�� for � � � �� 2

�� for � � � �� 3�� for � � � �� 4

The TMIP problem maximizes the overall number of datablocks that can be processed in one unit of time by thesource nodes (

��

��) and the other nodes in the system(�

��

��

�� ). Since�

�� is just an additive

constant to the optimization objective, it can be omittedwithout affecting the optimal solution. Constraint 1 requiresthat no intermediate node should receive more data blocksthan it can process; source node � � �� can collect data atmaximum rate �� and process the data at rate ��, hence therate at which data can flow out of � cannot exceed �� ,as is specified in constraint 2; constraint 3 represents thecapacity constraints of the links. A feasible solution to theabove problem represents a valid steady-state flow of datablocks from �� to the other nodes (where the data blocks areprocessed). Because the processing results of each data blockconsist of a very small number of bits (e.g. 1 bit in binaryclassification problems), we ignore the cost of transferring theprocessing results to sink node � in the problem formulation.

III. REDUCTION TO FLOW OPTIMIZATION PROBLEM

In the TMIP problem, data blocks are initially generatedby the source nodes. All the the other sensors face the samequestions upon receipt of the data blocks: should the datablocks be relayed to other sensors or processed locally? If thedata blocks are processed locally, these data blocks will bediscarded after the processing as they will be replaced by theprocessing results (of very small sizes and hence not consid-ered in our problem formulation). Then the question becomes:how many data blocks should be processed locally? If the datablocks are to be relayed to other nodes for processing, then towhich nodes should the data be relayed? The objective of theTMIP problem is to answer these questions for each sensorsuch that the overall system throughput is maximized.

We can see that the system throughput in TMIP is thesum of ��’s processing capabilities and the rate with whichdata blocks flow out of ��. After the data blocks flow outof ��, they will be transferred in the system and finally beconsumed (processed) by some nodes. If we model these dataconsumptions as a special type of data flow to a hypotheticalnode, then the throughput of the system is solely defined bythe rate with which data blocks flow out of ��.

Given a TMIP problem with �� as the input graph and�� as the source nodes, it is transformed to a standardnetwork flow maximization problem in a new graph � � usingthe following procedure:Procedure 1:

1) For each node � � � , create a node � � in ��. Add apseudo source � and a pseudo sink �� to ��.

2) For each link �� , add a link �� to �� with�� .

3) For each node � � � ��, add a link �� to �� with�� .

4) For each node � � ��, add a link � �� to �� with��

We have the following flow optimization problem basedon the above procedure. (To simplify the notations, we haveomitted the superscripts of the vertices.)

Problem 1:Given: Graph ��, source � � and sink � � � . Edge

�� has capacity ��.Maximize:

��

� � ��Subject to:�

�� for � � � � � � �� 1

�� for � � � �� 2�� for � � � �� 3

If an instance of the TMIP Problem has � as the input graphand �� as the source nodes, we denote it as TMIP(��). Ifan instance of Problem 1 has � as the input graph, as theroot, and � as the sink, we denote it as Problem 1 �� .We use �� to represent the maximum throughput forTMIP��. We use �� to represent the maximumthroughput for Problem 1 �� .

The next proposition shows that the TMIP Problem is aspecial case of Problem 1.Proposition 1: Suppose TMIP(��) is converted to Problem1 �� using Procedure 1, then

��

Proof: We use the notation in Procedure 1 to denote the nodesand edges in � and their corresponding nodes and edges in��.

Suppose � � � � � is a feasible solution forTMIP(��). We map it to a feasible solution

�

� ��

��

� for Problem 1 �� as follows:

1) initialize �

�� for � ��

.2) if �� , then set �� .3) for � � � , if

��

�� , then set ��

�� 4) for � � ��, if

��

�� , then set ��

��

It is easy to verify that such an �

is a feasible solution forProblem 1 �� and that

�

leads to the same throughputas .

Suppose �

� � ��

� is a feasible solution for Problem1 �� . We map it to a feasible solution � � � � � for TMIP�� simply as follows: for � �� , set�� .

It is also easy to verify that such an is a feasible solutionfor TMIP�� and that it has the same throughput as

�

. �Figure 1 illustrates a sensor network and the corresponding

network flow representation after applying Procedure 1.

156

(a) A sensor network (b) The corresponding network flow representation

Fig. 1. Reduction of TMIP to a network flow problem. Sensor nodes are denoted by circles. The square in (a) denotes the event of interest. Dotted lines in(a) represent the collection of data from the environment. The upper square in (b) denotes the newly added pseudo source �

�. The lower square in (b) denotesthe pseudo sink �

� . Weight of the nodes and links are omitted in this figure.

Note that �� and �� in the TMIP problem represents theprocessing capability and communication bandwidth of thesensors. The actual value of �� is determined by variousfactors such as the clock frequency, the supply voltage, thespecific design of the circuitry, and the complexity of thealgorithms for processing. The actual value of �� is alsodetermined by multiple factors such as the radio transmissionpower, the rate of signal decaying, distance between thesender and the receiver, etc. Since energy efficiency is a keyconsideration of networked sensor systems, trade-offs betweencomputation/communication speed and energy have been ex-plored extensively. For example, dynamic voltage scaling andfrequency scaling techniques save energy by reducing thesupply voltage and clock frequency of the sensor nodes, atthe cost of slower processing speeds. The modulation scalingreduces the radio transmission power, however, at the costof lower data communication rate. Additionally, these scalingtechniques can be activated on-the-fly based on the workloadand remaining energy of the sensors. The fact that �� and�� are under continuous real-time adjustment translates to therun-time variations of the link capacities in Problem 1.

Problem 1 itself is the well studied network flow maximiza-tion problem. Several algorithms [17] can be used to solve thisproblem (e.g. the Edmonds-Karp algorithm of �� complexity, the Push-Relabel algorithm of �� com-plexity, and the Relabel-to-Front algorithm of �� com-plexity). But in terms of decentralization and adaptivity, thesewell-known flow maximization algorithms are not suitable forour throughput maximization problem. Both the Edmonds-Karp and the Relabel-to-Front algorithms require a centralcoordinator. The Push-Relabel algorithm has a decentralizedimplementation where every node only needs to exchangemessages with its immediate neighbors and makes decisions

locally. But in order to be adaptive to the changes in thesystem, this algorithm has to be re-initialized and re-run fromscratch each time when some parameters (capacity of thelinks) of the flow maximization problem change. Each timebefore starting to search for the new optimal solution, thealgorithm needs to make sure that every node has finishedits local initialization, which requires a global synchronizationand compromises the property of decentralization.

IV. DECENTRALIZED THROUGHPUT MAXIMIZATION

ALGORITHM

In this section, we develop a decentralized and adaptivealgorithm for the network flow maximization problem. Thisalgorithm is an augmentation to the Push-Relabel algorithmand is denoted as the Incremental Push-Relabel algorithm(The ‘incremental’ nature of the algorithm will be explainedlater in this section). In the Incremental Push-Relabel algo-rithm, every node in the graph determines its own behaviorbased on the knowledge about itself and its neighbors. Nocentral coordinator or global information about the system isneeded. More importantly, unlike the Push-Relabel algorithm,no global synchronization is needed when the IncrementalPush-Relabel algorithm adapts to the changes in the system.

An integer valued auxiliary function �� is defined for � �� , which will be explained in the algorithm. The algorithm islisted as follows:

1) Initialization: ��, and �� are initialized as fol-lows:�� for � � � �

�� for � � � � �

�� for � � � �

�� for � � � �

157

��

�� for � � � �

2) Search for the maximum flow:Each node � � � �� conducts one of the followingthree operations as long as �� :

a) �� : applies when �� and �� s.t. �� ,

� � ��

��

��

b) ��: applies when �� and �� for � �� ,

��

c) �� : applies when �� ,

while �� pick a node � s.t. ��

��

��

��

3) Adaptation to changes in the system: For the flowmaximization problem, the only possible change that canoccur in the system is the increase or decrease of the ca-pacity of some edges. Suppose the value of �� changesto �

�

�� , the following four scenarios are considered whenperforming the Adaptation �� operation:

a) if ��

�� and �� , do nothing.b) if �

�

�� and �� , then

�� for � � � �� for � � � ��

�� for � � � �

c) if ��

�� and ��

�� , do nothing.d) if �

�

�� and ��

�� , then

�� for � � � �� for � � � ��

�� for � � � �

��

��

��

��

��

��

��

��

In the above algorithm, the ‘Push’, ‘Relabel’, and ‘Rebal-ance’ operations are called the basic operations. We have thefollowing observations:

1) The above algorithm is decentralized. In the algorithm,��, ��, and �� are the local variables main-tained by �. None of the other nodes, except �’s imme-diate neighbors, needs to or will query the value of theselocal variables. Messages are exchanged between adja-cent nodes only when such queries occur. For each node

�, the ‘firing conditions’ for the basic operations consistsof only local information, i.e. ��, ��, and ��where � � ��. The basic operations themselves alsoonly change the values of the local variables maintainedby �. When the ‘Adaptation’ operation is performed, thealgorithm only needs to change the values of the localvariables maintained by � and �.

2) Each node �, except � and �, performs one of the basicoperations as long as �� . It can be shown thatthe algorithm finds the maximum flow when no basicoperations can be performed. When some parameters inthe system change, the ‘Adaptation’ operation does notgenerate the new maximum flow immediately. Instead,the algorithm still relies on the individual nodes, whichperform the basic operations and find the new maximumflow gradually. The ‘Adaptation’ operation changes thevalues of some ��’s, which allows new basic opera-tions to be performed.

3) Different from the Push-Relabel algorithm, which needsto be re-initialized and re-run when some parametersin the system change, the initialization is performedonly once in our algorithm. After initialization, all theadaptations are performed upon the current values of�� and �� for �� . Hence no globalsynchronization is needed. In other words, our algorithmperforms incremental optimization as the parameters ofthe system change.

An intuitive explanation of the Incremental Push-Relabel isas follows. �� represents, intuitively, the shortest distancefrom � to � when �� . When �� ,�� represents the shortest distance from � to �.Hence the Incremental Push-Relabel algorithm attempts topush more flow from � to � along the shortest path; excessiveflow of intermediate nodes are pushed back to � along theshortest path. Similar to the Edmonds-Karp algorithm[17],such a choice of paths can lead to an optimal solution. Thisis formally stated in the next theorem.

Theorem 4.1: The Incremental Push-Relabel algorithmfinds the maximum flow with �� basic operations,where � is the number of adaptation operation performed, �� is the number of nodes in the graph, and �� is the numberof edges in the graph.

The proof of Theorem 4.1 consists of two parts. First, weprove that if the algorithm terminates, it obtains the maxi-mum flow. Next, we prove that the algorithm does terminateby finding upper bounds on the number of Push, Relabel,and Rebalance operations that are performed. Details of theproof are omitted here due to space limitations and can befound in [19]. In [19], the above technique is used for thecomputation of a large set of independent tasks in a dynamicheterogeneous system. Although the problem studied in [19]is totally different from the in-network processing applicationsstudied in this paper, both problems can be reduced (usingdifferent methods) to the network flow optimization problemand hence the Incremental Push-Relabel algorithm can beapplied.

158

V. ON-LINE PROTOCOLS FOR IN-NETWORK PROCESSING

The maximum flow obtained using the incremental push-relabel algorithm does not tell us how to transfer the datathrough the network. It only contains information like ‘� needsto process 0.23 data blocks in one unit of time’ or ‘� needsto transfer 0.38 data blocks to � in one unit of time’. But theactual system needs to deal with integer number of data blocks.Furthermore, before the Incremental Push-Relabel finds themaximum flow, a node, say �, may have a positive valued��, which means � is accumulating data blocks at that timeinstance. Yet such a node � should not keep accumulating datablocks as �� will eventually be driven to zero.

These issues are addressed by maintaining a data buffer ateach node. Initially, all the data buffers are empty. Buffersat the source node � � �� are being filled at rate ��. Let ��denote the length of the used buffer at �. At any time instance,each node � � � operates as follows:

1) Contact the adjacent node(s) and execute the IncrementalPush-Relabel algorithm.

2) If �� and � is not processing any data blocks,remove one data block from the data buffer and processit.

3) While �� and � is processing a data block, sendmessage ‘request to send’ to �� if �� . If‘clear to send’ is received from �, then set �� and send a data block to �.

4) Upon receiving ‘request to send’, � acknowledges ‘clearto send’ if �� . � acknowledges a denial if �� .Here � is a pre-set threshold that limits the maximumnumber of data blocks a buffer can hold.

5) If � � �� and �� , stop sensing the environmentuntil �� .

Two types of data are transferred in the system: controlmessages that are used by the Incremental Push-Relabel,and the sensed data themselves. The control messages areexchanged among the nodes to query and update the values of�� and �� etc. The ‘request to send’ and ‘clear to send’messages are also control messages, though they are morerelated to data transfer. The control messages and the senseddata are transmitted over the same links and higher priority isgiven to the control messages in case of a conflict.

Using the buffers during data transfer prevents any nodesfrom accumulating data blocks. Another advantage of utiliz-ing the buffers is that the nodes can start transferring datawithout waiting for the Incremental Push-Relabel algorithmto complete. Actually, since the algorithm is executed in adecentralized fashion, the nodes may not be aware whetherthe optimization is completed or not, unless a global synchro-nization is performed.

Because the pseudo source node is not an actual sensor,the nodes in �� need to maintain a consistent image for . Thiscan be implemented by first electing a leader from ��. Observethat �� is the only variable that all nodes in �� share. Theleader then maintains �� and broadcasts �� to nodes in�� whenever �� changes. Such regional cooperations will

cause some extra cost. However, since leader election canbe implemented efficiently and broadcast occurs only whenchanges occur in the system, such extra cost is minimum whencompared with the push/relabel/rebalance operations executedby the protocol. Consequently, this cost is considered to benegligible.

VI. EXPERIMENTAL RESULTS

To conduct the experiments, a simulator was developedusing PARSEC [20], which is a language for discrete evensimulations.

A. Simulation Setup

The simulated sensor network was generated by randomlyscattering 20 - 80 sensor nodes in unit square. The base stationwas located at the lower-left corner of the square. The event ofinterest was randomly dropped into the square. Nodes within0.2 units of distance from the event are assumed to sense theevent. Each data block is 32 bytes. Radio transmission range ofthe nodes was set to 0.2, i.e. nodes within 0.2 units of distancecan communicate with each other. Assuming a signal decayingfactor of ��, the flow capacity between sensor nodes � and� is determined by Shannon’s theorem as

��

��

��

��

where � is the bandwidth of the link, �� is the distancebetween � and �, �� is the transmission power on link�� , and � is the noise in the communication channel.In all the simulations, � was set to �� and � was setto �� . �� are uniformly distributed between 0 and��. �� is distributed between 0 and ��. Transmissiontime of a control message is assumed to be 1� . Because weconsider the scenario where the processing results of each datablock consist of a very small number of bits, we ignore thecost of transferring the processing results to sink node in thesimulations.

B. Summary of Results

Our first set of simulations examine the convergence of ourin-network processing protocol. For this set of simulations, thetransmission power of all links was set to a constant value�� . The in-networking processing lasted 30seconds for each simulation. Let �� denote the total numberof data blocks processed by the system from time 0 to time�. The raw throughput is calculated as �� . Thesteady state throughput is calculated as ��

�� .

The instantaneous throughput at time � is approximated as��

�� . The start up time is defined as � �

�� , which indicates the convergencespeed of our in-network processing protocol. �� was set to10. �� was set to 100.

Buffer sizes of � � � � �� were used. The simulationresults are listed in Table I - III, where each data point is anaverage over 200 experiments. The values of �� and �� havebeen normalized to the optimal system throughput, which wascalculated off-line.

159

TABLE I

NORMALIZED RAW THROUGHPUT

n=20 n=40 n=60 n=80� � � 0.9452 0.9421 0.9472 0.9516� � � 0.9591 0.9366 0.9440 0.9535� � �� 0.9452 0.9372 0.9349 0.9531

TABLE II

NORMALIZED STEADY-STATE THROUGHPUT

n=20 n=40 n=60 n=80� � � 0.9493 0.9456 0.9529 0.9634� � � 0.9639 0.9395 0.9482 0.9628� � �� 0.9489 0.9412 0.9374 0.9640

These simulation results show that our in-network process-ing protocol achieves around 95% of the optimal throughput.The difference between the raw and steady-state throughput issmall because the start-up time is only around ��, which isnegligible when compared with the �� in-network processingtime. Our protocol is insensitive to the buffer size limit.Actually, reducing the buffer size from 10 to 1 does not causenoticeable degradation in the system throughput. As can beseen from the results, the number of sensor nodes does nothave noticeable impact on the system throughput. However,we do observe an increasing trend in the start-up time as thenumber of sensor nodes increases. This means that a faster (interms of convergence speed) algorithm needs to be designedfor large-scale systems if response time is important. We planto explore this direction in the future.

The adaptivity of our protocol has been verified by mod-ifying the sensing/processing/communication capabilities ofthe sensors while in-network processing is being performed.The simulation settings is the same as before except that werandomly chose 20% of the communication links and increasetheir bandwidth each by 50% during the simulations. Whensuch changes occurred in the system, the adaptation procedurewas activated and in-network processing was adapted. We haveobserved that the system operated at close to (about 95%) thenew optimal throughput after the adaptation was completed.The results in table IV show the adaptation time �� of ourprotocol, which is defined as follows: suppose a set of changesoccur at time instance �� and the the steady state throughputafter the adaptation is ��

�, then

��

��

� � � ��

Intuitively, �� is the time for the system to achieve 85% (whenthe new steady state throughput is higher than the originalthroughput) to 115% (when the new steady state throughputis lower than the original throughput) of new steady state.The results in Table IV show that the adaptation time is around��, roughly the same as the start up time shown in Table III.However, the adaptation time does not increase as the numberof nodes increases. This is possibly caused by the followingfacts: when the system has a large number of nodes, a subsetof the nodes are already capable of processing all the datagenerated by the source nodes. If the performance of those

TABLE III

START UP TIME

n=20 n=40 n=60 n=80� � � 0.2965� 0.2800� 0.3127� 0.3671�� 0.3067� 0.2790� 0.3432� 0.3797�� 0.2902� 0.2793� 0.3138� 0.3692�

TABLE IV

ADAPTATION TIME

n=20 n=40 n=60 n=80� � � 0.3078 0.2907 0.2656 0.2186� � � 0.2678 0.2826 0.2442 0.2102� � �� 0.2679 0.3506 0.2998 0.2508

not-in-use nodes was changed, then our algorithm would notbe activated and the system simply continues to operate as if nochanges have occurred. This suggests that sensing capabilitiesmay be the performance bottleneck as the number of nodesincreases. The results in Table IV also show that our protocolis insensitive to buffer size.

The next set of simulations study the impact of ��.With � , , and �� fixed, �� represents the relativecompute power of the nodes and hence the average commu-nication/computation ratio of the data blocks. We simulatedsystems with 20 and 80 nodes. For each system size, we eval-uated the performance of our protocol with �� ranging from5 to 50. The results are shown in Figure 2, where each datapoint is an average over 500 simulations. The results have beennormalized to the optimal system throughput. This optimalthroughput was calculated off-line. We can see that the numberof nodes does not have any noticeable impact on the systemthroughput. When �� becomes larger, our protocol achievesa closer to optimal throughput. However, the improvementin throughput is marginal as �� increases. This suggeststhat communication bandwidth is a more important factor forimproving system throughput than processing capabilities ofthe nodes.

0 10 20 30 40 500.9

0.92

0.94

0.96

0.98

1

Nor

mal

ized

thro

ughp

ut

wmax

20−node80−node

Fig. 2. Impact of �� on system throughput

160

VII. PERFORMANCE COMPARISON

An alternative method for in-network processing problemis to transfer data blocks to neighbors that can process thedata. This heuristic attempts to maximize system throughputby pushing data blocks from the source (where the datais sensed) towards the sink (where the data is processed)along some paths. Such path based greedy heuristics arewidely used for many data routing problems since they areeasy to implement [13]. Actual choice of the path (shortestpath, minimum latency path, etc) is application specific. Buta common property of such heuristics is that the path isdetermined by the nodes based on some locally availableinformation. An example of path based greedy heuristics isdirected diffusion [21] that offers a solution to a class of datagathering problems. In [21], the sink node notifies its interestsin the data. While the interest in propagated throughput thesystem, each node locally determines a gradient that specifieswhich neighbor the data should be sent to. This gradient isthen used to establish a path from the source nodes to thesink.

Generally, path based greedy heuristic consist of the fol-lowing four steps in the context of the TMIP problem: (1)Transform the TMIP problem to its network flow represen-tation by applying the procedure specified in Section III. (2)Find an arbitrary path � from the source � to the sink �. Wedefine the capacity of the path �� as minimum capacity of allthe edges on this path. (3) Push �� units of flow along path �

and reduce the capacity of all the edges on path � by ��. (4)Repeat step 2 and 3 until there does not exist any path from� to �.

Note that the heuristic is applied to the network flowrepresentation of the TMIP problem. Sink � is a pseudo noderepresenting the processing of the data. If certain amount offlow is pushed along a path to �, it actually means that thedata is transferred along this path and then processed by thelast node of this path. For the sake of illustration, an arbitrarypath is chosen to push the flow in step 2 above. However,this heuristic can be generalized to use other choices such asshortest path, minimum latency path, etc.

The heuristic can be approximated by the following simpledistributed protocol: (1) Every node maintains a data buffer,which has a predetermined size limit. (2) The source nodeskeep sensing the environment and load their data buffers untilthe buffer becomes full. (3) At each node, as long as the databuffer is not empty, the node removes a data block from itsbuffer and processes the data block. (4) At each node, as longas the data buffer is not empty, the node sends a data block toa neighbor if the neighbor has less data blocks in its buffer.

To determine if a node � has more data blocks than itsneighbor �, � can send a query ‘request the number of datablocks’ to � and wait for the response. Or, � can broadcastthe number of data blocks in its buffer whenever it changes.It is possible that two nodes �� and �� query a commonneighbor � at the same time. �� then sends a data blockto �, increasing the number of data blocks at � by 1. At

s

101010

10 10 10 10 10 10

10

10

10

10

1010

10

10

n1 n2 n3

n4

n5

tn6 n7 n8 n9 n10

n11

n12

n13 n14 n15

10

10

10

Fig. 3. An example illustrating the poor performance of path based greedyheuristic

this time instance, the knowledge �� has about � is stale.If �� makes any decision based on this stale knowledge,then �� may not be following the above protocol precisely.Such consistency issues can be solved via some low levelhandshaking mechanism. For example, we can enforce that‘query the number of data blocks’ and ‘send the data block’be executed together as a single atomic operation. Or, � candelay its response to �� in the first place, and wait until ��

has finished its operations. Designing details for this protocolis beyond the scope of this paper. But it is clear that the spiritof the above protocol is to simply move data from the sourceto the sensors (where the data is processed, i.e. the pseudosink) along paths based on local information.

Although easy to implement, the greedy heuristic cannotguarantee the optimality of the solution when applied tothe TMIP problem. Actually, the performance of the greedyheuristic can be arbitrarily bad in the worst case. This isillustrated using the example in Figure 3.

For the sake of illustration, the problem reduction procedureis skipped and the TMIP problem is shown in its network flowrepresentation in Figure 3. Edges �� , �� , and �� represent data processing in the TMIP problem. Capacities ofthe edges are marked on the edges. Suppose the path thatthe greedy heuristic first choses is � � � � ��

�� . 10 units of flow can be pushed via path �.Then no more flow can be pushed from � to � since theredoes not exist any available path from � to �. However, thesystem can actually achieve 30 units of flow because 10 unitsof flow can be pushed along each of the following three paths�� , ��

�� , ��

�� . In this example, the greedy heuristicachieves 1/3 of the optimal solution. Note that we can insertmultiple copies of path �� into the system. This will lead to anarbitrarily bad worst case performance of the greedy heuristic.

We have shown that choosing paths arbitrarily based onlocal information can lead to very poor performance. It canbe shown that for other path based greedy heuristics, there alsoexist instances in which the system performance is arbitrarilybad. The above example can be generalized to show the

161

following:Theorem 7.1: For systems with 15 or more nodes, there

exist instances of the TMIP problem in which the performanceof the path based greedy heuristic can be arbitrarily bad.

VIII. DISCUSSION

In this paper, we considered the problem of in-networkprocessing in networked sensor systems. After reducing theproblem to its network flow representation, we developeda decentralized adaptive algorithm to maximize the systemthroughput. This algorithm was further implemented as an on-line protocol. System throughput upto 95% of the optimal wasobserved in the simulations. Adaptivity of the protocol (w.r.t.to the performance changes of the sensors) was illustratedthrough simulations.

In the TMIP problem, we have modeled the processingcapability, communication rate, and sensing rate of the sensors.Power consumption of the nodes was not directly modeled,but assumed to be controlled by some application level powermanagement scheme. This leads to the continuous changes ofthe computation and communication capabilities of the nodes(due to power management). In addition, environmental factorscan affect the performance of the nodes. We addressed theissue of adaptation to such performance changes.

To address power consumption directly, we can introducea fourth characteristic for the sensors: power budget. It rep-resents the maximum amount of energy that a sensor canconsume in one unit of time. The power budget may bedetermined by various factors. For example, if a node is lowon battery, it may impose a low power budget on its activitiesso as to extend its life. If a node is on the critical path ofconnecting two groups of sensors, it may also impose a lowpower budget so that it can operate over a long period of time.It is reasonable to assume that power budget is also controlledby some application level power management scheme. Thegoal is to maximize the throughput under power budgets ofthe sensors.

Power consumption of the sensors can be modeled byassociating certain energy costs for data processing, communi-cation, and sensing. Formally, suppose node � � � has powerbudget �� and consumes �� units of energy when processinga data block, �� units of energy when sensing a data block,�� units of energy when sending a data block, and �� unitsof energy when receiving a data block. Then the throughputmaximization problem can be formulated as follows. Fornotational convenience, let �� denote the successors of �, and�� denote the predecessors of �. We use �� to denote thenumber of data blocks processed by �, and �� to denote thenumber of data blocks sensed by �.

Given: Graph �� with set of source nodes ��. Node� � � has processing capability ��, data collection capability��, and power budget ��. For node �, the energy cost ofprocessing one data block is ��, of sensing one data block is��, of sending a data block is ��, of receiving a data block is��. �� has capacity ��.

Maximize:�

��

��

��

��

Subject to:

�� for � � � 1�� for � � �� 2�� for �� 3��

��

��

�� for � � � � �� 4

��

��

��

��

for � � �� 5��

��

��

��

for � � � � �� 6��

��

��

��

for � � �� 7

In the above problem statement, we differentiate predeces-sors and successors of a node, rather than defining �� . Such a notational change does not alter the problemformulation. We still have constraints on the processing capa-bility (condition 1), on the sensing capability (condition 2), andon the communication rate (condition 3). But in contrast to theTMIP problem, now we have conditions 6 and 7 that representthe power budgets of the nodes. Conditions 1, 2, and 3 canall be transformed to edge capacity constraints if we followthe procedure specified in Section III. Condition 6 and 7 arethe energy constraints imposed on the nodes. In [22], we haveshown that for data gathering problems (without in-networkprocessing), such energy constraints can be transformed toedge capacity constraints, leading to a network flow problemformulation. We are exploring the possibility of applyingthe technique to in-network processing problems. The goalis to have a network flow representation of the in-networkprocessing problems (with power budget constraints). Thiswill enable the use of Incremental Push-Relabel algorithm andlead to a distributed protocol for in-network processing underpower constraints.

The TMIP problem studies the class of in-network pro-cessing problems that maximize the throughput. An equallyimportant problem is to maximize the total number of datablocks processed. Because every sensor has certain energybudget, the number of data blocks that a sensor can sense,send, receive, and process are limited. Therefore we needto determine the routing of data blocks through the systemwithout violating the energy budgets of the sensors. Theenergy constraints, again, can be represented as constraintsof the vertices. An interesting observation is that there are noedge capacity constraints: a communication link can be used totransfer an arbitrary number of data blocks (over an arbitrarilylong time period) since we are not maximizing the number ofdata blocks processed in one unit of time. This effectivelyreduces the maximization of the number of data blocks to aspecial case of throughput maximization with edge capacitiesbeing set to infinity. For this class of in-network processingproblems, we are exploring the possibility of developing adistributed algorithm that can be executed by nodes while thesystem continues to collect and process data.

In this paper, we have modeled networked sensor systemsas a general graph where all the nodes have the same func-

162

tionality. However, various studies have proposed hierarchi-cal infrastructures for sensor networks (e.g. [23]). In theseinfrastructures, one node is elected as the cluster head thatcoordinates the operations of the nodes in a cluster. Addition-ally, the role of the cluster head is often rotated among thenodes in a cluster [24]. This results in dynamic tree structuredsystem topology. Because routing is greatly simplified for treetopologies, many data gathering/processing problems can besolved efficiently. For example, if the root of the tree (often thebase station) needs to disseminate data to the complete system,then the greedy algorithm in [25], which was originallydeveloped for tree structured distributed computer systems,can be applied. When data processing capabilities and energyconstraints are considered, non-trivial algorithms need to bedesigned to optimize system performance. Associated with theperformance optimization problems, is the problem of systemsynthesis: given a system connected via a general graph, whatis the optimal tree structured sub-graph that can collect themaximum number of data packets? Or, what is the optimaltree that can operate over the longest time period? Many sensornetwork applications can be abstracted as the coordination ofcommunication and computation in tree structured systems.Exploration in this direction which focuses on studies at theinfrastructure level will greatly aid in application design fornetworked sensor systems.

REFERENCES

[1] H. Wang, D. Estrin, and L. Girod, “Preprocessing in a Tiered SensorNetwork for Habitat Monitoring,” EURASIP JASP special issue of sensornetworks, vol. 2003, no. 4, pp. 392–401, March 2003.

[2] J. C. Chen, Y. Kung, and R. E. Hudson, “Source Localization andBeamforming,” IEEE Signal Processing Magazine, vol. 19, no. 2, March2002.

[3] F. Zhao, J. Shin, and J. Reich, “Information-Driven Dynamic SensorCollaboration for Tracking Applications,” IEEE Signal Processing Mag-azine, March 2002.

[4] G. J. Pottie and W. J. Kaiser, “Wireless Integrated Network Sensors,”Communications of the ACM, vol. 43, no. 5, May 2000.

[5] K. Sohrabi, J. Gao, V. Ailawadhi, and G. Pottie, “Protocols for Self-Organization of a Wireless Sensor Network,” IEEE Personal Communi-cations Magazine, vol. 7, no. 5, pp. 16–27, October 2000.

[6] R. Kumar, V. Tsiatsis, and M. B. Srivastava, “Computation Hierarchyfor In-network Processing,” the 2nd ACM international conference onWireless sensor networks and applications, pp. 68–77, 2003.

[7] J. Deng, R. Han, and S. Mishra, “Security Support for In-NetworkProcessing in Wireless Sensor Networks,” 2003 ACM Workshop onSecurity of Ad Hoc and Sensor Networks (SASN ’03), October 2003.

[8] B. J. Bonfils and P. Bonnet, “Adaptive and Decentralized OperatorPlacement for In-Network Query Processing,” Second InternationalWorkshop on Information Processing in Sensor Networks, (IPSN 2003),pp. 47–62, April 2003.

[9] S. R. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong, “TAG:a Tiny AGgregation Service for Ad-Hoc Sensor Networks,” 5th Sym-posium on Operating Systems Design and Implementation (OSDI ’02),December 2002.

[10] P. Bonnet, J. E. Gehrke, and P. Seshadri, “Towards Sensor Database Sys-tems,” Second International Conference on Mobile Data Management,January 2001.

[11] H. Wu and J. M. Mendel, “Quantitative Analysis of Spatio-TemporalDecision Fusion Based on the Majority Voting Technique,” Proc. ofSPIE, vol. 5434, SPIE Defense and Security Symposium 2004, Orlando,FL, USA, 2004.

[12] M. Cannataro, D. Talia, and P. K. Srimani, “Parallel Data Intensive Com-puting in Scientific and Commercial Applications,” Parallel Computing,vol. 28, no. 5, pp. 673–704, May 2002.

[13] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cyirci, “WirelessSensor Networks: A Survey,” Computer Networks, vol. 38, no. 4, pp.393–422, 2002.

[14] R. Min, T. Furrer, and A. Chandrakasan, “Dynamic Voltage ScalingTechniques for Distributed Microsensor Networks,” Workshop on VLSI(WVLSI ’00), April 2000.

[15] A. Salhieh, J. Weinmann, M. Kochha, and L. Schwiebert, “Powerefficient Topologies for Wireless Sensor Networks,” International Con-ference on Parallel Processing, pp. 156–163, 2001.

[16] C. Schurgers, O. Aberthorne, and M. Srivastava, “Modulation scaling forEnergy Aware Communication Systems,” 2001 International Symposiumon Low Power Electronics and Design, pp. 96–99, 2001.

[17] T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction toAlgorithms. MIT Press, 1992.

[18] A. A. Bertossi, C. M. Pinotti, and R. B. Tan, “Channel Assignmentwith Separation for Interference Avoidance in Wireless Networks,” IEEETransactions on Parallel and Distributed Systems, vol. 14, no. 3, pp.222–235, March 2003.

[19] B. Hong and V. K. Prasanna, “Distributed Adaptive Task Al-location in Heterogeneous Computing Environments to MaximizeThroughput,” Technical Report CENG-2003-02, Department of Elec-trical Engineering, University of Southern California, http://www-scf.usc.edu/˜bohong/report oct03.ps, October 2003.

[20] R. Bagrodia, R. Meyer, M. Takai, Y. Chen, X. Zeng, J. Martin, andH. Song, “PARSEC: A Parallel Simulation Environment for ComplexSystems,” IEEE Computer, vol. 31, no. 10, pp. 77–85, 1998.

[21] C. Intanagonwiwat, R. Govindan, and D. Estrin, “Directed diffusion:A scalable and robust communication paradigm for sensor networks,”the Sixth Annual International Conference on Mobile Computing andNetworking (MobiCOM ’00), August 2000.

[22] B. Hong and V. K. Prasanna, “Constrained Flow Optimization withApplications to Data Gathering in Sensor Networks,” First InternationalWorkshop on Algorithmic Aspects of Wireless Sensor Networks (ALGO-SENSORS 2004), July 2004.

[23] A. Wadaa, S. Olariu, L. Wilson, K. Jones, and Q. Xu, “On Traininga Sensor Network,” International Parallel and Distributed ProcessingSymposium (IPDPS’03), April 2003.

[24] J. Wu, B. Wu, and I. Stojmenovic, “Power-Aware Broadcasting andActivity Scheduling in Ad Hoc Wireless Networks Using ConnectedDominating Sets,” Wireless Communications and Mobile Computing,special issue on Research in Ad Hoc Networking, Smart Sensing, andPervasive Computing, vol. 3, no. 4, pp. 425–438, June 2003.

[25] O. Beaumont, A. Legrand, Y. Robert, L. Carter, and J. Ferrante,“Bandwidth-Centric Allocation of Independent Tasks on HeterogeneousPlatforms,” International Parallel and Distributed Processing Sympo-sium (IPDPS), April 2002.

163

Optimizing a class of in-network processing applications in networked sensor systems

Documents

Transcript of Optimizing a class of in-network processing applications in networked sensor systems