Reconstruction of Network Coded Sources From Incomplete Datasets

20
arXiv:1307.7138v1 [cs.IT] 26 Jul 2013 1 Reconstruction of Network Coded Sources From Incomplete Datasets Eirina Bourtsoulatze, Student Member, IEEE, Nikolaos Thomos, Member, IEEE, and Pascal Frossard, Senior Member, IEEE Abstract In this paper, we investigate the problem of recovering source information from an incomplete set of network coded data with help of prior information about the sources. We study the theoretical performance of such systems under maximum a posteriori (MAP) decoding and examine the influence of the data priors and, in particular, source correlation on the decoding performance. We also propose a low complexity iterative decoding algorithm based on message passing for decoding the network coded data. The algorithm operates on a graph that captures the network coding constraints, while the knowledge about the source correlation is directly incorporated in the messages exchanged over the graph. We test the proposed method on both synthetic data and correlated image sequences and demonstrate that the prior knowledge about the statistical properties of the sources can be effectively exploited at the decoder in order to provide a good reconstruction of the transmitted data. Index Terms Network coding, correlated sources, message passing, factor graph. I. I NTRODUCTION The emergence of new network architectures and the growth of resource demanding applications have created the need for novel data delivery mechanisms that are able to efficiently exploit the network diversity. This has favored the emergence of the network coding research field [1] that introduces the concept of in-network data processing with the aim of improving the network performance. Extensive theoretical studies in network coding have revealed its great potential as a building block for efficient data delivery. It has been shown for example that one can achieve the maximum network flow in the multicast communication scenario by randomly combining the incoming data in the network nodes [2]. While many research efforts have focused on the design of network codes, only few works have addressed the problem of recovering the network coded data when the available network coded information is not sufficient for perfect decoding. The assumptions made by the majority of network coding algorithms is that the source data is decoded upon collecting a sufficient number of network coded packets, so that the original data can be perfectly reconstructed using exact decoding methods such as Gaussian elimination. This imposes significant limitations on the application of network coding in real systems; due to dynamics in the network, such as bandwidth variations or data losses, there is no guarantee that the required amount of network coded data will reach the end user in time for decoding. Under these conditions, state of the art methods can only provide an all-or-nothing performance, with the source data being either perfectly recovered or not recovered at all. Besides its negative impact on the delivered data quality, this also influences the network performance since network resources may actually be used to deliver useless network coded data. In such cases, approximate reconstruction of the source data from incomplete network coded data is surely worthful, even if perfect reconstruction cannot be achieved. In this paper, we investigate the problem of decoding the original source data from an incomplete set of network coded data with help of source priors. We consider correlated sources that generate discrete symbols, which are subsequently represented by values in a finite algebraic field and transmitted through an overlay network towards the receivers. The intermediate nodes in the overlay network perform randomized linear inter-session network coding. E. Bourtsoulatze and P. Frossard are with the Signal Processing Laboratory 4 (LTS4), Ecole Polytechnique F´ ed´ erale de Lausanne (EPFL), Lausanne, Switzerland (e-mail: [email protected]; [email protected]). N. Thomos is with the Signal Processing Laboratory 4 (LTS4), Ecole Polytechnique F´ ed´ erale de Lausanne (EPFL), Lausanne, Switzerland, and the Communication and Distributed Systems laboratory (CDS), University of Bern, Bern, Switzerland (e-mail: [email protected]). This work has been supported by the Swiss National Science Foundation under grants 200021-138083 and PZ00P2-137275.

Transcript of Reconstruction of Network Coded Sources From Incomplete Datasets

arX

iv:1

307.

7138

v1 [

cs.IT

] 26

Jul

201

31

Reconstruction of Network Coded Sources FromIncomplete Datasets

Eirina Bourtsoulatze,Student Member, IEEE,Nikolaos Thomos,Member, IEEE,and Pascal Frossard,Senior Member, IEEE

Abstract

In this paper, we investigate the problem of recovering source information from an incomplete set of networkcoded data with help of prior information about the sources.We study the theoretical performance of such systemsunder maximum a posteriori(MAP) decoding and examine the influence of the data priors and, in particular,source correlation on the decoding performance. We also propose a low complexity iterative decoding algorithmbased on message passing for decoding the network coded data. The algorithm operates on a graph that capturesthe network coding constraints, while the knowledge about the source correlation is directly incorporated in themessages exchanged over the graph. We test the proposed method on both synthetic data and correlated imagesequences and demonstrate that the prior knowledge about the statistical properties of the sources can be effectivelyexploited at the decoder in order to provide a good reconstruction of the transmitted data.

Index Terms

Network coding, correlated sources, message passing, factor graph.

I. INTRODUCTION

The emergence of new network architectures and the growth ofresource demanding applications have created theneed for novel data delivery mechanisms that are able to efficiently exploit the network diversity. This has favoredthe emergence of the network coding research field [1] that introduces the concept of in-network data processingwith the aim of improving the network performance. Extensive theoretical studies in network coding have revealedits great potential as a building block for efficient data delivery. It has been shown for example that one can achievethe maximum network flow in the multicast communication scenario by randomly combining the incoming data inthe network nodes [2].

While many research efforts have focused on the design of network codes, only few works have addressed theproblem of recovering the network coded data when the available network coded information is not sufficient forperfect decoding. The assumptions made by the majority of network coding algorithms is that the source data isdecoded upon collecting a sufficient number of network codedpackets, so that the original data can be perfectlyreconstructed using exact decoding methods such as Gaussian elimination. This imposes significant limitations onthe application of network coding in real systems; due to dynamics in the network, such as bandwidth variations ordata losses, there is no guarantee that the required amount of network coded data will reach the end user in timefor decoding. Under these conditions, state of the art methods can only provide an all-or-nothing performance, withthe source data being either perfectly recovered or not recovered at all. Besides its negative impact on the delivereddata quality, this also influences the network performance since network resources may actually be used to deliveruseless network coded data. In such cases, approximate reconstruction of the source data from incomplete networkcoded data is surely worthful, even if perfect reconstruction cannot be achieved.

In this paper, we investigate the problem of decoding the original source data from an incomplete set of networkcoded data with help of source priors. We consider correlated sources that generate discrete symbols, which aresubsequently represented by values in a finite algebraic field and transmitted through an overlay network towards thereceivers. The intermediate nodes in the overlay network perform randomized linear inter-session network coding.

E. Bourtsoulatze and P. Frossard are with the Signal Processing Laboratory 4 (LTS4), Ecole Polytechnique Federale deLausanne (EPFL),Lausanne, Switzerland (e-mail: [email protected]; [email protected]). N. Thomos is with the Signal Processing Laboratory4 (LTS4), Ecole Polytechnique Federale de Lausanne (EPFL), Lausanne, Switzerland, and the Communication and Distributed Systemslaboratory (CDS), University of Bern, Bern, Switzerland (e-mail: [email protected]).

This work has been supported by the Swiss National Science Foundation under grants 200021-138083 and PZ00P2-137275.

2

Then, given a set of network coded symbols, the decoder provides an approximate reconstruction of the originalsource data. We first study the theoretical performance of anoptimal maximum a posterioridecoder. We providean upper bound on the probability of erroneous decoding as a function of the number of network coded symbolsavailable at the decoder, the finite field size that is used fordata representation and network coding, and the jointprobability mass function of the source data. This bound provides useful insights into the performance limits of theconsidered framework. Next, we propose a constructive solution to the approximate decoding problem and designan iterative decoding algorithm based on message passing, which jointly considers the network coding constraintsgiven in the finite field and the correlation constraints thatare expressed in the real field. We test our decodingalgorithm on two sets of correlated sources. In the first case, synthetic source symbols are generated by sampling amultivariate normal distribution. In the second case, the source data corresponds to signals obtained from correlatedimages in a video sequence. The results demonstrate that, even by using a simple correlation model expressed asa correlation noise between pairs of sources, the original source data can be efficiently decoded in practice froman incomplete set of network coded symbols.

The rest of the paper is organized as follows. In Section II, we present an overview of the related work. Wedescribe the network coding framework in Section III. In Section IV, we analyze the performance of amaximuma posteriori (MAP) decoder by giving an upper bound on the decoding error probability. Next, in Section V, wepropose a practical iterative decoding algorithm based on message passing for decoding network coded correlateddata from an incomplete set of network coded symbols. In Section VI, we illustrate the performance of the proposeddecoding algorithm through simulations with synthetic data and with sets of correlated images obtained from videosequences.

II. RELATED WORK

The problem of decoding from an incomplete set of network coded data refers to the problem of reconstructingthe source data from a set of network coded symbols that is notlarge enough for uniquely determining thevalues of the source symbols. When linear network coding in finite field is considered, this translates into theproblem of recovering the source symbols from a rank deficient system of linear equations in a finite field.This problem resembles classical inverse problems [3] in signal processing, which aim at recovering the signalfrom an incomplete set of observed linear measurements. However, while the latter are successfully solved usingregularization techniques [4], the same methods cannot be applied to the source recovery problem in finite domains,due to the cyclic property of the finite fields.

Source recovery from an insufficient number of network codeddata is an ill-posed problem. In order to solveit, additional assumptions on the structure of the solution, such as sparsity or smoothness, are required. Recoveryof sparse signals from network coded data has been considered by several authors. Draperet al. [5] derive boundson the error probability of a hypothetical decoder that recovers sparse sources from both noiseless and noisymeasurements using thel0-norm. The measurements are generated by randomly combining the source values ina finite field. Conditions on the number of measurements necessary for perfect recovery (with high probability)are provided and connections to the corresponding results in real field compressed sensing are established. Otherworks have as well investigated the sparse source recovery under the network coding framework [6]–[8]. However,in these works the network coding operations are performed in the real field [9], which enables decoding usingclassical compressed sensing techniques [10] such asl1-norm minimization. To the best of our knowledge, there isno practical scheme for recovering sparse data from incomplete network coded data in finite fields.

When the network coding operations are performed in a finite field, decoding from an incomplete set of networkcoded data is challenging and only a few works in the literature have proposed constructive solutions to thisproblem. The work in [11] presents an approximate decoding technique that exploits data similarity by matchingthe most correlated data in order to compensate for the missing network coded packets. The tradeoff between thedata quantization and the finite field selection is studied, and the optimal field size for coding is established. In [12],robustness to lost network coded packets is achieved by combining network coding with multiple description coding(MDC). Redundancy is introduced through MDC, which resultsin graceful quality degradation as the number ofmissing network coded packets increases. The decoding is performed via mixed integer quadratic programming,which limits the choice of the finite field size to be equal to a prime number or to a power of a prime number.

Another class of decoding methods relies on the sum-productalgorithm [13]. In [14], the authors first builda statistical model for each network coded packet, which reflects the packet’s path in the network. Then, they

3

combine these models with the source statistics in order to obtain the overall statistical model of the system. Thismodel is used for maximum a posteriori (MAP) decoding, whichis solved using the sum-product algorithm onthe corresponding factor graph. A similar approach is presented in [15]. Note that the authors of [11] impose adeterministic model on the source data, by setting the most similar data to be equal, while the work in [14] assumesglobal knowledge on the network topology, the paths traversed by the packets and the transition probabilities, whichis not realistic in large scale dynamic networks. The solution that we propose does not depend on the topology andrequires only the knowledge about statistical source priors and network coding coefficients, which are delivered tothe receiver in the header of the network coded packet. In addition, our solution has significantly lower computationalcomplexity compared to [14], [15].

From the theoretical perspective, the work by Hoet al. [16] is among the earliest attempts to characterize thejoint source and network coding problem for correlated sources. The communication scenario in [16] consists oftwo arbitrarily correlated sources that communicate with multiple receivers over a general network where nodesperform randomized network coding. The authors provide upper bounds on the probability of decoding error asa function of network parameters, e.g., the min-cut capacities and the maximum source-receiver path length. Theerror exponents provided in [16] generalize the Slepian-Wolf error exponents for linear coding [17] and reduce tothe latter in the special case where the network consists of separate links from each source to the receiver.

The limitations posed by the high complexity of the joint source-network coding problem have been partiallyaddressed in several works about the practical design of thejoint distributed source and network coding solutions. Forexample, a practical solution to the problem of communicating correlated information from two sources to multiplereceivers over a network has been proposed in [18]. The source correlation is represented by a binary symmetricchannel (BSC). The transmission is based on random linear network coding [2], while the source compression isachieved by syndrome-based coding [19]. Despite the linearoperations in the network, the desirable code structureis preserved and enables low-complexity syndrome-based decoding. The high complexity of the joint decoding ofsource and network codes has also motivated research efforts in the direction of separating the source and thenetwork coding [20], [21]. However, it has been shown that the source and network coding cannot be separated ingeneral multicast scenarios.

Note that the goal of this paper is not to design a joint sourceand network coding scheme. We look at thecomplementary problem of recovering the source information from an incomplete set of network coded data. Thekey idea of our approach is to exploit the statistical properties of the sources, and in particular the source correlation,in order to provide approximate source reconstruction whenthe available network coded data is not sufficient forexact source recovery.

III. F RAMEWORK

We begin by describing the general framework for the transmission of sources with randomized linear networkcoding. We consider the system setup illustrated in Fig. 1.S1, S2, . . . , SN form a set ofN possibly correlatedsources. The sources produce a sequence of discrete symbolsx1, x2, . . . , xN that are transmitted to a set of receiversthrough an overlay network, where intermediate nodes perform random linear network coding (RLNC). Withoutloss of generality, we assume that the source symbols belongto a finite alphabetX that is a subset of the set ofinteger numbers(X ⊂ Z). For continuous sources, this can be achieved by quantizingthe output of the sourceswith some quantizerQ. We also assume that the source alphabetX is common for all the sources. The symbolxn produced by then-th source can be regarded as a realization of a discrete random variableXn. Thus, werepresent the sourceSn by the random variableXn with probability mass functionfn(x) : X → [0, 1]. We alsodefine the joint probability mass function (pmf)f(x) : XN → [0, 1], which represents the probability that the vectorx = (x1, x2, . . . , xN )T of symbol values is generated by the sourcesS1, S2, . . . SN .

Prior to transmission, the source symbolxn is mapped to its corresponding representationxn in a Galois fieldof sizeq through a bijective mappingF : X → F(X ) = X ⊆ Fq, such that

xn = F [xn], n = 1, . . . , N (1)

andxn = F−1[xn], n = 1, . . . , N (2)

The sizeq of the field is chosen such that|X | ≤ q, where|X | denotes the cardinality of the source alphabet.

4

S1

S2

SN

F

F

F

RLNC Receiver F−1

x1

x2

xN

x1

x2

xN

x∗

x∗

.

.

.

y, A

Fig. 1. Proposed inter-session network coding framework.

We denote asXn the random variable that represents then-th source described in the Galois field domain. Themarginal probability mass functionsfn(x) : Fq → [0, 1] and the joint pmff(x) : FN

q → [0, 1] of the source symbolvalues represented in a Galois fieldFq can be obtained fromfn(x) andf(x) respectively by setting

fn(x) =

{

fn(F−1[x]), if x ∈ X

0, if x ∈ Fq\X

and

f(x) =

{

f(F−1[x]), if x ∈ XN

0, if x ∈ FNq \X

N

The source symbols given inFq are transmitted to the receivers through intermediate network nodes. Theintermediate network nodes perform random linear network coding and forward on their output links some randomlinear combinations of the symbols they receive. Thus, thel-th network coded symbolyl that reaches the receivercan be written as

yl =

N∑

n=1

alnxn (3)

wherealn ∈ Fq and all the arithmetic operations are performed in the Galois field Fq. The coding coefficients atevery intermediate node are chosen randomly according to a uniform distribution from elements of the fieldFq

excluding the element 0. The global coding vectorsal = (al1, al2, . . . , alN ), l = 1, 2, . . . , L, which result fromseveral coding operations in different network nodes, are sent along with the network coded symbols to the receiver.Since the coding coefficients in the intermediate nodes are chosen uniformly at random, the elements of the resultingglobal encoding vectoral are uniformly distributed over the Galois fieldFq. The coding coefficients are transmittedalong with the encoded symbols to enable decoding at the receiver. In practice, in order to reduce the overheadintroduced by the coding coefficients, several source symbols can be concatenated in a single packet and the samecoding coefficient can be used for all the symbols within a packet [11].

The receiver collects a set ofL network coded symbols represented by a vectory = (y1, y2, . . . , yL)T such that

y = Ax (4)

whereA is aL×N matrix of global coding coefficients{aln} andx = (x1, x2, . . . , xN )T is the vector of sourcesymbols represented inFq. Given the vector of network coded symbolsy and the matrix of coding coefficientsA,the decoder provides an estimationx∗ of the vectorx which is subsequently mapped to the estimationx∗ of thevectorx = (x1, x2, . . . , xN )T of source symbols through the inverse mappingF−1 : X → X , such that

x∗n = F−1[x∗n], n = 1, 2, . . . , N (5)

wherex∗n ∈ X ⊆ Fq andx∗n ∈ X .If the rank of the matrix of coding coefficientsA is equal to the number of source symbolsN (a necessary

condition is that the number of encoded symbolsL is greater than or equal to the number of sourcesN , i.e.,L ≥ N ), then the matrixA is invertible and the source symbols can be perfectly decoded, i.e., x∗n = xn, ∀n.However, if the number of linearly independent network coded symbols is smaller thanN , conventional decoding

5

methods, such as Gaussian elimination, can generally not recover any source symbol from Eq. (4). In this paperwe exactly focus on this specific case, where only incompleteinformation is available at the decoder. Our firstobjective is to provide a theoretical characterization of the performance of the proposed framework in terms ofdecoding error probability. Then, we design a practical decoding method that permits approximate reconstructionx∗ of the source symbolsx with the help of some prior knowledge about the sources, which partially compensatesfor the missing coded symbols.

IV. PERFORMANCE ANALYSIS UNDERMAP DECODING

In this section, we analyze the performance of the network coding system presented in Section III. In particular,we derive an upper bound on the error probability of a MAP (maximum a posteriori) decoder. A MAP decoderselects the sequence of source symbolsx∗ that maximizes the a posteriori probability given the observation y ofL network coded symbols and the matrixA of coding coefficients. The maximum a posteriori decoding rule isoptimal in the sense that it minimizes the probability of decoding error for a given set of source sequences and agiven encoding procedure [22]. Though exact MAP decoding for general linear codes is known to be NP-completeand its implementation complexity is prohibitively high especially for long sequences [23], it can still give usefulinsights into the performance limits of our framework. In order to establish the upper bound on the decoding errorprobability, we follow the development of [24], where the author studies the problem of source coding with sideinformation. Indeed, the framework that we examine bears some resemblance to the source coding problem andcan also be viewed as an instance of distributed in-network compression.

Formally, the MAP decoding rule can be written as

x∗ = argmaxx∈XN

p(x|y,A) = argmaxx∈XN

p(y|x,A)p(x|A)

p(y|A)(6)

The probabilityp(y|x,A) of receiving a vectory of network coded symbols conditioned on the event that a vectorx of source symbols has been transmitted and encoded with the coding matrixA, is equal to 1 ifAx = y, andequal to 0 otherwise. Given the fact thatp(y|A) does not depend onx and that the vectorx and the matrixAare statistically independent,i.e., p(x|A) = p(x) = f(x), the decoding rule presented in Eq. (6) can be furthersimplified as

x∗ = argmaxx∈XN

1{A,y}(x)f(x) (7)

where1{A,y}(x) is an indicator function defined as

1{A,y}(x) =

{

1, if Ax = y

0, otherwise(8)

Proposition 1. For a coding matrixA with i.i.d. entries that are uniformly distributed over theelements of a Galoisfield Fq of sizeq, the probability of errorPe under the MAP decoding rule described in Eq.(7) is upper boundedby

Pe ≤ min0≤ρ≤1

q−ρL[

x∈XN

f(x)1

1+ρ

]1+ρ

(9)

whereL is the number of network coded symbols available at the decoder, f(x) is the joint probability massfunction of the vector of source symbols,ρ is a scalar withρ ∈ [0, 1] and all the network coding operations areperformed in the Galois field of sizeq.

Proof: When a vectorx of source symbols is transmitted, a vector of network coded symbols y = Ax isobserved at the decoder, where the entries ofA are i.i.d. uniform random variables inFq. An error occurs in theMAP decoder of Eq. (7) when there exists a sequencew ∈ XN , with w 6= x, such that the probabilityf(w) ≥ f(x)andAw = y. Therefore, we can write the conditional probability of error as the probability of the union of eventswhereAw = y, for all vectorsw 6= x such thatf(w) ≥ f(x)

Pr{error|x} = Pr{

w∈XN ,w 6=x:

f(w)≥f(x)

Aw = y}

(10)

6

Applying the union bound and using the property that for any set of events{Bi}

Pr{

i

Bi

}

≤[

i

Pr{Bi}]ρ

(11)

for any 0 ≤ ρ ≤ 1 [22], Eq. (10) yields

Pr{error|x} ≤[

w∈XN ,w 6=x:

f(w)≥f(x)

Pr{Aw = y}]ρ

(12)

The probability that the vectorw ∈ XN satisfies the network coding constraints, namely the probability thatAw = y, can be written as

Pr{Aw = y} = Pr{Aw = Ax}

= Pr{A(w + x) = 0}

= Pr{Az = 0}

= Pr{

L⋂

l=1

(

N∑

n=1

alnzn = 0)}

(13)

wherez = w + x, with z ∈ FNq . The vectorz has at least one non-zero element, sincew 6= x. Since the entries

of the coding matrixA are i.i.d. uniform random variables inFq with Pr{aln = α} = q−1, ∀α ∈ Fq, it holds thatPr{alnzn = α} = q−1, ∀α ∈ Fq and∀zn ∈ Fq\0. Thus, Eq. (13) can be further written as

Pr{Aw = y} =L∏

l=1

Pr{

N∑

n=1

alnzn = 0}

=

L∏

l=1

q−1 = q−L

(14)

whereL is the number of network coded symbols available at the decoder. Substituting the value that we obtainfrom Eq. (14) into Eq. (12), and using the Gallager’s bounding technique [22], the conditional probability of erroris upper bounded by

Pr{error|x} ≤[

w∈XN ,w 6=x:

f(w)≥f(x)

q−L]ρ

(15)

≤[

w∈XN ,w 6=x:

f(w)≥f(x)

( f(w)

f(x)

)1

1+ρ

q−L]ρ

(16)

= q−ρLf(x)−ρ

1+ρ

[

w∈XN ,w 6=x:

f(w)≥f(x)

f(w)1

1+ρ

(17)

≤ q−ρLf(x)−ρ

1+ρ

[

w∈XN

f(w)1

1+ρ

(18)

where Eq. (16) results from Eq. (15) by multiplying each termof the summation by(

f(w)/f(x))

1

1+ρ

≥ 1. Eq. (17)

is further upper bounded by relaxing the constraintf(w) ≥ f(x) in the summation and by summing over all

7

w ∈ XN . Finally, the average error probabilityPe is

Pe =∑

x∈XN

Pr{error|x}f(x)

≤∑

x∈XN

q−ρLf(x)−ρ

1+ρ

[

w∈XN

f(w)1

1+ρ

f(x)

= q−ρL∑

x∈XN

f(x)1

1+ρ

[

w∈XN

f(w)1

1+ρ

= q−ρL[

x∈XN

f(x)1

1+ρ

][

w∈XN

f(w)1

1+ρ

= q−ρL[

x∈XN

f(x)1

1+ρ

]1+ρ

= q−ρL[

x∈XN

f(x)1

1+ρ

]1+ρ

(19)

Since the inequality in Eq. (19) holds for any0 ≤ ρ ≤ 1, we obtain the bound in Eq. (9).Let us now further investigate the behavior of the upper bound provided in Eq. (9) with respect to the parameters

L, q and the joint probability mass functionf(x), and determine the value ofρ that minimizes the expression ofthe upper bound. Let us first denote asE(ρ), 0 ≤ ρ ≤ 1, the base-2 logarithm of the upper bound on the errorprobability given in Eq. (9)

E(ρ) = −ρL log2 q + (1 + ρ) log2

[

x∈XN

f(x)1

1+ρ

]

(20)

We now use the following proposition to show the convexity ofE(ρ).

Proposition 2. The functionE(ρ) defined in Eq.(20) is a convex function ofρ for ρ ≥ 0 with strict convexityunless the random vectorX = (X1,X2, . . . ,XN ) is uniformly distributed,i.e., f(x) = 1

|XN | , ∀x ∈ XN .

The proof of Proposition 2 is given in Appendix.The first partial derivative ofE(ρ) is equal to

∂E(ρ)

∂ρ= −L log2 q +Hρ(x) (21)

whereHρ(x) denotes the entropy of the tilted distributionfρ(x) with

fρ(x) =f(x)

1

1+ρ

x∈XN

f(x)1

1+ρ

(22)

andHρ(x) = −

x∈XN

fρ(x) log2 fρ(x) (23)

SinceE(ρ) is convex, its second partial derivative with respect toρ is non negative. Thus, we have

∂2E(ρ)

∂ρ2=

∂Hρ(x)

∂ρ≥ 0 (24)

From Eq. (24) it follows that the entropyHρ(x) of the tilted distributionfρ(x) is a non-decreasing function ofρ,for 0 ≤ ρ ≤ 1. We can, therefore, identify the following three cases regarding the upper bound on the probabilityof error:

• If L log2 q < Hρ(x)|ρ=0 = H(x), then ∂E(ρ)∂ρ

> 0 for all 0 ≤ ρ ≤ 1. Given thatE(ρ) is convex and thatits first partial derivative does not change sign and is positive in the interval0 ≤ ρ ≤ 1, the value ofρ thatminimizes the upper bound isρ = 0. Thus, we obtain the trivial bound

Pe ≤ 1

8

• If L log2 q > Hρ(x)|ρ=1, then ∂E(ρ)∂ρ

< 0 for all 0 ≤ ρ ≤ 1. Given thatE(ρ) is convex and that its first partialderivative does not change sign and is negative in the interval 0 ≤ ρ ≤ 1, the value ofρ that minimizes theupper bound isρ = 1. Thus, we obtain the bound

Pe ≤ q−L[

x∈XN

f(x)1

2

]2

• Finally, if H(x) ≤ L log2 q ≤ Hρ(x)|ρ=1, the value ofρ that minimizes the expression given in Eq. (20) and,therefore, the upper bound on the error probability in Eq. (9), can be obtained analytically by setting the firstpartial derivative ofE(ρ) equal to0

L log2 q = Hρ(x) (25)

In this case, the upper bound on the probability of error is given by

Pe ≤ q−ρ∗L[

x∈XN

f(x)1

1+ρ∗

]1+ρ∗

whereρ∗ is the solution of Eq. (25).

We now illustrate the behavior of the upper bound on the errorprobability for a particular instance of the networkcoding framework presented in Section III. We consider a scenario withN correlated sources that generate discretesymbols from the alphabetX . We assume that the joint probability mass function of the sources,f(x), can befactorized as follows

f(x) =

N∏

n=1

f(xi|xi−1) (26)

and we set the conditional probability mass functions to be equal to

f(xi|xi−1) =1

K

1− p

1 + pp |xi−xi−1|, p ∈ (0, 1) (27)

whereK is a normalization constant such thatq−1∑

xi=0f(xi|xi−1) = 1. For a given value ofxi−1, the distribution

in Eq. (27) represents a shifted and truncated discrete Laplacian distribution [25]. The smaller is the value of theparameterp in Eq. (27), the higher is the correlation between the variables Xi−1 andXi. We also assume thatF(X ) ≡ Fq for some value ofq and thatX1 is uniformly distributed overX . The source symbols are transmittedto the receiver through an overlay network, where intermediate network nodes perform randomized linear networkcoding. The receiver collects a set ofL network coded symbols and reconstructs the source data by applying theMAP decoding rule given in Eq. (7).

Fig. 2 illustrates the upper bound on the decoding error probability for different values of the parameterp withrespect to the number of network coded symbolsL available at the decoder. As expected, higher correlation (i.e.,lower value ofp) leads to a lower probability of error and, thus, to a better performance of the MAP decoder.Note that the error probability decreases as the number of network coded symbols increases. However, even forL ≥ N the decoding error probability is non zero. This is due to thefact that the entries of the coding matrixAare uniformly distributed and the probability that rank(A) < N for L ≥ N is non zero, though this probabilitybecomes negligible as the number of network coded symbols exceeds the number of original source symbols.

In Fig. 3, we show the evolution of the bound on the decoding error probability with the size of the Galoisfield that is used to represent the source data and perform thenetwork coding operations. Here we assume that thecardinality of the source alphabet is equal toq = 16, while the size of the finite field used for data representationand network coding operations is equal toq′ with q′ ≥ q. From Fig. 3 we can see that for a fixed number ofnetwork coded symbols, the bound on the probability of errordecreases as the size of the Galois field increases,which indicates that by employing larger finite fields betterperformance can be achieved at the decoder for the samenumber of network coded symbols. This is due to the fact that by using larger Galois fields for the representationof the source data, we introduce some redundancy that assists the decoder to eliminate candidate solutions thatare not valid according to the data model. Moreover, the probability of generating linearly dependent network

9

0 5 10 15 20 25 30 3510

−45

10−40

10−35

10−30

10−25

10−20

10−15

10−10

10−5

100

Number of network coded symbols L

Upp

er b

ound

on

the

deco

ding

err

or p

roba

bilit

y

p = 0 .05

p = 0 .10

p = 0 .15

p = 0 .25

p = 0 .50

(a) q = 32

0 5 10 15 20 25 30 3510

−60

10−50

10−40

10−30

10−20

10−10

100

Number of network coded symbols L

Upp

er b

ound

on

the

deco

ding

err

or p

roba

bilit

y

p = 0 .05

p = 0 .10

p = 0 .15

p = 0 .25

p = 0 .50

(b) q = 64

Fig. 2. Upper bound on the decoding error probability versusthe number of network coded symbolsL available at the receiver forN = 30

sources,p = {0.05, 0.10, 0.15, 0.25, 0.50}, and Galois field size (a)q = 32 and (b)q = 64.

0 5 10 15 20 2510

−60

10−50

10−40

10−30

10−20

10−10

100

Number of network coded symbols L

Upp

er b

ound

on

the

deco

ding

err

or p

roba

bilit

y

q′ = 16

q′ = 32

q′ = 64

q′ = 128

q′ = 256

Fig. 3. Upper bound on the decoding error probability versusthe number of network coded symbols forN = 20 sources,p = 0.05 andq′ = {16, 32, 64, 128, 256}.

coded symbols is lower in larger fields, thus every network coded symbol that arrives at the decoder brings novelinformation with higher probability and limits the solution space. However, this performance improvement comesat the cost of a larger number of bits that need to be transmitted to the receiver.

The theoretical analysis presented in this section is general and is valid for any set of sources with an arbitraryjoint statistical model expressed in terms of the joint pmff(x). It permits to understand the influence of thedifferent design parameters on the decoding performance. However, the exact solution of the MAP decoding problemdescribed in Eq. (7) requires the enumeration of all possible configurations ofN random variables that satisfy thenetwork coding constraints. This becomes intractable as the number of sources and the size of the Galois fieldincrease. In the next section, we propose a practical decoder that provides suboptimal, yet effective decodingperformance by exploiting the knowledge of the pairwise correlation between the sources.

V. ITERATIVE DECODING VIA MESSAGE PASSING

We now propose a low-complexity iterative message passing algorithm for decoding of network coded correlateddata. The algorithm provides an estimationx∗ of the transmitted sequencex of source values by jointly consideringthe network coding constraints and the prior statistical information about the source values. The proposed decoding

10

dS1 S1

S2 S2

SiSi

SN

SN

G

(a) Sensor network: the data sources are spatially correlated.

S1 S2 Si

G

SN

S1 S2 Si SN

D: temporal correlation window

(b) Image sequences: the data sources are temporally correlated.

Fig. 4. Examples of applications with correlated data sources and the corresponding undirected graphical models that represent the statisticalrelationships among the sources.

algorithm is a variant of the standard Belief Propagation (BP) algorithm [26], [27] with source priors that aredirectly incorporated in the messages exchanged over the factor graph.

In the following subsections, we first introduce a simple correlation model that describes the statistical dependen-cies among the source values. This model permits to have efficient decoding with low computational complexity.Next, we provide the factor graph representation that captures the constraints imposed on the source values bythe network coding operations. Finally, we present in detail the message passing algorithm that integrates theknowledge of the statistical information about the sourcesand the network coding constraints. The algorithmoperates by iteratively exchanging beliefs on the values ofthe source symbols between the factor graph nodes. Assoon as these beliefs converge to a valid solution that satisfies the network coding constraints, the decoder outputsan estimation of the original source symbols.

A. Statistical source priors

We focus on the case where the sources are linearly correlated and we assume, without loss of generality, thatthe correlation coefficientρij between the variablesXi andXj , i 6= j is non negative (ρij ≥ 0), i.e., the sourcesare positively correlated. In this case, the pairwise statistical relationships between the sources can be efficientlyrepresented as a correlation noise. Hence, for every pair ofsourcesSi andSj , i, j ∈ {1, 2, . . . , N} with i 6= j, wedefine the discrete random variableWm such that

Wm = Xi −Xj , m = 1, 2, . . . ,M (28)

whereM is the number of correlated pairs of sources. We denote asgm(w) :W → [0, 1] the probability massfunction of the random variableWm, whereW is the set of all possible values of the difference of randomvariablesXi − Xj . Note that for a pair of sources with a negative correlation coefficientρij , a similar approachcan be used by defining the correlation noise between the random variablesXi and−Xj .

The number of correlated pairs of variables can be as high asN(N − 1)/2 in the case where every source iscorrelated with each of theN−1 remaining sources. In practice, the source sequence is usually characterized by somelocal structure, for example localized spatial or temporalcorrelation. This localization of statistical dependencieslimits the number of source pairs with significant correlation coefficients. The correlated source pairs can beidentified by observing the local interactions in the corresponding physical process captured by the sources.

It is convenient to represent the correlation structure by an undirected graphG = (V, E) where each vertexcorresponds to a source and two verticesi andj are connected with an undirected edge if the sourcesSi andSj arecorrelated,i.e., ρij 6= 0. Two examples of applications with correlated data sourcesand the corresponding undirectedgraphs that capture the correlation between source pairs are illustrated in Fig. 4. In Fig. 4(a), we present a networkof sensors. In this scenario, it is reasonable to assume thatthe value of the signal acquired by the sensorSi iscorrelated with the values generated by the sensors positioned within the distanced, with the correlation coefficientρij decaying as the distance between the sensors increases. Fig. 4(b) shows a sequence of images captured atdifferent time instants, where the pixel values in thei-th image are correlated with the pixel values of every otherimage that falls within the same time window of widthD.

11

. . .

. . . . . .

{AL,yL}(x){A1,y1}(x) {Al,yl}(x)

x1 xn xN

Fig. 5. Factor graph.

B. Factor graph representation

The constraints imposed by network coding operations can bedescribed by a factor graph as the one illustratedin Fig. 5. This bipartite graph consists ofN variable nodes andL check nodes that form the basis of the messagepassing algorithm used for decoding. The variable nodes represent the source symbols, which are the unknowns ofour problem. The check nodes represent the network coding constraints imposed on the source symbols. Thel-thcheck node corresponds to thel-th network coded symbol and is connected to all the variablenodes that participatein that particular network coding combination. The check node is associated with the indicator function1{Al,yl}(x),which is defined as

1{Al,yl}(x) =

{

1, if Alx = yl

0, otherwise(29)

The indicator function1{Al,yl}(x) takes the value 1 when a certain configurationx satisfies the equation defined bythe l-th row of the coding matrixA and the value of thel-th network coded symbolyl, and the value 0 otherwise.The degree of thel-th check node is equal to the number of non zero network coding coefficients in thel-th row ofthe coding matrixA, while the total number of check nodes is equal to the numberL of network coded symbolsavailable at the decoder.

One of the parameters that influence the computational complexity and the convergence speed of the messagepassing algorithms is the degree of the check nodes. Recall that, in our setting, the intermediate nodes performrandomized linear network coding. These network operations result in a dense matrix of coding coefficients. In otherwords, every network coded symbol is a linear combination ofnearly all the source symbols. Thus, the degree of thecheck nodes that correspond to the network coding constraints is nearlyN . In order to reduce the degree of the checknodes, we preprocess the matrix of coding coefficientsA before constructing the factor graph. This preprocessingconsists in performing elementary row operations on the original matrixA in order to eliminate some of the codingcoefficients, so that thel-th row of the new coding matrixA′ is of the forma′

l = (0 . . . 0 a′li . . . a′lj 0 . . . 0)with a′lk = 0 for k < i andk > j, anda′lk ∈ Fq for i ≤ k ≤ j. The new matrixA′ has the same null space asthe matrixA. The vector of network coded symbolsy is subject to the same elementary row operations yieldingy′, such that the systems of equationsy = Ax and y′ = A′x have the same set of solutions. The number of nonzero coefficients in every row of the new matrixA′ depends in general on the rank of the matrixA, and increasesas the rank decreases. It also depends on the particular values of the coefficients in the original matrix. Hence,the exact number of non zero coefficients per row cannot be predicted. Finally, this preprocessing of the originalcoding matrixA helps to identify and eliminate the non-innovative networkcoded symbols.

Note that, unlike the factor graphs used in the standard sum-product algorithm [13], the factor graph in Fig. 5does not correspond to a factorization of some probability distribution function. It simply serves as a graphicalmodel to capture the constraints imposed on the source symbols by the network coding operations.

C. Message passing algorithm

We now present our message passing algorithm for decoding ofnetwork coded correlated data that operates onthe factor graph presented in Section V-B. We begin by introducing the notation used throughout this section. We

12

first define the following sets:

• L(n) = {l : a′ln 6= 0} as the set of all the check nodes that are neighbours of then-th variable node, and• N (l) = {n : a′ln 6= 0} as the set of all the variable nodes that are neighbours of thel-th check node,

where the nodei is called aneighbourof the nodej if there exists an edge between nodesi and j in the factorgraph. The messages that are exchanged over the factor graphare defined and interpreted as follows:

• qnl(a), a ∈ Fq, denote the messages sent from then-th variable node to thel-th check node. These messagesrepresent the belief of then-th variable node that source symbolxn has the valuea, given the informationobtained from all the neighbours other than thel-th check node,

• rln(a), a ∈ Fq, denote the messages sent from thel-th check node to then-th variable node. These messagesrepresent the belief of thel-th check node that the source symbolxn has valuea, given the messages fromall the variable nodes other than then-th variable node.

Equipped with this notation, we describe now the decoding process in more details. The algorithm takes as inputthe matrixA′, the vectory′, the adjacency matrixCG of the undirected graphG, as well as the probability massfunctions of the source symbolsfn(x), ∀n, and the probability mass functions of the correlation noise variablesgm(w), ∀m. The algorithm starts with the initialization of the messages. All the messages from the check nodesare initialized to one. The messagesqnl(a), a ∈ Fq, from the variable nodes are initialized with the prior probabilitymass functions of the source symbols

qnl(a) = fn(a) (30)

If the prior distributions on the source symbols are unknown, they are set to be uniform overX .Once the initialization step is completed, the algorithm proceeds with iterative message passing rounds among

the factor graph nodes. In particular, at every iteration ofthe algorithm, beliefs are formed in the nodes and thevariable node messagesqnl(a), a ∈ Fq, are updated in the following way:

qnl(a) = αnl

l′∈L(n)\l

rl′n(a) (31)

To elaborate, the messageqnl(a) sent from the variable noden to the check nodel is updated by multiplying themessagesrl′n(a) previously received from all the check nodesl′ ∈ L(n)\l that are neighbors of the variable noden except for the nodel. The normalization constantαnl is computed by setting

∑q−1a=0 qnl(a) = 1, so thatqnl(a),

a ∈ Fq, is a valid probability distribution overFq. The messageqnl(a) represents then-th variable node’s beliefthat the value of then-th source isa, given the information from all the other check nodes exceptfor the checknodel.

Once the messages from the variable nodes to the check nodes have been computed, we update the messages inthe opposite direction. The messagesrln(a), a ∈ Fq, from the check nodes to the variable nodes are updated as:

rln(a) =∑

{x:xn=a,y′

l=A′

lx}

n′∈N (l)\n

µn′l(xn′) (32)

whereA′l denotes thel-th row of the matrixA′ and

µn′l(xn′) =

{

gn′n(xn′ − a)qn′l(xn′), if ρnn′ 6= 0

qn′l(xn′), if ρnn′ = 0(33)

The messagerln(a) sent from the check nodel to the variable noden represents the belief that the value ofthe n-th source symbols isa given (i) the beliefsqn′l(xn′) of all the neighboring variable nodesn′ ∈ N (l)\nof the check nodel except for the variable noden, (ii) the l-th network coding constraint and (iii) the pairwisecorrelation constraints between the variable noden and all the other neighborsn′ ∈ N (l)\n of the l-th checknode. The summation in Eq. (32) is performed over all the configurations of vectorsx that satisfy the condition onthe l-th check node and have the valuea at then-th position. The product term in Eq. (32) represents the beliefof the l-th check node on a specific configuration of variables with the value of then-th variable set toa. Thesebeliefs incorporate the prior knowledge on the pairwise correlation between then-th variable and the variablesn′ ∈ N (l)\n through the weights that multiply the messages from the variable nodesn′ ∈ N (l)\n, as it can be

13

Algorithm 1 Proposed message passing decoding algorithm1: Input:

Matrix A′

Vector y′

Adjacency matrixCG of the undirected graphGProbability mass functions of the correlation noisegm(w), ∀mProbability mass functions of the source symbolsfn(x), ∀n

2: Initialization:Initialize setsL(n)← {l : a′ln 6= 0} andN (l)← {n : a′ln 6= 0}Initialize messagesqnl(a)← fn(a),∀a ∈ Fq andrln(a)← 1,∀a ∈ Fq

Setk ← 0 and definekmax

3: while k < kmax do4: Update the messages from variable to check nodes

qnl(a)← αnl

l′∈L(n)\l

rl′n(a), ∀n, ∀l ∈ L(n), ∀a ∈ Fq

5: Update the messages from check to variable nodesrln(a)←

{x:xn=a,y′

l=A′

lx}

n′∈N (l)\n

µn′l(xn′), ∀l, ∀n ∈ N (l),∀a ∈ Fq

6: Estimate the values of the source symbolsx∗n ← argmax

a∈Fq

l∈L(n)

rln(a), ∀n

7: if A′x∗ == y′ then8: return x∗

9: else10: k ← k + 111: end if12: end while13: return x∗ = (E[X1],E[X2], . . . ,E[XN ])T

14: Output: Estimation of the sequence of source symbolsx∗

seen from Eq. (33). Therefore, higher belief values are assigned to configurations that agree with the pairwisecorrelation model, while configurations that deviate from the correlation model are assigned lower belief values.

At the end of each message passing round, the variable nodes form their beliefs on the values of the variablesXn and the values of the variables are tentatively set to

x∗n = argmaxa∈Fq

l∈L(n)

rln(a) (34)

If these values satisfy the network coding constraints imposed on the reconstruction of the sources, namelyy = Ax∗ (or, equivalently,y′ = A′x∗), the solution is considered to be valid. In this case, the decoding stopsand the decoder outputs the corresponding solution. If a valid solution is not found after the maximum number ofiterations in the decoder has been reached, the decoder declares an error and setsx∗n to be equal to the expectedvalue of Xn, i.e., x∗n = E[Xn].

Algorithm 1 summarizes the steps of the proposed iterative decoding scheme. We employ the parallel schedule forthe message update procedure, which implies that all the messages at the variable nodes are updated concurrentlygiven the messages received from the check nodes at a previous stage. Similarly, all check nodes update and sendtheir messages simultaneously. Note that other message passing schedules can be considered (i.e., serial) [28];however, they are not studied in this work.

D. Complexity

We now briefly discuss the computational complexity of the proposed message passing algorithm. The outgoingmessageqnl = (qnl(0), qnl(1), . . . , qnl(q − 1)) of a degree-dv variable node consists ofq values that are computed

14

by element-wise multiplication ofdv − 1 incoming messagesrl′n = (rl′n(0), rl′n(1), . . . , rl′n(q − 1)), l′ ∈ L(n)\l.This update step requiresq(dv − 2) multiplications and it is followed by a normalization step,which requiresq− 1summations andq divisions. Thus, the total number of operations performed in the variable node, per iteration, isdv(q(dv−2)+q−1+q), since the node sends messages on each of thedv outgoing edges. Hence, the computationalcomplexity at every variable node per iteration is dominated byO(d2vq). In order to calculate the outgoing messagesof a degree-dc check node, we first need to update thedc − 1 vectorsµn′l = (µn′l(0), µn′l(1), . . . , µn′l(q − 1)),which requiresq(dc−1) multiplications at most. The sum-product expression in Eq.(32) is essentially a convolutionin a Galois field [29] and can be computed using dynamic programming approaches [27]. It can be performed with(dc − 1)q2 multiplications and(dc − 1)q(q − 1) summations. Thus, the check sum processing requires2(dc − 1)q2

operations per outgoing edge, while the total number of operations during the check node message update is2dc(dc − 1)q2 and is dominated byO(d2cq

2). For Galois fields of sizeq = 2p, the computational complexity of thecheck node messages update can be further reduced by computing the convolution in the transform domain insteadof the probability domain. This can be done by using the 2-Hadamard transform of orderq [29], which can becomputed with complexityO(q log2 q). Overall, the main factors that influence the complexity of the algorithm isthe finite field sizeq, the number of variable nodesN and check nodesL and their degrees (i.e., the density of thecoding matrixA′), and the number of iterations performed by the decoder.

VI. SIMULATION RESULTS

A. Decoding performance with synthetic signals

We first evaluate the performance of our iterative decoding algorithm with synthetic signals. We study the impactof various system parameters on the decoding performance. For this purpose, we adopt the scenario introduced in[30]. We assume thatN sensors are distributed over a geographical area and measure some physical process. Thesensors generate continuous real-valued signal samplessn, n = 1, 2, . . . , N , which are quantized with a uniformquantizerQ[·]. The quantized samplesxn = Q[sn] are then transmitted using network coding to a receiver followingthe framework given in Fig. 1.

The vectors = (s1, s2, . . . sn)T of source values is assumed to be a realization of aN -dimensional random

vector with multivariate normal distributionN (0,Σ). The covariance matrixΣ is defined as

Σ =

1 ρ12 ρ13 . . . ρ1Nρ21 1 ρ23 . . . ρ2N...

......

.. ....

ρN1 ρN2 ρN3 . . . 1

(35)

The values of the correlation coefficientsρij decay exponentially with the distancedij between the sensors andare calculated using the formulaρij = e−βdij , whereβ > 0. The constantβ controls the amount of correlationbetween sensor measurements for a given realization of the sensor network. The undirected graphG that representsthe pairwise correlation between the sources is fully connected since the correlation coefficient is non-zero for allpairs of sources. The probability mass functiongm(w) of the correlation noiseWm, for a pair of sourcesSi andSj is computed analytically by integrating the joint probability distribution p(si, sj) over the quantization intervalsin order to obtain the joint pmf of the quantized samples, andthen by summing over all configurations that yieldthe same correlation noise

gm(w) =∑

xi,xj:xi−xj=w

f(xi, xj) =∑

xi,xj :xi−xj=w

I(xi)

I(xj)

p(si, sj)dsidsj (36)

The marginal probability mass functionsfn(x) are obtained in a similar way.We first study the influence of the correlation on the decodingperformance. We distributeN = 20 sensors

uniformly over a unit square and vary the value ofβ that controls directly the correlation among the sensormeasurements. We chooseβ = 0.01, 0.05, 0.1 and 0.2. For these values ofβ, the correlation coefficients ofall possible pairs of sources belong to the intervals[0.9898, 0.9997], [0.9502, 0, 9986], [0.9027, 0.9972] and[0.8153, 0.9943], respectively. The sensor measurements are quantized using a 3-bit and respectively a 4-bit uniformquantizer. The maximum number of iterations in the decodingalgorithm is set tokmax = 100 and the number oftransmitted source sequences is set toNsamp = 20000, unless stated otherwise.

15

2 4 6 8 10 12 14 16 18 20 220

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Number of received network coded symbols L

Ave

rage

err

or r

ate

β = 0 .01

β = 0 .05

β = 0 .10

β = 0 .20

(a) q = 8

2 4 6 8 10 12 14 16 18 20 220

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Number of received network coded symbols L

Ave

rage

err

or r

ate

β = 0 .01

β = 0 .05

β = 0 .10

β = 0 .20

(b) q = 16

Fig. 6. Average decoding error rate versus the number of received network coded symbolsL for a sensor network withN = 20 sensorsand various values of the parameterβ. The sensor measurements are sampled from a multivariate normal distribution and are quantized with(a) a 3-bit (q = 8) uniform quantizer and (b) a 4-bit (q = 16) uniform quantizer.

Fig. 6 shows the decoding performance in terms of average error rate with respect to the number of receivednetwork coded symbols for finite field sizesq = 8 andq = 16. In this first set of experiments, the cardinality of thesource alphabet, as well as the size of the finite field used fordata representation and network coding operations,is equal to the number of quantization bins of the uniform quantizer. The average error rate is defined as the ratiobetween the number of erroneously decoded source sequencesand the number of transmitted source sequencesNsamp. We can see that the decoder may tolerate a few missing network coded symbols and still provide perfectreconstruction of the transmitted source sequences. In these cases, the correlation between the sources compensatesfor the missing network coded data. As the number of network coded symbols available at the decoder reduces, theperformance of the decoder gradually deteriorates. The importance of this performance degradation is dependenton the correlation among the source symbols. We can observe that the stronger the correlation between the sources(i.e., the smaller the values of the parameterβ), the less network coded symbols are required at the decodertoachieve a given decoding error rate. We also remark that the quantization has an impact on the correlation betweenthe transmitted source symbols. For source data sampled from the same joint probability distributionN (0,Σ),finer quantization leads to larger values of the correlationnoise, which results in worse decoding performance. Thiscan be observed in Fig. 6 by comparing the decoding performance for the same value of the parameterβ and fordifferent values of Galois field sizeq. In particular, for a given value ofβ and for the same number of receivednetwork coded symbolsL, the average error rate is higher when the source data is represented in larger Galoisfields (i.e., when finer quantization is applied).

We investigate next the influence of the finite field sizeq on the decoding performance when the field size isnot necessarily equivalent to the cardinality of the quantized sources. We use the same sensor network topology aspreviously withN = 20 sensors distributed uniformly over a unit square. The sensor measurements are quantizedusing a 3-bit uniform quantizer. Thus, the cardinality of the source alphabet isq′ = 8. We vary the sizeq ofthe Galois field that is used for representing the source dataand performing the network coding operations in thenetwork. The maximum number of iterations at the decoder is set to kmax = 100. The results are obtained bytransmittingNsamp = 10000 source sequences.

In Fig. 7, we present the average error rate versus the numberof received network coded symbolsL for differentvalues of Galois field size and two values of the parameterβ, which controls the correlation among the sources.We can see that the decoding performance for a given number ofreceived symbolsL improves when larger Galoisfields are used for the source data representation. The employment of larger Galois fields for the data representationintroduces more diversity in the coding operations, which results in numerous candidate solutions in the decoderthat are not valid according to the source data model. Thus, the decoder is able to eliminate these solutions duringthe decoding process and the decoding performance is improved. Recall that the same behavior has been observed

16

2 4 6 8 10 12 14 16 18 20 220

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Number of received network coded symbols L

Ave

rage

err

or r

ate

q = 8

q = 16

q = 32

q = 64

(a) β = 0.01

2 4 6 8 10 12 14 16 18 20 220

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Number of received network coded symbols L

Ave

rage

err

or r

ate

q = 8

q = 16

q = 32

q = 64

(b) β = 0.05

Fig. 7. Average decoding error rate versus the number of received network coded symbolsL for a sensor network withN = 20 sensorsand for various values of the Galois field sizeq, used for data representation and network coding operations. The sensor measurements aresampled from two different multivariate normal distributions (β = 0.01 and β = 0.05) and are quantized with a 3-bit (q′ = 8) uniformquantizer.

in the upper bound on the decoding error probability under MAP decoding rule (see Fig. 3). It is worth notingthat this performance improvement comes at the cost of a larger number of bits that have to be transmitted in thenetwork.

B. Decoding of correlated images

We now illustrate the behavior of our iterative decoding algorithm in the problem of decoding correlated imagesfrom an incomplete set of network coded packets. For our tests we use the video sequencesSilent and Foremanin QCIF format. Each image in the sequences corresponds to a144 × 176 grayscale image with pixel values inthe range[0, 255]. We extract the firstN = 15 consecutive images from each sequence and assume that the datatransmitted by each source in Fig. 1 corresponds to one of these images. In order to obtain source data with differentalphabet sizes, we quantize the images by discarding the least significant bits of the binary representation of pixelvalues. The pixel values generated by the sources are directly mapped to Galois field values and transmitted withnetwork coding to the receiver. At the receiver, the networkcoded symbols are decoded with the proposed iterativemessage passing algorithm described in Section V and mappedback to pixel values. The maximum number ofiterations at the decoder is set tokmax = 100.

The temporal correlation between the images in the video sequences is modeled with a zero mean discreteLaplacian distribution [25]. This model is widely used in Distributed Video Coding (DVC) [31], [32], where thecorrelation noise between the original frame and the side information is considered to have a Laplacian distribution.The probability mass function of the correlation noiseWm between two sources (images)Si andSj is given by

gm(w) =1− pm1 + pm

p|w|m , pm ∈ (0, 1) (37)

Only one parameter per pair of correlated sources is required in order to model the correlation noise. The magnitudeof the parameterpm determines the degree of correlation between the sourcesSi andSj. For small values of theparameterpm, the probability distribution is concentrated around zero, which means that the sourcesSi andSj arehighly correlated. On the contrary, a large parameterpm indicates that the corresponding sources are not stronglycorrelated.

Fig. 8 shows the decoding performance in terms of average error rate and the average PSNR versus the numberof network coded symbols available at the decoder. The original images are quantized to eithern = 4 or n = 5bits prior to transmission, and the network coding operations are performed in finite fields of sizeq = 16 andq = 32, respectively. The average error rate is defined as the ratiobetween the number of erroneously decoded

17

2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Number of received network coded symbols L

Ave

rage

err

or r

ate

S i le nt (q = 16)

Foreman (q = 16)

S i le nt (q = 32)

Foreman (q = 32)

(a) Average error rate

2 4 6 8 10 12 1410

15

20

25

30

35

40

45

50

55

Number of received network coded symbols L

PS

NR

(dB

)

S i le nt (q = 16)

Foreman (q = 16)

S i le nt (q = 32)

Foreman (q = 32)

(b) Average PSNR

Fig. 8. Decoding performance in terms of (a) average error rate and (b) average PSNR as a function of the number of receivednetworkcoded symbolsL for N = 15 correlated images extracted from theSilentand theForemanQCIF sequences. The images are quantized to 4bits and 5 bits prior to transmission and the coding operations are performed in the fields of sizeq = 16 andq = 32, respectively.

Frame 2 Frame 7 Frame 14 Frame 2 Frame 7 Frame 14

(a)

Frame 2 Frame 7 Frame 14 Frame 2 Frame 7 Frame 14

(b)

Fig. 9. (a) Decoded frames and (b) error frames of theSilentandForemanQCIF video sequence. The images are quantized to 5 bits priorto transmission and the coding operations are performed in afield of sizeq = 32. The decoding is performed fromL = 13 network codedsymbols.

source sequences and the total number of transmitted sourcesequences, which is equal to the number of pixels inone image. The PSNR for each decoded image is calculated as

PSNR= 10 log10(2n − 1)2

MSE(38)

where MSE is the mean square error between the original quantized image and the decoded image, andn is thenumber of bits used to represent the pixel value. We can see that the decoding performance is significantly worsefor the Foremansequence as compared to theSilent sequence. This is due to the higher motion present in theForemansequence, which leads to lower correlation between pairs offrames. On the contrary, theSilentsequenceis more static and motion is present only in limited regions of the images. Thus the correlation between pairs offrames is larger; that leads to better performance both in terms of average error rate and average PSNR.

In Fig. 9(a) we illustrate the visual quality of some of the decoded images for the two sequences. The imagesare quantized to 5 bits (q = 32) and the decoding is performed fromL = 13 network coded symbols. In Fig. 9(b)we also present the error frames, which are equal to the absolute value of the difference between the original and

18

the decoded frames. We can observe that the majority of the errors appear in the regions that are characterized bysignificant motion, where the simple correlation model losses its accuracy.

VII. C ONCLUSIONS

We have studied the problem of decoding network coded data, when the number of received network codedsymbols is not sufficient for perfect reconstruction of the original source data with conventional decoding methods.We have analyzed the performance of amaximum a posterioridecoder that approximates the source data with themost probable source symbol sequence given the incomplete set of network coded symbols and the matrix of codingcoefficients. In particular, we have derived the upper boundon the probability of decoding error under the MAPdecoding rule, which provides useful insights into the performance of our framework with respect to the systemparameters. We have also proposed a practical algorithm fordecoding network coded data from an incomplete setof data once prior information about data correlation is available at decoder. Our algorithm is based on iterativemessage passing and jointly considers the network coding and the correlation constraints expressed in finite andreal fields, respectively. We have demonstrated the performance of our algorithm through simulations on syntheticsignals and on sets of correlated images extracted from video sequences. The results show the great potential ofsuch decoding methods for the recovery of source data when exact reconstruction is not feasible due to the lackof network coded information. It is worth noting that conventional decoding methods such as Gaussian eliminationfail to provide any reconstruction when some of the network coding symbols are missing at the decoder. Ourmethod however is able to partially recover the source information even from incomplete network coded data witha performance that increases with the amount of prior information about the sources.

APPENDIX

PROOF OFPROPOSITION2

We first prove that the functionF (r) = r log2

[

x∈XN

f(x)1

r

]

(39)

is a convex function ofr, r > 0, with strict convexity iff(x) is not uniform overXN . To show that, it is sufficientto show that for anys, t > 0 and0 < λ < 1

F (λs+ (1− λ)t) ≤ λF (s) + (1− λ)F (t) (40)

Let us setr = λs+ (1− λ)t. We have∑

x∈XN

f(x)1

r =∑

x∈XN

f(x)λ+(1−λ)

r =∑

x∈XN

f(x)λ

r f(x)1−λ

r

(a)

≤(

x∈XN

f(x)λ

r

r

λs

)λs

r

(

x∈XN

f(x)1−λ

r

r

(1−λ)t

)(1−λ)t

r

=(

x∈XN

f(x)1

s

)λs

r

(

x∈XN

f(x)1

t

)(1−λ)t

r

(41)

where the inequality (a) comes from the direct application of Holder’s inequality [33]. Raising the left and rightsides of Eq. (41) tor and taking the base-2 logarithm, we obtain the following result

r log2∑

x∈XN

f(x)1

r ≤ λs log2

(

x∈XN

f(x)1

s

)

+ (1− λ)t log2

(

x∈XN

f(x)1

t

)

(42)

which is equivalent to Eq. (40) and proves thatF (r) is convex. The equality in Eq. (41) holds, if and only if,

f(x)1

t

f(x)1

s

= f(x)1

t− 1

s = c, ∀x ∈ XN

wherec is a constant, which is only feasible iff(x) is a uniform distribution. Thus, the functionF (r) is strictlyconvex unlessf(x) is a uniform distribution.

19

By simple argumentation, it is now trivial to show that the functionE(ρ) is (strictly) convex. The first summandin Eq. (20) is an affine function ofρ and is convex. The second summand in Eq. (20) is a compositionof the(strictly) convex functionF (r) with the affine expression1+ ρ, which yields a (strictly) convex function ofρ [34].Hence,E(ρ) is a convex function since it is a sum of convex functions, with strict convexity unlessf(x) is auniform distribution.

REFERENCES

[1] R. Ahlswede, N. Cai, S.-Y. R. Li, and R. W. Yeung, “NetworkInformation Flow,” IEEE Trans. Information Theory, vol. 46, no. 4, pp.1204–1216, July 2000.

[2] T. Ho, M. Medard, J. Shi, M. Effros, and D. R. Karger, “On Randomized Network Coding,” inProc. of the 41st Allerton Conf. onCommunication, Control and Computing, Monticello, IL, USA, Oct. 2003.

[3] A. Tarantola,Inverse Problem Theory and Methods for Model Parameter Estimation. Society for Industrial and Applied Mathematics,2005.

[4] A. Neumaier, “Solving Ill-Conditioned And Singular Linear Systems: A Tutorial On Regularization,”SIAM Review, vol. 40, pp. 636–666,1998.

[5] S. Draper and S. Malekpour, “Compressed Sensing over Finite Fields,” in Proc. of IEEE Int. Symp. on Information Theory, 2009, pp.669–673.

[6] M. Nabaee and F. Labeau, “Quantized Network Coding for Sparse Messages,” inProc. of IEEE Statistical Signal Processing Workshop,Aug. 2012.

[7] S. Feizi and M. Medard, “A Power Efficient Sensing/Communication Scheme: Joint Source-Channel-Network Coding by UsingCompressive Sensing,” inProc. of 49th Annual Allerton Conf. on Communication Control and Computing, Monticello, IL, Sept.2011.

[8] S. Katti, S. Shintre, S. Jaggi, D. Katabi, and M. Medard,“Real Network Codes,” inProc of the 45th Allerton Conference onCommunication, Control, and Computing, Monticello, IL, Sept. 2007.

[9] B. K. Dey, S. Katti, S. Jaggi, and D. Katabi, “Real and Complex Network Codes: Promises and Challenges,” inProc. of 4th Workshopon Network Coding, Theory and Applications, NetCod, Jan. 2008.

[10] E. J. Candes and M. B. Wakin, “An Introduction to Compressive Sampling,”SIAM Review, vol. 25, no. 2, pp. 21–30, Mar. 2008.[11] H. Park, N. Thomos, and P. Frossard, “Approximate Decoding Approaches for Network Coded Correlated Data,”Signal Processing,

vol. 93, no. 1, pp. 109 – 123, 2013.[12] L. Iwaza, M. Kieffer, L. Liberti, and K. A. Agha, “Joint Decoding of Multiple-Description Network-Coded Data,” inInt. Symp. on

Network Coding, NETCOD’11, Beijing, July 2011, pp. 1–6.[13] F. Kschischang, B. Frey, and H.-A. Loeliger, “Factor Graphs and the Sum-Product Algorithm,”IEEE Trans. Information Theory, vol. 47,

no. 2, pp. 498–519, 2001.[14] G. Maierbacher, J. Barros, and M. Medard, “Practical source-network decoding,” inProc. of 6th Int. Symp. on Wireless Communication

Systems, ISWCS’09, Tuscany, Sept. 2009, pp. 283 –287.[15] F. Bassi, C. Liu, L. Iwaza, and M. Kieffer, “CompressiveLinear Network Coding for Efficient Data Collection in Wireless Sensor

Networks,” in Proc. of European Conf. on Signal Processing, EUSIPCO’12, July 2012.[16] T. Ho, M. Medard, M. Effros, and R. Koetter, “Network Coding for Correlated Sources,” inProc. of Conf. on Information Science and

Systems, Princeton, NJ, 2004.[17] I. Csiszar, “Linear Codes for Sources and Source Networks: Error exponents, Universal Coding,”IEEE Trans. Information Theory,

vol. 28, no. 4, pp. 585–592, Jul. 1982.[18] Y. Wu, V. Stankovic, Z. Xiong, and S. Y. Kung, “On Practical Design for Joint Distributed Source and Network Coding,”IEEE Trans.

Information Theory, vol. 55, no. 4, pp. 1709–1720, Apr. 2009.[19] S. Pradhan and K. Ramachandran, “Distributed Source Coding Using Syndromes (DISCUS): Design and Construction,”IEEE Trans.

Information Theory, vol. 49, no. 3, pp. 626–643, Mar. 2003.[20] M. Effros, M. Medard, T. Ho, S. Ray, D. Karger, and R. Koetter, “Linear Network Codes: A Unified Framework for Source,Channel

and Network Coding,” inProc. of DIMACS Workshop on Network Information Theory, Piscataway, NJ, 2003.[21] A. Ramamoorthy, K. Jain, P. A. Chou, and M. Effros, “Separating Distributed Source Coding From Network Coding,”IEEE Trans.

Information Theory, vol. 52, no. 6, pp. 2785–2795, June 2006.[22] R. G. Gallager,Information Theory and Reliable Communication. John Willey & Sons, Inc., 1968.[23] D. J. C. MacKay,Information Theory, Inference & Learning Algorithms. New York, NY, USA: Cambridge University Press, 2002.[24] R. G. Gallager, “Source Coding with Side Information and Universal Coding,” Sept. 1997, available at

http://web.mit.edu/gallager/www/papers/paper5.pdf.[25] S. Inusah and T. J. Kozubowski, “A Discrete Analogue of the Laplace Distribution,”Journal of Statistical Planning and Inference, vol.

136, no. 3, pp. 1090 – 1102, March 2006.[26] J. Pearl,Probabilistic reasoning in intelligent systems: networksof plausible inference. San Francisco, CA, USA: Morgan Kaufmann

Publishers Inc., 1988.[27] M. Davey and D. MacKay, “Low-Density Parity Check Codesover GF(q),”IEEE Communications Letters, vol. 2, no. 6, pp. 165 –167,

June 1998.[28] E. Sharon, S. Litsyn, and J. Goldberger, “Efficient Serial Message-Passing Schedules for LDPC Decoding,”IEEE Trans. Information

Theory, vol. 53, no. 11, pp. 4076–4091, Nov. 2007.[29] T. Moon and J. Gunther, “Transform-Based Computation of the Distribution of a Linear Combination of Random Variables Over

Arbitrary Finite Fields,”IEEE Signal Processing Letters, vol. 18, no. 12, pp. 737–740, 2011.

20

[30] J. Barros and M. Tuchler, “Scalable Decoding on FactorTrees: A Practical Solution for Wireless Sesor Networks,”IEEE Trans.Communications, vol. 54, no. 2, pp. 284–294, Feb. 2006.

[31] C. Brites and F. Pereira, “Correlation Noise Modeling for Efficient Pixel and Transform Domain Wyner-Ziv Video Coding,” IEEETrans. Circuits and Systems for Video Technology, vol. 18, no. 9, pp. 1177 – 1190, Sept. 2008.

[32] B. Girod, A. Aaron, S. Rane, and D. Rebollo-Monedero, “Distributed Video Coding,”Proceedings of the IEEE, vol. 93, no. 1, pp.71–83, Jan. 2005.

[33] I. S. Gradshteyn and I. M. Ryzhik,Table of Integrals, Series, and Products, 5th ed. Academic Press, 1994.[34] S. Boyd and L. Vanderberghe,Convex Optimization. New York: Cambridge University Press, 2004.