Secrecy Capacity per Unit Cost

13
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 9, SEPTEMBER 2013 1 Secrecy Capacity per Unit Cost Mustafa El-Halabi, Tie Liu, and Costas N. Georghiades Abstract—The concept of channel capacity per unit cost was introduced by Verd´ u in 1990 to study the limits of cost- efcient wide-band communication. It was shown that orthogonal signaling can achieve the channel capacity per unit cost of memoryless stationary channels with a zero-cost input letter. This paper introduces a concept of secrecy capacity per unit cost to study cost-efcient wide-band secrecy communication. For degraded memoryless stationary wiretap channels, it is shown that an orthogonal coding scheme with randomized pulse position and constant pulse shape achieves the secrecy capacity per unit cost with a zero-cost input letter. For general memoryless stationary wiretap channels, the performance of orthogonal codes is studied, and the benet of further randomizing the pulse shape is demonstrated via a simple example. Index Terms—Information-theoretic security, orthogonal sig- naling, secrecy capacity per unit cost, wide-band communication, wiretap channel I. I NTRODUCTION I N CLASSICAL Shannon theory [10], communication over a noisy medium is modeled as a communication channel with discrete-time input and output. An important objective is to understand the channel capacity, which is dened as the maximum number of bits per channel use that can be reliably transmitted for a given constraint on the average transmission cost per input symbol. (The channel capacity as a function of the average transmission cost per input symbol is known as the capacity-cost function.) This formulation is suited for studying band-limited communication, where spectrum is the most valuable resource. A different scenario emerges in the context of deep-space communication, where there are virtually no limitations on the available bandwidth. Instead, energy becomes the most valuable resource considering the prohibitively high cost for replacing satellite batteries. This communication scenario was abstracted by Verd´ u [12] using the concept of channel capacity per unit cost, which is dened as the maximum number of bits per total transmission cost that can be reliably transmitted. Since there are no limitations on the number of channel uses, this formulation is tailored for studying wide-band communi- cation. Verd´ u’s formulation of channel capacity per unit cost [12] can be viewed as a relaxed setting of the classical Shannon formulation of channel capacity, in the sense that there are Manuscript received August 30, 2012; revised March 2, 2013. This research was supported by the National Science Foundation under Grant CCF-08- 45848. The material in this paper was presented in part at the 2009 IEEE International Symposium on Information Theory (ISIT), Seoul, Korea, June– July 2009. The authors are with the Department of Electrical and Computer Engi- neering, Texas A&M University, College Station, TX 77843, USA (e-mail: {mustafa79,tieliu,c-georghiades}@tamu.edu). Digital Object Identier 10.1109/JSAC.2013.1309xx. no limitations imposed on the number of channel uses for communication. Therefore, it is not surprising that the channel capacity per unit cost of a memoryless stationary channel can be derived from its capacity-cost function. For the special case where there is a zero-cost letter in the input alphabet, however, Verd´ u [12] provided an alternative characterization of the channel capacity per unit cost which does not depend on the notion of channel capacity. There are two main advantages for this new characterization: 1) Compared with the classical single-letter characterization of channel capacity, Verd´ u’s characterization of channel capacity per unit cost is much easier to compute as it only involves an optimization over the input letters as opposed to the input distributions for channel capacity [10]. 2) Even though structured codes that can achieve the chan- nel capacity are difcult to construct, Verd´ u’s character- ization is strongly tied to the fact that with a zero-cost input letter, the channel capacity per unit cost can be achieved by highly structured orthogonal codes. The above results were later extended by Liu and Viswanath [7] to memoryless stationary state-dependent channels, where the channel states are non-causally known at the transmitter as side information. The main aim of this paper is to extend Verd´ u’s formulation of channel capacity per unit cost from regular communication (without any secrecy constraints) to secrecy communication, and to understand to what extent Verd´ u’s results [12] can be extended to memoryless stationary wiretap channels. We note here that in the spirit of the classical Shannon formulation [10], the secrecy capacity of memoryless stationary wiretap channels was characterized by Wyner [13] for the degraded case and by Csisz´ ar and K¨ orner [3] for the general case. The rest of the paper is organized as follows. The denition of secrecy capacity per unit cost is formally introduced in Sec. II. In Sec. III, we show that Verd´ u’s results [12] on channel capacity per unit cost extend naturally to degraded wiretap channels: an orthogonal coding scheme achieves the secrecy capacity per unit cost when there is a zero-cost input letter. In Sec. IV, we consider general wiretap channels and study the performance of orthogonal codes for secrecy capacity per unit cost. Finally, in Sec. V we conclude the paper with some remarks. Notation. Throughout the paper, we follow the convention 0/0=0 for divisions between two real numbers. II. DEFINITIONS As illustrated in Fig 1, a memoryless stationary wiretap channel consists of an input alphabet X , two output alphabets Y and Z at the legitimate receiver and the eavesdropper 0733-8716/13/$31.00 c 2013 IEEE

Transcript of Secrecy Capacity per Unit Cost

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 9, SEPTEMBER 2013 1

Secrecy Capacity per Unit CostMustafa El-Halabi, Tie Liu, and Costas N. Georghiades

Abstract—The concept of channel capacity per unit costwas introduced by Verdu in 1990 to study the limits of cost-efficient wide-band communication. It was shown that orthogonalsignaling can achieve the channel capacity per unit cost ofmemoryless stationary channels with a zero-cost input letter.This paper introduces a concept of secrecy capacity per unitcost to study cost-efficient wide-band secrecy communication.For degraded memoryless stationary wiretap channels, it isshown that an orthogonal coding scheme with randomized pulseposition and constant pulse shape achieves the secrecy capacityper unit cost with a zero-cost input letter. For general memorylessstationary wiretap channels, the performance of orthogonal codesis studied, and the benefit of further randomizing the pulse shapeis demonstrated via a simple example.

Index Terms—Information-theoretic security, orthogonal sig-naling, secrecy capacity per unit cost, wide-band communication,wiretap channel

I. INTRODUCTION

IN CLASSICAL Shannon theory [10], communication overa noisy medium is modeled as a communication channelwith discrete-time input and output. An important objective isto understand the channel capacity, which is defined as themaximum number of bits per channel use that can be reliablytransmitted for a given constraint on the average transmissioncost per input symbol. (The channel capacity as a functionof the average transmission cost per input symbol is knownas the capacity-cost function.) This formulation is suited forstudying band-limited communication, where spectrum is themost valuable resource.A different scenario emerges in the context of deep-space

communication, where there are virtually no limitations onthe available bandwidth. Instead, energy becomes the mostvaluable resource considering the prohibitively high cost forreplacing satellite batteries. This communication scenario wasabstracted by Verdu [12] using the concept of channel capacityper unit cost, which is defined as the maximum number of bitsper total transmission cost that can be reliably transmitted.Since there are no limitations on the number of channel uses,this formulation is tailored for studying wide-band communi-cation.Verdu’s formulation of channel capacity per unit cost [12]

can be viewed as a relaxed setting of the classical Shannonformulation of channel capacity, in the sense that there are

Manuscript received August 30, 2012; revised March 2, 2013. This researchwas supported by the National Science Foundation under Grant CCF-08-45848. The material in this paper was presented in part at the 2009 IEEEInternational Symposium on Information Theory (ISIT), Seoul, Korea, June–July 2009.The authors are with the Department of Electrical and Computer Engi-

neering, Texas A&M University, College Station, TX 77843, USA (e-mail:{mustafa79,tieliu,c-georghiades}@tamu.edu).Digital Object Identifier 10.1109/JSAC.2013.1309xx.

no limitations imposed on the number of channel uses forcommunication. Therefore, it is not surprising that the channelcapacity per unit cost of a memoryless stationary channel canbe derived from its capacity-cost function. For the specialcase where there is a zero-cost letter in the input alphabet,however, Verdu [12] provided an alternative characterizationof the channel capacity per unit cost which does not depend onthe notion of channel capacity. There are two main advantagesfor this new characterization:

1) Compared with the classical single-letter characterizationof channel capacity, Verdu’s characterization of channelcapacity per unit cost is much easier to compute as itonly involves an optimization over the input letters asopposed to the input distributions for channel capacity[10].

2) Even though structured codes that can achieve the chan-nel capacity are difficult to construct, Verdu’s character-ization is strongly tied to the fact that with a zero-costinput letter, the channel capacity per unit cost can beachieved by highly structured orthogonal codes.

The above results were later extended by Liu and Viswanath[7] to memoryless stationary state-dependent channels, wherethe channel states are non-causally known at the transmitteras side information.The main aim of this paper is to extend Verdu’s formulation

of channel capacity per unit cost from regular communication(without any secrecy constraints) to secrecy communication,and to understand to what extent Verdu’s results [12] can beextended to memoryless stationary wiretap channels. We notehere that in the spirit of the classical Shannon formulation[10], the secrecy capacity of memoryless stationary wiretapchannels was characterized by Wyner [13] for the degradedcase and by Csiszar and Korner [3] for the general case.The rest of the paper is organized as follows. The definition

of secrecy capacity per unit cost is formally introduced inSec. II. In Sec. III, we show that Verdu’s results [12] onchannel capacity per unit cost extend naturally to degradedwiretap channels: an orthogonal coding scheme achieves thesecrecy capacity per unit cost when there is a zero-costinput letter. In Sec. IV, we consider general wiretap channelsand study the performance of orthogonal codes for secrecycapacity per unit cost. Finally, in Sec. V we conclude thepaper with some remarks.Notation. Throughout the paper, we follow the convention

0/0 = 0 for divisions between two real numbers.

II. DEFINITIONS

As illustrated in Fig 1, a memoryless stationary wiretapchannel consists of an input alphabet X , two output alphabetsY and Z at the legitimate receiver and the eavesdropper

0733-8716/13/$31.00 c© 2013 IEEE

2 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 9, SEPTEMBER 2013

StochasticEncoder

LegitimateReceiver

Eavesdropper

W PY,Z|XXn

Y n

Zn

W

Fig. 1. An illustration of the memoryless stationary wiretap channel.

respectively, and a conditional probability distribution PY,Z|X .An (n,w0, ν, ε, h) secrecy code consists of:

• a message W uniformly drawn from {1, . . . , w0};• a stochastic encoder which maps the message W to alength-n codeword Xn = (X1, . . . , Xn) ∈ Xn such that

n∑i=1

b(Xi) ≤ ν (1)

where b : X → R+ = [0,+∞) is a function that assigns

a nonnegative cost to each letter in the input alphabetX . Also, the encoder must be designed such that theconditional entropy of the messageW given the receivedvector Zn at the eavesdropper

H(W |Zn) > h (2)

• a decoder which maps the received vector Y n ∈ Yn

at the legitimate receiver to an estimated message W ∈{1, . . . , w0} such that the average probability of error

Pr(W �= W ) < ε. (3)

Following the classical Shannon formulation [10], the se-crecy capacity of a memoryless stationary wiretap channel canbe defined as follows.Definition 1 (Secrecy capacity [13], [3]): Given 0 < ε <

1 and β > 0, a nonnegative real Rs is an ε-achievable secrecyrate with average cost per symbol not exceeding β if for anysufficiently small δ > 0, there exists a positive integer n0 suchthat for any integer n ≥ n0 an (n,w0, nβ, ε, h) code can befound for which

Rs ≥ logw0

n> Rs − δ (4)

andh

n> Rs − δ. (5)

Furthermore, Rs is said to be achievable if it is ε-achievablefor all 0 < ε < 1. The supreme of all achievable secrecy rateswith average cost per symbol not exceeding β is the secrecycapacity denoted by Cs(β). The secrecy capacity Cs(β) as afunction of the average cost per symbol β is formally referredto as the secrecy capacity-cost function.Note that the constraints (4) and (5) are such that the mutual

information between the message W and the received vectorZn at the eavesdropper normalized by the block length n must

satisfy

1

nI(W ;Zn) =

1

n(H(W )−H(W |Zn)) (6)

<1

n(logw0 − h) (7)

< Rs − (Rs − δ) (8)

= δ. (9)

Therefore, the secrecy capacity is the largest number ofbits per channel use that can be reliably transmitted to thelegitimate receiver while kept asymptotically perfectly secretfrom the eavesdropper.For the case where the memoryless stationary wiretap

channel (X , (Y,Z), PY,Z|X) is degraded, i.e, X → Y → Zforms a Markov chain in that order, Wyner [13] showed thatthe secrecy capacity-cost function Cs(β) is given by

Cs(β) = supE[b(X)]≤β

(I(X ;Y )− I(X ;Z)) . (10)

The secrecy capacity-cost function of a general memorylessstationary wiretap channel (X , (Y,Z), PY,Z|X) was charac-terized by Csiszar and Korner [3] and can be written as

Cs(β) = supE[b(X)]≤β

(I(V ;Y |U)− I(V ;Z|U)) (11)

where U and V are auxiliary random variables satisfying theMarkov chain U → V → X → (Y, Z).We use the following definition for secrecy capacity per unit

cost.Definition 2 (Secrecy capacity per unit cost): Given 0 <

ε < 1, a nonnegative real Rs is an ε-achievable secrecy rateper unit cost if for any sufficiently small δ > 0, there existsa positive real ν0 such that for any ν ≥ ν0 an (n,w0, ν, ε, h)code can be found for which

Rs ≥ lnw0

ν> Rs − δ (12)

andh

ν> Rs − δ. (13)

Furthermore, Rs is an achievable secrecy rate per unit cost ifit is ε-achievable for all 0 < ε < 1, and the secrecy capacityper unit cost Cs is the supreme of all achievable secrecy ratesper unit cost.Similar to (6)–(9), the constraints (12) and (13) are such

that the mutual information between the message W and thereceived vector Zn at the eavesdropper normalized by the totaltransmission cost ν must satisfy

1

νI(W ;Zn) =

1

ν(H(W )−H(W |Zn)) (14)

<1

ν(logw0 − h) (15)

< Rs − (Rs − δ) (16)

= δ. (17)

Therefore, the secrecy capacity per unit cost is the largestnumber of bits per total transmission cost that can be reliablytransmitted to the legitimate receiver while kept asymptoticallyperfectly secret from the eavesdropper. Unlike for secrecycapacity, however, the mutual information I(W ;Zn) is nor-malized by the total transmission cost ν rather than the block-length n. This is because the length of the codewords n

EL-HALABI et al.: SECRECY CAPACITY PER UNIT COST 3

does not play a fundamental role in the definition of secrecycapacity per unit cost. Hence, normalizing the mutual infor-mation I(W ;Zn) by the block-length n, instead of the totaltransmission cost ν, will only trivialize the secrecy constraint.Remark 1: By Definition 2, to show that that Rs is an ε-

achievable secrecy rate per unit cost, one has to consider allreal ν ≥ ν0. From the proof viewpoint, however, it is sufficientto consider νk = kβ for some β > 0 and show that for anysufficiently small δ > 0, there exists a positive integer k0 suchthat for any integer k ≥ k0 an (n,w0, kβ, ε, h) code can befound for which

Rs ≥ lnw0

kβ> Rs − δ

2(18)

andh

kβ> Rs − δ

2. (19)

For completeness, a proof of Remark 1 is provided inAppendix A. Similar to that between the Shannon capacity-cost function and Verduu’s channel capacity per unit cost [12,Th. 2], we have the following simple relationship between thesecrecy capacity-cost function and the secrecy capacity perunit cost.Theorem 1: The secrecy capacity per unit cost Cs of the

memoryless stationary wiretap channel (X , (Y,Z), PY,Z|X)is given by

Cs = supβ>0

Cs(β)

β(20)

where Cs(β) is the secrecy capacity-cost function of thechannel.A proof of the theorem is deferred to Appendix B to

enhance the flow of the paper. From Theorem 1 we see thatat least in theory, the secrecy capacity per unit cost can becalculated from the secrecy capacity-cost function. However,as shown in (10) and (11), calculating the secrecy capacity-cost function usually involves finding an optimal distributionfor the input/auxiliary random variables and hence is highlynontrivial in general. Next, following [12], we shall focuson the case where there is a zero-cost letter in the inputalphabet X (labeled as “0” throughout the rest of the paper,i.e., b(0) = 0) and look for a more direct way to calculate thesecrecy capacity per unit cost without resorting to the secrecycapacity-cost function.

III. DEGRADED WIRETAP CHANNEL

A. Main Results and Discussions

When there is a zero-cost letter in the input alphabet X , thesecrecy capacity per unit cost of a degraded wiretap channelcan be calculated without resorting to the secrecy capacity-costfunction. The result is summarized in the following theorem.Theorem 2: The secrecy capacity per unit cost Cs of the

memoryless stationary wiretap channel (X , (Y,Z), PY,Z|X)under the degradedness assumption X → Y → Z and with azero-cost input letter “0” is given by

Cs = supx∈X

N(x)

b(x)(21)

where

N(x) := D(PY |X=x‖PY |X=0)−D(PZ|X=x‖PZ|X=0)(22)

and D(P‖Q) denotes the (Kullback-Leibler) divergence be-tween two generic probability distributions P and Q. Further-more, the secrecy capacity per unit cost of the channel can beachieved by an orthogonal coding scheme.The fact that the secrecy capacity per unit cost can be

bounded from below as

Cs ≥ supx∈X

N(x)

b(x)(23)

does not depend on the assumption that the channel is de-graded and can be inferred from a stronger lower bound on thesecrecy capacity per unit cost of the general wiretap channelprovided in Sec. IV. To show that for degraded wiretapchannels, we also have the reversed inequality

Cs ≤ supx∈X

N(x)

b(x)(24)

we shall use the following standard relations between mutualinformation and divergence:

I(X ;Y ) =

∫XD(PY |X=x‖PY |X=0)dPX(x)−

D(PY ‖PY |X=0) (25)

and I(X ;Z) =

∫XD(PZ|X=x‖PZ|X=0)dPX(x)−

D(PZ‖PZ|X=0) (26)

and apply the data-processing inequality for divergence [4].The coding scheme that we consider is a binning scheme[13] over a codebook that consists of mutually orthogonalcodewords. The details of the proof are provided in Sec. III-B.Compared with the expression (10) for the secrecy capacity-

cost function, the expression (21) for the secrecy capacity perunit cost involves only an optimization over the input lettersrather than the input distributions. For many specific channels,this represents a significant reduction of the computationalcomplexity.Example 1: Consider the Gaussian wiretap channel

Y = X +N1

Z = X +N2 (27)

with a real channel input X and a quadratic cost functionb(x) = x2 (so the cost is on the energy of the transmission),where N1 and N2 are additive Gaussian noise with zero meansand variance σ2

1 and σ22 , respectively. Assume that σ

21 ≤ σ2

2 .Just like the secrecy capacity, the secrecy capacity per unitcost of the channel depends on the joint distribution of theadditive noise (N1, N2) only through its marginals. Thus, forthe purpose of calculating the secrecy capacity per unit costwe can write N2 = N1 + N2 where N2 is Gaussian withzero mean and variance σ2

2 = σ22 − σ2

1 and is independent ofN1. It follows that the Gaussian wiretap channel (27) can beequivalently written as

Y = X +N1

Z = X +N1 + N2 (28)

3-1

4 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 9, SEPTEMBER 2013

which satisfies the Markov relation X → Y → Z . Thedivergence between two Gaussian random variables is givenby

D(N (μ1, σ21)‖N (μ0, σ

20))

= lnσ0

σ1+

σ21 − σ2

0 + (μ1 − μ0)2

2σ20

. (29)

Therefore, for any x �= 0 we have

N(x)

x2=

1

2

(1

σ21

− 1

σ22

). (30)

Thus, by (21) and without any optimization, we may concludethat the secrecy capacity per unit cost of the Gaussian wiretapchannel (27) under the quadratic cost function b(x) = x2 isgiven by

Cs =1

2

(1

σ21

− 1

σ22

). (31)

Under some mild regularity conditions on the family ofdistributions {PY |X=x : x ∈ R}, the following asymptoticresult on divergence is known [6, Ch. 2.6]:

limx↓0

D(PY |X=x‖PY |X=0)

x2=

1

2J0(PY |X) (32)

where J0(PY |X) is the Fisher information over the parameterfamily {PY |X=x : x ∈ R} evaluated at x = 0. This result canbe used to obtain a simple lower bound on the secrecy capacityper unit cost, which does not involve any optimization at all.Theorem 3: The secrecy capacity per unit cost Cs of the

memoryless stationary wiretap channel (X , (Y,Z), PY,Z|X)with an input alphabet X = R and a quadratic cost functionb(x) = x2 can be bounded from below as

Cs ≥ 1

2

(J0(PY |X)− J0(PZ|X)

). (33)

Proof: By Theorem 2, for any x ∈ X the secrecy rateper unit cost N(x)/b(x) (when it is positive) is achievable forany memoryless stationary wiretap channels (not necessarilydegraded) with a zero-cost input letter. Under the quadraticcost function b(x) = x2, “0” is a zero-cost input letter, i.e.,b(0) = 0. Thus, (33) can be proved by letting x ↓ 0 inN(x)/b(x) and applying the asymptotic result (32) for bothD(PY |X=x‖PY |X=0) and D(PZ|X=x‖PZ|X=0).For the Gaussian wiretap channel (27) we have

J0(PY |X) = J(N1) = 1/σ21 (34)

and J0(PZ|X) = J(N2) = 1/σ22 . (35)

Hence, the simple lower bound on the right-hand side of (33)is tight for the Gaussian wiretap channel (27). With the helpof Theorem 3, we can also prove the following worst-noiseproperty for the Gaussian wiretap channel (27).Theorem 4: Consider the memoryless stationary wiretap

channel (28) with an input alphabet X = R, a quadratic costfunction b(x) = x2, and independent additive noise N1 andN2. While N2 is assumed to be Gaussian with zero mean andvariance σ2

2 = σ22 − σ2

1 , N1 is possibly non-Gaussian. Thesecrecy capacity per unit cost of the channel

Cs ≥ 1

2

(1

σ21

− 1

σ22

)(36)

for any distribution of N1 with zero mean and variance σ21 .

The equality holds when N1 is also Gaussian.Proof: By Theorem 3, the secrecy capacity per unit cost

of the channel can be bounded from below as:

Cs ≥ 1

2

(J(N1)− J(N1 + N2)

)(37)

where J(X) denotes the Fisher information of generic randomvariable X relative to a translation parameter. By the Fisherinformation inequality [11],

J(N1 + N2) ≤ J(N1)J(N2)

J(N1) + J(N2)=

J(N1)

σ22J(N1) + 1

(38)

where the last equality follows from the fact that N2 isN (0, σ2

2) so J(N2) = 1/σ22 . Substituting (38) into (37), we

have

Cs ≥ 1

2

σ22 (J(N1))

2

σ22J(N1) + 1

(39)

Note that the right-hand side of (39) is monotonically increas-ing with J(N1). By the well-known Cramer-Rao inequality,

J(N1) ≥ 1

σ21

. (40)

Substituting (40) into (39) gives

Cs ≥ 1

2

σ22

σ21(σ

21 + σ2

2)=

1

2

(1

σ21

− 1

σ22

). (41)

This completes the proof of the theorem.

B. Proof of Theorem 2

Let us first prove the reversed inequality (24). By thedegradedness assumption X → Y → Z , the probabilitydistributions PZ|X=x, PZ|X=0 and PZ can be obtained fromPY |X=x, PY |X=0 and PY respectively via the same “process-ing” PZ|Y . By the data-processing inequality for divergence[4], we have

D(PY |X=x‖PY |X=0) ≥ D(PZ|X=x‖PZ|X=0) (42)

and D(PY ‖PY |X=0) ≥ D(PZ‖PZ|X=0). (43)

By (42), we have N(x) ≥ 0 for any x ∈ X . Let

X ′ := {x ∈ X : b(x) > 0}. (44)

If there exists an x ∈ X \ X ′ such that N(x) > 0, thenN(x)/b(x) = ∞ and there is nothing to prove from theconverse point of view. Therefore, without loss of generalitywe may assume that N(x) = 0 for all x ∈ X \ X ′.

EL-HALABI et al.: SECRECY CAPACITY PER UNIT COST 5

By (25) and (26) we have

I(X ;Y )− I(X ;Z)

=

∫XN(x)dPX (x)−(D(PY ‖PY |X=0)−D(PZ‖PZ|X=0)

)(45)

≤∫XN(x)dPX (x) (46)

=

∫X ′

N(x)dPX(x) (47)

=

∫X ′

N(x)

b(x)b(x)dPX (x) (48)

≤(supx∈X ′

N(x)

b(x)

)∫X ′

b(x)dPX (x) (49)

=

(supx∈X ′

N(x)

b(x)

)∫Xb(x)dPX(x) (50)

=

(supx∈X ′

N(x)

b(x)

)E[b(X)] (51)

≤(supx∈X

N(x)

b(x)

)E[b(X)] (52)

where (46) follows from (43), and (47) follows from theassumption that N(x) = 0 for all x ∈ X \ X ′, and (50)follows from the definition of X ′. Substituting (10) and (52)into (20) gives

Cs = supβ>0

supE[b(X)]≤β

I(X ;Y )− I(X ;Z)

β(53)

≤ supx∈X

N(x)

b(x)· supβ>0

supE[b(X)]≤β

E[b(X)]

β(54)

= supx∈X

N(x)

b(x). (55)

To show that for any given x ∈ X , the secrecy rateper unit cost N(x)/b(x) can be achieved by an orthogonalcoding scheme, let us first consider x ∈ X ′. Fix 0 < δ <2N(x)/b(x), 0 < ε < 1, and k to be a sufficiently largepositive integer. Let m = w0l0 for some integers w0 and l0such that

exp

(k

(N(x) − δb(x)

2

))<

w0 < exp

(k

(N(x)− δb(x)

3

))(56)

and

exp

(k

(D(PZ|X=x‖PZ|X=0) +

δb(x)

12

))<

l0 < exp

(k

(D(PZ|X=x‖PZ|X=0) +

δb(x)

6

)). (57)

Codebook. Each codeword is identified by an integer pair(w, l), where w ∈ {1, . . . , w0} and l ∈ {1, . . . , l0}, andcorresponds to an m× k matrix

{xi,j : 1 ≤ i ≤ m, 1 ≤ j ≤ k}.Denote by xk

i the ith row of the codeword matrix {xi,j}. Forcodeword (w, l),

xki =

{(x, . . . , x) if i = (w − 1)l0 + l(0, . . . , 0) otherwise.

(58)

.

.

.

.

.

.

x xx xxxxx x

1 k1

m

(w − 1)l0 + l

Fig. 2. An illustration of the codeword (w, l). Each codeword is anm × k matrix, where m = w0l0, w0 ≈ exp (kN(x)), and w0 ≈exp

(kD(PZ|X=x‖PZ|X=0)

). Each row of the codeword matrix is termed

as a pulse. The only nonzero pulse for the codeword (w, l) is the((w − 1)l0 + l)th row, which consists of a sequence x’s of length k. Hence,the shape of the nonzero pulse is constant.

Thus, the block length of this code n = mk, and the costof each codeword is kb(x). In this paper, the nonzero rowof a codeword matrix is referred to as a “pulse”. Note thatthe pulses for different codewords are non-overlapping, so thecodewords are orthogonal to each other. See Fig. 2 for anillustration.Encoding. Given message W , randomly and uniformly

choose an integer L ∈ {1, . . . , l0}, and send codeword (W,L)through the channel. The randomness used for choosing L isintrinsic to the transmitter and is not shared with either thelegitimate receiver or the eavesdropper.Decoding at the legitimate receiver. Given the matrix of

observations

{Yi,j : 1 ≤ i ≤ m, 1 ≤ j ≤ k}the decoder performs m independent binary hypothesis tests,one on each row of the transmitted codeword matrix:

Hi,0 : xki = (0, . . . , 0)

Hi,1 : xki = (x, . . . , x)

(59)

for i = 1, . . . ,m. The conditional error probabilities of thesetests are denoted by

α(k)i = Pr{Hi,1|Hi,0} (60)

and β(k)i = Pr{Hi,0|Hi,1} (61)

and the decision rule is set so that β(k)i < ε/2. If one and

only one Hi,1 was claimed (denoted by Hi,1), we declarethe transmitted codeword to be (w, i − (w − 1)l0) (and thetransmitted message to be w) where w is the smallest integergreater than or equal to i/l0. Otherwise, an error is declared.Obviously, the probability Pw,l for erroneously decoding the

transmitted codeword conditioned on codeword (w, l) being

6 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 9, SEPTEMBER 2013

sent is independent of the value of (w, l) and can be boundedfrom above as

P1,1 ≤ β(k)i + (m− 1)α

(k)i . (62)

Denote by Y ki the ith row of the matrix of observations {Yi,j}.

Note that Y ki is i.i.d. according to PY |X=0 under Hi,0 and

i.i.d. according to PY |X=x under Hi,1. By the Chernoff-Stein

lemma [1], since β(k)i < ε/2, we can achieve

α(k)i < exp

(−k

(D(PY |X=x‖PY |X=0)− δb(x)

12

))(63)

for sufficiently large k. Consequently,

P1,1 < ε/2 +m exp

(−k

(D(PY |X=x‖PY |X=0)− δb(x)

12

))(64)

< ε/2 + exp(−kδb(x)/12) (65)

< ε (66)

for sufficiently large k, where (65) follows from the fact that

m = w0l0 < exp

(k

(D(PY |X=x‖PY |X=0)− δb(x)

6

)).

(67)

Confidentiality at the eavesdropper. The conditional entropyof the transmitted messageW given the matrix of observations

{Zi,j : 1 ≤ i ≤ m, 1 ≤ j ≤ k}at the eavesdropper is given by

H(W |{Zi,j})= H(W,L|{Zi,j})−H(L|W, {Zi,j}) (68)

= H(W,L)− I(W,L; {Zi,j})−H(L|W, {Zi,j}) (69)

= H(W ) +H(L)− I({Xi,j}; {Zi,j})−H(L|W, {Zi,j}).(70)

Induced by the random selection of (W,L), the transmittedcodeword entries Xi,j are identically distributed according to

PX(x) = 1− PX(0) = 1/m. (71)

Note from (56) and (57) that m → ∞ in the limit as k → ∞.By [2, Eq. (2.13)], for any (i, j) ∈ {1, . . . ,m} × {1, . . . , k}we have

limk→∞

(mI(Xi,j ;Zi,j)) = D(PZ|X=x‖PZ|X=0). (72)

Thus for sufficiently large k,

I({Xi,j}; {Zi,j}) ≤m∑i=1

k∑j=1

I(Xi,j ;Zi,j) (73)

< k

(D(PZ|X=x‖PZ|X=0) +

δb(x)

24

)(74)

where (73) follows from the fact that the channel is memory-less, and (74) follows from (72). We also have the followinglemma, whose proof is provided in Appendix C.

Lemma 1:

H(L|W, {Zi,j})< H(L)− k

(D(PZ|X=x‖PZ|X=0)− δb(x)

24

)(75)

for sufficiently large k.Substituting (74) and (75) into (70) gives

H(W |{Zi,j}) > logw0 − kb(x)δ

12(76)

> kb(x)

(N(x)

b(x)− 5δ

12

)=: h. (77)

Combing (66) and (77), we have successfully constructedfor any 0 ≤ ε < 1, any 0 < δ < 2N(x)/b(x), and anysufficiently large k, an (mk,w0, kb(x), ε, h) code for which

N(x)

b(x)>

N(x)

b(x)− δ

3>

logw0

kb(x)>

N(x)

b(x)− δ

2(78)

andh

kb(x)=

N(x)

b(x)− 5δ

12>

N(x)

b(x)− δ

2.

(79)

Per Remark 1, this proves that for any x ∈ X ′, the secrecyrate per unit cost N(x)/b(x) can be achieved by the proposedorthogonal coding scheme.For x ∈ X \ X ′, we have b(x) = 0. If N(x) = 0, by our

conventionN(x)/b(x) = 0, and there is nothing to prove fromthe achievability point of view. If, on the other hand, N(x) >0, replace b(x) by some positive real b in the previous analysis.Then, the same orthogonal coding scheme can achieve thesecrecy capacity per unit cost N(x)/b for any b > 0. Lettingb → 0 proves that in this case, the proposed orthogonal codingscheme can achieve an infinite secrecy rate per unit cost.Combing the above two cases proves that for any x ∈ X ,

the secrecy rate per unit cost N(x)/b(x) can be achieved bythe proposed orthogonal coding scheme. This completes theproof of the entire theorem.

C. Randomized Pulse Position Modulation

Note that for the orthogonal coding scheme describedabove, the length of the pulses (which we labelled as k) needsto be large in general. For the Gaussian wiretap channel (27)under the quadratic cost function b(x) = x2, as we shall see,it suffices to choose the length of the pulses to be one andpack the entire transmission cost kb(x) into a single transmitposition. Now that the codebook is identical to that of thestandard pulse position modulation (PPM) scheme [5, Ch. 8.5]but with an additional randomization of the transmit pulseposition, we shall refer to this coding scheme as randomizedPPM. Compared with the orthogonal coding scheme describedin Sec. III-B, randomized PPM only consumes one-kth ofthe bandwidth, but the signaling scheme becomes increasinglymore peaky as k becomes large.The performance of randomized PPM can be analyzed in a

similar way as that of the orthogonal coding scheme describedin Sec. III-B. The fact that randomized PPM can achieve thesecrecy capacity per unit cost for the Gaussian wiretap channel(27) under the quadratic cost function b(x) = x2, however, canbe proved via the following interesting thought experiment.

6-1

EL-HALABI et al.: SECRECY CAPACITY PER UNIT COST 7

Consider the same orthogonal coding scheme as describedin Sec. III-B. Now, instead of observing the output matrices

{Yi,j : 1 ≤ i ≤ m, 1 ≤ j ≤ k}and {Zi,j : 1 ≤ i ≤ m, 1 ≤ j ≤ k}

at the legitimate receiver and the eavesdropper respectively,the legitimate receiver and the eavesdropper can only observethe following “row averages”{

Yi =1√k

∑kj=1 Yi,j : 1 ≤ i ≤ m

}and

{Zi =

1√k

∑kj=1 Zi,j : 1 ≤ i ≤ m

} (80)

respectively. We claim that this additional “forced processing”of the received signals at the legitimate receiver and theeavesdropper respectively does not decrease the achievablesecrecy rate per unit cost, as can be seen as follows:

• At the eavesdropper, we have

I(W ; {Zi}) ≤ I(W ; {Zi,j}) (81)

due to the Markov relationW → {Zi,j} → {Zi}, so anyadditional “forced processing” at the eavesdropper canonly make the coding scheme even more secure.

• At the legitimate receiver, it is well known [9, Ch. II.D]that

Yi =1√k

k∑j=1

Yi,j (82)

is a sufficient statistic for the binary hypothesis test (59)whenever the channel between the inputs Xi,j and theoutputs Yi,j is memoryless stationary and additive Gaus-sian. Thus, the decoding error probabilities remain thesame with and without the additional “forced processing”at the legitimate receiver, assuming that decoding is doneby performing m independent binary hypothesis tests(one on each row of the transmit codeword matrix) forboth cases.

Finally, note that

Yi = Xi + N1,i

and Zi = Xi + N2,i(83)

where Xi =√kx when i is the actual transmit position and

Xi = 0 otherwise,

N1,i = 1√k

∑kj=1 N1,i,j

and N2,i = 1√k

∑kj=1 N2,i,j.

(84)

Further note that

b(√kx) = kx2 = kb(x) (85)

for the quadratic cost function b(x) = x2, and that {N1,i}and {N2,i} are i.i.d. N (0, σ2

1) and N (0, σ22), respectively.

Therefore, with the additional “forced processing” (80) atthe legitimate receiver and the eavesdropper, the orthogonalcoding scheme described in Sec. III-B is effectively reducedto the randomized PPM scheme. Since this additional “forcedprocessing” does not decrease the achievable secrecy rate perunit cost (as argued previously), we conclude that randomizedPPM can achieve the secrecy capacity unit cost for the Gaus-sian wiretap channel (27) under the quadratic cost functionb(x) = x2 as well.

IV. GENERAL WIRETAP CHANNELS

A. Main Results and Discussions

For general memoryless stationary wiretap channels with azero-cost input letter, the following secrecy rates per unit costcan be achieved by an orthogonal coding scheme.Theorem 5: The secrecy capacity per unit cost Cs of the

memoryless stationary wiretap channel (X , (Y,Z), PY,Z|X)with a zero-cost input letter “0” can be bounded from belowas

Cs ≥ supPX

D(PY ‖PY |X=0)−D(PZ‖PZ|X=0)

E[b(X)](86)

where

PY =

∫XPY |X=xdPX(x) (87)

PZ =

∫XPZ|X=xdPX(x) (88)

and E[b(X)] =

∫Xb(x)dPX(x). (89)

Furthermore, for any PX over X such that both

D(PY ‖PY |X=0)−D(PZ‖PZ|X=0)

and E[b(X)] are positive, the secrecy rate per unit cost

Rs =D(PY ‖PY |X=0)−D(PZ‖PZ|X=0)

E[b(X)](90)

can be achieved by an orthogonal coding scheme.The lower bound on the right-hand side of (86) can be

established via Theorem 1 by appropriately choosing a set ofinput/auxiliary random variables (U, V,X) in the Csiszar andKorner secrecy capacity expression (11). The coding schemethat we consider is similar to the one that we considered forthe degraded case but with an additional randomization on thepulse shape. The details of the proof are provided in Sec. IV-B.Note that if we choose X = x with probability one, the

right-hand side of (86) reduces to (21) which was shown to bethe secrecy capacity per unit cost when the wiretap channel isdegraded. The following example, however, shows that furtherrandomization of the pulse shape can strictly improve theachievable secrecy rate per unit cost when the wiretap channelis not degraded.Example 2: Consider a binary memoryless stationary wire-

tap channel with X = Y = {0, 1}. The marginal channeltransition probabilities are given by

PY |X=1 = (0.4, 0.6), PY |X=0 = (0.6, 0.4)PZ|X=1 = (0.3, 0.7), PZ|X=0 = (0.5, 0.5).

Obviously, the channel is not degraded. The cost function isgiven by b(0) = 0 and b(1) = 1. Simple calculations give

D(PY |X=1‖PY |X=0)−D(PZ|X=1‖PZ|X=0)

b(1)≈ −0.0012.

Therefore, orthogonal codes with constant pulse shape cannotachieve any positive secrecy rate per unit cost. On the otherhand, let PX = (0.5, 0.5) and we get

PY = (0.5, 0.5) and PZ = (0.4, 0.6).

8 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 9, SEPTEMBER 2013

By Theorem 5, the following secrecy rate per unit cost isachievable by an orthogonal code with randomized pulseshape:

D(PY ‖PY |X=0)−D(PZ‖PZ|X=0)

E[b(X)]≈ 0.0006.

So at least for this simple example, further randomization ofthe pulse shape can strictly improve the achievable secrecyrate per unit cost.

B. Proof of Theorem 5

Let us first prove the achievable secrecy rate per unitcost expression on the right-hand side of (86). For generalmemoryless stationary wiretap channels, by (11) and (20) wehave

Cs ≥ I(V ;Y )− I(V ;Z)

E[b(X)](91)

for any joint distribution PY,Z,X,V = PY,Z|XPX|V PV . Con-sider the following distribution for (V,X): V is a binaryrandom variable such that

PV (1) = 1− PV (0) = 1/m (92)

for some positive integer m, and X ∼ PX|V =1 if V = 1and X = 0 with probability one if V = 0. For this particularchoice of distribution for (V,X), we have

PY |V=0 = PY |X=0 (93)

PZ|V =0 = PZ|X=0 (94)

PY |V=1 =

∫XPY |X=xdPX|V =1(x) (95)

PZ|V =1 =

∫XPZ|X=xdPX|V =1(x) (96)

and E[b(X)] =1

mE[b(X)|V = 1]. (97)

By [2, Eq. (2.13)], we also have

limm→∞ (mI(V ;Y )) = D(PY |V=1‖PY |V=0) (98)

and limm→∞ (mI(V ;Z)) = D(PZ|V =1‖PZ|V=0). (99)

Substituting (97)–(99) into (91) gives

Cs ≥ limm→∞

m (I(V ;Y )− I(V ;Z))

E[b(X)|V = 1](100)

=D(PY |V=1‖PY |V=0)−D(PZ|V =1‖PZ|V=0)

E[b(X)|V = 1](101)

=D(PY |V=1‖PY |X=0)−D(PZ|V =1‖PZ|X=0)

E[b(X)|V = 1](102)

for any PX|V=1 over X . Renaming PX|V=1, PY |V=1 andPZ|V =1 as PX , PY and PZ respectively completes the proofof (86).To prove that the right-hand side of (90) can be achieved by

an orthogonal coding scheme, we shall consider the followingmodification of the orthogonal coding scheme considered forthe degraded case.Let m = w0l0 for some integers w0 and l0 such that

w0 ≈ exp(k(D(PY ‖PY |X=0)−D(PZ‖PZ|X=0)

))(103)

l0 ≈ exp(kD(PZ‖PZ|X=0)

)(104)

and t0 be an integer such that

t0 ≈ exp (kI(X ;Z)) . (105)

Let C = {ck(1), . . . , ck(t0)} be a collection of t0 length-kvectors from X k .Codebook. Each codeword is identified by an integer triple

(w, l, t), where w ∈ {1, . . . , w0}, l ∈ {1, . . . , l0}, and t ∈{1, . . . , t0}, and corresponds to an m× k matrix

{xi,j : 1 ≤ i ≤ m, 1 ≤ j ≤ k}.

Denote by xki the ith row of the codeword matrix {xi,j}. For

codeword (w, l, t),

xki =

{ck(t) if i = (w − 1)l0 + l

(0, . . . , 0) otherwise.(106)

Encoding. Given message W , randomly, uniformly, andindependently choose an integer L ∈ {1, . . . , l0} and an in-teger T ∈ {1, . . . , t0}, and send codeword (W,L, T ) throughthe channel. The randomness used for choosing L and T isintrinsic to the transmitter and is not shared with either thelegitimate receiver or the eavesdropper. Note that:

1) even though the codewords are not necessarily orthog-onal to each other, the codewords representing differentmessages remain orthogonal to each other; and

2) compared with the orthogonal coding scheme proposedfor the degrade case, additional randomization on the“shape” of the pulse is used in the modified scheme.

Decoding at the legitimate receiver. Given the matrix ofobservations

{Yi,j : 1 ≤ i ≤ m, 1 ≤ j ≤ k}

the decoder performs m independent binary hypothesis tests,one on each row of the transmitted codeword matrix:

Hi,0 : xki = (0, . . . , 0)

Hi,1 : xki uniformly drawn from C (107)

for i = 1, . . . ,m. If one and only one Hi,1 was claimed(denoted by Hi,1), we declare the transmitted message to bethe smallest integer greater than or equal to i/l0. Otherwise,an error is declared.Performance analysis. Using a random-coding argument for

which the entries of the vectors from C are independently gen-erated according to PX , it can be shown that for sufficientlylarge k:

1) the cost associated with each vector from C is approxi-mately kE[b(X)]; and

2) xki , when uniformly drawn from C, has approximatelyi.i.d. entries according to PX .

Then, following the same footsteps as those for the degradedcase, it can be shown that the secrecy rate per unit cost (90)can be achieved by the proposed coding scheme. The technicaldetails (which would make the meaning of the approximationsmentioned above precise) are omitted from the paper. Thiscompletes the proof of the theorem.

EL-HALABI et al.: SECRECY CAPACITY PER UNIT COST 9

V. CONCLUDING REMARKS

This paper introduced a concept of secrecy capacity per unitcost to study cost-efficient wide-band secrecy communication.For degraded memoryless stationary wiretap channels with azero-cost input letter, it was shown that an orthogonal codingscheme with randomized pulse position and constant pulseshape achieves the secrecy capacity per unit cost. For generalmemoryless stationary wiretap channels, the performance oforthogonal codes were studied, and the benefit of furtherrandomizing the pulse shape is demonstrated via a simpleexample.The results of this paper suggest several research directions

which may be worthy of further exploring:

• First, the secrecy capacity per unit cost defined in thispaper requires the transmitted message to be asymptot-ically perfectly secret from the eavesdropper (c.f. (17)).More generally, one may consider an imperfect secrecysetting where the constraints (12) and (13) are replacedby

R ≥ lnw0

ν> R− δ (108)

andh

ν> Re − δ (109)

respectively for some rate-equivocation per unit cost pair(R,Re). This setting includes Verdu’s channel capacityper unit cost (Re = 0) and the secrecy capacity per unitcost defined in this paper (Re = R) as two extremescenarios. For degraded memoryless stationary wiretapchannels with a zero-cost input letter, both Verdu’s chan-nel capacity per unit cost and the secrecy capacity perunit cost defined in this paper can be achieved by highlystructured orthogonal codes. It would be interesting tosee whether the other boundary points of the capacity-equivocation per unit cost region can be achieved byorthogonal codes as well.

• Second, a stronger notion of secrecy capacity per unitcost can be defined by replacing the constraint (13) with

h

ν> Rs − δ

ν. (110)

As such, the (unnormalized) mutual information betweenthe message W and the received vector Zn at theeavesdropper must satisfy

I(W ;Zn) = H(W )−H(W |Zn) (111)

< logw0 − h (112)

< νRs − (νRs − δ) (113)

= δ. (114)

For degraded memoryless stationary wiretap channelswith a zero-cost input letter, it would be interesting tosee whether orthogonal codes can in fact achieve thisstronger notion of secrecy capacity per unit cost withoutany “privacy amplification” [8].

• Finally, the problem whether orthogonal codes canachieve the secrecy capacity per unit cost for generalmemoryless stationary wiretap channels with a zero-costinput letter remains open.

APPENDIX APROOF OF REMARK 1

Suppose that for any 0 < δ ≤ Rs there exists a positive realβ and a positive integer k0 such that for any integer k ≥ k0 an(n,w0, kβ, ε, h) code can be found for which the constraints(18) and (19) are satisfied. Let

ν0 := max

(k0,

2Rs

δ

)β. (115)

When ν = kβ for some k ≥ k0, by (18) and (19) we have

Rs ≥ logw0

ν> Rs − δ

2> Rs − δ (116)

andh

ν> Rs − δ

2> Rs − δ (117)

for the (n,w0, kβ = ν, ε, re) code.When kβ < ν < (k + 1)β for some k ≥ k0 and such that

ν ≥ ν0, we have

k + 1 >ν

β≥ ν0

β≥ 2Rs

δ(118)

and hence(Rs − δ

2

)kβ

ν>

(Rs − δ

2

)k

k + 1(119)

=

(Rs − δ

2

)(1− 1

k + 1

)(120)

>

(Rs − δ

2

)(1− δ

2Rs

)(121)

> Rs − δ. (122)

In this case, the (n,w0, kβ, ε, h) code is also an (n,w0, ν, ε, h)code for which

Rs ≥ logw0

kβ>

logw0

ν>

(Rs − δ

2

)kβ

ν> Rs − δ (123)

andh

ν>

(Rs − δ

2

)kβ

ν> Rs − δ. (124)

Combining the above two cases completes the proof of theremark.

APPENDIX BPROOF OF THEOREM 1

Let us first show that for any given β > 0 and anyachievable secrecy rate Rs with average cost per symbol notexceeding β, Rs/β is an ε-achievable secrecy rate per unit costfor any 0 < ε < 1. Fix δ > 0, and let ε′ = min(1, β)ε. SinceRs is an achievable secrecy rate with average cost per symbolnot exceeding β, there exists a positive integer n0 such thatfor any integer n ≥ n0 an (n,w0, nβ, ε

′, h) code can be foundfor which

Rs ≥ lnw0

n> Rs − βδ

2(125)

andh

n> Rs − βδ

2. (126)

This immediately gives an (n,w0, nβ, ε, h) code for which

Rs

β≥ lnw0

nβ>

Rs

β− δ

2(127)

andh

nβ>

Rs

β− δ

2. (128)

10 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 9, SEPTEMBER 2013

Per Remark 1, this proves that Rs/β is an achievable secrecyrate per unit cost. Taking the supreme over all achievablesecrecy rate Rs with average cost per symbol not exceedingβ and subsequently over all β > 0, we conclude that

Cs ≥ supβ>0

Cs(β)

β. (129)

To prove the converse part of the theorem, let Rs be anachievable secrecy rate per unit cost. By definition, for any0 < ε < 1 and any sufficiently small δ > 0 there exists apositive real ν0 such that for any ν ≥ ν0 an (n,w0, ν, ε, re)code can be found for which the constraints (12) and (13)are satisfied. Recall that the constraints (12) and (13) togetherimply (17). By Fano’s inequality, we have

H(W |Y n) ≤ εH(W ) + ln 2. (130)

It follows that

(1− ε)H(W ) ≤ I(W ;Y n) + ln 2 (131)

and hence

Rs − δ

<lnw0

ν=

H(W )

ν(132)

≤ 1

1− ε

(I(W ;Y n)

ν+

ln 2

ν

)(133)

≤ 1

1− ε

(I(W ;Y n)− I(W ;Zn)

ν+ δ +

ln 2

ν

). (134)

For memoryless stationary wiretap channels, Csiszar andKorner [3] showed that

I(W ;Y n)− I(W ;Zn) ≤n sup

E[b(X)]≤ν/n

(I(V ;Y |U)− I(V ;Z|U)) (135)

where U and V are auxiliary random variables satisfying theMarkov chain U → V → X → (Y, Z). Substituting (135)into (134) gives

(1−ε)(Rs − δ)−(δ +

ln 2

ν

)

<n

νsup

E[b(X)]≤ν/n

(I(V ;Y |U)− I(V ;Z|U)) (136)

≤ supβ>0

(1

βsup

E[b(X)]≤β

(I(V ;Y |U)− I(V ;Z|U))

)

(137)

= supβ>0

Cs(β)

β. (138)

Let ν go to infinity and subsequently δ and ε go to zero, andtake the supreme over all achievable secrecy rate per unit costRs. We conclude that

Cs ≤ supβ>0

Cs(β)

β. (139)

Combining (129) and (139) completes the proof of Theo-rem 1.

APPENDIX CPROOF OF LEMMA 1

By the symmetry of the code construction, the value of theconditional entropy H(L|W = w, {Zi,j}) does not depend onthe realization w, so we have

H(L|W, {Zi,j}) = H(L|W = 1, {Zi,j}). (140)

Given W = 1 and the matrix of observations {Zi,j}, considerthe following l0 binary hypotheses, each on one of the first l0rows of the transmitted codeword metrix:

Hl,0 : xkl = (0, . . . , 0)

Hl,1 : xkl = (x, . . . , x)

(141)

for l = 1, . . . , l0. The conditional error probabilities of thesetests are denoted by

α(k)l = Pr(Hl,1|Hl,0) (142)

and β(k)l = Pr(Hl,0|Hl,1). (143)

Note that Zkl is i.i.d. according to PZ|X=0 under Hl,0 and

i.i.d. according to PZ|X=x under Hl,1. Fix δ′ > 0. By theChernoff-Stein lemma [1] a decision rule can be found suchthat β(k)

l → 0 in the limit as k → ∞ and

α(k)l ≤ e−k(D(PZ|X=x‖PZ|X=0)−δ′) (144)

for sufficiently large k.For l = 1, . . . , l0, let Nl = 1 if Hl,1 is declared and Nl = 0

if Hl,0 is declared. Then, for all l �= L we have

E[Nl] = α(k)l (145)

and Var[Nl] ≤ E[N2l ] = α

(k)l . (146)

Further let

N :=∑l =L

Nl. (147)

Since Nl, l �= L, are i.i.d., we have

E[N ] =∑l =L

E[Nl] = (l0 − 1)α(k)l (148)

≤ l0e−k(D(PZ|X=x‖PZ|X=0)−δ′) (149)

and Var[N ] =∑l =L

Var[Nl] ≤ (l0 − 1)α(k)l (150)

≤ l0e−k(D(PZ|X=x‖PZ|X=0)−δ′). (151)

It follows that

Pr{N ≥ 2l0e

−k(D(PZ|X=x‖PZ|X=0)−δ′)}

≤ Pr{N − E[N ] ≥ l0e

−k(D(PZ|X=x‖PZ|X=0)−δ′)}

(152)

≤ Pr{|N − E[N ]| ≥ l0e

−k(D(PZ|X=x‖PZ|X=0)−δ′)}(153)

≤ Var[N ](l0e

−k(D(PZ|X=x‖PZ|X=0)−δ′))2 (154)

≤ 1

l0e−k(D(PZ|X=x‖PZ|X=0)−δ′)

(155)

<1

ek(δb(x)12 +δ′)

(156)

→ 0 (157)

EL-HALABI et al.: SECRECY CAPACITY PER UNIT COST 11

in the limit as k → ∞, where (152) follows from (149),(154) follows from the well-known Chebyshev’s inequality,(155) follows from (151), and (156) follows from the secondinequality in (57).Let E be a random variable such that E = 1 if

N < 2l0e−k(D(PZ|X=x‖PZ|X=0)−δ′)

and HL,1 is declared, and E = 0 otherwise. We have

H(L|W = 1, {Zi,j})≤ H(L,E|W = 1, {Zi,j}) (158)

= H(E|W = 1, {Zi,j}) +H(L|W = 1, {Zi,j}, E)(159)

≤ H(E) + Pr {E = 0}H(L)+

H(L|W = 1, {Zi,j}, E = 1). (160)

Note that

H(E) ≤ ln 2 (161)

and H(L) = ln l0 < k

(D(PZ|X=x‖PZ|X=0) +

δb(x)

6

).

(162)

By the union bound, the probability Pr {E = 0} can bebounded from above by

Pr{HL,0

}+ Pr

{N ≥ 2l0e

−k(D(PZ|X=x‖PZ|X=0)+δ′)}

which tends to zero in the limit as k → ∞. Furthermore,H(L|W = 1, {Zi,j}, E = 1)

< ln(2l0e

−k(D(PZ|X=x‖PZ|X=0)−δ′) + 1)

(163)

= H(L)− k(D(PZ|X=x‖PZ|X=0)− ε′

)(164)

where

ε′ =1

kln

(2ekδ

′+

ekD(PZ|X=x‖PZ|X=0)

l0

)(165)

<1

kln(2ekδ

′+ e−kδb(x)/12

)(166)

→ δ′ (167)

in the limit as k → ∞. We thus haveH(L|W = 1, {Zi,j})

< ln 2 + kPr {E = 0}(D(PZ|X=x‖PZ|X=0) +

δb(x)

6

)+H(L)− k

(D(PZ|X=x‖PZ|X=0)− ε′

)(168)

= H(L)− k(D(PZ|X=x‖PZ|X=0)− ε′′

)(169)

where

ε′′ := ε′ +ln 2

k+

Pr {E = 0}(D(PZ|X=x‖PZ|X=0) +

δb(x)

6

)(170)

→ δ′ (171)

in the limit as k → ∞. Letting δ′ → 0 completes the proofof the lemma.

REFERENCES

[1] H. Chernoff, “A measure of asymptotic efficiency for tests of a hypoth-esis based on a sum of observations,” Ann. Math. Statistics, vol. 23,no. 4, pp. 493–507, Dec. 1952.

[2] I. Csiszar, “I-divergence geometry of probability distributions andminimization problems,” Ann. Probability, vol. 3, no. 1, pp. 146–158,Feb. 1975.

[3] I. Csiszar and J. Korner, “Broadcast channels with confidential mes-sages,” IEEE Trans. Inform. Theory, vol. IT-24, no. 3, pp. 339–348,May 1978.

[4] I. Csiszar and J. Korner, Information Theory: Coding Theorems forDiscrete Memoryless Systems. New York, NY: Academic, 1981.

[5] R. G. Gallager, Principles of Digital Communication. Cambridge, UK:Cambridge University Press, 2008.

[6] S. Kullback, Information Theory and Statistics. New York, NY: Dover,1968.

[7] T. Liu and P. Viswanath, ”Opportunistic orthogonal writing on dirtypaper,” IEEE Trans. Inform. Theory, vol. 52, no. 5, pp. 1828–1846,May 2006.

[8] U. Maurer and S. Wolf, “Information-theoretic key agreement: Fromweak to strong secrecy for free,” in Proc. EUROCRYPT’00, SpringerLNCS, vol. 1807, pp. 351–368, 2000.

[9] H. V. Poor, An Introduction to Signal Detection and Estimation, 2ndEdition. New York, NY: Springer, 1994.

[10] C. E. Shannon, “A mathematical theory of communication,” Bell Syst.Tech. Journal, vol. 27, pp. 379–423 and 623–656, July and Oct. 1948.

[11] A. J. Stam, “Some inequalities satisfied by the quantities of informationof Fisher and Shannon,” Inform. Control, vol. 2, no. 2, pp. 102–112,June 1959.

[12] S. Verdu, “On channel capacity per unit cost,” IEEE Trans. Inform.Theory, vol. 36, no. 5, pp. 1019–1030, Sept. 1990.

[13] A. D. Wyner, “The wire-tap channel,” Bell Syst. Tech. Journal, vol. 54,no. 8, pp. 1355–1387, Oct. 1975.

Mustafa El-Halabi was born in Beirut, Lebanon.He finished his Bachelor’s degree in Electrical En-gineering and his Master’s degree in Computer andCommunication Engineering at the Department ofElectrical and Computer Engineering at the Amer-ican University of Beirut (AUB). He is currently aPhD candidate at the Department of Electrical andComputer Engineering at Texas A&M University,College Station. His research interests are in theareas of wireless communication systems, informa-tion theory, and network coding, with emphasis on

physical-layer security and cryptography. Recently, he has been investigatingfinite bit approximations for communication in the presence of side informa-tion, and working on new techniques pertaining to cyber security for smartgrid.

Tie Liu received his B.S. (1998) and M.S. (2000)degrees, both in Electrical Engineering, from Ts-inghua University, Beijing, China and a second M.S.degree in Mathematics (2004) and Ph.D. degree inElectrical and Computer Engineering (2006) fromthe University of Illinois at Urbana-Champaign.Since August 2006 he has been with Texas A&MUniversity, where he is currently an Associate Pro-fessor with the Department of Electrical and Com-puter Engineering. His primary research interest is inunderstanding the fundamental performance limits

of communication and networked systems via the lens of information theory.Dr. Liu is a recipient of the M. E. Van Valkenburg Graduate Research

Award (2006) from the University of Illinois at Urbana-Champaign andthe Faculty Early Career Development (CAREER) Award (2009) from theNational Science Foundation.

12 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 9, SEPTEMBER 2013

Costas N. Georghiades received the B.E. degreewith distinction from the American University ofBeirut in 1980, and the M.S. and D.Sc. degreesfrom Washington University, St. Louis, MO, in 1983and 1985, respectively, all in Electrical Engineering.Since 1985 he has been with the Electrical andComputer Engineering Department at Texas A&MUniversity where he served as department. He iscurrently Associate Dean for Research in the Col-lege of Engineering and holder of the Delbert A.Whitaker Endowed Chair. His general interests are

in the application of information, communication and estimation theories tothe study of communication systems.Dr. Georghiades is a Fellow of the IEEE and a registered Professional Engi-

neer in Texas. Over the years he served in various editorial positions, including

with the IEEE Transactions on Communications, the IEEE Transactions onInformation Theory and as Editor-in-Chief of IEEE Communication Letters.He has been involved in organizing a number of conferences, including asGeneral co-Chair for the 2004 IEEE Information Theory Workshop, as Tech-nical Program Co-Chair for the 2005 IEEE Communication Theory Workshopand as General Co-Chair of the 2010 IEEE International Symposium onInformation Theory. In other service, he served in the IEEE Communica-tions Societys Awards Committee, as Chair of the Communication TheoryTechnical Committee and as Chair of the Fellows Evaluation Committeeof the IEEE Information Theory Society. He currently serves in the IEEEHamming Medal committee, the Communication Societys Awards committeeand as Chair of the Wireless Communication Letters Steering committee. In2012 he received the IEEE Communication Societys Communication TheoryTechnical Committee Service Award.

Notes i

3-1 Jun 21, 2013, 12:25 PM

Redundant

6-1 Jun 21, 2013, 12:25 PM

Combining

Report generated by GoodReader