Optimal Superimposed Training Design for Spatially Correlated Fading MIMO Channels

12
3206 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 8, AUGUST 2008 Optimal Superimposed Training Design for Spatially Correlated Fading MIMO Channels Vu Nguyen, Hoang D. Tuan, Member, IEEE, Ha H. Nguyen, Senior Member, IEEE and Nguyen N. Tran, Student Member, IEEE Abstract—The problem of channel estimation for spatially cor- related fading multiple-input multiple-output (MIMO) systems is considered. Based on the channel’s second order statistic, the minimum mean-square error (MMSE) channel estimator that works with the superimposed training signal is rst developed. The problem of designing the optimal superimposed signal is then addressed and solved with an iterative optimization algorithm. Results show that under the constraint of equal training power and bandwidth efciency, our optimal design of the superimposed training signal leads to a signicant reduction in channel estimation error when compared to the conventional design of time-multiplexing training, especially for slowly time- varying channels with a large coherence time. The issue of power allocation between the information-bearing and training signals for detection enhancement is also investigated. Simulation results demonstrate excellent bit-error-rate performance of orthogonal space-time block codes with our proposed channel estimation. Index Terms—MIMO channel, spatial correlation, channel estimation, MMSE estimation, training signal, training design, time-multiplexing training, superimposed training. I. I NTRODUCTION T HE use of multiple antennas at both the transmitter and the receiver to create the so-called multiple-input multiple-output (MIMO) communication systems has been shown to greatly increase the data rate of the wireless trans- mission medium [21], [30]. This is especially true when the channel fades among the transmitter-receiver pairs are inde- pendently Rayleigh distributed [5], [30], [38]. In particular, it is shown in [30] that the capacity of a MIMO wireless channel increases linearly with the number of antennas. The assumption of independent fades requires that the antennas be placed sufciently far apart, both at the transmitter and the receiver. In many practical applications, meeting such requirements might be very expensive and impractical (such as for the antennas in hand-held mobile units). It is therefore more practical and useful to consider spatial correlations Manuscript received March 2, 2007; revised August 1, 2007; accepted October 1, 2007. The associate editor coordinating the review of this paper and approving it for publication is D. Dardari. This work is supported by the Australian Research Council under grant ARC Discovery Project 0556174. A part of this work was presented at the IEEE Second International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, St. Thomas, U.S. Virgin Islands, USA, 12-14 December 2007. Vu Nguyen, Hoang D. Tuan, and Nguyen N. Tran are with the School of Electrical Engineering and Telecommunications, the University of New South Wales, Sydney, NSW 2052, Australia (e-mail: {q.nguyen, nam.nguyen}@student.unsw.edu.au, [email protected]). Ha H. Nguyen is with the Department of Electrical and Computer Engi- neering, University of Saskatchewan, 57 Campus Dr., Saskatoon, SK, Canada S7N 5A9 (e-mail: [email protected]). Digital Object Identier 10.1109/TWC.2008.070250. among different sub-channels of the MIMO channel matrix [5], [13], [28]. Compared to an independent fading MIMO channel, the results in [3], [10]–[12], [28] show that the capacity of a spatially-correlated fading MIMO channel is substantially reduced. Capacity reduction due to spatially-correlated fading can be partially alleviated by precoding the transmitted signal [20]. This technique however requires the knowledge of the channel state information at the transmitter, which is not always available. Furthermore, the MIMO channel capacity can be further reduced if inaccurate channel state information is obtained at the receiver [30]. In other words, accurate channel estimation is very important to fully exploit the advantages of MIMO wireless communications. The correlated fading channel is often estimated by a training sequence, which can be either time-multiplexing (TM) training (see e.g., [26], [32] for single-input multiple-output (MISO) channels and [7], [19] for MIMO channels), frequency-multiplexing [14], [18] or superimposed (SP) training (see e.g., [17], [36] for single-input single-output (SISO) channels and [33] for MISO channels). In superimposed traning, the training symbols are superimposed on the precoded data for transmission. In fact, superimposed traning includes both time-multiplexing and frequency-multiplexing as special cases, which correspond to sending the non-zero training symbols when the data symbols are zero or sending the non-zero training symbols over subcarriers that are not occupied by the data symbols (i.e., pilot subcarriers). Because superimposed training is a general and powerful framework, it has recently received a growing interest in the research community [14], [15], [18], [33]. In SP training, since the received signal is a superposition of the data-bearing signal, training signal and noise, a popular design approach is to decouple channel and symbol estimation [14], [18], [22], [37]. This can be done by designing the precoding and training matrices so that the data-bearing and training signals belong to complementary signal subspaces. Then, the data-bearing signal, which is considered as the unwanted noise in channel estimation, can be completely removed and channel estimation is carried out based on the training symbols. An alternative approach is to perform joint channel/symbol estimation at the receiver and design the training signal accordingly. Due to its more complicated processing and marginal advantage [37], joint channel/symbol estimation is less preferred than decoupled channel and sym- bol estimation. This paper, therefore, is only concerned with 1536-1276/08$25.00 c 2008 IEEE

Transcript of Optimal Superimposed Training Design for Spatially Correlated Fading MIMO Channels

3206 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 8, AUGUST 2008

Optimal Superimposed Training Design forSpatially Correlated Fading MIMO Channels

Vu Nguyen, Hoang D. Tuan, Member, IEEE, Ha H. Nguyen, Senior Member, IEEEand Nguyen N. Tran, Student Member, IEEE

Abstract—The problem of channel estimation for spatially cor-related fading multiple-input multiple-output (MIMO) systemsis considered. Based on the channel’s second order statistic, theminimum mean-square error (MMSE) channel estimator thatworks with the superimposed training signal is first developed.The problem of designing the optimal superimposed signalis then addressed and solved with an iterative optimizationalgorithm. Results show that under the constraint of equaltraining power and bandwidth efficiency, our optimal design ofthe superimposed training signal leads to a significant reductionin channel estimation error when compared to the conventionaldesign of time-multiplexing training, especially for slowly time-varying channels with a large coherence time. The issue of powerallocation between the information-bearing and training signalsfor detection enhancement is also investigated. Simulation resultsdemonstrate excellent bit-error-rate performance of orthogonalspace-time block codes with our proposed channel estimation.

Index Terms—MIMO channel, spatial correlation, channelestimation, MMSE estimation, training signal, training design,time-multiplexing training, superimposed training.

I. INTRODUCTION

THE use of multiple antennas at both the transmitterand the receiver to create the so-called multiple-input

multiple-output (MIMO) communication systems has beenshown to greatly increase the data rate of the wireless trans-mission medium [21], [30]. This is especially true when thechannel fades among the transmitter-receiver pairs are inde-pendently Rayleigh distributed [5], [30], [38]. In particular, itis shown in [30] that the capacity of a MIMO wireless channelincreases linearly with the number of antennas.

The assumption of independent fades requires that theantennas be placed sufficiently far apart, both at the transmitterand the receiver. In many practical applications, meeting suchrequirements might be very expensive and impractical (suchas for the antennas in hand-held mobile units). It is thereforemore practical and useful to consider spatial correlations

Manuscript received March 2, 2007; revised August 1, 2007; acceptedOctober 1, 2007. The associate editor coordinating the review of this paperand approving it for publication is D. Dardari. This work is supported by theAustralian Research Council under grant ARC Discovery Project 0556174. Apart of this work was presented at the IEEE Second International Workshop onComputational Advances in Multi-Sensor Adaptive Processing, St. Thomas,U.S. Virgin Islands, USA, 12-14 December 2007.

Vu Nguyen, Hoang D. Tuan, and Nguyen N. Tran are with theSchool of Electrical Engineering and Telecommunications, the Universityof New South Wales, Sydney, NSW 2052, Australia (e-mail: {q.nguyen,nam.nguyen}@student.unsw.edu.au, [email protected]).

Ha H. Nguyen is with the Department of Electrical and Computer Engi-neering, University of Saskatchewan, 57 Campus Dr., Saskatoon, SK, CanadaS7N 5A9 (e-mail: [email protected]).

Digital Object Identifier 10.1109/TWC.2008.070250.

among different sub-channels of the MIMO channel matrix[5], [13], [28]. Compared to an independent fading MIMOchannel, the results in [3], [10]–[12], [28] show that thecapacity of a spatially-correlated fading MIMO channel issubstantially reduced.

Capacity reduction due to spatially-correlated fading can bepartially alleviated by precoding the transmitted signal [20].This technique however requires the knowledge of the channelstate information at the transmitter, which is not alwaysavailable. Furthermore, the MIMO channel capacity can befurther reduced if inaccurate channel state information isobtained at the receiver [30]. In other words, accurate channelestimation is very important to fully exploit the advantagesof MIMO wireless communications. The correlated fadingchannel is often estimated by a training sequence, whichcan be either time-multiplexing (TM) training (see e.g., [26],[32] for single-input multiple-output (MISO) channels and[7], [19] for MIMO channels), frequency-multiplexing [14],[18] or superimposed (SP) training (see e.g., [17], [36] forsingle-input single-output (SISO) channels and [33] for MISOchannels). In superimposed traning, the training symbols aresuperimposed on the precoded data for transmission. In fact,superimposed traning includes both time-multiplexing andfrequency-multiplexing as special cases, which correspondto sending the non-zero training symbols when the datasymbols are zero or sending the non-zero training symbolsover subcarriers that are not occupied by the data symbols(i.e., pilot subcarriers). Because superimposed training is ageneral and powerful framework, it has recently received agrowing interest in the research community [14], [15], [18],[33].

In SP training, since the received signal is a superpositionof the data-bearing signal, training signal and noise, a populardesign approach is to decouple channel and symbol estimation[14], [18], [22], [37]. This can be done by designing theprecoding and training matrices so that the data-bearing andtraining signals belong to complementary signal subspaces.Then, the data-bearing signal, which is considered as theunwanted noise in channel estimation, can be completelyremoved and channel estimation is carried out based onthe training symbols. An alternative approach is to performjoint channel/symbol estimation at the receiver and designthe training signal accordingly. Due to its more complicatedprocessing and marginal advantage [37], joint channel/symbolestimation is less preferred than decoupled channel and sym-bol estimation. This paper, therefore, is only concerned with

1536-1276/08$25.00 c© 2008 IEEE

NGUYEN et al.: OPTIMAL SUPERIMPOSED TRAINING DESIGN FOR SPATIALLY CORRELATED FADING MIMO CHANNELS 3207

decoupled channel and symbol estimation approach.For the estimation of independent Rayleigh fading channels,

the works in [14], [18], [37] conventionally treat the receivedsignal along time, i.e., as a concatenation of the columns ofthe received signal matrix. The subspaces of the data-bearingand training signals thus depend on the unknown channelmatrix. Consequently, some special structures on both theprecoding and training matrices are imposed. This togetherwith several complex arrangements make these matrices com-mutative with the channel matrix and hence, decouple thetwo subspaces of the data-bearing and training signals. Anovel approach has also been recently developed in [22], [25],[31], where the received signal is viewed along space, i.e.,as a concatenation of the rows of the received signal matrix.The most important consequence of this view is that the twosubspaces of interest are independent of the channel matrix.This means that, if designed properly, the orthogonality of theprecoding and training matrices already guarantees subspacecomplementarity. Therefore there is much more freedom in theoptimal design of these matrices. Indeed, the results in [22],[31] demonstrate that this design approach is superior than thedesigns in [14], [18], [37] in terms of estimation performance,symbol detectability and computational complexity.

This paper adopts the approach in [22], [31] to designthe optimal superimposed training signal for the channelestimation of correlated block-fading MIMO channels. Foruncorrelated block-fading MIMO channels, precoding is wellknown to be useless [4], [6], [16], [18]. For these MIMOchannels, the optimal TM training and superimposed trainingcan be easily seen to be the same with a scaled identity matrixas the optimal training matrix [24], [25]. However, as pointedout in [9], the design of training signals for correlated fadingMIMO channels is quite challenging in general. This is dueto the large number of channel parameters involved and thecomplex nature of the correlated channel coefficients. In fact,while the design of TM training signal is quite straightforwardfor independent fading channels [8], it is still a difficultproblem for the correlated fading channels [9]. In particular,sub-optimal TM training signals for spatially-correlated fadingchannels were proposed in [9] for only two extreme casesof low and high signal-to-noise ratios (SNRs). It is still notclear what is the optimal TM training signal for a given SNRlevel. Similarly, while the design of the optimal superimposedtraining signal is well understood for independent fading chan-nels [14], [22], [31], [37], the design for spatially-correlatedfading channels at any SNR level has not been addressed. Thispaper solves this challenging design problem by developing anefficient iterative optimization algorithm to find the solution.Results show that the proposed design of the superimposedtraining signal performs much better than the TM training-based estimation considered in [9].

The remaining of this paper is organized as follows. SectionII introduces the system model and describes the design prob-lem. The optimal superimposed training design is presentedin Section III with an iterative optimization algorithm. Theissue of optimal power allocation for the training signal isaddressed in Section IV. Section V provides numerical resultsto illustrate the advantages of the proposed design over theexisting ones. Section VI concludes the paper.

Notation: Boldface upper (lower) letters denote matrices(column vectors). The operation vec(·) means matrix vector-ization which forms a column vector by vertically stackingthe columns of a matrix. For a matrix A, its transposition,Hermitian adjoint and trace are denoted by AT , AH andtrace(A), respectively. IN is the identity matrix of size N×N ,0N×M is the N ×M zero matrix and (∗)N×M stands for anymatrix of size N ×M . UN and DN denote the sets of N ×Nunitary matrices and diagonal matrices, respectively. For twoHermitian matrices X and Y, X ≤ Y means that Y − X ispositive semi-definite. Similarly, X < Y implies that Y−X ispositive definite. The symbol ⊗ is used for Kronecker matrixproduct, while E(A) is the expectation of random matrix A.For any x, define (x)+ = max(x, 0).

Furthermore, some properties of Kronecker product trans-formations and positive definite matrices used in this paperare as follows:

(P1) (A ⊗ B)(C ⊗ D) = (AC) ⊗ (BD).(P2) (A ⊗ B)H = AH ⊗ BH .(P3) (A ⊗ B)−1 = A−1 ⊗ B−1.(P4) trace(A ⊗ B) = trace(A)trace(B).(P5) vec(AXB) = (BT ⊗ A)vec(X).(P6) If UN ∈ UN and UM ∈ UM , then UM ⊗ UN ∈ UNM .(P7) If 0 < X < Y, then trace(X) < trace(Y) and

trace(X−1) > trace(Y−1).(P8) If X ≥ 0, then X⊗ IN ≥ 0, ∀N .

(P9) If 0 < X ∈ CN×N , then trace(X−1) ≥

N∑i=1

1/X(i, i).

II. SYSTEM MODEL

Consider a narrowband frequency-flat block-fading MIMOchannel with N transmit and M receive antennas (see Fig. 1).The information-bearing symbols are grouped into blocks ofsize Ns, namely s(k) = [s(kNs), s(kNs + 1), . . . , s(kNs +Ns −1)]T , where k denotes the block index. Then each blocks(k) is encoded and/or multiplexed in space and time, whichis generally represented by a block labeled with space-timecoding (STC) in Fig. 1. Thus the system under considerationcan accommodate any specific space-time schemes such as theAlamouti’s orthogonal space-time block codes or the BLAST-type schemes [27]. The output of the space-time encoderconsists of N vectors, xi ∈ CK×1, i = 1, . . . , N , each havinglength K (K ≥ N ) symbols. The information-bearing signalcan therefore be represented by the following matrix:

X = [x1,x2, . . . ,xN ]T ∈ CN×K . (1)

Before directed to the transmit antennas, the signal matrix Xis first precoded by post-multiplying with a precoding matrixP = [p1,p2, . . . ,pK ]T ∈ CK×(K+L), where L ≥ N , toproduce the following precoded signal matrix:

D :=

⎡⎢⎣dT1...

dTN

⎤⎥⎦ = XP =

⎡⎢⎣xT1 P...

xTNP

⎤⎥⎦ ∈ CN×(K+L). (2)

Here, L represents the number of redundant vectors resultedby precoding the transmitted signal. In general, it is desir-able to have L as small as possible in order to improve

3208 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 8, AUGUST 2008

Space-time

coding(STC)

1x

2x

Nx

Precodingby

post-multiplyingwith matrix

P

1d

1c

2d

2c

Nd

Nc

Superimposedtraining Ant-1

Ant-2

Ant-N

P/S

P/S

P/S

Space-TimeDecoding

Ant-1

Ant-M

S/P

Ant-2

S/P

S/P

ChannelEstimator

H

(a) Transmitter

(b) Receiver

Informationsymbols

Decodedsymbols

Decoupling bypost-multiplyingwith matrix Q

1y

2y

My

Fig. 1. An equivalent discrete-time baseband MIMO system.

the transmission efficiency. Next, a training matrix C =[c1, c2, . . . , cN ]T ∈ CN×(K+L) is added (i.e., superimposed)to the precoded matrix D. Finally, the nth row of D + C,namely dT

n + cTn is serially transmitted over the nth transmit

antenna, n = 1, . . . , N . It should be pointed out that theabove superimposed training is performed after the space-time encoder. Therefore our training design is flexible andcan accommodate any space-time code.

Assume that the fading channel remains constant duringevery block of (K + L) symbols, but changes independentlyfrom block to block. Typically, this assumption implies that theexact statistical behavior of the correlation in time is neitheravailable nor exploited, but only the coherence time is usedto determine the block length. In practical systems, the speedof mobility and the transmission rate determine the coherencetime (in units of information symbols). The coherence time inturn dictates the block length [27]1. With this assumption andsince transmission and reception are conducted on a block-by-block basis, the time index is omitted for convenience.Let H be the M × N MIMO channel matrix in an arbitrarytransmission block. To reflect spatially-correlated fading, thechannel matrix is represented as follows [9]:

H = Σ1/2r HwΣ1/2

t , (3)

where Σr and Σt are M ×M and N ×N known covariancematrices that capture the correlations of the transmit and

1For example, in a typical cellular system operating at a carrier frequencyof fc = 1.9 GHz, a mobile speed of v = 36 km/h translates to a coherencetime of about Tc = c/(8fcv) ≈ 1.97 ms, where c = 3 × 108 m/s is thespeed of light. If the data rate is 250 kbps and a 16-QAM constellation isused, then the block length would be about 123 QAM symbols.

receive antenna arrays, respectively. The matrix Hw is anM ×N matrix whose entries are independent and identically-distributed (i.i.d.) circularly symmetric complex Gaussian ran-dom variables of unit variance, i.e., CN (0, 1). In particularE[vec(Hw)vecH(Hw)] = IMN . The known matrices Σr andΣt have the following forms:

Σr =

⎡⎢⎢⎢⎢⎣1 r12 · · · r1M

r∗12 1. . . r2M

......

. . ....

r∗1M r∗2M · · · 1

⎤⎥⎥⎥⎥⎦ , (4)

Σt =

⎡⎢⎢⎢⎢⎣1 t12 · · · t1M

t∗12 1. . . t2M

......

. . ....

t∗1M t∗2M · · · 1

⎤⎥⎥⎥⎥⎦ , (5)

where tij (rnm, resp.) with i �= j (n �= m, resp.) reflects thecorrelated fading between the ith and the jth (nth and mth,resp.) elements of the transmit (receive, resp.) antenna array.The elements of Σr and Σt can be specified, for example, byusing the one-ring model in [28]. The covariance matrix ofthe overall channel matrix H can be easily shown to be

R = E[vec(H)vecH(H)] = Σt ⊗ Σr.

At the receiver, the received signal matrix is given as:

Y = H(C + D) + N = HC + HXP + N, (6)

where N ∈ CM×(K+L) is the matrix of additive whiteGaussian noise (AWGN) samples. Furthermore, the followingassumptions are made for the input/output channel model in(6):

(A1) The information-bearing symbols are independent, zero-mean and with variance σ2

x, i.e., E(XXH) = Kσ2xIN .

Note that this assumption is valid if X is obtainedby simply multiplexing the information symbol blocks(k) in space and time. When X is obtained by space-time encoding the information block s(k), the correlationmatrix E(XXH) generally admits a different form. Nev-ertheless, the technique presented in this paper can beeasily extended to cover any other form of E(XXH).

(A2) The AWGN samples are also independent, zero-mean andwith variance σ2

n, i.e., E(NNH) = (K + L)σ2nIM .

(A3) The average transmitted power, including the powers ofthe information-bearing and training signals, is normal-ized as σ2

x + σ2c = 1, where σ2

c = trace(CCH )N(K+L) is the

average power of the training signal.Moreover, the precoding matrix P is full rank and satisfies

trace(PPH) = K + L. (7)

The above constraint is to ensure that the average transmittedpower of the information-bearing signal is unchanged afterprecoding. Mathematically, this is verified as

σ2d =

trace(E(DDH)

)N(K + L)

=Nσ2

xtrace(PPH)N(K + L)

= σ2x.

In [14], [18], [37], the signal matrix X is precoded as PX,or equivalently, each column xi of X is precoded by Pxi.

NGUYEN et al.: OPTIMAL SUPERIMPOSED TRAINING DESIGN FOR SPATIALLY CORRELATED FADING MIMO CHANNELS 3209

Then the information-bearing signal HPxi at the receiverside belongs to a subspace governed by the unknown channelmatrix H. Similarly, the training signal Hci, where ci is theith column of the training matrix C, belongs to a subspace thatcan only be determined by knowing H. It follows that it is noteasy to decouple these two unknown subspaces [14], [18], [37]for convenient and effective channel estimation. Consequently,the optimal training matrix C cannot be readily derived.

On the contrary, it can be seen that our precoded signals,xT

i P, i = 1, . . . , N , belong to the subspace ΥP ⊂ CK+L

spanned by the rows pTi , i = 1, . . . , K , of the precoding

matrix P. Then the rows of the information-bearing part HXPin the received signal matrix Y also belong to ΥP, whichis independent of the unknown channel matrix H. Moreover,the rows of the training part HC belongs to the subspaceΥC ⊂ CN+L spanned by the rows of the training matrixC, which is also independent of H. Therefore, in order toestimate the channel matrix H in an effective way, the receivedsignal matrix Y is post-multiplied with the decoupling matrixQ = [q1,q2, . . . ,qK+L]T ∈ R(K+L)×N , which is chosensuch that

PQ = 0K×N . (8)

The matrix Q for channel estimation is also full rank andsatisfies QHQ = IN , which means that the noise is notenhanced by the decoupling operation.

Thus, the decoupled signal matrix for channel estimation isexpressed as

YQ = HCQ + HXPQ + NQ = HCQ + NQ, (9)

which is free of the unwanted component HX, and hence Hcan be efficiently estimated.

Although (8) is the most important relationship between theprecoding matrix P and the decoupling matrix Q for efficientchannel estimation, the precoding matrix P can be furtherdesigned to improve the performance of symbol detection asfollows [31]. First, choose the decoupling matrix for symbol

detection as QD = PH(PPH

)−1

, where P is also chosen

such that CPH = 0. Then, by post-multiplying the receivedsignal matrix in (6) with QD, the decoupled signal matrix forsymbol detection can be expressed as

YQD = HCQD + HXPQD + NQD

= HX + NQD. (10)

Let xi and qi be the ith columns of the estimated data matrixX and QD, respectively. Under the minimum mean-squareerror (MMSE) criterion for symbol detection, the ith columnof X is recovered based on the channel estimate H and thematrix YQD as follows [8]:

xi =(

1σ2

x

I +1

σ2n||qi||2 HHH

)−1 1σ2

n||qi||2 HHYqi. (11)

The total mean-square error (MSE) of symbol detection canbe shown to be

εX(QD) =K∑

i=1

tr

{(1σ2

x

I +1

σ2n||qi||2 HHH

)−1}

. (12)

Since QHDQD = (PPH)−1 and due to the the power con-

straint in (7), the problem of precoding matrix design can bestated as

mintr{[QH

DQD ]−1}=K+L

K∑i=1

tr

{(1σ2

x

I +1

σ2n||qi||2 HHH

)−1}

.

(13)The closed-form solution to the above optimization problemhas been shown in [31] to have the following structure:

PPH = (QHDQD)−1 =

K + L

KIK . (14)

Similar to [31], based on (8) and (14), the matrices P andQ are designed as follows:

P =√

K+LK O(1 : K, :) ∈ CK×(K+L)

Q = OH((K + 1) : (K + N), :

) ∈ C(K+L)×N ,(15)

where O(1 : K, :) and O((K + 1) : (K + N), :

)keep only

rows 1 to K and rows (K + 1) to (K + N) of an orthogonalmatrix O ∈ C(K+N)×(K+L). As an example, O can be formedby keeping the first (K + N) rows of an unitary UK+L ∈UK+L, i.e.,

O = UK+L(1 : K + N, :). (16)

Now, the key issue is how to design the superimposedtraining matrix C that results in the best estimation of thechannel matrix H based on the input/output model in (9).When H is uncorrelated, this design is quite straightforwardand a closed-form solution for the optimal C can be easilyderived [31]. In contrast, due to the spatial correlations amongchannels of different transmit-receive antenna pairs, the designunder consideration is quite complicated. Though a closed-form solution is not yet available, the next section proposes aniterative optimization algorithm to effectively find the optimalsolution.

III. OPTIMAL DESIGN OF THE SUPERIMPOSED TRAINING

SIGNAL

Let PT = N(K + L)σ2c . According to (A3), the design

of the training matrix C is subject to the following powerconstraint:

trace(CCH) ≤ PT . (17)

Rewrite (9) as y = Ch+n, where y = vec(YQ) ∈ CMN ,n = vec(NQ) = (QT ⊗ IM )vec(N) ∈ CMN , C = (CQ)T ⊗IM ∈ C

MN×MN and h = vec(H) ∈ CMN . Since QHQ =

IN , it follows that E[nnH ] = (QHQ)T ⊗ σ2nIM = σ2

nIMN .Based on the received signal vector y, the linear minimummean-square error (MMSE) estimation of the channel vectorh is:

h = RCH(CRCH + σ2nIMN )−1y, (18)

where, recall that, R = Σt⊗Σr. Furthermore, the covariancematrix of the estimation error vector is

E : = E[(h − h)(h − h)H ]

=(R−1 + CH(σ2

nIMN )−1C)−1

=((Σt ⊗ Σr)

−1 +1

σ2n

CTH(QQH )T CT ⊗ IM

)−1

. (19)

3210 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 8, AUGUST 2008

The objective is to design C to further minimize (19) subjectthe the power constraint in (17).

From (15), QQH admits the following singular valuedecomposition (SVD):

QQH = UQΛUHQ , UQ ∈ UK+L, (20)

where Λ =[

IN 0N×(K+L−N)

0(K+L−N)×N 0K+L−N

]. When O is

specified as in (16), UQ can be obtained by simply permutingthe rows of UK+L.

Now, perform the following transformation:

C = UTQCT . (21)

One has trace(CCH) = trace(UTQCTCTHCTH) =

trace((CHC)T ) = trace(CCH). Thus, under the powerconstraint in (17), the optimal design of C ∈ CN×(K+L),that aims at minimizing E , can be formulated as the followingconstrained optimization problem in C ∈ C(K+L)×N :

minC

trace((

(Σt ⊗ Σr)−1 +1σ2

n

CHΛC⊗ IM

)−1)

subject to trace(CCH) ≤ PT . (22)

In what follows, the approach of matrix partition formatrix inequalities [23] is employed to derive the solu-tion to the above optimization problem. Suppose that Cis the optimal solution of Problem (22). Partition C =[CU

CL

], CU ∈ CN×N ,CL ∈ C(K+L−N)×N and define C0 =[CU

0(K+L−N)×N

]. It can be verified that CHΛC = CH

U CU =

CH0 ΛC0. On the other hand, whenever CL �= 0, one has

PT = trace(CCH) = trace(CHU CU ) + trace(CH

L CL) >trace(CH

U CU ) = trace(C0CH0 ). It then follows that, when-

ever CL �= 0 there exists λ > 1 such that PT =trace(λ2C0CH

0 ). Now, applying properties (P7) and (P8)yields

trace((

(Σt ⊗ Σr)−1 +1σ2

n

CHoptΛCopt ⊗ IM

)−1)

>

trace((

(Σt ⊗ Σr)−1 +1σ2

n

(λC0)HΛ(λC0) ⊗ IM

)−1)

(23)

which contradicts with the assumption that C is the optimalsolution of Problem (22).

The above result shows that the optimal solution of theoptimization problem in (22) must have the following form:

Copt =[

CU

0(K+L−N)×N

], CU ∈ C

N×N . (24)

and the optimization problem in (22) is equivalent to thefollowing problem:

minCU∈CN×N

trace((

(Σt ⊗ Σr)−1 +1σ2

n

CHU CU ⊗ IM

)−1)

subject to trace(CUCHU ) ≤ PT . (25)

Remark 1: The above optimization problem implies the fol-lowing important consequence: As long as L ≥ N , perfor-mance of the channel estimation in terms of the mean-square

error does not depend on the actual value of L. Thus, asfar as channel estimation is concerned, choosing L = Nis optimal to maximize the system’s bandwidth efficiency. Itshould be noted, however, that choosing L > N affects theprecoding operation and might lead to a better performancewith respect to a different criterion (such as the bit-error-rate (BER) performance, or the effective SNR considered inSection IV).

To find the solution to Problem (25), make the followingSVDs:

Σt = UtΛtUHt , Ut ∈ UN , Λt ∈ DN ,

Σr = UrΛrUHr , Ur ∈ UM , Λr ∈ DM .

Then, the objective in (25) can be evaluated as follows:

trace((UtΛ−1

t UHt ) ⊗ (UrΛ−1

r UHr ) +

1σ2

n

(CHU CH) ⊗ IM

)−1

= trace((Λt ⊗ Λr)−1 +

1σ2

n

Z ⊗ IM

)−1

, (26)

whereZ = UH

t CHU CUUt ∈ C

N×N . (27)

The power constraint in (25) can also be expressed in termsof the new variable Z as trace(CUCH

H) = trace(Z) ≤ PT .With the expressions of the objective and constraint in Z, theequivalent optimization problem is:

minZ∈CN×N

trace((Λt ⊗ Λr)−1 +

1σ2

n

Z ⊗ IM

)−1

subject to trace(Z) ≤ PT . (28)

Using property (P9), we can easily see that

trace((Λt ⊗ Λr)−1 +

1σ2

n

Z ⊗ IM

)−1

trace((Λt ⊗ Λr)−1 +

1σ2

n

diag[Z(i, i)]i=1,...,N ⊗ IM

)−1

andtrace(diag[Z(i, i)]i=1,...,N ) = trace(Z).

Here diag[Z(i, i)]i=1,...,N is the diagonal matrix with diagonalentries Z(i, i). This implies that the optimal solution of Prob-lem (28) must be diagonal. Consequently, the optimizationproblem in (28) can be reformulated as follows:

min0≤ΛC∈DN

trace((Λt ⊗ Λr)−1 +

1σ2

n

ΛC ⊗ IM

)−1

subject to trace(ΛC) ≤ PT . (29)

The following theorem summarizes the optimal designproblem, based on (21), (24), (27).

Theorem 1: The optimal solution ΛC,opt of Problem (29)provides the following optimal training signal:

Copt =[UHT

t

√ΛC,opt 0N×(K+L−N)

]UH

Q . (30)

Remark 2: It is intuitively satisfying to observe that theprecoding matrix P designed as in (15) and the optimal

NGUYEN et al.: OPTIMAL SUPERIMPOSED TRAINING DESIGN FOR SPATIALLY CORRELATED FADING MIMO CHANNELS 3211

training matrix Copt derived in (30) are orthogonal. This canbe shown as follows:

CoptPH =[UHT

t

√ΛC,opt 0N×(K+L−N)

]UH

QPH

=[UHT

t

√ΛC,opt 0N×(K+L−N)

] [0N×K

∗]

= 0N×K . (31)

The above implies that the components in the received signalcorresponding to the training signal (namely HCopt) and theinformation-bearing signal (namely HXP) are guaranteed tobe orthogonal, regardless of the unknown channel matrix H.

Unfortunately, a closed-form expression for the optimalsolution of Problem (29) is not available. Nevertheless, thefollowing subsection provides an effective iterative algorithmto solve Problem (29).

A. Iterative Algorithm to Find the Optimal Solution of Prob-lem (29)

For convenience, define

[δ1, δ2, . . . , δMN ]T

= [(Λt ⊗ Λr)−1(1, 1), . . . , (Λt ⊗ Λr)−1(MN, MN)]T ,

s = [s1, s2, . . . , sN ]T

= [ΛC(1, 1), . . . ,ΛC(N, N)]T . (32)

Then Problem (29) can be equivalently re-expressed as

mins∈RN

N∑j=1

M∑i=1

1δ(j−1)N+i + 1

σ2nsj

,

subject toN∑

j=1

sj ≤ PT , sj ≥ 0. (33)

When there is only one receive antenna, i.e., M = 1,the closed-form optimal solution of the above optimizationproblem is well-known to have the water-filling structure (seee.g. [2]). The situation is quite different when M > 1 anda closed-form solution is not expected. Note that both theobjective and constraint functions of the above problem arestill convex in s. In principle, the interior-point algorithms ofconvex programming (see e.g. [2]) can be applied. However,we shall exploit not only the convex structure of Problem(33), but also its monotonic structure and provide an efficientand fast computational algorithm to find its optimal solution.Undoubtedly, convexity and monotonicity are the most usefulproperties in optimization [34], [35].

Specifically, for a decreasing function h(t) the nonlinearscalar equation

h(t) = γ, t ∈ [t, t] (34)

can be solved online by the following iterative bisectionprocedure (IBP):

• If h(t) < γ or h(t) > γ then there is no solution in [t, t].• For t = (t + t)/2 reset t = t if h(t) > γ and reset t = t

if h(t) < γ. Repeat until h(t) = γ.

Note that the objective function f(s) in (33) is separable insj , i.e.,

f(s) =N∑

j=1

fj(sj), where fj(sj) =M∑i=1

1δ(j−1)N+i + 1

σ2nsj

.

Furthermore, fj(sj) is not only convex but also decreasing insj .

The Lagrangian of (33) is

L(s, μ, μ1, . . . , μN )

= f(s) + μ

⎛⎝ N∑j=1

sj − PT

⎞⎠−N∑

j=1

μjsj , μ ≥ 0, μj ≥ 0.

According to the Kuhn-Tucker condition for optimality ofconvex programming, the optimal solution of the optimizationproblem in (33) and the corresponding Lagrange multipliersmust satisfy the following necessary and sufficient conditions:

∂L(s, μ, μ1, . . . , μj)∂sj

=∂fj(sj)

∂sj+ μ − μj = 0,

where μjsj = 0, j = 1, 2, . . . , N .Therefore, the optimal solution of (33) can be expressed as

sj,opt = s+j (μ) := max{sj(μ), 0}, j = 1, . . . , N,

where:• sj(μ) is the solution of the following nonlinear equation:

gj(sj) := −∂fj(sj)∂sj

=M∑i=1

1σ2

n

1[δ(j−1)N+i + 1

σ2nsj ]2

= μ.

(35)Thus for each μ we can quickly locate s+

j (μ) by the IBPdescribed before. Obviously s+

j (μ) is decreasing in μ.• The scalar Lagrange multiplier μ > 0 is such that

g(μ) :=N∑

j=1

sj,opt =N∑

j=1

(sj(μ)

)+ = PT . (36)

The function g(μ) is also decreasing in μ. So again theIBP can be effectively used to locate the solution of (36).

To summarize, an effective procedure for locating the op-timal solution of the optimization problem in (33) is outlinedbelow.

• Compute μ and μ such that the solution of (36) belongsto [μ, μ].

• Apply IBP to locate the solution of Equation (36).The subroutines include (i) the computations of sj(μ)and sj(μ) such that the solution of (35) belongs to[sj(μ), sj(μ)] and (ii) the application of IBP for locatingthe solutions sj(μ) of (35).

To make the above procedure completely realizable, theexpressions of sj(μ), sj(μ), μ and μ are given next. Define

δj,max = maxi=1,2,...,M

δ(j−1)N+i, (37)

δj,min = mini=1,2,...,M

δ(j−1)N+i, j = 1, 2, . . . , N. (38)

It is obvious thatM

σ2n(δj,max + 1

σ2nsj)2

≤ gj(sj) ≤ M

σ2n(δj,min + 1

σ2nsj)2

. (39)

3212 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 8, AUGUST 2008

Hence, for a fixed μ, the solution sj(μ) of (35) belongs to[sj(μ), sj(μ)] with

sj(μ) := σ2n

(√M

μσ2n

− δj,max

)+

, (40)

sj(μ) := σ2n

(√M

μσ2n

− δj,min

)+

. (41)

Also, it must be true that μ ∈ [μ, μ] for the solution of (36),where μ and μ are the solutions of

N∑j=1

(sj(μ)

)+ = PT (42)

andN∑

j=1

(sj(μ))+ = PT , (43)

respectively. In other words, {sj(μ), μ} and {sj(μ), μ} arethe water-filling structured optimal solutions and the corre-sponding Lagrange multipliers of the following optimizationproblems

mins∈RN

N∑j=1

M

δj,max + 1σ2

nsj

:N∑

j=1

sj ≤ PT , sj ≥ 0. (44)

and

mins∈RN

N∑j=1

M

δj,min + 1σ2

nsj

:N∑

j=1

sj ≤ PT , sj ≥ 0, (45)

respectively. The technique to find these solutions is quitestandard and the details are omitted here.

IV. POWER ALLOCATION FOR DETECTION ENHANCEMENT

Until now, we have considered the problem of designing theoptimal training signal C under its fixed power constraint asspecified in (17). Under the total transmitted power constraintstated in (A3), the following tradeoff arises. The performanceof channel estimation can be improved by spending morepower for the training signal. This, however, comes at theexpense of decreased transmitted power for the information-bearing signal, leading to performance degradation of signaldetection. This section considers this tradeoff problem andproposes a sub-optimal power allocation that maximizes theeffective signal-to-noise ratio (SNR). The SNR is selectedbecause any increase in the effective SNR translates to anincrease in system capacity and/or quality of signal detection.Maximizing the effective SNR is also very helpful for chan-nel decoding if error control coding is implemented in thesystems.

With the optimal training matrix Copt derived in the previ-ous section by (30), Equation (6) is rewritten as

Y = HCopt + HXP + N. (46)

Next, rewrite (10) to see the effect of channel estimation erroron data detection as follows:

YQD = HX + NQD

= HX + HX + NQD, (47)

where H is the estimated channel matrix and H = H − His the channel estimation error matrix. Since H in (47)is unknown and random and HX is uncorrelated to HXaccording to the orthogonality property [8], it is consideredas noise. Thus, the effective SNR of the input/output modelin (47) is defined as

SNReff =E(‖HX‖2

)E(‖HX + NQD‖2

) , (48)

Here,

E(‖HX‖2

)= trace

(E(HXXHHH

))= Kσ2

xtrace(E(HHH

)− E(HHH

))= Kσ2

x(MN − ε),

where ε := trace(E(HHH

)), and

E(‖HX + NQD‖2

)= trace

(E(HXXHHH

))+ trace

(E(NQDQH

DNH))

= Kσ2xε + trace

(E(NPH(PPH)−1(PPH)−HPNH

))= Kσ2

xε + Mσ2n

K2

K + L.

It follows that

SNReff =σ2

x(MN − ε)σ2

xε + γ, with γ = Mσ2

n

K

K + L. (49)

Note that ε is exactly the optimal value of the objectivefunction in Problem (29). As pointed out in Remark 1, ε doesnot depend on the actual value of L as long as L ≥ N . Onthe other hand, it follows from (49) that the effective SNRincreases with L, an intuitively satisfying result. For simplicityand to maximize the system’s bandwidth efficiency, L = Nis chosen in the remaining of this paper.

Furthermore, the following upper bound on ε can be easilyderived:

ε ≤ MN

K + N

σ2n

σ2c

= βσ2

n

σ2c

, where β =MN

K + N. (50)

This is because ΛC = PT

N IN = (K + N)σ2c IN is a feasible

solution of (29). Then, by replacing σ2x with (1 − σ2

c ), it iseasy to see that SNReff in (49) is lower bounded as

SNReff(σ2c ) ≥ (1 − σ2

c )(MNσ2c − βσ2

n)βσ2

n − βσ2nσ2

c + γσ2c

. (51)

Instead of maximizing SNReff , we maximize its lowerbound as given by the right-hand-side of (51). This maxi-mization leads to a sub-optimal power allocation as far asmaximizing SNReff is concerned. The sub-optimal solution ofpower allocation can be shown to be:

σ2c,sub-opt =

MNβσ2n −√MNγβσ2

n(−βσ2n + MN + γ)

MN(βσ2n − γ)

.

(52)Since the total transmitted power is normalized to unity,

the above expression essentially gives the fraction of thetotal power allocated to the training signal. The above sub-optimal power allocation is used to obtain the simulationresults presented in the next section.

NGUYEN et al.: OPTIMAL SUPERIMPOSED TRAINING DESIGN FOR SPATIALLY CORRELATED FADING MIMO CHANNELS 3213

0 5 10 15 20−25

−20

−15

−10

−5

0

SNR (dB)

Mea

n S

quar

e E

rror

(dB

)PSPTESPTTMTK=10K=60

Fig. 2. Comparison of the mean-squared errors in channel estimation ofthe 2 × 2 MIMO systems using different training signals: The proposedsuperimposed training (PSPT), the equal-powered superimposed training(ESPT) and the time-multiplexing training (TMT).

V. ILLUSTRATIVE RESULTS

This section provides simulation results to illustrate theperformance of the proposed optimal training design. In allsimulations, the wireless channel model is assumed to bequasi-static block Rayleigh fading and spatially correlated asdescribed in (3). The one-ring model in [28, E.q. (6)] is usedto generate the elements of the covariance matrices Σr andΣt. Specifically,

Σt(n, m) ≈ J0

λdt|m − n|

), (53)

Σr(i, j) ≈ J0

(2π

λdr|i − j|

)(54)

where Δ is the angle spread in the one-ring model; dt and dr

are the spacings of the transmit and receive antenna arrays,respectively; λ is the carrier wave-length and J0(·) is thezeroth order Bessel function of the first kind. Note thatthe angle spread, Δ, and the antenna spacings, dt and dr,determine how correlated the fading is at the transmit andreceive antenna arrays. Unless stated otherwise, the valuesof Δ = 50, dt = 0.5λ and dr = 0.2λ are used in thesimulation to create highly correlated fading. Since the averagetransmitted power, including the training and data powers,is normalized to unity as in assumption (A3), the receivedSNR in dB is defined as SNR = −10log10σ

2n. The power

allocation for data and training signals in the proposed SPtraining follows (52).

A. Estimation Performance

Two different designs of superimposed training and oneconventional time-multiplexing training (TMT) design are in-vestigated and compared. The proposed superimposed training(PSPT) signal is obtained with the iterative algorithm de-scribed in Section III. The equal-power superimposed training(ESPT) signal is chosen as a scaled identity matrix withpower constraint PT , which is the optimal training scheme

0 1 2 3 4 5 6 7 8−14

−12

−10

−8

−6

−4

−2

0

SNR (dB)

Mea

n S

quar

e E

rror

(dB

)

PSPTESPTTMTK=10K=60

Fig. 3. Comparison of the mean-squared errors in channel estimation ofthe 4 × 4 MIMO systems using different training signals: The proposedsuperimposed training (PSPT), the equal-powered superimposed training(ESPT) and the time-multiplexing training (TMT).

for the uncorrelated MIMO channel. The TMT design usedfor comparison in this section is the improved version of thedesign proposed in [9] (which applies to the case of high orlow SNR only [9, Subsection IV-C]). The improved designis obtained by applying the iterative algorithm described inSection III to the optimization problem in [9, Equ. (30)]. Asin [9], the linear MMSE estimator is used for identifying thewireless channel. Furthermore, the length of the TMT signalis chosen to be the minimum length required for the channelestimation. This minimum length is shown to be N symbolsfor a MIMO channel having N transmit antennas in [7]. Onthe other hand, the use of precoding matrix P and decouplingmatrix Q in this paper also introduces L = N redundantvectors per block. Thus, the extra bandwidth consumption isthe same for all the different training signals, namely the PSPT,ESPT and TMT signals. Of course, the estimation performanceof different training designs is compared based on the sametraining power and additive Gaussian noise environment.

The normalized mean-square errors (normalized byE[||h||2]), expressed in dB, of the channel estimation providedby the above three training signals are plotted versus SNRin Figs. 2 and 3 for the 2 × 2 and 4 × 4 MIMO channels,respectively. In these two figures, two different lengths of theinformation vector X, namely K = 10 and K = 60, areconsidered. It should be noted that the implementation of TMTis independent of K as long as K ≥ N . First, observe thatat almost any SNR level (except for SNR > 10 dB in the2 × 2 system), the mean-square error is significantly reducedwith the use of the PSPT signal instead of the ESPT. A moreimportant observation is that, compared to PSPT, using theTMT signal results in a larger MSE at any SNR level forboth cases of MIMO channels and for both block lengthsconsidered. As expected, the advantage of the superimposedtraining (including PSPT and ESPT) over the TMT becomesmore evident for the system with a larger block length. In fact,if the block length is not long enough, TMT can outperformESPT as can be seen from Fig. 3 for the 4 × 4 channel

3214 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 8, AUGUST 2008

0 5 10 15 20−25

−20

−15

−10

−5

0

SNR (dB)

Mea

n S

quar

e E

rror

(dB

)PSPTESPTTMT

Δ=150

Δ=300

Fig. 4. Comparison of the mean-squared errors in channel estimation of the2 × 2 MIMO systems for different angle spreads (Δ = 150 and Δ = 300)and using different training signals (K = 60).

and when K = 10. This particular case clearly shows theusefulness of our superimposed training design over the simpleESPT.

The difference in estimation performance between the PSPTand the TMT depends mainly on their length ratio, whichis (K+N)

N . In general, the bigger the ratio (K+N)N is, the

larger the performance difference between PSPT and TMTis, because the channel statistics is better incorporated forestimation by superimposed training. This can be clearly seenfrom the results shown in Figs. 2 and 3 for different valuesof K . In practice, the value of K + N is determined by thecoherence time of the channel.

In fact, when the coherence time is large, the estimationperformance of TMT can be improved by extending thetraining length beyond the minimum required length of Nsymbols [7], [9]. However, a direct consequence of extendingthe training length of TMT is a lower bandwidth efficiency.Therefore, taking both estimation performance and bandwidthefficiency into account, PSPT is more attractive than TMT,especially for a slowly time-varying wireless channel whosecoherence time can be very large.

Figures 4 and 5 illustrate the impact of having largerangle spreads, Δ = 150 and Δ = 300, on the estimationperformance of different training designs in both the 2 × 2and 4×4 systems. Here K = 60 is considered. Several obser-vations can be made from these two figures. First, all trainingdesigns perform better when the angle spread increases. Thisis expected since a larger Δ makes the channel less correlated.Second, based on the performance difference between PSPTand ESPT as well as the performance improvement whengoing from Δ = 150 to Δ = 300, one concludes that the2×2 MIMO channel can be considered spatially uncorrelatedfor Δ ≥ 150, while Δ ≥ 300 makes the 4×4 MIMO channeluncorrelated. Lastly, the PSPT scheme is seen to consistentlyoutperform the other two training schemes in these two figures.

Next, the impact of antenna spacings on the estimationperformance is illustrated in Figs. 6 and 7, where the mean-

0 1 2 3 4 5 6 7 8−12

−10

−8

−6

−4

−2

0

2

4

SNR (dB)

Mea

n S

quar

e E

rror

(dB

)

PSPTESPTTMT

Δ=150

Δ=300

Fig. 5. Comparison of the mean-squared errors in channel estimation of the4 × 4 MIMO systems for different angle spreads (Δ = 150 and Δ = 300)and using different training signals (K = 60).

0 5 10 15 20

−25

−20

−15

−10

−5

SNR (dB)

Mea

n S

quar

e E

rror

(dB

)

PSPTESPTTMTd

t=0.5λ, d

r=0.2λ

dt=0.2λ, d

r=0.1λ

Fig. 6. Comparison of the mean-squared errors in channel estimation ofthe 2 × 2 MIMO systems for different antenna spacings and using differenttraining signals (K = 60).

squared errors of different training designs are plotted for twosets of antenna spacings, which are {dt = 0.5λ, dr = 0.2λ}and {dt = 0.2λ, dr = 0.1λ}. These figures again confirm thatall the training designs perform better in more correlated chan-nels as the consequence of having smaller antenna spacings.And at any SNR level, the estimation performance of PSPTis always the best for both MIMO channels.

B. Impact of Power Allocation

Fig. 8 plots the average training power (as a fraction of thetotal power) computed as in (52), that maximizes the lowerbound of SNReff when K = 10 and K = 60. It can be seenthat a higher training power is needed for channel estimationat a lower SNR level. This is expected since the spatiallycorrelated fading has a stronger effect on the quality of thechannel estimation at the lower SNR level. It is also evidencedfrom Fig. 8 that a larger portion of the total power is spent

NGUYEN et al.: OPTIMAL SUPERIMPOSED TRAINING DESIGN FOR SPATIALLY CORRELATED FADING MIMO CHANNELS 3215

0 1 2 3 4 5 6 7 8−16

−14

−12

−10

−8

−6

−4

−2

SNR (dB)

Mea

n S

quar

e E

rror

(dB

)PSPTESPTTMTd

t=0.5λ, d

r=0.2λ

dt=0.2λ, d

r=0.1λ

Fig. 7. Comparison of the mean-squared errors in channel estimation ofthe 4 × 4 MIMO systems for different antenna spacings and using differenttraining signals (K = 60).

0 5 10 15 200.1

0.15

0.2

0.25

0.3

0.35

0.4

SNR (dB)

Ave

rage

Tra

inin

g P

ower

4x4 MIMO, K=104x4 MIMO, K=602x2 MIMO, K=102x2 MIMO, K=60

Fig. 8. Average training power that maximizes the lower bound of SNReffat different SNR levels.

for the training signal in the 4 × 4 MIMO system comparedto that in the 2×2 MIMO system. This is also expected sincethere are more channel parameters to be estimated in the 4×4MIMO system than in the 2× 2 MIMO system. Furthermore,observe that the larger K is, the smaller the average trainingpower becomes. This is also reasonable since with a larger Kthe channel statistics is better incorporated for estimation bysuperimposed training.

The actual SNReff and its lower bound attained by theproposed power allocation are plotted as functions of theSNR in Fig. 9 for the case of 2 × 2 MIMO system havingK = 60. Observe that the lower bound is very close tothe actual SNReff , which suggests the tightness of the lowerbound. Moreover, shown in Fig. 9 are plots of SNReff achievedwith several “ad-hoc” power allocation strategies. It is obviousthat failing to allocate the training power as proposed in (52)can significantly reduce SNReff .

0 5 10 15 20

0

5

10

15

20

SNR (dB)

SN

Ref

f (dB

)

Proposed allocationLower bound40% training power50% training power60% training power

Fig. 9. Plots of SNReff and its lower bound for the MIMO systems withdifferent power allocations (K = 60).

0 5 10 15 20

10−4

10−3

10−2

10−1

SNR (dB)

BE

R

PSPTESPTTMTPerfect

Fig. 10. BER performance of the 2 × 2 MIMO system using full-rateAlamouti OSTBC and QPSK: Comparison of PSPT, ESPT, TMT and perfectchannel estimation (K = 60).

C. Bit-Error-Rate Performance

The final aspect to be investigated is the bit-error-rate (BER)performance of the MIMO systems that employ the proposedsuperimposed training design. To this end, orthogonal space-time block codes (OSTBCs) together with the maximumlikelihood (ML) decoding are incorporated in Fig. 1. For the2 × 2 system, the full-rate Alamouti code [1] is selected,whereas a half-rate OSTBC [29, E.q. (5)] is applied for the4× 4 system. Both systems use QPSK modulation with Graymapping and the length of the information signal vector is setto K = 60.

The simulation results presented in Figs. 10 and 11 wereobtained with PSPT, ESPT, TMT and perfect channel estima-tion. The BER performance with perfect channel estimationis shown to serve as the performance benchmark. Consistentwith the relative comparison of estimation performance madebefore for K = 60, the BER performance with PSPT is betterthan that with ESPT and TMT in both the 2 × 2 and 4 × 4

3216 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 8, AUGUST 2008

0 1 2 3 4 5 6 7 810

−6

10−5

10−4

10−3

SNR (dB)

BE

RPSPTESPTTMTPerfect

Fig. 11. BER performance of the 4×4 MIMO system using half-rate OSTBCand QPSK: Comparison of PSPT, ESPT, TMT and perfect channel estimation(K = 60).

systems. Specifically, at the BER level of 10−4, there are about2.0 dB and 0.5 dB gains in SNR by employing PSPT overTMT for the 2× 2 and 4× 4 systems, respectively. Note that,compared to the 2 × 2 system, the reduced SNR gain in the4 × 4 is directly attributed to the reduction in the estimationperformance. This is because both systems have K = 60, butthere is a larger number of channel parameters to be estimatedin the 4 × 4 system compared to the 2 × 2 system, hencemaking the PSPT design less effective in the former systemthan the latter one. Finally, for both systems, compared to thecase of perfect channel estimation, our proposed superimposedtraining design experiences a performance loss of only about0.5 dB at the BER level of 10−4.

VI. CONCLUSION

An MMSE channel estimator was developed for spatiallycorrelated fading MIMO channels when superimposed trainingis used. The main contribution is a novel design of the optimalsuperimposed signal with an iterative optimization algorithm.Simulation results show that, when the coherence time of thechannel is large and under the constraint of equal trainingpower and bandwidth efficiency, the optimal superimposedtraining signal performs better than the conventional time-multiplexing training signal in terms of the channel estimationerror. A power allocation policy is also derived to maximizethe lower bound on the effective signal-to-noise ratio at thechannel output. The excellent BER performance of MIMOsystems that employ orthogonal space-time block codes andour proposed superimposed training design is also demon-strated.

REFERENCES

[1] S. M. Alamouti, “A simple transmit diversity technique for wirelesscommunications,” IEEE J. Select. Areas Commun., vol. 16, no. 8, pp.1451–1458, Oct. 1998.

[2] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge Univ.Press, 2004.

[3] C. Chuah, D. Tse, J. M. Kahn, and R. Valenzuela, “Capacity scaling inMIMO wireless systems under correlated fading,” IEEE Trans. Inform.Theory, vol. 48, no. 3, pp. 637–650, 2002.

[4] G. D. Forney and M. V. Eyuboglu, “Combined equalization and codingusing precoding,” IEEE Commun. Mag., Dec. 1991.

[5] G. J. Foschini and M. J. Gans, “On limits of wireless communication infading environment when using multiple antennas,” Wireless PersonalCommun., vol. 6, pp. 311–335, 1998.

[6] G. B. Giannakis, “Filterbanks for blind channel identification andequalization,” IEEE Signal Processing Lett., vol. 4, no. 6, pp. 184–187,1997.

[7] B. Hassibi and B. M. Howald, “How much training is needed inmultiple-antenna wireless links?” IEEE Trans. Inform. Theory, vol. 49,no. 4, Apr. 2003.

[8] S. M. Kay, Fundamentals of Statistical Signal Processing–Volumn 1:Estimation Theory. Prentice Hall, 1993.

[9] J. H. Kotecha and A. M. Sayeed, “Transmit signal design for optimal es-timation of correlated MIMO channel,” IEEE Trans. Signal Processing,vol. 52, pp. 546–557, 2004.

[10] W. C. Y. Lee, “Effects on correlation between two mobile radio based-station antennas,” IEEE Trans. Commun., vol. com-21, pp. 1214–1223,Nov. 1973.

[11] S. L. Loyka, “Channel capacity of two-antenna BLAST architecture,”Electron. Lett., vol. 35, pp. 1421–1422, 1999.

[12] S. Loyka, “Channel capacity of MIMO achitecture using the exponentialcorrelation matrix,” IEEE Commun. Lett., vol. 5, pp. 369–371, Sept.2001.

[13] S. L. Loyka and J. R. Mosig, “Channel capacity of N -antenna BLASTarchitecture,” Electron. Lett., vol. 36, pp. 660–661, 2000.

[14] X. Ma, L. Yang, and G. B. Giannakis, “Optimal training for MIMOfrequency selective fading channels,” IEEE Trans. Wireless Commun.,vol. 4, pp. 453–466, 2005.

[15] J. Manton, I. V. Mareels, and Y. Hua, “Affine precoders for reliablecommunications,” in Proc. ICASSP, pp. 2749–2752, 2000.

[16] J. H. Manton, “Design and analysis of linear precoders under a meansquare error crition–part II: MMSE design and conclusions,” Systemsand Control Lett., vol. 49, pp. 131–140, 2003.

[17] F. Mazzenga, “Channel estimation and equalization for M -QAM trans-mission with a hidden pilot sequence,” IEEE Trans. Broadcasting, vol.46, pp. 170–176, 2000.

[18] S. Ohno and G. B. Giannakis, “Optimal training and redudant precodingfor block transmissions with application to wireless OFDM,” IEEETrans. Commun., vol. 50, pp. 2112–2123, 2002.

[19] A. J. Paulraj and C. B. Papadias, “Space-time processing for wirelesscommunication,” IEEE Signal Processing Mag., pp. 49–83, 1996.

[20] H. Sampath and A. Paulraj, “Linear precoding for space-time codedsystems with known fading correlations,” IEEE Commun. Lett., vol. 6,pp. 239–241, June 2002.

[21] A. Paulraij, R. Nabar, and D. Gore, Introduction to Space-Time WirelessCommunications. Cambridge Univ. Press, 2005.

[22] D. H. Pham and J. H. Manton, “Orthogonal superimposed training onlinear precoding: a new affine precoder design,” in Proc. IEEE 6thWorkshop of Sig. Proc. Advance in Wireless Comm., pp. 445–449, June2005.

[23] D. H. Pham, H. D. Tuan, B. Vo, and T. Q. Nguyen, “Jointly optimalprecoding/postcoding for coloured MIMO systems,” in Proc. 2006 IEEEConf. on Acoustics, Speech and Sig. Process. (ICASSP), Toulouse,France, vol. 4, pp. 745–748, 2006.

[24] C. Pirak, Z. J. Wang, K. J. Liu, and S. Jitapunkul, “Optimum powerallocation for maximum-likelihood channel estimation in space-timecoded MIMO systems,” in Proc. IEEE Conf. in Acoustics, Speech andSig. Process. (ICASSP), Toulouse, France, pp. IV 573–IV 576, May2006.

[25] C. Pirak, Z. J. Wang, K. J. Liu, and S. Jitapunkul, “A data-bearingapproach for pilot-embedding frameworks in space-time coded MIMOsystem,” IEEE Trans. Signal Processing, vol. 54, pp. 3966–3979, Oct.2006.

[26] J. G. Proakis, Digital Communications. McGraw-Hill, 2001.[27] D. Tse and P. Viswanath, Fundamentals of Wireless Communication.

Cambridge Univ. Press, 2005.[28] D. Shiu, G. J. Foschini, M. J. Gans, and J. M. Kahn, “Fading correlation

and its effect on the capacity of multielement antenna systems,” IEEETrans. Commun., vol. 48, pp. 502–513, Mar. 2002.

[29] V. Tarokh, H. Jafarkhani, and A. R. Calderbank, “Space-time blockcoding for wireless communications: performance results,” IEEE J.Select. Areas Commun., vol. 17, no. 3, Mar. 1999.

[30] I. E. Telatar, “Capacity of multiple antenna Gaussian channels,” Euro-pean Trans. Telecommun., vol. 10, no. 6, 1999.

[31] N. N. Tran, D. H. Pham, and H. D. Tuan, and Ha H. Nguyen, “Affineprecoding and decoding for channel estimation and source detection in

NGUYEN et al.: OPTIMAL SUPERIMPOSED TRAINING DESIGN FOR SPATIALLY CORRELATED FADING MIMO CHANNELS 3217

MIMO frequency-selective fading channels,” to appear in IEEE Trans.Signal Processing.

[32] J. K. Tugnait, L. Tong, and Z. Ding, “Single-user channel estimation andequalization,” IEEE Signal Processing Mag. vol. 17, pp. 16–28, 2000.

[33] J. K. Tugnait and W. Luo, “On channel estimation using superimposedtraining and first-order statistic,” IEEE Commun. Lett., vol. 7, pp. 413–415, Sept. 2003.

[34] H. Tuy, Convex Analysis and Global Optimization. Kluwer AcademicPress, 1999.

[35] H. Tuy, “Monotonic optimization: problems and solution approaches,”SIAM J. Optimization, pp. 464–494, 2000.

[36] G. T. Zhou, M. Viberg, and T. McKelvey, “Superimposed periodic pilotsfor blind channel estimation,” in Proc. 35th Annu. Asilomar Conf. Sig.Systems Computers, Pacific Grove, CA, pp. 653–657, 2001.

[37] A. Vosoughi and A. Scaglione, “Everything you always wanted to knowabout training: guidelines derived using the affine precoding frameworkand CRB,” IEEE Trans. Signal Processing, vol. 54, pp. 940–954, 2006.

[38] J. H. Winters, J. Salz, and R. D. Gitlin, “The impact of antennadiversity on the capacity of wireless communication systems,” IEEETrans. Commun., vol. 42, pp. 1740–1751, 1994.

Vu Nguyen received the Bachelor degree in Electronic Engineering andTelecommunications from the University of Tasmania, Australia in 2003, andthe Master degree in Telecommunications from the University of New SouthWales, Australia in 2005. Mr. Nguyen currently works for Optus Ltd., NSW,Australia.

Hoang Duong Tuan was born in Hanoi, Vietnam. He received the diplomaand the Ph.D. degree, both in applied mathematics from Odessa State Uni-versity, Ukraine, in 1987 and 1991, respectively. From 1991 to 1994 he was aResearcher at Optimization and Systems Division, Vietnam National Centerfor Science and Technologies. He spent 9 academic years in Japan as anAssistant Professor at the Department of Electronic-Mechanical Engineering,Nagoya University from 1994 to 1999, and then as an Associate Professor atthe Department of Electrical and Computer Engineering, Toyota TechnologicalInstitute, Nagoya from 1999 to 2003. Presently, he lives in Sydney, Australia,where he is an Associate Professor at the School of Electrical Engineeringand Telecommunications, the University of New South Wales. His researchinterests include several multi-disciplinary areas of control, signal processing,communications and bio-informatics.

Ha H. Nguyen (M’01, SM’05) received the B. Eng degree from HanoiUniversity of Technology, Hanoi, Vietnam, in 1995, the M. Eng degree fromAsian Institute of Technology, Bangkok, Thailand, in 1997, and the Ph.D.degree from the University of Manitoba, Winnipeg, Canada, in 2001, allin electrical engineering. Dr. Nguyen joined the Department of ElectricalEngineering, University of Saskatchewan, Canada in 2001 and become aFull Professor in 2007. He holds adjunct appointments at the Departmentof Electrical and Computer Engineering, University of Manitoba, Winnipeg,MB, Canada, and TRLabs, Saskatoon, SK, Canada and was a Senior VisitingFellow in the School of Electrical Engineering and Telecommunications,University of New South Wales, Sydney, Australia during October 2007-June2008. His research interests include digital communications, spread spectrumsystems and error-control coding. Dr. Nguyen currently serves as an AssociateEditor for the IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS andthe IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY. He is a Regis-tered Member of the Association of Professional Engineers and Geoscientistsof Saskatchewan (APEGS).

Nguyen Nam Tran was born in Quang Nam, Vietnam. He received theB.E. degree in electrical engineering and telecommunications from Ho-Chi-Minh City University of Technology in 2001, and the M.S.E degree inPhysical Electronics from Ho-Chi-Minh City University of Natural Sciencesin 2004. Since 2005 he has been pursuing the Ph.D. degree with theSchool of Electrical Engineering and Telecommunications, University of NewSouth Wales. His research interests are MIMO and MIMO-OFDM wirelesscommunications, including coding and signal processing techniques, andapplications of convex optimization for training signal and precoder designunder correlated channels and colored noise.