Low-complexity turbo equalization and multiuser decoding for TD-CDMA

12
454 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, NO. 2, MARCH 2004 Low-Complexity Turbo Equalization and Multiuser Decoding for TD-CDMA Alessandro Nordio, Student Member, IEEE, Marco A. Hernandez, and Giuseppe Caire, Senior Member, IEEE Abstract—The authors propose a low complexity multiuser joint parallel interference cancellation (PIC) decoder and turbo deci- sion feedback equalizer for code division multiple access (CDMA). In their scheme, an estimate of the interference signal (both mul- tiple-access interference and intersymbol interference) is formed by weighting the hard decisions produced by conventional (i.e., hard-output) Viterbi decoders. The estimated interference is sub- tracted from the received signal in order to improve decoding in the next iteration. By using asymptotic performance analysis of random-spreading CDMA, they optimize the feedback weights at each iteration. Then, they consider two (mutually related) perfor- mance limitation factors: the bias of residual interference and the ping–pong effect. The authors show that the performance of the proposed algorithm can be improved by compensating for the bias in the weight calculation, and they propose a modification of the basic PIC algorithm, which prevents the ping–pong effect and al- lows higher channel load and/or faster convergence to the single- user performance. The proposed algorithm is validated through computer simulation in an environment fully compliant with the specifications of the time-division duplex mode of third-generation systems, contemplating a combination of time-division multiple ac- cess and CDMA and including frequency-selective fading channels, user asynchronism, and power control. The main conclusion of this work is that, for such application, soft-input soft-output decoders (e.g., implemented by the forward–backward BCJR algorithm) are not needed to attain very high spectral efficiency, and simple con- ventional Viterbi decoding suffices for most practical settings. Index Terms—Code division multiple access (CDMA), turbo equalization, turbo multiuser detection. I. INTRODUCTION AND MOTIVATION T HE recently proposed Universal Mobile Telecommuni- cation Systems (UMTS) standard for the third generation (3G) of mobile communication systems adopts wide-band code division multiple access (WCDMA) for the frequency divi- sion duplex (FDD) mode and a combination of time-division multiple access (TDMA) and code-division multiple access (CDMA) for the time-division duplex (TDD) mode [1]. In both FDD and TDD modes, the UMTS basic receiver scheme contemplates the use of conventional single-user matched filtering (SUMF). Since multiple-access interference Manuscript received February 8, 2002; revised October 17, 2002; accepted January 15, 2003. The editor coordinating the review of this paper and ap- proving it for publication is W.-Y. Kuo. This work was supported in part by Ascom, Cégétel, France Télécom, Hitachi, Motorola, Swisscom, Texas Instru- ments, Thales, ST Microelectronics, and Bouygues Telecom. The authors are with the Mobile Communications Department, Institut Eu- récom, 06904 Sophia-Antipolis Cedex, France. Digital Object Identifier 10.1109/TWC.2003.821170 (MAI) is treated as additional background noise, 1 powerful and high-complexity channel coding such as 256-states convolu- tional codes and turbo codes [3] are envisaged in order to attain low bit-error rates (BERs) at low decoder input signal-to-in- terference plus noise ratio (SINR). In any case, channel loads larger than one user/chip are practically very difficult if not impossible to attain by the SUMF front-end and single-user decoding [4]. On the other hand, information theory shows that much larger channel loads can be achieved provided that a nonlinear mul- tiuser joint detector and decoder is employed [4], [5]. This may range from the impractically complex optimal joint decoder to practically appealing successive interference cancellation ap- proaches [6], [7]. In practice, successive interference cancellation must cope with decision errors, which prevent perfect cancellation of al- ready decoded users. Then, several iterative schemes have been proposed, which limit the deleterious effect of decision errors by feeding back soft-estimates of the detected symbols (see for ex- ample [8]–[10]). These schemes require soft-input soft-output (SISO) decoders, usually implemented by the forward–back- ward Bahl–Cocke–Jelinek–Raviv (BCJR) algorithm [11]. How- ever, such SISO decoders represent a nonnegligible factor in the complexity of whole receiver. In real CDMA applications for ei- ther TDD and FDD modes, the maximum achievable channel load is often limited by synchronization and channel estima- tion issues, rather than by the ultimate capability of the de- coder itself [1]. Hence, it makes sense to investigate simpler joint detection and decoding schemes, which outperforms the conventional linear SUMF, MMSE, and decorrelator, and non- linear parallel interference cancellation (PIC) or serial interfer- ence cancellation (SIC) receivers [12] and, nevertheless, yield performance similar to the SISO-based schemes at lower de- coding complexity. Driven by this consideration, this paper proposes a low com- plexity iterative multiuser receiver, where SISO decoders are re- placed by simpler standard (i.e., hard-output) Viterbi decoders. The Viterbi hard decisions are weighted and fed back to the in- terference cancellation stage. By using large-system analysis of random CDMA [13], we optimize the feedback weights at each iteration such that the SINR at the decoders input of the next iteration is maximized. We address the problem of bias of residual interference [14],[15] and of the ping–pong effect [12], [16], and we show 1 Notice that the only difference between the SUMF and the linear minimum mean-square error (MMSE) filter is that the latter treats MAI as colored noise while the former treats it as white noise [2]. 1536-1276/04$20.00 © 2004 IEEE

Transcript of Low-complexity turbo equalization and multiuser decoding for TD-CDMA

454 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, NO. 2, MARCH 2004

Low-Complexity Turbo Equalization andMultiuser Decoding for TD-CDMA

Alessandro Nordio, Student Member, IEEE, Marco A. Hernandez, and Giuseppe Caire, Senior Member, IEEE

Abstract—The authors propose a low complexity multiuser jointparallel interference cancellation (PIC) decoder and turbo deci-sion feedback equalizer for code division multiple access (CDMA).In their scheme, an estimate of the interference signal (both mul-tiple-access interference and intersymbol interference) is formedby weighting the hard decisions produced by conventional (i.e.,hard-output) Viterbi decoders. The estimated interference is sub-tracted from the received signal in order to improve decoding inthe next iteration. By using asymptotic performance analysis ofrandom-spreading CDMA, they optimize the feedback weights ateach iteration. Then, they consider two (mutually related) perfor-mance limitation factors: the bias of residual interference and theping–pong effect. The authors show that the performance of theproposed algorithm can be improved by compensating for the biasin the weight calculation, and they propose a modification of thebasic PIC algorithm, which prevents the ping–pong effect and al-lows higher channel load and/or faster convergence to the single-user performance. The proposed algorithm is validated throughcomputer simulation in an environment fully compliant with thespecifications of the time-division duplex mode of third-generationsystems, contemplating a combination of time-division multiple ac-cess and CDMA and including frequency-selective fading channels,user asynchronism, and power control. The main conclusion of thiswork is that, for such application, soft-input soft-output decoders(e.g., implemented by the forward–backward BCJR algorithm) arenot needed to attain very high spectral efficiency, and simple con-ventional Viterbi decoding suffices for most practical settings.

Index Terms—Code division multiple access (CDMA), turboequalization, turbo multiuser detection.

I. INTRODUCTION AND MOTIVATION

THE recently proposed Universal Mobile Telecommuni-cation Systems (UMTS) standard for the third generation

(3G) of mobile communication systems adopts wide-band codedivision multiple access (WCDMA) for the frequency divi-sion duplex (FDD) mode and a combination of time-divisionmultiple access (TDMA) and code-division multiple access(CDMA) for the time-division duplex (TDD) mode [1].

In both FDD and TDD modes, the UMTS basic receiverscheme contemplates the use of conventional single-usermatched filtering (SUMF). Since multiple-access interference

Manuscript received February 8, 2002; revised October 17, 2002; acceptedJanuary 15, 2003. The editor coordinating the review of this paper and ap-proving it for publication is W.-Y. Kuo. This work was supported in part byAscom, Cégétel, France Télécom, Hitachi, Motorola, Swisscom, Texas Instru-ments, Thales, ST Microelectronics, and Bouygues Telecom.

The authors are with the Mobile Communications Department, Institut Eu-récom, 06904 Sophia-Antipolis Cedex, France.

Digital Object Identifier 10.1109/TWC.2003.821170

(MAI) is treated as additional background noise,1 powerful andhigh-complexity channel coding such as 256-states convolu-tional codes and turbo codes [3] are envisaged in order to attainlow bit-error rates (BERs) at low decoder input signal-to-in-terference plus noise ratio (SINR). In any case, channel loadslarger than one user/chip are practically very difficult if notimpossible to attain by the SUMF front-end and single-userdecoding [4].

On the other hand, information theory shows that much largerchannel loads can be achieved provided that a nonlinear mul-tiuser joint detector and decoder is employed [4], [5]. This mayrange from the impractically complex optimal joint decoder topractically appealing successive interference cancellation ap-proaches [6], [7].

In practice, successive interference cancellation must copewith decision errors, which prevent perfect cancellation of al-ready decoded users. Then, several iterative schemes have beenproposed, which limit the deleterious effect of decision errors byfeeding back soft-estimates of the detected symbols (see for ex-ample [8]–[10]). These schemes require soft-input soft-output(SISO) decoders, usually implemented by the forward–back-ward Bahl–Cocke–Jelinek–Raviv (BCJR) algorithm [11]. How-ever, such SISO decoders represent a nonnegligible factor in thecomplexity of whole receiver. In real CDMA applications for ei-ther TDD and FDD modes, the maximum achievable channelload is often limited by synchronization and channel estima-tion issues, rather than by the ultimate capability of the de-coder itself [1]. Hence, it makes sense to investigate simplerjoint detection and decoding schemes, which outperforms theconventional linear SUMF, MMSE, and decorrelator, and non-linear parallel interference cancellation (PIC) or serial interfer-ence cancellation (SIC) receivers [12] and, nevertheless, yieldperformance similar to the SISO-based schemes at lower de-coding complexity.

Driven by this consideration, this paper proposes a low com-plexity iterative multiuser receiver, where SISO decoders are re-placed by simpler standard (i.e., hard-output) Viterbi decoders.The Viterbi hard decisions are weighted and fed back to the in-terference cancellation stage. By using large-system analysis ofrandom CDMA [13], we optimize the feedback weights at eachiteration such that the SINR at the decoders input of the nextiteration is maximized.

We address the problem of bias of residual interference[14],[15] and of the ping–pong effect [12], [16], and we show

1Notice that the only difference between the SUMF and the linear minimummean-square error (MMSE) filter is that the latter treats MAI as colored noisewhile the former treats it as white noise [2].

1536-1276/04$20.00 © 2004 IEEE

NORDIO et al.: LOW-COMPLEXITY TURBO EQUALIZATION AND MULTIUSER DECODING 455

that the performance of the proposed algorithm can be im-proved by compensating for the bias in the weight calculationand by modifying the basic PIC algorithm in order to preventthe ping–pong effect. These modifications allow higher channelload and/or faster convergence to the single-user performanceat almost no additional computational cost.

We validate the proposed receiver algorithm in UMTS-TDDrealistic scenarios, including asynchronous transmission, fre-quency selective fading channels, and power control. In thisregime, the proposed receiver performs very close to the singleuser (i.e., MAI-free) matched filter bound (i.e., intersymbol in-terference (ISI)-free) performance, even for large channel load.

The remainder of this paper is organized as follows: Section IIgives a description of the system model. Section III describesthe large system asymptotic analysis. Based on the latter,the feedback optimal weights are derived in Section IV forsynchronous and flat channels, taking into account the biason the residual interference and ping–pong effect for largechannel load. Section V deals with asynchronous transmissionand frequency selective fading channels. Finally, in Section VI,conclusions are made.

II. SYSTEM MODEL

We consider the uplink of a DS-CDMA system whereusers send encoded information to a common receiver. Thebase-band transmission chain for the th user is depicted inFig. 1(a). Source bits are channel-encoded and organized incodewords of length bits. The codewords are then inter-leaved and modulated with transmitted energy per symbol .For simplicity, we assume that all users make use of convo-lutional coding and BPSK modulation. Let be the thsymbol generated by the th user, in the set , and

represent the code wordof user after interleaving. The symbols are then spread by thespreading sequence . Weassume all the users have the same spreading factor (numberof chips per symbol) and unitary energy sequences, that is,

, . If the chip pulse-shaping filter isand is common to all the users, then the th base-band

continuous-time transmitted signal is

(1)

where is the chip period and is the symbol period.In this paper, we focus on TD-CDMA, where users are syn-

chronous at the slot level but asynchronous at the chip level. Forthe sake of simplicity, we assume that the convolutional code-words span a single slot. The propagation channel impulse re-sponses are random but constant over each slot and are assumedperfectly known at the receiver.2

Thus, the th user signal is sent through the channelwith impulse response given by

(2)

2In practice, the user channel responses can be estimated from trainingsequences inserted in the middle of each slot, as currently proposed in theUMTS-TDD system [17].

Fig. 1. Transmitter and channel schemes.

where is the number of resolvable paths, is the th pathdelay, and is the th path coefficient. The coefficientsare assumed Gaussian complex circularly symmetric randomvariables with distribution where .This model also takes into account users’ asynchronous trans-mission, where the relative delays between users are includedinto the channel path delays.

The signal at the receiver is given by

(3)

(4)

where is the Gaussian noise and is defined by

(5)

and represents the overall channel impulse response for the thuser as depicted in Fig. 1(b). The signal is sampled by thereceiver at rate , an integer multiple of the chip-rate, where

is a suitable integer chosen in order to satisfy the Nyquistcriterion.

The discrete time version of , sampled at rate ,may have an infinite support but most of its energy is concen-trated in a finite interval. This interval depends on the channelmaximum delay spread, on the pulse-shaping filter , and onthe spreading factor . After a suitable truncation we can rep-resent the discrete-time channel impulse response as a vector

, where isthe maximum discrete-time channel length of all user channels.Subject to these assumptions, the received discrete-time base-band signal can be written as follows:

(6)

456 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, NO. 2, MARCH 2004

where is the vector ofthe received signal samples; matrixgiven by where thematrices contain, in each column, a shift of thevector as shown in Fig. 2;is the concatenation of the user’s codewords; and

contains the noisesamples where the s are independent and identically dis-tributed (i.i.d.) circularly symmetric complex Gaussian randomvariables with distribution ( denotes theidentity matrix).

For the sake of simplicity, the receiver front end is constrainedto be the conventional SUMF [2] that is given by the matrix

.3 The matched filter performs at the same time pulse-shapematched filtering, channel matched filtering, and despreading.The output of the bank of SUMFs is given by [2]

(7)

where andfor is the concatenation of the matched filteroutputs; is the cross-correlation matrix givenby where denotes the Hermitian oper-ator; and the noise term is given by

and has distribution . Thevector representsthe additive colored noise contribution after the th matchedfiltering.

In particular, the SUMF output for the th symbol of the thuser is given by

(8)

The matrix is Hermitian and contains blocks of sizeeach. In the following, indicates the th block given

by

(9)

The entry of is given by

(10)

From the structure of , depicted in Fig. 2, and from (10), itis clear that the matrices are banded with nonzerodiagonals where

(11)

Notice that since depends only on the differ-ence , is uniquely defined by the vector

where

3Some works, for example [9] and [10], consider linear MMSE front end.

Fig. 2. Structure of the matrix Gm .

Hence, using (7) and the above considerations, it is possible torewrite (8) as

(12)

where the first term is the useful symbol, the second term de-notes the ISI, the third term is the MAI, and the last term is thecolored noise.

The SUMF output feeds the proposed iterative multiuser de-tector-decoder depicted in Fig. 3. After SUMF, the signalpasses through an interference cancellation (IC) stage that usesthe estimates of the symbols to remove ISI andMAI (the superscript denotes th iteration). The hard de-cisions are provided by a bank of Viterbi de-coders, whose outputs are weighted by a factor .Thus, the signal at the output of the IC stage in the th iterationis given by

(13)

NORDIO et al.: LOW-COMPLEXITY TURBO EQUALIZATION AND MULTIUSER DECODING 457

Fig. 3. Scheme of the proposed turbo multiuser receiver.

where

(14)

is the residual MAI ISI at iteration , and.

Equation (13) shows that the IC stage performs at the sametime MAI suppression and also noncausal decision feedbackequalization (DFE) by removing from the th symbol the con-tributions due to the past and future symbols of the same user.

At the first iteration, the initial estimated symbols are set tozero, , andso that . In the case of perfect symbol estimatesand , (13) reduces to

(15)

where the MAI and ISI are completely removed and the single-user matched filter bound (MFB) performance for user is at-tained. For later use, we define the normalized received instan-taneous channel energy for user as , suchthat , and the single-user MFB instantaneous SNRfor user as

SNR (16)

Intuitively, the weighting factors should depend on thereliability of the estimated symbols and be equal to oneor to zero in the case of completely reliable or completely un-reliable symbol estimates, respectively. In Section III, we makethis statement precise by studying the performance of an ideal-ized synchronous system in the large-system limit.

III. LARGE SYSTEM ASYMPTOTIC ANALYSIS

Rigorous asymptotic analysis of this iterative scheme in thelarge-system limit (i.e., for with finite ratio

and ) for single-path channels and syn-chronous users is proposed in [14]. This analysis is based on thegeneral approach of density evolution over graphs. A standardanalysis tool for evaluating the asymptotic performance of

message-passing iterative decoders [18] makes use of resultsfrom the theory of large random matrices developed in [13] forthe analysis of linear receivers with randomly spread CDMA inthe large-system limit. In [14], it is also shown that this analysisholds only if the symbol estimates are functions of thedecoder extrinsic information [19]. The decoder extrinsic infor-mation is defined for a SISO decoder based on the sum-productalgorithm [20], such as the BCJR algorithm, but it is not definedfor a ML decoder such as the Viterbi algorithm. Hence, weshall optimize the weights for a fictitious receiver wherethe Viterbi decoders are replaced by BCJR decoders and thehard decisions are obtained by one-bit quantization ofthe extrinsic likelihood ratios produced by the latter.

We hasten to say that the fictitious receiver is used in thispaper as a reference but has no practical relevance, for the ob-vious reason that if BCJR decoders are used, then much moreefficient soft-estimation of the interfering symbols as in [8]–[10]is possible. However, as it will be clear from the rest of this sec-tion, the weight optimization based on the asymptotic analysisof the fictitious receiver allows us to derive a very simple expres-sion for the optimal weights, independent of the user sequencesand their mutual correlations, and a very simple practical algo-rithm for calculating these weights.

We make the following assumptions:

• complex random spreading sequences formed by i.i.d.chips uniformly distributed over the QPSK constellation;

• users are slot and chip synchronous;• equal received expected energy per symbol for each

user;• frequency-flat identically distributed propagation chan-

nels;• with ( denotes the channel

load, i.e., the number of users per chip in the system);• empirical distribution of the user received normalized

powers defined by [13]

(17)

converge weakly to a given nonrandom cumulative distri-bution function , for .

With the above assumptions, for a user randomly se-lected with uniform probability in the user population, wehave SNR where is a random variabledistributed (in the limit of large ) according to .

458 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, NO. 2, MARCH 2004

Under these conditions, the SINR at the decoders input in theth iteration for the fictitious receiver can be written as follows

[14]:

SNR (18)

where

(19)

is the degradation factor payed by any user with respect to itssingle-user MFB performance, and where

(20)

is the average variance of the residual interfering symbols. In(20), denotes a coded symbol of a generic user received atinstantaneous power , and denotes the hard decision rel-ative to at the decoder output. The expectation in (20) is withrespect to uniform over , distributed as the harddecision about at the decoder output, and . Werefer to as the multiuser efficiency (ME) [2] at iteration

. Clearly, the single-user MFB performance is achieved forall users if .

We can optimize the weighting factor as a function ofthe channel energy in order to minimize the expected interfer-ence variance at every iteration. Therefore, for each itera-tion we seek the solution of the simple optimization problem

(21)

By expanding the above conditional expectation, we get

(22)

where is the symbol error rate (SER) of a decoder withan input signal-to-noise ratio .The solution of (21) is easily obtained as

(23)

Assuming that the residual interference plus noise at iterationis Gaussian,4 the SER is a known function of the decoderinput SINR. We shall refer to as the SER characteristic ofthe user code (see Appendix A). Thus, for user at iterationthe optimal weighting factor is given by

(24)

The resulting minimized average interference variance is givenby

(25)

4In the large-system limit and under mild technical conditions, this assump-tion is valid, as shown rigorously in [21].

where expectation is with respect to .

Fig. 4. Evolution of ME given by asymptotic analysis for CC (5; 7), � = 2,and E =N = 5 dB.

The behavior of the asymptotic system is described by theevolution of the ME and it can be represented by the one-dimen-sional nonlinear dynamical system withinitial condition , and where mappingfunction is defined by

(26)

It can be easily checked that, since has range and itis nonincreasing, then has range and it is nonde-creasing. Therefore, the dynamical system defined by withinitial condition has at least one stable fixed point in the in-terval , and the sequence is nondecreasingand upper bounded by one. The limitexists and is equal to the left-most stable fixed point of thesystem in the interval .

As an example, Fig. 4 shows the function for channelload , dB, constant instantaneous receivedpower (i.e., for all users), and the four-states convo-lutional code of rate 1/2 and octal generators [denoted inthe following by CC (5,7)]. For the sake of comparison, we showalso the evolution of the same system when the BCJR decoderprovides soft extrinsic estimates as proposed in [8]. Thefunction in this case is derived in [14]. The limit in this caseis very close to 1, meaning that all users attain near-single-userperformance. Notice that soft feedback assures a faster decoderconvergence to the single-user performance.

When the channel load is increased, the curves are modi-fied so that for a certain threshold channel load , the curve cor-responding to weighted hard decisions is tangent to the diagonal,as shown in Fig. 5. This means that the system has reached itsmaximum load and is not able to converge to near-single-userperformance. On the contrary, the system using soft decisionsstill converges to single user performance. This shows that SISOdecoding and soft feedback also provides a higher threshold

NORDIO et al.: LOW-COMPLEXITY TURBO EQUALIZATION AND MULTIUSER DECODING 459

Fig. 5. Evolution of ME given by asymptotic analysis for CC (5; 7), � =2:35, and E =N = 5 dB.

channel load, i.e., a larger overall maximum achievable spec-tral efficiency of the system.

IV. IMPLEMENTATION OF THE PROPOSED RECEIVER

A. Basic Algorithm

The performance of the proposed receiver depends on thecomputation of the weighting factors which do depend ona reliable SINR estimation. Driven by the asymptotic analysisof Section III, we propose to compute the weighting factor forthe th user at iteration as

(27)

where is the (known) SER code characteristics, andis the estimated SINR at the th decoder input of the

th iteration. In order to estimate the input SINR, we can usethe estimator proposed in [22], given by

(28)

where .In [22], it is shown that

is an unbiased estimator of the residual MAI plus ISI plusnoise variance at the decoder input if is uncorre-lated with the desired variable . Remarkably, in thiscase, the MSE of this estimator is very close to that of theML estimator assuming known useful symbols, given by

(which is not applicable

here, since the symbols are unknown).When the symbol estimates are provided by a deci-

sion statistics containing the current observation interval, such

as in the Viterbi decoder, then the residual interference termgiven by (14) is conditionally biased given , that is,

(29)

where the bias coefficient is nonpositive and depends onthe system parameters and on the user and iteration indexes asshown in [14] and [15]. The bias is not negligible especiallyfor high channel load even in the absence of ISI and re-duces the energy of the useful signal at iteration by a factor

.

On the contrary, if the symbol estimates are providedby estimates not containing the current observation interval, i.e.,they are based on the decoder extrinsic information [14], [23],then in the limit for large block length (i.e., ) andrandom interleaving, the residual interference is conditionallyunbiased, i.e., .

We can rewrite the input of the th decoder at iterationgiven in (13) as

(30)

where is uncorrelated with .Hence, the true SINR in the presence of bias is given by

(31)

where we define . Now, the SINR

estimator proposed in (28), in the presence of bias and for large, converges in probability to

Since (i.e., the bias tends to decrease the useful signalterm), we conclude that the estimator (28) tends to overestimatethe SINR at the decoder input. As a consequence, the weights

computed according to (27) are mismatched in the pres-ence of bias.

B. Estimation of the Bias

In order to overcome this problem, a better SINR estimationtaking into account the bias of residual interference is required.The term in (31) appears also in the variance of the decoderinput signal, given by

(32)

and the bias term is contained in the expression of the cor-relation between the symbol estimates and the decoder input,that, recalling (30), is given by

460 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, NO. 2, MARCH 2004

(33)

where is a known characteristic function of the convolu-tional code, proportional to the correlation between the symbolestimates and the interference (see Appendix A).

Fig. 6. True and estimated SINR for “Viterbi (SINR est.); and “Viterbi(SINR-BIAS est.)” decoders, using CC (5; 7), U = 32, L = 16, andE =N = 5 dB.

Using (31) and (32), the correlation in (33) can be rewrittenas

(34)

In practice, an approximation of and can be com-puted as follows:

and (34) can be solved numerically for . Let be theestimated bias, solution of (34). Then, the estimated SINR isgiven by

(35)

Fig. 6 refers to a flat nonfading system, (i.e., for allusers), with users and spreading factor (corre-sponding to the channel load ), dB using the

convolutional code CC(5,7). Users are chip-asynchronous andtheir relative delays are uniformly distributed over an intervalspanning one symbol. The true and estimated SINR versus theiterations for a given user are shown. The curve labeled “Viterbi(SINR est.)” refers to the proposed iterative receiver that es-timates the SINR using (28), while the curve labeled “Viterbi(SINR-BIAS est.)” refers to the system that estimates both thebias and the SINR, using (35). Notice that in the first few iter-ations the SINR estimator given by (28) overestimates the trueSINR up to several decibels. Instead, the SINR estimation givenby (35) is very close to the true SINR.

C. Ping–Pong Effect and its Compensation

As it has been shown in Sections II and III, the proposed re-ceiver allows system loads up to a certain threshold above whichthe system cannot approach the single user performance. In suchhigh load situations, the system parameters as BER, SER, ME,and bias tend to oscillate between two convergence patterns[12]. This phenomenon is called ping–pong and it is related tothe bias in the residual interference term. In fact, it does not ap-pear when feedback is obtained from a SISO decoder extrinsicinformation [14].5

A further investigation reported in [16] showed that sucha bistable situation is due to a fixed subset of the estimatedsymbols that flip when passing from one iteration to the next,while the estimated symbols in the complementary subset donot change. A countermeasure to this problem proposed in[16] consists of introducing a perturbation into the bistablesituation, by feeding back to the IC stage the average of theestimates obtained from the two previous iterations. Thus, thecontribution of the flipping symbols (considered as not reliable)is mitigated. By considering this idea, (20) is modified as

(36)

where the two previous estimates are now weighted with the co-efficients and . Now, we can optimize the weightingfactors and as functions of the channel energy inorder to minimize the expected interference variance atevery iteration. In analogy with what was done before, we seekthe solution of the optimization problem

(37)

After straightforward algebra (we skip the details for the sakeof space limitations), we obtain the solution

(38)

(39)

where

5For the extrinsic-based schemes, there exists, obviously, a threshold loadabove which single-user performance cannot be achieved, but no oscillatory be-havior appears.

NORDIO et al.: LOW-COMPLEXITY TURBO EQUALIZATION AND MULTIUSER DECODING 461

Fig. 7. Bias in the statistic of z using CC (5; 7), U = 40, L = 16, andE =N = 5 dB. Oscillations due to the ping–pong effect.

Fig. 8. True SINR of several multiuser receivers using CC (5;7), U = 40,L = 16, and E =N = 5 dB.

In practice, the weights and of user at iterationare computed as follows:

• for , is irrelevant (since ) and

where is computed accordingto (35);

• for , we let, , and

. Thus, and are given by (38)and (39), respectively.

Figs. 7 and 8 illustrate the bias and the true SINR plottedversus the iterations for a system using Viterbi decoding, withunfaded chip asynchronous flat channel (i.e., ),

, , and the same conditions of Fig. 6. The“Viterbi (SINR est.)” and the “Viterbi (SINR-BIAS est.)” de-coders do not converge to near-single-user performance andshow the ping–pong effect in the bias and in the SINR. Thereceiver that weights the two previous iterations, denoted in

Fig. 9. BER of several multiuser receivers using CC (5;7),U = 40,L = 16,and E =N = 5 dB.

Fig. 10. Number of iterations required to attain an ME >�0:1 dB for severalmultiuser decoders using CC (5; 7), L = 16, and E =N = 5 dB.

the figures by “Viterbi (2-feedback),” converges faster to near-single-user performance and does not show oscillations.

This behavior can also be seen in Fig. 9, where the BERprovided by the proposed receivers is shown, for an unfadedchip asynchronous flat channel and the same conditions ofFig. 8. The curves are obtained by Monte Carlo simulation, byaveraging over a large number of frames and channel realiza-tions. The curve labeled “Viterbi ” denotes the receiverthat feedsback hard decisions (i.e., assuming for

and ) and the bold horizontal linerepresents the single user performance. The receiver “Viterbi(two-feedback)” converges to single user performance in sixiterations. In these conditions, it converges even faster thanemploying BCJR decoders providing soft extrinsic output,denoted by “BCJR (soft EXT)”.

Finally, Fig. 10 compares the performance of several re-ceivers in terms of speed of convergence to near-single-userperformance for dB, , and for .

462 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, NO. 2, MARCH 2004

Fig. 11. Performance of the ‘‘Viterbi (two-feedbacks)’’ multiuser decoderin the presence of multipath fading channels using CC (5; 7), L = 16, andE =N = 5 dB. FPC and SPC denote fast and slow power control conditions,respectively.

As test of convergence, we consider the number of iterationsrequired to reach an average ME larger than 0.1 dB. Thebasic hard feedback Viterbi receiver has maximum channelload . The weighted hard feedback Viterbi with SINRand bias estimation improves this limit to . Eventually,the receiver “Viterbi (2-feedback)” outperforms any otherconsidered receiver and has limit load (correspondingto a spectral efficiency of 1.85 bit/s/Hz). The dashed curvesrefer to the receivers employing BCJR decoders and extrinsicinformation feedback and are shown as a reference.

V. PERFORMANCE IN FREQUENCY SELECTIVE

FADING CHANNELS

In this section, we consider frequency selective channels andasynchronous users. Thus, the ISI term appears in (14). In thepresence of ISI, the system converges to near single-user MFBperformance [5] for a channel load , generally lower than thelimit for flat synchronous channels. Indeed, the presence of ISIslows down the convergence to single user performance and al-lows lower channel loads. Two different cases are considered:

• constant instantaneous power for all users;• constant average power for all users.

For simplicity, under frequency selective channel conditions, weused the same formulas developed for flat fading in order tocompute the feedback weights. This is motivated by the resultof Evans and Tse [24], showing that the large-system limit of afrequency-selective system with random spreading is essentiallyidentical to the frequency flat case if the receiver has perfectchannel knowledge.

A. Constant Instantaneous Power

This case is representative of perfect fast power control thatfully compensates for the instantaneous (slot-by-slot) effect offading, so that . Thus

(40)

Under these conditions, we evaluated by simulation the degra-dation due to the ISI with respect to the results shown inSection IV. Fig. 11 shows the number of iterations needed toreach ME larger than 0.1 dB, as a function of the channelload, for dB, , the “Viterbi (two-feedback)”receiver, and for three different channels whose multipath in-tensity profile (MIP) [25] is shown in Table I. CH1 is specifiedas a standard UMTS test channel in [17], while CH2 and CH3were created ad hoc in order to test the system in severe ISIconditions.

CH1 shows a negligible degradation with respect to thesingle-path case while, for the channels CH2 and CH3, themaximum achievable channel load is decreased to and

, respectively. Notice also that, in the case of strongISI and low channel loads, more iterations are required for theconvergence with respect to the single-path case.

B. Constant Average Power

This case is representative of a slow power control systemthat cannot compensate for the instantaneous fading, but main-tains a fixed average received power for each user. The averagereceived power is assumed to be the same for all users, that is

(41)

Fig. 11 shows the number of iterations needed to reach MElarger than 0.1 dB as a function of the channel load and forthe channels given in Table I. In this case, two contrasting phe-nomena characterize the convergence to single-user MFB per-formance. On one hand, faded users have little impact onto un-faded users, since they are received at much lower power. On theother hand, even if unfaded users can be reliably estimated andsubtracted from the received signal, faded users have an instan-taneous SINR that is too low for achieving a small SER; there-fore, their contribution cannot be perfectly eliminated even inthe absence of MAI. The fact that the performance in slow powercontrol is worse than their fast power-control counterparts (seeFig. 11) indicates that the effect of uncanceled faded users dom-inates the performance of the receiver. Moreover, Fig. 11 showsalso another interesting fact. Namely, the degradation of thereceiver performance due to uncompensated fading is largerfor channels with little multipath diversity (e.g., single-path,CH2 and CH3), while it is smaller for channels with rich multi-path diversity (e.g., the UMTS test channel CH1). This is intu-itively clear, since when the multipath diversity is large, then therandom fluctuations of the instantaneous received powerare reduced, and the fraction of users that can be reliably esti-mated and subtracted is larger.

In general, the average BER for a finite-dimensional systemcan be computed by Monte Carlo simulation, by averaging overa large number of frames and channel realizations. In order toreduce the complexity of BER computation, we propose a semi-analytic approach. Assuming symmetric users, with the samechannel statistics and average received power, the average BER(averaged with respect to both the channel statistics and over theuser population) can be bounded by

BER SNR

NORDIO et al.: LOW-COMPLEXITY TURBO EQUALIZATION AND MULTIUSER DECODING 463

TABLE IMIP OF CONSIDERED CHANNELS

Fig. 12. BER provided by the “Viterbi (2-feedbacks)” multiuser decoder forchannel CH3 using CC (5; 7), U = 32, and L = 16.

SNR

SNR

SNR (42)

where is the bit error probability versus decoder input SNRfunction of the employed convolutional code, where (a) followsfrom Jensen’s inequality applied to the convex functionand where we define the average ME as the arithmeticmean of the ME of all users. Notice that, under mild conditions,for randomly spread CDMA in the large-system limit the MEconverges to a constant independent of the user index, and in-equality in (42) holds with equality. However, for a finite-dimen-sional system and given spreading sequences, the above pro-vides only a lower bound.

The expectation in the last line of (42) is with respect to thenormalized channel energy (assumed identically distributedfor all users). The evaluation of the above lower bound is muchless computationally intensive than full Monte Carlo simulationof the whole system. In fact, the average ME can be ob-tained by running short simulations, since it requires a muchsmaller statistical sample to converge than the average BER.Then, the expectation with respect to can be obtained by ei-ther another Monte Carlo simulation or, if the distribution ofis known, by numerical integration.

Fig. 12 shows the comparison between the full Monte Carlosimulation and the semi-analytic approach for a system with

, , and channel CH3. Solid lines show theBER obtained by Monte Carlo simulation foriterations, respectively, plotted versus . The dashed linesshow the results obtained using the semi-analytic approach. Thethick solid line corresponds to the single-user MFB. For lowSNR the system does not converge to the single user MFB evenfor a large number of iterations, while for high SNR the conver-gence is obtained in a few iterations. We notice that the semi-an-alytic approach yields fairly accurate results already for such asmall system.

VI. CONCLUSION

We proposed a low complexity iterative turbo equalizer andmultiuser decoder/detector for TD-CDMA systems, character-ized by convolutional coding, hard-output Viterbi decoding, andweighted feedback. From a rigorous large-system analysis ofsynchronous users and flat channels we gained the rationale forthe optimization of the feedback weights. In the proposed itera-tive scheme, however, the presence of Viterbi decoders (not pro-viding extrinsic information) produces biased statistics after theIC stage. Hence, we proposed a method for estimating the biasand improving the performance of the basic iterative decoder.Moreover, in order to cope with the ping–pong effect, we pro-posed a modified algorithm where a weighted sum of the deci-sions made in the two previous iterations is used for interferencecancellation. The modified algorithm outperforms the basic PICreceiver and even the receiver based on SISO decoding and softfeedback, with lower complexity. In particular, the complexityof the SISO decoder is roughly twice that of the Viterbi De-coder with hard output, while the complexity of the interferencecanceler is the same for both systems. Moreover, both systemsrequire explicit estimation of the SINR at the input of each de-coder. The proposed receiver was validated in a realistic sce-nario, including asynchronous users, frequency selective fadingchannels, and power control. Simulation results show that thereceiver is robust to severe ISI conditions, even though its per-formance is degraded by uncompensated fading. However, thisdegradation is not very evident in the presence of sufficient mul-tipath diversity. Eventually, a simple and fast semi-analytic ap-proach to compute the BER was proposed and its behavior com-pared with the results obtained via Monte Carlo simulation.

As a concluding remark, we would like to point out that theproposed receiver structure is naturally suited to a frame-syn-chronous TD-CDMA system and, with some modifications, it

464 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 3, NO. 2, MARCH 2004

can be implemented in a UMTS-TDD scenario (i.e., by allowingnon orthogonal spreading sequences, a midamble able to ac-commodate a larger number of users, and simple convolutionalcodes instead of complex (256 states) convolutional codes orturbo codes [3], [17]).

APPENDIX

A. Definition of , , and

Consider a transmitter that maps the information bit sequenceonto a sequence of coded binary BPSK symbols

and sends over the AWGN channel defined by

where is Gaussian noise with distribution . Thesignal-to-noise ratio is SNR . The receiver decodesthe signal providing hard decisions on the transmittedsymbols and hard decisions on the information bits

. We define the following.

1) SER versus SNR characteristic is

SNR (43)

2) BER versus SNR characteristic is

SNR (44)

3) Correlation characteristic between decisions and noise is

SNR (45)

The above functions depend uniquely on the code considered(i.e., they are “characteristics” of the code) and, although theyare generally unknown in closed form, they can be precomputedby Monte Carlo simulation and stored as look-up tables in thereceiver memory. Therefore, the impact of real-time evaluationof these functions on the overall receiver complexity is negli-gible.

REFERENCES

[1] R. Prasad, W. Mohr, and E. W. Konhäuser, Third Generation MobileCommunication Systems. Boston, MA: Artech, 2000.

[2] S. Verdu, Multiuser Detection. Cambridge, U.K.: Cambridge Univ.Press, 1998.

[3] “ETSI TS 125 222 Multiplexing and Channel Coding TDD 3GPP TS25.222 Version 4.0.0 Release 4,”, 3GPP, 2001.

[4] S. Verdu and S. Shamai, “Spectral efficiency of CDMA with randomspreading,” IEEE Trans. Inform. Theory, vol. 45, pp. 622–640, Mar.1999.

[5] S. Shamai and S. Verdu, “The impact of frequency-flat fading on thespectral efficiency of cdma,” IEEE Trans. Inform. Theory, vol. 47, pp.1302–1327, May 2001.

[6] M. Varanasi and T. Guess, “Optimum decision feedback multiuserequalization with successive decoding achieves the total capacity ofthe gaussian multiple access channel,” in Proc. Asilomar Conf., PacificGroove, CA, Nov. 1997.

[7] R. R. Müller and S. Verdu, “Design and analysis of low-complexity inter-ference mitigation on vector channels,” IEEE J. Select. Areas Commun.,vol. 19, pp. 1429–1441, Aug. 2001.

[8] P. Alexander, A. Grant, and M. Reed, “Iterative detection in code-divi-sion multiple-access with error control coding,” Eur. Trans. Telecomm.,vol. 9, pp. 419–425, Sept. 1999.

[9] X. Wang and V. Poor, “Iterative (Turbo) soft interference cancellationand decoding for coded cdma,” IEEE Trans. Commun., vol. 47, pp.1047–1061, July 1999.

[10] H. ElGamal and E. Geraniotis, “Iterative multiuser detection for codedCDMA signals in AWGN and fading channels,” IEEE J. Select. AreasCommun., vol. 18, pp. 30–41, Jan. 2000.

[11] L. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linearcodes for minimizing symbol error rate,” IEEE Trans. Inform. Theory,vol. 20, pp. 284–287, Mar. 1974.

[12] L. K. Rasmussen, “On ping-pong effects in linear interference cancella-tion for CDMA,” in Proc. IEEE 6th Int. Symp. Spread-Spectrum Tech-niques Applications, Sept. 2000.

[13] D. Tse and S. Hanly, “Linear multiuser receivers: Effective interference,effective bandwidth and capacity,” IEEE Trans. Inform. Theory, vol. 45,pp. 641–675, Mar. 1999.

[14] J. Boutros and G. Caire, “Iterative multiuser decoding: Unified frame-work and asymptotic performance analysis,” IEEE Trans. Inform.Theory, vol. 48, pp. 1772–1793, July 2002.

[15] S. Marinkovic, B. Vucetic, and J. Evans, “Improved iterative parallelinterference cancellation,” in Proc. ISIT 2001, Washington, DC, June2001, pp. 34–34.

[16] E. G. Ström and S. L. Miller, “Iterative demodulation of othogonal sig-naling formats in asynchronous DS-CDMA systems,” Proc. ISSSE, pp.184–187, July 2001.

[17] “TS-25.105v3.1.0 UTRA (BS) TDD radio transmission and reception,”,3GPP-TSG-RAN-WG4, 2000.

[18] T. Richardson and R. Urbanke, “The capacity of low-densityparity-check codes under message-passing decoding,” IEEE Trans.Inform. Theory, vol. 47, pp. 599–618, Feb. 2001.

[19] S. Benedetto, D. Divsalar, G. Montorsi, and F. Pollara, “Soft-input soft-output building blocks for the construction of distributed iterative de-coding of code networks,” Eur. Trans. Commun., Apr. 1998.

[20] F. Kschischang, B. Frey, and H. Loeliger, “Factor graphs and the sum-product algorithm,” IEEE Trans. Inform. Theory, vol. 47, pp. 498–519,Feb. 2001.

[21] J. Zhang, E. Chong, and D. Tse, “Output mai distributions of linearMMSE multiuser receivers in DS-CDMA systems,” IEEE Trans. Inform.Theory, vol. 47, pp. 1128–1144, Mar. 2001.

[22] M. Kobayashi, J. Boutros, and G. Caire, “Successive interference cancel-lation with SISO decoding and EM channel estimation,” IEEE J. Select.Areas Commun., vol. 19, pp. 1450–1460, Aug. 2001.

[23] C. Berrou and A. Glavieux, “Near optimum error-correcting coding anddecoding: Turbo codes,” IEEE Trans. Commun., vol. 44, Oct. 1996.

[24] J. Evans and D. N. C. Tse, “Large system performance of linearmultiuser receivers in multipath fading channels,” IEEE Trans. Inform.Theory, vol. 46, pp. 2059–2078, Sept. 2000.

[25] J. G. Proakis, Digital Communications, 4th ed. New York: McGraw-Hill, 2000.

Alessandro Nordio (S’00) received the M.Sc. degree in telecommunicationsengineering from Politecnico di Torino, Torino, Italy, in July 1998 and thePh.D. degree from École Polytechnique Fédérale de Lausanne, Lausanne,Switzerland.

From August 1998 to December 1998, he worked as a Consultant for Omnitel.From 1999 to 2002, he was with the Mobile Communications Department, In-stitut Eurecom, Sophia-Antipolis, France. In April 2002, he joined the Depart-ment of Electrical Engineering of Politecnico di Torino where he is workingas a Postdoctoral student. His research interests include the field of signal pro-cessing, multi-user detection, and space–time coding.

NORDIO et al.: LOW-COMPLEXITY TURBO EQUALIZATION AND MULTIUSER DECODING 465

Marco A. Hernandez received the M.S. degree in electrical engineering fromthe Center for Research and Advanced Studies, National Polytechnic Instituteof Mexico (CINVESTAV), and the Ph.D. degree in electrical engineering fromDelft University of Technology, Delft, The Netherlands.

From 1995 to 1996, he was an International Young Graduate Trainee withthe European Space Research and Technology Center (ESTEC), workingon land mobile satellite propagation channel modeling. From 1997 to 2001,he was a Research Assistant with the faculty of electrical engineering, TVSGroup, Delft University of Technology. In 1999, he had a research internshipin the Center for Communications and Signal Processing Research Laboratory,New Jersey Institute of Technology, Hoboken. From 2001 to 2002, he wasa Postdoctoral Researcher with the Mobile Communication Group, InstituteEurecom in Sophia-Antipolis, France. He joined the Department of ElectricalEngineering at Yokohama University, Yokohama, Japan, in 2003, where he isa Research Associate. He holds a joint appointment as a Visiting Researcherat the Communications Research Laboratory, Yokosuka, Japan. His researchinterests include wireless communications.

Giuseppe Caire (S’91–M’94–SM’03) was born in Torino, Italy, on May 21,1965. He received the B.Sc. degree in electrical engineering from Politecnico diTorino, Torino, Italy, in 1990, the M.Sc. degree in electrical engineering fromPrinceton University, Princeton, NJ, in 1992, and the Ph.D. degree from Politec-nico di Torino in 1994.

He has been with the European Space Agency (ESTEC, Noordwijk, TheNetherlands). He was an Assistant Professor of Telecommunications, Politec-nico di Torino, from 1994 to 1998. He is currently an Associate Professorwith the Department of Mobile Communications, Institute Eurecom, Sophia-Antipolis, France. He is coauthor of more than 30 papers in internationaljournals and more than 60 in international conferences, and he is author of threeinternational patents with the European Space Agency. His research interestsinclude digital communications theory, information theory, coding theory, andmultiuser detection, with particular focus on wireless terrestrial and satelliteapplications.

Dr. Caire was a recipient of the AEI G. Someda Scholarship in 1991, theCOTRAO Scholarship in 1996, and a CNR Scholarship in 1997. He wasAssociate Editor for IEEE TRANSACTIONS ON COMMUNICATIONS from 1998to 2001 and an Associate Editor for Communications Theory of the IEEETRANSACTIONS ON INFORMATION THEORY.