Seminar Bermudez

download Seminar Bermudez

of 107

Transcript of Seminar Bermudez

  • 7/29/2019 Seminar Bermudez

    1/107

    Adaptive Filtering - Theory and Applications

    Jose C. M. Bermudez

    Department of Electrical EngineeringFederal University of Santa Catarina

    Florianopolis SCBrazil

    IRIT - INP-ENSEEIHT, Toulouse

    May 2011

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 1 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    2/107

    1 Introduction

    2 Adaptive Filtering Applications

    3 Adaptive Filtering Principles

    4 Iterative Solutions for the Optimum Filtering Problem

    5 Stochastic Gradient Algorithms

    6 Deterministic Algorithms

    7 Analysis

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 2 / 107

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    3/107

    Introduction

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 3 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    4/107

    Estimation Techniques

    Several techniques to solve estimation problems.

    Classical Estimation

    Maximum Likelihood (ML), Least Squares (LS), Moments, etc.

    Bayesian Estimation

    Minimum MSE (MMSE), Maximum A Posteriori (MAP), etc.

    Linear Estimation

    Frequently used in practice when there is a limitation in computational

    complexity Real-time operation

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 4 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    5/107

    Linear Estimators

    Simpler to determine: depend on the first two moments of data

    Statistical Approach Optimal Linear Filters Minimum Mean Square Error Require second order statistics of signals

    Deterministic Approach Least Squares Estimators Minimum Least Squares Error

    Require handling of a data observation matrix

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 5 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    6/107

    Limitations of Optimal Filters and LS Estimators

    Statistics of signals may not be available or cannot be accuratelyestimated

    There may not be available time for statistical estimation (real-time)

    Signals and systems may be non-stationary

    Memory required may be prohibitive

    Computational load may be prohibitive

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 6 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    7/107

    Iterative Solutions

    Search the optimal solution starting from an initial guess

    Iterative algorithms are based on classical optimization algorithms

    Require reduced computational effort per iteration

    Need several iterations to converge to the optimal solution

    These methods form the basis for the development of adaptivealgorithms

    Still require the knowledge of signal statistics

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 7 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    8/107

    Adaptive Filters

    Usually approximate iterative algorithms and:

    Do not require previous knowledge of the signal statistics

    Have a small computational complexity per iteration

    Converge to a neighborhood of the optimal solution

    Adaptive filters are good for:

    Real-time applications, when there is no time for statistical estimation

    Applications with nonstationary signals and/or systems

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 8 / 107

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    9/107

    Properties of Adaptive Filters

    They can operate satisfactorily in unknown and possibly time-varyingenvironments without user intervention

    They improve their performance during operation by learningstatistical characteristics from current signal observations

    They can track variations in the signal operating environment (SOE)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 9 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    10/107

    Adaptive Filtering Applications

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 10 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    11/107

    Basic Classes of Adaptive Filtering Applications

    System Identification

    Inverse System Modeling

    Signal Prediction

    Interference Cancelation

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 11 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    12/107

    System Identification

    + +

    _

    +x(n) d(n)

    d(n)

    y(n) e(n)

    eo(n)

    other signals

    unknownsystem

    adaptive

    adaptivealgorithm

    filter

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 12 / 107

    http://goforward/http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    13/107

    Applications System Identification

    Channel Estimation

    Communications systems

    Objective: model the channel to design distortion compensation

    x(n): training sequencePlant Identification

    Control systems

    Objective: model the plant to design a compensator

    x(n): training sequence

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 13 / 107

    h C ll

    http://find/
  • 7/29/2019 Seminar Bermudez

    14/107

    Echo Cancellation

    Telephone systems and VoIP

    Echo caused by network impedance mismatches or acousticenvironment

    Objective: model the echo path impulse response

    x(n): transmitted signal

    d(n): echo + noise

    H H

    Tx

    Rx

    H H

    Tx

    Rx

    EC

    +

    _

    x(n)

    d(n)e(n)

    Figure: Network Echo Cancellation

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 14 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    15/107

    Inverse System Modeling

    Adaptive filter attempts to estimate unknown systems inverse

    Adaptive filter input usually corrupted by noise

    Desired response d(n) may not be available

    ++ +_Unknown

    System

    Adaptive

    AdaptiveFilter

    Algorithm

    Delay

    othersignals

    x(n)

    s(n) e(n)

    y(n)

    d(n)z(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 15 / 107

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    16/107

    Applications Inverse System ModelingChannel Equalization

    ++ +_

    Channel

    Adaptive

    AdaptiveFilter

    Algorithm

    Local gen.

    x(n)

    x(n)

    s(n) e(n)

    y(n)

    d(n)z(n)

    Objective: reduce intersymbol interference

    Initially training sequence in d(n)

    After training: d(n) generated from previous decisions

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 16 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    17/107

    Signal Prediction

    +_

    Delay

    Adaptive

    AdaptiveFilter

    Algorithmothersignals

    x(n no)

    x(n) e(n)

    y(n)

    d(n)

    most widely used case forward prediction

    signal x(n) to be predicted from samples{x(n no), x(n no 1), . . . , x(n no L)}

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 17 / 107

    S

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    18/107

    Application Signal Prediction

    DPCM Speech Quantizer - Linear Predictive Coding

    Objective: Reduce speech transmission bandwidth

    Signal transmitted all the time: quantization error

    Predictor coefficients are transmitted at low rate

    +

    _

    ++

    Speech

    signalsignal DPCM

    Quantizer

    Predictor

    prediction error

    Q[e(n)]

    e(n)

    y(n) + Q[e(n)] d(n)

    d(n)

    y(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 18 / 107

    I f C l i

    http://find/
  • 7/29/2019 Seminar Bermudez

    19/107

    Interference Cancelation

    One or more sensor signals are used to remove interference and noise

    Reference signals correlated with the inteference should also beavailable

    Applications: array processing for radar and communications biomedical sensing systems active noise control systems

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 19 / 107

    A li i I f C l i

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    20/107

    Application Interference CancelationActive Noise Control

    Ref: D.G. Manolakis, V.K. Ingle and S.M. Kogon, Statistical and Adaptive Signal Processing, 2000.

    Cancelation of acoustic noise using destructive interference

    Secondary system between the adaptive filter and the cancelationpoint is unavoidable

    Cancelation is performed in the acoustic environmentJose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 20 / 107

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    21/107

    Active Noise Control Block Diagram

    s++

    AdaptiveAlgorithm

    wo

    w(n)

    x(n)

    xf(n)

    S

    Sy(n) ys(n) yg(n)

    d(n) z(n)e(n)

    g(ys)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 21 / 107

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    22/107

    Adaptive Filtering Principles

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 22 / 107

    Ada ti e Filte Feat es

    http://find/
  • 7/29/2019 Seminar Bermudez

    23/107

    Adaptive Filter Features

    Adaptive filters are composed of three basic modules:

    Filtering strucure Determines the output of the filter given its input samples Its weights are periodically adjusted by the adaptive algorithm Can be linear or nonlinear, depending on the application Linear filters can be FIR or IIR

    Performance criterion Defined according to application and mathematical tractability Is used to derive the adaptive algorithm Its value at each iteration affects the adaptive weight updates

    Adaptive algorithm

    Uses the performance criterion value and the current signals Modifies the adaptive weights to improve performance Its form and complexity are function of the structure and of the

    performance criterion

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 23 / 107

    Signal Operating Environment (SOE)

    http://find/
  • 7/29/2019 Seminar Bermudez

    24/107

    Signal Operating Environment (SOE)

    Comprises all informations regarding the properties of the signals and

    systemsInput signals

    Desired signal

    Unknown systems

    If the SOE is nonstationaryAquisition or convergence mode: from start until close to bestperformance

    Tracking mode: readjustment following SOEs time variations

    Adaptation can beSupervised desired signal is available e(n) can be evaluated

    Unsupervised desired signal is unavailable

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 24 / 107

    Performance Evaluation

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    25/107

    Performance Evaluation

    Convergence rate

    Misadjustment

    Tracking

    Robustness (disturbances and numerical)

    Computational requirements (operations and memory)

    Structure facility of implementation performance surface

    stability

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 25 / 107

    Optimum versus Adaptive Filters in Linear Estimation

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    26/107

    Optimum versus Adaptive Filters in Linear Estimation

    Conditions for this study

    Stationary SOEFilter structure is transversal FIR

    All signals are real valued

    Performance criterion: Mean-square error E[e2(n)]

    The Linear Estimation Problem

    +

    Linear FIR Filter

    w

    x(n) y(n)

    d(n)

    e(n)

    Jms = E[e2(n)]

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 26 / 107

    The Linear Estimation Problem

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    27/107

    The Linear Estimation Problem

    +

    Linear FIR Filter

    w

    x(n) y(n)

    d(n)

    e(n)

    x(n) = [x(n), x(n 1), , x(n N + 1)]T

    y(n) =xT

    (n)w

    e(n) = d(n) y(n) = d(n) xT(n)w

    Jms = E[e2(n)] = 2d 2p

    Tw +wTRxxw

    where

    p = E[x(n)d(n)]; Rxx = E[x(n)xT

    (n)]Normal Equations

    Rxxwo = p wo = R1xxp for Rxx > 0

    Jmsmin = 2d p

    TR1xxp

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 27 / 107

    What if d(n) is nonstationary?

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    28/107

    What ifd(n) is nonstationary?

    +

    Linear FIR Filter

    w

    x(n) y(n)

    d(n)

    e(n)

    x(n) = [x(n), x(n 1), , x(n N + 1)]T

    y(n) = xT(n)w(n)

    e(n) = d(n) y(n) = d(n) xT(n)w(n)

    Jms(n) = E[e2(n)] = 2d(n) 2p(n)

    Tw(n) + wT(n)Rxxw(n)

    where

    p(n) = E[x(n)d(n)]; Rxx = E[x(n)xT

    (n)]Normal Equations

    Rxxwo(n) = p(n) wo(n) = R1xxp(n) for Rxx > 0

    Jmsmin(n) = 2d(n) p

    T(n)R1xxp(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 28 / 107

    Optimum Filters versus Adaptive Filters

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    29/107

    Optimum Filters versus Adaptive Filters

    Optimum FiltersComputep(n) = E[x(n)d(n)]

    Solve Rxxwo = p(n)

    Filter with wo

    (n) y(n) = xT(n)wo(n)

    Nonstationary SOE:Optimum filter determined

    for each value of n

    Adaptive FiltersFiltering: y(n) = xT(n)w(n)

    Evaluate error: e(n) = d(n) y(n)

    Adaptive algorithm:

    w(n + 1) = w(n) + w[x(n), e(n)]

    w(n) is chosen so that w(n) is close to

    wo

    (n) for n large

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 29 / 107

    Characteristics of Adaptive Filters

    http://find/
  • 7/29/2019 Seminar Bermudez

    30/107

    Characteristics of Adaptive Filters

    Search for the optimum solution on the performance surface

    Follow principles of optimization techniques

    Implement a recursive optimization solution

    Convergence speed may depend on initializationHave stability regions

    Steady-state solution fluctuates about the optimum

    Can track time varying SOEs better than optimum filters

    Performance depends on the performance surface

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 30 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    31/107

    Iterative Solutions for the

    Optimum Filtering Problem

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 31 / 107

    Performance (Cost) Functions

    http://find/
  • 7/29/2019 Seminar Bermudez

    32/107

    Performance (Cost) Functions

    Mean-square error

    E[e2

    (n)] (Most popular)Adaptive algorithms: Least-Mean Square (LMS), Normalized LMS(NLMS), Affine Projection (AP), Recursive Least Squares (RLS), etc.

    Regularized MSE

    Jrms = E[e2(n)] + w(n)2

    Adaptive algorithm: leaky least-mean square (leaky LMS)

    1 norm criterionJ1 = E[|e(n)|]

    Adaptive algorithm: Sign-Error

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 32 / 107

    Performance (Cost) Functions continued

    http://find/
  • 7/29/2019 Seminar Bermudez

    33/107

    ( )

    Least-mean fourth (LMF) criterion

    JLMF = E[e4(n)]

    Adaptive algorithm: Least-Mean Fourth (LMF)

    Least-mean-mixed-norm (LMMN) criterion

    JLMMN = E[e2(n) + 1

    2(1 )e4(n)]

    Adaptive algorithm: Least-Mean-Mixed-Norm (LMMN)

    Constant-modulus criterion

    JCM = E[

    |xT(n)w(n)|22

    ]

    Adaptive algorithm: Constant-Modulus (CM)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 33 / 107

    MSE Performance Surface Small Input Correlation

    http://find/
  • 7/29/2019 Seminar Bermudez

    34/107

    p

    2010

    010 20

    20

    0

    200

    500

    1000

    1500

    2000

    2500

    3000

    3500

    w1

    w2

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 34 / 107

    MSE Performance Surface Large Input Correlation

    http://find/
  • 7/29/2019 Seminar Bermudez

    35/107

    g p

    20 1510 5

    0 510 15

    20

    20

    10

    010

    200

    500

    1000

    1500

    2000

    2500

    3000

    3500

    4000

    4500

    5000

    w1

    w2

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 35 / 107

    The Steepest Descent Algorithm Stationary SOE

    http://find/
  • 7/29/2019 Seminar Bermudez

    36/107

    p g y

    Cost Function

    Jms(n) = E[e2(n)] = 2d 2pTw(n) +wT(n)Rxxw(n)

    Weight Update Equation

    w(n + 1) = w(n) + c(n)

    : step-sizec(n): correction term (determines direction of w(n))Steepest descent adjustment:

    c(n) = Jms(n) Jms(n + 1) Jms(n)

    w(n + 1) = w(n) + [pRxxw(n)]

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 36 / 107

    Weight Update Equation About the Optimum Weights

    http://find/
  • 7/29/2019 Seminar Bermudez

    37/107

    Weight Error Update Equation

    w(n + 1) = w(n) + [pRxxw(n)]

    Using p = Rxxwo

    w(n + 1) = (I Rxx)w(n) + Rxxwo

    Weight error vector: v(n) = w(n) wo

    v(n + 1) = (I Rxx)v(n)

    Matrix I Rxx must be stable for convergence (|i| < 1)

    Assuming convergence, limn v(n) = 0

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 37 / 107

    Convergence Conditions

    http://find/
  • 7/29/2019 Seminar Bermudez

    38/107

    v(n + 1) = (

    I

    Rxx)v

    (n);R

    xx positive definiteEigen-decomposition ofRxx

    Rxx = QQT

    v(n + 1) = (I QQT)v(n)

    QTv(n + 1) = QTv(n) QTv(n)

    Defining v(n + 1) = QTv(n + 1)

    v(n + 1) = (I )v(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 38 / 107

    Convergence Properties

    http://find/
  • 7/29/2019 Seminar Bermudez

    39/107

    v(n + 1) = (I )v(n)

    vk(n + 1) = (1 k)vk(n), k = 1, . . . , N

    vk(n) = (1 k)nvk(0)

    Convergence modes

    monotonic if 0 < 1 k < 1

    oscillatory if 1 < 1 k < 0

    Convergence if |1 k| < 1 0 < 0 monotonic convergence

    Stability limit is again 0 < < 2maxJms(n) converges faster than w(n)

    Algorithm converges faster as = max/min 1

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 42 / 107

    Simulation Results

    http://find/
  • 7/29/2019 Seminar Bermudez

    43/107

    x(n) = x(n 1) + v(n)

    0 20 40 60 80 100 120 140 160 180 20070

    60

    50

    40

    30

    20

    10

    0

    iteration

    M

    SE(dB)

    Steepest Descent Mean Square Error

    Input: White noise ( = 1)

    0 20 40 60 80 100 120 140 160 180 20070

    60

    50

    40

    30

    20

    10

    0

    iteration

    M

    SE(dB)

    Steepest Descent Mean Square Error

    Input: AR(1), = 0.7 ( = 5.7)

    Linear system identification - FIR with 20 coefficients

    Step-size = 0.3

    Noise power 2v = 106

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 43 / 107

    The Newton Algorithm

    http://find/
  • 7/29/2019 Seminar Bermudez

    44/107

    Steepest descent: linear approx. of Jms about the operating point

    Newtons method: Quadratic approximation of Jms

    Expanding Jms(w) in Taylors series about w(n),

    Jms(w) Jms[w(n)] + TJms[w w(n)]

    + 12

    [w w(n)]TH(n)[w w(n)]

    Diferentiating w.r.t. w and equating to zero at w = w(n + 1),

    Jms[w(n + 1)] = Jms[w(n)] + H[w(n)][w(n + 1) w(n)] = 0

    w(n + 1) = w(n) H1[w(n)]Jms[w(n)]

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 44 / 107

    The Newton Algorithm continued

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    45/107

    Jms[w(n)] = 2p+ 2Rxxw(n)

    H(w(n)) = 2RxxThus, adding a step-size control,

    w(n + 1) = w(n) R1xx [p+Rxxw(n)]

    Quadratic surface conv. in one iteration for = 1

    Requires the determination ofR1xx

    Can be used to derive simpler adaptive algorithms

    When H(n) is close to singular regularization

    H(n) = 2Rxx + 2I

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 45 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    46/107

    Basic Adaptive Algorithms

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 46 / 107

    Least Mean Squares (LMS) Algorithm

    http://find/
  • 7/29/2019 Seminar Bermudez

    47/107

    Can be interpreted in different ways

    Each interpretation helps understanding the algorithm behavior

    Some of these interpretations are related to the steepest descentalgorithm

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 47 / 107

    LMS as a Stochastic Gradient Algorithm

    http://find/
  • 7/29/2019 Seminar Bermudez

    48/107

    Suppose we use the estimate Jms(n) = E[e2(n)] e2(n)

    The estimated gradient vector becomes

    Jms(n) =e2(n)

    w(n)= 2e(n)

    e(n)

    w(n)

    Since e(n) = d(n) xT(n)w(n),

    Jms(n) = 2e(n)x(n) (stochastic gradient)

    and, using the steepest descent weight update equation,

    w(n + 1) = w(n) + e(n)x(n) (LMS weight update)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 48 / 107

    LMS as a Stochastic Estimation Algorithm

    http://find/
  • 7/29/2019 Seminar Bermudez

    49/107

    Jms(n) = 2p+ 2Rxxw(n)Stochastic estimators

    p = d(n)x(n) Rxx = x(n)xT(n)

    Then,Jms(n) = 2d(n)x(n) + 2x(n)x

    T(n)w(n)

    Using Jms(n) is the steepest descent weight update,

    w(n + 1) = w(n) + e(n)x(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 49 / 107

    LMS A Solution to a Local Optimization

    http://find/
  • 7/29/2019 Seminar Bermudez

    50/107

    Error expressions

    e(n) = d(n) xT(n)w(n) (a priori error)(n) = d(n) xT(n)w(n + 1) (a posteriori error)

    We want to maximize |(n) e(n)| with |(n)| < |e(n)|

    (n) e(n) = xT(n)w(n)

    w(n) = w(n + 1) w(n)

    Expressing w(n) as w(n) = w(n)e(n)

    (n) e(n) = xT

    (n)

    w(n)e(n)

    For max |(n) e(n)| w(n) in the direction ofx(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 50 / 107

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    51/107

    w(n) = x(n) and

    w(n) = x(n)e(n)

    andw(n + 1) = w(n) + e(n)x(n)

    As (n) e(n) = xT(n)w(n) = xT(n)x(n)e(n)

    |(n)| < |e(n)| requires |1 xT(n)x(n)| < 1, or

    0 < < 2

    x(n)

    2 (stability region)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 51 / 107

    Observations - LMS Algorithm

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    52/107

    LMS is a noisy approximation of the steepest descent algorithm

    The gradient estimate is unbiased

    The errors in the gradient estimate lead to Jmsex() = 0

    Vector w(n) is now randomSteepest descent properties are no longer guaranteed LMS analysis required

    The instantaneous estimates allow tracking without redesign

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 52 / 107

    Some Research Results

    http://find/
  • 7/29/2019 Seminar Bermudez

    53/107

    J. C. M. Bermudez and N. J. Bershad, A nonlinear analytical model for thequantized LMS algorithm - the arbitrary step size case, IEEE Transactions on

    Signal Processing, vol.44, No. 5, pp. 1175-1183, May 1996.J. C. M. Bermudez and N. J. Bershad, Transient and tracking performanceanalysis of the quantized LMS algorithm for time-varying system identification,IEEE Transactions on Signal Processing, vol.44, No. 8, pp. 1990-1997, August1996.

    N. J. Bershad and J. C. M. Bermudez, A nonlinear analytical model for thequantized LMS algorithm - the power-of-two step size case, IEEE Transactions onSignal Processing, vol.44, No. 11, pp. 2895- 2900, November 1996.

    N. J. Bershad and J. C. M. Bermudez, Sinusoidal interference rejection analysisof an LMS adaptive feedforward controller with a noisy periodic reference, IEEETransactions on Signal Processing, vol.46, No. 5, pp. 1298-1313, May 1998.

    J. C. M. Bermudez and N. J. Bershad, Non-Wiener behavior of the Filtered-XLMS algorithm, IEEE Trans. on Circuits and Systems II - Analog and DigitalSignal Processing, vol.46, No. 8, pp. 1110-1114, Aug 1999.

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 53 / 107

    Some Research Results continued

    http://find/
  • 7/29/2019 Seminar Bermudez

    54/107

    O. J. Tobias, J. C. M. Bermudez and N. J. Bershad, Mean weight behavior of theFiltered-X LMS algorithm, IEEE Transactions on Signal Processing, vol. 48, No.

    4, pp. 1061-1075, April 2000.

    M. H. Costa, J. C. M. Bermudez and N. J. Bershad, Stochastic analysis of theLMS algorithm with a saturation nonlinearity following the adaptive filter output,IEEE Transactions on Signal Processing, vol. 49, No. 7, pp. 1370-1387, July 2001.

    M. H. Costa, J. C, M. Bermudez and N. J. Bershad, Stochastic analysis of the

    Filtered-X LMS algorithm in systems with nonlinear secondary paths, IEEETransactions on Signal Processing, vol. 50, No. 6, pp. 1327-1342, June 2002.

    M. H. Costa, J. C. M. Bermudez and N. J. Bershad, The performance surface infiltered nonlinear mean square estimation, IEEE Transactions on Circuits andSystems I, vol. 50, No. 3, p. 445-447, March 2003.

    G. Barrault, J. C. M. Bermudez and A. Lenzi, New Analytical Model for theFiltered-x Least Mean Squares Algorithm Verified Through Active Noise ControlExperiment, Mechanical Systems and Signal Processing, v. 21, p. 1839-1852,2007.

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 54 / 107

    Some Research Results continued

    http://find/
  • 7/29/2019 Seminar Bermudez

    55/107

    N. J. Bershad, J. C. M. Bermudez and J. Y. Tourneret, Stochastic Analysis of theLMS Algorithm for System Identification with Subspace Inputs, IEEE Trans. onSignal Process., v. 56, p. 1018-1027, 2008.

    J. C. M. Bermudez, N. J. Bershad and J. Y. Tourneret, An Affine Combination ofTwo LMS Adaptive Filters Transient Mean-Square Analysis, IEEE Trans. onSignal Process., v. 56, p. 1853-1864, 2008.

    M. H. Costa, L. R. Ximenes and J. C. M. Bermudez, Statistical Analysis of the

    LMS Adaptive Algorithm Subjected to a Symmetric Dead-Zone Nonlinearity at theAdaptive Filter Output, Signal Processing, v. 88, p. 1485-1495, 2008.

    M. H. Costa and J. C. M. Bermudez, A Noise Resilient Variable Step-Size LMSAlgorithm, Signal Processing, v. 88, no. 3, p. 733-748, March 2008.

    P. Honeine, C. Richard, J. C. M. Bermudez, J. Chen and H. Snoussi, A

    Decentralized Approach for Nonlinear Prediction of Time Series Data in SensorNetworks, EURASIP Journal on Wireless Communications and Networking, v.2010, p. 1-13, 2010. (KNLMS)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 55 / 107

    The Normalized LMS Algorithm NLMS

    http://find/
  • 7/29/2019 Seminar Bermudez

    56/107

    Most Employed adaptive algorithm in real-time applications

    Like LMS has different interpretations (even more)

    Alleviates a drawback of the LMS algorithm

    w(n + 1) = w(n) + e(n)x(n)

    If amplitude ofx(n) is large Gradient noise amplification Sub-optimal performance when 2x varies with time (for instance,

    speech)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 56 / 107

    NLMS A Solution to a Local Optimization Problem

    Error expressions

    http://find/
  • 7/29/2019 Seminar Bermudez

    57/107

    Error expressionse(n) = d(n) xT(n)w(n) (a priori error)(n) = d(n) xT(n)w(n + 1) (a posteriori error)

    We want to maximize |(n) e(n)| with |(n)| < |e(n)|

    (n) e(n) = xT(n)w(n) (A)

    w(n) = w(n + 1) w(n)

    For |(n)| < |e(n)| we impose the restriction

    (n) = (1 )e(n), |1 | < 1

    (n) e(n) = e(n) (B)For max |(n) e(n)| w(n) in the direction ofx(n)

    w(n) = kx(n) (C)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 57 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    58/107

    Using (A), (B) and (C),

    k = e(n)

    xT(n)x(n)

    and

    w(n + 1) = w(n) + e(n)x(n)

    xT(n)x(n)(NLMS weight update)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 58 / 107

    NLMS Solution to a Constrained

    Optimization Problem

    http://find/
  • 7/29/2019 Seminar Bermudez

    59/107

    Op b

    Error sequence

    y(n) =xT(n)w(n) (estimate of d(n))

    e(n) =d(n) y(n) = d(n) xT(n)w(n) (estimation error)

    Optimization (principle of minimal disturbance)

    Minimize w(n)2 = w(n + 1) w(n)2

    subject to: xT(n)w(n + 1) = d(n)

    This problem can be solved using the method of Lagrange multipliers

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 59 / 107

    Using Lagrange multipliers, we minimize

    T

    http://find/
  • 7/29/2019 Seminar Bermudez

    60/107

    f[w(n + 1)] = w(n + 1) w(n)2 + [d(n) xT(n)w(n + 1)]

    Differentiating w.r.t. w(n + 1) and equating the result to zero,

    w(n + 1) = w(n) +1

    2x(n) (*)

    Using this result in xT(n)w(n + 1) = d(n) yields

    =2e(n)

    xT(n)x(n)

    Using this result in (*) yields

    w(n + 1) = w(n) + e(n)x(n)

    xT(n)x(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 60 / 107

    NLMS as an Orthogonalization Process

    http://find/
  • 7/29/2019 Seminar Bermudez

    61/107

    Conditions for the analysis e(n) = d(n) xT(n)w(n) wo is the optimal solution (Wiener solution) d(n) = xT(n)wo (no noise) v(n) = w(n) wo (weight error vector)

    Error signal

    e(n) = d(n) xT(n)[v(n) + wo]

    = xT(n)wo xT(n)[v(n) + wo]

    = xT

    (n)v

    (n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 61 / 107

    T

    http://find/
  • 7/29/2019 Seminar Bermudez

    62/107

    e(n) = xT(n)v(n)

    Interpretation: To minimize the error v(n) should be orthogonal to allinput vectorsRestriction: We have only one vector x(n)

    Iterative solution:

    We can subtract from v(n) its component in the direction ofx(n) ateach iteration

    If there are N adaptive coefficients, v(n) could be reduced to zeroafter N orthogonal input vectors

    Iterative projection extraction Gram-Schmidt orthogonalization

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 62 / 107

    Recursive orthogonalization

    n = 0 : v(1) = v(0) proj of v(0) onto x(0)

    http://find/
  • 7/29/2019 Seminar Bermudez

    63/107

    n = 0 : v(1) = v(0) proj. ofv(0) onto x(0)

    n = 1 : v(2) = v(1) proj. ofv(1) onto x(1)

    ... : ...

    n + 1 : v(n + 1) = v(n) proj. ofv(n) onto x(n)

    Projection ofv(n) onto x(n)

    Px(n)[v(n)] =x(n)[xT(n)x(n)]1xT(n)

    v(n)

    Weight update equation

    v(n + 1) = v(n)

    scalar [xT(n)x(n)]1

    e(n) xT(n)v(n) x(n)

    = v(n) + e(n)x(n)

    xT(n)x(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 63 / 107

    NLMS has its owns problems!

    http://find/
  • 7/29/2019 Seminar Bermudez

    64/107

    NLMS solves the LMS gradient error amplification problem, but ...

    What happens ifx(n)2 gets too small?

    One needs to add some regularization

    w(n + 1) = w(n) + e(n)x(n)

    xT(n)x(n)+(NLMS)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 64 / 107

    -NLMS Stochastic Approximation of

    Regularized Newton

    http://find/
  • 7/29/2019 Seminar Bermudez

    65/107

    Regularized Newton Algorithm

    w(n + 1) = w(n) + [I+Rxx]1[pRxxw(n)]

    Instantaneous estimates

    p = x(n)d(n)Rxx = x(n)x

    T(n)

    Using these estimates

    w(n + 1) = w(n) + [I+ x(n)xT(n)]1x(n)

    e(n) [d(n) xT(n)w(n)]

    w(n + 1) = w(n) + [I+ x(n)xT(n)]1x(n)e(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 65 / 107

    w(n + 1) = w(n) + [I+ x(n)xT(n)]1x(n)e(n)

    http://find/
  • 7/29/2019 Seminar Bermudez

    66/107

    Inversion of I+ x(n)xT(n) ?

    Matrix Inversion Formula

    [A+BCD]1 = A1 A1B[C1 +DA1B]1DA1

    Thus,

    [I+ x(n)xT(n)]1 = 1I 2

    1 + 1xT(n)x(n)x(n)xT(n)

    Post-multiplying both sides by x(n) and rearranging

    [I+ x(n)xT

    (n)]1

    x(n) =

    x(n)

    + xT(n)x(n)

    and

    w(n + 1) = w(n) + e(n)x(n)

    xT(n)x(n) +

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 66 / 107

    Some Research Results

    M. H. Costa and J. C. M. Bermudez, An improved model for the normalized LMSG f ffi f

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    67/107

    algorithm with Gaussian inputs and large number of coefficients, in Proc. of the2002 IEEE International Conference on Acoustics, Speech and Signal Processing

    (ICASSP-2002), Orlando, Florida, pp. II- 1385-1388, May 13-17, 2002.J. C. M. Bermudez and M. H. Costa, A Statistical Analysis of the -NLMS andNLMS Algorithms for Correlated Gaussian Signals, Journal of the BrazilianTelecommunications Society, v. 20, n. 2, p. 7-13, 2005.

    G. Barrault, M. H. Costa, J. C. M. Bermudez and A. Lenzi, A new analyticalmodel for the NLMS algorithm, in Proc. 2005 IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP 2005) Pennsylvania, USA, vol.IV, p. 41-44, 2005.

    J. C. M. Bermudez, N. J. Bershad and J.-Y. Tourneret, An Affine Combination ofTwo NLMS Adaptive Filters - Transient Mean-Square Analysis, In Proc.Forty-Second Asilomar Conference on Asilomar Conference on Signals, Systems &

    Computers, Pacific Grove, CA, USA, 2008.P. Honeine, C. Richard, J. C. M. Bermudez and H. Snoussi, Distributedprediction of time series data with kernels and adaptive filtering techniques insensor networks, In Proc. Forty-Second Asilomar Conference on AsilomarConference on Signals, Systems & Computers, Pacific Grove, CA, USA, 2008.

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 67 / 107

    The Affine Projection Algorithm

    http://find/
  • 7/29/2019 Seminar Bermudez

    68/107

    Consider the NLMS algorithm but using{x(n),x(n 1), . . . ,x(n P)}

    Minimize w(n)2 = w(n + 1) w(n)2

    subject to:

    x

    T

    (n)w(n + 1) = d(n)xT(n 1)w(n + 1) = d(n 1)

    ...

    xT(n P)w(n + 1) = d(n P)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 68 / 107

    Observation matrix

    http://find/
  • 7/29/2019 Seminar Bermudez

    69/107

    X(n) = [x(n),x(n 1), ,x(n P)]

    Desired signal vector

    d(n) = [d(n), d(n 1), . . . , d(n P)]

    Error vector

    e(n) = [e(n), e(n 1), . . . , e(n P)] = d(n)XT(n)w(n)

    Vector of the constraint errors

    ec(n) = d(n) XT(n)w(n + 1)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 69 / 107

    Using Lagrange multipliers, we minimize

    http://find/
  • 7/29/2019 Seminar Bermudez

    70/107

    f[w(n + 1)] = w(n + 1) w(n)2 + T[d(n) XT(n)w(n + 1)]

    Differentiating w.r.t. w(n + 1) and equating the result to zero,

    w(n + 1) = w(n) +1

    2X(n) (*)

    Using this result in XT(n)w(n + 1) = d(n) yields

    = 2[XT(n)X(n)]1e(n)

    Using this result in (*) yields

    w(n + 1) = w(n) + X(n)[XT(n)X(n)]1e(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 70 / 107

    Affine ProjectionSolution to the Underdetermined Least-Squares Problem

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    71/107

    We want minimize

    ec(n)2 = d(n)XT(n)w(n + 1)2

    where XT(n) is (P + 1) N with (P + 1) < N

    Thus, we look for the least-squares solution of the underdetermined

    system XT(n)w(n + 1) = d(n)

    The solution is

    w(n + 1) =XT(n)

    +d(n) = X(n)[XT(n)X(n)]1d(n)

    Using d(n) = e(n) + XT(n)w(n) yields

    w(n + 1) = w(n) + X(n)[XT(n)X(n)]1e(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 71 / 107

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    72/107

    Observations:

    Order of the AP algorithm: P + 1 (P = 0 NLMS)

    Convergence speed increases with P (but not linearly)

    Computational complexity increases with P (not linearly)

    IfXT

    (n)X(n) is close to singular, use XT

    (n)X(n) + IThe scalar error of NLMS becomes a vector error in AP (except for = 1)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 72 / 107

    Affine Projection Stochastic Approximation ofRegularized Newton

    http://find/
  • 7/29/2019 Seminar Bermudez

    73/107

    Regularized Newton Algorithm

    w(n + 1) = w(n) + [I+Rxx]1

    [pRxxw(n)]Estimates using time window statistics

    p =1

    P + 1

    nk=nP

    X(k)d(k) =1

    P + 1X(n)d(n)

    Rxx =1

    P + 1

    nk=nP

    X(k)XT(k) =1

    P + 1X(n)XT(n)

    Using these estimates with = /(P + 1),

    w(n + 1) = w(n) + [I+X(n)XT(n)]1X(n)

    e(n)

    [d(n) XT(n)w(n)]

    w(n + 1) = w(n) + [I+X(n)XT(n)]1X(n)e(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 73 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    74/107

    w(n + 1) = w(n) + [I+X(n)XT(n)]1X(n)e(n)

    Using the Matrix Inversion Formula

    [I+X(n)XT(n)]1 = X(n)[I+XT(n)X(n)]1

    and

    w(n + 1) = w(n) + X(n)[I+XT(n)X(n)]1e(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 74 / 107

    Affine Projection Algorithm as aProjection onto an Affine Subspace

    http://find/
  • 7/29/2019 Seminar Bermudez

    75/107

    Conditions for the analysis e(n) = d(n) XT(n)w(n) d(n) = XT(n)wo defines the optimal solution in the least-squares

    sense v(n) = w(n) wo (weight error vector)

    Error vector

    e(n) = d(n)XT(n)[v(n) + w(n + 1)]

    = XT(n)w(n + 1) XT(n)[v(n) + w(n + 1)]

    = XT(n)v(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 75 / 107

    e(n) = XT(n)v(n)

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    76/107

    Interpretation: To minimize the error v(n) should be orthogonal to allinput vectorsRestriction: We are going to use only {x(n),x(n 1), . . . ,x(nP)}

    Iterative solution:

    We can subtract from v(n) its projection onto the range ofX(n) at

    each iteration

    v(n + 1) = v(n) PX(n)v(n) (Proj. onto an affine subspace)

    Using XT(n)v(n) = e(n)

    v(n + 1) = v(n) X(n)[XT(n)X(n)]1XT(n)

    v(n)

    = v(n) + X(n)[XT(n)X(n)]1e(n) (AP algorithm)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 76 / 107

    Pseudo Affine Projection AlgorithmMajor problem with the AP algorithmFor = 1 e(n) e(n) large computational complexity

    http://find/
  • 7/29/2019 Seminar Bermudez

    77/107

    For = 1, e(n) e(n) large computational complexity

    The Pseudo-AP algorithm replaces the input x(n) with its P-th order

    autoregressive prediction

    x(n) =Pk=1

    akx(n k)

    Using the least-squares solution for a = [a1, a2, . . . , aP]T,

    a = [XTp (n)Xp(n)]

    1XTp (n)x(n), Xp(n) = [x(n 1), . . . ,x(n P)]

    Now, we subtract from x(n) its projections onto the last P inputvectors

    (n) = x(n)Xp(n)a(n)

    = x(n)Xp(n)[X

    Tp (n)Xp(n)]

    1XTp (n)x(n)

    = (I PP)x(n); (PP: projection matrix)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 77 / 107

    Example Pseudo-AP versus AP

    http://find/
  • 7/29/2019 Seminar Bermudez

    78/107

    0 1000 2000 3000 4000 5000 6000

    80

    70

    60

    50

    40

    30

    20

    10

    0Mean Square Error

    dB

    Iterations

    AR(8)=[0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2]N=64P=8SNR=80dBS=Null

    AP(72.82dB) PAP(75.82)

    =0.7

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 78 / 107

    Using as the new input vector for the NLMS algorithm,

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    79/107

    Using as the new input vector for the NLMS algorithm,

    w(n + 1) = w(n) + (n)T(n)(n)

    e(n)

    It can be shown that Pseudo-AP is identical to AP for an AR inputand = 1

    Otherwise, Pseudo-AP is different from AP

    Simpler to implement than AP for = 1

    Can lead even to better steady-state results than AP for AR inputs

    For AR inputs: NLMS with input orthogonalization

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 79 / 107

    Some Research ResultsS. J. M. de Almeida, J. C. M. Bermudez, N. J. Bershad and M. H. Costa, Astatistical analysis of the affine projection algorithm for unity step size andautoregressive inputs, IEEE Transactions on Circuits and Systems - I, vol. 52, pp.

    http://find/
  • 7/29/2019 Seminar Bermudez

    80/107

    g p y pp1394-1405, July 2005.

    S. J. M. Almeida, J. C. M. Bermudez and N. J. Bershad, A Stochastic Model fora Pseudo Affine Projection Algorithm, IEEE Transactions on Signal Processing, v.57, p. 107-118, 2009.

    S. J. M. Almeida, M. H. Costa and J. C. M. Bermudez, A Stochastic Model forthe Deficient Length Pseudo Affine Projection Adaptive Algorithm, In Proc. 17thEuropean Signal Processing Conference (EUSIPCO), Aug. 24-28, Glasgow,

    Scotland, 2009.S. J. M. Almeida, M. H. Costa and J. C. M. Bermudez, A stochastic model forthe deficient order affine projection algorithm, in Proc. ISSPA 2010, KualaLumpur, Malaysia, May 2010.

    C. Richard, J. C. M. Bermudez and P. Honeine, Online Prediction of Time Series

    Data With Kernels, IEEE Transactions on Signal Processing, v. 57, p.1058-1067, 2009.

    P. Honeine, C. Richard, J. C. M. Bermudez, H. Snoussi, M. Essoloh and F.Vincent, Functional estimation in Hilbert space for distributed learning in wirelesssensor networks, In Proc. IEEE International Conference on Acoustics Speech andSignal Processing (ICASSP), Taipei, Taiwan, p. 2861-2864, 2009.

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 80 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    81/107

    Deterministic Algorithms

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 81 / 107

    Recursive Least Squares Algorithm (RLS)

    http://find/
  • 7/29/2019 Seminar Bermudez

    82/107

    Based on a deterministic philosophy

    Designed for the present realization of the input signals

    Least squares method adapted for real time processing of temporal

    series

    Convergence speed is not strongly dependent on the input statistics

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 82 / 107

    RLS Problem Definition

    Define

    http://find/
  • 7/29/2019 Seminar Bermudez

    83/107

    Define

    xo(n) = [x(n), x(n 1), . . . , x(0)]T

    x(n) = [x(n), x(n 1), . . . , x(n N + 1)]T input to the N-tap filter

    Desired signal d(n)

    Estimator of d(n): y(n) = xT

    (n)w(n)Cost function Squared Error

    Jls(n) =n

    k=0

    e2(k) =n

    k=0

    [d(k) xT(k)w(n)]2

    = eT(n)e(n), e(n) = [e(n), e(n 1), . . . , e(0)]T

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse 2011 83 / 107

    In vector form

    d(n) = [d(n), d(n 1), . . . , d(0)]T

    T T

    http://find/
  • 7/29/2019 Seminar Bermudez

    84/107

    y(n) = [y(n), y(n 1), . . . , y(0)]T, y(k) = xT(k)w(n)

    y(n) =N1k=0

    wk(n)xo(n k) = (n)w(n)

    (n) =

    x(n) x(n 1) x(n N + 1)

    x(n 1) x(n 2) x(n N)...

    ......

    ...x(0) x(1) x(N + 1)

    = [xo(n),xo(n 1), . . .xo(n N + 1)]

    e(n) = d(n)(n)w(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse 2011 84 / 107

    e(n) = d(n)(n)w(n)

    and we minimizeJl (n) = e(n)

    2

    http://find/http://goback/
  • 7/29/2019 Seminar Bermudez

    85/107

    Jls(n) = e(n)

    Normal Equations

    T(n)(n)w(n) = T(n)d(n)

    R(n)w(n) = p(n)

    Alternative representation for R and p

    R(n) = T(n)(n) =n

    k=0x(k)xT(k)

    p(n) = T(n)d(n) =n

    k=0

    x(k)d(k)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse 2011 85 / 107

    RLS Observations

    http://find/
  • 7/29/2019 Seminar Bermudez

    86/107

    w(n) remains fixed for 0 k n to determine Jls(n)

    Optimum vector w(n) is wo(n) = R1(n)p

    The algorithm has growing memory

    In an adaptive implementation,

    wo(n) = R1(n)p

    must be solved for each iteration n

    Problems for an adaptive implementation Dificulties with nonstationary signals (infinite data window) Computational complexity

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse 2011 86 / 107

    RLS - Forgetting the PastModified cost function

    ( )

    nk 2( )

    nk[ ( ) T ( ) ( )]2

    http://find/
  • 7/29/2019 Seminar Bermudez

    87/107

    Jls(n) =k=0

    nke2(k) =k=0

    nk[d(k) xT(k)w(n)]2

    = eT(n)e(n), = diag[1, , 2, . . . , n]

    Modified Normal Equations

    T(n)(n)w(n) = T(n)d(n)

    R(n)w(n) = p(n)

    where

    R(n) = T(n)(n) =n

    k=0

    nkx(k)xT(k)

    p(n) = T(n)d(n) =

    nk=0

    nkx(k)d(k)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse 2011 87 / 107

    RLS Recursive UpdatingCorrelation Matrix

    ( )

    nk ( ) T ( )

    http://find/
  • 7/29/2019 Seminar Bermudez

    88/107

    R(n) =k=0

    nkx(k)xT(k)

    =

    n1k=0

    nkx(k)xT(k) + x(n)xT(n)

    = R(n 1) + x(n)xT(n)

    Cross-correlation vector

    p(n) =n

    k=0

    nkx(k)d(k)

    =n1k=0

    nkx(k)d(k) + x(n)d(n)

    = p(n 1) + x(n)d(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse 2011 88 / 107

    We have recursive updating expressions for R(n) and p(n)

    However we need a recursive updating for R1

    (n) as

    http://find/
  • 7/29/2019 Seminar Bermudez

    89/107

    However, we need a recursive updating for R (n), as

    wo(n) = R1

    (n)p(n)

    Applying the matrix inversion lemma to

    R(n) =

    R(n 1) + x(n)xT

    (n)

    and defining P(n) = R1

    (n) yields

    P(n) = 1P(n 1) 2P(n 1)x(n)xT(n)P(n 1)

    1 + 1xT

    (n)P(n 1)x(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse 2011 89 / 107

    The Gain Vector

    Definition1P (n 1)x(n)

    http://find/
  • 7/29/2019 Seminar Bermudez

    90/107

    k(n) = P(n 1)x(n)

    1 + 1

    xT

    (n)P(n 1)x(n)Using this definition

    P(n) = 1P(n 1) 1k(n)xT(n)P(n 1)

    k(n) can be written as

    k(n) =

    P(n) 1P(n 1) 1k(n)xT(n)P(n 1)

    x(n)

    =R1

    (n)x(n)

    k(n) is the repres. ofx(n) on the column space of R(n)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT Toulouse 2011 90 / 107

    RLS Recursive Weight Update

    w(n + 1) = wo(n)

    http://find/
  • 7/29/2019 Seminar Bermudez

    91/107

    Using

    w(n + 1) = R1

    (n)p(n) = P(n)p(n)

    p(n) = p(n 1) + x(n)d(n)

    P(n) = 1P(n 1) 1k(n)xT(n)P(n 1)

    k(n) = P(n)x(n)

    e(n) = d(n) xT(n)w(n)

    yields

    w(n + 1) = w(n) + k(n)e(n) RLS weight update

    Jose Bermudez (UFSC) Adaptive Filtering IRIT Toulouse 2011 91 / 107

    RLS Observations

    The RLS convergence speed is not affected by the eigenvalues of

    http://find/
  • 7/29/2019 Seminar Bermudez

    92/107

    The RLS convergence speed is not affected by the eigenvalues of

    R(n)Initialization: p(0) = 0, R(0) = I, (1 )2x

    This results in R(n) changed to nI+ R(n) Biased estimate ofwo(n) for small n(No problem for < 1 and large n)

    Numerical problems in finite precision Unstable for = 1 Loss of symmetry in P(n)

    Evaluate only upper (lower) part and diagonal ofP(n) Replace P(n) with [P(n) + PT(n)]/2 after updating from P(n 1)

    Numerical problems when x(n) 0 and < 1

    Jose Bermudez (UFSC) Adaptive Filtering IRIT Toulouse 2011 92 / 107

    Some Research Results

    http://find/
  • 7/29/2019 Seminar Bermudez

    93/107

    C. Ludovico and J. C. M. Bermudez, A recursive least squaresalgorithm robust to low- power excitation, in. Proc. 2004 IEEEInternational Conference on Acoustics, Speech and Signal Processing(ICASSP-2004), Montreal, Canada, vol. II, p. 673-677, 2004.

    C. Ludovico and J. C. M. Bermudez, An improved recursive leastsquares algorithm robust to input power variation, in Proc. IEEEStatistical Signal Processing Workshop, Bordeaux, France, 2005.

    Jose Bermudez (UFSC) Adaptive Filtering IRIT Toulouse 2011 93 / 107

    Performance Comparison

    System identification, N=100

    Input AR(1) x(n) = 0.9x(n 1) + v(n)

    http://find/
  • 7/29/2019 Seminar Bermudez

    94/107

    p ( ) ( ) ( ) ( )

    AP algorithm with P = 2Step sizes designed for equal LMS and NLMS performances

    Impulse response to be estimated

    0 10 20 30 40 50 60 70 80 90 1000.1

    0.05

    0

    0.05

    0.1

    0.15

    0.2

    0.25

    0.3

    0.35

    0.4

    sample

    impulseresponse

    Jose Bermudez (UFSC) Adaptive Filtering IRIT Toulouse 2011 94 / 107

    Excess Mean Square Error - LMS, NLMS and AP

    10EMSE for LMS, NLMS and AP(2)

    http://find/
  • 7/29/2019 Seminar Bermudez

    95/107

    0 1000 2000 3000 4000 5000 600080

    70

    60

    50

    40

    30

    20

    10

    0

    iteration

    EM

    SE(dB)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT Toulouse 2011 95 / 107

    Excess Mean Square Error - LMS, NLMS, AP and RLS

    10EMSE for LMS, NLMS, AP(2) and RLS

    http://find/
  • 7/29/2019 Seminar Bermudez

    96/107

    0 1000 2000 3000 4000 5000 6000

    80

    70

    60

    50

    40

    30

    20

    10

    0

    10

    iteration

    E

    MSE(dB)

    Jose Bermudez (UFSC) Adaptive Filtering IRIT Toulouse 2011 96 / 107

    Performance ComparisonsComputational Complexity

    Adaptive filter with N real coefficients and real signals

    http://find/
  • 7/29/2019 Seminar Bermudez

    97/107

    Adaptive filter with N real coefficients and real signals

    For the AP algorithm, K = P + 1

    Algorithm + /

    LMS 2N + 1 2N

    NLMS 3N + 1 3N 1

    AP (K2 + 2K)N + K3 + K (K2 + 2K)N + K3 + K2RLS N2 + 5N + 1 N2 + 3N 1

    For N = 100, P = 2

    Algorithm + / factor

    LMS 201 200 1

    NLMS 301 300 1 1.5

    AP 1, 530 1, 536 7.5

    RLS 10, 501 10, 300 1 52.5

    J s B d (UFSC) Ad ti Filt i IRIT T l s 2011 97 / 107

    T i l l f ti h ll ti (N 1024 P 2)

    http://find/
  • 7/29/2019 Seminar Bermudez

    98/107

    Typical values for acoustic echo cancellation (N = 1024, P = 2)

    Algorithm + / factor

    LMS 2, 049 2, 048 1

    NLMS 3, 073 3, 072 1 1.5

    AP 15, 390 15, 396 7.5RLS 1, 053, 697 1, 051, 648 1 514

    J B d (UFSC) Ad ti Filt i IRIT T l 2011 98 / 107

    How to Deal with Computational Complexity?

    http://find/
  • 7/29/2019 Seminar Bermudez

    99/107

    Not an easy task!!!

    There are fast versions for some algorithms (especially RLS)

    What is usually not said is that ... speed can bring

    Instability Increased need for memory

    Most applications rely on simple solutions

    J B d (UFSC) Ad ti Filt i IRIT T l 2011 99 / 107

    http://find/
  • 7/29/2019 Seminar Bermudez

    100/107

    Understanding the Adaptive Filter Behavior

    J B d (UFSC) Ad ti Filt i IRIT T l 2011 100 / 107

    What are Adaptive Filters?Adaptive filters are, by design, systems that are

    Time-variant (w(n) is time-variant)

    Nonlinear (y(n) is a nonlinear function of x(n))

    http://find/
  • 7/29/2019 Seminar Bermudez

    101/107

    Nonlinear (y(n) is a nonlinear function ofx(n))

    Stochastic (w(n) is random)

    LMS:

    w(n + 1) = w(n) + e(n)x(n)

    =I x(n)xT(n)

    x(n) + d(n)x(n)

    y(n) = xT(n)w(n)

    They are difficult to understandDifferent applications and signals require different analyses

    Simplifying assumptions are necessary

    Good design requires good analytical models.

    J B d (UFSC) Ad i Fil i IRIT T l 2011 101 / 107

    LMS AnalysisBasic equations:

    d(n) = xT(n)wo + z(n)T

    http://find/
  • 7/29/2019 Seminar Bermudez

    102/107

    e(n) = d(n) xT

    (n)w

    (n)w(n + 1) = w(n) + e(n)x(n)

    v(n) = w(n)wo (study about the optimum weight vector)

    Mean Weight Behavior:

    Neglecting dependence between v(n) andx(n)x

    T

    (n),E[v(n + 1)] = (I Rxx)E[v(n)]

    Follows the steepest descent trajectory

    Convergence condition: 0 < < 2/maxlimn

    E[w(n)] = wo

    However: w(n) is random We need to study its fluctuations about E[w(n)] MSE

    J B d (UFSC) Ad i Fil i IRIT T l 2011 102 / 107

    The Mean-Square Estimation Error

    e(n) = eo(n) xT(n)v(n), eo(n) = d(n) x

    T(n)wo

    http://find/
  • 7/29/2019 Seminar Bermudez

    103/107

    Square the error equation

    Take the expected value

    Neglect the statistical dep. between v(n) and x(n)xT(n)

    Jms(n) = E[e2o(n)] + T r{RxxK(n)}, K(n) = E[v(n)v

    T(n)]

    This expression is independent of the adaptive algorithm

    Effect of the alg. on Jms(n) is determined by K(n)

    Jmsmin = E[e2o(n)]

    Jmsex = T r{RxxK(n)}

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 103 / 107

    The Behavior ofK(n) for LMS

    Using the basic equations

    T

    http://find/
  • 7/29/2019 Seminar Bermudez

    104/107

    v(n + 1) = I+ x(n)xT

    (n)v(n) + eo(n)x(n)Post-multiply by vT(n + 1)

    Take the expected value assuming v(n) and x(n)xT(n) independent

    Evaluate the expectations Higher order moments of x(n) require input pdf

    Assuming Gaussian inputs

    K(n + 1) =K(n) [RxxK(n) + K(n)Rxx]

    + 2

    {RxxTr[RxxK(n)] + 2RxxK(n)Rxx}+ 2RxxJmsmin

    Jose Bermudez (UFSC) Adaptive Filtering IRIT - Toulouse, 2011 104 / 107

    Design Guidelines Extrated from the Model

    Stability

    http://find/
  • 7/29/2019 Seminar Bermudez

    105/107

    0