On markov chain monte carlo methods for nonlinear and non-gaussian state-space models
Likelihood function modeling of particle filter in presence of non-stationary non-gaussian...
-
Upload
independent -
Category
Documents
-
view
1 -
download
0
Transcript of Likelihood function modeling of particle filter in presence of non-stationary non-gaussian...
ARTICLE IN PRESS
Contents lists available at ScienceDirect
Signal Processing
Signal Processing 90 (2010) 1873–1885
0165-16
doi:10.1
� Cor
Bengal
Bengal,
E-m
sgapara
journal homepage: www.elsevier.com/locate/sigpro
Likelihood function modeling of particle filter in presence ofnon-stationary non-gaussian measurement noise
Arpita Mukherjee a,b,�, Aparajita Sengupta a
a Department of Electrical Engineering, Bengal Engineering and Science University, Howrah 711103, West Bengal, Indiab Dr. B.C. Roy Engineering College, Durgapur - 713206, West Bengal, India
a r t i c l e i n f o
Article history:
Received 14 May 2009
Received in revised form
4 December 2009
Accepted 8 December 2009Available online 22 December 2009
Keywords:
Particle filter
Gaussian mixture model
Likelihood function
84/$ - see front matter & 2009 Elsevier B.V. A
016/j.sigpro.2009.12.005
responding author at: Department of Elec
Engineering and Science University, How
India. Tel.: +91 33 26685018; fax: +91 33 266
ail addresses: [email protected] (A. Mukhe
[email protected] (A. Sengupta).
a b s t r a c t
A generalized likelihood function model of a sampling importance resampling (SIR)
particle filter (PF) has been derived for state estimation of a nonlinear system in the
presence of non-stationary, non-Gaussian white measurement noise. The measurement
noise is modeled by Gaussian mixture probability density function and the noise
parameters are estimated by maximizing the log likelihood function of the noise model.
This model is then included in the likelihood function of the SIR particle filter (PF) at
each time step for online state estimation of the system. The performance of the
proposed algorithm has been evaluated by estimating the states of (i) a non-linear
system in the presence of non-stationary Rayleigh distributed noise and (ii) a radar
tracking system in the presence of glint noise. The simulation results show that the
proposed modified SIR PF offers best performance among the considered algorithms for
these examples.
& 2009 Elsevier B.V. All rights reserved.
1. Introduction
Optimal solution to the state estimation problem ofnonlinear systems is still a research problem—especiallyin the presence of measurement noise which is neitherGaussian in distribution nor stationary. The particle filter(also known as sequential importance (re-)sampling(SIS/SIR) and sequential Monte Carlo methods (SMS))[1–5] has proven its worth in case of dynamic stateestimation of nonlinear systems fed by non-Gaussiannoise. It too yields suboptimal solutions. Moreover itworks on the assumption that the measurement noise is azero mean white sequence with a known probabilitydensity function (PDF) [4]. The present work proposesimprovements in some of these issues.
ll rights reserved.
trical Engineering,
rah 711103, West
84564.
rjee),
The observation likelihood function, responsible fordetermining the particle’s weights, depends on themeasurement noise. The particles’ weights in turndetermine how they are resampled. The likelihoodfunction in this way has a strong influence on thequality of estimation. The work [6] contains experimentalresults which show a large influence of observationlikelihood function on the tracking performance ofthree different tracking tasks for different parametervalues of an assumed measurement model. However,not much work is reported in the literature onthe modeling of the observation likelihood functionwhich is necessary to calculate the particles’ weightsfrom the observed measurements. Often no detailsare given as to how the observation likelihood functionmodel of a particle filtering algorithm is determined. Inthe present work a generalized observation likelihoodfunction model of SIR particle filter is proposed fordynamic state estimation of a nonlinear system in thepresence of non-stationary non-Gaussian measurementnoise.
ARTICLE IN PRESS
A. Mukherjee, A. Sengupta / Signal Processing 90 (2010) 1873–18851874
The non-Gaussian densities of noise are closelyapproximated by Gaussian mixture models [7–11]. In[12], the authors discussed how a a posteriori PDF fromthe Bayesian recursion relation can be approximated by aweighted sum of Gaussian probability density functions.In [13], the authors proposed a special sequential Monte-Carlo method, the mixture Kalman filter, which usesrandom mixture of normal distributions to representtarget distributions. This technique is sometime alsoreferred to as Rao–Blacwellization [14] and the margin-alized particle filter [15]. These filters were developedfor linear or nonlinear models on the assumption ofzero mean white Gaussian process and measurementnoise. The paper [16], investigated Bayes optimal adaptiveestimation for a linear discrete time system withunknown Markovian state space and measurement noise.In another work [17], the authors developed an algorithmusing several Gaussian sum particle filters for a nonlinearstate space model operating with non-Gaussian noise. Allthese filters, approximating the filtering and predictivedistribution by weighted Gaussian mixtures, are basicallybanks of Gaussian particle filters [18]. None of thesepapers, however, were developed for state estimation of anonlinear system with the assumption of non-Gaussiannon-stationary measurement noise. Moreover, in case ofGaussian mixture filters, e.g., Gaussian sum particle filters[17], a bank of Gaussian particle filters (GPF) run parallely,similar to the case of mixture Kalman filters [13], where abank of Kalman filters run parallely. But in the latter case,the computational load is higher than the proposed SIR PFwhere a single PF runs to estimate the states in thepresence of non-Gaussian noise.
Since the likelihood function depends on the measure-ment noise distribution, so the first task is to model themeasurement noise and the parameter estimation of thenoise. In our previous work [19], an algorithm wasproposed to estimate the parameters of a non-stationarynon-Gaussian noise (NSNGN) by the maximization of thelog likelihood function. The present paper uses thisalgorithm to improve the particle filter algorithm [4] byproposing a generalized observation likelihood modelthrough measurement noise preprocessing. The noise ismodeled using Gaussian mixture PDFs and additionally,the noise parameters are evaluated by maximizing the loglikelihood function of a data set of a number ofindependent noise sequences. This noise model and theevaluated parameters are used in the likelihood functionof the SIR particle filter for state estimation. This is ageneralized noise model which can be used for PDFestimation of stationary or non-stationary, Gaussian ornon-Gaussian, zero mean or non-zero mean noise se-quence. This modified SIR particle filter seems to besuitable for state estimation of nonlinear system in thepresence of any type of white measurement noise.
This paper is organized as follows. In Section 2, theproposed modeling of the measurement noise and itsparameter estimation have been described. Section 3recalls the basic SIR particle filter algorithm. Section 4describes the proposed modeling of the likelihood func-tion in the SIR filter. The model order growth of theGaussian mixture models for non-Gaussian measurement
noise over iteration is discussed in Section 5. Simulationresults are shown in Section 6. Finally conclusions aredrawn in Section 7.
2. Parameter estimation of non-stationarynon-gaussian noise
Consider vk to be a non-stationary non-Gaussian noise(NSNGN) sequence with non-zero mean. The parametersof noise vk are to be estimated from n number ofindividual measurement noise history. The NSNGN ismodeled by Gaussian mixture densities and the noiseparameters at every time instant are estimated bymaximizing the logarithm of the likelihood function of n
independent sets of noise sequences. Also vk at timeinstant k may be characterized by a Gaussian mixturedensity of size m [19]:
f ðvkÞ ¼Xm
i ¼ 0
piðkÞfiðvkÞ ð1Þ
¼Xm
i ¼ 0
piðkÞ
ð2pÞ1=2siðkÞexp �
fvk�miðkÞg2
s2i ðkÞ
" #; ð2Þ
where fiðvkÞ is the probability density of vk belonging tothe i-th Gaussian component density in the mixture, andmiðkÞ ¼ ½m1ðkÞ; . . . ;mmðkÞ�, siðkÞ ¼ ½s1ðkÞ; . . . ;smðkÞ�, piðkÞ ¼
½p1ðkÞ; . . . ; pmðkÞ� are, respectively, the means, standarddeviations and weights of the individual Gaussian com-ponent densities. Also piZ0, and
Xm
i ¼ 0
piðkÞ ¼ 1: ð3Þ
The i-th Gaussian density component of vk, normallydistributed, with mean miðkÞ and variance s2
i ðkÞ, is denotedby NfmiðkÞ;s2
i ðkÞg at k-th time step and is related to theevent AiðkÞ according to the following equation:
AiðkÞ ¼ ½vk �NfmiðkÞ;s2i ðkÞg�; ð4Þ
with AiðkÞ, i¼ 1; . . . ;m, mutually exclusive and exhaustive,and the probability of the event AiðkÞ is
PfAiðkÞg ¼ piðkÞ: ð5Þ
Using Bayes’ rule, the posterior probability of the i-thGaussian component conditioned on signal vl
k is
PffiðvlkÞjv
lkg ¼
pfvlkjfiðv
lkÞgPffiðv
lkÞgPn
j ¼ 1 pfvlkjfiðv
lkÞgPffiðv
lkÞg;
where vlk denotes the noise in the l-th data set (among a
total of n data sets) at the k-th time instant.From (4) it follows that, the i-th Gaussian density
component of vlk (denoted by fiðv
lkÞ) is related to the event
AiðkÞ. Now replacing AiðkÞ with fiðvlkÞ in (5), we get
PfAiðkÞg ¼ PffiðvlkÞg ¼ piðkÞ: ð6Þ
And, since,
pfvlkjfiðv
lkÞg ¼ fifv
lkjmiðkÞ;siðkÞg; ð7Þ
ARTICLE IN PRESS
A. Mukherjee, A. Sengupta / Signal Processing 90 (2010) 1873–1885 1875
we have
PffiðvlkÞjv
lkg ¼
piðkÞfifvlkjmiðkÞ;siðkÞgPm
j ¼ 1 piðkÞfifvlkjmiðkÞ;siðkÞg
: ð8Þ
Also
Xm
i ¼ 1
PffiðvlkÞjv
lkg ¼ 1: ð9Þ
Now the parameters, i.e. piðkÞ, miðkÞ and siðkÞ will beestimated by maximizing the log likelihood of thecomplete data set w.r.t. the function variables piðkÞ, miðkÞ
and siðkÞ, using the a priori values of the parameters(a priori update of parameters is discussed at the end ofthis section through (21)–(26)). Log likelihood is usedinstead of true likelihood because it leads to simplermathematical expression, but still attains its maximum atthe same points as the likelihood function. The logarithmof the likelihood function of n independent signalsconditioned on the model parameters is given by [19]
OðkÞ ¼Xn
l ¼ 1
loge
Xmi ¼ 1
piðkÞfifvlkjmiðkÞ;siðkÞg ð10Þ
obeying the constraint in (3). If l be a Lagrange multiplier,(10) yields,
OðkÞ ¼Xn
l ¼ 1
loge
Xmi ¼ 1
piðkÞfifvlkjmiðkÞ;siðkÞg�l
Xm
i ¼ 1
piðkÞ�1
( ):
ð11Þ
The mean miðkÞ of the Gaussian components for whichOðkÞ would be maximum can be calculated from
@OðkÞ@miðkÞ
¼ 0: ð12Þ
Now,
@OðkÞ@miðkÞ
¼Xn
l ¼ 1
piðkÞ@
@miðkÞ
1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2ps2
i ðkÞq exp
�ðvlk�miðkÞÞ
2
2s2i ðkÞ
( )264
375
Pmi ¼ 1 piðkÞfifv
lkjmiðkÞ;siðkÞg
¼Xn
l ¼ 1
piðkÞfifvlkjmiðkÞ;siðkÞgPm
i ¼ 1 piðkÞfifvlkjmiðkÞ;siðkÞg
fvlk�miðkÞg
siðkÞ:
Now, using (8) and (12), we get,
Xn
l ¼ 1
PffiðvlkÞjv
lkgfvl
k�miðkÞg
siðkÞ¼ 0: ð13Þ
Therefore,
miðkÞ ¼
Pnl ¼ 1 Pffiðv
lkÞjv
lkgv
lkPn
l ¼ 1 PffiðvlkÞjv
lkg
: ð14Þ
Similarly, new estimates for the variance s2i ðkÞ can be
obtained by evaluating
@OðkÞ@siðkÞ
¼ 0: ð15Þ
Solving (15), we get,
s2i ðkÞ ¼
Pnl ¼ 1 Pffiðv
lkÞjv
lkgfv
lk�miðkÞg
2Pnl ¼ 1 Pffiðv
lkÞjv
lkg
: ð16Þ
To find the estimates for the weights of the Gaussiancomponents densities i.e. piðkÞ, we set the derivative ofOðkÞ w.r.t. piðkÞ equal to zero:
@OðkÞ@piðkÞ
¼Xn
l ¼ 1
fifvlkjmiðkÞ;siðkÞgPm
i ¼ 1 piðkÞfifvlkjmiðkÞ;siðkÞg
�l¼ 0:
Now, using (8) and above equation, we get
Xn
l ¼ 1
PffiðvlkÞjv
lkg
piðkÞ�l¼ 0 ð17Þ
) piðkÞ ¼1
l
Xn
l ¼ 1
PffiðvlkÞjv
lkg: ð18Þ
Also, (3) and (18) lead to
Xm
i ¼ 1
pi ¼Xmi ¼ 1
1
l
Xn
l ¼ 1
PffiðvlkÞjv
lkg ¼ 1
)1
l
Xn
l ¼ 1
Xmi ¼ 1
PffiðvlkÞjv
lkg ¼
n
l: ð19Þ
Therefore, l¼ n.Inserting l into (18), we get,
piðkÞ ¼1
n
Xn
l ¼ 1
PffiðvlkÞjv
lkg: ð20Þ
Eqs. (14), (16) and (20) give the a posteriori updated valuesof the means, variances and weights of the Gaussiancomponents, respectively, and inserting the correspond-ing values in (2), the resulting PDF is seen to closelyapproximate the actual non-Gaussian noise PDF.
In the proposed algorithm, the parameters (means,variances and weights) have been evaluated in twostages—a priori update and a posteriori update. At thestart of each time step, the parameters of the i-thGaussian component are set by the newly proposed a
priori update. Next the a posteriori, i.e., the final values ofthe parameters are evaluated by (14), (16) and (20). In thea priori update, the mean of the individual Gaussianmodels at k-th time step are set so that they aresymmetrically distributed about the a posteriori resultantmean of previous time step ðmðk�1ÞÞ. By this strategy anasymmetrical and multimodal distributed non-Gaussiannoise can be characterized too.
The a priori variance s20iðkÞ, the a priori mean m0iðkÞ and
the a priori weight p0iðkÞ are selected as follows:The a priori weight:
p0iðkÞ ¼1
m; ð21Þ
where m is the number of Gaussian components.The a priori variance:
s20iðkÞ ¼ s
20iðk�1Þ: ð22Þ
The a priori mean m0iðkÞ is set as
m0iðkÞ ¼ mðk�1Þ 1þmþ1
2�i
� �1
m
� �ð23Þ
when m is odd.
ARTICLE IN PRESS
A. Mukherjee, A. Sengupta / Signal Processing 90 (2010) 1873–18851876
If m is even, for i¼ 1 to m=2, The a priori mean is set asfollows:
m0iðkÞ ¼ mðk�1Þ 1þmþ2
2�i
� �1
m
� �; ð24Þ
and, for i¼m=2þ1 to m,
m0iðkÞ ¼ mðk�1Þ 1þm
2�i
� � 1
m
� �; ð25Þ
where
mðk�1Þ ¼ p1ðk�1Þm1ðk�1Þþ � � � þpmðk�1Þmmðk�1Þ ð26Þ
is the resultant a posteriori mean of n number of data setsat the ðk�1Þ�th time instant.
The proposed algorithm for parameter estimation ofNSNGN is summarized below:
2.1. Algorithm I
(1)
At the start of the k-th time step the a prioriparameters, i.e., weight p0iðkÞ, mean m0iðkÞ andvariance s2
0iðkÞ of i-th Gaussian component areset as:� Priori weight of Gaussian components (by (21))
p0iðkÞ ¼ 1=m.� Priori variance s2
0iðkÞ ¼ s2i ðk�1Þ. (by (22))
� Priori mean, (by (23)–(26))When m is an odd number,m0iðkÞ ¼ mðk�1Þð1þððmþ1Þ=2�iÞ1=mÞ.If m is a even number,m0iðkÞ ¼ mðk�1Þð1þððmþ2Þ=2�iÞ1=mÞ, for i¼ 1 tom=2andm0iðkÞ ¼ mðk�1Þð1þðm=2�iÞ1=mÞ, for i¼m=2þ1to m.
(2)
The posterior probability of the i-th Gaussian compo-nent (8) using the a priori values in step 1 above isgiven by,PffiðvlkÞjv
lkg ¼
p0iðkÞfifvlkjm0iðkÞ;s0iðkÞgPm
j ¼ 1 p0jðkÞfjfvlkjm0jðkÞ;s0jðkÞg
: ð27Þ
The a posteriori values of the parameters of the
(3) Gaussian components are estimated by (14), (16)and (20) asA posteriori weight of the Gaussian component:
piðkÞ ¼1
n
Xn
i ¼ 1
PffiðvlkÞjv
lkg: ð28Þ
A posteriori mean:
miðkÞ ¼
Pnl ¼ 1 Pffiðv
lkÞjv
lkgv
lkPn
l ¼ 1 PffiðvlkÞjv
lkg
: ð29Þ
A posteriori variance:
s2i ðkÞ ¼
Pnl ¼ 1 Pffiðv
lkÞjv
lkgfv
lk�miðkÞg
2Pnl ¼ 1 Pffiðv
lkÞjv
lkg
: ð30Þ
eriori resultant mean of the non-stationary non-
(4) A postGaussian noise mðkÞ ¼ p1ðkÞm1ðkÞþ � � � þpmðkÞmmðkÞ.
3. SIR particle filter
The SIR filter proposed in [4] is a Monte Carlo methodthat can be applied to recursive Bayesian filteringproblems. The state vector ðxk 2 Rn
Þ dynamics is assumedto evolve according to the following system model:
xk ¼ fk�1ðxk�1Þþwk�1; ð31Þ
where wk is a zero mean, white noise sequence. The PDFof wk is assumed to be known. The measurements yk arerelated to the state vector via the measurement equationas follows:
yk ¼ hkðxkÞþvk: ð32Þ
Here vk is another zero mean, white noise sequence ofknown PDF and independent of the process noise.SIR algorithm needs samples/particles, Sk�1 ¼ fxk�1ðjÞ :
j¼ 1; . . . ;ng from pðxkjxk�1ðjÞÞ. The SIR filter propagatesthe random sample set by the following steps.
�
Prediction: Each sample in Sk�1 is passed through thestate Eq. (31) to obtain samples from the prior densityat the time step k:x�kðjÞ ¼ fk�1ðxk�1ðjÞÞþwk�1ðjÞ; ð33Þ
where wk is a sample drawn from the PDF of theprocess noise pðwkÞ.
� Update: (i) Calculate the likelihood function pðzkjx�kðjÞÞ
for each sample in the set S�k ¼ x�kðjÞ : j¼ 1; . . . ;n. (ii)Calculate the normalized weights for each samplesas
qkðjÞ ¼pðzkjx
�kðjÞÞPn
j ¼ 1 pðzkjx�kðjÞÞ
: ð34Þ
The weights qkðjÞ represent the probability massassociated with the element j of S�k.
� Resample n times from the discrete distributiondefined by S�k and qk ¼ fxkðiÞ : j¼ 1; . . . ;ng to generatesamples the Sk ¼ fx
�kðjÞ : j¼ 1; . . . ;ng so that for any j0,
Prfxkðj0Þ ¼ x�kðjÞg ¼ qkðjÞ.
4. Likelihood function modeling of an SIR particle filter
The likelihood function pðzkjxkÞ needs to be available in(34) for point wise evaluation. In practice, the likelihoodfunction is evaluated as pðzkjxkÞ ¼ pNðzk;hkðxkÞ;RkÞ assum-ing vk to be always a zero mean white Gaussian noisewith covariance Rk. When vk turns out to be nonstationary and non-Gaussian with a non-zero mean, thisis expected to lead to introduce gross errors in theestimation process.
In case the noise sequence is non-stationary and non-Gaussian with a non-zero mean, the PDF may be approxi-mated using the newly proposed algorithm in Section 2.1and subsequently the likelihood function can be evaluatedas follows:
Let x and y be two independent n vector valuedrandom variables, and z be defined as their sum:
z¼ xþy: ð35Þ
ARTICLE IN PRESS
A. Mukherjee, A. Sengupta / Signal Processing 90 (2010) 1873–1885 1877
It directly follows that the conditional PDF may be foundas [20]
pzjxðzjxÞ ¼ pyðz�xÞ: ð36Þ
Thereby from (32) we can write
pðzkjxkÞ ¼ pðzk�hkðxkÞÞ ¼ pðvkÞ: ð37Þ
Following this procedure in the SIR algorithm, thelikelihood function (in (34)) can be evaluated using (2).Therefore the modified SIR algorithm takes the followingform including measurement noise preprocessing (usingAlgorithm I).
4.1. Algorithm II: new SIR particle filter
½fxkðjÞ;qkðjÞgnj ¼ 1� ¼ SIR½fxk�1ðjÞ; qk�1ðjÞg
nj ¼ 1; zk�
�
for j¼ 1 : nJ Draw
xkðjÞ � pðxkjxk�1ðjÞÞ: ð38Þ
J Calculate
qkðjÞ ¼ pðzkjxkðjÞÞ ¼ pðvkðjÞÞ
¼Xm
i ¼ 0
piðkÞ
ð2pÞ1=2siðkÞexp �
fvkðjÞ�miðkÞg2
s2i ðkÞ
" #ð39Þ
(using (1), (2) and (37))
� end � Calculate the total weight: t¼ SUM½fqkðjÞgnj ¼ 1�
�
for j¼ 1 : nJ Normalized qkðjÞ ¼ t�1qkðjÞ
�
end � Resampling (procedure explained in [1,2])J ½fxkðjÞ; qkðjÞgnj ¼ 1� ¼ RESAMPLE½fxkðjÞ;qkðjÞg
nj ¼ 1�.
5. Model order growth over iteration for Gaussianmixture models of non-Gaussian measurement noise
In this section, the order of the mixture models overiteration is discussed considering a general Bayesianframe work. Consider the system (31)–(32) given inSection 3. The non-Gaussian noise in the measurementin (32) is approximated by a finite number of Gaussianmixture models using (1).
The measurement noise PDF is, therefore, given by
pðvkÞ ¼Xm
i ¼ 1
piðkÞNðvk;miðkÞ;s2i ðkÞÞ: ð40Þ
Here we assume that the process noise is Gaussian. Theprediction density of the state at time k, using theChapman–Kolmogorov equation, is
pðxkjy1:k�1Þ ¼
Zpðxkjxk�1Þpðxk�1jy1:k�1Þdxk�1: ð41Þ
At time step k, when a measurement yk becomesavailable, the update of the prediction PDF via the Bayes’rule is given by
pðxkjy1:kÞ ¼ CkpðykjxkÞpðxkjy1:k�1Þ; ð42Þ
where Ck is the normalized constant:
Ck ¼
ZpðykjxkÞpðxkjy1:k�1Þdxk ð43Þ
Consider k¼ 1, after the arrival of y1, the posterior densitycan be expressed as
pðx1jy1Þ ¼ C1pðy1jx1Þpðx1Þ: ð44Þ
Now,
pðy1jx1Þ � pðv1Þ ¼Xm
i ¼ 1
pið1ÞNðv1;mið1Þ;s2i ð1ÞÞ;
pðy1jx1Þ ¼Xmi ¼ 1
pið1Þpiðy1jx1Þ; ð45Þ
where piðy1jx1Þ ¼Nðv1;mið1Þ;s2i ð1ÞÞ. Therefore, substitut-
ing (45) into (44) we get,
pðx1jy1Þ ¼ C1
Xmi ¼ 1
pið1Þpiðy1jx1Þpðx1Þ
or pðx1jy1Þ ¼Xm
i ¼ 1
aið1Þ ~piðx1jy1Þ; ð46Þ
where ~piðx1jy1Þppiðy1jx1Þpðx1Þ, and aið1Þ is the normalizedweight. Again, after the arrival of y2, the posterior densityis
pðx2jy1:2Þ ¼ C2pðy2jx2Þpðx2jy1Þ ð47Þ
¼ C2
Xm
i ¼ 1
pið2Þpiðy2jx2Þpðx2jy1Þ ð48Þ
Now, using (41), we can write,
pðx2jy1Þ ¼
Zpðx2jx1Þpðx1jy1Þ dx1: ð49Þ
Substituting (46) into (49), we get,
pðx2jy1Þ ¼
Zpðx2jx1Þ
Xmi ¼ 1
aið1Þ ~piðx1jy1Þdx1
¼Xmi ¼ 1
aið1Þ
Z~piðx1jy1Þpðx2jx1Þdx1
or pðx2jy1Þ ¼Xm
i ¼ 1
aið1Þ ~piðx2jy1Þ: ð50Þ
Now, using (50) in (48), we get,
pðx2jy1:2Þ ¼ C2
Xm
i ¼ 1
pið2Þpiðy2jx2ÞXm
j ¼ 1
ajð1Þ ~pjðx2jy1Þ
¼ C2
Xm
i ¼ 1
Xm
j ¼ 1
ajð1Þpið2Þpiðy2jx2Þ ~pjðx2jy1Þ
¼Xm2
j ¼ 1
ajð2Þpjðy2jx2Þ ~pjðx2jy1Þ;
pðx2jy1:2Þ ¼Xm2
j ¼ 1
ajð2Þ ~pjðx2jy1:2Þ; ð51Þ
where ~pjðx2jy1:2Þ ¼ pjðy2jx2Þ ~pjðx2jy1Þ. Similarly, after thearrival of the k-th measurement, the posterior PDF is
pðxkjy1:kÞ ¼Xmk
j ¼ 1
ajðkÞ ~pjðxkjy1:kÞ: ð52Þ
ARTICLE IN PRESS
Fig. 1. Estimated PDF at the 5th time step.
Fig. 2. Estimated PDF at the 25th time step.
A. Mukherjee, A. Sengupta / Signal Processing 90 (2010) 1873–18851878
So, we can see that the mixture model order increasesexponentially on the arrival of new measurements. Inpractical applications, updating with an exponentiallygrowing mixture model order is extremely difficult tohandle. But this situation does not arise in case of an SIRparticle filter. From the above derivation, we can see thatthe term pðxkjy1:k�1Þ is responsible for the model ordergrowth. But, in the SIR particle filter, predicted PDFpðxkjy1:k�1Þ is approximated by the term pðxkjx1:k�1Þ [1–4],(i.e. pðxkjy1:k�1Þ � pðxkjx1:k�1Þ), which is independent of themeasurement. Therefore the posterior PDF in (42) at thek-th time step can be expressed using (1) as
pðxkjy1:kÞ ¼ CkpðykjxkÞpðxkjy1:k�1Þ ð53Þ
� Ck
Xm
i ¼ 1
piðkÞpiðykjxkÞpðxkjx1:k�1Þ ð54Þ
From the SIR PF algorithm in Section 4.1, we can also seethat in the measurement update stage there is no need topass the importance weights from one time step of thealgorithm to the next. From (39), we can say that theimportance weights are independent of the previousmeasurements. And in the time update stage, i.e. (38),importance sampling density for the SIR PF is independentof the measurements. So the mixture model order for non-Gaussian measurement noise would not grow overiterations in case of an SIR PF.
6. Simulation results
The new SIR PF proposed in this paper is applied now totwo different systems. In Example I, a simple nonlinear timeseries model is chosen which has been used extensively inthe literature for benchmarking numerical filtering techni-ques [4,5]. Here the measurement noise is considered to bea non-stationary Rayleigh distributed noise. And in Example
II, the bearing only tracking model [4], which is of interestin defence applications, is presented in the presence ofglint noise. The glint noise is a heavy tailed non-Gaussiandistributed noise [21,21–23]. The performance of the newSIR PF was compared with a conventional SIR PF and amixture Kalman filter (MKF) [13].
6.1. Example I
Here the modified SIR filter is used to estimate thestates of a nonlinear system in the presence of a non-stationary non-Gaussian non-zero mean measurementnoise sequence, and the results are compared with that ofa usual SIR PF, MKF [13] and extended Kalman filter (EKF)[24,25]. Consider the nonlinear model [4] given below:
xk ¼ 0:5xk�1þ25xk�1
½1þxk�1�2þ8cosð1:2ðk�1ÞÞþwk; ð55Þ
yk ¼x2
k
20þvk; ð56Þ
where wk is a zero mean Gaussian white noise with variance10, whereas vk is assumed to be non-stationary and non-Gaussian—a non-stationary Rayleigh distributed signal. Thesimulation was carried out for 50 time samples. The
sequence of noise was generated randomly from a Rayleighdistribution with linearly varying parameter b from 1 to 2during the 50 time samples. The equivalent variance of thenoise distribution varies from 0.43 to 1.72. The Gaussian mixmaximum likelihood algorithm (Section 2) has been used toestimate the parameters of the non-stationary non-Gaussiannoise sequence at each time step. To find the parameters ofthe noise, a sequence of 1000 individual data sets were takenand processed with the Gaussian mix maximum likelihoodalgorithm. The number of Gaussian densities in the mixturewas fixed to m¼ 5. The initial value of the weighting co-efficients pið1Þ was set at 1=m. In this simulation, the initialvalue of the resultant mean was taken as mð1Þ ¼ 1:2 andvariance was taken as s2
01 ¼ � � � ¼ s205 ¼ 0:252.
The estimated noise PDF at the 5th, 25th and 45th timestep are shown in Figs. 1–3, respectively. In the figures,we see that the resulting PDF (red one) closely estimatesthe actual PDF (blue one) of the noise present in themeasurement. This resulting noise PDF is included in
ARTICLE IN PRESS
A. Mukherjee, A. Sengupta / Signal Processing 90 (2010) 1873–1885 1879
the likelihood function of the SIR algorithm in (34). Theperformance of the modified SIR PF has been comparedwith the conventional SIR PF, MKF and EKF. In theconventional SIR filter, the likelihood function has beentaken as a Gaussian one, i.e.
pðzkjxkÞ ¼1
ð2pRkÞ1=2
exp �yh�hkðxkÞ
2Rk
� �;
where Rk is the time-variant noise covariance which variesfrom 0.43 to 1.72 during the 50 time samples. The same
Fig. 3. Estimated PDF at the 45th time step.
0 10 200
10
20
30
40
50
60
70
80
90
Ti
RM
SE
RMSE with 500 particle
Fig. 4. RMSE for 100 M
measurement noise covariance Rk is selected for EKF. In caseof the MKF [13,26], here, the measurement noise isconsidered as: vk �Nð0;RkðLkÞÞ, where the indicator vectorLk is a discrete latent variable which takes an integer valuebetween 1�m (number of models). The measurement noisevk is approximated by
Pmi ¼ 1 piNð0;s2
i Þ. The indicatorrandom variable is as follows:
Lk9
1 if vk �Nð0;s21Þ;
2 if vk �Nð0;s22Þ;
^
m if vk �Nð0;s2mÞ
8>>>><>>>>:
with PðLk ¼ 1Þ ¼ p1; . . . ; PðLk ¼mÞ ¼ pm. The number ofmodels m¼ 5 was selected in the mixture Kalman filterwith five different values of measurement noisecovariance sðiÞ2 ¼ 1:2i�1
s2, where i¼ 1; . . . ;m ands2 ¼ 0:43. Since the state space and the measurementmodel is nonlinear, they must be linearized beforeapplying the MKF. For all the filters, the initial value ofthe mean and the state covariance was selected as 0.1 and5, respectively. The value of the process noise covariancewas set to Q ¼ 10 for all the filters. The root mean squareerror (RMSE) of each filter was subsequently computed.Fig. 4 shows the RMSE of each filter with 500 particles forL¼ 100 Monte Carlo runs. The RMSE of xk is defined as
RMSExk¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1
LXL
j ¼ 1
ðxk;j�x̂k;jÞ
vuut : ð57Þ
The comparison of the average MSE (where MSE¼ RMSE2)among the filters over the 50 time samples, in terms of the
30 40 50me
s of 100 Monte carlo run
SIR PFNew SIR PFMKFKF
onte Carlo runs.
ARTICLE IN PRESS
A. Mukherjee, A. Sengupta / Signal Processing 90 (2010) 1873–18851880
number of particles, is shown in Fig. 5. The average MSEwas compared for the number of particles 50, 100 and 500.Even for small number of particles the new SIR PFperforms significantly better than EKF. An increase in thenumber of particles reduces the MSE further for the newSIR PF, SIR PF and MKF. The performance of the new SIR PFbecomes very close for the number of particles 500 and100. The performance of the new SIR PF appears to bebetter than the other filters taken here for comparison.
6.2. Example II
Here, the case of radar based target tracking in thepresence of glint noise is considered. The performance of thenew SIR PF is compared to the SIR PF and MKF. Radartracking is an important area of research in defenceapplications. Practical radar tracking systems are rarelyGaussian due to many factors. One of them is the changes inthe aspect towards the radar, which can cause irregularelectromagnetic wave reflection, resulting in significantvariations of radar reflections [23]. This phenomenon givesrise to outliers in angle tracking, and it is referred to as thetarget glint. The glint noise has a long-tailed PDF [23].The glint noise has been modeled by a Student’st-distribution with two degrees of freedom [27]. For thestudent’s t-distribution (with 2 degrees of freedom), theestimated PDF using the proposed maximum log likelihoodalgorithm (Section 2.1, Algorithm I) for noise modeling isshown in Fig. 6 (where m¼ 5, the initial value of theweighting co-efficients pið1Þ ¼ 1=m, initial value of theresultant mean mð1Þ ¼ 0, variances s2
01ð1Þ ¼ 0:02, s202ð1Þ ¼
2, s203ð1Þ ¼ 0:15, s2
04ð1Þ ¼ 1:5, s205ð1Þ ¼ 10). A mixture model
10
100
200
300
400
500
600
700
Mea
n S
quar
e E
rror
Comparison of MSE
Number of partic
Fig. 5. Average of MSE over time samples with the number o
of two zero-mean Gaussian PDFs has been proposed forglint noise in [23], based on the statistical analysis. Thismodel consists of one Gaussian PDF with a high proba-bility and small variance, and another Gaussian PDF withsmall probability of occurrence and very high variance.Alternatively, in [28], the glint noise was modeled by amixture of zero-mean, small-variance Gaussian PDF withheavy-tailed Laplacian distribution. The last two models aremost commonly used for glint noise modeling [29]. Let usconsider the target moves in an x–y plane (2-D space)according to the following process model [4]:
Xk ¼
1 T 0 0
0 1 0 0
0 0 1 T
0 0 0 1
26664
37775Xk�1þ
T2=2 0
T 0
0 T2=2
0 0
26664
37775wk�1 ð58Þ
and measurement model:
Zk ¼ tan�1ðyk=xkÞþvk; ð59Þ
where Xk ¼ ½xk vxk yk vyk�, xk and yk denote the cartesianco-ordinate position of the target, vxk and vyk denote thevelocities in x and y directions, respectively. wk�1 is azero mean white Gaussian noise with covariance Q,where Q ¼ qI2, where I2 is a 2 2 identity matrix andffiffiffi
qp¼ 0:001. The measurement noise vk was considered
to be a glint noise which is modeled by followingequation [21,22]:
pðvkÞ ¼ ð1�eÞNðvk;0;s2r ÞþeNðvk;0;ks2
r Þ; ð60Þ
where glint probability e¼ 0:1, k¼ 1000 and themeasurement noise covariance R¼ s2
r ¼ 0:0052. Thetarget was simulated for 50 time samples with
2 3
w.r.t number of particle
les: 500 − 100 − 50
New SIRSIRMKFKF
f particles 50, 100 and 500 (for 100 Monte Carlo runs).
ARTICLE IN PRESS
A. Mukherjee, A. Sengupta / Signal Processing 90 (2010) 1873–1885 1881
sampling interval T ¼ 1 s and the initial value of thestate is selected as x0 ¼ ½�0:05; 0:001; 0:7; �0:055�T .The performance of the new SIR PF has been comparedwith the conventional SIR PF and the MKF. The initialprior distribution of the state vector for all the filterswas set with a mean x0 ¼ ½0; 0; 0:4; �0:05�T , and
−15 −10 −5 00
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
pdf ar
prob
abili
ty d
ensi
ty
Estima
Fig. 6. Estimated PDF of St
0 10 200
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
T
RM
SE
of v
eloc
ity a
long
x−a
xis
RMSE with 1000 sample
Fig. 7. RMSE for 100 M
covariance
P0 ¼
0:52 0 0 0
0 0:0052 0 0
0 0 0:32 0
0 0 0 0:012
266664
377775:
5 10 15 20 25gument
ted PDF
Student T pdfEstimated pdf
udent t distribution.
30 40 50ime
s of 100 Monte carlo runs
SIRNew SIRMKF
onte Carlo runs.
ARTICLE IN PRESS
A. Mukherjee, A. Sengupta / Signal Processing 90 (2010) 1873–18851882
In the conventional SIR filter, the measurement noisecovariance is R¼ 0:0052. In case of the MKF, here, themeasurement noise is considered as: vk �Nð0;RkðLkÞÞ,
0 10 200
0.02
0.04
0.06
0.08
0.1
0.12
RM
SE
of v
eloc
ity a
long
y−a
xis
T
RMSE with 1000 sample
Fig. 8. RMSE for 100 M
0 10 200
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Ti
RM
SE
of p
ositi
on a
long
x−a
xis
RMSE with 1000 sample
Fig. 9. RMSE for 100 M
where the indicator vector Lk is a discrete latentvariable which takes an integer value between 1 andM (number of models). The number of models, M¼ 5, is
30 40 50ime
s of 100 Monte carlo runs
SIRNew SIRMKF
onte Carlo runs.
30 40 50me
s of 100 Monte carlo runs
SIRNew SIRMKF
onte Carlo runs.
ARTICLE IN PRESS
A. Mukherjee, A. Sengupta / Signal Processing 90 (2010) 1873–1885 1883
selected in the mixture Kalman filter with five diffe-rent values of measurement noise covarianceRðiÞ ¼ 1000i�1
R, where i¼ 1; . . . ;M and R¼ 0:0052.Since the measurement model is nonlinear, it must be
0 10 200
1
2
3
4
5
6
Tim
RM
SE
of p
ositi
on a
long
y−a
xis
RMSE with 1000 samples
Fig. 10. RMSE for 100 M
1 20
0.02
0.04
0.06
0.08
RM
SE
of v
x
Number of particles: 5000 − 10
1 2 3
0.02
0.04
0.06
0.08
0.1
0.12
RM
SE
of v
y
Comparison of MSE w.r.t num
Number of particles: 5000 − 10
Fig. 11. Comparison of average MSE of velocity along x-axis and
linearized before applying MKF. The root mean squareerror (RMSE) of the state vector for each filter wascomputed. RMSE of position and velocity along x-axisand y-axis, using 1000 particles for L¼ 100 Monte Carlo
30 40 50e
of 100 Monte carlo runs
SIRNew SIRMKF
onte Carlo runs.
3 400 − 500 − 100
SIRMKFnew SIR
4
ber of particle
00 − 500 − 100
SIRMKFnew SIR
y-axis for the number of particles: 5000, 1000, 500, 100.
ARTICLE IN PRESS
1 2 3 40
0.5
1
1.5
2
2.5
RM
SE
of x
Number of particles: 5000 − 1000 − 500 − 100
SIRMKFnew SIR
1 2 3 40
1
2
3
RM
SE
of y
Comparison of MSE w.r.t number of particle
Number of particles: 5000 − 1000 − 500 − 100
SIRMKFnew SIR
Fig. 12. Comparison of average MSE of position along x-axis and y-axis for the number of particles: 5000, 1000, 500, 100.
A. Mukherjee, A. Sengupta / Signal Processing 90 (2010) 1873–18851884
runs, are shown in Figs. 7–10. The comparison of theaverage MSE over the 50 time samples among the filters,in terms of the number of particles, is shown in Figs. 11and 12. The average MSE was compared for number ofparticles 5000, 1000, 500 and 100. Even for a smallnumber of particles, the new SIR PF performssignificantly better than the other filters analyzedhere. An increase in the number of particles reducesthe MSE further for the new SIR PF, SIR PF and MKF. Theperformance of the new SIR PF becomes very close fornumber of particles 5000, 1000 and 500. The simulationresults show that the new SIR PF also gives considerablybetter results with small number of particles since theimportance weight of the SIR PF depends on thelikelihood function, i.e. measurement noise PDF, whichis estimated very closely by the algorithm inSection 2.1. Among all the filters taken for comparison,the performance of the new SIR PF is much better.
7. Conclusion
In this paper, a generalized likelihood function has beenderived for an SIR particle filter. The likelihood function isevaluated from measurement noise history using Gaussianmixture models. The parameters of the noise are estimatedby maximizing the log likelihood function. This model andthe estimated parameters are included in the likelihoodfunction of the SIR particle filter. This is a generalized model,as it can represent the PDF of a noise sequence—be itGaussian/non-Gaussian, zero mean or non-zero mean,stationary/non-stationary. In this way, the new SIR particlefilter is able to estimate the states of a nonlinear system inthe presence of any type of white measurement noise
sequence. Moreover, since, the importance weights of an SIRPF completely depend on the likelihood function, new SIR PFperforms considerably better with a small number ofparticles than the conventional SIR PF and MKF, as long asthe trend of the measurement noise parameters do notchange drastically from their data history, during the onlinestate estimation of the system. Simulation results show thatin the presence of the non-stationary Rayleigh distributednoise, the modified SIR filter gives better results compared tothe other filters considered here. Performance of the new SIRPF was also found to be much better than the SIR PF and theMKF in a radar tracking system in the presence of glint noise.
References
[1] B. Ristic, S. Arulampalam, N. Gordon, Beyond the Kalman Filter
Particle Filters for Tracking Applications, Artech House, Boston,
London, 2004.
[2] A. Doucet, S.J. Godsill, C. Andrieu, On sequential simulation-based
methods for Bayesian filtering, Statistics and Computing 10 (3)
(2000) 197–208.
[3] A. Doucet, N. De Freitas, N. Gordon, Sequential Monte Carlo
Methods in Practice, Springer, Berlin, 2001.
[4] N.J. Gordon, D.J. Salmond, A.F.M. Smith, Novel approach to
nonlinear/non-Gaussian Bayesian state estimation, in: IEE Proceed-
ings-F, vol. 140, no. 2, April 1993.
[5] M. Sanjeev Arulampalam, S. Maskell, N. Gordon, T. Clapp, A tutorial
on particle filters for online nonlinear/non-Gaussian Bayesian
tracking, IEEE Transactions on Signal Processing 50 (2) (February
2002) 174–187.
[6] J. Lichtenauer, M. Reinders, E. Hendriks, Influence of the observation
likelihood function on particle filtering performance in tracking
applications, in: Proceedings of Sixth IEEE International Conference
on Automatic Face and Gesture Recognition, 17–19 May 2004,
pp. 767–772.
[7] R.O. Duda, P.E. Hart, Pattern Classification and Scene Analysis,
Wiley, New York, 1973.
ARTICLE IN PRESS
A. Mukherjee, A. Sengupta / Signal Processing 90 (2010) 1873–1885 1885
[8] D.M. Titterington, A.F.M. Smith, U.E. Makov, Statistical Analysis andFinite Mixture Distributions, Wiley, Chichester, 1985.
[9] B.S. Everitt, D.J. Hand, Finite Mixture Distributions, Chapman & Hall,London, UK, 1981.
[10] G.J. Mclachlan, K.E. Basford, Mixture Models, Marcel Dekker, NewYork, 1988.
[11] R.A. Redner, H.F. Walker, Mixture densities, maximum likelihoodand the EM algorithm, SIAM Review 26 (2) (April 1984).
[12] D.L. Alspach, H.W. Sorenson, Nonlinear Bayesian estimation usingGaussian sum approximations, IEEE Transactions on AutomaticControl 17 (4) (August 1972) 439–448.
[13] R. Chen, J.S. Liu, Mixture Kalman filters, Journal of the RoyalStatistical Society 62 (2000) 493–508 (Part 3).
[14] A. Doucet, N.J. Gordon, V. Krishnamurthy, Particle filters for stateestimation of jump Markov linear systems, IEEE Transactions onSignal Processing 49 (3) (March 2001) 613–624.
[15] T. Schn, F. Gustafsson, Marginalized particle filters for mixed linear/nonlinear state-space models, IEEE Transactions on Signal Proces-sing 53 (7) (July 2005).
[16] J.K. Tugnait, A.H. Haddad, Adaptive estimation in linear systemswith unknown Markovian noise statistics, IEEE Transactions onInformation Theory 22 (1) (January 1980) 66–78.
[17] J.H. Kotecha, P.M. Djuric, Gaussian sum particle filtering, IEEETransactions on Signal Processing 51 (10) (October 2003) 2602–2612.
[18] J.H. Kotecha, P.M. Djuric, Gaussian particle filtering, IEEE Transac-tions on Signal Processing 51 (October 2003) 2593–2602.
[19] A. Mukherjee, A. Sengupta, Parameter estimation of a signalalongwith non-stationary non-gaussian noise, in: The 33rd Annual
Conference of the IEEE Industrial Electronics Society (IECON),November 5–8, 2007, pp. 2429–2433.
[20] P.S. Maybeck, Stochastic Models, Estimation and Control, vol. 2,Academic, New York, 1982.
[21] X. Wang, R. Chen, Adaptive Bayesian multiuser detection forsynchronous CDMA with Gaussian and impulsive noise, IEEETransactions on Signal Processing 47 (7) (July 2000) 2013–2028.
[22] W.R. Wu, Maximum likelihood identification of glint noise, IEEE Trans-actions on Aerospace Electronic Systems 32 (January 1996) 41–51.
[23] G. Hewer, R. Martin, J. Zeh, Robust preprocessing for Kalmanfiltering of glint noise, IEEE Transactions on Aerospace ElectronicSystems 23 (1987) 120–128.
[24] M.S. Grewal, A.P. Andrews, Kalman Filtering Theory and Practice,Prentice-Hall, Englewood Cliffs, NJ, 1993.
[25] Y. Bar-Shalom, X. Rong Li, T. Kirubaranjan, Estimation withApplications to Tracking and Navigation, Wiley, New York, 2001.
[26] R. Chen, X. Wang, J.S. Liu, Adaptive joint detection and decoding inflat-fading channels via mixture Kalman filtering, IEEE Transactionson Information Theory 46 (6) (September 2000) 2079–2094.
[27] B. Borden, M. Mumford, A statistical glint/radar cross section targetmodel, IEEE Transactions on Aerospace Electronic Systems 19 (1)(September 1983) 781–785.
[28] W. Wu, P. Cheng, Nonlinear IMM algorithm for maneuvering targettracking, IEEE Transactions on Aerospace Electronic Systems 30 (3)(July 1994) 875–884.
[29] X. Li, V. Jilkov, Survey of maneuvering target tracking. Part V:multiple-model methods, IEEE Transactions on Aerospace Electro-nic Systems 41 (4) (2005).