A multi-hypothesis solution to data association for the two ...
-
Upload
khangminh22 -
Category
Documents
-
view
0 -
download
0
Transcript of A multi-hypothesis solution to data association for the two ...
Article
The International Journal of
Robotics Research
2015, Vol. 34(1) 43–63
� The Author(s) 2014
Reprints and permissions:
sagepub.co.uk/journalsPermissions.nav
DOI: 10.1177/0278364914545674
ijr.sagepub.com
A multi-hypothesis solution to dataassociation for the two-frame SLAMproblem
Edmund Brekke1 and Mandar Chitre2
Abstract
We propose a multi-hypothesis solution to the simplified problem of simultaneous localization and mapping (SLAM) that
arises when only two measurement frames are available. The proposed solution is obtained through direct evaluation of
the posterior density as given by finite set statistics. We show that hypothesis probabilities can be evaluated within reason-
able accuracy by means of a closed-form expression. Consistency properties are discussed extensively. We overcome
inconsistency problems of the extended Kalman filter by means of natural gradient optimization, and we demonstrate
through implementations on simulated and real data that the proposed approach has better consistency properties than
alternative approaches when applied to the two-frame SLAM problem.
Keywords
SLAM, data association, MHT, consistency, Laplace approximation, natural gradient, finite set statistics,scan matching
1. Introduction
A key task in the navigation of autonomous vehicles is to
estimate the motion of the vehicle with respect to its sur-
rounding environment. This naturally leads to the concept
of simultaneous localization and mapping (SLAM), in
which the vehicle builds a map of the environment while
simultaneously estimating its own position in this map.
SLAM is typically formulated according to a feature-based
parametrization (FB-SLAM), where estimation is done by
processing sets of point measurements extracted from
images obtained by a radar, sonar or laser scanner.
In this paper we address the special case of FB-SLAM
which arises when only two consecutive data frames, and
corresponding measurement sets, are available. We refer to
this problem as feature-based scan-matching (FBSM). This
problem is obviously of particular importance for initializa-
tion of recursive SLAM methods. Furthermore, we believe
that it is instructive to study this simplified problem in close
detail, before attempting to pursue optimal solutions to the
full SLAM problem which comprises several data frames.
FB-SLAM is in some ways related to the problem of
multi-target tracking (MTT). The difference is, crudely
speaking, that in FB-SLAM the sensor moves and the tar-
gets are stationary, while in MTT the targets move while
the sensor is stationary. Both problems are naturally
decomposed into two subproblems: data association and
state estimation. The former problem concerns establishing
which features in consecutive scans originate from the
same physical objects. This includes distinguishing
landmark-originating measurements from clutter measure-
ments, also known as false alarms. In other words, data
association includes answering the question: how many
landmarks are there? The latter problem, when viewed
from the perspective of SLAM, concerns estimating the
motion of the vehicle and the map of landmarks, condi-
tional on such association hypotheses.
Based on this philosophy, the multi-hypothesis tracker
(MHT) was developed in the late 1970s by Donald Reid
(1979) as a solution to the MTT problem. The idea of the
MHT is simply to evaluate posterior probabilities of all feasi-
ble association hypotheses. Reid’s original MHT has been
extended and refined in numerous papers, most notably by
1Department of Engineering Cybernetics, Norwegian University of
Science and Technology (NTNU), Trondheim, Norway2Acoustic Research Laboratory, Tropical Marine Science Institute,
National University of Singapore, Singapore
Corresponding author:
Edmund Brekke, Department of Engineering Cybernetics, Norwegian
University of Science and Technology (NTNU), 7491 Trondheim, Norway.
Email: [email protected]
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
Kurien (1990) and Mori et al. (1986). Several researchers
have expressed belief that the MHT provides an optimal
solution to MTT in some unspecified sense (Ristic, 1999;
Van Keuk, 2002; Musicki and Evans, 2005). Despite its
inherent intractability due to exponential hypothesis growth,
there exist numerous suboptimal MHT implementations
which have been made tractable through aggressive usage of
hypothesis pruning, hypothesis merging, limited time hori-
zons and optimization techniques such as the auction method
and Lagrangian relaxation (Deb et al., 1997; Blackman,
2004). Other well-established MTT methods such as the joint
probabilistic data association filter (JPDAF) (Bar-Shalom
et al., 2011) or the probabilistic multiple hypothesis tracker
(PMHT) (Streit and Luginbuhl, 1994) have been inspired by
the MHT paradigm to a large extent.
In FB-SLAM, joint compatibility branch and bound
(JCBB) (Neira and Tards, 2001) has been described as a gold
standard for data association (Frese, 2010). JCBB aims to
choose the best out of all possible association hypotheses by
using hypothesis cardinality and joint Mahalanobis distance
as score functions. It is not a multi-hypothesis approach in
the same sense as the MHT, since it does not calculate poster-
ior hypothesis probabilities, and therefore is unable to hedge
on alternative hypotheses in the same manner as the MHT.
Many recent advances for data association in FB-SLAM have
been inspired by previous advances in MTT. In particular, this
can be said about the multi-dimensional assignment approach
of Wijesoma et al. (2006), the PMHT-based approach of
Davey (2007) and FastSLAM (Montemerlo, 2003; Nieto
et al., 2003). Both PMHT and FastSLAM, as well as other
SLAM methods such as the work of Pfingsthorn and Birk
(2013), do provide MHT-like hedging abilities, but do not
include concrete formulae for posterior hypothesis probabil-
ities. Blanco et al. (2012) recently proposed such formulae,
but their formulae did not address issues related to false
alarms and landmark existence uncertainty.
During the last decade, an alternative paradigm in MTT
has emerged with basis in Ronald Mahler’s theory of finite
set statistics (FISST) (Mahler, 2007). FISST allows one to
formulate standard MTT problems in such a way that a
well-defined posterior density of the multi-target state,
which is a set-valued random variable, can be constructed.
A primary rationale behind FISST is to be able to deal with
target existence uncertainty in a rigorous manner. FISST
facilitates several useful approximations of this multi-object
posterior, such as the probability hypothesis density (PHD),
which is defined by the property that its integral over some
region R should be equal to the expected number of targets
in R. Mahler’s PHD filter is able to circumvent explicit
data association, and thus avoids the full complexity of the
MHT. Several SLAM methods have been proposed with
basis in the PHD concept (Kalyan et al., 2010; Mullane
et al., 2011; Lee et al., 2012). In this paper we will give par-
ticular attention to the approach of Lee et al. (2012), known
as a single-cluster PHD filter (SCPHD), since this approach
arguably is the neatest FISST-based SLAM method so far
developed.
The PHD is a drastic approximation of the multi-object
posterior. It is reasonably clear that the full multi-object
posterior for standard MTT problems must be represented
in terms of association hypotheses (Williams, 2011; Vo and
Vo, 2013; Williams, 2012; Brekke et al., 2014). Based on
this, we believe that construction of the full posterior for
FB-SLAM may proceed along the lines of the MHT formu-
lations of Reid (1979), Kurien (1990) and Mori et al.
(1986). Establishing concrete expressions for the full pos-
terior of FB-SLAM is important in order to benchmark the
performance of the aforementioned SLAM methods. It is
also important in order to provide further unification
between SLAM methods and MTT methods, and in order
to unify the FISST paradigm with the MHT paradigm. As
a first step on this road, we develop an MHT-solution to
the FBSM problem in this paper.
This entails four contributions. First, we derive a formal
solution to the FBSM problem based on FISST. Second, we
derive an analytical approximation for posterior hypothesis
probabilities. Third, we use Amari’s natural gradient (NG)
(Amari and Douglas, 1998) to overcome inconsistency
problems of the extended Kalman filter (EKF). Fourth we
devise an efficient strategy for exploring the hypothesis
space. This strategy is based on the Bron–Kerbosch method
for clique-detection in graph theory (Bron and Kerbosch,
1973). We therefore refer to the overall FBSM method as
clique-detection scan-matching (CDSM).
This paper builds on preliminary work reported in
Brekke and Chitre (2013). It complements this work by
developing CDSM in the FISST framework, as opposed to
the more old-fashioned MHT-framework used in Brekke
and Chitre (2013), and by presenting real-world results on
laser data, as opposed to the sonar data discussed in Brekke
and Chitre (2013). It also extends Brekke and Chitre (2013)
by providing a full derivation of the hypothesis probability
formula of CDSM, and by comparing CDSM with SCPHD
in addition to other benchmark methods.
This paper is organized as follows. In Section 2 we pres-
ent the underlying framework for our work, including ter-
minology, assumptions and a brief summary of FISST. In
Section 3 we develop a formal solution to the FBSM prob-
lem by means of FISST. In Section 4 we present our most
important contribution: a closed-form expression for pos-
terior hypothesis probabilities. In Section 5 we deal with
hypothesis-conditional state estimation, while Section 6
deals with the search for good hypotheses. Results on
simulated data are presented in Section 7, while results
on the Victoria Park data (Guivant and Nebot, 2001) are
presented in Section 8. Finally, a conclusion follows in
Section 9. The paper includes three appendices.
Appendix A provides details concerning the evaluation of
derivatives used in Sections 4–5. Appendix B provides
details regarding our FBSM-adaptation of SCPHD. In
Appendix C we show how the posterior density for
FBSM can be expressed in terms of association hypoth-
eses. The acronyms used throughout the paper can be
found in Table 1.
44 The International Journal of Robotics Research 34(1)
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
2. Underlying framework
The purpose of this section is to state all aspects of the
FBSM problem in precise terms, and thus to provide a
foundation for the developments in later sections.
2.1. Basic notation and terminology
We address a two-dimensional scan-matching problem, and
so the two-dimensional rotation matrix given by
R(c)=cosc � sinc
sinc cosc
� �ð1Þ
plays a central role. Following the formalism for
matrix derivatives established by Magnus and Neudecker
(1988), differentiation of the rotation matrix yields
DcR(� c)= ½� sin (c), � cos (c), cos (c), � sin (c)�T
where D is the matrix derivative operator as defined in
Magnus and Neudecker (1988). For future reference we
reshape DcR(� c) into the matrix
S(c)=� sinc cosc
� cosc � sinc
� �ð2Þ
Landmark positions (also known as targets) are gener-
ally denoted by x. Vehicle position is denoted by rk = [xk,
yk]T, while vehicle pose is denoted by pk = [xk, yk, ck]
T (cf.
Figure 1). By N (a ; b,S) we mean the Gaussian distribu-
tion N (b,S) evaluated at a. The notation f(�) is used to sig-
nify both probability density functions (PDFs) and multi-
object densities. Uniform distributions are written as
unifL(x)=1=v if x 2 L0 otherwise
�ð3Þ
Let S denote the field of view (FOV) in sensor (polar)
coordinates, and let Lk denote the corresponding region in
landmark (Cartesian) space at time k. Let the correspond-
ing volumes be denoted by V and v, respectively.
2.2. Assumptions
Here we formulate the FBSM problem according to a clas-
sical Bayesian framework along the lines of Wijesoma et al.
(2006) and Reid (1979).
2.2.1. Kinematic prior. We assume that there exist a set
X = {x1,.,xn} of n stationary landmarks, where xi 2 R2,
and both xi and n are unknown. We assume the landmarks
to be a priori independent and identically distributed (i.i.d.)
according to a uniform distribution over the region L1 :
f (xi)=unifL1(xi) ð4Þ
Furthermore, the landmarks are assumed a priori indepen-
dent of the vehicle pose pk. For the vehicle pose, we use a
standard motion model
p2 = p1 +R(c1) 02 3 1
01 3 2 1
� �q+w, w;N (0,Q) ð5Þ
where the displacement q depends on the velocity of the
vehicle and w is plant noise. We assume that p1 is perfectly
known (= 03 3 1), and that q;N (mq,Pq). The prior knowl-
edge about p2 (i.e. the information that we have about p2
without invoking any sensor measurements) can then be
represented by
f (p2)=N (p2 ; mq,P) where P=Q+Pq ð6Þ
2.2.2. Measurement model. At each time step a measure-
ment set Zk = fz1k , . . . , zmk
k g is registered. The combined
measurement set is denoted Z1:2 = (Z1, Z2). Any measure-
ment zjk may originate from clutter or from one of the
Table 1. List of Acronyms.
Acronym Definition
EKF Extended Kalman filterFB-SLAM Feature-based SLAMFBSM Feature-based scan matchingFISST Finite set statisticsJCBB Joint compatibility branch and boundMAP Maximum a posterioriMHT Multiple Hypothesis TrackerMSS Minimal sample setMTT Multitarget trackingNG Natural gradientPHD Probability hypothesis densitypIC Probabilistic iterative correspondencePDF Probability density functionRANSAC Random sample and consensusSCPHD Single-cluster PHDSLAM Simultaneous localization and mapping
Fig. 1. Illustration of the scan-matching problem. Our aim is to
estimate the vehicle pose p2 = [x2, y2, c2]T from the
measurement sets Z1 and Z2. In this illustrative example a
plausible association hypothesis would consist of the
correspondences (z11, z
12), (z
41, z
22) and (z3
1, z32).
Brekke and Chitre 45
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
landmarks in X. In the former case its spatial PDF is uni-
form over the sensor’s FOV:
f (zjk jNot from landmark)=unifS(z
jk) ð7Þ
In the latter case its spatial PDF is modeled as a
Gaussian
f (zjk jxi, pk)=N (z
jk ; h(xi, pk),R) ð8Þ
where
h(xi, pk)= f(g(xi, pk)) ð9Þ
fx
y
� �� �=
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffix2 + y2
patan2(y, x)
� �ð10Þ
g(xi, pk)=R(� ck)(xi � rk) ð11Þ
Note that f(�) signifies a vector-valued function. The matrix
R is related to the sensor resolution (Brekke et al., 2010).
The measurements in Zk are independent when conditioned
on X and pk.
2.2.3. Cardinality models. In addition to the randomness
of the kinematic quantities discussed above, we must also
specify the random nature of the numbers of landmarks,
landmark-originating measurements and clutter measure-
ments, also known as false alarms. We want to make mini-
mal assumptions regarding these numbers. Therefore, we
model both the number of landmarks n and the number of
false alarms fk as uniformly distributed with upper limits
N and M, respectively:
Pr(n)=1
N + 1if n 2 f0, . . . ,Ng
0 otherwise
�ð12Þ
Pr(fk)=1
M + 1iffk 2 f0, . . . ,Mg
0 otherwise:
�ð13Þ
Further justification for these assumptions is given in
Section 7.5.
Furthermore, we assume that all landmarks are detected
with unity probability. We employ this assumption because
non-repeated landmarks do not provide useful information
concerning the motion of the vehicle, and therefore should
be regarded as clutter for practical purposes. This is, how-
ever, a problematic assumption since it contradicts the kine-
matic landmark prior (3). Clearly, a landmark which lies
outside of the second sensor region L2 cannot be detected
at k = 2. This becomes increasingly problematic as p2
increases, and the overlap between L1 and L2 decreases.
Conditional on p1 and p2 only, it is reasonable to assume
that any repeated landmark has a uniform distribution in
L1 \ L2. For p2 sufficiently small, this will be similar to a
uniform distribution over L1 alone. In order to obtain the
prior distribution of repeated landmarks we must margina-
lize this distribution over p2. This will lead to a smeared
distribution, but for mq and P sufficiently small the smeared
distribution will not differ substantially from the original
distribution. Therefore, we consider (3) as a reasonable
approximation of our actual prior knowledge concerning
landmark locations.
2.3. Association hypotheses
It can be shown that the FISST multi-object posterior for
standard MTT is a mixture over association hypotheses sim-
ilar to those used in Reid (1979) or Mori et al. (1986). In
this paper we make a similar statement for FBSM: That the
marginalized posterior f(p2jZ1:2) in FBSM can be written as
a mixture over association hypotheses.
In our preliminary work (Brekke and Chitre, 2013) we
developed the solution to FBSM using classical MHT con-
cepts. We defined the concept of association hypotheses
through four progressive refinements of the outcome space,
called the number event g, configuration event tk, data-to-
data hypothesis u and landmark-to-data hypothesis v.
Rigorous definitions are given below. The background for
this terminology can be found in the seminal references by
Reid (1979) and Mori et al. (1986). We then obtained the
prior hypothesis probabilities by counting the number of tk
per g etc., and we obtained the posterior hypothesis prob-
abilities by multiplying with a hypothesis-conditional
likelihood.
In the FISST approach to FBSM, we restrict attention to
data-to-data hypotheses u and landmark-to-data hypotheses
v. The latter provide assignments of measurements to land-
marks, while the former associate measurements in one
scan with measurements in another scan. The distinction
between these two types of objects, which first was pointed
out by Mori et al. (1986), is of crucial importance in order
to develop a mathematically precise and parsimonious for-
mulation of FBSM (and, more generally, FB-SLAM).
Definition 1 (Landmark-to-data hypothesis). A landmark-
to-data hypothesis is a mapping v: {1,.,n} !{1,.,m1} 3 {1,.,m2} such that vk(s) = vk(t) )s = t for
any k 2 {1,2}.
Definition 2 (Data-to-data hypothesis). Let v be a
landmark-to-data hypothesis. The data-to-data hypothesis u
corresponding to v is then defined as the equivalence class
consisting of all landmark-to-data hypotheses ~v for which
there exists a permutation s: {1,.,n} ! {1,.,n} such
that vk(s(i))= ~vk(i) for k 2 {1, 2}. We signify the rela-
tion of permutation-equivalence by the symbol ‘‘~’’. That
is, if such a permutation s exists for v and ~v, then v;~v.
Remark 1 (Alternative representations). It follows from
Definition 1 that a landmark-to-data hypothesis also can be
viewed as a 2 3 n matrix, where each column represents
a correspondence. It follows from Definition 2 that a data-
to-data hypothesis also can be viewed as a set of corre-
spondences. We use the notation t 2 u to signify that cor-
respondence t belongs in u, while the notation v 2 Per(u)
is used to signify that the landmark-to-data hypothesis
46 The International Journal of Robotics Research 34(1)
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
v belongs to u. Conversely, we let the notation u[v] repre-
sent the data-to-data hypothesis corresponding to v.
Remark 2 (Measurement and correspondence indices).
The measurement associated to landmark i at time step k
under v is denoted zvk (i)k . By zv(i) we understand the vector
½(zv1(i)1 )T, (z
v2(i)2 )T�T 2 R
4, while zv denotes the vector
½(zv1(1)1 )T, (z
v1(2)1 )T, . . . , (zv2(n)
2 )T�T 2 R4n, i.e. the stacked
vector of all measurements that originate from landmarks
according to v. The notation xv signifies a stacked vector
containing all of the landmarks [(x1)T,.,(xn)T]T. Also let
jv be a joint state vector which contains both the vehicle
displacement p2 as well as the landmarks involved in v:
jv = ½pT2 , (xv)T�T = ½pT2 , (x1)T, . . . , (xn)T�T: ð14Þ
If s is a permutation, then xsv is the permuted stacked
landmark vector [(xs(1))T,.,(xs(n))T]T, and jsv is the corre-
sponding permuted joint state vector ½pT2 , (xsv)T�T. The
notation u(i) signifies correspondence number i in u, while
uk(i) signifies the measurement claimed by this correspon-
dence at time k.
In order to illustrate the entities defined in this sec-
tion, let us refer to the example displayed in Figure 1.
Here, the correct number event is gtrue = (n = 3, f1 = 3,
f2 = 2) where n is the number of landmarks and fk is
the number of false alarms at time step k. The correct
association hypothesis utrue corresponds to the following
set of correspondences:
1
1
� �,
3
3
� �,
4
2
� �� �ð15Þ
There are six landmark-to-data hypotheses which agree
with utrue. Two of these are displayed in Figure 2. Thus, in
the rigorous sense of Definition 2, utrue is the equivalence
class
1 3 4
1 3 2
� �, . . . ,
4 3 1
2 3 1
� �� �ð16Þ
2.4. A brief review of finite set statistics
FISST can be developed with a basis in belief measures.
The belief measure of a random finite set N is the
probability
bN(S)=Pr(N � S) ð17Þ
Here S is a subset of the base space, i.e. if x 2 N and
x 2 Rd , then S � R
d . FISST utilizes the so-called set inte-
gral. Let f(X) be a set function, i.e. a function of the finite
set X. Then the set integral of f(X) over S is defined asZS
f (X) dX=X‘
n= 0
1
n!
ZS 3 ... 3 S
f (fx1, . . . , xng) dx1 . . . dxn
Here n is the cardinality of X, and f({x1,.,xn}) = f(X)
under the constraint that jXj = n. A multiobject density
fN(X) is a function which produces a non-negative real
number from any realization X of N, and which normalizes
to one under the set integral:ZR
d
fN(X)dX= 1
The multiobject density fN(X) is related to the belief mea-
sure according to
bN(S)=
ZS
fN(X)dX
Fig. 2. Illustration of how several landmark-to-data hypotheses correspond to the same data-to-data hypothesis: (a) data-to-data
hypothesis; (b) one landmark-to-data hypothesis; (c) another landmark-to-data hypothesis.
Brekke and Chitre 47
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
For further details on FISST the reader is referred to
Mahler (2007), Goodman et al. (1997) and Vo et al. (2005).
3. Formal solution based on FISST
The multi-hypothesis solution to FBSM that was originally
presented in Brekke and Chitre (2013) can also be derived
using FISST. This is important, both in order to provide a
rigorous foundation for data association in SLAM, and also
in order to explore how FISST relates to the classical MHT
paradigm.
We formulate the FBSM problem in terms of the set Nof landmarks (with realization X), the set Gk of false alarms
(with realization Ck), and the set Sk of all measurements at
time k (with realization Zk). The prior multiobject density
of the landmarks is
fN(X)=n!
N + 1
Yn
i= 1
f (xi)
where n is the cardinality of X and f(xi) is as given in (4).
It is easy to verify that this expression, subject to the set-
integral, yields a uniform distribution of n. In a similar
way, the multiobject density of the clutter set Gk is
fGk(Ck)=
fk !
M + 1
1
V fk
where fk is the cardinality of Ck. The measurement set Sk
is the union of the clutter set Gk and a set consisting of
landmark-originating measurements. Based on the assump-
tions of Section 2.2 (e.g. unity detection probability) and
the results of Chapter 12 in Mahler (2007), we arrive at the
following multiobject likelihood:
fSk(Zk jX, pk)=
Xvk
fk !
M + 1
1
V fk
Yn
i= 1
f (zvk (i)k jxi, pk):
The summation goes over all functions vk:
{1,.,n}!{1.,mk} such that vk(i) =vk(j))i = j.
Solving the Bayesian FBSM problem amounts to evalu-
ating the joint posterior density f(X,p2jZ1:2). Assuming a
priori independence between p2 and X, the joint prior
becomes
f (X, p2)= f (X)f (p2): ð18Þ
Bayes’ rule yields
f (X, p2jZ1:2)=1
cf (Z2jX, p2)f (Z1jX)f (X)f (p2) ð19Þ
It can be shown (see Appendix C) that this density is a mix-
ture over all feasible data-to-data hypotheses. In order to
state this result more precisely, we shall first introduce the
components involved in this mixture.
Let Gm1,m2
n denote the set of all feasible data-to-data
hypotheses given that jXj = n, jZ1j = m1 and jZ2j = m2.
This set is the quotient set under permutation-equivalence of
a larger set T m1,m2
n of landmark-to-data hypotheses on the
form v: {1,.,n}!{1,.,m1} 3 {1,.,m2}. For any data-to-
data hypothesis u 2 Gm1,m2
n there are exactly n! permutation-
equivalent landmark-to-data hypotheses v 2 T m1,m2
n .
For any landmark-to-data hypothesis v we define the
hypothesis-conditional posterior density
f (jvjv,Z1:2) = 1au½v� f (p2)
Qnt = 1
f (z
v1(t)1 jxt, 03 3 1)
f (zv2(t)2 jxt, p2)f (x
t)ð20Þ
where the normalization constant au[v] is given by
au½v� =R
f (p2)Qn
i= 1
Rf (z
v1(i)1 jxi)
f (zv2(i)2 jxi, p2)f (x
i)dxidp2
ð21Þ
For reasons soon to be explained we refer to au as the
hypothesis score. We can then express the joint posterior as
f (X, p2jZ1:2)=1
c
Xu2Gm1 ,m2
n
n!f1!f2!
V f1 +f2au
Xv2Per(u)
f (jvjv,Z1:2)
ð22Þ
Remark 3 (The hypothesis score is well-defined). The
number au can be expressed in terms of any v 2 Per (u),
that is, in terms of any landmark-to-data hypothesis v that
corresponds to the data-to-data hypothesis u. The expres-
sion in (21) attains the same value for all v 2 Per (u), since
the integration over xi removes any dependence on the
landmark index i.
Remark 4 (Redundancy of landmark-to-data hypotheses).
All of the densities f(�jv,Z1:2) conditional on landmark-to-
data hypotheses v belonging to the same data-to-data
hypothesis u are functionally equivalent under permutation
of the landmarks. In mathematical terms, if both v(1) and
v(2) are in Per (u), then there exists a permutation s:
{1,.,n} ! {1,.,n} such that f(jv(1)jv(1),Z1:2) =
f(jsv(1)jv(2),Z1:2). Thus, any of the densities f(�jv,Z1:2) can
be used to represent all the kinematic densities belonging
to u. In Brekke and Chitre (2013) we consequently
expressed the multi-hypothesis solution to FBSM in terms
of densities on the form f(�ju,Z1:2). This can be understood
as follows: the function f(�ju,Z1:2) takes as its argument a
joint state vector ½pT2 , (x1)T, . . . , (xn)T�T where landmark
xi is identified with correspondence number i in u.
We define the posterior hypothesis probability
Pr(ujZ1:2) as the total probability mass contributed by the
48 The International Journal of Robotics Research 34(1)
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
data-to-data hypothesis u. Application of the set integral
yields
1=
ZZf (X, p2jZ1:2) dX dp2
=1
c
XN
n= 0
1
n!
Xu2Gm1 ,m2
n
n!f1!f2!
V f1 +f2au
Xv2Per(u)
Zf (jvjv,Z1:2)djv
=1
c
XN
n= 0
1
n!
Xu2Gm1 ,m2
n
n!f1!f2!
V f1 +f2aun!
=1
c
Xu
n!f1!f2!
V f1 +f2au
ð23Þ
Therefore, the contribution of the data-to-data hypothesis u
towards the total posterior probability mass is
Pr(ujZ1:2)=1
c
n!f1!f2!
V f1 +f2au: ð24Þ
This justifies referring to au as the hypothesis score. Also
note that the overall normalization constant c can be found
by normalizing the hypothesis probabilities. Equations (22)
and (24) are important contributions of this paper. These
expressions show how FISST leads to a solution which is
equivalent to the corresponding MHT-based solution, in the
sense that it arrives at exactly the same equations as were
proposed in Brekke and Chitre (2013).
Our primary objective in this paper is to study the
marginalized posterior of the pose displacement p2. It is
given by
f (p2jZ1:2)=
Zf (X, p2jZ1:2) dX
=X
u
Pr(ujZ1:2)f (p2ju,Z1:2)ð25Þ
where
f (p2ju,Z1:2)=
Zf (jvjv,Z1:2) dx
v ð26Þ
for any v 2 Per (u). Thus, the posterior density of p2 is a
conventional mixture of PDFs.
4. Approximation of hypothesis probabilities
Direct evaluation of the posterior densities f(X,p2jZ1:2) or
f(p2jZ1:2) through (22) or (25) can only be of practical util-
ity if we are able to evaluate the hypothesis probabilities
Pr(ujZ1:2) by means of a closed-form expression. This is
problematic, because the hypothesis score au is given by an
integral which does not have any closed-form solution
in general. Nevertheless, we argue in this section that it
is possible to approximate au reasonably well by a
closed-form expression. We will first present this approxi-
mation in Section 4.1, before we derive and justify it in
Section 4.2. Some details are left for Appendix A.
4.1. Statement of main results
Let us convert the measurements to Cartesian coordinates
along the lines of Bar-Shalom et al. (2011). We define
the converted measurements as yik = f�1(zi
k), that is,
according to
yik =
r cosq
r sinq
� �with r, q given by zi
k =r
q
� �ð27Þ
The covariances corresponding to yik and zi
k are
Yik =
Y11 Y12
Y21 Y22
� �and R=
s2r 0
0 s2q
� �ð28Þ
where (as a first-order approximation)
Y11 = r2s2q sin2 (q)+s2
r cos2 (q)
Y22 = r2s2q cos2 (q)+s2
r sin2 (q)
Y12 = Y21 =(s2r � r2s2
q) sin (q) cos (q)
ð29Þ
We can then approximate the hypothesis score au as
au’
(2p)3=2N (pu
2j2 ; mq,P)
(V jRj)nffiffiffiffiffiffiffiffiffiffiffiffiffijJu(p
u
2j2)jp 3Qn
i= 1
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffijYu1(i)
1 jjYu2(i)2 j
qsu(i)(p
u
2j2) n.0
1 n= 0
8>>>><>>>>: ð30Þ
where
su(i)(p2)=N (yu2(i)2 ; Ay
u1(i)1 + b,
Yu2(i)2 +AY
u1(i)1 AT)
ð31Þ
and where
A=R(� c2), b= �R(� c2)r2 ð32Þ
Note that both A and b depend on p2. The vector pu
2j2 is a
maximum a posteriori probability (MAP) estimate of p2
conditional on u, which is found using EKF and NG tech-
niques as explained in Section 5. Finally, Ju(pu
2j2) is an
information matrix which describes the shape of the poster-
ior density f(p2ju,Z1:2). It is found as a sum of information
matrices for the correspondences involved in the associa-
tion hypothesis u:
Ju(p2)=P�1 +Xi2u
Ju(i)(p2) ð33Þ
By defining
nu(i) = yu2(i)2 � Ay
u1(i)1 � b
Su(i)
=Yu2(i)2 +AY
u1(i)1 AT
ð34Þ
Brekke and Chitre 49
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
we can find the correspondence-conditional information
matrices Ju(i) as
Ju(i)(p2)= Dp2nu(i)
� �T(S
u(i))�1Dp2
nu(i)
+
0 0 0
0 0 0
0 0 12tr (S
u(i))�1Uu(i)(S
u(i))�1Uu(i)
24 35 ð35Þ
where
Dp2nu(i) = A,
� sinc2 cosc2
� cosc2 � sinc2
� �(r2 � y
u1(i)1 )
� �and the matrix
Uu(i) =u11 u12
u21 u22
� �ð36Þ
is given according to
u11 = � 2c11 sinc2 + 2c12 cosc2
u21 = � (c12 + c21) sinc2 +(c22 � c11) cosc2
u12 = � (c12 + c21) sinc2 +(c22 � c11) cosc2
u22 = � 2c22 sinc2 � 2c21 cosc2
ð37Þ
where c11, c12, c21 and c22 are given by
C=c11 c12
c21 c22
� �=AY
u1(i)1 ð38Þ
The form of the information matrix in (35) is based on a
result from Porat and Friedlander (1986).
4.2. Derivation of main results
First of all, let us note that under the given assumptions we
can rewrite (21) as
au =
ZN (p2 ; mq,P)
Yn
i= 1
Iu(i)(p2) dp2 ð39Þ
where the landmark integral
Iu(i)(p2) =RN (z
u1(i)1 ; h(x; 0),R)
N (zu2(i)2 ; h(x; p2),R) unifL1
(x) dxð40Þ
represents the contribution from correspondence number i.
Before delving into the mathematical details of the pro-
posed approximation, let us make some reflections regard-
ing the strategy used.
We came to believe that a closed-form approximation of
au was possible through numerical investigations which
indicated that the densities f(p2ju, Z1:2) are close to
Gaussian. Based on this, the key idea is as follows: if we
are able to approximate each Iu(i)(p2) by a function propor-
tional to a Gaussian in p2, then we also have au in closed
form.
Approximation of Iu(i)(p2) must overcome two obstacles,
posed by the presence of the non-Gaussian prior unifL1(x)
and by the nonlinearities in h(x;p2). These nonlinearities
are due to the fact that measurements are received in polar
coordinates, and (more importantly) due to change in vehi-
cle orientation as represented by c2. By converting the
measurements from polar to Cartesian coordinates, we
make the integrand in (40) approximately proportional to a
Gaussian in x. We then assume that the integrand in (39) as
well is approximately proportional to a Gaussian in p2. We
are able to find the parameters of this Gaussian using EKF-
techniques and evaluation of the Fisher information matrix
(FIM). This yields a closed-form approximation of au.
More concretely, by recalling the converted measure-
ments defined in (27)–(29) and the notation A and b
defined in (32), we first employ the following
approximation
Iu(i)(p2) ’N (0 ; 0,R)2
N (0 ; 0,Yu1 (i)
1)N (0 ; 0,Y
u2 (i)
2)vR
L1N (y
u2(i)2 ; Ax+ b,Y
u2(i)2 )
N (x ; yu1(i)1 ,Y
u1(i)1 ) dx
’
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffijYu1(i)
1jjYu2 (i)
2j
pjRjv su(i)(p2)
ð41Þ
where su(i)(p2) is as defined in (31). The idea underlying
(41) is that the ‘‘converted Gaussians’’ N (x ; yu1(i)1 ,Y
u1(i)1 )
and N (yu2(i)2 ; Ax+ b,Y
u2(i)2 ) are approximately propor-
tional to the original densities N (zu1(i)1 ; h(x; 0),R) and
N (zu2(i)2 ; h(x; p2),R) when interpreted as functions of x.
The proportionality constants are found as
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffijYu1(i)
1 j=jRjq
and
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffijYu2(i)
2 j=jRjq
by comparing the peak values of the
densities.
Having eliminated the integral over x, we hypothesize
that for each correspondence in u there exist a number bu(i),
a vector mu(i) and a matrix Ju(i) such that
su(i)(p2)’bu(i)N (p2 ; mu(i), (Ju(i))�1) ð42Þ
The shape of this approximating Gaussian is given by the
matrix Ju(i). More precisely, the curvature of its logarithm is
given by Ju(i). We can obtain a measure of this curvature in
different ways. In a so-called Laplace approximation we
would match the second-order derivatives of su(i)(p2) and of
the approximating Gaussian for a suitable choice of p2. In
this paper we explore an alternative option based on match-
ing of FIMs, which can be expressed in terms of only first-
order derivatives. The idea is that if su(i)(p2) is proportional
to a PDF (in, say, zu2(i)2 ) which depends on p2 as a parameter,
then the FIM of this PDF with respect to p2 should be equal
to the inverse covariance of the approximating Gaussian.
This scheme, using the general result for FIMs of Gaussian
processes established by Porat and Friedlander (1986), leads
to the correspondence-conditional information matrix for-
mula (35).
Assuming thus that every Iu(i)(p2) can be approximated
by a function proportional to a Gaussian with information
50 The International Journal of Robotics Research 34(1)
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
matrix Ju(i) and some expectation mu(i), we can approxi-
mate the full integrand in (39) by a function proportional to
a Gaussian. Its information matrix is
Ju =P�1 +Xi2u
Ju(i) ð43Þ
In other words, we hypothesize that there exists a vector
pu
2j2 and a scalar bu such that
au’
ZbuN (p2 ; p
u
2j2, (Ju)�1) dp2 =bu ð44Þ
The scalar bu should be equal to the ratio between the true
integrand in (39) and its Gaussian replacement in (44) for
all p2 which contribute to these integrals. Clearly, the ratio
between these two functions will in reality depend on p2,
so we need to choose a particular p2 for which we calculate
the ratio. As indicated by the notation pu
2j2, we choose the
hypothesis-conditional MAP estimate of p2 for this pur-
pose. We discuss in the next section how to find this vector.
Thus, we approximate
au’
N (pu
2j2 ; mq,P)Qn
i= 1
Iu(i)(pu
2j2)
N (pu
2j2 ; pu
2j2, (Ju)�1)
ð45Þ
Combining (35), (41), (43) and (45) leads to the desired
hypothesis score formula (30).
The accuracy of this approximation will depend on
factors such as the actual displacement p2, the sensor reso-
lution and the number of correspondences. Our approxima-
tion tends to underestimate the hypothesis probabilities
when n = 1, but does not deviate more than a few per cent
for other cardinalities, cf. Figure 3.
5. State estimation
The subproblem of state estimation in FBSM amounts to
evaluating hypothesis-conditional PDFs such as f(jvjv,
Z1:2) and f(p2ju, Z1:2). We remind the reader that although
several landmark-to-data hypotheses v exist for each data-
to-data hypothesis u, state estimation needs only be done
once for each u, as explained in Remark 4.
Under the given assumptions, the joint state jv is
approximately Gaussian when conditioned on Z1 only, with
first and second moments
jv2j1 =
mq
yv1(1)1
..
.
yv1(n)1
2666437775, Pv
1j1 =
P
Yv1(1)1
. ..
Yv1(n)1
2666437775
The PDF f(jvjv, Z1:2) can be approximated by a Gaussian
in different ways. The most straightforward approach, used
in this paper, is to use an EKF to obtain its first and second
moments. Unfortunately, the EKF tends to exhibit inconsis-
tent behavior: the actual estimation error of the EKF may
be substantially larger than indicated by the calculated cov-
ariance (Huang et al., 2010). This effect can have a huge
detrimental impact on evaluation of the hypothesis prob-
abilities Pr (ujZ1:2), since these essentially are exponential
in the underlying Mahalanobis distances. Any analytical
solution to this problem must either inflate the covariance
to correspond to the actual estimation error, or it must
improve the estimation accuracy so that it actually reaches
the expected covariance.
In order to resolve this problem we employ a two-stage
estimation procedure. In the first stage we use an EKF to
obtain an estimate jv2j2 with associated covariance Pv
2j2 of
the full state vector jv. In the second stage we obtain an
improved estimate pu
2j2 of the pose vector by means of a
technique know as the NG (Amari and Douglas, 1998).
5.1. EKF-based pose estimation
In this stage, we linearize the hypothesis-conditioned esti-
mation problem around the predicted state vector jv2j1. The
linearization is carried out by means of the Jacobian
Hv =
Hv(1)p Hv(1)
x
..
. . ..
Hv(n)p Hv(n)
x
26643775 j = jv
2j1
ð46Þ
with sub-matrices
Hv(i)p =Dp2
h(yv1(i)1 , p2)=DgfDp2
g
Hv(i)x =D
yv1 (i)
1
h(yv1(i)1 , p2)=DgfDy
v1 (i)
1
gð47Þ
where
Fig. 3. The closed-form approximation (21) does not appear to
deviate more than 10% from numerical evaluation of (39) for the
true hypotheses in the scenarios considered in Section 7. For all
practical purposes this deviation is negligible.
Brekke and Chitre 51
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
Dgf =x=r y=r
�y=r2 x=r2
� �Dp2
g = �R(� c2) S(c2)(yv1(i)1 � r2)
h iD
yv1 (i)
1
g = R(� c2)
ð48Þ
Here r, x and y are given according to r= yv1(i)1 � r2
��� ���2
and ½x, y�T =R(� c2)(yv1(i)1 � r2). Furthermore, we
define the combined measurement mapping
eh(ju)= ½h(x1, p2)T, . . . , h(xn
1, p2)T�T ð49Þ
and the corresponding covariance
Rv = In � R ð50Þ
An EKF with output jv2j2 and Pv
2j2 can then be con-
structed as follows:
Wv = Pv1j1(H
v)T(Rv +HvPv1j1(H
v)T)�1 ð51Þ
nv = zv2 � eh(jv
1j1) ð52Þ
jv2j2 = jv
1j1 +Wvnv ð53Þ
Pv2j2 = (I�WvHv)Pv
1j1 ð54Þ
We can partition the EKF estimate into pose and land-
mark components according to jv2j2 = ½(p
u½v�2j2 )T, (xv
2j2)T�T,
and the corresponding covariance can be partitioned as
Pv2j2 =
Pu½v�p Pv
px
(Pvpx)
T Pvx
" #
5.2. NG optimization
The NG is an optimization technique similar to the more
well-known optimization method of Newton.1
Instead of
the Hessian of the cost function, the NG uses a Riemannian
tensor which is ‘‘naturally defined from the characteristics
of the parameter space’’ (Amari and Douglas, 1998).
Assuming the integrand of (39) to be proportional to a
Gaussian in p2, it follows that its desired expectation pu
2j2can be found by maximizing the integrand. Therefore, we
use a cost function proportional to the negative logarithm
of the integrand
c(p2)= � lnN (p2 ; mq,P)�Xn
i= 1
ln su(i)(p2) ð55Þ
By using the information matrix Ju(pu2j2) as the
Riemannian tensor, we obtain the optimization scheme
pu
2j2 = pu2j2 � (Ju(pu
2j2))�1Dpu
2j2c(pu
2j2) ð56Þ
with Ju(p2) as given in (33), and where
Dp2c(p2)= (p2 � mq)
TP�1
+ 12
Pi2u
d1 + d2 +vec((Su(i))�1)TDp2
Su(i)
d1 = ((Su(i))�1nu(i))TDp2
nu(i)
d2 = (nu(i))T½((nu(i))T � I2)Dp2½(Su(i)
)�1�+(Su(i)
)�1Dp2nu(i)�
Dp2½(Su(i)
)�1�= � ((Su(i))�1 � (Su(i)
)�1)Dp2Su(i)
Dp2Su(i)
= ½04 3 2, vec(Uu(i))�
ð57Þ
All quantities not defined in (57) have been introduced in
Section 4. The landmark estimates and the covariance are
not optimized during this stage, and these quantities there-
fore remain as in (53) and (54). For detailed derivations of
the derivatives in (57) we refer the reader to Appendix A.
In Figure 4 the inconsistency problem of the EKF is illu-
strated by means of an example scenario where the perfor-
mance of EKF is particularly bad. In this particular case,
the EKF asserts that the true MAP estimate is about 1040
times less probable than its own estimate pu2j2. On the other
hand, the Gaussian N (p2 ; pu
2j2, (Ju(pu
2j2))�1) is very similar
to the true posterior f(p2ju, Z1:2). Further illustration of how
the NG step mends the inconsistency problem can also be
found in Figure 3 of Brekke and Chitre (2013).
6. Hypothesis search by clique detection
In practical MHT implementations, the key challenge is to
search for promising hypotheses without brute force enu-
meration. This search is typically carried out by assignment
methods based on integer programming (Wijesoma et al.,
2006) or the auction algorithm (Blackman and Popoli,
1999). These methods exploit a key feature of multiple
hypothesis tracking, namely that the cost of a hypothesis
can be written as a sum of terms contributed by the mea-
surements involved. In multi-hypothesis SLAM this is no
longer possible, since the correspondences become co-
dependent when the uncertainty of p2 is taken into account.
Also, assignment methods such as integer programming or
the auction algorithm only aim to discover a single best
hypothesis, or possibly the N best hypotheses. They do not
attempt to explore the hypothesis space beyond such
limitations.
In this paper we solve the search problem through a
three-stage procedure consisting of validation gating,
clique-detection and hypothesis expansion.
6.1. Validation gating
First of all, validation gating similar to the individual com-
patibility test of Neira and Tards (2001) is used in order to
keep the number of correspondences on a manageable
level. More precisely, zj2 may possibly be associated with
zi1 only if
(zj2 � h(yi
1,mq))T(Si)�1(z
j2 � h(yi
1,mq))\g2 ð58Þ
52 The International Journal of Robotics Research 34(1)
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
where g is the number of standard deviations tolerated and
Si =HipP(H
ip)
T +HixY
i1(H
ix)
T
The Jacobians Hip and Hi
x are found according to (47).
The output of the gating procedure can be represented by a
matrix G 2 f0, 1gm1 3 m2 whose entry number (i,j) is one if
zi1 may be associated with z
j2, and zero otherwise.
6.2. Clique detection
The search problem can be addressed by treating the indi-
vidual correspondences as nodes in a graph, whose edges
represent distance between the correspondences. For the
FBSM problem discussed in this paper, a correspondence
consists of two measurements. This yields one free para-
meter during displacement estimation. The locus of all pos-
sible p2 for a given correspondence can be visualized as a
helix. Following this line of thought, the nodes of the corre-
spondence graph correspond to helices, and the edges cor-
respond to minimal sample sets (MSSs), i.e. data-to-data
hypotheses with two correspondences. The edge weights
are related to the minimal distances between helices. One
would expect that all of the helices involved in a good
hypothesis should intersect a limited volume near the cor-
rect displacement vector ptrue2 .
Working with helices is problematic for two reasons.
First, the construction of helices is fundamentally a non-
probabilistic method. Second, inter-helix distances would
have to be evaluated using a Riemannian metric, but it is
not immediately clear how this one should be defined or
minimized.
Instead, we establish the edges of the correspondence
graph using normalized re-projection error. For any MSS u
which can be generated from G we find its normalized re-
projection error as
e(u)= (nu)T(HuPu2j2(H
u)T +Ru)�1nu ð59Þ
Only those edges whose normalized re-projection error is
below some threshold t are included in the graph.
Furthermore, if the number of edges exceeds some constant
B, then we only retain the B best edges.
In order to find the maximal cliques of this graph we use
the Bron–Kerbosch method (Bron and Kerbosch, 1973) as
implemented by Wildman (2011).
6.3 Hypothesis expansion
The collection of maximal cliques does not necessarily
contain the true hypothesis utrue. One can nevertheless rest
assured that utrue will be a sub-clique of one of the maximal
cliques, insofar as the edge threshold t and maximal edge
number B are large enough. Therefore, for each maximal
clique u, we also include all sub-cliques which contain
n(u) 2 d or more nodes in the hypothesis collection. We
refer to d as the expansion depth.
7. Test design and simulation results
In this section we will first (Sections 7.1–7.4) compare
CDSM with four other methods for scan-matching and data
association in SLAM. We will then discuss the sensitivity
of SCPHD and CDSM to model mismatch in Section 7.5.
7.1. Simulation setup
We investigate the performance of CDSM through Monte
Carlo simulations which are designed to mimic output from
a multi-beam sonar with maximum range 60 m, and total
azimuth coverage of 120�. The vehicle’s surge velocity is
uniformly drawn from the interval [0 knots, 5 knots], while
the time interval between the two scans is uniformly drawn
Fig. 4. Illustration of the EKF’s inconsistency problem, and how the NG optimization step mends it. (a) Logarithm of pose
distribution and EKF approximant. (b) Logarithm of pose distribution and NG approximant.
Brekke and Chitre 53
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
from the interval [1s, 3s], so as to mimic a realistic applica-
tion for autonomous underwater vehicles (AUVs).2
Landmarks are placed at random around the vehicle,
and detected with various detection probabilities in [0,1],
independently between the scans. For any detected land-
mark, the measurement noise is given by R = diag ([0.125,
5.7 3 1024]), corresponding to a sensor resolution of
60 3 31 cells (Brekke et al., 2010). Clutter measurements
are drawn with a spatial distribution being uniform over the
FOV in polar coordinates. The number of false alarms are
drawn according to Poisson distributions corresponding to
various false-alarm rates in [10210, 1021].
Having generated several hundreds of thousands of
such scenarios, we pick the first 200 Monte Carlo
runs which satisfy various constraints given by the number
n of landmark measurements, by the average number�f=(f1 +f2)=2 of clutter measurements, and by the rela-
tive pose angle c2 between the two scans. We investigate a
total of 16 such scenarios (see Tables 2–5).
The pose prior is given by mq = 03 3 1 and P = diag
([(5 m)2, (2 m)2, (30�)2]). Normalized re-projection error
e(u) is thresholded at t = 4 standard deviations, and valida-
tion gating is done with g = 3. Maximally B = 200 corre-
spondences are included in the correspondence graph. We
use d = n as the expansion depth for CDSM.
Recall that CDSM makes minimal assumptions regard-
ing false-alarm rate or detection probability. Thus, the
model assumed by CDSM differs from the simulation
model with regard to these quantities. This is justified by
the fact that it hardly is possible to estimate these quantities
reliably from processing only two frames.
7.2. Benchmark methods
7.2.1. JCBB. We use a Matlab implementation of JCBB
which was downloaded from http://www.robots.ox.ac.uk/
~SSS06. It should be noted that the original implementa-
tion did not give satisfactory performance for large jc2j,
Table 2. Successes for 5 � n \ 10, �f\10.
CDSM JCBB SCPHD RANSAC pIC
jc2j \ 2� 94.5% 93.0% 96.0% 90.0% 26.5%jc2j2 [2�, 8�) 94.0% 94.0% 91.0% 55.5% 31.0%jc2j2 [8�, 32�) 92.5% 95.0% 95.5% 1.0% 19.5%jc2j� 32� 94.5% 94.5% 93.0% 0.0% 7.0%
Table 3. Successes for n \ 5, �f\10.
CDSM JCBB SCPHD RANSAC pIC
jc2j \ 2� 92.0% 90.0% 93.4% 85.5% 27.5%jc2j2 [2�, 8�) 87.0% 89.5% 97.5% 61.0% 27.0%jc2j2 [8�, 32�) 92.0% 91.0% 92.5% 16.0% 20.5%jc2j� 32� 88.0% 89.0% 90.5% 8.0% 12.0%
Table 4. Successes for n \ 5, 10� �f\15.
CDSM JCBB SCPHD RANSAC pIC
jc2j \ 2� 79.5% 78.5% 78.5% 73.5% 11.0%jc2j2 [2�, 8�) 79.5% 73.5% 83.0% 41.5% 9.0%jc2j2 [8�, 32�) 73.0% 74.5% 82.0% 5.5% 6.0%jc2j� 32� 67.5% 72.5% 78.0% 1.0% 1.0%
Table 5. Successes for n \ 5, 15� �f\20.
CDSM JCBB SCPHD RANSAC pIC
jc2j \ 2� 51.5% 41.0% 59.0% 40.5% 6.0%jc2j2 [2�, 8�) 51.0% 50.5% 61.0% 21.0% 4.0%jc2j2 [8�, 32�) 37.0% 40.5% 57.0% 0.5% 3.5%jc2j� 32� 24.0% 25.0% 36.0% 0.0% 0.0%
54 The International Journal of Robotics Research 34(1)
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
due to usage of converted measurements yi2 when comput-
ing joint compatibility and Mahalanobis distances. The pro-
gram is therefore rewritten so that the polar measurements
zi2 are used instead of the converted measurements yi
2 in
these tasks. JCBB’s estimate of p2 is found using the same
technique as CDSM uses.
7.2.2. RANSAC. A generic random sampling consensus
(RANSAC) method works by drawing random MSSs until
k iterations have been executed. During each iteration, all
correspondences are tested for compatibility with the cur-
rent MSS, and added to the current hypothesis if deemed
compatible. A score function is then evaluated to test
whether the new hypothesis is better than the previously
best hypothesis. The number k is dynamically updated dur-
ing each iteration, so as to reflect how many more itera-
tions are deemed necessary to find the true solution with a
given probability p (p = 0.999 is used in our simulations):
k =1� p
1� (card(u)=M)sð60Þ
Here M is the total number of feasible correspondences,
s = 2 is the size of an MSS, and card (u*) is the cardinality
of the best hypothesis u* found so far.
RANSAC requires us to specify an estimator of p2, an
error function for whether a correspondence fits the MSS,
and a hypothesis score function. We have deliberately
aimed to make our RANSAC implementation as straight-
forward as possible. Hence, we use the formulae suggested
by Horn (1987) to estimate p2. As the error function, we
use re-projection error, defined as
error(u)=1
n
Xn
i= 1
R(� c)yu1(i)1 � r � y
u2(i)2
��� ���2
ð61Þ
When testing a correspondence’s fit with an MSS, this
function is evaluated for each of the resulting three pairs of
correspondences. The correspondence is added to the
hypothesis if the average of all three reprojection errors is
smaller than a threshold T = 2 m. As hypothesis score we
use cardinality of hypotheses. Whenever a tie is encoun-
tered, the hypothesis set with the lowest re-projection error
is chosen.
7.2.3. pIC. Standard scan-matching methods work in
terms of a two-step iterative process which is repeated until
convergence. In step 1, a measure of the plausibility of
each tentative correspondence is calculated. In step 2, the
displacement between the two scans is calculated according
to a least-squares criterion which puts most emphasis on
the most plausible correspondences. The most popular
methods of this kind are the iterative closest point (ICP)
method (Besl and McKay, 1992; Nieto et al., 2007) and the
probabilistic iterative correspondence (pIC) method
(Montesano et al., 2005; Hernandez et al., 2009). The latter
is specifically tailored towards working with measurements
received by range-bearing sensors. In our simulations we
use pIC as described in Montesano et al. (2005) with maxi-
mally eight iterations.
7.2.4. SCPHD. This method, originally proposed by Lee
et al. (2012) and further developed in Lee et al. (2013), is
currently the most recent SLAM method derived from the
FISST formalism. Roughly speaking, SCPHD evaluates a
joint intensity function over the vehicle state space and the
landmark space. In the implementation of Lee et al. (2012),
the vehicle posterior was evaluated using a particle filter,
while the landmark intensity conditional on a given vehicle
pose was represented by a Gaussian mixture.
The accuracy of SCPHD depends on the number of
samples and on how the samples are chosen. In order to
ensure that SCPHD is able to achieve the same resolution
as CDSM without odometry or data-dependent sampling,
several hundred thousand samples may be necessary. For
this reason, we do not use particle filtering in our imple-
mentation of SCPHD. Instead, we evaluate the pose displa-
cement posterior of SCPHD over a fixed grid.
In SCPHD we must specify some tuning constants,
notably the detection probability PD, the expected number
of landmarks m and the clutter rate l. In accordance with
the modeling underlying CDSM we assume PD = 1, while
we set both m and l equal to five. Further details concern-
ing our implementation of SCPHD can be found in
Appendix B.
7.3. Performance measures
We analyze the performance of CDSM in terms of the three
normalized error functions
eu =
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi(p
u2j2 � ptrue2 )TP�1
p, true(pu2j2 � ptrue2 )
qeMAP
1 =ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi(pMAP
2j2 � ptrue2 )TP�1p, true(p
MAP2j2 � ptrue2 )
qeMAP
2 =ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi(pMAP
2j2 � ptrue2 )TP�1p, ave(p
MAP2j2 � ptrue2 )
qwhere
pMAP2j2 =argmax
p2
f (p2jZ1:2)
Pp, ave =
Z(p2 � ptrue2 )(p2 � ptrue2 )Tf (p2jZ1:2)dp2
and where u* is the most probable non-empty data-to-data
hypothesis, ptrue2 is the true pose displacement, and Pp,true is
the hypothesis-conditional pose covariance of the true data-
to-data hypothesis. We are particularly concerned about two
criteria for performance: success rates and consistency.
7.3.1. Success rates. For a match to be declared successful,
we require that the top non-empty hypothesis has at least
max(2, ntrue/2) correct correspondences, and that eu\3.
Brekke and Chitre 55
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
This performance measure must in some cases be slightly
modified in order to apply to the benchmark methods. For
both JCBB and RANSAC the same success criterion
applies. For SCPHD we declare a success whenever
eMAP1 \3. For pIC we require that its final estimate must
lie within three standard deviations (as given by Pp,true) of
the true pose displacement.
7.3.2. Consistency. In the stochastic filtering literature, an
estimator is considered consistent if it has the appropriate
degree of self-confidence. More precisely, Bar-Shalom
et al. (2011) state that the estimation error should have mag-
nitude commensurate with the corresponding covariance
that is yielded by the estimator. A related requirement, that
was suggested by Brekke and Chitre (2013), is that the esti-
mation procedure should not consider the true parameter
value as significantly less probable than the estimated
value. In this paper we take a more conventional approach,
and analyze consistency in terms of the normalized error
measure eMAP2 . Without stating any more precise require-
ments, we desire that the distribution of eMAP2 should be
close to that of a three-degree-of-freedom x-distributed ran-
dom variable, i.e. similar to the distribution of the absolute
value of a three-dimensional Gaussian random variable with
zero mean and unit covariance. This is based on the ratio-
nale that insofar as the posterior density f(p2jZ1:2) is close
to a Gaussian, then eMAP2 should indeed have a distribution
similar to the unit-variance Gaussian.
We analyze consistency for CDSM, JCBB and SCPHD,
only. For both CDSM and SCPHD we evaluate eMAP2
through grid-based optimization of the pose posterior. For
JCBB, which only provides a single hypothesis with no
hedging on alternative hypotheses, the concept of MAP
estimation is of questionable validity. In order to get some
idea of JCBB’s consistency properties we simply treat its
final pose estimate as pMAP2j2 , while we treat the correspond-
ing covariance as Pp,ave.
7.4. Simulation results
7.4.1. Success rates. Success rates for increasingly difficult
scenarios are displayed in Tables 2–5. We make the follow-
ing four comments on these results.
The success rates of CDSM, JCBB and SCPHD are very
similar. The are generally quite high, but not perfect. Even
in the simplest case (Table 2) there are several Monte Carlo
runs where both CDSM and JCBB fail to choose the correct
hypothesis. However, in the vast majority of these cases, a
hypothesis which satisfies our success criteria can be found
among the top 10 CDSM hypotheses. Such hedging is not
provided by JCBB since it only returns one hypothesis.
SCPHD is arguably the most robust method since it
achieves substantially higher success rates than the other
methods for the most challenging scenarios (the last two
rows in Table 5). However, neither SCPHD nor any of the
other methods considered can be expected to estimate the
true pose displacement accurately when there is four times
as much or more clutter than landmarks. An interesting,
but highly non-trivial topic for future research may be to
develop rigorous Cramer–Rao lower bounds based on
model parameters such as sensor resolution and clutter rate,
possibly along the lines of Rezaeian and Vo (2010).
The plain-vanilla RANSAC method achieves very good
performance for low rotation uncertainty, but deteriorates
quickly as c2 increases. We believe that this is because it
does not use a probabilistic EKF-based pose estimation
technique. In Brekke and Chitre (2013) an alternative
‘‘probabilistic’’ RANSAC method was also proposed to
overcome this problem, but that method never achieved the
same best-case performance as the plain-vanilla RANSAC.
We believe that RANSAC-based methods have a great
potential to solve data association problems in SLAM, but
many bells and whistles may be necessary to achieve the
desired performance.
In most cases, pIC fails to provide adequate estimates.
This could possibly be because pIC depends on continuous
features such as walls, or because there are too few land-
marks in the simulated scenarios. However, even when pIC
violates the success criterion, it is not necessarily entirely
lost. Often pIC yields near-acceptable estimates, but local
optima prevent the method from converging to sufficiently
accurate estimates.
7.4.2. Consistency. In order to investigate consistency,
Figure 5 displays box-plots of eMAP2 -values for the scenar-
ios corresponding to Table 3 (CDSM 1, JCBB 1 and
SCPHD 1) and Table 4 (CDSM 2, JCBB 2 and SCPHD 2),
as well as a box-plot for samples of a three-degree-of-free-
dom x-distributed random variable (labeled Gauss). For the
easy scenario, both CDSM, JCBB and SCPHD have pretty
much ideal consistency properties insofar as eMAP2 is
concerned.
For the tough scenario, CDSM maintains low eMAP2 val-
ues. In other words, actual estimation error is low com-
pared to overall covariance, implying that CDSM becomes
Fig. 5. Consistency of CDSM, JCBB and SCPHD.
56 The International Journal of Robotics Research 34(1)
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
more conservative. This is due to two reasons. First, the
increased amount of clutter causes it to put more emphasis
on the empty hypothesis as well as on other low-cardinality
hypotheses. Second, the increased amount of available
measurements also causes it to hedge on a larger number
of semi-true hypotheses, typically with one correspondence
wrong or missing, which leads to a larger spread of the
main mode of f(p2jZ1:2).
Both for SCPHD, and especially for JCBB, the large
number of outliers indicates inconsistent behavior in the
tough scenario. For JCBB this is clearly because it chooses
a single hypothesis without accounting for data association
uncertainty. For SCPHD, we will discuss one possible cause
of inconsistency in the next subsection.
7.5. Sensitivity to assumed clutter rate in Poisson
formulation
As mentioned, SCPHD assumes the cardinalities n (number
of landmarks) and fk (number of false alarms) to be
Poisson distributed with rates m and l. We could also have
developed CDSM under these assumptions, instead of the
uniform assumptions used in (12) and (13). This leads to
the alternative hypothesis probability formula
Pr(ujZ1:2)=mn(l=V )f1 +f2 au ð62Þ
The approximation for au as well as the hypothesis-
conditional state estimation remain as before. It is of inter-
est to investigate how robust such a Poisson-CDSM is
compared with the default CDSM.
In Figure 6 we see an example of how the normalized
errors eMAP1 and eMAP
2 may depend on the assumed false
alarm rate l for SCPHD and the Poisson-CDSM. First of
all, we note that eMAP1 and eMAP
2 are virtually indistinguish-
able for SCPHD and the Poisson-CDSM. This can be taken
as evidence that SCPHD indeed provides a very good
approximation of the posterior density for FBSM (condi-
tional on the Poisson assumptions, of course). With regard
to the primary error measure eMAP1 , we can see that assum-
ing too low a clutter rate appears to be less dangerous than
assuming too high a clutter rate. At least for Poisson-
CDSM, this can be understood in terms of the emphasis
put on cardinality information. The assumed absence of
clutter makes Poisson-CDSM shift probability towards
high-cardinality hypothesis. In most cases, unless actual
clutter rates are extremely high, the hypothesis with the
highest cardinality tends to be correct or semi-correct, and
choosing this hypothesis (which, by the way, always is
attempted by JCBB or RANSAC) is therefore likely to
yield good accuracy. On the other hand, the consistency
measure eMAP2 attains its best values for slightly exagger-
ated values of l. However, these very low values of eMAP2
(down to 0.1 standard deviations) are indeed unreasonably
low, and indicate an over-conservative estimation method.
For the default CDSM, both eMAP1 and eMAP
2 attain
reasonable values (2.8 and 0.8 standard deviations, respec-
tively) in this particular example.
In Figure 7 we get an insight into how the hypothesis
probabilities depend on m and l. For the same example, the
ratio Pr (utruejZ1:2)/Pr (u0jZ1:2), where utrue is the true data-
to-data hypothesis and u0 is the empty data-to-data hypoth-
esis, is plotted as a function of m and l. For default CDSM,
this ratio is 386 in this example, indicating a strong, but not
absolute, confidence in utrue. For Poisson-CDSM, on the
other hand, this ratio varies with a factor larger than 109 for
l 2 [1, 10]. Such a large variation is clearly not representa-
tive of our actual knowledge, and should cause concern
regarding the Poisson assumption.
There are three ways of dealing with this problem. The
first way is simply to ignore it, with the justification that
curves such as those displayed in Figure 6 indicate a lim-
ited practical relevance of this problem. The second way is
to acknowledge that our knowledge of l (and m and PD as
well) is uncertain. Following this line of thought we must
specify a (possibly non-informative) distribution of l, and
marginalize over this distribution. Research in this direction
has been reported in Mahler et al. (2011), but no such
developments have so far been proposed for SLAM-related
problems. The third way, pursued in our development of
CDSM, is to give up the Poisson cardinality model, and
instead treat n and fk as distributed according to non-
informative distributions.
7.6. Computational complexity
Only limited attempts have been made to optimize the
computational complexity of CDSM, and a systematic
comparison with the complexities of JCBB and SCPHD
has therefore not yet been conducted. For scenarios such as
those studied in this section, we have found that CDSM
will require between 2 times and 100 times longer runtime
than JCBB, while our implementation of SCPHD will typi-
cally require about 1000 times longer runtime than JCBB.
Fig. 6. Sensitivity to Poisson rate l for SCPHD and the Poisson
version of CDSM.
Brekke and Chitre 57
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
For CDSM, as for any bona fide MHT approach, the
theoretical complexity is exponential. Its runtime depends
on how aggressively low-probability correspondences and
hypotheses are being pruned. For SCPHD, the runtime
depends on the number of samples used. The high runtime
for SCPHD results from our requirement that SCPHD
should yield an accuracy commensurate with the accuracy
of CDSM, while not utilizing any more prior information
than CDSM does (cf. Section 7.2.4).
8. Real data results
The Victoria Park data set (Guivant and Nebot, 2001) con-
sists of 7249 frames of data recorded by a laser scanner
mounted on a small truck, which was driving in the
Victoria Park in Sydney, NSW, Australia. Point features
corresponding to trees in the Victoria Park can be extracted
from the data, for example using the Matlab code available
from www-personal.acfr.usyd.edu.au/nebot/victoria_park
.htm. The Victoria Park data have been used as a test case
in several SLAM publications, among them Davey (2007)
and Montemerlo (2003).
Since this paper only addresses FBSM, and not FB-
SLAM in general, we only analyze the data in pairwise
consecutive frames. That is, for all pairs of frames k 2 1
and k we investigate whether CDSM and our FBSM adap-
tations of CDSM and SCPHD are able to estimate the odo-
metry between these frames reliably. The odometry consists
of velocity measurements recorded at the rear left wheel
and measurements of the average steering angles of the
front wheels. By dead-reckoning using the Ackermann
model we are able to calculate ground truths for the pose
displacements between consecutive scans. For the methods
to be investigated, we use the measurement noise matrix
R = diag ([0.3 m, 0.0076 rad]2).
In the upper plots of Figures 8 and 9 we can see how
hypothesis-conditional velocity estimates from CDSM’s
lead hypotheses match the measured odometry for forward
velocity and vehicle rotation, respectively. In the lower plots
of these figures we display normalized errors
ebestx, (2) = (xbest2j2 � x2)=(Dt
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPave½1, 1�
p)
ebestc, (2) = (cbest2j2 � c2)=(Dt
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPave½3, 3�
p)
for the same time steps. It is evident that CDSM in the
majority of cases estimates the odometry very well.
Occasionally it fails to provide good estimates, but in all
such cases the normalized errors remain small, indicating
that CDSM acknowledges its own failure, and hedges
heavily on the empty hypothesis. It should also be noted
Fig. 7. Ratio between the probabilities of the true hypothesis and
the empty hypothesis as function of m and l for the Poisson
version of CDSM.
Fig. 8. Forward velocity estimates of CDSM. The best
hypothesis occasionally misses odometry, but is generally within
62s.
Fig. 9. Turn rate estimates of CDSM. A small bias can be
observed during very strong left turns.
58 The International Journal of Robotics Research 34(1)
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
that during very sharp left turns (turn radius less than 5 m,
estimates of the rotation rate exhibit a noticeable bias. This
may possibly be due to inadequacy of the Ackermann
model under these circumstances.
In Figure 10 we display box-plots for of forward-
velocity errors
ebestx, (1) = (xbest2j2 � x2)=Dt
eMAPx, (1) = (xMAP
2j2 � x2)=Dt
for the collection all consecutive pairs of scans in the data
set.3
All of the methods tend to estimate the correct velocity
within 610 cm/s. The typical accuracy of CDSM’s best
hypothesis and JCBB is better than the typical accuracy of
the MAP estimates. This is so because the MAP estimates
are obtained from optimization over a grid with limited res-
olution. This illustrates the performance loss inevitably suf-
fered by sampling-based methods. Again, we note that
JCBB has more outliers than CDSM or SCPHD.
In Figure 11 we investigate the distributional properties
of the estimation errors more closely by plotting empirical
survival functions for the normalized error measures ebestx, (2)
and
eMAPx, (2) = (xMAP
2j2 � x2)=(DtffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPave½3, 3�
p)
The survival function is defined as one minus the cumula-
tive distribution function (cdf). The empirical survival
function is constructed by first making a histogram of the
relevant samples, and then calculating one minus the cumu-
lative sum of the histogram divided by the total sum of the
histogram. For comparison, we also include the survival
function for the absolute value of a zero-mean unit-variance
Gaussian random variable in Figure 11. Again, the inconsis-
tency of JCBB is very noticeable, while SCPHD displays a
small degree of inconsistency. The normalized error distri-
bution of CDSM-MAP is very similar to the folded unit-
variance Gaussian, indicating ideal consistency properties,
while CDSM-Best appears to be slightly over-conservative.
9. Conclusion and future research
In this paper we have developed a rigorous solution to what
we call FBSM, which is a special case of the more general
SLAM problem. Our proposed solution, whose complete
machinery is referred to as CDSM, provides a Gaussian
mixture approximation of the posterior density of the pose
displacement. Novel strategies for state estimation and
hypothesis search have been incorporated in CDSM in
order to make this approximation as reliable and efficient
as possible.
More precisely, we have shown that the posterior density
of the pose displacement according to FISST is a mixture
over data-to-data hypotheses, we have derived an exact
expression for the weights in this mixture, and we have
developed a closed-form approximation for these mixture
weights. Thus, this paper provides a bridge between the
MHT paradigm and the FISST paradigm. This is very
important, because the relationship between these two para-
digms has not yet been clarified in the available literature.
For the FBSM problem, CDSM appears to have advan-
tages both over traditional data association methods such as
JCBB, and over FISST-based techniques such as SCPHD.
Analysis of consistency reveals that CDSM is significantly
more robust than JCBB. This can be credited to CDSM’s
ability of hedging on alternative hypotheses. CDSM also
has somewhat better consistency properties than SCPHD.
This is probably so because CDSM is able to utilize unin-
formative models for the number of targets and false
alarms, as opposed to the Poisson models required by
SCPHD. Furthermore, CDSM provides a more direct and
parsimonious representation of the posterior density than
SCPHD does.
CDSM is only a first step towards truly multi-hypothesis
SLAM and target tracking with navigation uncertainty. In
future research, we aim to develop extensions to k time
steps. This is in principle relatively straightforward, but will
entail challenges with regard to the following. First, the
Fig. 10. Boxplots displaying forward velocity error statistics.
Fig. 11. Consistency properties of velocity estimates
investigated through empirical survival functions.
Brekke and Chitre 59
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
assumption of unity detection probability must be
relaxed. Second, the number of available hypotheses
will increase, partly because of non-unity detection
probability. Therefore, efficient hypothesis management
techniques are necessary for this. Third, more refined
motion models may be required for extension to k time
steps. Using a slightly different hypothesis representation
than the one used in this paper, we have recently devel-
oped a formal solution to Bayesian multi-hypothesis
SLAM with k time steps in Brekke et al. (2014), and we
are currently pursuing practical implementations of that
solution.
Funding
This research received no specific grant from any funding agency
in the public, commercial, or not-for-profit sectors.
Notes
1. The benefit of the NG over Newton is that the NG only
requires evaluation of first-order derivatives of the objective
function, while Newton requires evaluation of second-order
derivatives. Nevertheless, both methods are able to exploit
curvature information. Gauss–Newton, as opposed to
Newton, can also manage with only first-order derivatives,
but only for least-squares problems. Minimizing c(p2) is not
simply a least-squares problem, because c(p2) contains
Gaussians whose normalization constants also depend on p2
(cf. (31)).
2. Typical surge speed for a small AUV is about 3 knots (Eng
et al., 2010).
3. For JCBB, there is no difference between ebest:, (�) and eMAP:, (�) . For
CDSM, ebest:, (�) does not exist, so only eMAP:, (�) is relevant.
References
Amari S and Douglas SC (1998) Why natural gradient? In: Pro-
ceedings of 1998 IEEE ICASP. Seattle, WA, USA, pp. 1213–
1216.
Bar-Shalom Y, Willett PK and Tian X (2011) Tracking and Data
Fusion: A Handbook of Algorithms. Storrs, CT: YBS
Publishing.
Besl PJ and McKay HD (1992) A method for registration of 3-D
shapes. IEEE Transactions on Pattern Analysis and Machine
Intelligence 14(2): 239–256.
Blackman S (2004) Multiple hypothesis tracking for multiple tar-
get tracking. IEEE Aerospace and Electronic Systems Maga-
zine 19: 5–18.
Blackman S and Popoli R (1999) Design and Analysis of Modern
Tracking Systems. Norwood, MA: Artech House.
Blanco JL, Gonzalez-Jimenez J and Fernandez-Madrigal J (2012)
An alternative to the mahalanobis distance for determining
optimal correspondences in data association. IEEE Transac-
tions on Robotics 28(4): 980–986.
Brekke E and Chitre M (2013) Bayesian multi-hypothesis scan
matching. In: Proceedings of OCEANS’13, Bergen, Norway.
Brekke E, Hallingstad O and Glattetre J (2010) The signal-to-
noise ratio of human divers. In: Proceedings of OCEANS’10,
Sydney, Australia.
Brekke E, Kalyan B and Chitre M (2014) A novel formulation of
the bayes recursion for single-cluster filtering. In: Proceedings
of IEEE aerospace conference, Big Sky, MT, USA.
Bron C and Kerbosch J (1973) Finding all cliques of an undirected
graph. Communications of the ACM 16(9): 575–7.
Davey SJ (2007) Simultaneous localization and map building
using the probabilistic multi-hypothesis tracker. IEEE Transac-
tions on Robotics 23(2): 271–280.
Deb S, Yeddanapudi M, Pattipati K and Bar-Shalom Y (1997) A
generalized S-D assignment algorithm for multisensor-
multitarget state estimation. IEEE Transactions on Aerospace
and Electronic Systems 33(2): 523–538.
Eng YH, Hong GS and Chitre M (2010) Depth control of an
autonomous underwater vehicle, starfish. In: Proceedings of
OCEANS’10, Sydney, NSW, Australia.
Fackler PL (2005) Notes on Matrix Calculus. North Carolina State
University. Available at: http://www4.ncsu.edu/~pfackler/.
Frese U (2010) Interview: is SLAM solved?Kunstliche Intelligenz
24(3): 255–257.
Goodman IR, Nguyen HT and Mahler R (1997) Mathematics of
Data Fusion. New York: Springer.
Guivant J and Nebot E (2001) Simultaneous localization and map
building: Test case for outdoor applications. Technical report,
Australian Centre for Field Robotics. Available at: http://www-
personal.acfr.usyd.edu.au/nebot/publications/slam/
IJRR_slam.htm.
Hernandez E, Ridao P, Ribas D and Batlle J (2009) MSISpIC: A
probabilistic scan matching algorithm using a mechanical
scanned imaging sonar. Journal of Physical Agents 3(1): 3–11.
Horn BKP (1987) Closed-form solution of absolute orientation
using unit quaternions. Journal of the Optical Society of Amer-
ica A 4(4): 629–642.
Huang GP, Mourikis AI and Roumeliotis SI (2010) Observability-
based rules for designing consistent EKF SLAM estimators.
The international Journal of Robotics Research 29(5): 502–528.
Kalyan B, Lee KW and Wijesoma WS (2010) FISST-SLAM:
Finite set statistical approach to simultaneous localization and
mapping. The International Journal of Robotics Research
29(10): 1251–1262.
Kurien T (1990) Issues in the design of practical multitarget track-
ing algorithms. In: Multitarget-Multisensor Tracking: Applica-
tions and Advances, volume 1, chapter 3. Norwood, MA:
Artech House, pp. 219–245.
Lee CS, Clark DE and Salvi J (2012) SLAM with single cluster
PHD filters. In: Proceedings of ICRA, St. Paul, MN, pp. 2096–
2101.
Lee CS, Clark DE and Salvi J (2013) SLAM with dynamic targets
via single-cluster PHD filtering. IEEE Journal of Selected
Topics in Signal Processing 7(3): 543–552.
Magnus JR and Neudecker H (1988) Matrix Differential Calculus
with Applications in Statistics and Econometrics. London:
John Wiley & Sons.
Mahler R (2007) Statistical Multisource-Multitarget Information
Fusion. Norwood, MA: Artech House.
Mahler R, Vo BT and Vo BN (2011) CPHD filtering with
unknown clutter rate and detection profile. IEEE Transactions
on Signal Processing 59(8): 3497–3513.
Montemerlo M (2003) FastSLAM: A Factored Solution to the
Simultaneous Localization and Mapping Problem With
Unknown Data Association. PhD Thesis, School of Computer
Science, Carnegie Mellon University, Pittsburgh, PA, USA.
60 The International Journal of Robotics Research 34(1)
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
Montesano L, Minguez J and Montano L (2005) Probabilistic scan
matching for motion estimation in unstructured environments.
In: Proceedings of IROS 2005, Edmonton, AB, Canada, pp.
3499–504.
Mori S, Chong CY, Tse E and Wishner R (1986) Tracking and
classifying multiple targets without a priori identification.
IEEE Transactions on Automatic Control 31(5): 401–409.
Mullane J, Vo BN, Adams MD and Vo BT (2011) A random-
finite-set approach to Bayesian SLAM. IEEE Transactions on
Robotics. DOI:10.1109/TRO.2010.2101370.
Musicki D and Evans R (2005) Target existence based MHT. In:
Proceedings of CDC-ECC’05, Seville, Spain, pp. 1228–1233.
Neira J and Tards JD (2001) Data association in stochastic map-
ping using the joint compatibility test. IEEE Transactions on
Robotics and Automation 17(6): 890–897.
Nieto J, Bailey T and Nebot E (2007) Recursive scan-matching
SLAM. Robotics and Autonomous Systems 55(1): 39–49.
Nieto J, Guivant J, Nebot E and Thrun S (2003) Real time data
association for FastSLAM. In: Proceedings of ICRA 2003, Tai-
pei, Taiwan, pp. 412–418.
Pfingsthorn M and Birk A (2013) Simultaneous localization and
mapping with multimodal probability distributions. The Inter-
national Journal of Robotics Research 32(2): 143–171.
Porat B and Friedlander B (1986) Computation of the exact infor-
mation matrix of gaussian time series with stationary random
components. IEEE Transactions on Acoustics, Speech and Sig-
nal Processing 34(1): 118–130.
Reid D (1979) An algorithm for tracking multiple targets. IEEE
Transactions on Automatic Control 24(6): 843–854.
Rezaeian M and Vo BN (2010) Error bounds for joint detection
and estimation of a single object with random finite set obser-
vation. IEEE Transactions on Signal Processing 58(3):
1493–506.
Ristic B (1999) A comparison of MHT and 2D assignment algo-
rithm for tracking with an airborne pulse Doppler radar. In: Pro-
ceedings of ISSPA’99, Brisbane, QLD, Australia, pp. 341–344.
Streit RL and Luginbuhl TE (1994) Maximum likelihood method
for probabilistic multi-hypothesis tracking. In: Signal and Data
Processing of Small Targets, Orlando, FL (Proceedings of
SPIE, vol. 2235). Bellingham, WA: SPIE.
Van Keuk G (2002) MHT extraction and track maintenance of a
target formation. IEEE Transactions on Aerospace and Elec-
tronic Systems 38(1): 288–295.
Vo BN, Singh S and Doucet A (2005) Sequential Monte Carlo
methods for multitarget filtering with random finite sets. IEEE
Transactions on Aerospace and Electronic Systems 41(4):
1224–1245.
Vo BT and Vo BN (2013) Labeled random finite sets and multi-
object conjugate priors. IEEE Transactions on Signal Process-
ing 61(13): 3460–3475.
Wijesoma WS, Perera LDL and Adams MD (2006) Toward multi-
dimensional assignment data association in robot localization
and mapping. IEEE Transactions on Robotics 22(2): 350–365.
Wildman J (2011) Bron-Kerbosch maximal clique finding algo-
rithm. Matlab Central File Exchange. Retrived 27 October 2011.
Williams JL (2011) Graphical model approximations of random
finite set filters. Preprint arXiv:1105.3298v2.
Williams JL (2012) Hybrid Poisson and multi-Bernoulli filters. In:
Proceedings of FUSION 2012, Singapore, pp. 1103–1110.
Appendix A: The gradient of (55)
The NG update step (56) utilizes the gradient of c(p2) as
expressed in (57). Detailed derivations of the formulae in
(57) are provided in this appendix.
First of all, it is useful to generalize the formula for dif-
ferentiation of quadratic forms. It is well known that
Dx½f(x)AfT(x)�=(A+AT)Dx f(x). If f is allowed to be a
matrix or A is allowed to depend on x, a more general for-
mula is needed. For this to be handled, the well-known
product rule (uv)0 = u0v + uv0 should be generalized to
matrix-valued functions. In Fackler (2005), the following
result can be found.
Lemma 1. (Generalized product rule). Let f : Rn !R
m 3 pand g : Rn ! Rp 3 qbe matrix-valued functions. If
f(x) = f(x)g(x), then its matrix derivative is
Dxf(x)= (g(x)T � Im)Dxf (x)+ (Iq � f (x))Dxg(x): ð63Þ
Lemma 2. (Derivative of quadratic form). If
F : Rn ! Rt3 sand c : Rn ! R
t 3 tis given by c(x) = F
(x)TA(x)F(x), then
Dxc(x)= A(x)F(x)½ �T�Is
Ks, tDxF(x)+ (Is � F(x)T)
(F(x)T � It)DxA(x)�
+(Is � A(x))DxF(x)�
where Ks, t 2 Rst 3 st is the commutation matrix (Magnus
and Neudecker, 1988).
Proof. This can easily be seen by applying the result of
Lemma 1 twice. In the first step, let f(x) bF(x)T and let
g(x) bA(x)F(x). In the second step, differentiate A(x)F(x)
by letting f(x) bA(x) and g(x) bF(x).
Lemma 3. We have Dp2Su(i) = ½04 3 2, vec(U
u(i))�.
Proof. It can be seen from (34) that the only variable in p2
that Si depends on is c2. As a special case of Lemma 2 we
obtain
Dc2Su(i)
= Yu(i)1 R(c2)
h iT� �K2, 2Dc2
½R(c2)�
+(I2 �R(c2)T)(I2 � Y
u(i)1 )Dx½R(c2)�
= (½Yu(i)1 R(c2)�
T � I2)K2, 2Dc2R(c2)
+ I2 � ½R(c2)TY
u(i)1 �Dc2
R(c2)
ð64Þ
The second equality in (64) follows from the mixed-
product property of the Kronecker product. The derivative
Dc2R(c2) of the rotation matrix is
Dc2R(c2)= ½� sinc2, cosc2, � cosc2, � sinc2�T
If we now define C=R(c2)Yu(i)1 as in (38), then
Brekke and Chitre 61
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
Dc2Su(i)
= ((C� I2)K2, 2 + I2 � C) Dc2R(c2)
=
c11 c12 0 0
0 0 c11 c12
c21 c22 0 0
0 0 c21 c22
2666437775
0BBB@
+
c11 c12 0 0
c21 c22 0 0
0 0 c11 c12
0 0 c21 c22
26664377751CCCA
� sinc2
cosc2
� cosc2
� sinc2
2666437775
=
�2c11 sinc2 + 2c12 cosc2
�(c12 + c21) sinc2 +(c22 � c11) cosc2
�(c12 + c21) sinc2 +(c22 � c11) cosc2
�2c22 sinc2 � 2c21 cosc2
2666437775
By definition, this is vec(Uu(i)).
Lemma 4. The derivative of the inverse covariance matrix
isDp2(Su(i))�1 = � ((Su(i))�1 � (Su(i))�1)Dp2
Su(i).
Proof. This result is established in Ch. 9 of Magnus and
Neudecker (1988).
Lemma 5. We have Dp2nu(i) = R(� c2), S(c2)½
(r2 � yu(i)1 )�.
Proof. Application of Lemma 1 yields
Dp2nu(i) =Dp2
(yu(i)1 �R(� c2)y
u(i)1 � b)
=Dp2(R(� c2)r2 �R(� c2)y
u(i)1 )
= (r2 � I2)Dp2R(� c2)+R(c2)Dp2
r2
�(yu(i)1 � I2)Dp2
R(� c2)
= 02 3 2, (r2 � yu(i)1 )� I2
� sinc2
cosc2
� cosc2
� sinc2
2666437775
2666437775
� R(� c2), 02 3 1½ �
= R(� c2), S(c2)(r2 � yu(i)1 )
h iLemma 6. If c(p2) is given by (55), then
Dp2c(p2)= (p2 � mq)
TP�1
+1
2
Xi2u
d1 + d2 +vec((Su(i))�1)TDp2
Su(i)
where
d1 =((Su(i))�1nu(i))TDp2
nu(i)
d2 =(nu(i))T ((nu(i))T � I2)Dp2((Su(i)
)�1)h
+(Su(i))�1Dp2
nu(i)i:
Proof. By discarding all terms in c(p2) whose derivatives
are zero, we get the reduced cost function
c(p2)=1
2(p2 � mq)
TP�1(p2 � mq)
+Xi2u
1
2ln jSu(i)j+ 1
2(nu(i))T(Su(i)
)�1nu(i)
� �The derivative of the first term is
Dp2
1
2(p2 � mq)
TP�1(p2 � mq)
� �=(p2 � mq)
TP�1
The derivatives of the logarithmic determinants are
Dp2
12ln jSu(i)j
=Dp2jSu(i)j
2jSu(i)j= 1
2vec((Su(i)
)�1)TDp2Su(i)
where the second equality follows from (1) in Ch. 9 of
Magnus and Neudecker (1988). Finally, from Lemma 2 we
find that
Dp2(nu(i))T(Su(i)
)�1nu(i)
= d1 + d2
where d1 and d2 correspond to the first and second terms in
(63), respectively.
Appendix B: Implementation of SCPHD
SCPHD is based on the concept of a joint pose-and-
landmark intensity, which is factorized as follows:
Dk(pk , x)= sk(pk)Dk(xjpk)
Here sk(pk) represents a parent process (of the pose), while
Dk(xjpk) represents a daughter process (of the landmarks
conditional on the pose). In this particular case, where pk is
a vector, sk(�) is a PDF, while Dk(�jpk) is a PHD. Thus, the
integralR
Dk(xjpk)dx yields the expected number of land-
marks conditional on pk.
Based on the assumptions stated in Section 2.2 we use
the prior intensities
s1j0(p1)= d0(p1) and D1j0(xjp1)=m unifL1(x)
After the first measurement update we get the posterior
intensities
s1(p1)= d0(p1)
D1(xjp1)’Xm1
i= 1
m
m+ l � (v=V )ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffijRj=jYi
1jq N (x; yi
1,Yi1)
This follows from applying the same kind of reasoning that
led to (41) in Section 4.2 to formulae (6)–(11) in Lee et al.
(2012). After the second measurement update we get the
posterior intensities
62 The International Journal of Robotics Research 34(1)
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from
s2(p2)}N (p2; mq,P)Ym2
j= 1
l=V +Xm1
i= 1
wij(p2)
!
D2(xjp2)=Xm2
j= 1
Pm1
i= 1 wij(p2)N (x; mij,Pij)
l=V +Pm1
i= 1 wij(p2)
where
wij = m
m+ l�(v=V )ffiffiffiffiffiffiffiffiffiffiffiffijRj=jYi
1jp N (z
j2; h(yi
1, p2),R)
mij = yi1 +Wij(z
j2 � h(yi
1, p2))Pij =(I�WijHi
x)Yi1
Wij =Yi1(H
ix)
T(HixY
i1(H
ix)
T +R)
These equations are implemented for each p2 in a
discrete grid, with 50, 61 and 200 samples in the intervals
22 � x2 � 8, 27 � y2 � 7 and 265��c2 � 65�,
respectively.
Appendix C: Deriving the joint posterior as a
mixture over data-to-data hypotheses
In this third and last appendix we show that the joint density
f(p2, XjZ1:2) is a mixture over data-to-data hypotheses on the
form (22). Recall that the prior density can be factorized as
f(X, p2) = f(X) f(p2) due to a priori independence between
p2 and X. Thus, we let the prior joint density be given by
f (X, p2)= f (p2)f (X)= f (p2)n!
N + 1
Yn
i= 1
f (xi)
Inserting the measurement model fSkZk jX, pkð Þ into the
two-step Bayes rule (19) yields the posterior joint density
f (X, p2jZ1:2)=1
c
Xv1
f1!
V f1
Yn
i= 1
f (zv1(i)1 jxi, 03 3 1)
!Xv2
f2!
V f2
Yn
i= 1
f (zv2(i)2 jxi, p2)
!f (p2)n!
Yn
i= 1
f (xi)
where the two summations go over all landmark-to-data
hypotheses v1: {1, ., n}! {1, ., m1} and v2: {1, ., n}
! {1, ., m2}, respectively. This product of sums can be
replaced by a single sum over all 2-dimensional landmark-
to-data hypotheses v: {1, ., n}! {1, ., m1} 3 {1, .,
m2}:
f (X, p2jZ1:2)=1
c
Xv
n!f1!f2!
V f1 +f2f (p2)
Yn
i= 1
f (zv1(i)1 jxi, 03 3 1)f (z
v2(i)2 jxi, p2)f (x
i):
=1
c
Xv
n!f1!f2!
V f1 +f2au½v�f (jvjv,Z1:2)
Here u[v] is the data-to-data hypothesis encompassing the
landmark-to-data hypothesis v, and the last line was
obtained by invoking the definitions in (20) and (21).
Finally, it remains to partition the collection of landmark-
to-data hypotheses into data-to-data hypotheses. By recog-
nizing that for any landmark-to-data hypothesis v all its per-
mutations correspond to the same data-to-data hypothesis u,
we immediately obtain the desired formula (22).
Brekke and Chitre 63
at PENNSYLVANIA STATE UNIV on May 11, 2016ijr.sagepub.comDownloaded from