NOTE TO USERS - University of Guelph Atrium
-
Upload
khangminh22 -
Category
Documents
-
view
1 -
download
0
Transcript of NOTE TO USERS - University of Guelph Atrium
STATISTICAL DECONVOLUTION
ON THE 2D-EUCLIDEAN MOTION GROUP
A Thesis
Presented to
The Faculty of Graduate Studies
of
. The University of Guelph
by
MAIA R. LESOSKY
In partial fulfilment of requirements
for the degree of
Doctor of Philosophy
June, 2009
© Maia R. Lesosky, 2009
1*1 Library and Archives Canada
Published Heritage Branch
395 Wellington Street Ottawa ON K1A0N4 Canada
Bibliotheque et Archives Canada
Direction du Patrimoine de I'edition
395, rue Wellington Ottawa ON K1A0N4 Canada
Your file Votre reference ISBN: 978-0-494-50128-3 Our file Notre reference ISBN: 978-0-494-50128-3
NOTICE: The author has granted a nonexclusive license allowing Library and Archives Canada to reproduce, publish, archive, preserve, conserve, communicate to the public by telecommunication or on the Internet, loan, distribute and sell theses worldwide, for commercial or noncommercial purposes, in microform, paper, electronic and/or any other formats.
AVIS: L'auteur a accorde une licence non exclusive permettant a la Bibliotheque et Archives Canada de reproduire, publier, archiver, sauvegarder, conserver, transmettre au public par telecommunication ou par Plntemet, prefer, distribuer et vendre des theses partout dans le monde, a des fins commerciales ou autres, sur support microforme, papier, electronique et/ou autres formats.
The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.
L'auteur conserve la propriete du droit d'auteur et des droits moraux qui protege cette these. Ni la these ni des extraits substantiels de celle-ci ne doivent etre imprimes ou autrement reproduits sans son autorisation.
In compliance with the Canadian Privacy Act some supporting forms may have been removed from this thesis.
Conformement a la loi canadienne sur la protection de la vie privee, quelques formulaires secondaires ont ete enleves de cette these.
While these forms may be included in the document page count, their removal does not represent any loss of content from the thesis.
Canada
Bien que ces formulaires aient inclus dans la pagination, il n'y aura aucun contenu manquant.
ABSTRACT
STATISTICAL DECONVOLUTION
ON THE 2D-EUCLIDEAN MOTION GROUP
Maia R. Lesosky Advisors: University of Guelph, 2009 Dr. P.T. Kim
Dr. D.W. Kribs
The problem discussed in this dissertation is that of deconvolution on a
low-dimensional example of a non-compact, non-commutative group, namely the 2D-
Euclidean motion group. This is the first time that asymptotic rates for the upper
bound of the mean integrated squared error have been determined in a setting such
as this. Multiple regularization methods, including spectral cut-off and Tikhonov
regularization are used to solve this problem. A minor simulation study completes
the dissertation.
i
Acknowledgments
I acknowledge the advice and support of my advisory committee: Prof. J. Dickey,
Prof. P.T. Kim, Prof. D.W. Kribs. Additionally, I am grateful to Prof. P. McNicholas
for the use of the quadcore to run my simulations and for helpful advice, Prof. A.
Munk for hosting me at the Institute for Mathematical Stochastics, University of
Gottingen (Nov.-Dec. 2008) and Dr. N. Hyvonen for hosting me at the Institute of
Mathematics, Helsinki University of Technology (Jan.-Apr. 2009).
I received funding from various sources, including NSERC, OGS, WSIB, various Uni
versity controlled scholarships and Prof. D.W. Kribs for which I am grateful.
ii
Table of Contents
List of Tables iv
List of Figures v
1 Introduction 1 1.1 Statement of Problem 1 1.2 Dissertation Outline 3 1.3 Literature Review 4
2 Technical Background 7 2.1 The 2D-Euclidean Motion Group . , 7 2.2 Irreducible Unitary Representations 9
2.2.1 Proof of Irreducibility 12 2.3 Fourier Analysis on the Euclidean Motion Group 15
2.3.1 Proof of Convolution Theorem 17 2.3.2 Proof of Parseval's Equality 18
2.4 Regularization Techniques 20 2.4.1 Spectral Cut-off 20 2.4.2 Tikhonov Regularization 21
3 Theoretical Results 22 3.1 Deconvolution Density Estimation 22
3.1.1 Conditions and Assumptions . 22 3.1.2 Statistical Model 26
3.2 Asymptotic Error Bounds 30 3.2.1 Spectral Cut-off Regularization 31 3.2.2 Tikhonov Regularization 34
3.3 Further Remarks 37
4 Simulation Study 40 4.1 Direct Density Estimation 40
4.1.1 Estimation of Mean Integrated Squared Error 47 4.2 Remarks 50
5 Discussion 52 5.1 Remarks 52 5.2 Further Work 52
Ill
Bibliography 56
A Proofs 60 A.l Proof of Theorem 3.2.1 60
A. 1.1 Bias Calculation . . . 60 A.1.2 Variance Calculation 66
A.2 Proof of Corollary 3.2.2 69 A.3 Proof of Corollary 3.2.3 69 A.4 Proof of Theorem 3.2.4 70
A.4.1 Bias Calculation 70 A.4.2 Variance Calculation 74
A.5 Proof of Corollory 3.2.5 76 A.6 Proof of Theorem 3.2.6 77 A.7 Development of CV(T) 79
A.7.1 Empirical Density Estimator 79 A.7.2 Leave-one-out Cross Validation 80
B Simulation Code 83 B.l Computational Functions 83 B.2 Plotting Functions 89 B.3 Sample Simulation 95
V
List of Figures
4.1 Sample perspective and contour plots for simulated density 42 4.2 Perspective plots comparing true and estimated density 44 4.3 Contour plots comparing true and estimated density 45 4.4 Perspective plots comparing true and estimated density. 46 4.5 CV(T) as sample size varies for various parameters 49 4.6 Cross Validation Variation due to density parameters 50
1
Chapter 1
Introduction
1.1 Statement of Problem
Statistical inverse problems are a fundamental class of problems that en
compass broad fields of application and statistical methodology. Deconvolution is a
classical problem, however, it is rare that deconvolution over groups and particularly
over non-compact groups is considered. The primary objective, of this thesis is to
estimate a deconvolution density, which is an ill-posed inverse problem, and to find
upper bounds for the mean integrated squared error of estimation over a specific ex
ample of a non-compact, non-commutative group, the 2D-Euclidean motion group,
denoted SE(2).
One often cited definition of an inverse problem is due to Keller [28] who
states,
We call two problems inverses of one another if the formulation of each involves all or part of the solution of the other. Often, for historical reasons, one of the two problems has been studied extensively for some time, while the other is newer and not so well understood. In such cases the former problem is called the direct problem, while the the latter is called the inverse problem.
Another definition, perhaps more illuminating, is attributed to Alifanov (and quoted
2
by Woodbury [50]) who said,
Solution of an inverse problem entails determining unknown causes based on observation of their effects. This is in contrast to the corresponding direct problem, whose solution involves finding effects based on a complete description of their causes.
The general formulation for an inverse problem will be written,
d = Af, (1.1.1)
where the objective is to estimate the unknown /' in some manner based on observa
tions d. Hadamard [21] circa 1923 provided the standard definition of an ill-posed
problem. He defined a problem as well-posed if it satisfies the following three condi
tions:
1. there exists a solution,
2. the solution is unique and
3. the solution depends continuously on the data,
and ill-posed if the problem fails in any one of those conditions. Conditions 1 and 2
are equivalent to saying that A - 1 , as in (1.1.1), is well defined and the domain is all
of the data space.
Convolution of densities is a well known operation, defined here as
%) = (k * f)(y) = / k{x)f{x-1y)dx (1.1.2) JG
where x, y are members of the group G and dx is the measure on the group (since the
group operation is multiplication). The intent is to 'untangle' the operators k and / ,
3
estimate / and the associated error, while assuming the only information available
from h is a random sample of data from that density. The restrictions or conditions
on the distortion operator k are of major significance in determining the relative
difficulty of solving the deconvolution problem.
Specifically, this dissertation addresses the problem of finding an empirical
estimator and the asymptotic upper bounds of the mean integrated squared error
(MISE) for a deconvolution problem where k is known, h is measured in some sense,
/ is the unknown density to be estimated, and all of these densities are in the set of
square integrable functions, L2(SE(2)), over the Euclidean motion group, §E(2).
1.2 Dissertation Outline
The dissertation is separated into five chapters. The first chapter contains a
brief literature review and organizational information. Chapter 2 provides the tech
nical background required for a complete understanding of the results. The results,
theorems and sketch proofs can be found in the third chapter. Simulation results
are presented in Chapter 4. Finally, a discussion containing remarks on the results,
potential applications as well as suggestions for future work are presented in Chapter
5. The full proofs are restricted to Appendix A. Appendix B contains the simulation
code.
4
1.3 Literature Review
The literature on inverse problems, on regularization methods and on ap
plications of inverse problems, is vast. A thorough review of the entire field is not
possible, instead a review of the literature directly related to the research at hand
will be given. For surveys on inverse problems, refer to O'Sullivan [40], Evans and
Stark [17], Nychka and Cox [39] and Kaipio and Somersalo [27]. In particular,
although an early review, O'Sullivan [40] provides an overview of inverse problems
considered from a statistical perspective. More recently, overviews of certain aspects
of inverse problems include Cavalier [6] on non-parametric estimation and Bissantz
and Holzmann [4] on statistical inference. Additionally Bissantz et al. [2] and Tenorio
[45] provide an updated look at regularization methods for general inverse problems.
The classical framework for inverse problems is that of a linear inverse prob
lem between two Hilbert spaces, but many systematic accounts of ill posed problems
are restricted to compact intervals in R, mainly to exploit the properties of pure
point spectrum and singular value decomposition, see for example, Carroll and Hall
[5]; Fan [18], Diggle and Hall [14] and the references therein. Golubev [20] obtained
minimax estimators over the entire real line (which is non-compact, although com
mutative) but did not provide many proofs. Rigollet [43] later provided complete
proofs. In the non-compact setting, such as the one described in this thesis, these
both become issues. Recently, interest has developed in deconvolution and related
density estimation problems on spaces possessing a Riemanninan structure. Hendriks
[25], van Rooij and Ruymgaart [48], Healy, Hendriks and Kim [24], Kim [29] and
5
Kim and Koo [31] considered these problems, studying deconvolution problems on
spheres, orthogonal groups and certain classes of manifolds. The work of those au
thors constitute extensions, to a non-Euclidean setting, of results developed in the
Euclidean case. This approach is similar in some sense to a line of research that
approaches the indirect inverse problem through square integrable minimax theory
(Pinsker [42], Ibragimov and Khasminski [26], Belister and Levit [1], Klemela [32],
Efromovich [16] and Cavalier and Tsybakov [7]). Here, the approach is to treat the
problem as a constrained optimization problem where the exact asymptotic minimax
bounds can be found by solving the boundary conditions.
Regularization techniques have received significant interest for some time,
dating at least as far back as the seminal research by Tikhonov [46] and the mono
graph by Tikhonov and Arsenin [47]. More recent contributions to the general theory
of regularization include, for example, Cox [13], Bissantz, Hohage and Munk [2], Bis-
santz et al. [3] and Lu, Pereverzev and Ramlau [38].
Research specifically regarding deconvolution on §E(2) includes work by
Chirikjian and Kyatkin [11, 10], Kyatkin and Chirikjian [33, 34, 35, 36, 37],
Chirikjian and Wang [12], Wang and Chirikjian [49] and Chirikjian and Ebert-Uphoff
[9] that primarily look at applications in robotics including workspace generation, but
also include some more general work on efficient algorithms for Fourier analysis on
8E(2). In addition, Yarman and Yazici [51, 52, 54, 53, 55] and Yazici [56] have
produced a number of results looking at computational aspects of deconvolution on
SE(2) and particularly at the inversion of the Radon transform and exponential Radon
transform when expressed as a deconvolution over §E(2). There are some examples
6
of applied research where the data is considered as drawn from §E(2). These include
image analysis problems in Duits et al. [15] and diffusion based models in Park et
al. [41].
7
Chapter 2
Technical Background
This chapter provides the technical background required to support the re
sults. Much of the material referring to Fourier analysis on motion groups and related
topics can be found in greater detail in [11, 44]. Particularly, the resource in [11]
is excellent for a straightforward coverage of representations and Fourier analysis on
motion groups.
This chapter is divided into three main subsections, the first describing the
2D-Euclidean motion group, the second building the theory of irreducible unitary
representations on the motion group and the third discussing Fourier analysis. In
addition, there are subsections proving some of the important results of the Fourier
transform (the convolution property and Parseval's equality) as well as a subsection
proving the irreducibility of the representations.
2.1 The 2D-Euclidean Motion Group
The Euclidean motion group on 2-dimensions, denoted §E(2), is the simplest
example of a non-compact, non-commutative Lie group. It has a well defined represen
tation suitable for Fourier analysis. §E(2) is defined to be the semi-direct product of
the rotation group, §0(2), and the additive group, M2. An element g e SE(2) will be
written as g = {Rg,r) where Re £ SO(2) and r G M2. In this form, g'1 = (R'e, -R'ev)
where A' denotes the transpose of a matrix or vector A. The identity element of the
group is e = (/2>0), h being the identity rotation. The group operation is matrix
multiplication and can be written (i?e,r)(/?^,x) = (ReR4>,r + i?#x). Each element
g E §E(2) can be parameterized in either rectangular or polar coordinates as
cos# — sin# ri
9{Q,r\,r2) sin 9 cos 9 r2
0 0 1
or (2.1.1)
/
' cos 9 — sin 9 r cos 6
9{0,<f>,r) = sin 9- cos 9 r sin <
V 0 0
(2.1.2)
/
respectively, with r = |r| — |(ri,r2) | . Geometrically, §E(2) can be thought of as the
set of all translations and rotations, or rigid motions, on the plane. Intuitively the set
of rigid motions describe the position and orientation of a rigid body. With respect
to a 'home' position (called an active point of view) any subsequent position can be
described by the transformation g that moves a rigid body from 'home' to the new
position. It is evident by inspection that the group is non-commutative. The lack of
compactness is inherited from M2 and is similarly obvious. Finally, note that since
SE(2) is a Lie group, and since it is a subgroup of the real 3 x 3 matrices, it is a simple
matter to define the Lie algebra via the matrix exponential map. Formally, denote
by se(2) the Lie algebra of §E(2) so that se(2) = 1Z2 +so(2), a vector space sum,
9
where so(2) is the Lie algebra of SO(2). Using the matrix exponential map means
the Lie algebra for §E(2) is going to consist of all those elements g G §E(2) such
that exp(£<?) G §E(2) for all real numbers t. The next section introduces the class
of representations needed to define both the Fourier transform and inverse Fourier
transform on SE(2).
2.2 Irreducible Unitary Representations
The irreducible unitary representations are vital to the definition and prop
erties of the Fourier transform over SE(2). A representation is a continuous mapping
that sends each element of the group into a continuous linear operator that acts on
some vector space and which preserves the group product. By definition, an irre
ducible representation is one that leaves no proper subspaces invariant. The final
requirement is that the representation is unitary. The condition for an operator A to
be unitary is that
A*A = AA* = I, (2.2.1)
where / is the identity operator. Because §E(2) is non-compact, these represen
tations are infinite-dimensional. The collection of inequivalent irreducible unitary
representations is denoted by {U(,p)}, characterized by a real number p € [0, oo).
The irreducible unitary representations for §E(2) are defined by
U(g,p)<p{x) =?-**•*>$ {%*) (2.2.2)
10
for each g = (RQ..T) G §E(2), p E [0, oo) (x is a unit vector) and (p is an L2(Sl)
function. Let g(A,a) and h(R,r) £ §E(2) and 0 e L2(5'1) then it is simple to see
that the group representation observes the group homomorphism property,
(U(g(A,BL),p)U(h(R,r),p)f)(x) = ( C % , p ) ( t f ( M # ) ( x )
= (U(g,p)<ph) (x) = e - ^ a x ^ h (A rx)
= e-ip(*+Ar>*<p ((AR)Tx)
= (U(goh,p)<p)(x).
Note that since x is a unit vector, the function < (̂x) = (f (cos tjj, sin ip) = <p(ip), hence
there is no need to distinguish between ip = (p.
In general, representations can be expressed as unitary operators in a basis
for the underlying vector space. In order to represent U(g,p) as a matrix note that
any function in L2^1) can be expressed as a Fourier series of orthonormal basis
functions. Hence the matrix elements of the operator U(g,p), denoted by ikm(g,p),
will be represented with respect to the basis functions (p('ip) = YLk&z0^^ •> cfc e ^ »
a s
1 r 2 7 r
uem(g,p) = (el^,U(g,p)eimip) = — / e-^e-(npco8v+r2P8mV)e.m(v»-«)^) (2.2.3)
for all ! , m e Z where the inner product is ((pi, ip?) = JQn ^(^^(^dip. The matrix
elements of this representation can also be expressed in polar coordinates as
uim(g(0, 4>, r),p) = t m - V ^ + < < - m M j m _ , ( p r ) (2.2.4)
11
where Ju{x) is the uth order Bessel function. It is then straightforward to write down
= uem(g(8,4>,r),p)=im-iel^m-iWje-m(pr)- (2.2.5)
Henceforth, no distinction will be made between the operator U(g,p) and the corre
sponding infinite dimensional matrix with elements uem(g,p). The irreducible unitary
representations satisfy a number of important symmetry relations. These are briefly
mentioned here.
1. This property is used often in the proofs and results from the fact that the
representations are unitary operators,
U(g,p) = U{Re,T,P) = U(I2jr,p)-U(Ro,0,p). (2.2.6)
2. An orthogonality relation,
r 4TT2
/ uiimi(g,p1)uhm2(g,p2)d(g) = Si1i25mim28(p1 - p2), (2.2.7) JSE(2) P2
where 5(pi — p2) is the Dirac delta. The Dirac delta function is given by 8x =
+oo if x = 0 and equal to 0 otherwise and the Kronecker delta by Sij = 1 if
i = j and equal to 0 otherwise.
3. Additionally, it can be written
uem(g,p) = (-iy-mu-e,-m(g,p) (2.2.8)
uem(g{9,<f>,-a),p) =ulm(g(9,(f)±7r,a),p) = (-lY~muem(g(9,(j),a),p) (2.2.9)
12
and
(-iy-muim(g(6„ <f>-9,a),p) = ume(g(d, 0,a),p). (2.2.10)
4. Finally, note that the collection
{utm(;p)\t,meZ,peR+} (2.2.11)
forms a complete orthonormal basis for L2(SE(2)).
Section 2.3 will demonstrate that Fourier analysis on §E(2) is dependent on the
properties of the representations, U(g,p), and on the fact that the matrix elements
form a complete orthonormal basis for L2(SE(2)).
2.2.1 Proof of Irreducibility
The proof of the irreducibility of the representations U(g,p) is instructive
and so it is provided here. Some aspects of this will be echoed later on when setting
up the Sobolev condition. This proof is as in Chirikijian and Kyatkin [11], with some
slight modifications to ordering and notation. The fact that these are representations
over a non-compact space implies that the representation matrices for §E(2) are
infinite dimensional. Hence it is easier to show the irreducibility of the operator than
the representation matrices.
Let se(2) be the Lie algebra of SE(2)so that se(2) = TZ2 +so(2), a vector
space sum, where so(2) is the Lie algebra of §0(2). Further, consider the one param
eter subgroups generated by three exponentiated basis elements of the Lie algebra.
13
Using the basis elements Xi,X2:Xs,
' 0 0 1 ] [ 0 0 0
0 0 0 , £2 = 0 0 1
0 0 0 / \ 0 0 0
XQ —
' 0 - 1 0 *
1 0 0
0 0 0
(2.2.12)
gives
' 1 0 ^
#i(£) = exp(tXi)
p2(*) = exp(££2
0 1 0
yO 0 1
' 1 0 0
0 1 £
g3(t) = exp(££3) =
0 0 1 j
cos t — sin £ 0
sin £ cos £ 0
V 0 0 1
(2.2.13)
(2.2.14)
(2.2.15)
The next step is to define corresponding differential operators, these will be denoted
as Xf and Xf and are as follows
•X-i —
x 2
-v-L x 3 "
= « » ( » - # ) | +
- -Me-*)i d
ae'
sin(0 — 4>) d a dcj)
cos(9 — 4>) d
a dcj)
(2.2.16)
(2.2.17)
(2.2.18)
14
and
.X-i
X 2
TH —
COS(0) d sin(0) d
da a dd>
. , ,, 8 cos(0) d
aa a o<p
d6 + deb'
(2.2.19)
(2.2.20)
(2.2.21)
The L and R superscripts denote left and right invariance respectively under shifts.
These operators act as functions on the group, so it follows from the definition of
U(g,p) that
U(gx(t),p)ip^) ^ e ^ V W -
U(g2(t),p)<pW>) ipt sin ip ¥ # ) >
U(g3(t),pMij) = ipty-t).
(2.2.22)
(2.2.23)
(2.2.24)
Now, differentiate with respect to t and setting t — 0, define the operators
XJ(P)<PM = dU{exp(tXj),p)if(^)
dt t=o
Explicitly these are written
%2(P)<PW
—tpcos(ip)ip(i/;);
—ipsm(if>)(pfy);
dcp
dip
Finally the operators,
(2.2.25)
(2.2.26)
(2.2.27)
(2.2.28)
Y+(p) = kx{p) + iX2(p), K.(p) = ^ ( p ) - ik2(p), %{p) = X3(p). (2.2.29)
15
Since tp G L2(S'1) basis elements are of the exponential form (elfc^), these basis ele
ments are transformed by the operators Y± and Y3 as
Y+(p)elk^ = -7,pe<k+1)^- Y„(p)elk^ - -ipe1^1^- Y3{p)elklp = -ikelklp. (2.2.30)
Notice that Y+ and Y_ always move basis elements to the 'adjacent' subspaces. Since
no subspaces are left invariant by Y±, the representation operators, U(g,p), must be
irreducible.
2.3 Fourier Analysis on the Euclidean Motion Group
The Fourier transform on §E(2) is analogous to the ordinary Fourier trans
form on the real line, given a definition for a restricted class of functions. A function,
/ , is rapidly decreasing if lim,.-^ rmf = 0 for all m G Z+, where Z + is the set of
positive integers.
Definition 2.3.1 The Fourier transform of a rapidly decreasing function, f(g) G
L2(SE(2)) where g € SE(2) and p G [0, 00), is defined by
f(p) = / f(g)U(g-\p)d(g), (2.3.1) JSE(2)
where superscript^ denotes the Fourier transform. The inverse transform is given as
f(g) = y ° ° t r (f(p)U(g,p))pdp, (2.3.2)
with tr(-) denoting the matrix trace.
The measure d(g) is the R2 invariant, SO(2) normalized Haar measure on SE(2). For
convenience, f(p) will often be represented as an infinite matrix. The matrix elements
16
of / will use the matrix elements of U(g,p) as defined in (2.2.3) giving
/L(p) = (e^, J(p)eim^) = [ f(g)uem(g-\p)d(g), (2.3.3) J$E(2)
for all £, m G Z. Likewise, the inversion can be written in terms of the matrix elements
as
fid) = Yl Yl / forn(p)ume(g,p)pdp. (2.3.4)
The properties and existence of the inverse transform depend primarily on the uni
tary and irreducibility properties of U(g,p). Note that equality must be interpreted
correctly, as it may not hold point wise.
Recall the definition of the Fourier transform in terms of the matrix elements
(2.3.3) and let / , fu f2 G L2(SE(2)), then the following properties hold:
1. the adjoint property: /*(p) = f'(p), where f*(g) = f{g-1),
2. the convolution property: {f\* f2)tm{p) = J2 h,ij{p)fi,jm(p), also written as o
fi*f2(g) = 72(g)Ti(g),
3. Plancherel (Parseval): j m { 2 ) \f(g)\2d(g) = J0°° \\J(p)ftrpdp.
The norm, \\A\\2tr is the square of the Hilbert-Schmidt norm, which in this context is
tr(AA*). Equality in the Plancherel property must be interpreted correctly, it may be
that it reflects an isometry between the two quantities, as opposed to a strict equality
relationship. With respect to Plancherel, note the following identities,
| | /£m| | = J(mJlm = Jim * Jim
\\l\t = tr(r/)
17
then
- \f(g)\2d(g) = / \\f(p)\\lpdp = tv(t(p)f(p))pdP [2) JO Jo V 7
/» /*oo —
/ \f(g)\2d(g) = 7 E f'im(p)fim(p)pdP. JSE(2) JO , m c 7 ' SE(2 ) ^ u l m £
For completeness, note that the right regular representation as opposed to the left
regular representation is in use, although it can be written either way. What follows
are proofs of the the convolution theorem and of the Plancherel equality. These
have been slightly re-worked to conform to the notation and methods used in this
dissertation, but attribution is to Chirikjian and Kyatkin [11].
2.3.1 Proof of Convolution Theorem
Let / i , f2 E L2(SE(2)) and g, y G SE(2) then given
(/i * /2)(<7) = / fi(y)f2(y-1 o g)d(y) (2.3.5) JSE{2)
and applying the Fourier transform gives
/ W 2 ( p ) = f ([ fi(y)f2(y-log)d(y))u(g-1,P)d(g) (2.3.6) VSE(2) \JSE(2) /
f2(y-1og)U(g-\p)d(g))f1{y)d(y) SE(2) \JSE(2)
upon switching the order of integration. Using the fact that the Haar measure on
SE(2) is both left and right invariant gives
f(y o g)d(g) = f f(go y)d(g) = f /(g-^dig) = f f{g)d(g) (2.3.7) SE(2) «/SE(2) J§E(2) JSE(2)
18
for any function / G L2(§E(2)). Thus the inner integral of (2.3.6) can be written as
f2(y~1o(yog))U({yog)-\p)d(g)= / f2(g)U(g-1 o y~\p)d{g). (2.3.8) SE(2) JSE{2)
Using the fact that U(g,p) is a representation on §E(2) and has the property that
U{g-loy-\p) = U{g-1,p)U(y-\p), (2.3.9)
allows
*/a(p) = / / f2{g)U(g-1,p)U(y-1,p)d(g))f1(y)d(y) JSE(2) \JSE(2) /
= (f f2(g)U(g~\p)d(g))( J h{y)U{y-\p)d{y) \JSE(2) / \JSE(2)
= UP)UP)
as required.
2.3.2 Proof of Parseval's Equality
Let 3, j /£ §E(2) and recall e G SE(2) is the identity element. Further let
/ , h G L2(SE(2)). Then recall that the adjoint property given earlier and assume that
/*(<?) = fig-1) e J/(SE(2)),
for all / G L2(§E(2)) then defining h to be a convolution relation then following can
be written
Kg) = (f*r)(g)
= [ f(y)f*(y-1°g)d(y)= [ f(y)f(g-1 °y)%). JSE{2) JSE{2)
19
Evaluate this at the identity element g = e gives,
h(e) = / f(y)f(y)d(y)= \f(y)\2d(y). (2.3.10) JSE(2) JSE(2)
Now, consider using the inversion formula to alternately express the function h(g) as
/•oo
h(g)= / tr(h(p)U(g,p))pdp. (2.3.11) Jo
Similarly, evaluating this at the identity element means U(e,p) is the identity opera
tor, so
h(e) = j°° tv(h(p))pdp = J°° tr (/*(p)/(p)) pdp, (2.3.12)
using the convolution property and the initial definition of h. Using the fact that
SE(2) is a unimodular (ie. the left and right Haar measures are equivalent) group
gives
f*tm(p)= I 7iFrJuem(g-1\p)d(g)= f 7{g)ulm{glP)d(g). (2.3.13) JSE{2) JSE{2)
Thus, for any function / e L2(SE(2))
/ fig-^dig) = f f(g)d(g). (2.3.14) JSE(2) VsE(2)
Now write
f*em(p) = / f(g)ume(g-\p)d(g) = Jmi(p) = JeM = ( X V (p), (2-3.15) J§E(2) v '
where y is the complex conjugate transpose of / and superscript' is transpose. This
means that f°° / \ r°°
h(e) = J tr [j*(p)f(p)jpdp = J \\f(p)\\22pdp. (2.3.16)
Equating (2.3.10) and (2.3.16), the two expressions for h(e), gives Parseval's
equality for §E(2) as required.
20
2.4 Regularization Techniques
Two methods of regularization are used, namely spectral cut-off and Tikhonov
regularization. Spectral cut-off is also known as truncated singular value decomposi
tion and Tikhonov regularization is more commonly known as ridge regression in the
statistical literature. Both methods are briefly outlined.
2.4.1 Spectral Cut-off
For compact operators, spectral cut-off regularization corresponds exactly
with singular value truncation. For more general operators this method of regulariza
tion relies on the Halmos version of the spectral theory [23]. This theory states that
every Hermitian operator is unitarily equivalent to a multiplication. To discuss in full
detail involves introducing unnecessary notation, but briefly consider the following. A
multiplication operator is defined as Mp4> := p- <f) on some dense subset of L2(SE(2)).
Given A such that A is unitarily equivalent to a multiplication operator
A = U'MpU, (2.4.1)
with U*U = UU* = I. Hence A*A is WM^MpU and
tr((A*A)_1) = tvdU'MlMpU)-1) (2.4.2)
= trtfAijAfp)"1) (2.4.3)
oo
i = l
where d\ are the diagonal elements. This essentially allows for the diagonalization of
the distortion operator in subsequent sections.
21
2.4.2 Tikhonov Regularization
Given a system of linear equations Af = d where A is an n x p matrix, / is
a p-vector and d is an n-vector then Tikhonov regularization focuses on an estimator
of the form
fs = (A'A + SI)-1A'd. (2.4.5)
The parameter 5 > 0 is known as the regularization parameter. For a well conditioned
approximation problem Af ~ d the residual \\Af — d\\2 is minimized for the choice of
estimator / = (A*A)~1A*d. Note that, as in the case of ill-posed inverse problems,
if the A is rank deficient or ill-conditioned then the least squares choice of estimator
does not exist or is useless. Since A*A is symmetric and positive semi-definite, the
matrix A*A + a21 has eigenvalues in [cr2,cr2 + \\A\\2}. Hence the condition number
has an upper bound of (a2 + ||A||2)/cr2 which becomes smaller as a increases. This
gives us the formula (2.4.5) with 8 = a2. This formula was originally derived in [46]
by solving the modified least squares problem below,
m i n | | A / - d | | 2 + a2 | |/||2. (2.4.6)
The extension to / and d being square matrices, as in this thesis, is straightforward.
22
Chapter 3
Theoretical Results
This chapter describes the main results of the research undertaken for this
dissertation. The chapter is separated into three sections, of which the second section
contains the majority of the new results. The first section sets up.the statistical
problem and describes the restrictive conditions on the problem. Some remarks are
collected in the final section.
3.1 Deconvolution Density Estimation
3.1.1 Conditions and Assumptions
The first stage is to make appropriate restrictions on the parameter space
and as is common in inverse problems a Sobolev type condition is used. In a similar
way to Section 2.2.1 the Riemannian structure is used to define the Laplacian on
SE(2). In particular, using differential operators £; the general Laplacian can be
written as
hj 3
giving the class of functions. The choice made here is to let bj = 0 for all j and let
aij — $ij- Let sc(2) be the Lie algebra of §E(2) as in Section 2.2.1. The following
23
matrices are chosen as a basis for se(2), noting that £ 2 and £3 satisfy the Hormander
bracket theorem,
X j —
' 0 0 1 *
0 0 0
y 0 0 0 J
1 £ 2 —
' 0 0 0 ^
0 0 1
0 0 0
.0 - 1 0 ^
1 0 0
0 0 0
(3.1.1)
1 u u u / \ u u u /
For X E se(2), consider the one-parameter subgroup exp(££) of se(2), where
exp : sc(2) —» SE(2) is the exponential map. The left invariant vector field on sc(2)
can now be defined by
Xf = ^/(^exp(tX)) t=o
where / : SE(2) —> R. Thus, with respect to the basis (3.1.1),
' 1 0 ^
exp(££i
exp(i£2) =
0 1 0
0 0 1 y
1 0 0 *
0 1 t
V 0 0 1
exp (t£3) =
cos t — sin t 0 '
sin t cos £ 0
0 1
24
for t € R. In polar coordinates (2.1.2), the left invariant vector fields are.
yL ,„ „d. sm(8-d>) d X^ = c o s ( # - 0 ) — +
dr r L • (a ±\® cos(9 — 4>) d
XL2 = - s i n ( 0 - 0 ) — +
or
Xz ~ do-
and in rectangular coordinates (2.1.1)
Je[ = cos(#)—--sin(£) — , <9ri <9r2
*2 = M0)~+coS(9)^-, '2
* 3 ~ do'
The Laplacian on §E(2) is (note the superscript L has been dropped and terms
reordered)
-v-2 -v-2 -v-2 J^-i S\*n J\.n
<96>2 <9r2 <9rf
The eigenfunctions of the Laplacian are given by U£m(g,p) in (2.2.11) which have
eigenvalues (m2 + p2), m G Z, p G [0, oo). This is easily demonstrated, here in
25
rectangular coordinates.
d2 d2 d2
Au£m(g,p) = --Q^uem(g,p) + —-^uem(g,p) + ~-—^uem(g,p)
1 2TT~
2?r d_ Id
\e~i£^ g - ^ i P c o s ' / ' + ^ p s m v ) eirn(ip-6) t_lrn\\ rfjj
+ 1 ^ d r
2W0 ~In \-Hip — i(rip cos ip+r2p simp) /
e *e -tpcostp)e im{ip—8) dljj
2TT
+ — / \e-me-i{ripcO^+r2pSm^)f_lpsiri^ym{i,-0)l ^
2vr 70 dr2
1 f2n
27T 0
1 2TT
+ — / e - l ^e~ i ( r i p c o s ¥ ' + r 2 p s i n ^(- l ) ( -zpcosV) 2 e i m ( ' / , " 9 ) ^ 2TT7O
1 f27r + — / e-^e- i ( r i p c o s^+ r 2 p s i n^ )(-l)(-2psin^)2e i m ( ' / ' -9 )d'(/;
2TT JO
= ( ( - l ) ( -zm) 2 + (-l)(-zpcost/;)2 + (-l)(-zpsin^)2)u< m(p,p)
= (m2 +p2)uem(g,p).
Define the adjoint of the Laplace operator, A* by
A*utm(g,p) = Aume(g,p), (3.1.2)
for g e SE(2), p E [0, oo) and £, m <E Z. Note that since
A*uim(g,p) = Auim(g,p) = Auml>(g,p), (3.1.3)
applying A* to uetTri(g, p) will give (I2+p2)uem(g, p) which provides the necessary result
for the Sobolev condition below. Recall that the set of all matrix elements, defined by
(2.2.11), is a complete orthonormal basis for L2(SE(2)). Then, the Sobolev condition
with respect to the operator 1 + A* + A, specifying a radius Q > 0 is
6(s, Q) = {f\ I (l + e + m2 + 2p 2 \ s f(p) pdp < Q
tr (3.1.4)
26
where s > 3/2 and / G L2(§E(2)). Note this has embedded the implicit assumption
that (1 + £2 + m2 + 2p2)~s <C T~2s and also note that Sobolev functions are rapidly
decreasing functions.
Two major considerations for implementation are, first, reducing the infi
nite dimensional problem down to a finite dimensional approximation, and second,
addressing the ill-posed condition of non-invertibility of the known distortion oper
ator. Both of these considerations can be addressed in the following way. For an
operator A acting on a countable Hilbert space so that in some basis, if A = {ai^ij&L
then the compression for some T > 0, is denoted by AT = (ay)|i|,|j|<T- Further de
grees of ill-posedness in the distortion operator will be handled by using Tikhonov
regularization.
3.1.2 Statistical Model
Consider random §E(2) elements gx,gv and gz with densities f,h and k
respectively, assuming also that gy and gx are independent. Then, if gx is observed
indirectly via
9r = gzgx, (3.1.5)
the relationship among the densities is given by convolution; h — k * f. The density
of interest is the unknown density / which is related to the observations h by the
model
h(g) = (k*f)(g) 5 e § E ( 2 ) , (3.1.6)
27
with f,g.kE G(s, Q) and k a known density. Taking the Fourier transform of (3.1.6)
results in operators that can be represented as infinite matrices, indexed by the pos
itive real number p,
h{p)=f{p)k{p). (3.1.7)
Note that the matrix elements of the Fourier transform h(jp) can be written in terms
of the matrix elements of / and k
oo
htmip) = 5Z ftq(p)kim(p), (3.1.8) q=—oo
or by using the matrix elements of the irreducible unitary representations (2.2.3) and
the definition of the the Fourier transform (2.3.1) as
htmip) = / . h(g)uem(g~1,p)d(g). (3.1.9) JSE(2)
Subsequently, using the definition of the inverse Fourier transform (2.3.2) the operator
h{g) is written in terms of matrix elements and its Fourier coefficients
oo oo ^ ^ /»oo
% ) = Yl Yl / him(p)umi(g,p)pdp. (3.1.10) =—oo m=—oo
The assumption is made that a random sample is available, so given obser
vations gi,...,gn = {<7j}"=1 £ SE(2) the matrix elements of the empirical Fourier
transform, hn, can be equated with the elements of the empirical characteristic func
tion (see [29, 19] for examples),
KM = -Y,"tm{jaj\p)> . (3.1.H) n *•—' J
3=1
for all \£\, \m\ < T and p £ [0,T]. Applying the inverse Fourier transform, and using
the above as the empirical Fourier coefficients gives the empirical density function,
28
hn(g),
hn(g) = / E E hUp>mi(g,P)pdp (3.1.12)
r T T 1 n
= / E E -^2u^m(gJ1,p)ume(g,p)pdp. (3.1.13)
Formally, the inversion model, isolating / would be given by
Kp)=h(p)k-\p). (3.1.14)
However, two issues arise here. First, since the problems under consideration are
ill-posed where it is assumed that the full (infinite) operator k is not invertible, the
use of the compression operator is invoked, with the spectral cut-off parameter T,
giving
fT{p)=hT{p)kT\p). (3.1.15)
Additionally, the values of h? are not known, since the coefficients fr(p) are unknown,
so an empirical version, hn, must be substituted. Applying the inverse Fourier trans
form then gives the empirical density estimator fn(g) for the unknown density of
interest,
P(g) = f tv{(hn(p)krr\p))U(g-\p)}pdp, (3.1.16) Jo
assuming invertibility of kx(p)- This is the first case. Consider, now, that the matrix
kr(p) is singular or near singular. In this case, a second regularization technique to
handle the eigenvalues near or at zero must be implemented. In this situation, the
29
formal model for fr would be given by
Trip) = hT('PK(p) (3-1.17)
= hT(p)k*T(p) (h(pjfy(p) + vlzr+i) ~*, (3.1-18)
where hr+i is the (2T + 1) x (2T+ 1) identity matrix. Again, in practice h must be
replaced by the empirical version. In both cases, one may write the empirical Fourier
coefficients ffm in terms of matrix elements
oo
fL(p) = E ^-(p)%t(p) (3-L19) j=-oo
for \£\, \m\ < T and assuming invertibility of the operator kr- The empirical density
estimator fn is given as
T T T
/"(•?) = E E / fL(Phmi(9,P)pdp, (3-1.20)
for \£\, \m\ < T. Without the assumption of invertibility
oo
fZ(p) = E %(P)^m(P) (3-L 2 1) j=-oo
where
/ ~ N - l
Kv(p) = k*T{p) (kT(p)k*T{p) + VI2T+I) .
and for \£\, \m\ < T. Hence the empirical density estimator fnu is given as T T T
/""(<?) = E E / fZ(p)umt(g,p)pdp, (3.1.22)
with the restrictions that \£\, \m\ < T. The case requiring Tikhonov regularization can
be distinguished by the addition of a superscript v, so in fact there are two empirical
density functions, fn and /"".
30
Expectation, denoted by E, is with respect to the density h. Define the
MISE to be E || fn — /1 | 2 which can be separated into bias and variance parts as
E\\r - f\\22 = \\EP - f\\2
2+E\\P -EP\\l (3.1.23)
3.2 Asymptotic Error Bounds
The main results of this dissertation are presented in this section. Presen
tation of the results will be split into two subsections, the first, where only spectral
cut-off is required as a regularization method and the second where Tikhonov regu
larization is additionally needed. Results will generally be presented with a partial
or sketch proof, as most of the proofs are quite lengthy and technical. The full proofs
are given in Appendix A. In this section the performance of the estimators (3.1.20)
and (3.1.22) in terms of the MISE is assessed.
The following notation will be used. Let {an} and {bn} denote two real
sequences of numbers. Write an <§C bn to mean an < Cbn for some C > 0, as
n —> oo, the Vinogradov notation. The notation an = o{bn) will mean an/bn —> 0, as
n —> oo, consequently, the expression, o(l) would'mean a sequence converging to 0.
Furthermore, an x bn whenever an <C bn and 6n < a„, as n -+ oo. Finally, for some
operator A on a separable (countably infinite dimensional) Hilbert space denote the
spectrum of it by A (A).
For an arbitrary compression AT(p), 0 < p < T, T > 0, spectral conditions
31
can be formulated as, for p. T > 0 and ft > 0, there exists 0 < 70 < 71 < 00 such that
2 1 ^ 2 \ - 3 „, (rp2 , J2\-P A(AT(p))c h0(T*+p2yJni{T
2+p2) (3.2.1)
as T —* 00, and f3 is a smoothness parameter dictating the rate of decay. The
compression to which this condition is applied will be specified in each case. The
following decay condition on the restriction is also needed, as in for p, T > 0 and
P>0 rT _ ^ 2
(3.2.2) •'O L k T L ' K T l?l<T,|</'|>T
as T ^ 00.
3.2.1 Spectral Cut-off Regularization
Theorem 3.2.1 Suppose for p, T > 0, (3 > 0 there exists 0 < 70 < 71 < 00 that
satisfy
•2 , JI\~0 -2 . „ 2 \ - / 5 A M ^ T C P ) C 70 {T2 + p2)~p , 71 (T2 + p2) (3.2.3)
for the compression kr and that the decay condition (3.2.2) is satisfied as T —+ 00.
Further assume that k^1 exists (ie. \kj^\ < 00 for all \£\, \m\ < T). Then
7-2/3+3
E| i r - / i ^« — + T - 2 S (3.2.4)
as T, n —•> 00, / £ 0(s, Q) and s > 3/2.
Sketch Proof. The method of proof is to take the MISE and separate the latter into
the integrated variance and integrated bias components as
E \\r - f\\l= E n r - E / i l ! + iiE/n - f\\l • (3-2-5)
32
Each component is then calculated separately.
Condition (3.2.1) involves the spectral structure of kT(p)k^(p). In particular,
for kT(p)h(p) a (2T + 1) x (2T + 1) matrix for each T > 0 and p e [0, oo), let
kT(p)k*T(p) = VT(p)DT(p)V*(p). (3.2.6)
denote the spectral decomposition, where Vr(p)Vf(p) = V^(p)VT(p) = I2T+1, DT(P) is
a diagonal matrix with diagonal entries dj(p), \j\ <T and p G [0. oo) and superscript
* means conjugate transpose. Consequently, (3.2.3) is equivalent to the statement,
for p, T > 0 and (5 > 0, there exists 0 < 70 < 71 < 00 such that
7o (T2+p2y" < dfr) < 71 (T2 +p2y" (3.2.7)
for \j\ < T.
The integrated bias term is presented below with the details of the calcula
tion in Appendix A. 1.1. In particular
| | / - E r | | 2 « T - 2 s ( l + 0(l)) (3.2.8)
as T -> 00 and / G 0(Q, s),s> 3/2.
The integrated variance term is presented below with the details of the
calculation in Appendix A.1.2. In particular
E|ir-.^ii'<if E^o'* (3'2'9)
as T —> 00.
33
Finally, by combining (3.2.8) and (3.2.9), the MISE for the regularized case
can be determined
Eii/"-/n2«^/0TEr4op*+T"2* (3210)
as n, T —>• oo. This is completed by taking the supremum over j and finishing the
integration, giving
i rT T i T-2/3+i
The following best upper bound rate can be obtained for this case.
Corollary 3.2.2 IfT x nV(2*+2/?+3) f / i e n
E | | / n - / | | 2 - < n " 5 5 ^ + 3
as n -» 00 /or / G 0(s , <3), s > 3/2.
Consider the case where gx is observed directly, i.e., gz = (I2, 0)' in (3.1.5),
this corresponds to f3 = 0 and would be direct density estimation. For this the
following holds.
Corollary 3.2.3 IfT x n1^28"1"3), t/ien
E i i r - z i i a ^ * - *
as n —• co /or / G 6(s , Q), s > 3/2.
23 s+3
34
These results indicate an upper bound rate of convergence that is compat
ible with lower bound rates of convergence in the statistical literature over compact
manifolds, hence Lie groups, see for example [30].
3.2.2 Tikhonov Regularization
Theorem 3.2.4 Suppose for p,T > 0 and /? > 0, there exist 0 < 70 < 71 < 00 that
satisfy
A (kT(p)kT(p)) C [fo(T2+p2y0r/i(T2+P2yP} (3-2-12)
and that (3.2.2) holds. Further assume that k^1 is singular or near singular. If u > 0
then
^ r - n l « T - \ ^ 6 ^ ; +T-* (3.2.13)
as T, n -> 00 for f e 9(s, Q), s > 3/2.
The method of proof is very similar to that of 3.2.1, however there are additional
terms that must be considered.
Sketch Proof. Once again, separate the MISE into the integrated variance and
integrated bias components as
E | | / w - f\\\ = E \\f™ - Efnv\\22 + \\Efnu - /((J , (3.2.14)
and calculate each component separately.
The spectral structure condition is as per the decomposition in (3.2.6) and
will not be repeated here, except for the result that (3.2.1) is equivalent to the state-
35
ment, for p, T > 0 and j3 > 0, there exists 0 < 70 < 71 < 00 such that
7o (T2+P2yP < dj{p) < 7i (T2+PT0 (3.2-15)
for \j\ < T.
The integrated bias term is presented below with the details of the calcula
tion in Appendix A.4.1. In particular
ii/-E/"ii« [tT[ (m^)«*+T~)(1+o(1)) (3-2'16)
as T -»• 00 and / G 0(Q, s), s > 3/2.
The integrated variance term is presented below with the details of the
calculation in Appendix A.4.2. In particular
E||/~-E/~| |S<lfE(^^* <^> as T —>• 00.
Finally, by combining (3.2.16) and (3.2.17), the MISE for the regularized
case can be determined
E | | / ~ - ff « I f j : g g ± ^ * + T - (3.218)
as n, T —> 00. Then the MISE can be evaluated as follows,
1 /"T V - diipj + nv* , , ^_ 2 s ^ (2T + 1) /-T ^-(pj+rw/2 , , ^_ 2 s
n JO ^T(dj(p) + ")2 n Jo \3\<T (djip) + V)
< "7 / "/..'AT-ofl , -Ao Pdp + T Wo (7oT-^ + //)2
T3 2 - ^ 7 l T - ^ + n^2 _ 2s <g- ii ! 1_ T~2S
36
In the special case of invertibility and hence (3.1.20), the following holds.
Corollary 3.2.5 If u = 0, then
Mr - f\\l<~ — + T ~ 2 S
n
asT,n-+ooforfee{s,Q),s>3/2.
This follows directly. As a result of Corollary 3.2.5, it is of interest to determine how
the choice of the regularization parameter u affects the MISE. The following results
provide sufficient conditions that give the same L2(SE(2))—bound as previously. This
sufficient condition implies that singularity in the truncated distortion operator can
be handled without additional penalty, so long as the regularization parameter v is
of the correct form.
Theorem 3.2.6 Suppose for p, T > 0 and (5 > 0, there exists 0 < 7o < 71 < 00 that
satisfy (3.2.1) and (3.2.2). IfT2/3 = o(n) and £ > ^ ^ then
0 < fro + Veil + (n - gP*)(2-/»7i - fro)?^ ~ V ~ n- £T2^
and so 7^2/3+3
Wnu-ft-^ + r-2s, n
as T, n —> 00 for f G 0(s, Q), s > 3/2.
Sketch Proof. The approach here is to obtain the inequality
2 - ^ T - ^ + m,2 w
37
for some £ > 0.
One notes that by assuming T2/3 = o(n), by solving for v one obtains the
dominant root as
e?o + Vei2o + ( n - gP»/>)(2-/»7i - ^0)T-^ n - iT^
thus providing a sufficient condition on v. •
Consequently, the following is obtained.
Corollary 3.2.7 / / T x ni/(2*+2/?+3) and 0 < v < M ( i + 0(i)) ^ e n
as n —> oo for f G 6(5, Q), s > 3/2.
Again, with respect to direct density estimation as in Corollary 3.2.3, the
following holds.
Corollary 3.2.8 / / T x n1^2^3) and 0 < u < ^ ( 1 + o(l)) tfien
E | | / n i / - / | | 2 < n - ^
as n —> 00 for f G 0(s , Q), s > 3/2.
3.3 Further Remarks
Finally, some remarks need to be made on the conditions in Theorem 3.2.1
and Theorem 3.2.4. In particular, it is important to verify that they are non-vacuous
and relatively easy to check.
38
Regarding the first condition (3.2.1), with a bit of work, one can build a
distribution on §E(2) that satisfies the required conditions. To start, adopt the
definition of the SO(3)—Laplace distribution as specified in [24] where
ktm(p) = — — Y T W * — 2 ~ m \ 5 t m > (3-3-1) 1 + al[ll + ml + 2pz)
a2 > 1/2, Sim denotes Kronecker delta, and f , r a € Z . This would correspond to the
situation where 7 = l/(2er2) and (3 — 1. By Fourier inversion (2.3.4), the correspond
ing function would be
kW=JQ TLl + 2a2{i2+p2)M9,p)pdP , (3.3.2)
for g e §E(2). One can verify that this is a probability density function and will
be referred to as the §E(2)—Laplace distribution and serves as one example of a
probability density function over §E(2) that satisfies the spectral conditions of the
theorem.
With respect to condition (3.2.2) which is the condition on the distortion
operator k. In many cases it will be trivially satisfied, or can at least be easily checked.
In particular, if it can be assumed that k e L2(SE(2)), then through the Plancherel
formula,
/ \Hg)\2dg= J E V'W PdP (3-3-3) ^E(2) JO , = 0 0 , , = 0 0 . 1
is finite, hence the condition
T E IW*5) ^dP = o (T-2-2^1) (3.3.4) Jo , , J T T V ^ ' kl<T,|?'|>T
as T —>• 00, is stating the manner in which an infinite part of the summation vanishes.
39
As examples, if k(p) is diagonal as in the §E(2)—Laplace distribution, or
is band-limited for all p G [0, oc), then the condition is trivial in the former since
kq'q(p) = 0 f° r W\ > ^ a n ( i \q\ < T, while in the latter, k(p) = 0 for p > T at some
finite value. See [8, 49] for more examples.
More generally, if k(p) is a banded matrix involving nonzero sub-diagonal
and super-diagonal terms, then the above condition becomes:
fc_T-i,r(p) + fcr+i,-r(p) pdp = o (T - . - 2 s - 2 ^ - l ) (3.3.5)
as T —> oo. In such cases, this condition describes a joint decay condition on the sub-
and super-diagonal terms.
40
Chapter 4
Simulation Study
This chapter describes results from a small simulation of density and pa
rameter estimation on §E(2). A cross validation estimator is used to estimate the
MISE.
4.1 Direct Density Estimation
The first case to consider is direct density estimation. This is the situation
where the smoothing parameter j3 = 0 and the operator k is considered to be the
identity. The 'true' density / is a multiplicative density made up of the von Mises
distribution on S1 (with argument 9) and the bivariate normal distribution on 1R2
(with arguments x = r cos <fi,y = r sin <p). The density is given by
fW^r)) = [———)[ _ = (4.1.1)
with
(r cos 0 - nx)2 (r sin0 - py)
2 2p(r cos <f> - px)(r s i n 0 - fiy) 7 = 2 ' 2 ' I4'1"2)
°Z ay a*ay
The random sample of §E(2) elements is generated in two parts, so a set {^-}"=1
iid from the von Mises distribution with parameters /i and K and joint {xj,yj}7j=l
41
Parameter
Values
n fi
20 0 50 7T
100 200 400 500 1000
K
1 3 8
Vx
0 1
tlv
0 1
0"x
0.05 1 2
ay
0.05 1 2
P
-0.4 0.05 0.5 0.95
Table 4.1: Possible parameter choices for simulated density.
iid bivariate normal with parameters p,v, py, ax, ay and p. Parameter choices for the
presented results are somewhat arbitrary since the primary interest is in evaluating
the choice of the compression parameter T. Various combinations of these parameter
choices, found in Table 4.1 are used.
Most results for density estimation are presented in terms of perspective
plots and contour plots over a grid of two of (6, 4>, r) for a fixed value of the third.
The notation used is that of a conditional density, so that f(8, <p\r) means the plot is
of f(g(9, <f), r)) for 9 and <p in the noted ranges and r fixed at the noted value. Similarly
f(8,r\(j)) and f{4>,r\0) represent / evaluated on a grid of the first two variables with
the third variable fixed. One example of this is given in Figure 4.1 showing perspective
and contour plots for all three aspects.
42
(a) (b)
(c) . (d)
i i
(e) (f)
Figure 4.1: Sample perspective and contour plots for simulated density; parameters given by: /i = n, k — 1, /j,x = \xy = 0, ax — ay = 1, p = 0.05; Sample size n = 20; Cutoff parameter T = 2. (a) Perspective f(8, <f>\r) for r = 1; (b) Contours for /(#, 0|r) for r = 1; (c) Perspective /(</>, r\9) for # = 7r/2; (d) Contours f(4>, r\9) for 9 = 7r/2; (e) Perspective f(9,r\cj)) for 0 = — 7r/2; (f) Contours /(#, r|0) for 0 = —n/2.
43
The empirical density is built by using the empirical characteristic function
(ecf) via
hn(p) = -J2Ur(9j\p) (4.1-3) i=i
where {gj}™=1 = g^j^j^j^^i- Following this, the inverse Fourier transform is
applied to the empirical characteristic function, since the assumption is direct density
estimation, with kT(p) equal to the identity. This gives the following as t,he empirical
density,
T T n f
(4.1.4)
The empirical estimator is necessarily complex valued, however for practical purposes
only the real part was retained. As a sample Figure 4.2 and Figure 4.3 demonstrate
visually the approximation made by the density function. For each set of figures, the
left hand column is the true density f(8, </>|r), the right hand column is the estimated
density fn(9, <f>\r) in this case for 3 different values of r = 0.05, 0.5 and 1. The contour
plots are laid out similarly. Determination of T in a data-based manner is considered
in the next section and is the primary focus of the simulation.
44
(a) (b)
(c) (d)
(e) (f)
Figure 4.2: Perspective plots comparing true and estimated density; parameters given by: p = IT, K = 1, px — py = 0, ax = ay = 1, p = 0.05; Sample size n = 20; Cutoff parameter T = 2. (a) True density f(9, <j>\r) for r = 0.05; (b) Estimated density /n(6>, (p\r) for r = 0.05; (c) True density f(8,(f)\r) for r = 0.5; (d) Estimated density fn(9, 4>\r) for r = 0.5; (e) True density f(8, <j)\r) for r = 1; (f) Estimated density / n (^ ,0 | r ) for r = 1.
45
(a) (b)
(c) (d)
(e) (f)
Figure 4.3: Contour plots comparing true and estimated density; parameters given by: ix = IT, K = 1, fxx = parameter T = 2. (a) fn{Q,<f>\r) for r = 0.05; / n (Mk) for r = 0.5; / n (# ,^ | r ) forr = 1.
- Hy = 0, ax = ay = 1, p = 0.05; Sample size n = 20; Cutoff True density f(9, <j>\r) for r = 0.05; (b) Estimated density (c) True density f(9, (j>\r) for r = 0.5; (d) Estimated density (e) True density f(9,(f)\r) for r = 1; (f) Estimated density
46
(a) (b)
(e) (f)
Figure 4.4: Perspective plots comparing true and estimated density; parameters given by: \i = 7T, K = 1, fj,x = fty = 0, ax = ay = 1, p = 0.05; Sample size n = 20; Cutoff parameter T = 2. (a) True density /(#, </>|r) for r = 0.05; (b) Estimated density fn(9, (j)\r) for r — 0.05; (c) True density f(9,r\<f)) for 0 = 0.5; (d) Estimated density fn(9,r\4>) for <f> = 0.5; (e) True density f((f>,r\9) for 0 = 1; (f) Estimated density fn[(f>,r\9) for 9 = 1.
(c)
47
4.1.1 Estimation of Mean Integrated Squared Error
An estimate of the MISE is needed so that the truncation parameter T can
be chosen in a data driven manner. One common choice the cross validation (CV)
estimator. The CV method used in this situation is least squares leave-one-out cross
validation. This provides a function of the truncation parameter T is a nearly unbi
ased estimator of the MISE. There are some well known problems with this estimator.
Generally, it suffers from issues of high variability, tendency to undersmooth and of
ten shows multiple local minima. However, it seems to be the natural first choice
in this setting. The requirement is that T* can be chosen that minimizes the cross
validation (and hence minimizes the MISE). Formally,
T* = argmin CV(T). (4.1.5) T>0
The computation for the estimator is developed in the following manner (see Sec
tion A.7 in Appendix A for details). By definition (as in [22] for example) the CV
estimator is
CV(T) = / \fn(g)\2d(g) --f^fn'k(gk), (4.1.6) JSE(2) n
k=1
where the second term is the leave-one-out empirical density estimator. The idea in
cross validation is to use the available data set multiple times to achieve a better
estimate. In the case of leave-one-out, n evaluations of the empirical density function
are used, denoted fn,k, and for each calculation, the empirical density estimator is
determined using n — 1 observations, omitting observation k, and is evaluated at the
omitted observation. Returning to the CV estimator, note that the first term is not
48
[\ng)\2dg=[ Yl |̂ m(p)|Vp- (4-1.7)
estimated directly. Instead Plancherel is used to re-express
I £tm=-T
Consequently, the computation gives
CV(T)=/"r J2 \flm{p)\vdv-lJZrk{9k). (4.1.8) JQ e,m=-T fc=l
The CV can be further refined into calculable form,
CV(T) = T^-IS- E E fT Jt-miprM-miprdPdp (4.1-9) v ' e,m=-T j = l ,yU
Figure 4.5 shows the evaluation of the cross validation function for T at
integer values for various sample sizes and sets of distribution parameters. Note that
although a line has been drawn between points, there is no sensible interpretation on
non-integer T. Finally Figure 4.6 demonstrates the variability of the cross validation
measure over multiple sets of parameters, for two arbitrary sample sizes n = 200 and
77, — 1000. Note that in Figure 4.6-b the minimum is not attained for the given values
of T.
49
- • - n=20 - • * - - n=40 *•*• n=60 —(— n=80 - - * - - n=100 - * - n=200
^ * S 5 * 4 S 5 ^ ^ - - ^ ^ ^ ^
(a) (b)
^s*^fc-~ / ' /
+ -a-
-*--•--*-- « •
-**-
n=20 n=50 n=100 n=200 n=400 n=50Q n=1000
(c) (d)
Figure 4.5: The legend denotes the line and point type for each sample size, only the minimum point is plotted. Distribution parameters for each figure follow: (a) \x = 0, K = 3, \xx = fiy = 0, a\ = (Ty = 1 and p = 0.95; (b) fx = 3.45, K = 1, fix = fiy = 0, ^x = °"y — 1 a n d P = 0-05; (c) /i = 0, « = 8, ^x = /% = 0, a2
x = o^ = 1 and p = —0.4; (d) fi — 0, K — 8, fix — fiy = 0, ax = 0.05, o-y — 1 and p = 0.5.
50
(a) (b)
Figure 4.6: Variation in cross validation for a single sample size over multiple density parameters. In both subfigures the parameters are: Run A - p = 0, K = 8, px = py = 0, o% = 0.0025, dy = 1 and p = 0.5; Run B - p = n, K = 1, px = py = 0, a\ = a2
y = 1 and p = 0.05; Run C - p = n, K = 1, px — py — 0, a^ = a^ = 1 and p = 0.05. (a) Sample size 200, (b) Sample size 1000.
4.2 Remarks
It is evident that for most cases in the direct density estimation case the
cross validation estimator works and obtains a data-based minimum T*. The large
variation, as in Figure 4.6, can likely be explained by the well-known shortcomings of
least squares cross validation when used for density estimation. With respect to the
density estimator itself, the performance is highly dependent on the parameters of the
underlying distributions, however it often is able to pick out the main feature of the
data. It would be beneficial for the estimator to be more robust to underlying param
eter changes. Finally, regarding true deconvolution density estimation. This was not
reported in the dissertation because although the programming aspect is straightfor
ward, implementation provided no additional insight. The estimator appears to be
limited to the extent that reasonable distortion operations, for example rotations or
51
shifts, could not be detected. Additionally, distortion in the form of noise, additive
or otherwise, is not yet. theoretically established making simulation results unlikely
to yield much benefit.
52
Chapter 5
Discussion
This chapter contains a discussion, in general terms, of some possible exten
sions of this line of research as well as some general remarks.
5.1 Remarks
The results presented in Chapter 3 provide new results for the statistical
deconvolution problem, giving asymptotic upper bounds for the first time over non-
compact groups, non-commutative groups and as such extend the field of research on
statistical deconvolution in a new and relevant direction.
5.2 Further Work
There are a number of avenues of further research that make themselves
immediately evident. A straightforward extension is to relax the spectral conditions
on k, assuming, for example, that there exist at most a countably infinite number of
zero eigenvalues in the spectrum. This amounts to using the condition (3.2.1)
A (AT(p)) C [70 (T2 + p2) ~p , 7i (T2 + p2) -*
53
as T —» oo and for some /?, but allowing 70 to be equal to 0. Computationally
this involves handling the 'zero'-points separately from the points bound away from
zero. The most likely scenario is to build e-balls around each of the zero points and
let e —>• 0 at an appropriate rate. Another extension is to extend the results of the
asymptotic error bounds to the case of §E(3), and in fact to §E(N). Initial exploration
suggests that the mathematics for the §E(3) case will be nearly identical, as the same
important properties of Fourier analysis, being Plancherel and deconvolution, hold.
Of course the technical details become much more complex, as the representations
require four-indexed matrices and this in turn likely implies the need for a more
subtle approach to compression parameter. Approaching the compression operator
more carefully is one issue that should be explored further. It may be worthwhile
considering using two parameters, T\ say, that acts as a limit on the matrix size, the
T in earlier notation, and T2 that controls the size of the interval over which p is
integrated. So, instead of
fn(g)= f fn(p)UT(g,P)pdp
Jo
one has
r(g)= f 2 fn(p)UTl(g,p)Pdp
Jo
and this gives a 2-dimensional minimization problem. Formally, choose the MISE estimator Af(Ti*,T2*) as
54
Changing this would not necessarily affect the asymptotic results as both T\ and T2
could be chosen to be the same as the rate for the previous parameter T, but may
provide interesting insight and may possibly lead to improved results.
A second approach involving iterative methods that may reduce the need for
this sort of two-stage regularization procedure is also an interesting question for future
investigations. The approach for this is in essence quite simple but presents some
theoretical and computational difficulties. Take Landweber iteration for example, as
the notation is quite simple. For inverse problems, it is as follows. Given the model
d = Af as before, find a solution, or estimate of / , denoted f£ where k indexes the
iteration number and superscript n indicates it is an estimate based on the random
sample of size n in the following way
fk^fk-i + ^id-Af^).
Recall, the convolution relationship h(g) = (k * f)(g) for g G SE(2) used previously.
In terms of Landweber iteration, using the empirical estimator hn for h we have
fnM = fZ-i(g) + *k*(g)(hn(g) - ( #_ ! * k){g)).
Note this requires a known k as assumed previously. Assuming the use of the Fourier
transform in this setting, then at each step the kth iterate of the empirical density
estimator can be found by evaluating the inverse Fourier transform of
fk(p) = Ik-M + *k*(p)(hn(p) - Hp)fk-i(p))-
Of course, at this stage the inevitable dimension issue arises, since these Fourier
transform matrices are infinite and indexed by the positive real number p. There
55
would still be the need for some sort of pre-conditioning with spectral cutoff to obtain
practical results. At the moment, it is not clear that there is an obvious way to handle
the need to approximate, by finite operators, the infinite operators generated by the
setting. One expects that iterated Tikhonov regularization could also be applied to
this sort of problem, although again, the initial need to truncate or use some sort of
cutoff parameter does not disappear.
The more interesting class of extensions is one that further the conceptual
framework and begin to generalize the asymptotic results. In particular, the problem
of additive noise is interesting, this would be the case where the model is given by
h(g) = {k * f)(g) + e£(g) (5.2.2)
where £ is the error, for example, a Gaussian noise process on the underlying group
and 0 < £ < 1 is the noise level. This puts the framework of the problem into a
more contemporary setting, but increases the complexity significantly. The first step
would be similar to the step taken in the current research, which is to, under suitable
restrictions, take Fourier transforms, giving a model
h(p)=f(p)k(p) + et(p). (5.2.3)
From this point a number of theoretical considerations arise, particularly about the
form of the noise process. Most often in inverse problems the noise is assumed to be
Gaussian white noise. One modification of a problem with noise involves assuming an
independent sample from the noise distribution is available in addition to the sample
from the density h. This may be a more fruitful first avenue of research.
56
Bibliography
E.N. Belitser and B.Y. Levit. On minimax filtering on ellipsoids. Mathematical Methods of Statistics, 4:259-273, 1995.
N. Bissantz, T. Hohage, and A. Munk. Consistency and rates of convergence of nonlinear tikhonov regularization with random noise. Inverse Problems, 20:1773-1789, 2004.
N, Bissantz, T. Hohage, A. Munk, and F. Ruymgaart. Convergence rates of general regularization methods for statistical inverse problems and application. SIAM Journal of Numerical Analysis, 45:2610-2636, 2007.
N. Bissantz and H. Holzmann. Statistical inference for inverse problems. Inverse Problems, 24:034009, 2008.
R.J. Carroll and P. Hall. Optimal rates of convergence for deconvolving a density. Journal of the American Statistical Association, 83:1184-1186, 1988.
L. Cavalier. Nonparametric statistical inverse problems. Inverse Problems, 24, 2008.
L. Cavalier and A. Tsybakov. Sharp adaptation for inverse problems with random noise. Probability Theory and Related Fields, 123:323-354, 2002.
G.S. Chirikjian. Fredholm integral equations on the euclidean motion group. Inverse Problems, 12:579-599, 1996.
G.S. Chirikjian and I. Ebert-Uphoff. Numerical convolution on the euclidean group with applications to workspace generation. IEEE Transanctions on Robotics and Automation, 14:123-136, 1998.
G.S. Chirikjian and A.B. Kyatkin. An operational calculus for the euclidean motion group with applications in robotics and polymer science. Journal of Fourier Analysis and Applications, 6:583-606, 2000.
G.S. Chirikjian and A.B. Kyatkin. Engineering Application of Noncommutative Harmonic Analysis: with emphasis on rotation and motion groups. Boca Raton: CRC Press, 2001.
G.S. Chirikjian and Y. Wang. Engineering applications of the motion-group fourier transform. Modern Signal Processing, 46:63-77, 2003.
D.D. Cox. Approximation of method or regularization estimators. Annals of Statistics, 16:694-712, 1988.
57
[14] P.J. Diggle and P. Hall. A fourier approach to nonparametric deconvolution of a density estimate. Journal of the Royal Statistical Society Series B, 55 (2): 523-531, 1993.
[15] R. Duits, M. Felsberg, G. Granlund, and B.T.H. Romeny. Image analysis and reconstructing using wavelet tranform constructed from a reducible representation of the euclidean motion group. International Journal of Computer Vision, 72:79-102, 2007.
[16] S. Efromovich. On sharp adaptive estimation of multivarite curves. Mathematical Methods of Statistics, 9:117-139, 2000.
[17] S.N. Evans and P. Stark. Inverse problems as statistics. Inverse Problems, 18:R55-R97, 2002.
[18] J. Fan. On the optimal rates of convergence for nonparametric deconvolution problems. Annals of Statistics, 9:117-139, 1991.
[19] A. Feuerverger and R.A. Murieka. The empirical characteristic function and its applications. Annals of Statistics, 5:88-97, 1977.
[20] G.K. Golubev. Nonparametric estimation of smooth probability densities in 1%. Problems in Information Transmission, 28:44-54, 1992.
[21] J. Hadamard. Le probleme de Cauchy et les equations aux derivees partielles hyperboliques. Herman, Paris, 1932.
[22] P. Hall and J.S. Marron. Extent to which least squares cross-validation minimises integrated square error in nonparametric density estimation. Probability Theory and Related Fields, 74:567-581, 1987.
[23] P.R. Halmos. What does the spectral theorem say? American Mathematics Monthly, 70:241-247, 1963.
[24] D.M. Healy, H. Hendriks, and P.T. Kim. Spherical deconvolution. Journal of Multivariate Analysis, 67:1-22, 1998.
[25] H. Hendriks. Nonparametric estimation of a probability density on a riemannian manifold using fourier expansions. Annals of Statistics, 18:832-849, 1990.
[26] LA. Ibragimov and R.Z. Khasminskii. On nonparametric estimation of the value of a linear functional in gaussian white' noise. Theory of Probability and its Applications, 29:18-34, 1984.
[27] J.P. Kaipio and E. Somersalo. Statistical and Computational Inverse Problems. Springer, New York, 2004.
[28] J.B. Keller. Inverse problems. American Mathematics Monthly, 83:107-118, 1976.
58
P.T. Kim. Deconvolution density estimation on so(n). The Annals of Statistics, 26:1083-1102, 1998.
P.T. Kim and J. Koo. Statistical inverse problems on manifolds. Journal of Fourier Analysis and Applications, 11:639-653, 2005.
P.T. Kim and J-Y. Koo. Optimal spherical deconvolution. Journal of Multivariate Analysis, 80:21-42, 2002.
J. Klemela. Asymptotic minimax risk for the white noise model on the sphere. Scandanavian Journal of Statistics, 26:465-473, 1999.
A.B. Kyatkin and G.S. Chirikjian. Regularized solutions of a nonlinear convolution equation on the euclidean group. Acta Applicandae Mathematicae, 53:89-123, 1998.
A.B. Kyatkin and G.S. Chirikjian. Computation of robot configuration and workspaces via the fourier transform on the discrete-motion group. The International Journal of Robotics Research, 18:601-615, 1999.
A.B. Kyatkin and G.S. Chirikjian. Pattern matching as a correlation on the discrete motion group. Computer Vision and Image Understanding, 74:22-35, 1999.
A.B. Kyatkin and G.S. Chirikjian. Algorithms for fast convolutions on motion groups. Applied and Computational Harmonic Analysis, 9:220-241, 2000.
A.B. Kyatkin and G.S. Chirikjian. An operational calculus for the euclidean motion group with applications in robotics and polymer science. Journal of Fourier Analysis and Applications, 6:583-606, 2000.
S. Lu, S.V. Pereverzev, and R. Ramlau. An analysis of tikhonov regularization for nonlinear ill-posed problems under general smoothness assumption. Inverse Problems, 23:217-230, 2007.
D.W. Nychka and D. Cox. Convergence rates for regularized solutions of integral equations from discrete noisy data. Annals of Statistics, 17:556-572, 1989.
F. O'Sullivan. A statistical perspective on ill-posed inverse problems. Statistical Science, 1:502-518, 1986.
W. Park, J.S. Kim, Y. Zhou, N.J. Cowan, A.M. Okamura, and G.S Chirikjian. Diffusion-based motion planning for a nonholonomic flexible needle model. Proceedings IEEE, Robotics and Automation, 2005.
M.S. Pinsker. Optimal filtration of square-integrable signals in gaussian noise. Problems of information transmission, 16:120-133, 1980.
P. Rigollet. Adaptive density estimation using stein's blockwise method. Preprint, 2004.
59
[44] M. Sugiura. Unitary representations and harmonic analysis. New York: Wiley, 1975.
[45] L. Tenorio. Statistical regularization of inverse problems. SIAM Review, 43:347-366, 2001.
[46] A. V. Tikhonov. Regularization of incorrectly posed problems. Soviet Mathematics Doklady, 4:1624-1627, 1963.
[47] A.V. Tikhonov and V.Y. Arsenin. Solution of Ill-posed Problems. Winston & sons, 1977.
[48] A.CM. van Rooij and F.H. Ruymgaart. Asymptotic minimax rates for abstract linear estiamtors. Journal of Statistical Planning and Inference, 53:389-402, 1996.
[49] Y. Wang and G.S. Chirikjian. Workspace generation of hyper-redundant manipulators as a diffusion process on se{n). IEEE Transactions on Robotics and Automation, 20:399-408, 2004.
[50] K.A. Woodbury. What are inverse problems?, 1995. http://www.me.ua.edu/inverse/whatis.html.
[51] C.E. Yarman and B. Yazici. Radon transform inversion via wiener filtering over the euclidean motion group. Proceedings IEEE International Conference on Image Processing, pages 811-814, 2003.
[52] C.E. Yarman and B. Yazici. An wiener filtering approach over the euclidean motion group for radon transform inversion. Proc. SPIE, 5032, 2003.
[53] C.E. Yarman and B. Yazici. Exponential radon transform inversion based on harmonic analysis of the euclidean motion group. Proceedings IEEE International Conference on Acoustics Speech and Signal Processing, 2:481-484, 2005.
[54] C.E. Yarman and B. Yazici. Radon transform inversion based on harmonic analysis of the euclidean motion group. IEEE ICASSP 2005, pages 481-484, 2005.
[55] C.E. Yarman and B. Yazici. An inversion method for the exponential radon based on the harmonic analysis of the euclidean motion group. SPIE Proc, 6142, 2006.
[56] B. Yazici. Stochastic deconvolution over groups. IEEE Transactions on Information Theory, 50:494-510, 2004.
60
Appendix A
Proofs
A . l P r o o f o f T h e o r e m 3 . 2 . 1
Recall that the MISE can be separated into the integrated bias and inte
grated variance parts as,
mr(g) - f(g)\\l = m\fn(g) - ®fn(9)\\l + \Wn(g) - f(g)\\l . (A.1.1)
A.1.1 Bias Calculation
First consider evaluating
f(g)-Efn(g) = I J2 Y fem(p)umt(g,p)pdp ^° \e\>T\m\>T
/
oo
^2 Y Tim(p)ume(g,p)pdp - \e\<T\m\>T
/ o o ^
Y Y hrn{p)umi{g,p)pdp - \£\>T\m\<T
+ / Y Y forn(p)ume{g,p)pdp JT \t\<T\m\<T
+ Y Y (fim{p)-^fL(p))ume{g,p)pdp. 0 W<T\m\<T
Note that fn exists for only \£\, \m\ < T and p € [0, T]. The first four terms will
simplify primarily due to the Sobolev condition (3.1.4). The interim step is to multiply
61
by the factor (l+£2+m2+2p2)~s(l+£2+m2+2p2)s. This results in an asymptotic term
of T~2s due to the compression and a second term bound by the Sobolev condition
(3.1.4). The calculations for most of the terms are provided the sake of completeness,
although for understanding, inspection of one calculation should be sufficient. The
first term,
|2
E ftm{p)ume(g,p)pdp \t\,\m\>T
E \fem{p) I4|m|>r
pdp
/>oo
< / ]T (1 + t2 + m2 + 2p2)"s(l + l2 + m2 + 2p2)s fim(p) Jo \e\,\m\>T
« r - 2 s / Y (l+£2 + m2 + 2p2Y fim(p) Pdp
(A.1.2)
(A.1.3)
2
pdp
,\m\>T
<QT -2s
since / G 0(s, Q), s > 3/2. The second term follows similarly,
E E fim(p)Uml(9,P)pdp \l\<T\m\>T
pdp poo .
= / E E /*»(p) J° \e\<T\m\>T
/>oo
^ / E E C1 + e + m2 + V)" s( l + f + m2 + 2p2)s fim(p) pdp \i\<T\m\>T
coo l-OO
« T~2S / Y E (! + f + ̂ + ̂ ^ ^{p) JO I „ I ^ ~ , , . . . , . „ ,
pdp •\<T\m\>T
<QT -2s
since / G 9(s, Q), s > 3/2. The third term is identical to the second term except that
instead of ^m<r52\m\>T ^ ^s 12\t\>T Yl\m\<T- This calculation is omitted. Finally,
62
the fourth term
T E Un{p)uml{g,p)pdp ,\m\<T
/
oo |
Yl \ftm(p) - |i|,H<r
pdp
(A.1.4)
(A.1.5)
< / ^ (1 + f + m2 + 2p2)"s(l + l2 + m2 + 2p2)s \ftm(p)\ pdp JT i»i i „ i / T \t\,\m\<T
fOO /•oo
« T"2' / £ (1 + f + m2 + 2p2)s fim(p) \e\,\m\<ri
pdp
<QT -2s (A.1.6)
since / £ 6(s, Q), s > 3/2. Now, consider the final term of the bias decomposition,
Yl (femip) - ^ftmip)) u™e(g,p)pdp ,\m\<T
E \hm{p)-^IUP) \e\M<T
pdp.
63
In particular consider the term without the integration
Y \fim(p)-^fL(p)2
\t\,\m\<T
= E \e\,\m\<T
- E \e\,\m\<T
= E |£|,|m|<r
= E \e\,\m\<T
- E \e\,\m\<T
fem(p) - Y Ehl(p)k^(p) \1\<T
ftmip) ~ Y M^fcO9) \9\<T
ftmip) ~ Y Y fw(P)%'q(p)Km(P) \q\<Tq'=-oo
oo
Ttmip)- Y vw Y w^fe(p) ?'=-oo -l<?!<T
oo /
Y ^b) <w ~ S v«(pfe(p) ?'=-oo \ |<?|<T
^ E E |/*oo| E \e\-\m\<Tq'=-oo q'=—oo
<W - £ kq,q(p)kq^(p) \1\<T
s E E M l E E \e\<Tq'=-oo \m\<Tq'=-oo
8I'm - Y WP)*U(P) kl<r
(A.1.7)
(A.l i
(A.1.9)
The previous uses
and
JL(P) = Y K(P)Km(P)' q=-T
. (A.1.10)
(A.1.11)
Eh^ip) = hiq(p). (A.1.12)
for |^|, |g| < T. Recall that hn is built as an empirical characteristic function, which
is an unbiased estimator of the Fourier coefficients; and that
M?) = Y Tw(p)%q(p)> (A.1.13) q =—oo
64
again by definition for the substitution. Note that this can be bound in two parts,
first by
Z ) E K / 9 , ( P ) - / i /o?)i2^< o o> (A-L14)
which can be dropped from the equations as it is merely a constant and secondly by
noting that,
E E \m\<Tq' = -oa
"q'm E kQ'q(p)kq^(p) \1\<T
= E E \m\<T\q'\<T
+ E E \m\<T\q'\>T
\1\<T
2
E kq'q{p)kqm{v) \1\<T
(A.1.15)
(A.1.16)
The first term above (A.1.15) collapses to zero, since the assumptions of this theorem
mean that k~£(p) exists for all the given q, m so that Yl\q\<T kq'q(p)hqm(p) reduces to
the identity element itself. The second term (A.1.16) along with integration requires
a little bit more care, since the summation over q' is on an infinite domain. Consider
65
the following
T
EE \m\<T\q'\>T
cT
E *V«(P)*U(P) \Q\<T
pdp
^ / E E Mf E E \K~ l<?'l>T|g|<T \m\<T\q\<T
pdp
< E E |WP) triCfcr^Cp))-1}^ k'l>T|(?!<T
pdp |g '!>T|q!<r
« T sup E E WP) o b ' l<T7o( r 2 +p 2 ) ^^ | > r | ^ T
pdp
JpT -i . ,2
0 7 o 1 ^ J |? ' |>TM<T' '
k'l>T|q|<T
as T —* oo. Recall the assumption that
/ E | ^ (P )Vp = o(T-2-^-1) kl<T,|9'|>r
and by putting everything together the bound for the bias is,
|E/ n-ni = T E E \fem(p)-^fL(p) pdp \e\<T\m\<T
<Z AT-2s _|_ j ' 2^+l 'T 1 -2s -2 /3 - l
< T - i s ( l + o(l))
(A
as T ^ oo.
66
A.1.2 Variance Calculation
Note that
E | | / n - E / " |
E / E |/L(P)-E/L(P) prfp |£|,|mj<T
= E E E (^,(P)-E^(P))^(P) p* (A-L 1 8) W,\m\<T \q\<T
Let F be a generic §E(2) element and note that the following identity is needed,
E hn*(p)hn(p)] - Eh*n(p)Ehn(p) = - I 2 T + 1 - -EUT(Y,p)EUT(Y-\p). (A.1.19)
67
This can be seen to hold via the following argument, note that explicit dependence
on p is dropped for the calculation.
E ( hn*hn) - Ehn*Ehn
JEW) ( J E * ) -E ( i ± UT(g;1)^ E^± Ur(g-^
., / n n n n
= ̂ E E E UT&WTigj,1) - E E EMr^OE^C^1; V i=i i'=i i=i i'=i
= ^ E ( E UT{gj)UT{gj}) + ]T UT{9i)UT{g-}) ) - E ^ ^ E ^ O r 1 )
= ^ E ( E UT^UrigJ1)) + ̂ E E iPT{gi)UT{gj})) - Ef/r(F)Et/T(F-1)
= ^ E ( E ^ + i ) + ̂ Y.m^3j)^UT{gjll) - EUT{Y)EUT{Y-1)-
1 ™ 1 - a X 1 2 ^ 1 + ^ ^ " «)Ef/T(^)EC/r(y-1) - Ef/T^E^iy-1) rr A—' n
= ^ n I 2 T + 1 + (1 - - - 1)EUT(Y)EUT(Y-1] n2 ' n
= - I 2 T+i - -EUT(Y)EUT(Y-1). n n
Here the fact that U(g, p)* = U(g~1,p) is used, as well as the homomorphism property
U(g,p)U(g^1,p) = U(gg~1,p) = U(e,p) = (<W) where e = (12,0), the unit element
of §E(2). Returning to the main calculation, note that
E / tr { (hnr{p)kT
x{p) - mnr(p)k-l(p)\ (h^ip^ip) - Ehv
T(p)k-1(p)y^pdp
tr | [E (hnm(pjhn(p)\ - Ehn*(p)Ehn(p)] k^{p)(k^(p)y^pdp
k^(p)(k^(p)Apdp
T
T
I I _ -EUT(Y}p)EUr(Y-\p) n n
n « - / tr \k^(p)(k^(p))* \pdp. (A.1.20)
Finally, using the spectral conditions (3.2.1) in equation (A.1.20) reduces to-
consideration of the following
k-'ipW'ip))* = k-'ip^ip))-1
= (F(P)MP))"1
- (VrDrVr)'1
where VTV^ = hr+i and DT is a diagonal matrix with entries di(p). Note that 2 i ^
Km(P) = J2 \Km(p) \q\,H<T
= trj^fe1^))*} T 1
(A.1.21)
(A.1.22)
(A.1.23) 3=-T
69
Then the upper bound for the MISE can be put together as,
-T T
I n \ [ t ^ + T-» (A.1.24)
3=-T
(2T+i) r _ i ^ , ^_2s < ^ - 1 / sup -yr-.pdp + T~2s (A.1.25) n A | j | < r W )
< - / 7 O T 2 V P + ^ - 2 s (A. 1.26)
rc Jo j -2 ,3+1
<C T2 + T~2s. (A.1.27) n
A.2 Proof of Corollary 3.2.2
Given 7^2/3+3
E | | r - / | | 2 « — — + T " 2 s . (A.2.1)
as T, n -+ oo, / e 9(s, Q) and s > 3/2, and choosing T = nV(2s+2/?+3) g i v e g
2/3+3
(A.2.2)
= n2s+2~0+3 _)_ n2S+2"/3+3 ( A . 2 . 3 )
giving the required result.
A.3 Proof of Corollary 3.2.3
Given j -2 /3+3
E | | . / n - . / | | 2 < + T~2s (A.3.1)
as T, n —• oo, / e 6(s , Q) and s > 3/2, and choosing T = n1/ (2s+3) gives
2,3+3 7) 2s+3 _ , «
E | | / n - / | | 2 < +n^+a (A.3.2)
= n ^ + n ^ B (A.3.3)
E | | / n - - fll2 / II2 <
n 23+20+3 _2s 1_ n 2a+20+3
n -2 .9 - 2 s
.70
giving the required result.
A.4 Proof of Theorem 3.2.4
For convenience adopt the following notation
«"(p) = k*M ( kT(p)k*r(p) + "hT+i
A.4.1 Bias Calculation
As before, first consider evaluating
/•oo
f(g)-Er(g) = / Y, E fem(p)ume(g,p)pdp •1 ° \£\>T\m\>T
/
oo
E E forn(p)ume{g,P)pdp
~ \e\<T\m\>T
/•oo
+ / E E ftm{p)Umt{g,P)pdp ^° K|>T|m|<T
/
oo
E E hrn{p)Uml{g,P)pdp - \t\<T\m\<T
+ / X E (L(p)-E/,:(p))«m<(g,p)P(ip. KI<T|m|<T
Again, the first four terms will simplify primarily due to the Sobolev condition (3.1.4).
The interim step is to multiply by the factor (l+£2 + m2 + 2p2)~s(l + £2+m2+ 2p2)s.
This gives a factor of T~2s due to the compression and leaves the first four terms
following the above equality (multiplied by (1 + £2 + m2 + 2p2)s) bounded by the
Sobolev condition (3.1.4). Particulars for the first four terms are the same as the
calculations from (A.1.2) to (A.1.6).
Now, consider the final term of the bias decomposition without the integra-
71
tion. Also note the use of the following notation
Km{p) = kr(p) kT{p)k*T{p) + V12T+I - r
Im
In particular,
E \flmiP)-^fZiP) \t\,\m\<T
- E
\t\,\m\<T
K|, |m|<T
= E \£\,\m\<T
- E |<| , |m|<T
= E W,\m\<T
fem(P) - E E^(P)«^(P) kl<T
ftmip) ~ E heq(pKqm(p) \I\<T
Ttmip) " E E '̂(P)V?(P)^m(P) \q\<Tq' = -oo
oo
ftmip)- E f^(P)J2^'^PKm(P) q'=-oo \q\<T
oo /
E faip) <W - E ^(P)'^m(P) \ \q\<T q' = — oo
* E E \MP)\ E K | , | m | < T g ' = - o c < j > ' = — O O
Vm - E kq'q(pKgm(p) \Q\<T
(A.4.1)
< E E Ml E E \i\<Tq'=-oo \m\<Tq'=-oo l?l<r
Recall the following bound
E E \MP) \i\<Tq'=-co
< I \f(g)\zdg<<^, /SE(2)
(A.4.2)
72
and furthermore,
E E \m\<T q'=—co
= E E \m\<T\q'\<T
+ E E \m\<T\q'\>T
\q\<T
Sq'm - Yl kq>q{p)KUgm{p)
\Q\<T
Y Wj>Km(p)
(A.4.3)
(A.4.4)
The term given by (A.4.4) has a simple calculation, since it is an entirely finite
summation,
|2
E E \m\<T\q'\<T
5<l'm - Y K'q{pWqm(p) \q\<T
I2r+i - kT{p)k*T{p) (kT(p)k*T(p) + UI2T+I
tr
A2T+1 — VT(p)DT(p) (DT(p) + rvT(Py Itr
v - n i 2
I2T+i - DT(p) (DT(p) + I / I 2 T+I)" Mtr
rfj(p) \ sr^ f V
<*;(P) + dj(p) +
With the integration, this gives,
T T
E 0 —T \dj(p) + u pdp. (A.4.5)
Now consider the term (A.4.4) along with integration. Note also that explicit
73
reference to p has been dropped in a number of lines. Then
|2 T
E E m\<T\q'\>r
T
E E \m\<T\q'\>T
pdp
o E kq'q E Ki' \k^T + ^l2T+l)#/ \q\<T \£'\<T
1
I'm pdp
T
^ / E E M E E l9 ' l>Tk|<T \q\<T\m\<T
Now the second summation is equivalent to
kT [kTk^ + VI2T+I qm
pdp.
tv{Dt(DT(p) + uI2T+1)~2} = Y,
dj{p)
(d3(p) + uy (A.4.6)
\J:<T
Return to the full bias term and make the appropriate substitutions, giving
k'!>T|<?|<T \j\<T v 3yy/ '
rp-20 rT 2
< r ^ ^ / E E |WP)| WP
PC?P
|?'|>TM<T
rT , . |2
« T2^1 / E E | w?) pdp , l<?'l>T]<?|<T
At this point, the tail condition (3.2.2) assumed earlier comes into play and gives
rT
IE/"" "/III! = / E E |/*n(P)-E^(p)|Vp
«4T"2* + fJ Y (—4 Xpdp + T2(3+1T-2s-^-3=-T
rT T
< T -2s l J^U (P) + . pdp
as T —> oo.
74
A.4.2 Variance Calculation
Recall the following notation,
- I
\t'\<T
and note that in the general case
«"(P) = k*T(p) {kT(p)k*T(p) + "I2T+1
(A
EH/^-E/"
= E / T E |&(p)-E^(p)|2pdp :|.|ml<r
=Ef E ,|ro|<T
E i ^ |4|m|<T
\1\<T
2
pdp (A
kl<T
prfp . (A
75
Now. apply expectation to just the summand and let Y denote a generic random SE(2)
quantity
£ E \e\,\m\<T \q\<T
K|<oo|ro|<T \Q\<T
Z ^ 2^i n |<oo \m\<T
J2 (ueq{Y-\P)-h,q{p))2vqm{p)
q\<T
« E E ^ E E nlq{Y-\p)utq,{Y-\p)K^m{p)^^) K|<oo|m|<T |q'|,M<T
= E ^E E E utq{Y-\Py^x^i \ rqm{P)i^jp) \m\<T ' \q'\\q\<T \\l\«x> J
= E ^E E | E MY,P)^(Y-\P) 1^(P)^JP)
- E ^ E E {«W(^-1.P)}^»(P)^JP)
M < r k'l,l?l<T
= E ~ E E u<ra'(e>P)K£n(p)«£m(p) M<T W\,\q\<T
M<T \<t\,\q\<T
(A.4.10) l»n|<T|9|<T
Here, use the fact that U(g,p)* = U(g~1,p) as well as the homomorphism property
U(g,p)U(g-l,p) = Uigg'1^) = U(e,p) = (6tm) where e = (I2,0), the unit element
76
of SE(2) and 5^m is the Kronecker delta. Note that
\Q\,\m\<T
J2 V««'m(p) \£'\<T
= tr < kT{p)k*T(p) ( kT{p)k*T(p) + uI2T+i
dj(p)
\3\<T (dj(p) + uy
Consequently, by substitution the formula for the variance is obtained,
T T , • 2
Y ^ dj + nu1
2^ tj , ..MPdP ^ o —T{dj + v o j f ^ ( 7 o ( T 2 + p 2 ) - / » + i / ) ^ P
<r ^ , ^ 2 - ^ i T - ^ + n^2 r T
(7or-2* +1/)2 y0
< T: 32-/37ir-2/3 + n z / 2
7o T-2/3 + i,)2 '
as n, T —» oo.
Using this inequality in (3.2.18) gives us
T 3 2 - 0 7 l T - 2 0 + n^2
EH/"" - / | | 2 « —*, ^n f l ^';; + T~2S
n (l0T-w + vy
as n.T —> oo.
A.5 Proof of Corollory 3.2.5
(A.4.11)
(A.4.12)
The MISE for the regularized case
E | | / n " - - / | | 2 < n
dj (p) + nu1
o Jf^T(^(p) + i /)2 (A.s.i;
77
as n, T —> oo. Taking the special case that kTl exists, so that v = 0 can be chosen
gives
nr-n- « \l ZM^+T -2s
J=-T
T T
n h j~^Tdj(p)P P
as n, T —• oo, which is clearly equal to the case in 3.2.1.
A.6 Proof of Theorem 3.2.6
Recall the spectral conditions reduce to considering the following equation
for 0 < 70 < 71 < oo,
7o (T2+P2y3 < d3iP) < 7i (T2 +p2y0 (A.6.1)
for |j | < T. Also, recall the equation for the MISE,
mr - f\\2« - f E # r r r ^ + T~2s ^6^
as n, T —» oo. Taking the supremum of the summand over j releases a (2T + 1) and
then an additional T2 from the integration JQ pdp is obtained, giving a total of T3.
Then the approach is to obtain the inequality
7 l T - ^ 2 ^ + nu2 ^ _ 2 ,
(70T-2/3 + vy
for some £ > 0 since
< (,Tils (A.6.3)
dj{p)+nu2 _ 7 l T - ^ 2 - ^ + nv2
mrWpHW2' boT-w + u)* '
78
Assuming T20 = o(n), solving for v gives the following roots
£7o ± y/e27o2 + (n - e ^ ) ( 7 i 2 ^ - ^2)T~^ n — £T2/3
A small amount of algebra is needed to determine the restrictions on £ as well as the
dominant root. Clearly two conditions must be satisfied for v > 0 and in 1L The first
is that n — £T2/3 j^ 0 implying that £ ^ nT~20, the second is that the term in the
square root be non-negative. This condition can be checked as follows,
o<7o2e2-(n-eT2^r-^(7l2-^-e/02)
= ll? - l^nT-2? + jfcnT-2? + 7 l 2 ^ £ - «
^ ^ ~ ilnT-W + 7i2-0
Finally, a choice of root must be made and the conditions checked, so that v > 0.
Assuming the positive root, a single condition must be satisfied, ie. that £ < nT~2/3.
Assuming the negative root leads to two alternatives, first, if the numerator is positive,
implying (7o£)2 > 7o£2 - (n - ^T20)^29^-0 - £72), then it is also required that
£ < nT~20 (so that the denominator is positive). Solving for £ in the numerator
reduces to solving the following
0 > -(n - iT20)T-29{lx2-0 - £70) (A.6.5)
which in turn suggests that either £ > nT~20 which contradicts the previous require
ment or that £ > :3a-2—. A similar process is followed if it is assumed the numerator To
is negative, it leads to the following conditions, £ > nT~20 and £ < nT~20 (contra
diction or £ < 2 1 ^ . To summarize, choosing
To ^ £7o + Veij + (n - £T^) ( 7 l 2-^ - £7o2)T-^
v < — ^ (A.b.O)
79
requires the following conditions on £
£ < nT-^ (A.6.7)
'Yi 2-'3nT~20
Assuming
requires either
v < ^7o ~ V ^ g + (n - £T^)(7i2-^ - £7 o2)T-^
— n _ £ J"2/3
£ < nT~2(j (A.6.10)
e > ^ - . (A.6.11) /o
or
£ > nT"2/3 (A.6.12)
712-^nr-2^ rfnT-W + 7i2" « ^ A U , , . - > < A ' 6 - 1 3 >
£ < -^V". (A.6.14) 7o
A.7 Development of CV(T)
A.7.1 Empirical Density Estimator
The empirical characteristic function is used to form the elements of the
estimated Fourier transform, h™m(p) but since this is the direct density estimation
case they are equivalent to ffm{p)- The development is as follows,
fUp) = lYJulm{gj\p). (A.7.1)
80
Then, recall the definition of the inverse Fourier transform, so that
fn(9)= E / fL(pM9,P)pdP- (A.7.2) \e\,\m\<T °
Substitution of ffm(p) and the formulas for u^m(,p) give
I 1 Pig) = E / -^2uim(g]'1,p)ume(g,p)pdp
= E f \ > T > - V < M ' + < m - ^ Jt-m(prj)it-me-<lB+(m-eM Je-m(pr)pdp
= E f I E ^^-^^^^^^/-mCpry) J/-m(pr)pdp |^|,|m[<r^° n i=i
= ^ E E ^ ^ ^ ™ " ^ ^ fT Jt-m(pr3)Ji-m(pr)pdp . (A.7.3) K|,|m|<T J= l
A.7.2 Leave-one-out Cross Validation
The theoretical form for CV(T) is
CV(T) = / ' \fn(g)\2d(g) -if^ftfa), (A.7.4) JSK(2) n
k=1
but Plancherel is used to re-express the first term as
rT T
I \fn(g)\2dg= f E \fe,miP)\2Pdp- (A.7.5) •J (jr J U fi -_- ' p
This gives the following for the cross validation,
CV(T)= / T E | i ( r i 2 ^ - ; E / r a ( * ) . (A.7.6) J[) i,m=-T k=l
For computation purposes, some further development is needed. First consider the
empirical density estimator, fn(g), as given by (A.7.3) for the direct density estima-
81
tion case,
T T n T
r(g(9,(f>,r))= £ \ £ ±<£e*m-W»-W>-*)] f J^pr^J^ipr^dp. l=-Tm=-T j = l ^ °
(A.7.7)
With simple modification this leads to the leave-one-out estimator fn'k(gk) T T
,-ek)+(m-e)Ct>j-M}
£=-Tm=-T j=l,j¥=k
I Jt-.m{prj)Jt-m{prk)pdp. (A.7.8) Jo
82
These can now be substituted into the equation for CV(T).
Uimigk1^) \PdP J{) e,m=~T \ j=l / \ k=l
~ E E - ^ T E • e ^ - ^ m - ^ ^ - ^ I'Je-rniprJJi-rniprJpdp n z —' ^—' • n — 1 • '-- ' / n
k=l C,m=-T j=l,j^k
JU Lm=-T 7=1 fc=l
V V e^-^)+(m-/)(^-^I f J^prM-mipr^pdp _ . ^ -• ,._-, ._z,. J O nfn — 1)
v ' e,m=-T j,k=l,j^k rT' T n 1 r1
4 / E E /"m e" [^+ ( m"£ )^1 J^(^)E^"€ e l [^+ ( m"^ f c l^--(^)^ "'O e,m=~T j=l fc=l
n t n _ i ; ! , ^ - r p ~ i ^ ^
I? / E E ew^-efc)+(m-£)(^-0fe)1^-m(^)j,_m(prfewP
T n T
Y) E E e^-flfc)+(m^^-*h)] / J,-m(pr,)J,-m(prfc)P^ Tl\Tt — -
T n rT
= ^ E E e ^ - ^ ^ ^ ^ - * * ) / Jt-miprjUe-mipr^pdp l,m=-T k,j=l J°
T n T
£ J2 jm-^Hm-D^-M] I Je_m(prj)Jt_m(prk)pdp „ , _ T A U—\ A-IU J O n(n — 1)
= ^ E E e^-W™-^-^ J Je-miprM-mipr^pdp \ei\m\<T j,k=l J° \<Tj,k=
2 n(n — 1)
2. ^ E E e i m ^ ) + ( m ^ ) ( ^ " 0 f c V T j ^ ^ ) J ^ ( ^ ) P ^ n + 1
'|,|m|<Ti,fc=l
n n |£|,|m|
TV E E / Je-m{prk)Je-m(prk)pdp . ' m,imi<r fe=i ^°
83
Appendix B
Simulation Code
All R code used for the simulation is included here for the sake of complete
ness. R libraries used include MASS, boot and c i rcu la r .
B.l Computational Functions
###
bess.f<-function(lambda,radl,order,rad2){
hl<-besselJ(lambda*radl,order)*besselJ(lambda*rad2,order)*lambda
hi}
###
bess.f2<-function(lambda,radl,order,rad2){
hl<-besselJ(lambda*radl,-l*order)*besselJ(lambda*rad2,-l*order)*lambda
hi}
###
find.phi<-function(sampleSize,h){
values<-array(0,c(sampleSize,1))
values [,l]<-atan((h[2,3,] )/(h[l,3,]))
values}
###
84
find.radius<-function(sampleSize,h){
values<-array(0,c(sampleSize,l))
values[,l]<-sqrt((h[1,3,])~2 + (h[2,3,])~2)
values}
###
find.theta<-function(samplesize,h){
values<-array(0,c(sampleSize,1))
values[,1]<-acos(h[1,1,])
values}
###
bivariate<-function(n, mux=0, muy=0, sigx=l, sigy=l, rho=0){
x = matrix(0, nrow=n, ncol=2)
x [ , l ] = rnorm(n, mux, sigx)
sigcond = sigy*sqrt( l-rho~2)
x[,2] = sigcond*rnorm(n)+muy+rho*sigy*(x[,l]-mux)/sigx
re tu rn(x)}
###
dens2<-function(kappa,mu,mux,sx,muy,sy,rho,theta,phi,r){
xx<-r*cos(phi)
yy<-r*sin(phi)
hh<-(((xx-mux)"2)/sx~2)+(((yy-muy)"2)/sy~2)-
((2*rho*(xx-muy)*(yy-muy))/(sx*sy))
fl<-(exp(kappa*cos(theta-mu)))/(2*pi*bessell(kappa,0))
f2<-(r*exp(-hh/(2*(l-rho~2))))/(2*pi*sx*sy*sqrt(l-rho~2))
f3<-fl*f2
85
f3}
###
LSCV2<-function(T,sampleSize,radius,phi,theta){
thetaA<-rep(0,sampleSize)
thetaB<-rep(0,sampleSize)
phiA<-rep(0,sampleSize)
phiB<-rep(0,sampleSize)
radA<-rep(0,sampleSize),
radB<-rep(0,sampleSize)
radA<-radius
radB<-radius
phiA<-phi
phiB<-phi
thetaA<-theta
thetaB<-theta
matl<-array(0,c(2*T,2*T))
mat2<-array(0,c.(2*T,2*T))
ExpMplus<-array(0,c(sampleSize,sampleSize))
ExpMminus<-array(0,c(sampleSize,sampleSize))
ExpMminusR<-array(0,c(sampleSize,sampleSize-l))
ExpPK-array(0,c(sampleSize, l))
ExpP2<-array(0,c(sampleSize,1))
tempP<-array(0,c(sampleSize,sampleSize))
tempM<-array(0,c(sampleSize,sampleSize))
86
tempMR<-array(0,c(sampleSize,sampleSize-1))
vall<-rep(0,2*T+l)
val2<-rep(0,2*T+l)
temp3<-rep(0,sampleSize)
templ<-rep(0,sampleSize)
BessM<-array(0,c(sampleSize,sampleSize))
BessN<-array(0,c(sampleSize, D )
for(p in -T:T){
for(q in -T:T){
f o r ( k i n 1:sampleSize){
f o r ( j i n 1:sampleSize){
i f ( p - q < 0 ) {
h l c < - i n t e g r a t e ( b e s s . f 2 , r a d l = r a d A [ k ] , o r d e r = p - q , r a d 2 = r a d B [ j ] , 0 , T )
BessM[k , j ]< -h lc$va lue}
e l s e {
h l c < - i n t e g r a t e ( b e s s . f , r a d l = r a d A [ k ] , o r d e r = p - q , r a d 2 = r a d B [ j ] , 0 , T )
BessM[k , j ]<-h lc$va lue}}}
ExpPl [] < - e x p ( l i * (p*thetaA [] + (q-p) *phiA [ ] ) )
ExpP2 [] <-exp ( l i * ( - l*p* the taB [] - (q-p) *phiB [ ] ) )
ExpMplus [, ] <-ExpP 1 [] %o°/0ExpP2 []
tempP[,]<-ExpMplus[ , ]*BessM[, ]
temp4<-sum(diag(BessM[,]))
87
for(m in 1:sampleSize){ tempi[m]<-sum(tempP[,m]) }
temp2<-sum(tempi[])
matl[p+T,q+T]<-((-l*(sampleSize+l))/(sampleSize*(sajnpleSize-l)))*Re(temp2)
mat2[p+T,q+T]<-(2/(sampleSize-1))*Re(temp4) }}
for(q in -T:T){ vail[q]<-sum(mat1[,q])
val2[q]<-sum(mat2[,q])}
valF<-sum(vall)
EstK-(1/(sampleSize) )*valF
valG<-sum(val2)
Est2<-(1/(sampleSize))*valG
CV<-(Estl+Est2)
CV}
###
runme2<-function(sampleSize,T,theta,phi,r,thetal,A){
thetal<-rep(l,sampleSize)
ph iK- rep (1 , sampleSize)
h2<-rep(l,sampleSize)
h2Real<-rep(l,sampleSize)
val<-rep(l,2*T+l)
mat<-array(1,c(2*T,2*T))
mat2<-array(l,c(2*T,2*T))
mat1<-array(1, c (3,3,sampleSize))
hlb<-array(1,c(sampleSize,1))
for(k in 1:sampleSize){
88
v e c l < - c ( c o s ( t h e t a l [ k ] ) , s i n ( t h e t a l [ k ] ) , 0 , - s i n ( t h e t a l [ k ] ) ,
cos ( the ta l [k ] ) ,0 ,A[k, l] ,A[k,2] ,1)
mat l [ , ,k]<-array(vecl ,d im=c(3,3))}
gzR<-matl
ph iK- f ind . phi (sampleSize ,gzR)
r adK- f ind . radius (sampleSize,gzR)
hl<-array(0,c(sampleSize, I ) )
for(p in -T:T){
for(q in -T:T){
for(k in 1:sampleSize){
if(p-q<0){
hlc<-integrate(f2,r l=radl[k] ,order=p-q,radi=r , lower=0,upper=T)
hlb[k]<-hlc$value}
e lse{
hlc<-integrate(f , r l=radl[k] ,order=p-q,radi=r , lower=0,upper=T)
hlb[k]<-hlc$value}
h2[k]<-exp( l i* (p*( the ta l [k ] - the ta )+(q-p)*(ph i l [k ] -ph i ) ) )
h3<-(1/ (sampleSize)) *Re(sum(h2*hlb)) }
mat[p+T,q+T]<-h3 }}
for(q in -T:T){
val[q]<-sum(mat[,q]) }
valF<-sum(val)
valF }
89
B.2 Plotting Functions
###
surface K-funct ion(kappa,mu,mux,sx,muy,sy,rho,r,n,angle1=30,angle2=30){
xxl<-seq(0,2*pi,len=n)
yyl<-seq(-pi ,p i , len=n)
zz K-matrix (1, nrow=n, ncol=n)
for(k in l:n){ for(m in l:n){
zz1[k,m]<-dens2(kappa,mu,mux,sx,muy,sy,rho,theta=xxl[k],phi=yy1[m],r)}}
postscript(file="SurfaceXX.ps")
persp(xxl,yyl,zzl,theta=anglel,phi=angle2,xlab="theta",
ylab="phi",cex.axis=0.8,tcl=-0.4,zlab="", cex.Iab=0.7,
ticktype="detailed",nticks=5)
dev.offO
zzl }
###
surface2<-function(kappa,mu,mux,sx,muy,sy,rho,the,n,angle1=30,angle2=30){
xxl<-seq(0.1,2,len=n)
yyl<-seq(-pi,pi,len=n)
zz K-matrix (1, nrow=n, ncol=n)
for(k in l:n){ for(m in l:n){
zzl[k,m]<-dens2(kappa,mu,mux,sx,muy,sy,rho,theta=the,phi=yyl[m],r=xxl[k])}}
postscript(file="Surface45.ps")
persp(xxl,yyl,zzl,theta=anglel,phi=angle2,xlab="r",
ylab="phi",cex.axis=0.8,tcl=-0.4,zlab="",cex.Iab=0. 7,
90
t icktype="detai led" ,nt icks=5)
dev.offO
zz l}
###
surface3<-funct ion(kappa,mu,mux,sx,muy,sy,rho,ph,n,anglel=30,angle2=30){
xxl<-seq(0,2*pi,len=n)
yyl<-seq(0.1,2,len=n)
zz K-matr ix (1 , nrow=n, ncol=n)
for(k in l :n ){ for(m in l :n ){
zzl[k,m]<-dens2(kappa,mu,mux,sx,muy,sy,rho,theta=xxl[k],phi=ph,r=yyl[m])}}
postscr ipt ( f i le="Surface45.ps")
persp(xxl ,yyl ,zz l , the ta=angle l ,phi=angle2 ,x lab="theta" ,
y lab="r" ,cex .axis=0.8 , tc l=-0 .4 ,z lab="" ,cex. Iab=0.7 ,
t icktype="detai led" ,nt icks=5)
dev.offO
zzl }
###
contourK-funct ion(kappa,mu,mux,sx,muy,sy,rho,r,n){
xxl<-seq(0,2*pi,len=n)
yy K - s e q ( - p i , p i , len=n)
zz K-matr ix (1 , nrow=n, ncol=n)
for(k in l :n ){ for(m in l :n ){
zzl[k,m]<-dens2(kappa,mu,mux,sx,muy,sy,rho,theta=xxl[k] ,phi=yyl[m] , r ) }}
postscr ipt ( f i le="Surface45.ps")
contour(xxl,yyl,zzl,xlab="theta",ylab="phi",nlevels=15,drawlabels=FALSE)
91
dev.offO
zzl}
###
contour2<-function(kappa,mu,mux,sx,muy,sy,rho,the,n){
xxl<-seq(0.1,2,len=n)
yyl<-seq(-pi,pi,len=n)
zzl<-matrix(1,nrow=n,ncol=n)
for(k in l:n){ for(m in l:n){
zzl[k,m]<-dens2(kappa,mu,mux,sx,muy,sy,rho,theta=the,phi=yyl[m],r=xxl[k])}}
postscr ipt ( f i le="Surface45.ps")
contour(xxl.yyl,zzl,xlab="r",ylab="phi",nlevels=15,drawlabels=FALSE)
dev.offO
zz l}
###
contour3<-funct ion(kappa,mu,mux,sx,muy,sy,rho,ph,n){
xxl<-seq(0,2*pi,len=n)
yyl<-seq(0.1,2,len=n)
zzl<-matrix(1,nrow=n,ncol=n)
for(k in l :n ){ for(m in l :n ){
zzl[k,m]<-dens2(kappa,mu,mux,sx,muy,sy,rho,theta=xxl[k],phi=ph,r=yyl[m])}}
postscr ipt ( f i le="Surface45.ps")
contour(xxl,yyl,zzl,xlab="theta",ylab="r",nlevels=15,drawlabels=FALSE)
dev.offO
zz l}
###
92
empSurface1<-function(kappa,mu,mux, sx,muy,sy,rho,r,n,T,
sampleSize,anglel=30,angle2=30){
xxl<-seq(0,2*pi,len=n)
yyl<-seq(-pi,pi,len=n)
zz K-matr ix (1, nrow=n, ncol=n)
thetaK-rvonmises (sampleSize ,mu, kappa)
A<-bivariate(sampleSize,mux,muy,sx,sy,rho)
for(k in l:n){ for(m in l:n){
zzl[k,m]<-runme2(sampleSize,T,theta=xxl[k],phi=yyl[m],r,thetal,A)}}
postscript(file="GraphA.ps")
persp(xxl,yyl, zzl,theta=anglel,phi=angle2,xlab="theta",
ylab="phi",cex.axis=0.8,tcl=-0.4,zlab="",cex.Iab=0.7,
ticktype="detailed",nticks=5)
dev.offO
zzl}
###
empSurface2<-function(sampleSize,T,kappa,mu,mux,sx,muy,sy,rho,
the,n,angle1=30,angle2=30){
xxl<-seq(0.1,2,len=n)
yyl<-seq(-pi,pi,len=n)
zzl<-matrix(1,nrow=n,ncol=n)
thetaK-rvonmises (sampleSize, mu, kappa)
A<-bivariate(sampleSize,mux,muy,sx,sy,rho)
for(k in l:n){ for(m in l:n){
zzl[k,m]<-runme2(sampleSize,T,theta=the,phi=yyl[m],r=xxl[k],thetal,A)}}
93
postscript(file="Surface45.ps")
persp(xxl,yyl,zzl,theta=anglel,phi=angle2,xlab="r",
ylab="phi",cex.axis=0.8,tcl=-0.4,zlab="",cex.Iab=0.7,
ticktype="detailed",nticks=5)
dev.off()
zzl}
###
empSurface3<-funct ion(sampleSize,T,kappa,mu,mux,sx,muy,sy,
rho,ph,n,angle1=30,angle2=30){
xxl<-seq(0,2*pi,len=n)
yyl<-seq(0.1,2,len=n)
zz K-matrix (1, nrow=n, ncol=n)
thetaK-rvonmises (sampleSize, mu, kappa)
A<-bivariate(sampleSize,mux,muy,sx,sy,rho)
for(k in l:n){ for(m in l:n){
zzl [k,m]<-runme2(sampleSize,T,theta=xxl[k],phi=ph,r=yyl[m],thetal,A)}}
postscript(file="Surface45.ps")
persp(xxl,yyl,zzl,theta=anglel,phi=angle2,xlab="theta",
ylab="r",cex.axis=0.8,tcl=-0.4,zlab="",cex.lab=0.7,
ticktype="detailed",nticks=5)
dev.offO
zzl}
###
empContourK-function(sampleSize,T,kappa,mu,mux,sx,muy,sy,rho,r,n){
xxl<-seq(0,2*pi,len=n)
94
yyl<-seq(-pi ,p i , len=n)
zz K-matr ix (1 , nrow=n, ncol=n)
thetaK-rvonmises (sampleSize, mu, kappa)
A<-bivariate(sampleSize,mux,muy,sx,sy,rho)
for(k in l :n ){ for(m in l :n ){
zzl[k,m]<-runme2(sampleSize,T,theta=xxl[k],phi=yyl[m],r,thetal,A)}}
postscript(file="Surface45.ps")
contour(xxl,yy1,zzl,xlab="theta",ylab="phi",nlevels=15,drawlabels=FALSE)
dev.off0
zzl}
###
empContour2<-funct ion(sampleSize,T,kappa,mu,mux,sx,muy,sy,rho,the,n){
xxl<-seq(0.1,2,len=n)
yyl<-seq(-pi,pi,len=n)
zz K-matr ix (1 , nrow=n, ncol=n)
thetaK-rvonmises (sampleSize,mu, kappa)
A<-bivariate(sampleSize,mux,muy,sx,sy,rho)
for(k in l :n ){ for(m in l :n ){
zzl[k,m]<-runme2(sampleSize,T,theta=the,phi=yyl[m] ,r=xxl[k] , the ta l ,A)}}
postscript(file="Surface45.ps")
contour(xxl,yyl,zzl,xlab="r",ylab="phi",nlevels=15,drawlabels=FALSE)
dev.offO
zzl}
###
empContour3<-function(sampleSize,T,kappa,mu,mux,sx,muy,sy,rho,ph,n){
95
xxl<-seq(0,2*pi,len=n)
yyl<-seq(0.l,2,len=n)
zz K-matr ix (1, nrow=n, ncol=n)
thetal<-rvonmises(sampleSize,mu,kappa)
A<-bivariate(sampleSize,mux,muy,sx,sy,rho)
for(k in l:n){ for(m in l:n){
zzl[k,m]<-runme2(sampleSize,T,theta=xxl[k],phi=ph,r=yyl[m],thetal,A)}}
postscript(file="Surface45.ps")
contour(xxl,yyl,zzl,xlab="t",ylab="r",nlevels=15,drawlabels=FALSE)
dev.off()
zzl}
###
B.3 Sample Simulation
set.seed(8141)
sampleSize<-1000
mu<-pi
kappa<-l
mux<-0
muy<-0
sx<-l
sy<-l
rho<-0.05
96
theta<-rep(0,sampleSize)
phi<-rep(0,sampleSize)
theta<-rvonmises(sampleSize,mu,kappa)
A<-bivariate(sampleSize,mux,muy,sx,sy,rho)
mat1<-array(0,c(3,3,sampleSize))
for(k in 1:sampleSize){
vecl<-c(cos( theta[k]) , s in ( the t a [k ] ) , 0 , - s i n ( t h e t a [ k ] ) , cos ( the ta [k] ) ,0,A[k, 1] ,A[1<
matl[,,k]<-array(vecl,dim=c(3,3))}
gzR<-matl
phi<-find.phi(sampleSize,gzR)
radius<-find.radius(sampleSize,gzR)
gl<-LSCV2(T=l,sampleSize,radius,phi,theta)
g2<-LSCV2(T=2,sampleSize,radius,phi,theta)