Post on 21-Apr-2023
Int’l Conf. on Computer & Communication Technology │ICCCT’10│
___________________________________ 978-1-4244-9032-5/10/$26.00©2010 IEEE 230
Curvelet Transform based Object Tracking
Swati Nigam and Ashish Khare Department of Electronics and Communication
University of Allahabad, Allahabad
swatinigam.au@gmail.com, khare@allduniv.ac.in
Abstract - In this paper, we have proposed a new object tracking method in video sequences which is based on curvelet transform. The wavelet transform has widely been used for object tracking purpose, but it cannot well describe curve discontinuities. We have used curvelet transform for tracking. Tracking is done using energy of curvelet coefficients in sequence of frames. This method is suitable for object tracking as well as human object tracking purpose also. The proposed method is simple and does not require any other parameter except curvelet coefficients. Experimental results demonstrate performance of this method.
Keywords – Object Tracking, Video Sequences, Curvelet
Transform.
1. INTRODUCTION
The object tracking in video sequences is a
very popular problem in the field of computer vision
[1] today. Object tracking is the basis of applications
in many areas like security and surveillance, clinical
applications, biomechanical applications, human
robot interaction, entertainment, education, and
training etc. Initially, researchers have focused on
tracking of a single object, whereas the focus of
recent research is on tracking of multiple objects [2].
Various tracking algorithms are described in
[3]. It is now well established fact that complex
wavelets are one of the most promising tools for
object tracking purpose. Complex wavelets are very
suitable for representing local features. Several
methods exist for object tracking using wavelets
[4,6,7,20]. The Dual-Tree Complex Wavelet
Transform is an efficient approach and gives better
directional selectivity [4]. The important work for
object tracking was performed by Khansari et al. [5-
7]. Khansari et al. [5] developed a new noise robust
algorithm for tracking the user-defined shapes in
noisy images and video sequences by using the
features generated in the Undecimated Wavelet
Packet Transform (UWPT). They analyzed the
adaptation of a feature vector generation and block
matching algorithm in the UWPT domain for
tracking human objects, in crowded scenes, in
presence of occlusion [6] and introduced a new
tracking algorithm that can manage partial or short-
term full occlusion [7].
However, the wavelet transform does not
process edge discontinuities optimally, and
discontinuities across a simple edge affect all the
wavelets coefficients on the edge. The Ridgelet
transform was introduced to overcome the weakness
of wavelets in higher dimensions [8]. It provides a
good representation for line singularities also in 2D
space. Xiao et al. [9] presented a human object
tracking system based on the ridgelet transform and
proved to be an alternative to wavelet representation
of image data.
The ridgelet transform is capable of
handling one dimensional singularities only. The new
tight frame of curvelet [10] is an effective
nonadaptive representation for objects with edges
[11]. The continuous curvelet transform [12,13] and
3D discrete curvelet transform [14] are well capable
of handling two dimensional singularities also. Zhang
et al. [15] experimentally confirmed that the use of
curvelet transform to extract face features is a more
effective approach for object tracking. Lee and Chen
[16] used the digital curvelet transform to capture the
high dimensional features at different scales and
different angles. Also Mandal et al. [17] presented an
improvement by reducing the number of coefficients.
Extending the previous works done on
curvelet transform, in this paper, we propose an
implementation of the curvelet transform – the 2D
discrete curvelet transform and describe a new
method for object tracking using curvelet transform
from a video scene. The approach uses energy of
curvelet coefficient for tracking of objects.
The rest of the paper is organized as follows:
section 2 describes basic concepts of curvelet
transform. Section 3 deals with the proposed tracking
algorithm. Experimental results and conclusions are
given in section 4 and 5 respectively.
II THE CURVELET TRANSFORM
Curvelet Transform is a new multi-scale
representation, suitable for objects with curves. It was
developed by Candès and Donoho in 1999. Curvelets
are designed to handle curves using only a small
number of coefficients. Hence the curvelet transform
handles curve discontinuities well. The curvelet
transform includes four stages:
(i) Sub-band decomposition
Int’l Conf. on Computer & Communication Technology │ICCCT’10│
231
We define P� (low pass filters) and Δ�, s≥0 (high pass
filters). The image f is filtered into subbands using a
trous algorithm [18] as
f P� f, Δ�f, Δ f, … �
(ii) Smooth Partitioning
Each subband is smoothly windowed into “squares”
of appropriate scale as
h� � w� . Δ� f where w� is a nonnegative smooth function localized
around a grid of dyadic squares defined as
Q = ��� � , ����
� ���� � , ����
� � � Q�
(iii) Renormalization
Renormalization is centering each dyadic square to
the unit square [0,1]×[0,1] as
g� � T� � h�
For each Q, the operator TQ is defined as:
!T�f"x�, x � � 2� f 2�x� % k�, 2�x % k �
(iv) Ridgelet analysis
Each square is analyzed in the orthonormal ridgelet
system. This is a system of basis elements ρ( making
an orthonormal basis for L (R ):
α�(� � ,g�, ρ(- The curvelet transform is useful for object tracking
due to its following properties:
(1) The curvelet coefficients are directly
calculated in the Fourier space. In the
context of the Ridgelet transform, this
allows avoiding the computation of the 1D
inverse Fourier transform along each radial
line.
(2) Each subband is sampled above the Nyquist
rate, hence, avoiding aliasing – a
phenomenon typically encountered by
critically sampled curvelet transform.
(3) The reconstruction is trivial. The curvelet
coefficients simply need to be co-added to
reconstruct the input signal at any given
point. In our application, this implies that
the ridgelet coefficients simply need to be
co-added to reconstruct Fourier coefficients.
(4) In curvelet domain, the most essential
information in the image is compressed into
relatively few large coefficients, which
coincides with the area of major spatial
activity.
III THE PROPOSED TRACKING ALGORITHM
The tight frame property of curvelet [19], allows
us to shift attention to the coefficient domain. Thus
magnitude and energy of curvelet coefficients remain
approximately invariant by translating the object in
different frames of a video. The proposed tracking
algorithm exploits this property.
The tracking algorithm searches the object in next
frame according to its predicted centroid value,
which is computed from the previous four frames.
Similar to Khare and Tiwary [20], we have computed
centroid of the object. For this, first we have
computed distance between previous four frames
with the help of coordinates of centroids of these
frames and using equations of motion followed by
velocity calculation after first three frames. Then we
calculated acceleration in fourth frame. The final
velocity is computed using initial velocity and
acceleration. Finally we predicted distance of
centroid of next frame with the help of velocity and
acceleration. At last we used again equation of
motion and predicted the centroid of next frame. The
searching is done using this centroid value. In all the
computations, it has been assumed that the frame rate
is adequate and the size of the object should not
change between adjacent frames. However the
proposed algorithm is capable of tracking an object
whose size changes within a range in various frames.
This computation makes each object correspond to a
single point. Calculation of velocity of the moving
object is based on its position coordinates. The
tracking algorithm does not require any other
parameter except cuvelet coefficient. Complete
tracking algorithm is as follows –
Step 1:
Make a square bounding box to cover the object with
centroid at (C�,C ) and compute the energy of
curvelet coefficients of the square box, say E, as
E � ∑ 1curve_coef9,:1 9,:�;<=>?@9?A <=B
where curve_coef9,: is curvelet coefficients at i, j�EF
point.
Step 2:
for frame_no = 2 to last do
compute the curvelet coefficients of the frame, say
curve_coef9,:.
search_length = 3.
if frame_no > 4
Predict the centroid (C�,C ) of the current frame
Int’l Conf. on Computer & Communication Technology │ICCCT’10│
232
with help of centroids of previous four frames
and basic equations of straight line motion.
endif
for i = - search_length to + search_length do
for j = - search_length to + search_length do
C?GH�= C�+ i; C?GH = C + j;
Make a bounding box with centroid
C?GH�, C?GH �.
Compute the difference of energy of curvelet
coefficient of bounding box, with E, say d9,:.
end
end
Find minimum of {d9,:} and its index, say (m,n).
C� = C� + m; C = C + n.
Mark the object in current frame with bounding box
with centeroid (C�,C ) and energy of bounding box
E, as
E � J 1curve_coef9,:1
9,:�;<=>?@9?A <=B
end.
4. Experiments and Results
In this section, we show the experimental
results of the proposed algorithm. We implemented
tracking method described in section 3 and tested on
several video clips.
For tracking, the object area is determined in
the first frame by hand. In this experiment, we use
mouse to select the object area in the first frame.
Once the area is determined in the first frame, the
tracking algorithm is needed to track the object from
frame to frame. We make a square bounding box to
cover the object with centroid at (C�,C ) and compute
the energy of curvelet coefficients of the square box
as tracking method described in section 3. Each
highpass block is assumed around that object by
using the boundaries [top bottom left right] of the
object from the previous frame, such as the box is
stretched by 3 pixels in each dimension. To start with
the top left corner of that bounding box, within the
box we compute the energy of the curvelet
coefficient for each sub box whose dimension is
equal to that bounding box of that object.
We show results for Child video, Ball video and
Soccer video in Fig. 1, 2 and 3 respectively and
position centroids are shown in Table 1, 2 and 3
respectively. The proposed method was applied on
several video clips and it processes frames at a speed
of 25 frames per second on average.
TABLE 1: Centroids of child video
Frame
No
Boundary Centroid
Top Bottom Left Right
10 51 88 69 83 (70,76)
20 61 98 66 80 (80,73)
30 61 98 81 95 (80,88)
40 54 91 99 113 (73,106)
50 49 86 98 112 (68,105)
60 40 77 87 101 (59,94)
70 44 81 67 81 (63,74)
80 35 72 45 59 (54,52)
90 38 75 38 52 (57,45)
100 46 83 32 46 (65,39)
TABLE 2: Centroids of Ball video
Frame
No
Boundary
Centroid Top Bottom Left Right
1 57 80 25 49 (69,37)
2 59 82 27 51 (71,39)
3 58 81 25 49 (70,37)
4 61 84 26 50 (73,38)
5 69 92 18 42 (81,30
6 68 91 18 42 (80,30)
7 68 91 16 40 (80,28)
8 69 92 16 40 (81,28)
9 75 98 10 34 (87,22)
10 75 98 10 34 (87,22)
Int’l Conf. on Computer & Communication
Frame No 10 Frame No 20
Frame No 60 Frame No 70
Figure 1 Tracking of child in frame no 10, 20, 30, 40, 50, 60, 70, 80, 90 and 100
Frame No 1 Frame No 2
Frame No 6 Frame No 7
Figure 2 Tracking of ball in frame no 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10
Frame No 230 Frame No 240
Frame No 280 Frame No 290
Figure 3 Tracking of player in frame no 230, 240, 250, 260, 270, 280, 290, 300, 310 and 320
Int’l Conf. on Computer & Communication Technology │ICCCT’10│
233
Frame No 20 Frame No 30 Frame No 40 Frame No 50
Frame No 70 Frame No 80 Frame No 90 Frame No 100
Tracking of child in frame no 10, 20, 30, 40, 50, 60, 70, 80, 90 and 100
Frame No 2 Frame No 3 Frame No 4 Frame No 5
Frame No 7 Frame No 8 Frame No 9 Frame No 10
Tracking of ball in frame no 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10
Frame No 240 Frame No 250 Frame No 260 Frame No 270
Frame No 290 Frame No 330 Frame No 310 Frame No 320
Tracking of player in frame no 230, 240, 250, 260, 270, 280, 290, 300, 310 and 320
Frame No 5
Frame No 320
Int’l Conf. on Computer & Communication Technology │ICCCT’10│
234
TABLE 3: Centroids of Soccer video
Frame
No
Boundary Centroid
Top Bottom Left Right
230 61 73 23 31 (67,27)
240 57 69 26 34 (63,30)
250 58 70 32 40 (64,36)
260 57 69 39 47 (63,43)
270 51 63 39 47 (57,43)
280 57 69 42 50 (63,46)
290 50 62 46 54 (56,50)
300 52 64 48 56 (58,52)
310 46 58 50 58 (52,54)
320 42 54 47 55 (48,51)
From the above experimental results, it can be
observed that the proposed method performs well.
V CONCLUSIONS
In this paper, we have developed and
demonstrated a new algorithm for tracking of video
that exploits new tight frames of curvelet and
provides a sparse expansion for typical images
having smooth contours. We use curvelet coefficients
for tracking the object in the sequence of frames. The
curvelet transform provides near-ideal sparsity of
representation for both smooth objects and objects
with edges.
It is clear that the proposed method performs well.
However, if the quality of the frame in video is not
good (such as noise, blur etc), then the estimation
ability is reduced. From experimental results, it can
be observed that the proposed algorithm is capable of
tracking long and challenging sports footage, where
human object are moving fast and taking on extreme
poses. The proposed algorithm allows a user to easily
and quickly track a object in image or video using
curvelet transform. Unlike the other methods, the
proposed algorithm does not rely upon many
properties of object such as size, shape, etc. The
proposed method is also capable to handle occlusion
problem. In all the computations, it has been assumed
that the frame rate is adequate and the size of the
object should not change between adjacent frames.
However, the proposed algorithm is capable of
tracking a object whose size changes within a range
in various frames.
The experimental results demonstrate that the
algorithm can track the moving objects in video clips.
Although we use a simple algorithm, other methods
can also be applied after the object areas are
determined. One can even develop an algorithm to
weigh in different tracking methods to achieve more
accurate results.
REFERENCES:
[1] M. Sonka, V. Hlavac, and R. Boyle. Image Processing,
Analysis and Machine Vision. Thomson Asia Pvt. Ltd.,
Singapore, 2001.
[2] A. Utsumi, H. Mori, J. Ohya and M. Yachida. Multiple-human
tracking using multiple cameras. Proceedings of Third IEEE
International Conference on Automatic Face and Gesture
Recognition, 1998, pp. 498-503.
[3] A. Yilmaz, O. Javed and M. Shah. Object Tracking: A survey,
ACM Computing Surveys, vol. 38, no. 4, 2006.
[4] A. Mansouri, F. T. Azar and A. M. Aznaveh. Face Tracking by
3-D Dual-Tree Complex Wavelet Transform Using Support
Vector Machine. 9th International Symposium on Signal
Processing and Its Applications, 2007. Pp. 1-4.
[5] M. Khansari, H. R. Rabiee, M. Asadi, and M. Ghanbari. A
Robust Object Shape Prediction Algorithm in the Presence of
White Gaussian Noise. Proceedings of 12th International
Multi-Media Modeling Conference, 2006, pp. 4.
[6] M. Khansari, H. R. Rabiee, M. Asadi, and M. Ghanbari.
Crowded Scene Object Tracking in Presence of Gaussian
White Noise using Undecimated Wavelet Features. 9th
International Symposium on Signal Processing and Its
Applications, 2007, pp. 1-4.
[7] M. Khansari, H. R. Rabiee, M. Asadi, and M. Ghanbari.
Occlusion Handling for Object Tracking in Crowded Video
Scenes Based on the Undecimated Wavelet Features.
IEEE/ACS International Conference on Computer Systems
and Applications, 2007, pp. 692-699.
[8] E. J. Cand`es and D. L. Donoho; Ridgelets: A Key to Higher
Dimensional Intermittency, Royal Society Publishing,
Source: Philosophical Transactions: Mathematical, Physical
and Engineering Sciences, vol. 357, no. 1760, 1999, pp.
2495-2509.
[9] L. Xiao, H.Z. Wu, Z.H.Wei, and Y. Bao, Research and
Applications of a new Computational Model of Human
Vision System Bsed on Ridgelet Transform. Proceedings of
the Fourth International Conference on Machine Learning
and Cybernetics, vol. 8, 2005, pp. 5170-5175.
[10] N.T. Binh and A. Khare, "Multilevel Threshold Based Image
Denoising in Curvelet Domain", Journal of Computer Science
and Technology, vol. 25, no. 3, May 2010, pp. 633-641.
[11] E. J. Candès and D. L. Donoho. Curvelets - a surprisingly
effective nonadaptive representation for objects with edges.
Curves and Surfaces, L. L. Schumaker et al. (eds), Vanderbilt
University Press, Nashville, TN. Available online http://www-
stat.stanford.edu/~candes/papers/Curvelet-SMStyle.pdf
[12] E. J. Cand`es and D. L. Donoho. Continuous Curvelet
Transform: I. Resolution of the Wavefront Set. Appl.
Comput. Harmon. Anal. Vol. 19, 2003, pp. 162-197.
[13] E. J. Cand`es and D. L. Donoho. Continuous Curvelet
Transform: II. Discretization and Frames. Appl. Comput.
Harmon. Anal. Vol 19, 2003, pp. 198-222.
[14] L. Ying, L. Demanet, and E. J. Cand`es; 3D Discrete Curvelet
Transform. Proc. of SPIE Wavelets XI, vol. 5914, no.
591413, 2005.
[15] J. Zhang, Z. Zhang, W. Huang, Y. Lu, and Y. Wang. Face
Recognition Based on Curvefaces. Third International
Conference on Natural Computation, 2007, pp. 627-631.
[16] Y. C. Lee and C. H. Chen. Face Recognition Based on Digital
Curvelet Transform. Eighth International Conference on
Intelligent Systems Design and Applications, 2008, pp. 341-
345.
Int’l Conf. on Computer & Communication Technology │ICCCT’10│
235
[17] T. Mandal, Q.M.J. Wu, and Y. Yuan. Curvelet based face
recognition via dimension reduction, Signal Processing,
2009, in press, doi: 10.1016/j.sigpro.2009. 03.007.
[18] J. L. Starck, E.J. Candes, and D. L. Donoho. The Curvelet
Transform for Image Denoising. IEEE Trans. on Image
Processing, vol. 11, no. 6, 2002, pp. 670-684.
[19] E. J. Cand`es and D. L. Donoho. New Tight Frames of
Curvelets and Optimal Representations of Objects with C²
Singularities. Communications on Pure and Applied
Mathematics, vol. 57, no. 2, 2003, pp. 219-266.
[20] A. Khare and U. S. Tiwary. Daubechies Complex Wavelet
Transform Based Moving Object Tracking. IEEE
Symposium on Computational Intelligence in Image and
Signal Processing, 2007, pp. 36-40.