Penalized Gaussian mixture probability hypothesis density filter for multiple target tracking

6
Abstract— This paper presents a penalized Gaussian mixture probability hypothesis density tracker with multi-feature fusion to track close moving targets in video. A weight matrix that contains all updated weights between the predicted target states and the measurements is first constructed. The ambiguous weights in the constructed weight matrix is then determined according to the total weight and the predicted target states. Multiple features such as spatial-color appearance, histogram of oriented gradient, and target area are fused to further penalize the ambiguous weights. The experimental results conducted on both of the synthetical and real videos validate the effectiveness of the proposed tracker. I. INTRODUCTION Recently, the random finite set approach [1] for multi-target tracking has received considerable attention [2-10]. The probability hypothesis density (PHD) filter [2], which is the first-order statistical moment of the multi-target posterior density, provides a computationally tractable alternative. However, it’s generally intractable due to the “curse of dimensionality” in numerical integration. The Gaussian mixture PHD filter (GM-PHD) [3] does not suffer from this problem because its posterior intensity function is estimated by a sum of weighted Gaussian components that can be propagated analytically in time. It is a closed form solution to the PHD filter recursion. Although the GM-PHD filter originates from radar tracking [3], recently it has been successfully explored for visual tracking [4-7]. For simplicity, the GM-PHD filter-based tracker is called GM-PHD tracker in this paper. Generally, it is assumed that each measurement corresponds to one target and vice versa in multi-target tracking. This so called one-to-one assumption expresses that a target can only associate one measurement. However, in the GM-PHD tracker this one-to-one assumption is violated whenever multiple measurements are close to one target. In other words, the efficiency of the GM-PHD tracker may degrade when targets come near each other. To remedy this, Y.-Dehkordi et al. propose a competitive GM-PHD (CGM-PHD) tracker [8] This work was supported in part by National Natural Science Foundation of China (61403342, 61273286, 61325019, 61173096 and 11302195) and International Science & Technology Cooperation Program of China (S2014GAT030). Xiaolong Zhou, Zhen Xie, and Shengyong Chen are with the College of Computer Science and Technology of Zhejiang University of Technology, Hangzhou, China, (e-mail: [email protected] , [email protected] ). Yazhe Tang is with the department of Precision Mechanical Engineering of Shanghai University, Shanghai, China, (e-mail: [email protected] ). Jianyu Yang is with the School of Urban Rail Transportation, Soochow University, Suzhou, China, (e-mail: [email protected] ). and a penalized GM-PHD (PGM-PHD) tracker [9] to refine the weights of the close moving targets in the update step of the GM-PHD filter. However, they do not provide continuous trajectories for the targets. By considering this point, Wang et al. [10] propose a collaborative penalized GM-PHD (CPGM-PHD) tracker, in which they utilize the track label of each Gaussian component to collaboratively penalize the weights of close moving targets with the same identity. However, the above mentioned trackers are merely suitable for point target tracking, they may fail in video target tracking. Compared to the simple point representations of the target state and the measurement in point target tracking, the representations are more complicated in video target tracking. Both location and size of video target are considered for modeling the target state and the measurement. As video targets move closely, the trackers (GM-PHD tracker, CGM-PHD tracker, PGM-PHD tracker and CPGM-PHD tracker) mentioned above may track multiple targets with one same identity or with switched identities. In this paper, a new penalized GM-PHD tracker with multi-feature fusion (so called PGM-PHD-MF tracker) is proposed to solve the above mentioned problems in close moving targets tracking. The contributions of this paper are those given here. 1) A PGM-PHD-MF tracker is proposed to effectively track multiple moving targets in video, especially to robustly track the targets in close moving. 2) A weight matrix of all the updated weights is constructed. The total weight and the predicted target states are used to determine the ambiguous weights in the matrix. 3) Multiple features that include spatial-color appearance, histogram of oriented gradient, and target area are incorporated into the tracker to penalize the ambiguous weights. By doing so, the weights of the mis-matched targets could be greatly reduced, and thus improve the tracking accuracy. The rest of this paper is organized as follows. Section II simply introduces the GM-PHD filter and its drawbacks. Section III presents the PGM-PHD-MF tracker in detail. Some experimental results on synthetical and real videos are discussed in Section IV, and followed by concluding remarks in Section V. Penalized Gaussian Mixture Probability Hypothesis Density Tracker with Multi-Feature Fusion Xiaolong Zhou, Yazhe Tang, Jianyu Yang, Zhen Xie, and Shengyong Chen, Senior Member, IEEE 978-1-4799-7396-5/14/$31.00 © 2014 IEEE

Transcript of Penalized Gaussian mixture probability hypothesis density filter for multiple target tracking

Abstract— This paper presents a penalized Gaussian mixture probability hypothesis density tracker with multi-feature fusion to track close moving targets in video. A weight matrix that contains all updated weights between the predicted target states and the measurements is first constructed. The ambiguous weights in the constructed weight matrix is then determined according to the total weight and the predicted target states. Multiple features such as spatial-color appearance, histogram of oriented gradient, and target area are fused to further penalize the ambiguous weights. The experimental results conducted on both of the synthetical and real videos validate the effectiveness of the proposed tracker.

I. INTRODUCTION Recently, the random finite set approach [1] for

multi-target tracking has received considerable attention [2-10]. The probability hypothesis density (PHD) filter [2], which is the first-order statistical moment of the multi-target posterior density, provides a computationally tractable alternative. However, it’s generally intractable due to the “curse of dimensionality” in numerical integration. The Gaussian mixture PHD filter (GM-PHD) [3] does not suffer from this problem because its posterior intensity function is estimated by a sum of weighted Gaussian components that can be propagated analytically in time. It is a closed form solution to the PHD filter recursion.

Although the GM-PHD filter originates from radar tracking [3], recently it has been successfully explored for visual tracking [4-7]. For simplicity, the GM-PHD filter-based tracker is called GM-PHD tracker in this paper. Generally, it is assumed that each measurement corresponds to one target and vice versa in multi-target tracking. This so called one-to-one assumption expresses that a target can only associate one measurement. However, in the GM-PHD tracker this one-to-one assumption is violated whenever multiple measurements are close to one target. In other words, the efficiency of the GM-PHD tracker may degrade when targets come near each other. To remedy this, Y.-Dehkordi et al. propose a competitive GM-PHD (CGM-PHD) tracker [8]

This work was supported in part by National Natural Science Foundation

of China (61403342, 61273286, 61325019, 61173096 and 11302195) and International Science & Technology Cooperation Program of China (S2014GAT030).

Xiaolong Zhou, Zhen Xie, and Shengyong Chen are with the College of Computer Science and Technology of Zhejiang University of Technology, Hangzhou, China, (e-mail: [email protected], [email protected]).

Yazhe Tang is with the department of Precision Mechanical Engineering of Shanghai University, Shanghai, China, (e-mail: [email protected]).

Jianyu Yang is with the School of Urban Rail Transportation, Soochow University, Suzhou, China, (e-mail: [email protected]).

and a penalized GM-PHD (PGM-PHD) tracker [9] to refine the weights of the close moving targets in the update step of the GM-PHD filter. However, they do not provide continuous trajectories for the targets. By considering this point, Wang et al. [10] propose a collaborative penalized GM-PHD (CPGM-PHD) tracker, in which they utilize the track label of each Gaussian component to collaboratively penalize the weights of close moving targets with the same identity. However, the above mentioned trackers are merely suitable for point target tracking, they may fail in video target tracking. Compared to the simple point representations of the target state and the measurement in point target tracking, the representations are more complicated in video target tracking. Both location and size of video target are considered for modeling the target state and the measurement. As video targets move closely, the trackers (GM-PHD tracker, CGM-PHD tracker, PGM-PHD tracker and CPGM-PHD tracker) mentioned above may track multiple targets with one same identity or with switched identities.

In this paper, a new penalized GM-PHD tracker with multi-feature fusion (so called PGM-PHD-MF tracker) is proposed to solve the above mentioned problems in close moving targets tracking. The contributions of this paper are those given here.

1) A PGM-PHD-MF tracker is proposed to effectively track multiple moving targets in video, especially to robustly track the targets in close moving.

2) A weight matrix of all the updated weights is constructed. The total weight and the predicted target states are used to determine the ambiguous weights in the matrix.

3) Multiple features that include spatial-color appearance, histogram of oriented gradient, and target area are incorporated into the tracker to penalize the ambiguous weights. By doing so, the weights of the mis-matched targets could be greatly reduced, and thus improve the tracking accuracy. The rest of this paper is organized as follows. Section II simply introduces the GM-PHD filter and its drawbacks. Section III presents the PGM-PHD-MF tracker in detail. Some experimental results on synthetical and real videos are discussed in Section IV, and followed by concluding remarks in Section V.

Penalized Gaussian Mixture Probability Hypothesis Density Tracker with Multi-Feature Fusion

Xiaolong Zhou, Yazhe Tang, Jianyu Yang, Zhen Xie, and Shengyong Chen, Senior Member, IEEE

978-1-4799-7396-5/14/$31.00 © 2014 IEEE

II. PROBLEM FORMULATION The kinematic state of a target i at time t is denoted as

{ , , }i i i it t t t=x l v s . The location, velocity, and bounding box

size of the target are denoted as , ,{ , }i i it x t y tl l=l , , ,{ , }i i i

t x t y tv v=v ,

and { , }i i it t tw h=s , respectively. tNi ,...,1= and tN is the

number of targets at time t . Similarly, the measurement originating from a target j at time t is denoted as

, ,{ , }j j jt z t z t=z l s . tmNj ,,...,1= and tmN , is the number of

measurements at time t . The target states set and measurements set are denoted as 1{ , , }tN

t t t=X x x and ,1{ , , }m tN

t t t=Z z z , respectively.

A. The GM-PHD Filter According to [7], the GM-PHD filter for multiple video

targets tracking is implemented by: Prediction: Suppose the prior density 1 1( )t tD − −x has the

form 1 ( ) ( ) ( )1 1 1 1 1 11( ) ( ; , )tJ i i i

t t t t t tiD ω−

− − − − − −== Ν∑x x m P , then the

predicted intensity 1( )tt tD − x is given by: 1 ( ) ( ) ( )

11 , 1 , 11( ) ( ) ( ; , )tJ i i i

t t t sv t tt t sv t t sv t tiD pγ ω−

−− − −== + Ν∑x x x m P (1)

where ( ; , )Ν ⋅ m P denotes a Gaussian component with the mean m and the covariance P . 1tJ − and ( )

1i

tω − are the number and weight of the Gaussian component, respectively. tx is the element of tX . svp is the survival probability of the survival targets. ( )t tγ x is the birth intensity of the newborn targets, which is estimated by Zhou’s method [6] in this paper.

Update: 1( )tt tD − x can be expressed as a Gaussian mixture

1 ( ) ( ) ( )1 1 1 11( ) ( ; , )t tJ i i i

t tt t t t t t t tiD ω−

− − − −== Ν∑x x m P , then the posterior

intensity ( )t tD x is given by:

,1( ) (1 ) ( ) ( ; )t t

t t d t g t t tt tD p D D−∈

= − + ∑z Z

x x x z (2)

1 ( ) ( ) ( ), , , ,1( ; ) ( ) ( ; ( ), ( ))t tJ i i i

g t t t g t t t g t t g t tiD ω−

== Ν∑x z z x m z P z (3)

1

( ) ( ) ( ), ,1( )

, ( ) ( ) ( ), ,11

( ; , )( )

( ) ( ; , )t t

i i id t h t h tt ti

g t t J i i it t t d t h t h tt ti

p

c p

ωω

λ ω−

−=

Ν=

+ Ν∑z m P

zz z m P

(4)

where ( ) ( )-1 1, 1

i it tsv t t −− =m F m , ( ) ( )

-1 -1 1 -1, 1i i T

t t t tsv t t −− = +P Q F P F , ( ), ( )i

g t t =m z ( ) ( )

1 1( )i it tt t t tK− −+ −m z H m , ( ) ( )

, 1( ) ( )i ig t t t t tK −= −P z I H P , ( )

1i T

tt tK −= P H( ) 1

1( )i Tt t tt t

−− +H P H Ri , ( ) ( )

, 1i i

h t t t t −=m H m , ( ) ( ), 1i i T

h t t t tt t−= +P R H P H .

dp is the detection probability and tz is the element of tZ . tλ is the average rate of the Poisson distributed clutters. ( )t tc z is the probability density of the spatial distribution of clutters.

To prune those components that are irrelevant to the target intensity and to merge those components that share the same intensity peak into one component, the pruning and merging algorithms proposed by Vo and Ma [3] are used. The peaks of the intensity are points of the highest local concentration of the expected number tN of targets. The estimate of the

multi-target states is the set of tN ordered of the mean with the largest weights.

B. Drawbacks of the GM-PHD filter As targets come near, multiple measurements may

associate to one target or associate to incorrect targets. The GM-PHD filter may fail to track the targets with correct identities. Fig. 1 is a pictorial example of two close moving targets. Normally, each predicted state 1

it t−x of target i is

associated with only one measurement jtz originating from

target i , which means the weight of the i th predicted target updated by the j th measurement should be far greater than those weights updated by other measurements. However, in the real world scenarios, two possible cases are existed to violate this one-to-one association.

Case 1: one predicted target (shown as the target 11t t−x in

Fig. 1(a)) may be associated with more than one measurement (shown as the measurements 1

tz and 2tz in Fig. 1(a)). For

such case, there exist at least two updated weights for the same target (shown as (1,1)

tω and (1,2)tω in Fig. 1(a)) whose

values are far greater than other updated weights. ( , )i jtω is the

normalized weight of target i updated by measurement j . For simplicity, indices i and j are used to represent the i th predicted target state 1

it t−x and the j th measurement j

tz ,

respectively. As a result, the GM-PHD filter tracks the multiple targets with one same identity (shown as 1

tx in Fig. 1(a)).

(a)

(b)

Fig. 1. A pictorial example of two close moving targets. (a) Case 1: two targets with one same identity. (b) Case 2: two targets with switched identities.

Case 2: one predicted target may be associated with the other measurement that is not originated from this target. As shown in Fig. 1(b), measurement 1 should theoretically be associated to target 1 while measurement 2 should be associated to target 2. However, (1,2)

tω is actually greater than (1,1)tω while (2,1)

tω is actually greater than (2,2)tω . As a result,

the GM-PHD filter tracks this two targets with switched identities (shown as 1

tx and 2tx in Fig. 1(b)).

III. THE PGM-PHD-MF TRACKER To remedy the above mentioned drawbacks of the

GM-PHD filter, a new PGM-PHD-MF tracker is proposed. First, a weight matrix that consists of all updated weights is constructed. Then, the ambiguous weight is defined and the corresponding methods for searching the ambiguous weights are proposed. Finally, multiple features are incorporated into the tracker to penalize the ambiguous weights.

A. Weight Matrix Construction Fig. 2 is a symbolic representation of updated weights. For

a better clarification, the weights of all the targets updated by all the measurements in a matrix form in Fig. 2 is called weight matrix. In the weight matrix, the i th row represents the weights of the i th predicted target updated by all measurements, while the j th column represents the weights of all predicted targets updated by the j th measurement.

, ( , )

1

m tNi i jj tj

W ω=∑= in the figure is the total weight of the i th row

while 1 ( , )

1

t tJj i j

i tiW ω

=∑= is the total weight of the j th column.

,m tN and 1t tJ − are the number of measurements and the

number of predicted target states, respectively.

1

( ) ( ) ( ), ,1( , )

( ) ( ) ( ), ,11

( ; , )

( ) ( ; , )t t

i j i id t h t h tt ti j

t Jj i j i it t t d t h t h tt ti

p

c p

ωω

λ ω−

−=

Ν=

+ Ν∑z m P

z z m P (5)

B. Ambiguous Weights Determination In this paper, the weights of those close moving targets are

defined as the ambiguous weights. Before penalization, the weight matrix should be analyzed first to determine the ambiguous weights. In the CGM-PHD tracker [8], PGM-PHD tracker [9] and CPGM-PHD tracker [10], the weight of target i is determined as an ambiguous weight once the total weight i

jW of the i th row is greater than one.

However, this method is not applicable to the Case 2 (shown as in Fig. 1(b)) because the total weight i

jW may be less than

one. In this paper, the total weight ijW and the predicted

target states are utilized to determine the ambiguous weights of the Case 1 and Case 2, respectively.

1) Ambiguous weights determination for Case 1. For a given weight matrix, if the total weight i

jW of the i th row satisfies the following condition

1ijW > (6)

this weight matrix is determined as the ambiguous weight matrix.

After the ambiguous weight matrix has been determined, the next step is to determine the ambiguous weights in the ambiguous matrix.

First, the expected number tN of targets is calculated according to method proposed in Section II.A.

Then, the first tN largest weights in the ambiguous weight matrix are selected as the ambiguous candidates.

Finally, if more than one candidate is in the same row in the matrix, these candidates are determined as the ambiguous weights. In other words, if more than one candidate has the same row index, the corresponding weights ( , )i j

tω and ( , ')i jtω

are determined as the ambiguous weights. 'j j≠ and

,' {1,2, , }m tj N∈ . The related measurements j and 'j are determined as the ambiguous measurements, which are probably to be associated with the same target i . Consequently, the ambiguous weights ( , )i j

tω and ( , ')i jtω

should be penalized. For example, the weights (1,1)tω and

(1,2)tω in Fig. 1(a) can be determined as the ambiguous

weights according to the proposed method. 2) Ambiguous weights determination for Case 2. To

determine the ambiguous weights for Case 2, the close moving targets should be defined first. In this paper, target i and 'i are regarded as two close moving targets when

' '1 1 1 1

i i i it t t t t t t t− − − −− < +l l s s (7)

where 1

it t−l (or '

1it t−l ) and

1it t−s (or '

1it t−s ) are the location and size

of the predicted state 1

it t −x (or '

1it t −x ) of the target i (or target

'i ), respectively. i is the Euclidean norm (hereinafter the same).

Then, the ambiguous weights of Case 2 can be determined according to the close moving targets. For two close moving targets i and 'i , if more than one measurement satisfy the following condition, these weights ( , )i j

tω can be regarded as the ambiguous weights.

', 1 1 1j i i i

z t t t t t t t− − −− < −l l l l (8)

where ,tj

zl is the location of the j th measurement jtz .

After the ambiguous weights between the measurement j and the target i have been determined, multiple features

that include spatial-color appearance, histogram of oriented

Fig. 2. The weight matrix: a symbolic representation of the updated weights.

gradient, and target area are fused to penalize the ambiguous weights.

C. Weights Penalization with Multi-Feature Fusion

1) Spatial-color appearance. Color histogram based appearances [11-13] are popular in visual tracking because of their effectiveness and efficiency in capturing the distribution characteristics of visual features inside the target regions. In this section, a spatial constraint color histogram appearance model is presented. Similar to [11], the appearance of a target i is modeled as a Gaussian mixture ( , , )i i i i i

k k kq q ω μ= Σ , representing the color distribution of target’s pixels.

1, ,k K= and K is the number of Gaussian component. ( , , )i i i

k k kω μ Σ represents the weight, mean and covariance matrix of the k th Gaussian component of the mixture. The measure of the similarity ( , )sP i j between the measurement j and the target i is defined by

1

1( , ) exp log ( ; , )j

j

Ki i i

s k k kkj

P i j N cN

ω μΩ =

⎧ ⎫⎧ ⎫⎪ ⎪= Σ⎨ ⎨ ⎬⎬⎩ ⎭⎪ ⎪⎩ ⎭

∑ ∑ l (9)

1/2 11( ; , ) 2 exp ( ) ' ( )2

N c c cμ π μ μ− −⎧ ⎫⎡ ⎤Σ = Σ − − Σ −⎨ ⎬⎣ ⎦ ⎩ ⎭

(10)

where ( , , )j j j jc r g I=l l l l

is the color of the pixel located in jl

within the support region jΩ of the measurement j .

/ ( )j j j j jg G R G B= + +l l l l l

, / ( )j j j j jr R R G B= + +l l l l l

and

( ) / 3j j j jI R G B= + +l l l l

. jN is the number of foreground

pixels in jΩ . The above appearance model is robust only when targets

have different color distributions. It may fail when targets have similar color distributions. To remedy this, a Gaussian spatial constraint is applied according to [12]. The measure of the similarity is then improved by

( , )sP i j

1

1exp log ( ; , ) ( ; , )j

j

Kj i i i i i

t t k k kkj

N N cN

ω μΩ =

⎧ ⎫⎧ ⎫⎪ ⎪= Σ Σ⎨ ⎨ ⎬⎬⎩ ⎭⎪ ⎪⎩ ⎭

∑ ∑ ll l (11)

where 2 2[( / 2) ,0;0, ( / 2) ]i i it t tw hΣ = . , ,{ , }i i i

t x t y tl l=l and

,i it tw h{ } are the location and size of bounding box of the

target i at time t , respectively. ( ; , )j i it tN Σl l is the Gaussian

spatial constraint of the locations of the foreground pixels.

2) Histogram of oriented gradient [14]. The gradient ( , )G x y and orientation ( , )O x y of each pixel in the target

region is calculated by

2 2( , ) [ ( 1, ) - ( 1, )] [ ( , 1) - ( , 1)]G x y I x y I x y I x y I x y= + − + + − (12)

( , 1) - ( , 1)( , ) arctan( 1, ) - ( 1, )

I x y I x yO x yI x y I x y

+ −=+ −

(13)

where ( , )I x y is the intensity of pixel ( , )x y in the image.

The weighted oriented gradient histogram ( )ihq u of target

i is formed by dividing the orientation into 36 bins (10 degrees each step).

( ) 2

01

( ) C / ( ) [ ( ) ]ini i i i i

h r r rr

q u k h G b uδ=∑

⎛ ⎞= − −⎜ ⎟⎝ ⎠

l l l l (14)

where u =1,2,…,36, 2

1C 1/ ( )

in ir

ik

=∑= l is a normalization

function, in is the number of pixels in target i ’s region,

( )k i is an isotropic kernel profile, irl is the location of pixel r ,

h is the bandwidth, δ is the Kronecker delta function, ( )i

rb l associate the pixel r to the histogram bin. The gradient of oriented histogram likelihood between the

measurement j and the target i is defined by 2[ ( ), ( )]1( , ) exp

22

i jh h h

hhh

d q u q uP i j

σπσ⎧ ⎫⎪ ⎪= −⎨ ⎬⎪ ⎪⎩ ⎭

(15)

2[ ( ), ( )] 1 [ ( ), ( )]i j i jh h h h hd q u q u q u q uρ= − (16)

36

1[ ( ), ( )] ( ) ( )i j i j

h h h hu

q u q u q u q uρ=

=∑ i (17)

where hσ is the Gaussian variance, which is set as 0.3 in the experiments.

3) Target area. The degree of change between the areas of the target i and measurement j is defined by

( , ) min{ , } / max{ , }a i j i jP i j S S S S= (18)

where iS and jS represent the areas of the target i and measurement j , respectively. It is reasonable to state that the larger the ( , )aP i j is the more possibility that the measurement j is generating from the target i , because the size of the same target changes slightly between two consecutive frames.

4) Weights penalization with multi-feature fusion. In this paper, the above mentioned features are fused to robustly penalize the ambiguous weight between the measurement j and the target i .

( , ) ( , )= ( , ) ( , ) ( , )i j i jt t s h aP i j P i j P i jω ω i i i (19)

After all the ambiguous weights have been penalized, all the weights in the j th column in the weight matrix should be further normalized by

( , ) ( , )= /i j i j jt t iWω ω (20)

As targets move too close, mutual occlusion occurs. To track those targets in mutual occlusion, a game-theoretical occlusion handling method [15] is used. An n-person, non-zero-sum, non-cooperative game is constructed. The measurements originating from the targets in occlusion are regarded as the players in the game competing for maximum utilities using certain strategies. The location of the measurement is the strategy of the player, and the similarity between the target before occlusion and the measurement after occlusion is the corresponding utility. The Nash

Equilibrium of the game is selected as the optimal estimations of the locations of the players.

IV. EXPERIMENTAL RESULTS The state transition model of the proposed tracker is a

constant velocity model with: 2 2 2 2 2 2[ , , ; , , ;t T=F I I 0 0 I 0

2 2 2, , ]0 0 I and 2 4 3 3 22 2 2 2 2 2[ / 4, / 2, ; / 2, , ;t v T T T Tσ=Q I I 0 I I 0

22 2 2, , ]T0 0 I , where n0 and nI are the n n× zero and

identity matrices. T =1 frame, is the interval between two consecutive time steps. vσ =3 is the standard deviation of the state noise. The measurements follow the measurement likelihood with: 2 2 2 2 2 2[ , , ; , , ]t =H I 0 0 0 0 I , 2

4t wσ=R I , where wσ =2 is the standard deviation of the measurement noise. The values of parameters used in the GM-PHD filter are set as: the detection probability dp =0.99, the survival probability svp =0.95, the average distributed rate of clutters

tλ =0.01 and the spatial distribution of clutters ( )t tc z =(image area)-1.

The PGM-PHD-MF tracker is tested on both of the synthetical and real videos. To validate the proposed weight penalization method, the PGM-GM-MF is compared with the GM-PHD tracker [3] and the CPGM-PHD tracker [10].

A. Qualitative Analysis A synthetical video is used to validate the effectiveness of

the proposed weight penalization method. Fig. 3 and Fig. 4 show the tracking results and the corresponding weight matrices obtained by the trackers, respectively. At t=48, all the trackers can successfully track all the targets (shown as in Fig. 3(a)). At t=49, the targets 1 and 4 approach very close as well as the targets 2 and 3. Without any weights penalization method, the conventional GM-PHD tracker tracks the target 2 with wrong identity 3, while switches the identities for targets 1 and 4 (shown as in Fig. 3(b)). According to the method proposed in the CPGM-PHD tracker, two ambiguous weights for Case 1 are determined and rearranged (shown as in Fig. 4(b)), and the corresponding targets are tracked with correct identities (shown as targets 2 and 3 in Fig. 3(c)). However, the CPGM-PHD tracker cannot correctly track the targets with switched identities for Case 2 (shown as targets 1 and 4 in Fig. 3(c)). On the contrary, the proposed PGM-GM-MF tracker determines the ambiguous weights for both Case 1 and Case 2, and penalizes the ambiguous weights by incorporating the multiple target features. By doing so, four ambiguous weights are determined and rearranged (shown as in Fig. 4(c)), and all the targets are tracked with correct identities by the PGM-GM-MF tracker (shown as in Fig. 3(d)).

A real outdoor surveillance video is used to further evaluate the proposed weight penalization method. Fig. 5 shows the tracking results by the GM-PHD tracker, the CPGM-PHD tracker and the PGM-PHD-MF tracker, respectively. Without

(a)

(b)

(c)

Fig. 4. Updated weight matrices at t=49 of synthetical video. (a) For GM-PHD tracker. (b) For CPGM-PHD tracker. (c) For PGM-PHD-MF tracker. weight penalization method, the conventional GM-PHD tracker tracks the close moving targets with the same identities (shown as two target 1 in the left of Fig. 5(a)). Both CPGM-PHD and PGM-PHD-MF trackers can successfully track the close moving targets. However, both of the GM-PHD and CPGM-PHD trackers will track the merged measurement as one single target as mutual occlusion occurs in targets (shown as target 1 in the right of Fig. 5(a) and 5(b)). On the contrary, the PGM-PHD-MF tracker can correctly track the targets in mutual occlusion (shown as targets 1 and 5 in the right of Fig. 5(c)) by incorporating the mutual occlusion handling method proposed in [15].

B. Quantitative Analysis To evaluate the tracking performance, the CLEAR MOT

metrics [15] is used. This return a precision score MOTP (Multi-Object Tracking Precision) and an accuracy score MOTA (Multi-Object Tracking Accuracy).

(a) (b)

(c) (d)

Fig. 3. Tracking results comparison of synthetical video. (a) Tracked targets at t=48 by all the trackers. (b) Tracked targets at t=49 by the GM-PHD tracker. (c) Tracked targets at t=49 by the CPGM-PHD tracker. (d) Tracked targets at t=49 by the PGM-PHD-MF tracker.

Table 1. Tracking performance for synthetical and real videos

Synthetical video Real video MOTA MOTP MOTA MOTP

GM-PHD tracker 0.8586 0.9266 0.6256 0.8567 CPGM-PHD tracker 0.9863 0.9536 0.7038 0.8724

PGM-PHD-MF tracker 1 0.9675 0.9348 0.9273

Table 2. Tracking performance comparison for PETS2009

GM-P

HD tracker

Tracker reported in [16]

Tracker reported in [17]

Tracker reported in [18]

PGM-PHD-MF tracker

MOTA 0.4626 0.8932 0.7977 0.7591 0.8826 MOTP 0.4983 0.5643 0.5634 0.5382 0.6055

Table 1 is the results of tracking performance comparison

according to the CLEAR MOT metrics. The results show that tracking with the proposed PGM-PHD-MF tracker can greatly improve the MOTA score and achieve a comparable MOTP score on the tested videos. Moreover, the proposed PGM-PHD-MF tracker is also compared with the state-of-the-art trackers reported by Andriyenko et al. [16], Breitenstein et al. [17], and Yang et al. [18] on the video PETS2009 from http://www.cvg.rdg.ac.uk/PETS2009/a.html. The results in Table 2 show that the PGM-PHD-MF tracker achieves a better MOTP score on tracking precision and a comparable MOTA score on tracking accuracy.

V. CONCLUSIONS We had developed a penalized GM-PHD tracker with

multi-feature fusion to track multiple close moving targets in video. We proposed an effective method to determine the ambiguous weights for both cases 1 and 2. We fused multiple target features including spatial-color appearance, histogram of oriented gradient and target area to penalize the ambiguous weights. By doing so, those weights between the target and

the irrelevant measurements can be greatly penalized, and thus lead to improved tracking accuracy with low mismatch rate. The experiments conducted on both of the synthetical and real videos showed the good performance of the proposed tracker.

REFERENCES [1] I.R. Goodman, R. Mahler, and H.T. Nguyen, “Mathematics of data

fusion”, Norwell: Kluwer Academic Press, 1997. [2] R. Mahler, “Multitarget bayes filtering via first-order multitarget

moments”, IEEE Transactions on Aerospace and Electronic Systems, vol. 39, no. 4, pp. 1152-1178, Oct. 2003.

[3] B.-N. Vo and W.K. Ma, “The Gaussian mixture probability hypothesis density filter”, IEEE Transactions on Signal Processing, vol. 54, no. 11, pp. 4091-4104, Nov. 2006.

[4] N. T. Pham, W.M. Huang, and S. H. Ong, “Tracking multiple objects using probability hypothesis density filter and color measurements”, in IEEE International Conference on Multimedia and Expo, 2007, pp. 1511-1514.

[5] J. Wu, S. Hu, and Y. Wang, “Adaptive multifeature visual tracking in a probability-hypothesis-density filtering framework”, Signal Processing, vol. 93, no. 11, pp. 2915-2926, Nov. 2013.

[6] X. Zhou, Y.F. Li, and B. He, “Entropy distribution and coverage rate-based birth intensity estimation in GM-PHD filter for multi-target visual tracking”, Signal Processing, vol. 94, pp. 650-660, Jan. 2014.

[7] X. Zhou, Y.F. Li, B. He, and T. Bai, “GM-PHD-based multi-target visual tracking using entropy distribution and game theory”, IEEE Transactions on Industrial Informatics, vol. 10, no. 2, pp. 1064-1076, May 2014.

[8] M. Yazdian-Dehkordi, Z. Azimifar, and M. Masnadi-Shirazi, “Competitive Gaussian mixture probability hypothesis density filter for multiple target tracking in the presence of ambiguity and occlusion”, IET Radar Sonar Navig , vol. 6, no. 4, pp. 251–262, April 2012.

[9] M. Y.-Dehkordi, Z. Azimifar, and M. A. M.-Shirazi, “Penalized Gaussian mixture probability hypothesis density filter for multiple target tracking”, Signal Processing, vol. 92, no. 5, pp. 1230-1242, May 2012.

[10] Y. Wang, H. Meng, Y. Liu, and X. Wang, “Collaborative penalized Gaussian mixture PHD tracker for close target tracking”, Signal Processing, DOI: http://dx.doi.org/10.1016/j.sigpro.2014.01.034, online first, 2014.

[11] W. Hu, X. Zhou, M. Hu, and S. Manbank, “Occlusion reasoning for tracking multiple people”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 19, no. 1, pp. 114-121, Jan. 2009.

[12] X. Zhang, W. Hu, G. Luo, and S. Manbank, “Kernel-bayesian framework for object tracking”, in 8th Asian Conference on Computer Vision, 2007, pp. 821-831.

[13] X. Zhou, Y.F. Li, and B. He, “Multi-target visual tracking with game theory-based mutual occlusion handling”, in IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013, pp. 4201-4206.

[14] Yazhe, Tang and Youfu Li. “Contour coding based rotating adaptive model for human detection and tracking in thermal catadioptric omnidirectional vision”, Applied Optics, vol. 51, no. 27, pp. 6641-6652, 2012.

[15] X. Zhou, Y.F. Li, and B. He, “Game-theoretical occlusion for multi-target visual tracking”, Pattern Recognition, vol. 46, no. 10, pp. 2670-2684, Oct. 2013.

[16] A. Andriyenko, K. Schindler, and S. Roth, “Discrete-continuous optimization for multi-target tracking”, in IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 1926-1933.

[17] M. D. Breitenstein, F. Reichlin, B. Leibe, E. Koller-Meier, and L. V. Gool, “Online multiperson tracking-by-detection from a single, uncalibrated camera”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 9, pp. 1820-1833, Sep. 2011.

[18] J. Yang, Z. Shi, P. Vela, and J. Teizer, “Probabilistic multiple people tracking through complex situations”, in IEEE Workshop Performance Evaluation of Tracking and Surveillance, 2009, pp. 79-86.

(a)

(b)

(c)

Fig. 5. Tracking results comparison of outdoor surveillance video. (a) Tracking by the GM-PHD tracker. (b) Tracking by the CPGM-PHD tracker. (c) Tracking by the PGM-PHD-MF tracker.