\u003ctitle\u003eFuzzy logic recursive change detection for tracking and denoising of video...

12
Fuzzy Logic Recursive Change Detection for tracking and denoising of Video Sequences Vladimir Zlokolica a , Matthias De Geyter c , Stefan Schulte b , Aleksandra Pizurica a , Wilfried Philips a and Etienne Kerre b a Ghent University, Dept. of Telecommunications and Information Processing (TELIN), IPI, Sint-Pietersnieuwstraat 41, 9000 Gent, Belgium b Ghent University, Department of Applied Mathematics and Computer Science, 9000 Gent, Belgium c VRT, Brussels, Belgium ABSTRACT In this paper we propose a fuzzy logic recursive scheme for motion detection and temporal filtering that can deal with the Gaussian noise and unsteady illumination conditions both in temporal and spatial direction. Our focus is on applications concerning tracking and denoising of image sequences. We process an input noisy sequence with fuzzy logic motion detection in order to determine the degree of motion confidence. The proposed motion detector combines the membership degree appropriately using defined fuzzy rules, where the membership degree of motion for each pixel in a 2D-sliding-window is determined by the proposed membership function. Both fuzzy membership function and fuzzy rules are defined in such a way that the performance of the motion detector is optimized in terms of its robustness to noise and unsteady lighting conditions. We perform simultaneously tracking and recursive adaptive temporal filtering, where the amount of filtering is inversely proportional to the confidence with respect to the existence of motion. Finally, temporally filtered frames are further processed by the proposed spatial filter in order to obtain denoised image sequence. The main contribution of this paper is the robust novel fuzzy recursive scheme for motion detection and temporal filtering. We evaluate the proposed motion detection algorithm using two criteria: robustness to noise and changing illumination conditions and motion blur in temporal recursive denoising. Additionally, we make comparisons in terms of noise reduction with other state of the art video denoising techniques. 1. INTRODUCTION In digital video, the general motion detection problem is quite complex, because it is not always easy to distinguish illumination changes from real motion and because of the aperture problem. As such, “close to perfect” solutions tend to be very time-consuming. However, in many applications it is sufficient to detect changes in the scene rather than actual motion, and even to detect only some of the changes. In this paper we consider motion detection in terms of robust change detection against noise and slowly varying illumination conditions. This means that our method will not be able to distinguish completely changes due to motion from other changes due to rapidly varying illumination, scene-cuts, fast camera zoom and panning, . . . Despite these restrictions there are numerous applications for this kind of motion detection: it can be used for surveillance objectives, e.g. to monitor a room in which there is not supposed to be any motion, or the detection results can be a useful input for more advanced, higher level video processing techniques, like the tracking of objects through time (e.g. [1]). Other applications are noise removal and deinterlacing, where one applies temporal filtering depending on the outcome of the motion detector. Given the outlined restrictions, the main problem to be tackled is to distinguish image noise, caused by various sources, from real changes in picture intensity. Additionally, we try to adapt the method to changeable illuminating conditions both in spatial and temporal direction. A trivial and very fast (but not very good) solution for pixel-by-pixel change detection is to simply subtract the grey levels of successive frames, and to Further author information: (Send correspondence to Vladimir Zlokolica) Vladimir Zlokolica: E-mail: [email protected], Telephone: +32 9 264 8909, Fax.: +32 9 264 4295

Transcript of \u003ctitle\u003eFuzzy logic recursive change detection for tracking and denoising of video...

Fuzzy Logic Recursive Change Detection for tracking anddenoising of Video Sequences

Vladimir Zlokolicaa, Matthias De Geyterc, Stefan Schulteb, Aleksandra Pizuricaa, WilfriedPhilipsa and Etienne Kerreb

aGhent University, Dept. of Telecommunications and Information Processing (TELIN), IPI,Sint-Pietersnieuwstraat 41, 9000 Gent, Belgium

bGhent University, Department of Applied Mathematics and Computer Science, 9000 Gent,Belgium

cVRT, Brussels, Belgium

ABSTRACT

In this paper we propose a fuzzy logic recursive scheme for motion detection and temporal filtering that can dealwith the Gaussian noise and unsteady illumination conditions both in temporal and spatial direction. Our focusis on applications concerning tracking and denoising of image sequences. We process an input noisy sequencewith fuzzy logic motion detection in order to determine the degree of motion confidence. The proposed motiondetector combines the membership degree appropriately using defined fuzzy rules, where the membership degreeof motion for each pixel in a 2D-sliding-window is determined by the proposed membership function. Both fuzzymembership function and fuzzy rules are defined in such a way that the performance of the motion detectoris optimized in terms of its robustness to noise and unsteady lighting conditions. We perform simultaneouslytracking and recursive adaptive temporal filtering, where the amount of filtering is inversely proportional to theconfidence with respect to the existence of motion. Finally, temporally filtered frames are further processed bythe proposed spatial filter in order to obtain denoised image sequence.

The main contribution of this paper is the robust novel fuzzy recursive scheme for motion detection andtemporal filtering. We evaluate the proposed motion detection algorithm using two criteria: robustness to noiseand changing illumination conditions and motion blur in temporal recursive denoising. Additionally, we makecomparisons in terms of noise reduction with other state of the art video denoising techniques.

1. INTRODUCTION

In digital video, the general motion detection problem is quite complex, because it is not always easy to distinguishillumination changes from real motion and because of the aperture problem. As such, “close to perfect” solutionstend to be very time-consuming. However, in many applications it is sufficient to detect changes in the scenerather than actual motion, and even to detect only some of the changes. In this paper we consider motiondetection in terms of robust change detection against noise and slowly varying illumination conditions. Thismeans that our method will not be able to distinguish completely changes due to motion from other changes dueto rapidly varying illumination, scene-cuts, fast camera zoom and panning, . . . Despite these restrictions thereare numerous applications for this kind of motion detection: it can be used for surveillance objectives, e.g. tomonitor a room in which there is not supposed to be any motion, or the detection results can be a useful inputfor more advanced, higher level video processing techniques, like the tracking of objects through time (e.g. [1]).Other applications are noise removal and deinterlacing, where one applies temporal filtering depending on theoutcome of the motion detector.

Given the outlined restrictions, the main problem to be tackled is to distinguish image noise, caused byvarious sources, from real changes in picture intensity. Additionally, we try to adapt the method to changeableilluminating conditions both in spatial and temporal direction. A trivial and very fast (but not very good)solution for pixel-by-pixel change detection is to simply subtract the grey levels of successive frames, and to

Further author information: (Send correspondence to Vladimir Zlokolica)Vladimir Zlokolica: E-mail: [email protected], Telephone: +32 9 264 8909, Fax.: +32 9 264 4295

Fuzzy motion detection

Update α, β, σ

Recursive temporalfiltering

Spatial filteringSpatio-temporaldenoised sequence

Noisy inputcurrent frame

Previously temporally filtered frame

Motion confidence thresholding

Motion tracking

motion confidence

T

Figure 1. The proposed algorithm framework

conclude that the pixel has changed when the outcome exceeds a present threshold. Because only one pixel isconsidered at a time, the computational cost is quite low. While this technique works reasonably well at lownoise levels, the performance degrades rapidly with increasing noise level. Also, in practice the threshold willhave to be tuned to the noise level, which must be estimated on a global or local basis. To tackle the noiseproblem, some more sophisticated techniques are needed. One could examine regions with more than one pixel,and define and calculate some characteristic functions (e.g., the spatial average in a window of pixels) as in [2,3],at the expense of an increased complexity.

In this paper, we propose an alternative approach and we restrict ourselves to a recursive algorithm requiringa small number of computations. The idea is to use adaptive thresholding, where the threshold is adapted tothe local grey scale statistics and the spatial context of the pixel. The resulting method is insensitive to noiseand to slow illumination changes. Moreover, it is locally adaptive, i.e., to spatially varying noise levels. In [4],some more complex techniques to obtain a threshold based on the noise or signal properties are proposed, andapplied to simple frame differencing. Our method is different, since it uses data coming from a longer period oftime, and the threshold is adapted to both temporal and spatial information.

It is quite obvious that different applications require different approaches; in the case of motion detection fortracking, it is important that more or less only “real” motion is detected: the goal is to detect moving objects.This means, on the one hand, that if the input sequence is noisy, we do not want the noise to be labeled asmotion, while on the other hand, it is not that important if not every single changed pixel of an object is detected.However, in the case of motion detection for denoising, where the detection result is used for temporal filtering,undetected changes in an object can lead to (unwanted) motion blur, while it is allowed that some (preferably aslittle as possible) of the noise is labeled as motion. By using fuzzy logic we aim at defining a confidence measurewith respect to the existence of motion, to be called hereafter (the degree of) motion confidence. Based on thedefined motion confidence we optimize temporal denoising and motion detection. We show the results of videodenoising performance in terms of PSNR and visual point of view, and motion detection performance in termsof its robustness to changeable noise and illuminating conditions.

In section 2 we present the framework of the algorithm for simultaneous motion detection and video denoising,and show the main blocks of the proposed algorithm. Subsequently, we explain the proposed fuzzy logic recursivemotion detector in section 3. In section 4 adaptive temporal and spatial filtering scheme is introduced. Finally,experimental results are given in section 5 and conclusions are drawn in section 6.

2. ALGORITHM DESCRIPTION

The proposed algorithm is depicted in Fig. 1. We first process an noisy input sequence with fuzzy logic motiondetection using a current and previously processed temporal frame in order to determine the degree of the motion

confidence. The degree of the motion confidence is expressed as a real number between the two extremes: zero(no motion for sure) and one (motion for sure), in a fuzzy logic manner.

The determined motion confidence is further used for motion tracking and for video denoising. For tracking weperform motion confidence thresholding where the threshold used is application dependent. By thresholding wedetermine two states of the motion tracker: motion - no motion, and subsequently use the estimated informationfor motion tracking of movements and certain objects (in that case motion information is combined with imagesegmentation) of the noisy video sequence. The assumed noise in corrupted input video is the white Gaussiannoise.

On the other hand, motion confidence is used to update the parameters (α, β and σ) for temporal recursivefiltering and spatio-temporal adaptation of the algorithm to the change of noise levels and illumination changes,in the input corrupted sequence. Specifically, the parameter α is a weighting factor of the recursive temporalfilter, which controls the amount of filtering based on the amount of noise present (input noise variance) andbased on the motion confidence determined by the fuzzy motion detector.

We estimate the standard deviation of the Gaussian noise σ(x, y, t) from the input video sequence [5] only forthe first frame (t = 0) and subsequently try to adapt the variance of noise to the input video illumination andnoise changes by spatio-temporal adaptation of the standard noise deviation (σ), where parameter β is used tocontrol the amount of updating noise and illumination changes.

Next, we perform temporal recursive filtering using the current input noisy frame and the previously tempo-rally filtered frame. Subsequently, the proposed spatial filter is applied to the temporally filtered frame. Since thetemporally filtered frames will have non-stationary noise introduced, the proposed spatial filter aims at locally(in space) adapting to the estimated signal variance.

As can be seen in Fig.1 the temporal filter and the fuzzy logic motion detector work in a closed loop, that is ina recursive manner where at each step the performance of each module is improved by the one. This is achievedthrough adaptation to a local spatio-temporal noise variance and illumination changes and motion present in theinput noisy video, recursively.

3. FUZZY LOGIC RECURSIVE MOTION DETECTOR

In a previous work [5] we have introduced a binary motion detector, which can only distinguish between motionand no motion. This motion detector was used to control the performance of the spatio-temporal filter, accordingto two states only: temporal information included or not. A disadvantage was that in case of higher noise levels,due to motion ambiguities, the filter was introducing motion blurring and artifacts resulting in a reduced denoisingperformance.

Recently, in [6] a recursive scheme for motion detection was proposed for motion tracking in the presenceof low-noise levels, where only information from the current and from the previously processed (temporallyaccumulated) frame was used to detect motion aiming at reducing memory requirements. However, still inthe same approach the proposed motion detector was binary, and consequently was not well suited for motiondetection in presence of higher noise levels and usage for video denoising.

In this paper we propose a fuzzy logic recursive motion detector for both video denoising and tracking.Instead of determining motion by binary logic, that is motion - no motion, we determine the degree of motionconfidence which is expressed as a real number between zero (no motion for sure) and one (motion for sure). Weuse a membership function to determine the membership degree for the absolute difference between the pixelvalue at the same spatial positions in two consecutive frames. Additionally, we do not only consider the currentpixel position but spatial neighbors in a 2D-window (3× 3) as well. Finally we combine the membership degreeappropriately using the proposed fuzzy rules in order to get the resulting fuzzy logic motion detection output.In order to get a binary output for tracking, we impose simple thresholding on the fuzzy logic motion detectionoutput.

3.1. Fuzzy logic based information processing

Fuzzy logic based information processing helps to model imperfect knowledge and uncertainty about thresholdvalues [7]. Extension of the classical set theory to fuzzy sets presents the essential part of fuzzy logic [8]. A fuzzyset F in an universe U characterized by a membership function ηF which takes a value of the interval [0, 1]:

F : U → [0, 1], u 7−→ ηF (u) (1)

The classical intersection and union of sets do also have their fuzzy counterparts. A fuzzy intersection of twofuzzy sets A and B in U is given by:

A ∩B : U → [0, 1], u 7−→ ηA(u) ∧ ηB(u) (2)

where the ∧-operator represents a triangular norm [9,10]. Additionally, a fuzzy union of A and B is given by:

A ∪B : U → [0, 1], u 7−→ ηA(u) ∨ ηB(u) (3)

where the ∨-operator represents a triangular co-norm [9,10].

In general a conditional rule such as “IF antecedent THEN consequent” formulates the necessary conditionsin the antecedent so that the consequent is activated. The fuzzy control rule “IF x is F THEN y is G” isrepresented by a fuzzy relationship using a fuzzy implication function I:

R : U × V → [0, 1], (u, v) 7−→ IR(u, v) = `(ηF (u), ηG(v)). (4)

Several choices for the implication function are available [7]. Rules are activated by means of an inferencemechanism. In general compound antecedents can be used in a rule set. For example, “x1 is F1 and x2 is F2” isrepresented by a triangular norm:

ηF1∧F2(u1, u2) = ηF1

(u1) ∧ ηF2(u2) (5)

For more detailed description concerning fuzzy logic controllers we refer to [11,12]. In general fuzzy controlleris composed of three important steps: fuzzyfication, an application of a rule set, and the defuzzification. Therule set of a MISO-system (Multiple-Input-Single-Output) can be characterized as follows:

R1 : IF x1 is A1,1 AND . . . AND xn is A1,n THEN z is C1

R2 : IF x1 is A2,1 AND . . . AND xn is A2,n THEN z is C2

...Rm : IF x1 is Am,1 AND . . . AND xn is Am,n THEN z is Cm

(6)

Such a canonical form only contains antecedents which use the conjunction operator only. Every rule evaluatesto a certain truthness value which is called a plausibility in fuzzy logic (as opposed to probability they are notnormalized). If multiple rules infer the same consequent, those results need to be combined by a triangular co-norm. Finally, defuzzification is the inverse process of fuzzification: the plausibilities of the different consequentsneed to be combined to obtain a meaningful result (e.g., a single scalar output) [13].

3.2. Motion detection

The proposed fuzzy logic based motion detector combines several exemplars of the difference signal ∆(x, y, t)in a fuzzy manner in order to determine the motion confidence for a certain spatio-temporal position (x, y, t),horizontal, vertical and temporal coordinate, respectively, where the difference signal ∆(x, y, t) is defined asfollows:

∆(x, y, t) = |In(x, y, t)− Ip(x, y, t− 1)| (7)

Namely, the difference signal ∆(x, y, t) represents the absolute difference of the pixel value from the inputnoisy frame currently being processed In(x, y, t) and pixel value in the same spatial position from the previouslyprocessed sequence Ip(x, y, t − 1). Using the difference signal ∆(x, y, t), we define the membership functionsSMALL and BIG as follows:

γSMALL(∆(x, y, t)) =

1 ,∆(x, y, t) < a

1− ∆(x,y,t)−ab−a , a ≤ ∆(x, y, t) ≤ b

0 ,∆(x, y, t) > b

(8)

γBIG(∆(x, y, t)) =

0 ,∆(x, y, t) < a∆(x,y,t)−a

b−a , a ≤ ∆(x, y, t) ≤ b1 ,∆(x, y, t) > b

(9)

with the following parameters values used: a = k ∗ σmean and b = 4q(σ(x, y, t)), where σ(x, y, t) and σmean(t)correspond to the standard deviation of the noise for the processed video signal at position (x, y, t) and meanvalue of σ(x, y, t) for the whole frame “t”, respectively. The parameter k depends on the resolution of the inputvideo and usually ranges in k ∈ [0, 0.25]. Additionally, we define q(σ(x, y, t)) in the following way:

q(σ(x, y, t)) = 2σmean(t− 1)− (σ(x, y, t− 1)− kbµσmean(t− 1)/µ(x, y, t) + kfµµ(x, y, t)/σmean(t− 1)) (10)

where µ(x, y, t) represents a rough estimation of the noise variance and motion together and is defined as follows:

µ(x, y, t) =

∑1i,j=−1 |∆(x+ i, y + j, t)|

9(11)

and kbµ and kfµ represent adaptable constants for the no motion (kbµ = 1.5) and motion (kfµ = 0.75) undergoingarea, respectively. Specifically, the absolute value of difference signal ∆ from (7) is averaged in a 2D (3 × 3)window, in order to get more spatially smoothed value µ, where the determined µ value is used for the correction ofthe slope of the membership functions γBIG(∆(x, y, t)) and γSMALL(∆(x, y, t)). The bigger the value of µ(x, y, t)is (relatively to the σmean(t− 1)), the bigger the slope of the membership functions will be and consequently themore sensitive the motion detection will be. On the other hand if µ(x, y, t) is relatively small (region withoutmotion - only noise in the background), the slope of the membership functions will be smaller, hence reducingthe sensitiveness of the motion detector. In such way, we aim at improving at same time the sensitiveness of themotion detection to movements and its robustness against noise.

We not only consider the membership degrees γBIG(∆(x, y, t)) and γSMALL(∆(x, y, t)) at the current pixelposition (x, y, t) but also for those of the spatial neighbors in a in a 2D-window (K × K), where K = 3 is inour implementation. These two membership functions (SMALL and BIG) are visualized in Fig. 2. Finally, wecombine the membership degree (8) and (9) of all difference signals (∆) in a 2D sliding window, by using ourfuzzy rule set (Table 1), which represents the core of the fuzzy controller. As a result every rule decides whetherthere is motion (M(x, y, t) = true)) or not (M(x, y, t) = false)).

The first rule states that there is motion (M(x, y, t) = true)) if the signal difference in the window Fig. 3(a)∆(x, y, t) can be considered as big in a fuzzy way and at least three neighboring signals ∆(x+ i, y + j, t) (i, j =−1, 0, 1) can be considered as big too. It introduces Nr1 rules:

(a) SMALL (b) BIG

a∆

0

0.5

1

1.5

mem

bers

hip

degr

ee

b

SMALL

a b∆

0

0.5

1

1.5

mem

bers

hip

degr

ee

BIG

Figure 2. Membership functions

rule antecedent consequent1 ∃K((i1,j1)6=(i2,j2)6=(i3,j3))

i1,j1,i2,j2,i3,j3=−K((i1,2,3,j1,2,3)6=(0,0)){(∆(x, y, t) = BIG) AND

(∆(x+ i1, y + j1, t) = BIG) AND(∆(x+ i2, y + j2, t) = BIG) AND (∆(x+ i3, y + j3, t) = BIG)} M(x,y,t) = true

2 (∆(x, y, t) = SMALL) M(x,y,t) = false3 (∆(x, y, t) = BIG) AND (∀Ki,j 6=0

i,j=−K(∆(x+ i, y + j, t) = SMALL)) M(x,y,t) = false

4 ∃Ki,j 6=0i,j=−K

{(∆(x, y, t) = BIG) AND (∆(x+ i, y + j, t) = BIG) AND

(∀K((i1,j1)6=(i,j))i1,j1=−K

((i1,j1)6=(0,0))(∆(x+ i1, y + j1, t) = SMALL)} M(x,y,t) = false

5 ∃Ki1,j1,i2,j2 6=0i,j=−K

{(∆(x, y, t) = BIG) AND (∆(x+ i1, y + j1, t) = BIG) AND

(∆(x+ i2, y + j2, t) = BIG) AND(∀K((i3,j3)6=(i1,2,j1,2))

i3,j3=−K((i3,j3)6=(0,0))(∆(x+ i3, y + j3, t) = SMALL)} M(x,y,t) = false

Table 1. Fuzzy rule set - theoretical approach (∃ - there exists; ∀ - for all)

Nr1 = C3N−1 =

(N − 1

3

)=

(N − 1)!

(N − 2− 2)!3!=

(N − 1)(N − 2)(N − 3)

3!(12)

where in our case for K = 3, N = 9 and consequently Nr1 = 56.

The second, third, fourth and fifth rule determine the case where we detect no motion, M(x, y, t) = false.The second and the third rule state that there is no motion if the central signal difference, in the window Fig. 3(b)∆(x, y, t) can be considered as small in a fuzzy way or if the central signal difference in the window Fig. 3(c)∆(x, y, t) can be considered as big in a fuzzy way but all neighbors can be considered as small, respectively. Bothrules (the second and the third) introduce one rule. The fourth rule states that there is no motion if the centralsignal ∆(x, y, t) and one neighbor can be considered big but all other neighbors can be considered as small ina fuzzy way, fig. 3(d). It introduces N − 1 rules. Finally, the fifth rule states that there is no motion if thecentral signal difference ∆(x, y, t) and two neighbors can be considered as big but all the other neighbors can beconsidered as small in a fuzzy way, Fig. 3(e), introducing additionally Nr5 = C2

N−1 rules. As a result in total weget Ntotal = Nr1 +Nr5 +N + 1 = rules at each position.

For the triangular norm and co-norm [10] we have used the product and the probabilistic sum, respectively:

(a) 1.rule (b) 2.rule (c) 3.rule

(d) 4.rule (e) 5.rule

Figure 3. 2D Neighborhood sliding window for fuzzy motion detection

rule antecedent consequentR1 ∃K((i1,j1)6=(i2,j2)6=(i3,j3))

i1,j1,i2,j2,i3,j3=−K((i1,2,3,j1,2,3)6=(0,0)){(∆(x, y, t) = BIG) AND

(∆(x+ i1, y + j1, t) = BIG) AND(∆(x+ i2, y + j2, t) = BIG) AND (∆(x+ i3, y + j3, t) = BIG)} M(x,y,t) = true

R2 ∀K((i1,j1)6=(i2,j2)6=(i3,j3))i1,j1,i2,j2,i3,j3=−K

((i1,2,3,j1,2,3)6=(0,0))(SF1 OR SF2 OR SF3 OR SF4) M(x,y,t) = false

subfact factSF1 ∆(x, y, t) = SMALLSF2 ∆(x1, y1, t) = SMALLSF3 ∆(x2, y2, t) = SMALLSF4 ∆(x3, y3, t) = SMALL

Table 2. Fuzzy rule set - practical implementation (∃ - there exists; ∀ - for all)

A AND B: A ∗BA OR B: A+B −A ∗B (13)

where A and B correspond to the membership degree of the membership functions γSMALL and/or γBIG (8,9).

Finally, we define the motion confidence as follows:

θ =Mtrue

Mtrue +Mfalse(14)

where Mtrue corresponds to the “true” motion calculated according to the first rule and Mfalse corresponds tothe “false” motion calculated according to the second, third, forth and the fifth rule (Table 1).

However, for practical implementation in order to reduce the number of calculation, we have tested the per-formance of the proposed motion detector by θ = Mtrue and have noticed negligible decrease of the performancein comparison to (14). Nevertheless, it should be noted that in the latter case the fuzzy rules concerning “false”motion are defined differently, i.e. the fuzzy rules are defined as in Table 2.

Finally, for the motion tracking we perform motion confidence thresholding on the estimated motion confi-dence θ(x, y, t) as follows:

MD(x, y, t) =

{Y ES θ > THRNO otherwise

(15)

where the threshold THR is implementation defendant and video resolution defendant.

4. SIGNAL FILTERING AND ADAPTATION

In this section we introduce the temporal recursive filter and propose a scheme for adapting the temporalfiltering and motion detection algorithm to unsteady lightning conditions and to sudden changes of the noiselevel (section 4.1). For the latter, we estimate the standard deviation of the input Gaussian noise for the firstframe only and then adapt it throughout the sequence.

Additionally in subsection 4.2 we propose simple and efficient spatial filter for removing non-stationary noiseintroduced by the temporal filter, that precedes the spatial one.

4.1. Temporal filtering and adaptation

Using the degree of motion confidence, θ, we perform recursive temporal filtering as follows: when the degree ofmotion confidence is bigger we filter less while in the opposite case we filter more. Namely, we control the amountof filtering through a parameter α(x, y, t), which is directly proportional to the estimated motion confidence, andtakes values α ∈ [0, 1]. The proposed recursive temporal filtering is defined in the following way:

Ip(x, y, t) = αIn(x, y, t) + (1− α)Ip(x, y, t− 1) (16)

where In and Ip correspond to current input noisy and processed (temporally filtered) output frame, respectively,as shown in Fig. 1. For one extreme case α(x, y, t) = 1, there is no averaging, while for the other extreme case,α(x, y, t) = 0, only the processed (filtered) pixel from the previous temporal position t − 1 is substituted. Thesmaller the α value, the stronger filtering will be performed but also more information from the previous framewill be taken into account. Hence, in case when in the previous frame α(x, y, t − 1) alpha was big α ≈ 1, thenoise will be kept and if we allow in the current frame α(x, y, t) big, that is due to the information from motionconfidence from the current frame θ(x, y, t) to be big, noise will propagate through the processed sequence. Inorder to refine our method to be robust against noise propagation, we determine α(x, y, t) not only accordinglyto the motion confidence in the current frame (θ(x, y, t)), but also according to the amount of the temporalfiltering in the previous frame, α(x, y, t− 1), in the following way:

α(x, y, t) = α(x, y, t− 1)α′(x, y, t) + α(x, y, t− 1)

2+ (1− α(x, y, t− 1))α′(x, y, t) (17)

where α′(x, y, t) is an initial estimate based on the motion confidence only, α′(x, y, t) =√θ(x, y, t), and the

updated value (17) uses the information about the amount of filtering from the past. As a result, we obtaina more smoothed transition of the amount of temporal filtering from one frame to the other. Note that thisalso prevents noise propagation in time (as explained above) and avoids in this way a typical problem of timerecursive schemes.

In order to adapt our algorithm to unsteady lighting conditions and sudden changes of the noise level weestimate the standard deviation of the input Gaussian noise using the method of [5] for the first frame only andthen adapt it throughout the sequence for each position (x, y, t), separately. We perform this adaption of thestandard deviation of the noise (σ(x, y, t)) by recursive averaging. The higher the degree of motion confidence isthe closer the current value σ(x, y, t) is to the previous value σ(x, y, t− 1):

σ(x, y, t) = (1− β(x, y, t))µ+ β(x, y, t)σ(x, y, t− 1) (18)

where µ is defined in (11) and the weighting parameter β(x, y, t) which controls the updating of σ(x, y, t) isdefined as:

β(x, y, t) =

{Kβ θ < Kβ

θ otherwise(19)

where Kβ is a constant. We found experimentally that for different video sequences the range Kβ ∈ [0.25, 0.5]yields the best motion detection performance. Further on, in our experiments, we find that more stable andbetter results are obtained by locally averaging the estimated σ value within a 2D-sliding window (contains halfof the values from the previous and half from the current frame) centered at (x, y, t). In the first frame σ(x, y, t)is initialized to a constant for each spatial position and then further adapted.

We use σ(x, y, t) to improve the robustness against noise, noise level changes and illumination changes, forthe motion detector in the following way: the estimated σ(x, y, t) defines the slope of the fuzzy SMALL andBIG membership functions (8,9) by modifying the parameter b from (8,9) in terms of function q(σ(x, y, t),which is defined in (10). The bigger the estimated noise variance is, the smaller the slope of the correspondingmembership function will be and thus noise reduction of the motion confidence will be stronger. additionally, incase of illumination changes and/or change of the noise level σ(x, y, t) should slowly (in one or two frames time)adapt to the present changes in the video.

4.2. Spatial filter

After temporal filtering, we apply spatial weighted averaging filtering as as follows:

Is(x, y, t) =

1∑

i,j=−1

(w(i, j, t)Ip(x+ i, y + j, t) (20)

where the weighting function w(x, y, t) is a Gaussian function of the absolute difference between the central(x, y, t) and the neighboring pixel value in a 2-D window (3× 3). Namely, the latter function is defined as:

w(i, j, t) = e− (I(x+i,y+j,t)−I(x,y,t))2

2σ2(x,y,t) (21)

The width of the Gaussian function used for weighted spatial averaging has been set to the estimated σ(x, y, t),that is the standard deviation estimated for the central pixel position, which was experimentally shown to performoptimal in terms of the resulting PSNR in the processed sequences.

5. EXPERIMENTAL RESULTS

In this section, we demonstrate the performance of the proposed algorithm for video denoising and motiontracking. The tested video sequences were distorted by various levels of the Gaussian noise (σi = 10, 15, 20, 25, 30).Additionally video sequences with unsteady lighting conditions and noise levels varying in time throughout thesequence were considered. The camera used for obtaining the sequences was fixed. Hence, we consider sequenceswith steady background and moving objects, that are used in video surveillance (motion tracking and videoquality improvement).

We have first tested several video sequences corrupted with steady Gaussian noise throughout the inputcorrupted sequence and with steady lighting conditions, such as “Salesman”, “Trevor”, “Miss America”, ”Bicycle”and “Chair”,. . . . We have corrupted them with various noise levels and compared denoising results with severalstate of the art techniques. In Table 3 we show comparison of the proposed filter performance in terms ofaveraged PSNR over all frames in the sequence for the noise corrupted sequences with artificial uniformly spreadGaussian noise, with σi = 10.

Additionally we show the PSNR per frame for the “Salesman” and “Trevor” sequence with additive Gaussiannoise (σi = 15), respectively, on Fig. 4. The proposed Fuzzy motion recursive spatio-temporal filtering technique(FMRSTF) was compared with several state-of-the-art techniques for video denoising: The Rational filter [14](Rational), Motion-Detail Adaptive KNN filter [5] (MDAKNN), The spatio-temporal sequential filter of [15],Multiclass wavelet based spatio-temporal filter [16] (MCWF), 3D wavelet filter [17] (3DWF) and Alpha-Trimmedfilter [18]. Both in terms of the PSNR and visually, our algorithm outperforms single resolution techniques [5,14]of similar or higher complexity. Moreover, the proposed algorithm yields a PSNR and visual performance that iscomparable to much more complex wavelet based video denoising techniques [15–17]. The processed sequences(with added Gaussian noise of σ = 10, 15, 20, 25, can be viewed on http://telin.ugent.be/ vzlokoli/SPIE05/UNSP.

Table 3. PSNR for processed sequences corrupted with Gaussian noise σi = 10

PSNRinput Fuzzy adaptive α rational MCWF SEQWT

sequence MRSTF K-NN trimmedSalesman 34.1 32.5 29.4 30.6 32.96 34.14Deadline 34.8 32.2 24.3 27.3 33.17 XMissAm. 36.6 35.3 35.1 35.2 35.9 37.57Trevor 34.1 34.1 34.1 34.3 35.37 XTennis 31.2 30.5 22.5 26.5 31.33 32.27

(a) Salesman (b) Trevor

0 10 20 30 40 50frame index

29

29.5

30

30.5

31

31.5

32

32.5

33

33.5

34

34.5

PS

NR

SEQWTMCWF3DWFAKNNFSTFRational

0 5 10 15 20frame index

28

28.5

29

29.5

30

30.5

31

31.5

32

32.5

33

33.5

34

PS

NR

MDAKNNRationalFMRSTFMCWF

Figure 4. PSNR per frame for input noise level σ = 15

There video sequences with motion detection (tracking) can be viewed as well, where the detected motion wasdetermined by the motion confidence thresholding as showed in Fig. 1 as a separate block and described by (15),with the threshold THR = 0.75.

In order to show the robustness of the motion detector to noise and to motion blur we show the results ofthe motion detector for different threshold values (15) for the estimated motion confidence (θ). The results ofthe processed sequences for the “Tennis” and “Salesman” sequence for noise levels σ = 5, 10, 15, 20, 25, 30, fordifferent threshold values THR, can be visualized on the website: http://telin.ugent.be/˜vzlokoli/SPIE05/UNSP.Additionally, in Fig. 5 we show the comparison of the proposed motion detector for the “Tennis” sequence(t = 131) and with added noise of σi = 15, with the Noise Robust Change Detection (NRCD) of [6] and with theresult of the proposed method performance on the same sequence without noise added. From the fig. 5, it canbe observed that the proposed FMRD method is always better, that is, more robust to noise and detects moremotion in comparison to NRCD method, with consideration of ground truth motion information from Fig. 5(d).Moreover, by changing the threshold value (THR) we can adapt the algorithm to desired amount of detection,whereas in the other methods [4, 6] the output is fixed. As the noise increases the amount of detected pixelsdecreases slowly for the cost of noise reduction of motion detected pixels. However, even for relatively smallthreshold (THR) and relatively high noise levels (σ ≈ 20) as shown in the Fig. 15 it gives good and reliableresults.

We have also tested our video denoising algorithm on video sequences with additive Gaussian noise thatchanges throughout the sequence in the temporal domain. As an experiment we have added noise, σi = 10,in “Salesman” and “Bicycle” sequence for the first 25 and 15 frames, respectively, and then changed the noiselevel to (a)σi = 15 and (b)σi = 20. Similarly, we have set another experiment with σi = 20 for the first 25and 15 frames, for “Salesman” and “Bicycle” sequence, respectively, and then change it to (c)σi = 15 and

(a) (σi = 20) NRCD (b) (σi = 20) FMRD (THR = 0.5)

(c) (σi = 20) FMRD (THR = 0.75) (d) (noise-free) FMRD (THR = 0.75)

Figure 5. Motion Detection comparison

(a) Salesman (b) Bicycle

0 10 20 30 40 50frame index

8

9

10

11

12

13

14

15

16

17

18

19

20

σ(t)

10−1520−1520−1510−20

0 10 20 30frame index

5

10

15

20

σ(t)

10−2010−1520−1020−15

Figure 6. σ adaption

(d)σi = 20. In Fig. 6 we show how the estimated average σmean(t) changes from frame to frame and howbig the slope is, i.e. the speed of the adaption. The corresponding processed sequences can be seen on theweb: http://telin.ugent.be/˜vzlokoli/SPIE05/NUNSP/CN, where both denoised sequences and motion detectionsequences are shown. The value for motion confidence thresholding (Fig. 1)(15) was THR = 0.75. It can beobserved from the sequences showed how fast the motion detection adapts to the noise level change and correctsits performance in few frames (usually 1-4).

Finally, we have tested our algorithm for motion detection and video denoising for the video sequences withvarying illumination changes both in spatial and temporal domain. For that we have set the experiment with thesequence with the fixed camera and have changed the illumination conditions in the time and space. Additionallywe have added Gaussian noise σi = 15 to the sequences and have processed them. The results demonstrate goodrobustness against slow illumination changes. For sudden changes more time is needed for the algorithm toadapt. The results are shown on the website: http://telin.ugent.be/˜vzlokoli/SPIE05/NUNSP/CI.

6. CONCLUSION

In this paper a new adaptive recursive scheme for fuzzy logic based motion detection that is robust to noise andslowly variating illumination changes, is presented. The proposed scheme works in a close loop with the proposedtemporal recursive filter. In future we aim at investigating the usage of color and spatial image features for moreefficient motion detection that could cope with fast illumination changes and fast zooming and panning.

REFERENCES

1. J. Denzler and D. W. R. Paulus, “Active motion detection and object tracking,” in ICIP, pp. 635–639, 1994.

2. R. Jain and K. Skifstad, “Illumination independent change detection from real world image sequences,”Computer Vision, Graphics, and Image Processing 46, pp. 387–399, 1989.

3. Y. Z. Hsu, H. H. Nagel, and G. Refers, “New likelihood test methods for change detection in image se-quences,” Computer Vision, Graphics, and Image Processing 26, pp. 73–106, 1984.

4. P. L. Rosin, “Thresholding for change detection,” in ICCV, pp. 274–279, (Bombay, India), January 1998.

5. V. Zlokolica and W. Philips, “Motion and detail adaptive denoising of video,” in IS&T/SPIE Symposiumon Electronic Imaging,San Jose, California, USA, Jan. 2004.

6. M. De Geyter and W. Philips, “A noise robust method for change detection,” in ICIP,Barcelona, Sept.2003.

7. D. V. D. Ville, PhD, Ghent University, 2001.

8. E. Kerre, “Introduction to the basic principles of fuzzy set theory and some of its applications,” Communi-cation and Cognition , 1991.

9. E. Kerre, Fuzzy sets and approximate reasoning, Xian Jiaotong University Pres, 1998.

10. E. Klement, M. R., and E. Pap, Triangular Norms, Kluwer Academic Pub., 2000.

11. E. Cox, “Fuzzy fundamentals,” IEEE Spectrum , Oct. 1992.

12. C. Lee, “Fuzzy logic in control systems: Fuzzy logic controller,” IEEE Trans. Syst., Man. and Cybernetics20, pp. 319–323, Mar. 1990.

13. W. Van Leekwijck and E. Kerre, “Defuzzification: criteria and classification,” Fuzzy Sets and Systems 108,pp. 159–178, Dec. 1999.

14. F. Cocchia, S. Carrato, and G. Ramponi, “Design and real-time implementation of a 3-D rational filter foredge preserving smoothing,” IEEE Transactions on Consumer Electronics 43, pp. 1291–1300, Nov. 1997.

15. A. Pizurica, V. Zlokolica, and W. Philips, “Noise reduction in video sequences using wavelet-domain andtemporal filtering,” in SPIE Conference on Wavelet Applications in Industrial Processing, (Providence,Rhode Island, USA), Oct. 2003.

16. V. Zlokolica, A. Pizurica, and W. Philips, “Video denoising using multiple class averaging with multiresolu-tion,” in International Workshop VLBV03, Lecture Notes in Computer Science of Springer Verlag, (Madrid,Spain), Sept. 2003.

17. W. Selesnick and K. Li, “Video denoising using 2d and 3d dual-tree complex wavelet transforms,” in WaveletApplications in Signal and Image Processing (Proc. SPIE 5207), San Diego, Aug. 2003.

18. J. Bednar and T. L. Wat, “Alpha-trimmed means and their relationship to median filters,” IEEE Transac-tions on acoustics, speech, and signal processing 32, pp. 145–153, Feb. 1984.