Frequency-domain blind deconvolution based on mutual information rate

11
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 54, NO. 5, MAY 2006 1771 Frequency-Domain Blind Deconvolution Based on Mutual Information Rate Anthony Larue, Jérôme I. Mars, and Christian Jutten, Senior Member, IEEE Abstract—In this paper, a new blind single-input single-output (SISO) deconvolution method based on the minimization of the mutual information rate of the deconvolved output is proposed. The method works in the frequency domain and requires estima- tion of the signal probability density function. Thus, the algorithm uses higher order statistics (except for Gaussian source) and al- lows non-minimum-phase filter estimation. In practice, the crite- rion contains a regularization term for limiting noise amplification as in Wiener filtering. The score function estimation, which repre- sents a key point of the algorithm, is detailed, and the most robust estimate is selected. Finally, experiments point to the relevance of the proposed algorithm: 1) any filter, minimum phase or not, can be estimated and 2) on actual data (underwater explosions, seismo- volcanic phenomena), this deconvolution algorithm provides good results with a better tradeoff between deconvolution quality and noise amplification than existing methods Index Terms—Blind deconvolution, frequency domain, mutual information rate, noise regularization, non-minimum-phase sys- tems, seismic data, statistical independence. I. INTRODUCTION B LIND deconvolution is an important problem in signal processing and telecommunications, in which both single-input single-output (SISO) or multiple-input mul- tiple-output (MIMO) models are considered. In this paper, we address the problem of SISO blind deconvolution. The obser- vation is the convolution between the unknown source and the unknown filter and is corrupted by additive noise . In the discrete time domain, the observation is (1) where denotes the convolution product. This model can be ap- plied in many application fields. For example, we can use it in order to estimate the response of a system in control engineering or of a channel in telecommunications. Another application field is reflectivity characterization in seismic or seismology, where a short duration acoustic wave is transmitted into the ground. The reflected energy resulting from the series impedance in the earth is measured. Therefore, each reflection can be expressed as the convolution of delta function with the emitted wavelet. Manuscript received July 23, 2004; revised June 10, 2005. The authors thank the BG group, BP, ChevronTexaco, the Department of Trade and Industry and Shell for financial support. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Jonathan H. Manton. The authors are with the Laboratoire des Images et des Signaux, 38 402 Saint Martin d’Hères Cedex, BP 46 France (e-mail: [email protected]; [email protected]; [email protected]). Digital Object Identifier 10.1109/TSP.2006.872545 Then, in the above model (1), the source is the vertical re- flectivity sequence of the earth, while represents the unknown emitted wavelet and the coupling effect between the source and the earth. The same model is also applied for nondestructive testing and ultrasonics medical imaging. Convolution with a sparse process can also be convenient in shock detection in me- chanical system such as a detachable chair-lift station [1]. Ac- cording to the application, the aim of the deconvolution problem is to estimate the input process and/or the direct filter . In the blind deconvolution context, only is accessible to the algorithm, whereas , and are unknown. There- fore, the problem is ill posed, and extra assumptions about the input process or about the direct filter are necessary to estimate the system. The sparsity of the input process can be a way to solve the deconvolution problem as for undetermined blind source separation [2] or single-input multiple-output (SIMO) channel identification [3]. For instance, we can model the un- known spike train input process by using a Bernoulli–Gaussian (BG) model, i.e., the reflectivity is assumed to be a sum of delayed Dirac function with random magnitude: The Bernoulli part models the localization of the spikes and the Gaussian (sometimes Laplacian) process expresses their random mag- nitudes. In the literature, a few approaches, based on spike detection/estimation and BG models, have been published: using a maximum-likelihood approach [4] or in a Bayesian framework adding some priors [5], [6]. Other algorithms [7]–[9] are based on a stochastic approach dedicated to incomplete data. The algorithms cited above use a too-restrictive model of the input as spike trains. On real data, the impedance changes do not appear as clearly as predicted by the BG model and the important noise level increases the difficulty of the detection step. Therefore, the sparse nature of the input signal is not a sufficiently robust hypothesis for a good system estimation. Thus, for seismology applications and some others, it is inter- esting to develop blind deconvolution algorithms that take into account the additive noise. Indeed, if we estimate an inverse filter for providing the deconvolution output (see Fig. 1), the most similar to , the noise has two effects. The first is to imply a bias on the inverse filter estimation, the noise prevents estimating the theoretical inverse filter . Moreover, even if the direct filter is perfectly estimated, the noise contribution on the output is equal to , i.e., the noise can be amplified by the inverse filter. Another way to estimate the model in (1) is to assume that source samples are independent and identically distributed (i.i.d.) and non-Gaussian. Thus, the solution is unique up to delay and magnitude indeterminacies. The method aims at esti- mating a deconvolution filter (Fig. 1), which provides an i.i.d. 1053-587X/$20.00 © 2006 IEEE

Transcript of Frequency-domain blind deconvolution based on mutual information rate

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 54, NO. 5, MAY 2006 1771

Frequency-Domain Blind Deconvolution Based onMutual Information Rate

Anthony Larue, Jérôme I. Mars, and Christian Jutten, Senior Member, IEEE

Abstract—In this paper, a new blind single-input single-output(SISO) deconvolution method based on the minimization of themutual information rate of the deconvolved output is proposed.The method works in the frequency domain and requires estima-tion of the signal probability density function. Thus, the algorithmuses higher order statistics (except for Gaussian source) and al-lows non-minimum-phase filter estimation. In practice, the crite-rion contains a regularization term for limiting noise amplificationas in Wiener filtering. The score function estimation, which repre-sents a key point of the algorithm, is detailed, and the most robustestimate is selected. Finally, experiments point to the relevance ofthe proposed algorithm: 1) any filter, minimum phase or not, canbe estimated and 2) on actual data (underwater explosions, seismo-volcanic phenomena), this deconvolution algorithm provides goodresults with a better tradeoff between deconvolution quality andnoise amplification than existing methods

Index Terms—Blind deconvolution, frequency domain, mutualinformation rate, noise regularization, non-minimum-phase sys-tems, seismic data, statistical independence.

I. INTRODUCTION

B LIND deconvolution is an important problem in signalprocessing and telecommunications, in which both

single-input single-output (SISO) or multiple-input mul-tiple-output (MIMO) models are considered. In this paper, weaddress the problem of SISO blind deconvolution. The obser-vation is the convolution between the unknown sourceand the unknown filter and is corrupted by additive noise

. In the discrete time domain, the observation is

(1)

where denotes the convolution product. This model can be ap-plied in many application fields. For example, we can use it inorder to estimate the response of a system in control engineeringor of a channel in telecommunications. Another application fieldis reflectivity characterization in seismic or seismology, wherea short duration acoustic wave is transmitted into the ground.The reflected energy resulting from the series impedance in theearth is measured. Therefore, each reflection can be expressedas the convolution of delta function with the emitted wavelet.

Manuscript received July 23, 2004; revised June 10, 2005. The authors thankthe BG group, BP, ChevronTexaco, the Department of Trade and Industry andShell for financial support. The associate editor coordinating the review of thismanuscript and approving it for publication was Dr. Jonathan H. Manton.

The authors are with the Laboratoire des Images et des Signaux, 38 402Saint Martin d’Hères Cedex, BP 46 France (e-mail: [email protected];[email protected]; [email protected]).

Digital Object Identifier 10.1109/TSP.2006.872545

Then, in the above model (1), the source is the vertical re-flectivity sequence of the earth, while represents the unknownemitted wavelet and the coupling effect between the source andthe earth. The same model is also applied for nondestructivetesting and ultrasonics medical imaging. Convolution with asparse process can also be convenient in shock detection in me-chanical system such as a detachable chair-lift station [1]. Ac-cording to the application, the aim of the deconvolution problemis to estimate the input process and/or the direct filter .

In the blind deconvolution context, only is accessible tothe algorithm, whereas , and are unknown. There-fore, the problem is ill posed, and extra assumptions about theinput process or about the direct filter are necessary to estimatethe system. The sparsity of the input process can be a way tosolve the deconvolution problem as for undetermined blindsource separation [2] or single-input multiple-output (SIMO)channel identification [3]. For instance, we can model the un-known spike train input process by using a Bernoulli–Gaussian(BG) model, i.e., the reflectivity is assumed to be a sum ofdelayed Dirac function with random magnitude: The Bernoullipart models the localization of the spikes and the Gaussian(sometimes Laplacian) process expresses their random mag-nitudes. In the literature, a few approaches, based on spikedetection/estimation and BG models, have been published:using a maximum-likelihood approach [4] or in a Bayesianframework adding some priors [5], [6]. Other algorithms [7]–[9]are based on a stochastic approach dedicated to incompletedata. The algorithms cited above use a too-restrictive model ofthe input as spike trains. On real data, the impedance changesdo not appear as clearly as predicted by the BG model and theimportant noise level increases the difficulty of the detectionstep. Therefore, the sparse nature of the input signal is not asufficiently robust hypothesis for a good system estimation.Thus, for seismology applications and some others, it is inter-esting to develop blind deconvolution algorithms that take intoaccount the additive noise. Indeed, if we estimate an inversefilter for providing the deconvolution output(see Fig. 1), the most similar to , the noise has two effects.The first is to imply a bias on the inverse filter estimation, thenoise prevents estimating the theoretical inverse filter .Moreover, even if the direct filter is perfectly estimated, thenoise contribution on the output is equal to , i.e., thenoise can be amplified by the inverse filter.

Another way to estimate the model in (1) is to assume thatsource samples are independent and identically distributed(i.i.d.) and non-Gaussian. Thus, the solution is unique up todelay and magnitude indeterminacies. The method aims at esti-mating a deconvolution filter (Fig. 1), which provides an i.i.d.

1053-587X/$20.00 © 2006 IEEE

1772 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 54, NO. 5, MAY 2006

Fig. 1. Blind deconvolution basis.

output signal . In fact, we adjust the inversefilter for maximizing an i.i.d. criterion of the output .

In the applications cited above, the i.i.d. assumption is real-istic. Especially, in the reflectivity applications, can be as-sumed i.i.d. since it usually consists of sparse spike trains withrandom time locations and magnitudes.

Thus, the general deconvolution method, called blind sincewe only use the i.i.d. hypothesis on , can be a goodalternative for all the applications. The simplest methods(Yule–Walker, Durbin, spectral equalization) are based onsecond-order statistics only: The sample temporal indepen-dence is partially measured using correlation function andspectral power density. The second-order whitening does notimply time independence of samples; thus, the phase filter isundetermined and must be fixed with prior information. How-ever, in many applications, we cannot assume that the directfilter is minimum, maximum or null phase. Thus, we needto use higher (than 2) order statistics (HOS). With parametricapproaches using Moving average (MA) or autoregressive (AR)or ARMA models, some authors [10]–[14] extend Donoho’sresult [15] using fourth-order statistics (cumulant and kurtosis)for estimating nonminimum phase filter. Another approachconsists of decomposing the nonminimum phase filter into acascade of two filters: a causal filter and an anticausal filter.The method can be applied both for single-channel [16] andmultichannel [17] blind deconvolution. However, whateverthe model, these approaches may involve linear-time-invariantdeconvolution filters with a large number of taps, i.e., requirethe estimation of a large number of parameters. In this paper,we develop a frequency-domain approach.

This paper is organized as follows. In Section II, we presentthe principles of a frequency domain blind deconvolution algo-rithm. The deconvolution filter is estimated by minimizing themutual information rate (MIR) used as an i.i.d. criterion. Prac-tical issues of the algorithm, especially score function estima-tion, are detailed in Section III. Finally, Section IV presents ex-periments with both simulated and actual data, before the con-clusion.

II. DECONVOLUTION ALGORITHM

A. Theoretical Assessments

The mutual information of a random vectorof dimension is defined by

(2)

where denotes the Shannon marginal entropy of , i.e.,and the

Shannon joint entropy . Themutual information has the nice property of being posi-tive and vanishes if and only if the components of aremutually independent. It is then a good criterion for measuringindependence of random variables. Here, we consider stochasticprocesses, i.e., infinite sequences, for which (2) is not suitablecontrary to the entropy rate. The entropy rate [18] of the sto-chastic process is defined as

(3)

According to [18], this limit exists for a stationary process.Then, the MIR of a stationary process is defined by

(4)

where denotes the marginal entropy of the processand denotes the entropy rate of defined by (3). Morepractically, to clarify the notation, denotes the process withall samples taken together, and denotes the process corre-sponding to one sample . The MIR is always positiveand vanishes if and only if is an i.i.d. process [18]; thus, MIRis a good criterion for deconvolution. Under the stationary as-sumption, the marginal entropy of the process is not timedependent. Then, all the terms of the right-hand-side sum in (4)are equals, so we have

(5)

Hence, adding stationary assumptions, the MIR (4) is simplygiven by

(6)

In (6), is arbitrarily chosen. There is no reason to choose aparticular value of . Moreover, a convenient estimation of themarginal entropy need several realizations of the process

. Then, in practice, to estimate the entropy of the process, we use all samples as realizations of the

process . Then, the entropy of the process is equal to theentropy of one realization (i.e., random variable) , . Inthe blind deconvolution context, we can obtain a simpler costfunction [17], [19], [20], noticing that the entropy rate of thedeconvolution output is

(7)With (7), the MIR of the deconvolution output becomes

(8)

LARUE et al.: FREQUENCY-DOMAIN BLIND DECONVOLUTION BASED ON MUTUAL INFORMATION RATE 1773

Then, since the entropy rate is independent of the in-verse filter , instead of (8), one can consider the simplified cri-terion

(9)which, like (8), is minimum when the process is i.i.d. Theabove criterion uses all higher order statistics by the means ofthe entropy rate. For Wiener systems composed of a cascade ofa direct filter and an invertible non linear distortion, Talebet al. [19] estimate the inverse filter in the time domain, byminimizing (9) with respect to the impulse response :

(10)

The minimum search of (9) is made by a gradient techniquewith respect to the coefficients of the impulse response of theinverse filter . They use a finite number of coefficients forthe estimation of , so it is equivalent to choosing an MAmodel for . Thus, the method is well adapted to the inversionof an AR direct filter . A similar method restricted to lineardeconvolution has been proposed by minimizing Renyi’s en-tropy [21]. One can show that the algorithm is equivalent to amaximum-likelihood (ML) method, replacing the source distri-bution, supposed known in the ML method, by the distributionof the deconvolution output estimated at each iteration. In thispaper, this method is called moving average maximum likeli-hood (MAML). The application field of this method is limitedby the parametric model: for example, in seismology, the directfilter impulse response can be a chirp or a Ricker wavelet, whichcannot be modeled by an MA filter. Moreover, the method doesnot take into account the additive noise of the model (1).With Gaussian additive noise, one can experimentally show thatthe method achieves the same performances as the second ordermethod, i.e., the Yule–Walker algorithm. In this paper, to over-come these limitations, we propose to use a criterion in the fre-quency domain that avoids parametric approaches like the MA,AR, or ARMA models, whose parameter number can be verylarge. Moreover, in the frequency domain, it is easy to add a reg-ularization term for limiting noise amplification as it is usuallydone in Wiener filtering. In seismic applications, the wavelet isoften a bandpass filter, and there is a frequency band with poorinformation about reflectivity called the “no space.” Under inde-pendent Gaussian additive noise assumptions, the no space onlycontains noise. Thus, to provide a good output signal-to-noiseratio (SNR), the inverse filter must not amplify this frequencyband. Therefore, the largest values of the frequency response ofthe inverse filter are often prohibited, except if they give an im-portant output independence improvement.

Therefore, in Section II-B, we design a deconvolution crite-rion using the whiteness criterion (9) based on the MIR of thedeconvolution output.

B. Frequency Cost Function

In (9), we can rewrite the right-hand-side term in the integralusing the complex gain of the impulse response . Thus, we use

its discrete frequency response

(11)Then, approximating the integral with the rectangle method,

the i.i.d. measure (9) can be estimated by

(12)

If we minimize (12) with respect to the inverse filter fre-quency response , we have to esti-mate complex parameters (due to the hermitian symmetryof the real filter) using only temporal samples of the outputsignal. Thus, without adding a smoothing constraint on the in-verse filter frequency response, a trivial solution is found. Forexample, with a noiseless observation obtained by filtering of aBG signal, the optimal deconvolution, i.e., that which providesperfect (the most) i.i.d. signal, will give a signal with a uniquespike whose temporal location coincides with the largest spikelocation of the input signal. Then, priors (regularization) arenecessary on inverse filter to avoid trivial solutions. Therefore,we add a smoothing constraint on , i.e., which controls thedifference . Moreover, for overcoming noiseamplification on the output, we must limit the largest value of

by a second regularization term. Based on these remarks,we propose the following frequency-domain blind deconvolu-tion (FBD) criterion

(13)

where and denote two hyperparameters. In the third right-hand-side term, the sum is fully defined by using periodicity of

, i.e., . The first regularization term, bal-anced by , constrains the frequency response of the inversefilter to be smooth enforcing to be Gaussian,i.e., with a maximum density for . Prac-tically, we notice that this term also improves the stability andthe performance of the minimization algorithm, because, it isa strong prior on the frequency response: the smoothness con-straint reduces the freedom degree number. The last term penal-izes (with the norm) the largest values of the spectrum of

. For instance, with , it would enforce to have aGaussian distribution. Thus, this term is equivalent to the noisefactor usual in Wiener filtering: It allows a tradeoff between thedeconvolution quality and the noise amplification. We can inter-pret this criterion in a maximum a posteriori (MAP) framework:indeed, it is equivalent to taking a Gaussian prior distributionof conditional to and a generalized Gaussianprior distribution (parametrized by ) for the marginal proba-bility density function (pdf) of .

In the following, we minimize the criterion (13) with respectto the complex-valued vector according to a gradient iterativeprocedure.

1774 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 54, NO. 5, MAY 2006

C. Inverse Filter Estimation

Let us compute the gradient of the criterion (13) with respectto the complex vector . For areal function of a complex-valued vector ,the gradient is defined [22], [23] as

, where denotesthe derivative of with respect to . Moreover, and itsconjugate are considered as linearly independent variables.Consequently, we compute the derivative of each term of thecriterion (13) with respect to , . Since

, the second term of (13) can be writtenwith the variables and , as follows:

(14)

Using the independence assumption of and , wehave

(15)

Similarly, using periodicity propertiesand , the derivative of the regularization terms are

(16)

(17)

To obtain the derivative of the marginal entropy, since, one can write

(18)

After permuting the expectation and the derivative in(18), we rewrite with the chain ruleexpression

(19)

As is real valued, we have

(20)

(21)

Moreover, since can beexpanded as

(22)

(23)

where is the spectrum of the observation . Definingthe score function of the process by

(24)

Using (21) and (23), (19) becomes

(25)Finally, under ergodicity conditions, the expectation operator

is estimated by a temporal average, providing the estimateof (18)

(26)Thus, we must estimate the score function on the sample

values . Denoting by the Fourier trans-form of the score function , (26) becomes

(27)

Thus, with (15), (16), (17), and (27), the gradient of the cri-terion (13) with respect to ,1 is estimated by

(28)

After computing the gradient, we can minimize the criterion(13) with a gradient descent procedure. Therefore, the FBD al-gorithm can be summarized as follows:

1) initialization of the inverse filterand of the deconvolution output

;2) estimation of the score function ;3) computation of the gradient estimate

(28);4) updating of ;5) computation of the deconvolution

output ;6) normalization step.

1We name again the frequency variable by � , although all the previous deriva-tives are calculated with respect toG(�) to avoid confusion between the variable� used in criterion definition. However, � , the most convenient variable nameto represent the frequency.

LARUE et al.: FREQUENCY-DOMAIN BLIND DECONVOLUTION BASED ON MUTUAL INFORMATION RATE 1775

We iterate the main loop (Steps 2) to 6)) until convergence.The normalization Step 6) is required for taking into accountscale indeterminacy in . Here, we just normalize the in-verse filter to obtain an unit power deconvolution output, butother normalization can be used. denotes the gradient step size(a real positive constant).

As it appears in the first right-hand-side term of (28), the es-timation of the score function is a key point of the algorithm:It contains all the statistical knowledge of the process. If thepdf (or the score function) was known, or if it was perfectly es-timated, one could achieve a perfect estimation of the criteriongradient, and hence, derive an optimal algorithm (with respect tothe mutual information criterion). Unfortunately, the pdf’s (and,hence, the score function) being unknown, instead of choosinga prior (i.e., wrong) pdf, the algorithm is estimating the pdf ofthe output process, and then is quasi-optimal since the pdf tendstoward the true one when converging. Thus, its performance de-pends less on the source distribution than methods based on se-lected higher order statistics. In Section III, we detail two scorefunction estimates, and we study them to select the most robustfor the experiments of Section IV.

III. PROBABILITY DENSITY AND SCORE FUNCTIONS

The score function is a key point that appears in estimatingequations of blind source separation [24], [25] and blind decon-volution [19]. Usually, authors use rough approximations witha nonlinear model such as hyperbolic tangent, cubic [26], poly-nomial model [17], [24]. These approximations are related on aprior on the input pdf. Indeed, given a nonlinear score functionmodel, one can compute the input pdf by simple integration. Forexample, a third-order polynomial approximation of the scorefunction is equivalent to argue that input pdf follows a general-ized Gaussian distribution with , i.e., is a sub-Gaussianprocess. Then, if we use these third-order polynomial approx-imation with a super-Gaussian input process, we can achievepoor performance. We can also use a parametric distributionmodel of the process such as extended generalized lambda orbeta distribution [27], generalized Gaussian distribution [28],[29]. Finally, most advanced parametric model can be consid-ered as a model in the form of AR spectrum density [30]. In thispaper, focusing on blind deconvolution in a general framework,we need a score function estimate, suited for any input distri-bution and if possible with a low computational cost. Then, ouralgorithm can be applied efficiently for a wide range of signals.

Classically, we can estimate the score function by, where is a pdf estimate. Thus,

approximating the pdf with an Edgeworth expansion [31],or a Gram–Charlier expansion [32], leads to a polynomial ap-proximation of the score function which is similar to [10]–[13].Nevertheless, this expansion does not provide the best poly-nomial function2 in the least-mean-square sense [33], [34]. InSection III-A, we detail a method based on a kernel estimateof and a direct3 estimator of the score function based on apolynomial approximation.

2Generally, when stopping the expansion at a given order, the pdf positivityproperty is not guaranteed.

3That is, without using ̂ = �((p̂ )=(p̂ )).

A. Kernel Estimation

With the definition (24), the score function of the randomvariable can be estimated by derivation of the pdf estimation.Basically, the kernel estimate of , based on the samples

is given by [35]

(29)

where is the kernel and is a bandwidth parameter ad-justed according to the sample size and the data. However, thisestimate requires a high computation cost: for each value ,

kernel estimations are necessary. Thus, the complete com-putation (28) requires kernel estimations. Therefore, Pham[36], [37] designed a fast algorithm to estimate pdf, score func-tion, mutual information, or entropy for blind source separa-tion application. Pham’s estimate models the kernel by athird-order cardinal spline (defined as the three time convolutionof the indicator function), which is a second-order polynomialfunction

otherwise

(30)

since the kernel width is finite, we compute the pdf on a regulargrid of points, spaced by . Practically, the parameter inthe model (30) will be selected following [37]. In other words,assuming is a zero-mean process, with the variable change

, (29) becomes

(31)

in which the term is nonzero only ifor , where denotes the

signed integer the closest to . Therefore, this method estimatesthe pdf with samples on a grid with only kernel estimates,instead of for an infinite width kernel. The score functionis then estimated on a regular grid by first-order differentiationof . Finally, we linearly interpolate the score function toget the estimation necessary to (28); this step only adds a lowcomputational cost.

B. Direct Parametric Estimation

In this section, following [24] and [34], we explain the generalissue of direct estimation of according to a parametric model

, where denotes the parameter vector. This model isestimated by minimizing a least-mean-square-error criterion

(32)

Using [34, Lemma 3], the least-mean-square estimate of theparameters (i.e., solutions of ) satisfies

(33)

which does not depend on the score function . In fact, thescore function disappears in the expectation operator by the way

1776 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 54, NO. 5, MAY 2006

of the pdf of . If the model is linear with respect to, one can find an explicit (and optimal at the mean-least-square

sense) solution of the parameter vector . In fact, (33) leads toa linear system as

(34)

In Section III-C, we will use this method with an -de-gree polynomial model , where

denotes the parameter vector. In this case,(respectively, ) is a matrix (respectively,

vector) defined with the moments of up toorder (resectively, ). We can note that the best (in theleas-mean-square sense) -degree polynomial approximationof the score function requires moments up to order .

C. Comparison of Estimated Score Functions

Fig. 2 compares 1) Pham’s kernel estimator presented in Sec-tion III-A and 2) direct estimation using a seventh-degree poly-nomial. In the simulation, we choose a Laplacian signal withadditive Gaussian noise, with an SNR equal to 3 dB.

We note in Fig. 2 that for the largest sample values, whichrepresent the Laplacian process, the theoretical score function(dashed line) is constant. Pham’s kernel estimate [Fig. 2(a)] andthe polynomial estimate [Fig. 2(b)] have similar performance,with a dispersion around the theoretical score function. Inorder to choose the best estimator in a noisy case, we pro-pose to compute the estimate performance versus the SNRfor various distributions. We consider four 4000-sample withdifferent i.i.d. distributions: (a) Bernoulli–Laplacian (BL),(b) Bernoulli–Gaussian (BG), (c) Laplacian, or (d) uniform,with additive Gaussian noise. For the BG and BL process, thesparsity factor4 is equal to 1/20. For these processes, the SNR isdefined with respect to the standard deviation of the Gaussianor Laplacian process which represents the peak amplitudedistribution and not with respect to the standard deviation of thefinal process. Denoting the useful process and the additivenoise, the pdf of is equal to ,and the theoretical score function of is derived from thedefinition (24). Fig. 3 represents, for the four above processes,the theoretical score functions for three SNR values (0, 3, and10 dB).

The performance is evaluated with the normalizedmean-square error (MSE) of the score function estimate

defined by

MSE (35)

For each SNR, we compute the MSE over 1000 random con-figurations. We compare three estimates (Fig. 4): 1) Pham’skernel estimate, 2) the direct estimation with polynomial func-tions (see Section III-B) of degrees 7 and 11.

For the four different input signals, the kernel estimate andthe seventh-degree polynomial estimate give comparable per-formance. For the smallest SNR, the score function is closed

4The sparsity coefficient is the probability of the Bernoulli process.

Fig. 2. Comparison of theoretical (dashed) and estimated (plain) scorefunction for a Laplacian signal with additive Gaussian noise. (a) Pham kernel’sestimate; (b) direct estimation with a seventh-order polynomial function.

Fig. 3. Theorectical score functions for (a) Bernoulli–Laplacian,(b) Bernoulli–Gaussian, (c) Laplacian, (d) uniform in [�

p3;p3] distributions

with additive Gaussian noise. Dotted: SNR = 0 dB; dashed: SNR = 3 dB;solid: SNR = 10 dB.

to the Gaussian score function (since the Gaussain noise domi-nates), i.e., , where denotes the standard devi-ation of the Gaussian noise. Consequently, a simple polynomialmodel (even of first degree) achieves a perfect estimation, andin our experiments, the seventh-degree polynomial with less pa-rameters than the eleventh-degree polynomial, provides the bestestimates.

For the largest SNR, the MSE increases for all distributionsand for all the estimates. In fact, with a large SNR, the scorefunctions are close to the theoretical score functions of a noise-less process, and as shown in Fig. 3(a), (b), and (d), for BernoulliLaplacian or BG or uniform distribution, the function has acomplex shape. Moreover, due to the sparsity, there are only afew number of samples for estimating the complex shape, sincemost of the samples are located near zero: they correspond tothe simplest shape (a linear function) related to the Gaussian

LARUE et al.: FREQUENCY-DOMAIN BLIND DECONVOLUTION BASED ON MUTUAL INFORMATION RATE 1777

Fig. 4. Mean-square error of score function estimates. Kernel estimator (solidline), direct estimation with seventh-degree order polynomial (dashed line)and with eleventh-degree polynomial (dashed–dotted line ) versus SNR for (a)Bernoulli–Laplacian, (b) Bernoulli–Gaussian, (c) Laplacian, and (d) uniformin [�

p3;p3] distribution with additive Gaussian noise.

score function. Consequently, the kernel as well as the polyno-mial provide poor estimates.

For the Laplacian process, the singularity near zero is a Heav-iside function [Fig. 3(c)], which is less difficult to estimate.

Finally, we could believe that increasing the polynomialdegree allows us to improve the accuracy of the score functionestimation. In fact, in Fig. 4(b) and (d), for BG signal and uni-formly distributed signal, the eleventh-degree approximationgives worse results than the seventh-degree estimate, since atoo-large-order polynomial provides an ill-conditioned system(34).

For the BL and Laplacian distributions, we observe a similarbehavior, and a poor estimation of the score function can implyinstabilities in the gradient descent algorithm. For this reason,the eleventh-degree polynomial estimates have not been drawnin Fig. 4(a) and (c).

As a conclusion, we will use for any experiment the kernel es-timate. This estimate requires the adjustment of only one param-eter, the bandwidth . Adjusting as suggested by [37] providesaccurate results for all the distributions studied in this paper.

IV. EXAMPLES

In this section, we first address the practical issue of the se-lection of the parameters and . Then, two sets of experi-ments including 1) simulated signals and 2) processing of actualrecorded signals are described.

A. Parameter Selection

First of all, we should come back to the choice of hyperpa-rameter. In Appendix, we propose a method for estimating .The relation (36) is defined if the spectral density of theobservation is nonzero for each frequency . For instance, wecan estimate only the frequency bins whendoes not vanish. Practically, we begin with , and is

Fig. 5. (a) Reflectivity simulated with a noiseless Bernoulli–Gaussian processwith 50 reflectors. (b) Observation filtered by minimum-phase ARMA(10,10).(c) Deconvolved output signal with � = 0:05, � = 5, � = 10 (valuegiven by Appendix) and p = 3. (d) [respectively, (e)] modulus (respectively,phase) of the estimated (solid line) compared with the theoretical (dashed line)inverse filter (respectively, phase).

estimated as in Appendix. Then, we adjust and separatelyaccording to the following strategy with a few successive testsand observing the deconvolution output and the inverse filterfrequency response:

• if is not smooth enough, is increased;• if the output is too noisy, is increased.

In fact, this strategy is efficient and provides a simple estimationof or . The estimation is robust enough in the sense thatlarge parameter variations around the em-pirical optimum do not strongly influence the deconvolutionperformance.

B. Non-Minimum-Phase Filter Estimate

The goal of this subsection is to show that the FBD algorithmallows the estimation of a nonminimum phase filter. This prop-erty is due to higher (than 2) order statistics, which are implicitlyintroduced in the deconvolution criterion (13) by the MIR (6).The direct filter is a (10,10) ARMA filter. The poles and zerosof the filter are randomly chosen inside the unit circle. Variousinput signals (BG, BL, Laplacian, and uniformly distributions)can be used. In any case, with the above parameter solution, thefilter is accurately estimated.

In Fig. 5, as a first example, we show the result obtained fora BG input signal (400 samples with 50 reflectors). The per-formance appears both in filter [Fig. 5(d) and (e)] and signal[Fig. 5(c)] estimates. The comparison of input [Fig. 5(a)] andestimated output [Fig. 5(c)] is easy, due to the simple BG shape,and points out the deconvolution quality. In (d) [respectively,(e)], the modulus (respectively, the phase) of the estimated in-verse filter (solid line) is compared with the theoretical inversefilter (dashed line).

In Fig. 6, as a second example, we use the same BG signal,and a (10,10) ARMA filter with the same spectrum as in Fig. 5but with a zero phase.

1778 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 54, NO. 5, MAY 2006

Fig. 6. (a) Reflectivity simulated with a noiseless Bernoulli–Gaussian processwith 50 reflectors. (b) Observation filtered by zero phase ARMA(10,10). (c)Deconvolved output signal with� = 0:05,� = 5,� = 10 (value given byAppendix) and p = 3. (d) [respectively, (e)] the modulus (respectively, phase)of the estimated in solid line compared to the theoretical in dashed line inversefilter (respectively, phase).

Fig. 7. Underwater experiment: (S) explosive source and (P) pressure sensor.

We observe that for the two experiments the quality of therestored signal and of the estimated filter are similar. Conse-quently, the phase of the direct filter seems to have no effect onthe algorithm’s performance. We note that the phase estimationis worse at higher frequencies (0.4, 0.5) due to lower power inthis frequency range [see Fig. 5(d) or Fig. 6(d)].

This experiment points out the ability to estimate any filterswhatever the phase, using high order statistics. The previousexample, using simulated noiseless signals, does not point upthe relevance of the noise regularization in our FBD algorithm.As blind deconvolution methods are very sensitive to additivenoise, it is important to test the FBD algorithm for noisy signals.To this purpose, we choose more realistic contexts, with actualsignals recordings.

C. Underwater Explosions Data

Fig. 8(a) shows the real data of an underwater explosionrecorded in a pool. The source is an explosive source, and thesensor records the pressure. The signal is composed of thedirect wave (numbered 1), a reflection from the bottom of thepool (numbered 2), and a reflection from the surface (numbered3) as depicted in Fig. 7.

Therefore, in this application, the source signature con-tains information on both the direct wave and the reflections.The direct filter represents the wave generated by the explo-sion characterized by the “bubble effect” with an exponentialincrease in pressure following by an exponential decrease. Wecan model this wave as a noncausal MA [12] filter. In Fig. 8,we compare the deconvolution results obtained from the (a) ob-servations according to three methods: (b) spectral equalization,(c) Durbin’s method, and (d) the FBD algorithm.

Fig. 8. (a) Observation of underwater explosion, (b) spectral equalization, (c)Durbin’s method, and (d) FBD algorithm with � = 0:01, � = 0:1, and � =

0:1 (value given by Appendix) and p = 3.

TABLE ICOMPARISON OF UNDERWATER EXPLOSION DECONVOLUTIONS

We note that the three methods provide a good estimation ofthe time location of the three events. Table I measures the per-formance of the deconvolution output obtained for the differentalgorithms of Fig. 8 according to three criteria. Since the algo-rithm is driven by an i.i.d. criterion, we first compute the i.i.d.nature of the estimated output using the i.i.d. criterion(9). Then, we measure the width, i.e., the number of samplesbetween the two samples defined by a value equal to 10% ofthe peak amplitude. Finally, a last criterion is the noise standarddeviation, computed over the first 100 samples. We shall no-tice that the three deconvolution outputs are normalized havinga first peak amplitude equal to 10.

Clearly, the FBD algorithm provides estimated sources withthe narrowest spikes and the smallest noise standard deviation.Concerning , spectral equalization achieves better perfor-mance than FBD. In fact, measures the i.i.d. nature of

, where is the desired input signal. If the inputis not perfectly i.i.d. and/or the observation is noisy, the

best estimated output (in the i.i.d. sense) is ,which is as i.i.d. as possible. Even with a perfect estimation of

, one would obtain , i.e., a noisy version ofthe source. The noise regularization done by the FBD algorithmlimits the term and provides a better estimation of but,at the same time, it increases the temporal dependence of theestimated output. In fact, the criterion used in FBD is notbut the regularized criterion (13), and the minimum of

and are not equal. This shows that the i.i.d. criteriontaken without regularization leads to good restoration only fornoiseless observations. In other words, for leading to efficientdeconvolution algorithm, the criterion must take into accountthe additive noise. Moreover, in Fig. 8(b) and (c), one observes

LARUE et al.: FREQUENCY-DOMAIN BLIND DECONVOLUTION BASED ON MUTUAL INFORMATION RATE 1779

Fig. 9. (a) Observation of seismovolcanic volcano Purace. Estimation with(b) spectral equalization, (c) Yule–Walker method with twentieth-order filter,(d) MAML algorithm with a MA filter of order 40, (e) FBD algorithm with� = 0:01, � = 10, � = 5, and p = 1:5 and (f) frequency modulus of thedirect filter estimated by FBD algorithm.

secondary spurious spikes just after the reflector, which does notappear in the FBD result.

In conclusion, the FBD algorithm achieves a better tradeoffbetween deconvolution quality and noise amplification.

D. Seismovolcanic Recording

We now consider a seismogram recorded on the Colombianvolcano Purace. The aim is to separate the resonance effects andthe less energetic effects such as excitation and propagation. Therecording [Fig. 9(a)] is characterized by long period events [38],with an important resonance. It was recorded at a station lo-cated 1 km from the crater of Purace Volcan on September 15,1994, with a sampling frequency of 100 Hz. Using the model ofFig. 1, the residual signal obtained by deconvolution con-tains information about the excitation of the volcano, and theresonant filter gives information about the geometry of thevolcano. The excitation phenomenon gives an i.i.d. process be-cause it is due to an explosion associated with a white (wide-band) process. Therefore, we can apply the deconvolution al-gorithm on this data. Moreover, in the time domain, the reso-nant filter can be modeled by an AR filter. In Fig. 9, we com-pare the spectral equalization [Fig. 9(b)] with two methods dedi-cated to the autoregressive filter inversion, Yule–Walker method[Fig. 9(c)] using second-order statistic and the temporal method(MAML) [Fig. 9(d)] detailed in [19] (see Section II). Finally,we process the data with our FBD algorithm [Fig. 9(e)]. In Fig.9(f), we represent the direct filter estimated by our algorithm.

The excitation phenomenon is located between the samples500 and 1000. Looking at the temporal shape, we can argue thatthe FBD algorithm gives the shorter excitation and the smallernoise level. Table II measures the performance of the deconvo-lution output obtained for the different algorithms in Fig. 9. Thenoise standard deviation is evaluated with the last 3500 samplesof the deconvolution output.

According to Table II, the FBD algorithm achieves the bettertradeoff between the noise amplification and the deconvolution

TABLE IICOMPARISON OF PURACE VOLCANO RECORDINGS DECONVOLUTIONS

Fig. 10. Resonant direct filter spectrum of Volcano Purace (zoom betweenreduced frequency [0; 0:15] estimated by (a) FBD algorithm, (b) spectralequalization, (c) Yule–Walker method, and (d) MAML algorithm.

output whiteness. Comparing with MAML algorithm, whichuses the same whiteness measure, we note the relevance of thenoise regularization in frequency domain which assures a bettersignal restoration. In fact, the MAML algorithm points out sim-ilar results to the Yule–Walker algorithm due to a large Gaussianadditive noise.

Finally, we compute in Fig. 9(f) the direct filter estimated bythe FBD algorithm consisting of several resonance frequencieswhich characterize the volcano geometry. This direct filter isvery interesting for seismologists for describing volcano geom-etry by means of the resonance frequency. To see the relevanceof the spectrum smoothness regularization, in Fig. 10, we com-pare the direct filter frequency modulus estimated by the fouralgorithms in the Fig. 9 with a zoom around the reduced fre-quencies .

First, we note that the four algorithms give the same pre-ponderant resonances. Although using 20 parameters, theYule–Walker algorithm cannot estimate more than three reso-nant frequencies. For MAML algorithm, we only detect fourfrequencies with 40 parameters. With spectral equalizationand FBD algorithm, we detect at least six resonances. At firstglance, we observe resonances near 0.05 and 0.1 so we canargue that these are the fundamental and the first modes. Withan in-depth examination, we detect a small resonance at 0.0415and two clear resonances at 0.082, 0.126, which we believe itcorrespond to the fundamental and the first two modes. Finally,the FBD algorithm provides a smoother frequency responsethan the spectral equalization, due to the first regularization

1780 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 54, NO. 5, MAY 2006

term. Then, we keep a good frequency resolution, and we haveless variations in frequency response.

V. CONCLUSION

In this paper, we have presented a new blind deconvolu-tion method based on mutual information rate (MIR) in thefrequency domain called the frequency-domain blind deconvo-lution (FBD) algorithm. For achieving both good deconvolution(estimation of the inverse filter ) and good signal restoration,the estimation is based on the minimization of a criterion whichis the sum of MIR and two regularization terms. The criterioncan be related to a maximum a posteriori (MAP) approach inwhich the regularization terms are the priors on the distributionof (for providing a smooth filter) and on

(for improving the output SNR). Moreover, due to theMIR term, the method uses high-order statistics and is ableto estimate minimum phase as well as non-minimum-phasefilters. For quasi-optimal estimation of the gradient of the MIRterm, accurate score function estimation is required and, aftercomparison of a few methods, we propose to use the fast kernelestimate developed by Pham.

Simulation emphasize the ability of the FBD algorithm atestimating non minimum phase filters, and for providing goodsignal estimations even with noisy data. Finally, experiments ontwo sets of actual data point out the performance of the methodcompared to classical (and second order) approaches like spec-tral equalization or Yule–Walker/Durbin methods.

Future works include study of other i.i.d. criteria: secondcharacteristic functions instead of MIR as suggested byEriksson and Koivunen [39] for blind source separation, andcomplete MIR estimation (instead of simplified (9)) based onscore function difference introduced by Babaie-Zadeh et al.[40].

APPENDIX

ESTIMATION OF HYPERPARAMETER

The hyperparamater can be associated to the Gaussiangeneralized prior distribution of defined by

, where is a normalization term and we have

where is the gamma function defined by. We can estimate with

the interference formula linking the data and output spectraldensity, defined by , respectively, , as follows:

and the i.i.d. hypothesis on leads to (we fixthe magnitude indeterminacy of by a power normalization).Therefore, with the above relation, we can estimate by

(36)

This estimation provides a good value of in the case of thespectral density does not vanish for some frequencies.

ACKNOWLEDGMENT

The authors would like to thank P. Lesage of Laboratoire deGeophysique Interne et Tectonophysique from Grenoble for hisanalysis of our results on volcano data sets. The authors alsothank the “LACROIX” society for the underwater data. Last,but not least, the authors would like to thank S. Sangwine forhis careful review of the English usage.

REFERENCES

[1] F. Combet, P. Jaussaud, and N. Martin, “Recovery of a high shock prob-ability process using blind deconvolution,” in Proc. Eur. Signal Pro-cessing Conf. (EUSIPCO 02), vol. 2, Toulouse, France, 2002, pp. 91–94.

[2] P. Bofill and M. Zibulevsky, “Undetermined blind source separationusing sparse representations,” Signal Process., vol. 11, no. 81, pp.2353–2362, Nov. 2001.

[3] D. Luengo, I. Santamaría, J. Ibáñez, L. Vielva, and C. Pantaleón, “A fastSIMO channel identification algorithm for sparse sources,” IEEE SignalProcess. Lett., vol. 10, no. 5, pp. 148–151, May 2003.

[4] J. J. Kormolyo and J. M. Mendel, “Maximum likelihood detection andestimation of Bernoulli-Gaussian processes,” IEEE Trans. Inf. Theory,vol. 28, no. 3, pp. 482–488, May 1982.

[5] M. Lavielle, “Bayesian deconvolution of Bernoulli–Gaussian pro-cesses,” Signal Proces., vol. 33, pp. 67–79, 1993.

[6] J. Idier and Y. Goussard, “Multichannel seismic deconvolution,” IEEETrans. Geosci. Remote Sens., vol. 31, no. 5, pp. 961–980, Oct. 1993.

[7] M. Lavielle, “A stochastic algorithm for parametric and nonparametricestimation in the case of incomplete data,” Signal Process., vol. 42, pp.3–17, 1995.

[8] A. Doucet and P. Duvaut, “Bayesian estimation of state-space modelsapplied to deconvolution of Bernoulli–Gaussian processes,” SignalProcess., vol. 57, pp. 147–161, 1997.

[9] O. Rosec and J.-M. Boucher, “Bayesian estimation of nonminimumphase wavelets applied to marine reflection seismic data,” in Proc. Int.Conf. Acoustics Speech, Signal Processing (ICASSP), vol. 5, Phoenix,AZ, Mar. 1999, pp. 2797–2800.

[10] J. K. Tugnait, “Estimation of linear parametric models using inversefilter criteria and higher order statistics,” IEEE Trans. Signal Process.,vol. 41, no. 11, pp. 3196–3199, Nov. 1993.

[11] J. A. Cadzow, “Blind deconvolution via cumulant extrema,” IEEE SignalProcess. Mag., vol. 13, no. 3, pp. 24–42, May 1996.

[12] M. Boumahdi, “Blind identification using the kurtosis with applicationsto field data,” Signal Process., vol. 48, no. 3, pp. 205–216, 1996.

[13] S.-F. Lei and R. P. Hamernik, “Construction of a joint peak-intervalhistogram using higher-order cumulant-based inverse filtering,” pre-sented at the Int. Conf. Acoustics, Speech, Signal Processing (ICASSP),Seattle, WA, May 1998.

[14] P. Comon, “MA identification using fourth order cumulants,” SignalProcess., vol. 26, no. 3, pp. 381–388, 1992.

[15] D. L. Donoho, “On Minimum Entropy Deconvolution,” in Applied TimeSeries Analysis, D. Findley, Ed. New York: Academic, 1981, vol. II,pp. 565–609.

[16] M. J. Porsani and B. Ursin, “Mixed-phase deconvolution,” Geophys.,vol. 63, no. 2, pp. 637–647, Mar.–Apr. 1998.

[17] L. Zhang and A. Cichocki, “Multichannel blind deconvolution ofnonminimum-phase systems using filter decomposition,” IEEE Trans.Signal Process., vol. 52, no. 5, pp. 1430–1441, May 2004.

[18] T. Cover and J. Thomas, Elements of Information Theory, ser. WileySeries in Telecommunications. New York: Wiley, 1991.

[19] A. Taleb, J. Solé i Casals, and C. Jutten, “Quasinonparametric blind in-version of Wiener systems,” IEEE Trans. Signal Process., vol. 49, no. 5,pp. 917–924, May 2001.

[20] D.-T. Pham, “Contrast functions for blind separation and deconvolutionsources,” in Proc. Int. Conf. Independent Components Analysis (ICA),San Diego, CA, Dec. 2001, pp. 37–42.

LARUE et al.: FREQUENCY-DOMAIN BLIND DECONVOLUTION BASED ON MUTUAL INFORMATION RATE 1781

[21] D. Erdogmus, K. E. Hild, J. C. Principe, M. Lazaro, and I. Santamaria,“Adaptive blind deconvolution of linear channels using Renyi’s entropywith Parzen window estimation,” IEEE Trans. Signal Process., vol. 52,no. 6, pp. 1489–1498, Jun. 2004.

[22] D. H. Brandwood, “A complex gradient operator and its applicationin adaptative array theory,” Proc. Inst. Elect. Eng., vol. 130, no. 1, pp.11–16, Feb. 1983.

[23] A. Van Den Bos, “Complex gradient operator and hessian,” Proc. Inst.Elect. Eng. Vision, Image Signal Processing, vol. 141, pp. 380–382,1994.

[24] D.-T. Pham, P. Garat, and C. Jutten, “Separation of a mixture of inde-pendent sources through a maximum likelihood approach,” in Proc. Eur.Signal Processing Conf. (EUSIPCO), vol. 2, Brussels, Sep. 1992, pp.771–774.

[25] J.-F. Cardoso, “Blind signal separation: Statistical principles,” Proc.IEEE, vol. 9, pp. 2009–2025, Oct. 1998.

[26] L. Zhang, A. Cichocki, and S. Amari, “Self-adaptive blind source sep-aration based on activation function adaptation,” IEEE Trans. NeuralNetw., vol. 15, no. 2, pp. 233–244, Mar. 2004.

[27] J. Eriksson, J. Karvanen, and V. Koivunen, “Source distribution adap-tive maximum likelihood estimation of ICA model,” in Proc. Int. Conf.Independent Components Analysis (ICA), Helsinki, Finland, 2000, pp.227–232.

[28] S. Choi, A. Cichocki, and S. Amari, “Flexible independent componentanalysis,” J. VLSI Signal Process., vol. 20, pp. 25–38, 2000.

[29] R. Everson and S. Roberts, “Independent component analysis: Aflexible nonlinearity and decorrelating manifold approach,” NeuralComput., vol. 11, pp. 1957–1983, 1999.

[30] J.-F. Bercher and C. Vignat, “Estimating the entropy of a signal with ap-plications,” IEEE Trans. Signal Process., vol. 48, no. 6, pp. 1687–1694,Jun. 2000.

[31] P. Comon, “Independent component analysis, a new concept?,” SignalProcess., vol. 36, no. 3, pp. 287–314, Apr. 1994.

[32] S. Amari, H. Cichocki, and H. Yang, “A new learning algorithm forblind signal separation,” in Advances in Neural Information Pro-cessing. Cambridge, MA: MIT Press, 1995, pp. 757–763.

[33] M. Babaie-Zadeh, “On blind source separation in convolutive and non-linear mixtures,” Ph.D. dissertation, Institut National Polytechnique deGrenoble, Grenoble, France, 2002.

[34] A. Taleb and C. Jutten, “Source separation in post nonlinear mixtures,”IEEE Trans. Signal Process., vol. 47, no. 10, pp. 2807–2820, Oct. 1999.

[35] B. Silverman, Density Estimation for Statistics and Data Anal-ysis. London, U.K.: Chapman & Hall, 1982.

[36] D.-T. Pham, “Fast algorithm for estimating mutual information,entropies and scores functions,” in Proc. Int. Conf. Independent Com-ponents Analysis (ICA), Nara, Japan, Apr. 2003, pp. 17–22.

[37] , “Fast algorithm for mutual information based independentcomponent analysis,” IEEE Trans. Signal Process., vol. 52, no. 10, pp.2690–2700, Oct. 2004.

[38] P. Lesage, F. Glangeaud, and J. Mars, “Applications of autoregressivemodels and time-frequency analysis to the study of a volcanic tremorand long-period events,” J. Volcanology Geotherm. Res., vol. 114, no. 3,pp. 391–417, 2002.

[39] J. Eriksson and V. Koivunen, “Blind separation using charcteristic func-tion based criterion,” in Proc. 35th Conf. Information Sciences Systems(CISS), vol. 2, Baltimore, MD, Mar. 2001, pp. 781–785.

[40] M. Babaie-Zadeh, C. Jutten, and K. Nayebi, “Minimization-projectionapproach for blind source separationin different mixing models,” inProc. Int. Conf. Independent Components Analysis (ICA), Nara, Japan,2003, pp. 1083–1088.

Anthony Larue received the Aggregation inelectrical engineering from the Ecole NormaleSupérieure de Cachan, France, in 2002 and the M.S.degree in automatic control and signal processingfrom the University of Paris XI, France, in 2003.He is currently working toward the Ph.D. degree atthe Laboratoire des Images et Signaux, Grenoble,France.

His research interests are in signal processing,more precisely blind deconvolution for seismicapplications.

Jérôme I. Mars received the M.S. degree in geo-physics from Joseph Fourier University of Grenoble,France, in 1986 and the Ph.D. degree in signal pro-cessing from the Institut National Polytechnique ofGrenoble, France, in 1988.

From 1989 to 1992, he was a Postdoctoral Re-searcher at the Centre des Phénomènes Aléatoireset Geophysiques, Grenoble, France. From 1992 to1995, he was a visiting Lecturer and Scientist atthe Materials Sciences and Mineral EngineeringDepartment, University of California, Berkeley. He

is currently an Assistant Professor in Signal Processing for the Laboratoiredes Images et des Signaux at the Institut National Polytechnique de Grenoble,France. He is leader of geophysical signal processing team. His researchinterests include seismic and acoustic signal processing, wavefield separationmethods, time-frequency time-scale characterization, and applied geophysics.

Dr. Mars is a member of Society of Exploration and Geophysicists (SEG) andEuropean Association of Geoscientists and Engineers (EAGE).

Christian Jutten (A’92–M’03–SM’06) receivedthe Ph.D. degree and the Docteur ès Sciencesdegree from the Institut National Polytechnique ofGrenoble, France, in 1981 and 1987, respectively.

From 1982 to 1989, he was an Associate Professorin Ecole Nationale Supérieure d’Electronique et deRadioélectricité of Grenoble, France. He was a vis-iting Professor with the Swiss Federal PolytechnicInstitute, Lausanne, Switzerland, in 1989, before be-coming a Full Professor with the Université JosephFourier of Grenoble, France, more precisely with the

Polytech Grenoble Institute. He is currently Associate Director of the Labora-toire des Images et des Signaux (100 peoples). For 25 years, his research inter-ests have been blind-source separation, independent component analysis, andlearning in neural networks, including theoretical aspects (separability, sourceseparation in nonlinear mixtures) applications in signal processing (biomedical,seismic, and speech), and data analysis. He is author or coauthor of more than40 papers in international journals, 16 invited papers, and 100 communicationsin international conferences.

Dr. Jutten has been Associate Editor of IEEE TRANSACTIONS ON CIRCUITS

AND SYSTEMS from 1994 to 1995 and co-organizer with Dr. J.-F. Cardosoand Prof. Ph. Loubaton of the First International Conference on Blind SignalSeparation and Independent Component Analysis (Aussois, France, January1999). He is currently member of a technical committee of the IEEE Circuitsand Systems Society on blind signal processing. He is a reviewer of main inter-national journals (IEEE TRANSACTIONS ON SIGNAL PROCESSING, IEEE SIGNAL

PROCESSING LETTERS, IEEE TRANSACTIONS ON NEURAL NETWORKS, SignalProcessing, Neural Computation, Neurocomputing, etc.) and conferences insignal processing and neural networks (ICASSP, ISCASS, EUSIPCO, IJCNN,ICA, ESANN, IWANN, etc.).