Evaluating the Performance of Kalman-Filter-Based EEG Source Localization

15
122 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 56, NO. 1, JANUARY 2009 Evaluating the Performance of Kalman-Filter-Based EEG Source Localization Matthew J. Barton , Peter A. Robinson, Suresh Kumar, Andreas Galka, Hugh F. Durrant-Whyte, Fellow, IEEE, Jos´ e Guivant, Member, IEEE, and Tohru Ozaki Abstract—Electroencephalographic (EEG) source localization is an important tool for noninvasive study of brain dynamics, due to its ability to probe neural activity more directly, with better tem- poral resolution than other imaging modalities. One promising technique for solving the EEG inverse problem is Kalman filter- ing, because it provides a natural framework for incorporating dynamic EEG generation models in source localization. Here, a recently developed inverse solution is introduced, which uses spa- tiotemporal Kalman filtering tuned through likelihood maximiza- tion. Standard diagnostic tests for objectively evaluating Kalman filter performance are then described and applied to inverse solu- tions for simulated and clinical EEG data. These tests, employed for the first time in Kalman-filter-based source localization, check the statistical properties of the innovation and validate the use of likelihood maximization for filter tuning. However, this analysis also reveals that the filter’s existing space- and time-invariant pro- cess model, which contains a single fixed-frequency resonance, is unable to completely model the complex spatiotemporal dynam- ics of EEG data. This finding indicates that the algorithm could be improved by allowing the process model parameters to vary in space. Index Terms—Diagnostic testing, distributed model, electroen- cephalographic (EEG), filter tuning, inverse problem, Kalman fil- tering, source localization. Manuscript received September 3, 2007; revised June 13, 2008. First pub- lished October 7, 2008; current version published February 13, 2009. This work was supported in part by the Australian Research Council and by the Deutsche Forschungsgemeinschaft (DFG) through Sonderforschungsbereich (SFB) 654 “Plasticity and Sleep.” Asterisk indicates corresponding author. M. J. Barton is with the School of Physics, University of Sydney, Sydney, N.S.W. 2006, Australia, and also with the Brain Dynamics Cen- tre, Westmead Millennium Institute, Westmead Hospital and Western Clinical School of the University of Sydney, Westmead, N.S.W. 2145, Australia (e-mail: [email protected]). P. A. Robinson is with the School of Physics, University of Sydney, Sydney, N.S.W. 2006, Australia, and with the Brain Dynamics Centre, West- mead Millennium Institute, Westmead Hospital and Western Clinical School of the University of Sydney, Westmead, N.S.W. 2145, Australia, and also with the Faculty of Medicine, University of Sydney, Sydney, N.S.W. 2006, Australia (e-mail: [email protected]). S. Kumar is with the Australian Research Council (ARC) Centre of Excel- lence for Autonomous Systems, University of Sydney, Sydney, N.S.W. 2006, Australia (e-mail: [email protected]). A. Galka is with the Department of Neurology, University of Kiel, 24105 Kiel, Germany (e-mail: [email protected]). H. F. Durrant-Whyte is with the Australian Research Council (ARC) Centre of Excellence for Autonomous Systems, University of Sydney, Sydney, N.S.W. 2006, Australia (e-mail: [email protected]). J. Guivant is with the School of Mechanical Engineering, University of New South Wales, Sydney, N.S.W. 2052, Australia (e-mail: [email protected]). T. Ozaki is with the Institute of Statistical Mathematics, Tokyo 106-8569, Japan (e-mail: [email protected]). Digital Object Identifier 10.1109/TBME.2008.2006022 I. INTRODUCTION F UNCTIONAL neuroimaging aims to noninvasively char- acterize the dynamics of the distributed neural networks that mediate brain function in healthy and pathological states. A number of imaging techniques have emerged over the past 20 years, providing insights into brain dynamics on different spatiotemporal scales. Functional MRI (fMRI) and positron emission tomography (PET) use hemodynamic and metabolic fluctuations induced by neural activity to probe brain dynam- ics with high spatial (in millimeters), but only low temporal (in seconds to minutes) resolution [1]. Electroencephalographic (EEG) source localization is a complementary imaging tech- nique that accesses, through scalp voltages, a more direct, albeit spatially blurred, measure of the brain’s electrical (neural) ac- tivity. Typically, the images generated by EEG inverse solutions have a lower spatial resolution (centimeters), but possess a much higher temporal resolution (milliseconds), and are thus impor- tant for studying brain dynamics as they probe neural processes on cognitive time scales [2]. Solving the EEG inverse problem to estimate the location, magnitude, and time course of the neuronal sources that produce the observed scalp voltages presents a considerable challenge. Unlike the forward problem (prediction of scalp voltages for a given source configuration), which has a unique solution, the inverse problem is nonunique due to the relatively small num- ber of spatial measurements (256) and volume conduction effects. To make the problem tractable, a priori assumptions (mathematical, anatomical, and physiological) are imposed on the sources and head model. The variety of methodologies being employed has seen a proliferation of source localization tech- niques in recent years. For a comprehensive review of these, see [2] and [3]. Solutions to the EEG inverse problem fall into two main cate- gories [2], [3]. The first type are dipole-fitting approaches (also known as “parametric” or “equivalent current dipole” methods), in which the activity is modeled by a relatively small number of focal sources at locations assumed a priori or estimated from the data. Examples include least-squares source estimation [4], beamformer techniques [5], and multiple-signal-classification methods [6], [7]. However, a drawback is that the equivalent sources can misrepresent actual activity, especially when ex- tended spatially [8]. The second group of techniques, which this paper is concerned with, are “linear distributed” approaches (also known as “imaging” methods), in which the sources are modeled by a 3-D grid of dipoles throughout the head volume. Distributed source models present a highly ill-posed inverse problem, particularly due to the mismatch between the small 0018-9294/$25.00 © 2009 IEEE

Transcript of Evaluating the Performance of Kalman-Filter-Based EEG Source Localization

122 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 56, NO. 1, JANUARY 2009

Evaluating the Performance of Kalman-Filter-BasedEEG Source Localization

Matthew J. Barton∗, Peter A. Robinson, Suresh Kumar, Andreas Galka, Hugh F. Durrant-Whyte, Fellow, IEEE,Jose Guivant, Member, IEEE, and Tohru Ozaki

Abstract—Electroencephalographic (EEG) source localization isan important tool for noninvasive study of brain dynamics, due toits ability to probe neural activity more directly, with better tem-poral resolution than other imaging modalities. One promisingtechnique for solving the EEG inverse problem is Kalman filter-ing, because it provides a natural framework for incorporatingdynamic EEG generation models in source localization. Here, arecently developed inverse solution is introduced, which uses spa-tiotemporal Kalman filtering tuned through likelihood maximiza-tion. Standard diagnostic tests for objectively evaluating Kalmanfilter performance are then described and applied to inverse solu-tions for simulated and clinical EEG data. These tests, employedfor the first time in Kalman-filter-based source localization, checkthe statistical properties of the innovation and validate the use oflikelihood maximization for filter tuning. However, this analysisalso reveals that the filter’s existing space- and time-invariant pro-cess model, which contains a single fixed-frequency resonance, isunable to completely model the complex spatiotemporal dynam-ics of EEG data. This finding indicates that the algorithm couldbe improved by allowing the process model parameters to vary inspace.

Index Terms—Diagnostic testing, distributed model, electroen-cephalographic (EEG), filter tuning, inverse problem, Kalman fil-tering, source localization.

Manuscript received September 3, 2007; revised June 13, 2008. First pub-lished October 7, 2008; current version published February 13, 2009. This workwas supported in part by the Australian Research Council and by the DeutscheForschungsgemeinschaft (DFG) through Sonderforschungsbereich (SFB) 654“Plasticity and Sleep.” Asterisk indicates corresponding author.

∗M. J. Barton is with the School of Physics, University of Sydney,Sydney, N.S.W. 2006, Australia, and also with the Brain Dynamics Cen-tre, Westmead Millennium Institute, Westmead Hospital and Western ClinicalSchool of the University of Sydney, Westmead, N.S.W. 2145, Australia (e-mail:[email protected]).

P. A. Robinson is with the School of Physics, University of Sydney,Sydney, N.S.W. 2006, Australia, and with the Brain Dynamics Centre, West-mead Millennium Institute, Westmead Hospital and Western Clinical Schoolof the University of Sydney, Westmead, N.S.W. 2145, Australia, and also withthe Faculty of Medicine, University of Sydney, Sydney, N.S.W. 2006, Australia(e-mail: [email protected]).

S. Kumar is with the Australian Research Council (ARC) Centre of Excel-lence for Autonomous Systems, University of Sydney, Sydney, N.S.W. 2006,Australia (e-mail: [email protected]).

A. Galka is with the Department of Neurology, University of Kiel, 24105Kiel, Germany (e-mail: [email protected]).

H. F. Durrant-Whyte is with the Australian Research Council (ARC) Centreof Excellence for Autonomous Systems, University of Sydney, Sydney, N.S.W.2006, Australia (e-mail: [email protected]).

J. Guivant is with the School of Mechanical Engineering, University of NewSouth Wales, Sydney, N.S.W. 2052, Australia (e-mail: [email protected]).

T. Ozaki is with the Institute of Statistical Mathematics, Tokyo 106-8569,Japan (e-mail: [email protected]).

Digital Object Identifier 10.1109/TBME.2008.2006022

I. INTRODUCTION

FUNCTIONAL neuroimaging aims to noninvasively char-acterize the dynamics of the distributed neural networks

that mediate brain function in healthy and pathological states.A number of imaging techniques have emerged over the past20 years, providing insights into brain dynamics on differentspatiotemporal scales. Functional MRI (fMRI) and positronemission tomography (PET) use hemodynamic and metabolicfluctuations induced by neural activity to probe brain dynam-ics with high spatial (in millimeters), but only low temporal(in seconds to minutes) resolution [1]. Electroencephalographic(EEG) source localization is a complementary imaging tech-nique that accesses, through scalp voltages, a more direct, albeitspatially blurred, measure of the brain’s electrical (neural) ac-tivity. Typically, the images generated by EEG inverse solutionshave a lower spatial resolution (centimeters), but possess a muchhigher temporal resolution (milliseconds), and are thus impor-tant for studying brain dynamics as they probe neural processeson cognitive time scales [2].

Solving the EEG inverse problem to estimate the location,magnitude, and time course of the neuronal sources that producethe observed scalp voltages presents a considerable challenge.Unlike the forward problem (prediction of scalp voltages for agiven source configuration), which has a unique solution, theinverse problem is nonunique due to the relatively small num-ber of spatial measurements (256) and volume conductioneffects. To make the problem tractable, a priori assumptions(mathematical, anatomical, and physiological) are imposed onthe sources and head model. The variety of methodologies beingemployed has seen a proliferation of source localization tech-niques in recent years. For a comprehensive review of these,see [2] and [3].

Solutions to the EEG inverse problem fall into two main cate-gories [2], [3]. The first type are dipole-fitting approaches (alsoknown as “parametric” or “equivalent current dipole” methods),in which the activity is modeled by a relatively small number offocal sources at locations assumed a priori or estimated fromthe data. Examples include least-squares source estimation [4],beamformer techniques [5], and multiple-signal-classificationmethods [6], [7]. However, a drawback is that the equivalentsources can misrepresent actual activity, especially when ex-tended spatially [8]. The second group of techniques, which thispaper is concerned with, are “linear distributed” approaches(also known as “imaging” methods), in which the sources aremodeled by a 3-D grid of dipoles throughout the head volume.

Distributed source models present a highly ill-posed inverseproblem, particularly due to the mismatch between the small

0018-9294/$25.00 © 2009 IEEE

BARTON et al.: EVALUATING THE PERFORMANCE OF KALMAN-FILTER-BASED EEG SOURCE LOCALIZATION 123

number of measurements (≈102) and the number of states tobe estimated (≈104). This necessitates the use of constraintsto identify an “optimal” inverse solution. Numerous classes ofconstraints have been applied to the EEG imaging problem,such as minimizing the norm of the current distribution [9]and variations of weighted minimum norm constraints as im-plemented in the low-resolution electromagnetic tomography(LORETA) [10], and focal underdetermined system solution(FOCUSS) [11] algorithms. It is important to note that theseinverse solutions [9]–[11] are instantaneous, i.e., each sourceestimate is calculated using only the data available at the cur-rent instant of time, independent of all other estimates exceptthat the regularization parameter required in these solutions isusually computed by optimization over the entire dataset. SinceEEG data have temporal structure and are produced by physicalprocesses, this assumption of temporal independence is certainlyfalse, and instantaneous techniques ignore much additional in-formation that could further constrain the inverse solution. In-corporating information from previous times into the estimationprocess yields dynamical EEG inverse solutions, which is thefocus of this paper.

Several approaches for solving the dynamical EEG inverseproblem have been investigated. One commonly used tech-nique is the introduction of a temporal smoothness term, whichhas been successfully applied to regularization [12], [13] andBayesian estimation [14], [15] methods. Another strategy isto use dynamic models for describing source behavior, whichcan then be used in various estimation schemes. Recent exam-ples include a particle filter using a random walk model for in-verting magnetoencephalographic (MEG) data [16], a modifiedLORETA algorithm that generalizes the temporal smoothnessconstraint into the form of an autoregressive (AR) model, allow-ing more complex dynamics to be modeled [17], and an inversesolution for evoked responses that uses a neural mass modelwithin a dynamic causal modeling framework [18].

This paper investigates another model-based approach, theapplication of Kalman filtering to solving the dynamical EEGinverse problem. The Kalman filter (KF) is a widely used tech-nique for the estimation of unobservable states in dynamicalsystems [19]–[21]. It has been used to solve dynamic inverseproblems in several biomedical imaging areas including electri-cal impedance tomography (EIT) [22], single-photon-emissioncomputed tomography (SPECT) [23], and diffuse optical to-mography (DOT) [24], [25]. An attractive feature of the KF isthat it provides a natural framework for introducing predictivemodels for EEG generation into source localization techniques.These models can be inferred from signal analysis, as done here,or derived from physiology (e.g., [26] and [27]). Despite theseattributes, Kalman filtering has not been widely explored in theEEG inverse mapping field, and its potential remains largely un-tapped, although a few studies have appeared [28]–[31], and therelated particle [16] and local linearization [32] filters have alsobeen used. A major reason for this is that the high dimensionalityof the underlying state space makes the application of a stan-dard KF challenging, due to the inability to accurately modelthe spatiotemporal interactions between all voxels and the highcomputational costs of running such an algorithm. However, a

recently developed KF-based inverse solution [28] avoids theseproblems and shows considerable promise. It proposes a mod-ified KF algorithm that reduces the high dimensionality of thisproblem by reformulating it as a coupled set of low-dimensionalKFs running in parallel. Using a single telegrapher’s equationto model the global source dynamics and likelihood maximiza-tion to estimate a small number of model and noise parameters,this technique offers improved source localization over existinginstantaneous solutions (e.g., LORETA).

In this paper, the application of Kalman filtering to sourcelocalization is examined through a detailed study of the KF-based inverse solution described before [28]. The study aims tocharacterize, validate, and identify ways to improve the algo-rithm’s performance. To achieve this, several new contributionsare made: 1) standard diagnostic tests for objectively evaluat-ing KF performance are introduced to EEG source localization;2) the application of these tests is demonstrated; 3) results forthis particular filter whose performance has not been previouslyevaluated formally are shown and discussed for both simulatedand clinical EEG data; and 4) the outcomes are used to directfuture work. These tests have not been discussed in the growingliterature on KF-based EEG source localization, despite theirproven utility and widespread use in other fields where Kalmanfiltering is employed [19], [33], [34]. All such tests check the sta-tistical properties required of the innovation sequence, which isthe only indicator of KF performance available for real data [19].Numerous tests, for both off- and online applications, have beendeveloped for this purpose [19], [20], [33]–[37]. Using thesetests, we can determine objectively whether the filter tuningstep results in a well-tuned filter, as defined in Section V. Thisanalysis is repeated for several process models, so the relativecontributions of spatial and temporal components to the inversesolution can be ascertained. Resonant behavior in the processmodel is then examined to provide the basis for discussing po-tential improvements to the algorithm.

In Section II, the linear distributed EEG inverse problem isdescribed. Section III outlines the KF-based inverse solution andthe likelihood maximization technique used for parameter esti-mation. In Section IV, inverse solutions for both simulated andclinical EEG data are presented. Section V describes the tests forevaluating KF performance and discusses their results. Processmodel resonance is explored and discussed in Section VI, whichcloses by outlining ways to improve both the process model andthe filter itself.

II. EEG INVERSE PROBLEM FORMULATION

To set up the EEG inverse problem, we define a continuouscurrent vector field j(r, t), where r and t denote space andtime, respectively. The solution space is discretized into Nv

grid points (voxels) rv , v = 1, . . . , Nv , restricted to the corticalgray matter of the brain, where the majority of the EEG signalis generated [8]. Time is discretized into Nk points tk , k =1, . . . , Nk . Discretized points are indicated by v and k here,rather than rv and tk . At each voxel, the state vector is

j(v, k) = [jx(v, k), jy (v, k), jz (v, k)]T . (1)

124 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 56, NO. 1, JANUARY 2009

The global state vector for the entire system has dimensionNJ = 3Nv and is written as

J(k) = [j(1, k)T , . . . , j(Nv , k)T ]T . (2)

The currents j(v, k) produce the EEG signal that is recordedon the scalp at Nc electrode sites. If the EEG voltage at a singleelectrode is denoted by y(c, k), where c is an electrode label,the observation vector containing the scalp voltages at all EEGchannels is

Y (k) = [y(1, k), . . . , y(Nc, k)]T . (3)

Here, voltages refer to average reference (the average voltage issubtracted from each channel).

The observation equation that relates the current vectors tobe estimated to the EEG signal is

Y (k) = KJ(k) + ε(k) (4)

where the Nc × NJ matrix K, often referred to as the leadfield matrix (LFM) or the observation model, maps the currentvectors to voltages at the scalp electrodes. In this paper, theLFM is approximated for the International 10–20 System [38]by solving the vector Laplace equation for a three-shell sphericalhead model via the boundary element method [39]. The termε(k) is an Nc -dimensional vector of observational noise, whichis assumed to be white, Gaussian, and unbiased, with covariancematrix Cε, and uncorrelated between all pairs of sensors, withequal variance σ2

ε at every electrode, so

Cε = σ2εINc

(5)

where INcis the Nc × Nc identity matrix.

Equation (4) cannot be inverted directly due to the large ratioof solution points to measurements. Hence, the inverse problemcan only be solved by introducing additional constraints.

III. SPATIOTEMPORAL KALMAN FILTERING

In this section, we summarize a recently developed KF-basedsource localization technique [28] that provides the motivationand basis for the present study. We begin by introducing a modelto describe the source dynamics and a state space transformationthat reduces the filter’s computational costs. The spatiotempo-ral KF algorithm is then outlined. Parameter estimation is thendiscussed and a method to tune the filter by likelihood maxi-mization is proposed.

A. Spatiotemporal Models

A key component of any dynamical inverse solution is a modelof the system dynamics (process model), in this case, one thatdescribes the spatiotemporal evolution of the current vectors.For this task, we propose a telegrapher’s equation [40] of theform

∂2j(r, t)∂t2

+2ζωn∂j(r, t)

∂t+ω2

nj(r, t) = b∇2j(r, t) + η(r, t)(6)

where ωn is the natural frequency, ζ the fractional damping co-efficient, b the wave velocity squared, and η(r, t) is a dynamical

(process) noise term. This equation is selected for several rea-sons: 1) it is the continuous form of the discrete model used hereand in [28]; 2) it contains an explicit temporal resonance, whichis a key feature of EEG data; 3) it allows physically meaningfulparameters to be determined through the estimation step; and 4)in previous work using mean-field modeling [26], an equivalentequation successfully described the spatiotemporal propagationof neuronal activity. To implement a KF, (6) is discretized withrespect to space and time to give

j(v, k) = AL1j(v, k − 1) + AL2j(v, k − 2)

+ BL1 [LJ(k − 1)]v + ηL (v, k) (7)

at each voxel, where L is a discrete 3-D spatial Laplacian oper-ator of dimensions NJ × NJ that arises from the discretizationof the second spatial derivative in (6) and is defined

L =(

INv− N

6

)⊗ I3 (8)

where ⊗ indicates Kronecker multiplication and N is anNv × Nv matrix with element N(v, v′) = 1 if v′ is immedi-ately adjacent to v (maximum of six neighbors per voxel ina 3-D rectangular grid) and N(v, v′) = 0 otherwise. The [J ]voperator selects the column vector composed of the three ele-ments of J that correspond to grid point v. Restricting atten-tion to classes of process models (e.g., [28]) in which the localcurrent components in each voxel are approximated as behav-ing independently of each other and only interacting with thecorresponding current vectors in neighboring voxels, gives thefollowing local parameter matrices in (7):

AL1 = a1I3 AL2 = a2I3 BL1 = b1I3 . (9)

From the discretization of (6), the model parameters in (9),assumed to be space- and time-invariant, are

a1 =2 − (ωn∆t)2

1 + ζωn∆t(10)

a2 =ζωn∆t − 11 + ζωn∆t

(11)

b1 = − 6b(∆t)2

(∆x)2(1 + ζωn∆t)(12)

where ∆t and ∆x are the time step and voxel size (assumingcubic voxels). From (7), we write the global process model as asecond-order multivariate AR model

J(k) = AG1J(k − 1) + AG2J(k − 2) + ηG (k) (13)

where the NJ × NJ global parameter matrices are

AG1 = a1INJ+ b1L AG2 = a2INJ

. (14)

The NJ -dimensional vector ηG (k) is a dynamical noise termthat is assumed white, Gaussian, and unbiased, with covariancematrix CηG

. To decompose this high-dimensional problem intoa set of coupled low-dimensional, voxel-centered, local filteringproblems, as described in the next section, requires this dynam-ical noise covariance matrix to be diagonal. However, for theprocess noise, assumption of a diagonal covariance matrix is

BARTON et al.: EVALUATING THE PERFORMANCE OF KALMAN-FILTER-BASED EEG SOURCE LOCALIZATION 125

typically not justified due to nonvanishing instantaneous cor-relations between neighboring voxels. So, to diagonalize thismatrix, a switch to a transformed (Laplacianized) state spaceJ (k) was proposed [28], where

J (k) = LJ(k). (15)

Assuming that the same form of dynamics govern the Laplacianof J , the process model is

J (k) = AG1 J (k − 1) + AG2 J (k − 2) + ηG (k). (16)

As a result of this transformation, the dynamical noise covari-ance matrix CηG

is closer to diagonal since applying the Lapla-cian operator L to the state vector J reduces spatial correlationsbetween neighboring voxels through (second order) spatial dif-ferentiation. Assuming the process noise covariance σ2

η to befixed in space and time, the covariance matrix is

CηG= σ2

ηINJ. (17)

We can rearrange (16) to obtain

J(k) = L−1AG1LJ(k − 1) + L−1AG2LJ(k − 2)

+ L−1 ηG (k). (18)

By equating the process noise term in (18) with the one in (13),we find ηG (k) = L−1 ηG (k), which yields the process noisecovariance matrix in the original space

CηG= L−1E(ηηT )(L−1)T = σ2

η(LT L)−1 . (19)

This state space transformation is called “spatial whitening,”and allows decomposition of the filtering problem, as describedin the next section. From now on, we will operate in this Lapla-cianized state space by replacing J(k) with J (k) and CηG

with CηG. To obtain actual current densities and covariances,

we simply apply the inverse of the spatial whitening transfor-mation; as seen shortly, this step requires one-off inversion of avery large (≈104 × 104) matrix.

B. KF Algorithm

At this point, we could apply standard Kalman filtering to thisproblem in the original or Laplacianized state space. However,given the high dimension NJ of the state space usually seenin EEG inverse problems, the computational time and memoryusage for such a filter is large enough to make the numericalestimation of model parameters performed in Section IV imprac-tical. To overcome this problem, a modified KF was introducedin [28] that reduces this NJ -dimensional filtering problem to aset of Nv coupled 6-D KFs, one at each voxel in Laplacianizedstate space, governed by the local process model (7). This modi-fication requires that CηG

be diagonal, which explains the needfor spatial whitening.

We now outline the modified KF used here. The reader isreferred to [28] for further details regarding its development.Before describing the algorithm, a notational convention is de-fined. The term x(k1 |k2) will indicate an estimate of some quan-tity x computed at time k1 , based on all observations availableat time k2 , where k1 k2 . Also, due to the application of spa-tial whitening, we will replace the LFM K with K = KL−1

henceforth. We start by augmenting the local state vector as theKF requires the local process model (7) to be in the form of afirst-order AR model. To achieve this, we define a new 6-D localstate vector jKF(v, k) as follows:

jKF(v, k) = [j(v, k)T , j(v, k − 1)T ]T (20)

so the new local parameter matrices become

AKF =(

AL1 AL2

I3 0

)BKF =

(BL1 0

0 0

). (21)

Rewriting (7), we obtain the local state prediction equation

j KF(v, k|k − 1) = AKF j KF(v, k − 1|k − 1)

+ BKF

([LJ (v, k−1|k−1)]v

0

). (22)

The local predicted state covariance is approximated as

P (v, k|k − 1) = AKF P (v, k − 1|k − 1)ATKF + CηL

(23)

where CηLis the 6 × 6 local dynamical noise covariance matrix

given by

CηL=

(σ2

ηI3 0

0 0

). (24)

The contribution of the second (neighborhood) term in (22)to the predicted state covariance (23) is ignored [28], sinceit is expected to be small, relative to the first (local) term, andtherefore will not contribute significantly to the state covariance.

Once the local prediction equations (22) and (23) have beenapplied at all voxels, we predict observed scalp voltages fromthe global state vector

Y (k|k − 1) = KJ (k|k − 1). (25)

The innovation sequence is the difference between observed andpredicted EEG measurements

∆Y (k) = Y (k) − Y (k|k − 1). (26)

The associated innovation covariance is approximated by

R(k|k − 1) =Nv∑v=1

Q(v)P (v, k|k − 1)Q(v)T + Cε (27)

where the Nc × 6 matrix Q(v) is defined as

Q(v) = ([K]v 0). (28)

The [K]v term denotes the three columns from K that cor-respond to the vth voxel, and the 0 matrix on the right hasdimensions Nc × 3. The 6 × Nc Kalman gain matrix for voxelv is then

G(v, k) = P (v, k|k − 1)Q(v)T R(k|k − 1)−1 . (29)

The filtering cycle is then completed by calculating the localstate estimate and the corresponding local state estimate covari-ance matrix

jKF(v, k|k) = jKF(v, k|k − 1) + G(v, k)∆Y (k) (30)

P (v, k|k) = [I6 − G(v, k)Q(v)]P (v, k|k − 1) (31)

126 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 56, NO. 1, JANUARY 2009

respectively. Applying (30) and (31) to all voxels generates theinverse solution for time point k. To obtain actual current densityestimates, we then undo the spatial whitening transformation(15) via

J(k|k) = L−1 J (k|k). (32)

The associated NJ × NJ covariance matrix for the actual cur-rent densities at every voxel is given by

P (k|k) = L−1P (k|k)(LT )−1 (33)

where P (k|k) denotes the Laplacianized NJ × NJ covariancematrix for all voxels, the diagonal of which consists of theP (v, k|k) matrices (only the first three columns of the first threerows) given by (31) for each voxel. The remaining elements ofP (k|k) are filled with zeros as a result of spatial whitening,which removes off-diagonal covariances.

C. Parameter Estimation

Since no detailed prior knowledge of parameter values isusually available, a strategy for selecting optimal values for theprocess model parameters (a1 , a2 , and b1) and noise covari-ances (σ2

ε and σ2η) is required. It was proposed in [28] that

the filter parameters should be estimated directly from the datausing the widely applied technique of likelihood maximiza-tion [41]–[44]. Following [43], filter parameters are selected bynumerically minimizing the Akaike information criterion (AIC).The AIC, closely related to the logarithm of the likelihood, esti-mates the distance between the process model and the unknowntrue model. Its calculation allows different model structures andparameter values to be compared objectively, so the best com-bination can be identified. We begin by defining a parametervector ϑ = (a1 , a2 , b1). The log-likelihood for the entire EEGtime series is

log L(ϑ, σ2ε , σ2

η) = −12

Nk∑k=1

[log |R(k|k − 1)|

+ ∆Y (k)T R(k|k − 1)−1∆Y (k)

+ Nc log(2π)] (34)

where | · | denotes absolute value of the matrix determinant. TheAIC is then

AIC(ϑ, σ2ε , σ2

η)= − 2 log L(ϑ, σ2ε , σ2

η)+2[dim(ϑ)+2] (35)

where dim(ϑ) indicates the number of parameters in ϑ, whichis increased by 2 as we need to fit the noise covariances fromthe data.

IV. RESULTS OF INVERSE SOLUTION

The spatiotemporal KF and parameter estimation techniquesare now applied to both simulated and clinical EEG data. Foreach dataset, we computed inverse solutions for three processmodels: 1) full model (discretized telegrapher’s equation); 2) nospatial coupling (b1 = 0); and 3) random walk (a1 = 1, a2 = 0,b1 = 0). This allows the relative contribution to filter perfor-mance of different parts of the process model to be assessed.

The following applies to all simulated and clinical EEG stud-ies in this paper. Prior to computing an inverse solution, wedefine a discretized solution space consisting of 3564 (Nv )7 × 7 × 7 mm gray matter voxels. These voxels cover the cortexand basal ganglia, and were taken from the Probabilistic MRIAtlas produced by the Montreal Neurological Institute [45]. Ateach voxel, the 3-D local current vector is mapped to the 19electrode sites for the 10–20 system through the LFM intro-duced in Section II. However, due to the choice of a referenceout of the set of electrodes, we exclude one of the electrode sitesfrom the analysis [28], so the number of channels is Nc = 18;in this case, Pz is chosen. After referencing, both datasets werenormalized to unit variance.

The filter requires that initial values jKF(v, 1|1) andP (v, 1|1) be given, although the value for P (v, 1|1) is notcritical [28]. Here, the filter is initialized by setting jKF(v, 1|1)to a 0 column vector and P (v, 1|1) to an identity matrix for allvoxels. If it converges, the filter is sensitive to initialization onlyin the short term, up to 0.5 s (see simulations).

A. Simulated EEG Recording

A major problem with all inverse solutions is obtaining mean-ingful evaluations of the algorithm’s results and performance,because true sources are not available for comparison whenworking with real data. One solution is to use simulated EEGdata, where underlying sources are known. To generate a simu-lated EEG dataset for this purpose requires us to select a modelfor the brain dynamics, which displays complex spatiotemporalbehavior. Here, we propose a highly simplified approximation,similar to the one used in [46], based on the observation thatoscillations can be widely distributed but are often strongest ina local region, e.g., alpha activity in the visual cortex.

The temporal dynamics are modeled using a linear combina-tion of sine functions whose components are evenly spaced inthe alpha band (8–12 Hz). The alpha band was selected sincethe clinical data used in the following section display prominentalpha activity. The amplitude of the oscillations follows a Gaus-sian centered at f0 = 10 Hz, so the simulated current density is

j(k) =Nf∑i=1

A(i) sin[2πf(i)k∆t + ψ(i)] (36)

where Nf is the number of frequency components, f(i) isthe frequency of oscillation [8 Hz ≤ f(i) ≤ 12 Hz], ψ(i)is a random phase offset [−π ≤ ψ(i) ≤ π], and A(i) is theGaussian scaling coefficient with variance σ2

f and

A(i) =1

σf

√2π

exp

(− [f(i) − f0 ]2

2σ2f

). (37)

The spatial distribution of the simulated source is modeled bythe following 3-D Gaussian function:

B(va) =1

(2π)3/2 |Ω|1/2

× exp[− (V a − V c)T Ω−1(V a − V c)

2

](38)

BARTON et al.: EVALUATING THE PERFORMANCE OF KALMAN-FILTER-BASED EEG SOURCE LOCALIZATION 127

Fig. 1. Two seconds of simulated EEG data generated for the 10–20 system(Pz is omitted) sampled at 256 Hz with an activation center in the right occipitallobe. Electrode abbreviations are on the vertical axis. The EEG potential usesaverage reference of all 19 electrodes (including Pz).

centered at voxel vc , with coordinates V c = (xc, yc , zc)T

and evaluated at each voxel va in the activation zone, withcoordinates V a = (xa , ya , za)T . The activation zone comprisesthe gray matter voxels within a certain radius of vc ; elsewhere,B = 0. The spatial Gaussian’s covariance matrix is Ω = σ2

s I3 ,where σ2

s is the variance. Finally, to produce the simulatedcurrent densities, the current density (36) is multiplied by thespatial coefficient mask (38).

For our simulated data, we selected an active region centeredin the right occipital pole. The full-width half-maximum valuesfor the frequency and spatial Gaussian distributions were 2 Hz(component spacing 0.25 Hz) and 75 mm, respectively (activa-tion zone radius 100 mm). In this simulation, all current vectorswere oriented in the z-direction (coronal axis) to maximize thescalp voltages at the occipital electrodes (i.e., O1 and O2). Thesimulated brain dynamics were then generated for 512 (Nk )time points, assuming a sampling rate of 256 Hz. Two secondsof synthetic EEG data, according to the 10–20 system, was gen-erated from the simulated current densities by multiplicationwith the LFM, and average reference was applied. Next, whiteGaussian observation noise was added to the data to give anSNR of 20:1 in terms of standard deviations. The resulting EEGdata are shown in Fig. 1, and displays high-amplitude alphaoscillations in the occipital electrodes, which are largest at O2.

We begin by applying the full model inverse solution to thesimulated data. Using likelihood maximization to estimate theunknown filter parameters yields a1 = 1.85, a2 = −0.91, b1 =−1.88 × 10−3 , σ2

ε = 1.94 × 10−3 , σ2η = 1.71 × 10−8 , and a

minimized AIC = −13 960. The AIC is computed from the130th time point onward (after ≈500 ms) for all simulations toallow transients to pass.

By looking at the parameters whose value we know from thesimulated data, we immediately gain insight into the optimiza-tion’s performance. The estimated value of the spatial couplingterm is very small, which agrees with the simulated data thatcontained no spatial interactions (i.e., b1 = 0). The estimated

Fig. 2. Axial slice from the gray matter mask showing the spatial distributionof the coronal component of the local current vectors at a fixed point in timefor the simulated data. (a) Original current vectors used in the simulation.(b) Estimated current vectors from the inverse solution.

Fig. 3. Coronal current density component for a voxel in the right occipitalpole [(a) and (c)] and the right medial frontal lobe [(b) and (d)] versus time forsimulated data. Frames (a) and (b) display the simulated current vectors, while(c) and (d) show the estimated currents from the inverse solution. Solid linesrepresent the simulated/estimated currents, while dashed lines indicate 95%confidence intervals.

measurement noise covariance σ2ε is also close to the actual

value of 2.5 × 10−3 . These findings provide preliminary sup-port for using AIC minimization to tune the filter.

Figs. 2 and 3 illustrate the inverse solution for the simulateddata. Fig. 2 shows the spatial distribution of the current’s coro-nal component when the activation center is maximal. We havedisplayed the coronal component in Figs. 2 and 3 as the sim-ulated current vectors were restricted to this direction. We seethat the algorithm correctly locates the region of alpha activityand its approximate spatial extent, but slightly underestimatesthe current densities.

Fig. 3 shows the time series of the coronal current densitycomponent for the simulated data and the inverse solution attwo voxels; one in the right occipital pole, at the center of thealpha activity, and the other in the right medial frontal lobewhere no simulated activity was present. At the occipital voxel,the simulated current exhibits a large alpha oscillation. This isaccurately reconstructed by the inverse solution but, as observedin the spatial data, the current amplitude is marginally under-estimated. The frontal voxel is inactive during the simulation

128 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 56, NO. 1, JANUARY 2009

Fig. 4. Innovation sequences for the simulated data shown in Fig. 1. Thevertical spacing between each channel has been decreased by a factor of ninerelative to Fig. 1.

(current density is zero throughout). This lack of activity is alsoidentified by the algorithm, where only a very low amplitudeoscillation, which lies inside the error interval, is present in theestimated time series.

In Fig. 4, the innovation sequence for each electrode is plotted;these should be white in a properly tuned KF (see Section V).Here, we see that once filter transients pass, most innovationsequences are near-white, aside from a small alpha oscillationpresent in some channels. Even in the occipital electrodes, whereremaining alpha activity is more pronounced, it is significantlyreduced in magnitude relative to the data (by a factor ≈10).

We then computed the inverse solution for the case of nospatial coupling, which gave almost identical parameters (a1 =1.85, a2 = −0.91, σ2

ε = 1.94 × 10−3 , σ2η = 1.72 × 10−8), AIC

(−13 962), estimated current density, and innovation values.This is expected as the simulated data assumed no spatial inter-action between voxels. These results imply that in both cases, awell-tuned filter and accurate process model have been selectedto satisfactorily describe the simulated alpha resonance.

Finally, we examined the effect of setting the dynamicalmodel to a random walk, which reduces the process model toa temporal smoothness constraint, and forces the filter to relylargely on the observations and estimation step. However, theoptimization step was unable to find a minimized AIC with thefollowing necessary features: realistic noise covariance values,corresponds to a well-tuned filter, reaches steady state by theend of the dataset, and produces an accurate inverse solution.Therefore, the only comments we make about this inverse so-lution are: 1) the random walk process model functions verypoorly for this dataset, resulting in a parameter space whereno optimal AIC value exists that corresponds to a well-tunedKF and 2) this finding implies the temporal component of theprocess model is required for the filter to operate soundly. Forthese reasons, the random walk case is excluded from furtheranalysis, here and for modeling the clinical data.

Fig. 5. Two seconds of awake, eyes-closed EEG recorded from a child aged8.5 years in the same format as Fig. 1.

B. Clinical EEG Recording

We now estimate inverse solutions for a time series chosenfrom a normal EEG recording collected during routine clinicalpractice (Neuropediatric Clinic, University of Kiel). The datawere recorded from a healthy male child aged 8.5 years, inawake resting state with eyes closed. Electrodes were placedaccording to the 10–20 system and the data were collected at asampling rate of 256 Hz. The resolution of the AD conversionwas 12 bit. A 2-s time series was selected from the recording foranalysis, and is shown, using average reference, in Fig. 5, whichshows characteristic alpha oscillations that are most prominentin the occipital electrodes and attenuate posterior to anterior.

The full model inverse solution for this EEG recording wascomputed using likelihood maximization, selecting the fol-lowing filter parameter values: a1 = 1.60; a2 = −0.65; b1 =3.08 × 10−2 ; σ2

ε = 2.17 × 10−11 ; σ2η = 7.00 × 10−7 . The min-

imized AIC = −2922, and is calculated from the 20th time steponward (after ≈75 ms) for all three process models.

Fig. 6 shows the spatial distribution of the inverse solution ata fixed moment in time. This figure shows an area of activityat the right occipital pole, as expected for prominent occipitalalpha activity. Fig. 7 displays the time series for the coronalcomponent of the inverse solution at two voxels: one in the rightoccipital pole and the other in the right medial frontal lobe. Onceagain, as expected, we see a large-amplitude alpha oscillationin the occipital voxel’s time series and very little activity in thefrontal voxel. These observations are consistent with an eyes-closed EEG recording.

Innovation sequences are shown in Fig. 8. This shows thatwhile the existing dynamical model explains much of the datastructure, some of the dynamics still remain uncaptured, espe-cially alpha activity in the occipital electrodes. Relative to thedata, these oscillations are approximately five times smaller inmagnitude, but twice the size of the corresponding ones in thesimulated data.

The inverse solution was then recomputed with spatial in-teractions removed; the parameters (a1 = 1.61, a2 = −0.66,

BARTON et al.: EVALUATING THE PERFORMANCE OF KALMAN-FILTER-BASED EEG SOURCE LOCALIZATION 129

Fig. 6. Axial slice from the gray matter mask showing the spatial distributionof the coronal component of the local current vectors at a fixed point in time forthe clinical recording as estimated by the inverse solution.

Fig. 7. Estimated coronal current density component for a voxel in the (a)right occipital pole and the (b) right medial frontal lobe (b) versus time forthe clinical recording. Solid lines represent estimated current vectors from theinverse solution, while dashed lines indicate 95% confidence intervals.

Fig. 8. Innovation sequences for the clinical data shown in Fig. 5. The verticalspacing between each channel has been decreased by a factor of five relative toFig. 5.

σ2ε = 4.77 × 10−11 , σ2

η = 6.49 × 10−7), AIC = −2645, recon-structed currents, and innovations remained essentially un-changed. This indicates that the spatial coupling term makesonly a small contribution to the inverse solution.

V. ANALYSIS OF FILTER PERFORMANCE

This section focuses on evaluating the performance of the KFitself. In any real application, validating and optimizing filterperformance is difficult because, unlike a simulation study, thetrue states are unknown and the only information available is

contained in the observations of the state. As a result, analysisof the innovation is the principal means of evaluating KF per-formance [19]. The AIC, which is a function of the innovationand its covariance, has already been used as a relative measure,but we now apply a series of standard diagnostic tests widelyused for objectively evaluating and tuning KF performance [19].These are applied after the likelihood maximization step, andallow us to determine the overall (rather than relative) qualityof the filter, something that is difficult to ascertain from theAIC alone. The tests focus on the properties of the innovationsequence, which should be normally distributed, unbiased (zeromean), uncorrelated (white), and have the correct magnitude(i.e., actual and filter-predicted innovation covariances shouldbe the same). The testing procedure uses the recommendationsof the standard reference [19], which are similar to many di-agnostic tests described in the literature, and consists of thefollowing five steps.

1) Using a single sample Kolmogorov–Smirnov (KS)goodness-of-fit hypothesis test [47], we determine if theinnovation sequence is normally distributed (innovationnon-Gaussian if P < 0.05). The tests in steps 2) and 3)assume that the innovation is Gaussian.

2) To determine if the innovation is unbiased, a one-samplet-test [48] is used (innovation has nonzero mean if P <0.05). The innovation must be unbiased for steps 3) and4).

3) We determine whether the actual and filter-predicted inno-vation covariances are the same. A mismatch indicates thatoverall filter noise levels (i.e., the process and/or observa-tion noise covariances) have been set incorrectly, whichcan degrade filter performance, and requires further anal-ysis to ascertain its cause. An inaccurate process modelcan also result in discordance between the innovation co-variances. Assuming 1) and 2) hold, noise levels can beinvestigated by checking that approximately 95% of inno-vation values lie within two standard deviations of zero.A more precise means of assessing filter noise levels, againrequiring that 1) and 2) hold, is to carry out a χ2-test onthe normalized square of the innovation

∆Y N (c, k) =[∆Y (c, k)]2

Rc(k|k − 1)(39)

where Rc(k|k − 1) denotes the innovation covariance forchannel c at time k. If 1) and 2) hold, ∆Y N will bea χ2 random variable (resulting from squaring a Gaus-sian random variable), with a mean of 1 if the actual andfilter-predicted covariances match. To compare these, atest statistic

∆Y N (c) =1

Nk

Nk∑k=1

∆Y N (c, k) (40)

for each channel is used; hence, we obtain 95% confidenceintervals from which we can determine whether the noiselevels are correct. If the average normalized innovationlies below these bounds, the assumed noise levels are toohigh and vice versa.

130 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 56, NO. 1, JANUARY 2009

4) The innovation sequence should be white. Any correla-tions are due to the presence of unmodeled process dy-namics or the ratio of observation to process noise beingtoo high. We computed the innovation’s time-averaged(biased) autocorrelation r for each channel c by calculat-ing the inverse Fourier transform (FT) of the power spec-tral density (PSD) [20]. The resulting autocorrelation isequivalent to that obtained via the following time-domainequation [20]:

r(c, τ) =1

Nk − τ

Nk −τ∑i=1

∆Y (c, i)∆Y (c, i + τ) (41)

where τ indicates a discrete time shift ranging from0 to Nk − 1. The autocorrelation was normalized, sor(c, 0) = 1. The number of points in the autocorrelation ishalved due to the FT and is denoted by Na . The autocorre-lation for each channel can be used as a test statistic, whichshould be approximately normally distributed with a meanof 0 and standard deviation 1/

√Na if the innovation is un-

biased and white. Therefore, we can approximate the 95%confidence intervals by ±2/

√Na . If approximately 95%

of the autocorrelation does not lie within these bounds,then the innovation is not white.Finally, we calculated the innovation PSD S for each chan-nel c using Welch’s method [49]. The PSD allows the inno-vation’s frequency content to be examined, and should beflat for an uncorrelated signal [21]. From these PSDs, wecomputed the spectral entropy HS [50] for each channel c

HS (c) =−

∑Nf

f =1S(c, f) log S(c, f)log Nf

(42)

where f is the frequency bin number and Nf is the numberof bins in the PSD. Spectral entropy is a compact measureof a power spectrum’s “peakedness” (or conversely “flat-ness”), ranging from zero for a monochromatic signal toone for a completely random one. This is useful for com-paring the overall “whiteness” of the innovation sequencesbetween channels and different filter configurations.

5) It can be difficult to distinguish the relative contributionsof process and measurement errors to the innovation, so itis important to look at the error between the state estimateand prediction when evaluating KF performance, becauseit relates only to process errors, and should be approxi-mately uncorrelated and bounded by its covariance.

We stress that the measurement of KF performance is bestthought of as a continuum, spanning from filters that are optimal(i.e., pass the earlier tests by wide margins) to ones that cannotbe tuned to provide any useful estimates. The innovation testshere provide a very strict measure of KF performance, and filtersthat pass these are close to optimal. However, as stressed in [19],many KFs in real applications lie between these two extremes,and while they may not satisfy all of these rigorous tests all ofthe time, they do provide useful results, as we will see with thefilter here. We term such filters “well-tuned.” In this situation,the aim of filter tuning is to make the KF as close to optimal as

Fig. 9. Temporal properties of the innovation sequence at channel O2 [(a) and(c)] and P3 [(b) and (d)] for the simulated data. Frames (a) and (b) display thePSD of the data (solid line) and innovation (dashed line), while (c) and (d) showthe autocorrelation of the innovation (solid line) and its 95% confidence bounds(dashed line).

possible, and evaluation of the filter’s performance will requirea degree of engineering judgement.

We now apply this procedure to the simulated and clinicalEEG inverse mappings described in Section IV. Steps 1)–4) areapplied to all 18 channels used by the inverse solution, and step5) is performed by looking at the error between the state estimateand prediction for voxels of interest in each analysis. Due tosome intrinsic short-term filter behavior and the filter receivinginaccurate initial values, the performance analysis begins, likethe AIC calculation, at the 130th and 20th time steps for thesimulated and clinical data, respectively, to allow filter transientsto pass.

A. Simulated EEG Recording

When the filter is applied to the simulated data using the fullprocess model, the KS test finds that all 18 innovation sequencesare normally distributed, while the t-test shows that the 18 in-novations are unbiased. Therefore, we apply steps 3) and 4), buta χ2-test found that the actual innovation covariances did notmatch the filter-predicted ones at any of the 18 channels. The re-sults at 16 electrode sites indicated that overall filter noise levelswere set too high (on average by a factor of 3.0), while the re-maining channels (O1 and O2) suggested that noise levels weretoo low. However, the two occipital channels are discountedsince the actual innovation covariance value is inflated by asmall amount of residual alpha activity (see Fig. 4). These testsimply that AIC minimization has selected conservative noisevalues that will not adversely affect the filter’s performance.

Turning our attention to the detection of correlations in the in-novation sequences, we computed the PSD for both the recordeddata and innovation at each channel, as well as the innova-tion’s autocorrelation. Fig. 9 shows the PSD and autocorrela-tion for two typical channels, O2 and P3, and demonstrates thatthe process model selected through optimization describes the

BARTON et al.: EVALUATING THE PERFORMANCE OF KALMAN-FILTER-BASED EEG SOURCE LOCALIZATION 131

Fig. 10. Temporal properties of the innovation sequence at channel O2 [(a)and (c)] and F4 [(b) and (d)] for the clinical data. Frames (a) and (b) displaythe PSD of the data (solid line) and innovation (dashed line), while (c) and (d)show the autocorrelation of the innovation (solid line) and its 95% confidencebounds (dashed line).

alpha activity present in the simulated data quite well, althougha small alpha peak remains in all innovations, particularly O1and O2, due to the inverse solution underestimating the sourcemagnitudes. This residual alpha activity means that 17 channelshave greater than 10% of their autocorrelation lying outside the95% confidence bounds. From the data and innovation PSDs,the spectral entropy was calculated for each channel and plottedin Fig. 11(a). This figure shows that the innovation sequencesare significantly whiter than the simulated data across all chan-nels. The dip in the innovation curve at O1 and O2 is due to thealpha still present at the occipital electrodes.

We completed our filter evaluation by looking at the errorbetween the state estimate and prediction for selected voxels.It was found that some alpha activity was present in this error,mainly in the z-component of voxels at or near the center of thesimulated activation. However, this appears to result from theinverse solution spatially blurring the reconstructed activation,and thus underestimating the current densities rather than a pro-cess model deficiency, and is most likely due to the small numberof electrodes. The predicted and estimated current densities be-ing in phase, along with the data and predicted observations,support this explanation.

The performance analysis was repeated for the inverse solu-tion computed with no spatial interaction between voxels. Verysimilar results were obtained, as expected, since the simulateddata assumed no spatial coupling.

B. Clinical EEG Recording

The filter performance analysis is now repeated for the clini-cal data, starting with the full process model. The KS test foundthat all innovation sequences are Gaussian, and the t-test iden-tified 11 out of 18 channels to be unbiased. We then appliedthe χ2-test that found the actual innovation covariances weresmaller than the filter-predicted covariances for all 18 channels.

Fig. 11. Spectral entropy of the EEG recording (solid line) and innovationsequence (dashed line) at each channel for the (a) simulated and the (b) clinicaldata.

This indicates that filter noise levels, selected via AIC mini-mization, were set too high (on average by a factor of 4.1).However, as the results for the clinical data demonstrate, thesesomewhat conservative noise values are compatible with correctfilter operation.

We then checked whether the innovations were white by com-puting the PSD and autocorrelation at each channel. Results areshown for illustrative channels, O2 and F4, in Fig. 10. Thisshows that the filter can handle the low-pass characteristic ofthe EEG recording, as the flatter PSD indicates, but what doesremain in the innovation sequence of some channels is a consid-erable alpha resonance, which is most prominent in the occipitalelectrodes. The autocorrelations confirm this, as they are clearlynot white, and all channels have greater than 10% of the autocor-relation outside the confidence intervals. Despite the unmodeledalpha activity, the filter’s ability to significantly whiten the in-novation relative to the signal is clear when the spectral entropyfor each channel is plotted in Fig. 11(b). The small decrease inthe innovation spectral entropy is from the alpha remaining atthe occipital electrodes.

Finally, we looked at the error between the state estimateand prediction for a number of voxels. As with the simulateddata, we found alpha oscillations present in this time series,particularly in voxels around the occipital poles. However, un-like the simulated data, this appears to result from a deficiencyin the process model, as the predicted current density lags theestimated one. More precisely, the process model is unable toreproduce the alpha activity accurately.

The performance analysis was repeated for the process modelwithout spatial coupling. As seen with the inverse solutionsthemselves in Section IV, the performance of this filter is nearlyidentical to the one using the full model. This again indicatesthat while the filter inverts the clinical data quite well, the spatialpart of the process model does little to enhance its performance.

C. Preliminary Overview of Filter Performance

The inverse solutions and the validation tests have shownthat this tuning technique produces well-tuned filters althoughsome potential improvements have also become apparent. Wefound for both simulated and clinical data that the innovationswere generally Gaussian and unbiased, and the optimizationstep selected slightly conservative values for the noise parame-ters. Similarly, the process models selected for the two datasetsmodeled the EEG data satisfactorily, as demonstrated by the

132 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 56, NO. 1, JANUARY 2009

spectral entropy increasing by ≈0.3 (between the data and theinnovations) across nearly all channels in both datasets (seeFig. 11). However, the correlation analysis and the error be-tween the state estimate and prediction revealed residual alphaactivity in the innovation sequences of the simulated and clinicaldata, but for different reasons. The low-amplitude alpha oscil-lations present in the simulation innovations (most prominentat O1 and O2) appear to result from the filter underestimatingthe current density’s magnitude, which was caused by spatialblurring due to the small number of electrodes used. While thisissue is still present in the clinical study, the major reason forthe (larger) alpha waveforms in the innovations is the processmodel being unable to fully capture the alpha resonance; thisissue is examined in the next section.

We also found that dropping the spatial coupling from theprocess model had surprisingly little effect on the filter’s perfor-mance for the clinical data (expected for the simulation study).This discovery further indicates that in its current implementa-tion, the filter’s major deficiency most likely lies in the temporalpart of the process model, which overshadows any influencethe spatial term might have. Alternatively, it is possible that themodel’s spatial component is inaccurate, and the optimizationstep seeks to remove it from the inverse solution by selectingb1 ≈ 0, or its impact is nullified by spatial whitening. Furtheranalysis is required to resolve this issue.

Finally, a random walk process model was investigated. Thismodel so significantly degraded performance that an optimal,well-tuned filter could not be found for either dataset. Theseresults imply that the temporal component of the process model,which performs better for the simulated data, is necessary forthe filter to function properly. However, the findings of the filterperformance analysis, especially for the clinical data, indicatethat the modeling part of this estimation technique could besubstantially improved. Possible modeling improvements, alongwith resonant behavior in the process model, are the subject ofthe next section.

VI. RESONANT BEHAVIOR OF PROCESS MODEL

Here, we investigate the process model’s resonant behavior,particularly for the clinical data where model deficiencies havebeen identified. We focus on the model’s temporal aspects asSection V revealed that the spatial term has minimal impacton the inverse solution. We begin by obtaining expressions forthe parameters describing the model’s resonant behavior, whichprovides additional insight into the resonant properties of theinverse solutions generated in Section IV and why the alpharesonance was modeled better in the simulated data. Then theinverse solution is computed for a series of process models, eachcontaining an explicit alpha resonance to capture the posterioralpha activity present in the clinical data. Suggestions for po-tential future improvements to both the dynamical model andthe filter algorithm are then made.

A. Resonant Process Model

The equations for the temporal AR parameters, (10) and (11),can be manipulated to give expressions for ωn and ζ as functions

Fig. 12. PSD of the process model’s temporal component as selected bylikelihood maximization for the simulated (solid line) and clinical (dashed line)datasets.

of a1 and a2

ωn =

√2(1 − a1 − a2)(∆t)2(1 − a2)

(43)

ζ = −(

1 + a2

1 − a2

) √1 − a2

2(1 − a1 − a2). (44)

These expressions convert a1 and a2 into parameters from theoriginal telegrapher’s equation that have a clear physical inter-pretation and allow us to better characterize the resonant behav-ior of the process model selected by likelihood maximization.

From the simulated data model parameters, we find ωn =62.2 s−1 (9.9 Hz) and ζ = 0.20, while ωn = 63.5 s−1

(10.1 Hz) and ζ = 0.86 for the clinical data. To further illustratethe model’s temporal characteristics, the frequency response forthe model’s temporal component is shown in Fig. 12 for bothdatasets. The process model selected for the simulation has reso-nant properties that closely match the data itself, as the estimatednatural frequency of 9.9 Hz lies near the center of the Gaussianfrequency envelope and ζ is also close to its actual value of zerofor undamped sine waves. As expected, the model has a sharpresonance at ≈10 Hz, which can be seen in Fig. 12. However,the process model for the clinical data has ζ ≈ 1, so it displaysno discernible resonant behavior, as is clear from the absence ofany peak in the frequency response in Fig. 12.

If we recall that the AIC minimization step identifies theglobal (space- and time-invariant) process model that best ex-plains the data, these results are to be expected. In the case of thesimulated data, which was generated by a single source centeredin the right occipital lobe, the dynamics can be described suffi-ciently by a single, globally resonant, process model. This is notthe case for the clinical data, where the recording’s dominantalpha resonance shows considerable spatial dependence (dimin-ishing in amplitude frontally) that cannot be accurately capturedby a process model with spatial and temporal uniformities. In-stead, the optimization step identified the one global feature ofthe clinical EEG, its low-pass filter characteristic. As a result,the alpha activity seen in the innovation of some channels (e.g.,O1 and O2) is expected as the alpha resonance is unmodeled.

B. Inverse Solution With Explicit Resonance

With the selection of appropriate values for ωn and ζ, theprocess model can describe resonant features of the EEG, as

BARTON et al.: EVALUATING THE PERFORMANCE OF KALMAN-FILTER-BASED EEG SOURCE LOCALIZATION 133

Fig. 13. Innovation spectral entropy as the damping coefficient in the processmodel is varied (ζ = 0.2, 0.3, 0.4, 0.5) at a fixed natural frequency (9 Hz) forthe clinical data. The spectral entropy for the optimized filter innovations andthe clinical data itself are also shown, as labeled.

seen for the simulated data. However, due to the space- andtime-invariant process models, the parameter estimation stepselected a nonresonant model for the clinical data, despite thepresence of posterior alpha activity. So, we now examine theeffect on filter performance of applying a process model withan alpha resonance to the clinical data to provide further insightinto how the filter could be improved.

We began by fixing the process model’s natural frequencyωn = 56.5 s−1 (9 Hz) to match the alpha frequency at O2.The strength of the resonance was varied across four filterruns by setting ζ = 0.2, 0.3, 0.4, and 0.5. To find the optimalfilter for each run, AIC minimization was used. Due to thespatial term’s minimal impact on the inverse solutions for theclinical data, it was ignored (i.e., b1 = 0); so, only the twonoise covariances needed to be estimated. To allow transient fil-ter behavior to pass, the AIC was calculated from the 130thtime step onward for each case. For the purposes of com-parison the full model filter was reoptimized over this seg-ment of clinical data, giving ωn = 36.4 s−1(5.8 Hz), ζ = 0.75,b1 = 1.11 × 10−2 , σ2

ε = 1.27 × 10−9 , σ2η = 1.25 × 10−7 , and

AIC = −3307.We find that introducing an explicit resonance degrades the

filter’s overall performance; for instance, AIC values increasedfrom −2091 to 2408 as ζ was decreased from 0.5 to 0.2. Thespectral entropy of the innovations, which are plotted for eachrun in Fig. 13, tell a similar story: they generally decrease (in-novations become less white) as the model is made more reso-nant, although resonant behavior, up to a point, does marginallywhiten the innovation of channel O2, the site with the mostsignificant alpha activity. This is shown in Fig. 13, where thespectral entropy increases slightly in channel O2 for ζ = 0.3and 0.4. However, this small improvement comes at a consid-erable cost as the global resonance distorts (with decreasingζ) the modeling of the low (subalpha) frequencies, which canbe seen in the innovations of the other channels, and results in

poorer filter performance. This is not surprising since the alpharesonance is not present at all sites in the clinical data.

From this analysis, we conclude that overall, the “optimal”nonresonant process model outperforms the model with an ex-plicit uniform alpha resonance, as expected, although neither ofthese space- and time-invariant models can accurately describethe spatiotemporal complexity of clinical EEG data. Therefore,improved dynamical models are required.

C. Future Directions

Guided by our filter analysis, we now discuss options for im-proving the inverse solution, focusing particularly on the pro-cess model. The first issue is what form the dynamical modelshould take. Given that resonant behavior is a key feature ofEEG data, the existing telegrapher’s equation, which containsa single resonance, would be a reasonable choice, althoughphysiology-based models of brain dynamics, such as those pre-sented in [26] and [27], are also attractive, because estimatedparameters are physiologically meaningful, increasing the in-formation provided by the inverse solution. Furthermore, thesemodels could better describe the spatial interactions betweenvoxels, an issue that warrants further investigation.

As noted, a uniform global model of brain dynamics is un-realistic, so regardless of what form of model is selected, itsbehavior will require spatiotemporal variation, e.g., to modelthe spatial properties of the alpha rhythm seen in the clini-cal data. This issue was previously investigated for this filterusing generalized autoregressive conditional heteroskedasticity(GARCH) modeling of covariance [29], which was found to en-hance performance. In that study, the same homogeneous modelwas used, but the process noise (which measures our confidencein the dynamical model) was allowed to vary in space and timeas a function of how well the process model was performingat a particular voxel. We propose the alternative approach ofletting the process model parameters (e.g., ζ and ωn ) have spa-tiotemporal variation, which opens the possibility of parametricimaging, but poses a more difficult parameter estimation taskthan in this paper, although for systems with spatial variationdescribed by a relatively small number of parameters, it may bepossible to apply the likelihood maximization technique usedhere. Another strategy is to estimate the parameters within theKF itself, with the added benefit of providing the quality ofeach parameter estimate via the state covariance matrix. Thisproduces a nonlinear filtering problem that can be solved usingalgorithms such as the extended KF (EKF) [19] or the unscentedKF (UKF) [51]. In other fields, both the EKF [52], [53] andUKF [54] have been successfully applied to system identifica-tion for spatiotemporal systems modeled by partial differentialequations. Another option, which also permits estimation insidea KF, is to extend the state space and model each parameter as aGaussian random field [55]. However, as discussed in [28], re-gardless of what technique is used, the estimation problem mustbe observable [19] (i.e., the states must be estimable from themeasurements), which is of increasing concern as the numberof quantities estimated from the data rises.

134 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 56, NO. 1, JANUARY 2009

A difficulty inherent to the EEG inverse problem, and of par-ticular significance to the KF, is its high dimensionality. Typ-ically, this necessitates simplification of the filter algorithm toreduce memory consumption and achieve practical run times.Here, spatial whitening is used for this purpose, reducing the al-gorithm to a set of low-dimensional KFs. A filter that operates inthe untransformed state space would offer two key advantages:1) removal of any distortions introduced by the “strong” whiten-ing transformation, which will allow its effect on the inversesolution to be properly assessed, and 2) the process model willdescribe the state of interest, i.e., the current density, ratherthan its second spatial derivative, which is especially importantfor physiology-derived models. The gap between the full filterand the single voxel centered, spatially whitened version couldpotentially be bridged by partitioned filters [54]. These filtersdivide the state space into local filtering neighborhoods, and al-low the tradeoff between computation time/memory usage andfilter performance to be examined.

Finally, it is worth mentioning that the inverse solution couldbe further constrained by introducing additional information.High-density EEG recordings could be used to provide extraobservations (up to 256 channels), which will improve the spa-tial resolution of the inverse mappings. When the inverse so-lution is computed offline, the Kalman smoother [21] becomesavailable. This algorithm uses all available data, past and future,to compute each estimate, and has recently been applied to theEEG inverse problem [30], [31]. Using the KF to fuse EEG datawith other imaging modalities, particularly fMRI, is a naturalextension of this study [19], [31], [32], which could improvespatial resolution beyond what is possible with EEG alone.

VII. SUMMARY AND CONCLUSION

We have investigated the application of dynamical inverse so-lutions to EEG source localization. Dynamical techniques are ofparticular interest because they provide a natural framework forintroducing the growing number of models describing brain dy-namics [26], [27] into inverse solutions. The KF is an example ofa model-based estimation technique that is well suited to solvinginverse problems, but has only lately been applied in this field.Motivated by its potential, we introduced a recently proposedKF-based source localization technique [28]. Key features ofthis algorithm are the following: the process model is a space-and time-invariant telegrapher’s equation, a spatial whiteningtransformation is used to reduce its computational burden, andthe filter is tuned using likelihood maximization.

Inverse solutions for simulated and clinical data, both con-taining alpha activity in the occipital lobe, were computed andpresented for three process models. The optimized filters werethen analyzed in detail using standard diagnostic tests for eval-uating KF performance. Following this, the resonant propertiesof the process model were examined and the effect of introduc-ing an explicit alpha resonance into the filter was explored forclinical data. The major findings are as follows:

1) The AIC minimization step selects appropriate model pa-rameters and noise covariances, which result in a well-tuned filter as indicated by the reconstructed current densi-

ties and diagnostics. This shows that likelihood maximiza-tion is effective for filter tuning, but tuning still requiresan appropriate process model to be chosen. For instance,the simulated and clinical data could be modeled by ei-ther the full model or the model without spatial coupling,although these models performed better for the simulateddata, and the spatial term made only a small contributionto the clinical data’s inverse solution. In contrast, a randomwalk model could not be optimized for either dataset.

2) The process model is a telegrapher’s equation, which con-tains a resonance whose properties were examined. It wasfound that AIC minimization, which finds the space- andtime-invariant models that best describe the data, selecteda process model containing an alpha resonance for the sim-ulated, but not the clinical, EEG. This makes sense as themodel chosen should capture any spatially and temporallyuniform features of the time series, which for the simu-lated data are the alpha activity (the only salient feature),and for the clinical recording is the low-pass characteris-tic. Thus, these findings explain why: 1) the innovationsfor the clinical data (especially the occipital electrodes)contain unmodeled alpha activity of a higher magnitudethan in the simulated EEG and 2) the predicted and esti-mated current densities are out of phase only in the clinicalstudy.

3) The introduction of an explicit alpha resonance into theprocess model for the clinical data degraded filter perfor-mance. This is due to a mismatch between the globallyresonant dynamical model and the data, where the alphaactivity is confined predominantly to the occipital elec-trodes. However, introduction of a resonance did improvethe modeling of the posterior alpha rhythm.

4) We demonstrated the utility of applying a battery of diag-nostic tests to this KF, as they provide numerous insightsinto filter performance, and a means of validating the pa-rameters selected by likelihood maximization. This stepis very important because a minimized AIC does not nec-essarily correspond to a well-tuned filter.

From these results, a number of potentially rewarding futuredirections were identified, which focused on selecting an appro-priate process model, the need for spatiotemporal variation ofmodel parameters, handling the problem’s high dimensionality,and introducing additional information to further constrain theinverse solution.

ACKNOWLEDGMENT

The authors would like to thank C. J. Rennie, T. A. Bailey,and P. M. Drysdale for stimulating discussions.

REFERENCES

[1] A. W. Toga and J. C. Mazziotta, Eds., Brain Mapping: The Methods, 2nded. London, U.K.: Academic, 2002.

[2] S. Baillet, J. C. Mosher, and R. M. Leahy, “Electromagnetic brain map-ping,” IEEE Signal Process. Mag., vol. 18, no. 6, pp. 14–30, Nov. 2001.

[3] C. M. Michel, M. M. Murray, G. Lantz, S. Gonzalez, L. Spinelli, andR. G. de Peralta, “EEG source imaging,” Clin. Neurophysiol., vol. 115,pp. 2195–2222, 2004.

BARTON et al.: EVALUATING THE PERFORMANCE OF KALMAN-FILTER-BASED EEG SOURCE LOCALIZATION 135

[4] M. Scherg and D. von Cramon, “Two bilateral sources of the late AEP asidentified by a spatio-temporal dipole model,” Electroencephalogr. Clin.Neurophysiol., vol. 62, pp. 32–44, 1985.

[5] B. D. Van Veen, W. van Drongelen, M. Yuchtman, and A. Suzuki, “Lo-calization of brain electrical activity via linearly constrained minimumvariance spatial filtering,” IEEE Trans. Biomed. Eng., vol. 44, no. 9,pp. 867–880, Sep. 1997.

[6] J. C. Mosher, P. S. Lewis, and R. M. Leahy, “Multiple dipole modelingand localization from spatio-temporal MEG data,” IEEE Trans. Biomed.Eng., vol. 39, no. 6, pp. 541–557, Jun. 1992.

[7] H. Liu and P. H. Schimpf, “Efficient localization of synchronous EEGsource activities using a modified RAP-MUSIC algorithm,” IEEE Trans.Biomed. Eng., vol. 53, no. 4, pp. 652–661, Apr. 2006.

[8] P. L. Nunez and R. Srinivasan, Electric Fields of the Brain: The Neuro-physics of EEG. 2nd ed. New York: Oxford Univ. Press, 2006.

[9] M. S. Hamalainen and R. J. Ilmoniemi, “Interpreting magnetic fields ofthe brain—Minimum norm estimates,” Med. Biol. Eng. Comput., vol. 32,pp. 35–42, 1994.

[10] R. D. Pascual-Marqui, C. M. Michel, and D. Lehmann, “Low resolu-tion electromagnetic tomography: A new method for localizing electri-cal activity in the brain,” Int. J. Psychophysiol., vol. 18, pp. 49–65,1994.

[11] I. F. Gorodnitsky, J. S. George, and B. D. Rao, “Neuromagnetic sourceimaging with FOCUSS: A recursive weighted minimum norm algo-rithm,” Electroencephalogr. Clin. Neurophysiol., vol. 95, pp. 231–251,1995.

[12] F. Darvas, U. Schmitt, A. K. Louis, M. Fuchs, G. Knoll, and H.Buchner, “Spatio-temporal current density reconstruction (stCDR)from EEG/MEG-data,” Brain Topogr., vol. 13, pp. 195–207,2001.

[13] U. Schmitt, A. K. Louis, C. Wolters, and M. Vaukhonen, “Efficient al-gorithms for the regularization of dynamic inverse problems: II. Applica-tions,” Inverse Probl., vol. 18, pp. 659–676, 2002.

[14] S. Baillet and L. Garnero, “A Bayesian approach to introducing anatamo-functional priors in the EEG/MEG inverse problem,” IEEE Trans. Biomed.Eng., vol. 44, no. 5, pp. 374–385, May 1997.

[15] J. Daunizeau, J. Mattout, D. Clonda, B. Goulard, H. Benali, and J.-M. Lina,“Bayesian spatio-temporal approach for EEG source reconstruction: Con-ciliating ECD and distributed models,” IEEE Trans. Biomed. Eng., vol. 53,no. 3, pp. 503–516, Mar. 2006.

[16] E. Somersalo, A. Voutilainen, and J. P. Kaipio, “Nonstationary magnetoen-cephalography by Bayesian filtering of dipole models,” Inverse Probl.,vol. 19, pp. 1047–1063, 2003.

[17] O. Yamashita, A. Galka, T. Ozaki, R. Biscay, and P. Valdes-Sosa, “Recur-sive penalized least squares solution for dynamical inverse problems ofEEG generation,” Hum. Brain Mapp., vol. 21, pp. 221–235, 2004.

[18] S. J. Kiebel, O. David, and K. J. Friston, “Dynamic causal modelling ofevoked responses in EEG/MEG with lead field parameterization,” Neu-roImage, vol. 30, pp. 1273–1284, 2006.

[19] Y. Bar-Shalom, X. Rong Li, and T. Kirubarajan, Estimation with Appli-cations to Tracking and Navigation: Theory Algorithms and Software.New York: Wiley, 2001.

[20] P. S. Maybeck, Stochastic Models, Estimation, and Control (ser. Math-ematics in Science and Engineering), vol. 141, New York: Academic,1979.

[21] M. S. Grewal and A. P. Andrews, Kalman Filtering: Theory and Practice.Englewood Cliffs, NJ: Prentice-Hall, 1993.

[22] J. P. Kaipio, P. A. Karjalainen, E. Somersalo, and M. Vauhkonen, “Stateestimation in time-varying electrical impedance tomography,” Ann. N. Y.Acad. Sci., vol. 873, pp. 430–439, 1999.

[23] M. Kervinen, M. Vauhkonen, J. P. Kaipio, and P. A. Karjalainen, “Time-varying reconstruction in single photon emission computed tomography,”Int. J. Imag. Syst. Technol., vol. 14, pp. 186–197, 2004.

[24] S. Prince, V. Kolehmainen, J. P. Kaipio, M. A. Franceschini, D. Boas, andS. R. Arridge, “Times-series estimation of biological factors in opticaldiffusion tomography,” Phys. Med. Biol., vol. 48, pp. 1491–1504, 2003.

[25] S. G. Diamond, T. J. Huppert, V. Kolehmainen, M. A. Franceschini, J.P. Kaipio, S. R. Arridge, and D. A. Boas, “Dynamic physiological mod-eling for functional diffuse optical tomography,” NeuroImage, vol. 30,pp. 88–101, 2006.

[26] P. A. Robinson, C. J. Rennie, D. L. Rowe, S. C. O’Connor, and E. Gordon,“Multiscale brain modelling,” Phil. Trans. R. Soc. B, vol. 360, pp. 1043–1050, 2005.

[27] J. W. Kim and P. A. Robinson, “Compact dynamical model of brainactivity,” Phys. Rev. E, vol. 75, pp. 031 907.1–031 907.10, 2007.

[28] A. Galka, O. Yamashita, T. Ozaki, R. Biscay, and P. Valdes-Sosa, “Asolution to the dynamical inverse problem of EEG generation using spa-tiotemporal Kalman filtering,” NeuroImage, vol. 23, pp. 435–453, 2004.

[29] A. Galka, O. Yamashita, and T. Ozaki, “GARCH modelling of covariancein dynamic estimation of inverse solutions,” Phys. Lett. A, vol. 333,pp. 261–268, 2004.

[30] C. J. Long, N. U. Desai, M. Hamalainen, S. Temereanca, P. P. Purdon,and E. N. Brown, “A dynamic solution to the ill-conditioned magnetoen-cephalography (MEG) source localization problem,” in Proc. IEEE ISBI,2006, pp. 225–228.

[31] T. Deneux and O. Faugeras, “EEG-fMRI fusion of non-triggereddata using Kalman filtering,” in Proc. IEEE ISBI, 2006, pp. 1068–1071.

[32] J. J. Riera, J. C. Jimenez, X. Wan, R. Kawashima, and T. Ozaki, “Nonlinearlocal electrovascular coupling. II: From data to neuronal masses,” Hum.Brain Mapp., vol. 28, pp. 335–354, 2007.

[33] A. S. Willsky, “A survey of design methods for failure detection in dy-namic systems,” Automatica, vol. 12, pp. 601–611, 1976.

[34] M. Basseville, “Detecting changes in signals and systems—A survey,”Automatica, vol. 24, pp. 309–326, 1988.

[35] R. K. Mehra and J. Peschon, “An innovations approach to fault detectionand diagnosis in dynamic systems,” Automatica, vol. 7, pp. 637–640,1971.

[36] C. M. Gadzhiev, “New method for checking the statistical characteristicsof the innovation sequence of the Kalman filter,” Eng. Simulation, vol. 14,pp. 83–91, 1997.

[37] C. Hajiyev, “Innovation approach based measurement error self-correctionin dynamic systems,” Measurement, vol. 39, pp. 585–593, 2006.

[38] H. H. Jasper, “The ten twenty electrode system of the International Fed-eration,” Electroencephalogr. Clin. Neurophysiol., vol. 10, pp. 371–375,1958.

[39] J. J. Riera, M. E. Fuentes, P. A. Valdes, and Y. Oharriz, “EEG-distributedinverse solutions for a spherical head model,” Inverse Probl., vol. 14,pp. 1009–1019, 1998.

[40] A. D. Polyanin, Handbook of Linear Partial Differential Equations forEngineers and Scientists. Boca Raton, FL: CRC, 2002.

[41] F. C. Schweppe, “Evaluation of likelihood functions for Gaussian signals,”IEEE Trans. Inf. Theory, vol. IT-11, no. 1, pp. 61–70, Jan. 1965.

[42] R. K. Mehra, “Identification of stochastic linear dynamic systems usingKalman filter representation,” AIAA J., vol. 9, pp. 28–31, 1971.

[43] H. Akaike, “A new look at the statistical model identification,” IEEETrans. Autom. Control, vol. AC-19, no. 6, pp. 716–723, Dec. 1974.

[44] K. J. Astrom, “Maximum likelihood and prediction error methods,”Automatica, vol. 16, pp. 551–574, 1980.

[45] J. C. Mazziotta, A. W. Toga, A. Evans, P. Fox, and J. Lancaster, “A proba-bilistic atlas of the human brain: Theory and rationale for its development,”NeuroImage, vol. 2, pp. 89–101, 1995.

[46] N. J. Trujillo-Barreto, E. Aubert-Vazquez, and P. A. Valdes-Sosa,“Bayesian model averaging in EEG/MEG imaging,” NeuroImage, vol. 21,pp. 1300–1319, 2004.

[47] W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, Numer-ical Recipes: The Art of Scientific Computing. New York: CambridgeUniv. Press, 1986.

[48] M. C. Phipps and M. P. Quine, A Primer of Statistics: Data Analysis,Probability, Inference, 3rd ed. Sydney, Australia: Prentice-Hall, 1998.

[49] P. D. Welch, “The use of fast Fourier transform for the estimation ofpower spectra: A method based on time averaging over short, modifiedperiodograms,” IEEE Trans. Audio Electroacoust., vol. AU-15, no. 2,pp. 70–73, Jun. 1967.

[50] T. Inouye, K. Shinosaki, H. Sakamoto, S. Toi, S. Ukai, A. Iyama, Y.Katsuda, and M. Hirano, “Quantification of EEG irregularity by use of theentropy of the power spectrum,” Electroencephalogr. Clin. Neurophysiol.,vol. 79, pp. 204–210, 1991.

[51] S. J. Julier and J. K. Uhlmann, “Unscented filtering and nonlinear estima-tion,” Proc. IEEE, vol. 92, no. 3, pp. 401–422, Mar. 2004.

[52] L. A. Rossi, B. Krishnamachari, and C.-C. J. Kuo, “Distributed parameterestimation for monitoring diffusion phenomena using physical models,”in Proc. IEEE SECON, 2004, pp. 460–469.

[53] J. Yin, V. L. Syrmos, and D. Y. Y. Yun, “System identification usingnonlinear filtering methods with applications to medical imaging,” inProc. IEEE Decision Control, 2000, pp. 3313–3318.

[54] A. Sitz, J. Kurths, and H. U. Voss, “Identification of nonlinear spatiotempo-ral systems via partitioned filtering,” Phys. Rev. E, vol. 68, pp. 016 202.1–016 202.9, 2003.

[55] N. A. C. Cressie, Statistics for Spatial Data. New York: Wiley, 1991.

136 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 56, NO. 1, JANUARY 2009

Matthew J. Barton received the B.E. (Hons.) degree in mechatronic engineer-ing, in 2002, from the University of Sydney, Sydney, N.S.W., Australia, wherehe is currently working toward the Ph.D. degree in the School of Physics.

From 2002 to 2004, he was a Researcher with the Brain Dynamics Cen-tre, University of Sydney. His current research interests include using statisti-cal modeling and state–space estimation techniques to solve the EEG inverseproblem.

Peter A. Robinson received the Ph.D. degree in theoretical physics from theUniversity of Sydney, Sydney, N.S.W., Australia, in 1987.

He was a Postdoctoral Researcher at the University of Colorado, Boulder,until 1990. In 1994, he joined the permanent staff of the School of Physics,University of Sydney, obtaining a Chair in 2000, where he is currently anAustralian Research Council Federation Fellow, working on topics includingbrain dynamics, space physics, plasma theory, and wave dynamics. He is alsowith the Faculty of Medicine, University of Sydney.

Suresh Kumar received the B.Tech. degree in engineering from the Indian In-stitute of Technology, Chennai, India, in 1992, and the M.S. and Ph.D. degreesin computational engineering mechanics from the State University of New York,Buffalo, in 1994 and 1997, respectively.

He is currently with the Australian Research Council (ARC) Centre ofExcellence for Autonomous Systems, University of Sydney, Sydney, N.S.W.,Australia. His current research interests include finite and boundary elementmethods for solution of partial differential equations, inverse problems (acous-tics, EEG, ECG), and machine learning methods for data representation andinterpretation.

Andreas Galka received the Dipl.-Phy. and Dr. rer. nat. degrees from theUniversity of Kiel, Kiel, Germany, in 1994 and 1999, respectively.

From 1994 to 2002, he was a Researcher at the Institute of Applied Physics,University of Kiel, and a Researcher at the Institute of Statistical Mathematics,Tokyo, Japan, from 2002 to 2006. He is currently a Researcher in the Departmentof Neurology, University of Kiel. His current research interests include timeseries analysis of EEG/MEG/fMRI/NIRS data.

Hugh F. Durrant-Whyte (S’84–M’86–SM’02–F’06) received the B.Sc. degreein nuclear engineering from the University of London, London, U.K., in 1983,and the M.S.E. and Ph.D. degrees, both in systems engineering, from the Uni-versity of Pennsylvania, Philadelphia, in 1985 and 1986, respectively.

From 1987 to 1995, he was a University Lecturer in engineering science atthe University of Oxford, Oxford, U.K., and a Fellow of Oriel College Oxford.Since 1995, he has been Professor of mechatronic engineering at the Universityof Sydney, N.S.W., Australia, where he currently leads the Australian ResearchCouncil (ARC) Centre of Excellence for Autonomous Systems. His currentresearch interests include robotics and sensor networks. He has published morethan 350 research papers and has won numerous awards and prizes for his work.

Prof. Durrant-Whyte was the recipient of two ARC Federation Fellowshipsin 2002 and 2007.

Jose Guivant (M’07) received the Elect. Eng. degree from the UniversidadNacional del Sur, Bahia Blanca, Argentina, in 1993, and the Ph.D. degree inengineering from the University of Sydney, Sydney, N.S.W., Australia, in 2002.

He is currently a Senior Lecturer in the School of Mechanical Engineering,University of New South Wales, Sydney. His research interests include datafusion, nonlinear control, motion planning, perception, and learning algorithmsapplied to autonomous systems.

Tohru Ozaki received the B.Sc. degree in mathematics from the University ofTokyo, Tokyo, Japan, in 1969, and the D.Sc. degree from the Tokyo Institute ofTechnology, Tokyo, in 1981.

He is currently a Professor at the Institute of Statistical Mathematics, TheGraduate University for Advanced Studies, Tokyo. His current research in-terests include linear and nonlinear time series analysis with applications toneuroscience and finance engineering.