Unbalance estimation using linear and nonlinear regression

10
Automatica 46 (2010) 1752–1761 Contents lists available at ScienceDirect Automatica journal homepage: www.elsevier.com/locate/automatica Unbalance estimation using linear and nonlinear regression Peter Nauclér a , Torsten Söderström b,a Ericsson ABB, Stockholm, Sweden b Department of Information Technology, Uppsala University, Uppsala, Sweden article info Article history: Received 27 March 2008 Received in revised form 18 December 2009 Accepted 21 June 2010 Available online 1 August 2010 Keywords: Unbalance estimation Balancing Nonlinear regression Linear regression Variable projection algorithms abstract This paper considers the problem of unbalance estimation of rotating machinery. It is formulated as a parameter estimation problem, where the unknowns enter nonlinearly in a regression model. By use of a certain method, the problem can be reformulated as a linear estimation procedure with a closed form solution. This procedure is sometimes known as the influence coefficient method. In its derivation, no special treatment is devoted to disturbing terms and imperfections in the model. Therefore, a novel method is derived which takes disturbances into account, leading to a nonlinear estimator. The two procedures are compared and analyzed with respect to their statistical accuracy. Using the example of unbalance estimation of a separator, the nonlinear approach is shown to give superior performance. © 2010 Elsevier Ltd. All rights reserved. 1. Introduction Estimation of mechanical unbalances is an important topic in many applications. Such problems appear in balancing of high speed machinery, where the purpose is to estimate mass unbalances in complex dynamic systems rotating at high speed. Applications that can be mentioned are machining tools, aircraft turbine engines (Zhou & Shi, 2001), steam turbines, electric generators (Darlow, 1989), compressors and separators. In order to counteract the effects of unbalances, first, they need to be determined. Their are several ways to do this; see for example the surveys Foiles, Allaire and Gunter (1998) and Zhou and Shi (2001). In this paper we focus on the influence coefficient matrix approach. The method is based on the assumption of linearity of both the machine itself and the measuring system. No detailed modeling of the modal properties of the machine is required. Modal balancing on the other hand, see for example Tiwari and Chakravarthy (2006), is based on a detailed modeling of the dynamic properties of the rotating machine. In the influence coefficient matrix approach, a number of experiments are carried out, where small amounts of weights are The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Giuseppe De Nicolao under the direction of Editor Ian R. Petersen. Corresponding author. Tel.: +46 18 4713075; fax: +46 18 511925. E-mail addresses: [email protected] (P. Nauclér), [email protected] (T. Söderström). added or removed at various axial locations and angular positions. These additional weights contribute to the rotating forces of the system. Due to the assumption of linearity, the rotor’s vibration amplitude is proportional to the mass unbalance of the rotor. The proportionality coefficients are called the influence coefficients. They are complex functions of the dynamic characteristics of the rotating machine and depend strongly with frequency. The response at some fixed frequency (generally above the critical speed) is recorded. From the measurements in the experiments, the user then has to determine both the influence coefficients and, particularly, the unknown unbalance in the machine. This paper is specifically inspired by the problem of separator balancing, but the techniques and the analysis apply to other applications of unbalance estimation as well. It is an important topic in the field of separator technology. The separator bowl is rotating with high speed, which typically is about 5000 revolutions per minute. The appeared centrifugal force is used to separate different substances, e.g. liquids. The use of large centrifugal forces is the core of separation technology. Since the bowl is very heavy and rotates with such a high speed, small mass unbalances create large radial bearing forces that may even be harmful. The magnitudes of these forces can often be tens of kN, i.e. several tonnes. Therefore, the separator must be balanced after manufacturing. When the unbalance estimation is completed, mass corrections are applied to the bowl to counteract the unbalances. When balancing rotating machines, and determining unbal- ances, different situations can occur. In many cases, like for turbines and crankshafts, there is often a considerable mass dis- tribution, and the deformation resulting from the unbalances 0005-1098/$ – see front matter © 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.automatica.2010.06.053

Transcript of Unbalance estimation using linear and nonlinear regression

Automatica 46 (2010) 1752–1761

Contents lists available at ScienceDirect

Automatica

journal homepage: www.elsevier.com/locate/automatica

Unbalance estimation using linear and nonlinear regression✩

Peter Nauclér a, Torsten Söderström b,∗

a Ericsson ABB, Stockholm, Swedenb Department of Information Technology, Uppsala University, Uppsala, Sweden

a r t i c l e i n f o

Article history:Received 27 March 2008Received in revised form18 December 2009Accepted 21 June 2010Available online 1 August 2010

Keywords:Unbalance estimationBalancingNonlinear regressionLinear regressionVariable projection algorithms

a b s t r a c t

This paper considers the problem of unbalance estimation of rotating machinery. It is formulated as aparameter estimation problem, where the unknowns enter nonlinearly in a regression model. By useof a certain method, the problem can be reformulated as a linear estimation procedure with a closedform solution. This procedure is sometimes known as the influence coefficient method. In its derivation,no special treatment is devoted to disturbing terms and imperfections in the model. Therefore, a novelmethod is derived which takes disturbances into account, leading to a nonlinear estimator.

The two procedures are compared and analyzed with respect to their statistical accuracy. Usingthe example of unbalance estimation of a separator, the nonlinear approach is shown to give superiorperformance.

© 2010 Elsevier Ltd. All rights reserved.

1. Introduction

Estimation of mechanical unbalances is an important topicin many applications. Such problems appear in balancing ofhigh speed machinery, where the purpose is to estimate massunbalances in complex dynamic systems rotating at high speed.Applications that can be mentioned are machining tools, aircraftturbine engines (Zhou & Shi, 2001), steam turbines, electricgenerators (Darlow, 1989), compressors and separators.

In order to counteract the effects of unbalances, first, theyneed to be determined. Their are several ways to do this; see forexample the surveys Foiles, Allaire and Gunter (1998) and Zhouand Shi (2001). In this paper we focus on the influence coefficientmatrix approach. The method is based on the assumption oflinearity of both the machine itself and the measuring system.No detailed modeling of the modal properties of the machine isrequired. Modal balancing on the other hand, see for exampleTiwari and Chakravarthy (2006), is based on a detailed modelingof the dynamic properties of the rotating machine.

In the influence coefficient matrix approach, a number ofexperiments are carried out, where small amounts of weights are

✩ The material in this paper was not presented at any conference. This paperwas recommended for publication in revised form by Associate Editor Giuseppe DeNicolao under the direction of Editor Ian R. Petersen.∗ Corresponding author. Tel.: +46 18 4713075; fax: +46 18 511925.

E-mail addresses: [email protected] (P. Nauclér),[email protected] (T. Söderström).

0005-1098/$ – see front matter© 2010 Elsevier Ltd. All rights reserved.doi:10.1016/j.automatica.2010.06.053

added or removed at various axial locations and angular positions.These additional weights contribute to the rotating forces of thesystem. Due to the assumption of linearity, the rotor’s vibrationamplitude is proportional to the mass unbalance of the rotor. Theproportionality coefficients are called the influence coefficients.They are complex functions of the dynamic characteristics ofthe rotating machine and depend strongly with frequency. Theresponse at some fixed frequency (generally above the criticalspeed) is recorded. From the measurements in the experiments,the user then has to determine both the influence coefficients and,particularly, the unknown unbalance in the machine.

This paper is specifically inspired by the problem of separatorbalancing, but the techniques and the analysis apply to otherapplications of unbalance estimation as well. It is an importanttopic in the field of separator technology. The separator bowl isrotatingwith high speed, which typically is about 5000 revolutionsper minute. The appeared centrifugal force is used to separatedifferent substances, e.g. liquids. The use of large centrifugalforces is the core of separation technology. Since the bowl isvery heavy and rotates with such a high speed, small massunbalances create large radial bearing forces that may even beharmful. The magnitudes of these forces can often be tens ofkN, i.e. several tonnes. Therefore, the separator must be balancedaftermanufacturing.When the unbalance estimation is completed,mass corrections are applied to the bowl to counteract theunbalances.

When balancing rotating machines, and determining unbal-ances, different situations can occur. In many cases, like forturbines and crankshafts, there is often a considerable mass dis-tribution, and the deformation resulting from the unbalances

P. Nauclér, T. Söderström / Automatica 46 (2010) 1752–1761 1753

k

k

k k

Fig. 1. A separator model. The stiffnesses are modeled as complex numbers, whichis a way to introduce damping in the system. These stiffnesses are subject to changebetween experiments.

tends to be distributed, and should be described by a flexiblemodel. This is the situation treated in many references, for ex-ample, Alauze, Der Hagopian and Gaudiller (2001), Blanco-Ortega,Beltrán-Carbajal, Favela-Contreras and Silva-Navarro (2008), Kang,Tseng, Wang, Chiang and Wang (2003), Schneider (1991), Sinha,Friswell and Lees (2002), Sinha, Lees and Friswell (2004) and Tiwariand Chakravarthy (2006). For separators, on the other hand, thesituation is both simpler and more complicated. The situation issimpler in the sense that the effect of the unbalances can be mod-eled as that of a rigid body. Hence it is enough to characterize theunbalance effect as force and torque in one point, and it wouldbe sufficient to carry out unbalance experiments by adding trialmasses in two planes only. With more experiments a more ac-curate result can though be expected. However, the situation isalsomore complicated for separators in the following sense.Whenadding an additional mass, for construction and geometrical rea-sons the separator has to be dismantled and a lot of different partsremoved. After the additional trial mass is mounted, the differentparts of the separator are to be put back in place, and this will in-evitably lead to that some parts become slightly differently ad-justed, and the dynamics will change. For the same reason, thechange in dynamics will be different for each new experiment.

The balancing problemmay bemodeled as follows. Let n denotethe number of sensors used to measure vibrations caused by themass unbalance. Further, let there be M experiments, where foreach new experiment, the added mass xk is modified. Given this,the estimation problem can be modeled as

yk = A (x0 + xk) , k = 1, . . . ,M, (1)

where yk ∈ Cn×1 is a measured variable, A ∈ Cn×n is the influencecoefficient matrix (unknown), x0 ∈ Cn×1 is the unbalance to beestimated (an unknown variable), and xk ∈ Cn×1 is the addedmassin experiment k (a user chosen variable), and where Cn×m denotesthe set of complex valued n×mmatrices. The number of unknownsis apparently n2

+ n and the number of equations is nM . As thenumber of equations must be at least as large as the number ofunknowns to guarantee that a feasible solution may exist, we findthat the number of experiments must fulfill

M ≥ n + 1. (2)

We are primarily interested in estimating x0 and, therefore, A canbe treated as a nuisance variable. For each experiment, the soughtvariable x0 is invariant.

The procedure is visualized using the separatormodel shown inFig. 1. The trial masses (user chosen variables) are applied in twopositions of the bowl. The notation [x]i heremeans element i of the

vector x. The separator is then driven up to its speed of operation.The applied masses together with the unknown mass unbalancex0 give rise to a vibrational response, which is measured at twoframe position. The procedure is then repeated for a new set of trialmasses, for a total ofM experiments.

ThematrixA = A(iω0) can be viewed as the frequency responsefunction from the current unbalance state (x0+xk) to themeasuredharmonic vibrational response yk at the angular frequency ω0. Theuser chosen variable xk is used to excite the system so that theproblembecomes solvable. The reason for performing experimentswith a system that operates in stationary rotation is that therelation between measured output and applied input becomessimple. Irrespectively of the order of the system (which can beextremely large), the entries ofA becomes scalar complex numberswhen the frequency response is ‘evaluated’ at the single frequencyω = ω0. The influence coefficient matrix is a function of thestructural properties of the underlying system.

In the present paper, A is assumed to be square. Thus, thenumber of inputs is equal to the number of outputs. This isa reasonable assumption for the separator problem, since therotating bowl is considered to be a rigid body and a squaresystem of equations captures the entire vibrational state. This isin contrast to balancing of flexible rotors, where the underlyingmodel rather is a partial differential equation and a large numberof sensors may be needed in order to minimize the response overthe entire structure. Thus, generally, A can be rectangular. In suchcircumstances, one should employ the pseudo-inverse of the tallmatrixA instead ofA−1, whenever it appears. The results presentedhere then can be applied also for balancing applications wheremore sensors than inputs are desired. Notice that increasing thenumber of sensors does not inevitably imply improved statisticalaccuracy in the estimation. The reason for this is that the number ofunknownparameters inA increases at the same rate as the numberof additional sensors.

1.1. Existing methods

Equations of the type (1) frequently appear in the literatureon balancing of rotating machinery (Darlow, 1989; Foiles et al.,1998; Goodman, 1964; Lund & Tonnesen, 1972). Even though theEq. (1) is nonlinear in the unknownsA and x0, it can be transformedto a linear estimation problem. This is the basis for the influencecoefficient method (Darlow, 1987, 1989; Kang, Chang, Tseng, Tangand Chang, 2000; Zhou & Shi, 2001). It is an experimental methodthat can be implemented in different ways, but the basis is to usexk = 0 in the first experiment. If disturbances can be neglected,the first measurement becomes

y1 = Ax0, (3)

which can be employed to subtract the effects of x0 in theremaining experiments where xk = 0. Then the matrix A can beestimated.When it is considered to be known, it is straightforwardto compute an estimate of x0, for example as

x0 = A−1y1. (4)

The equations needed to carry out such a procedure can bearranged in different ways, but the basics are as described above.

The estimation problem is often treated as a deterministicproblem in the literature, leading to a least squares approach fordetermining A. This has been advocated by e.g. Goodman (1964)and Lund and Tonnesen (1972). Such an approach can easily beanalyzed also for the casewhen sensor noise on themeasurementshas to be taken into account. In such cases the appropriate modelshould be

yk = A(x0 + xk) + ek, k = 1, . . . ,M, (5)

1754 P. Nauclér, T. Söderström / Automatica 46 (2010) 1752–1761

which has been treated in Larsson (1976). Here, an optimalweighting is introduced and a statistical analysis is carried out.

However, sometimes there may be another type of uncertaintyaswell. Themain source of uncertainties can be that the dynamicalproperties of the system change between experiments. This isan observation which has triggered the work in the presentpaper Hillström (2008). There are several reasons for this kindof uncertainty. First of all, in case of separators, the bowl oftenneeds to be opened in order to apply the trial masses. Whendoing this, some of the structural properties will change dueto plays of bearings etc. Also a main source of uncertainty isthat different stiffnesses and damping elements seem to changesomewhat between experiments. For example, there are rubberdamping elements whose properties depend on temperatureand the vibrational amplitude. For such cases we will need theextended model of the form

yk =

A + Ak

(x0 + xk) , k = 1, . . . ,M, (6)

where Ak ∈ Cn×n is a disturbance. The way that Ak entersthe system makes the estimation problem trickier and far morenontrivial to handle than the case when the presence of sensornoise is the main random effect.

The model (6) is also considered by Li, Lin, Untaroiu and Allaire(2003), but then the perturbations Ak are assumed to be boundedand deterministic. Also, an estimate of A is assumed to havebeen estimatedbeforehand. Theunbalance determinationproblemis formulated as a certain convex optimization problem whichincludes upper bounds for the the perturbations. In contrast wewill in this paper use the model (6) but treat the perturbationsas random variables. In addition, no previous estimate of theinfluence coefficient matrix is needed.

In the current context it is assumed that sensor noise isnegligible compared to the system disturbance Ak. In fact, theeffect ofmeasurement noise has been checked in previous practicalstudies on separators, Hillström (2008), and found to be negligibleto as compared to the effects of the varying dynamics fromone experiment to another. To the best of our knowledge thereis no statistical analysis associated with estimation of (1) andno algorithms proposed that are devoted to a sound statisticaltreatment of the disturbing variable.

In the separator system, equations of the type (1) can be setup for several angular frequencies. The measured quantity yk andthe matrix A then become functions of frequency, while x0 and xkare frequency independent. Still, the number of experiments mustfulfill (2). Thus, the core of the problem is to be able to performunbalance estimation at a single frequency, which is considered inthis paper.

The paper is organized as follows. The next section containssome preliminary mathematical notations and basic results.Section 3 contains an analysis of a deterministic approach fordetermining the unbalance based on least-squares. The resultingestimate is evaluated for the case when there are random errorsin the influence matrix, as in the model (6). Section 4 developsa more advanced method, where the structure of the model (6)is exploited, leading to a nonlinear estimation procedure. Theresulting estimate is shown to have much better (statistical)performance. Both statistical and computational aspects of theestimator are analyzed. A detailed numerical example, based onthe separator model above, is treated in Section 5, showing againsuperior behavior of treating the disturbance terms Ak as randomvariables. Most details of the statistical analysis of the treatedmethods are placed in the appendices.

2. Preliminaries

The purpose of this section is to introduce some notation andmathematical tools that will be utilized in the sequel of this paper.

The vec operator is the operator that stacks the columns of amatrix. If A = [a1 . . . an], where ak is column k, we define

α , vec(A) =

a1...an

. (7)

Similarly, the vectorized uncertainty matrix is defined as

αk = vec(Ak) (8)

and furthermore the uncertainty vector from all M experimentsbecomes

α =

α1...

αM

. (9)

Before proceeding, an assumption regarding the uncertaintymatrix is needed:

Assumption 1. The uncertainty matrix is zero mean and Ak isuncorrelated with Al for k = l. It has an associated covariancematrix

EαkαTl = Rαδk,l, (10)

where E denotes the expectation operator and δk,l is the Kroneckerdelta function.

Recall that the reason for the error Ak in the influence matrix isthat the separator is dismantled and rebuilt between the test usinga new trial mass. Therefore it is reasonable to assume that this typeof error is independent from one experiment to another.

The fact that the uncertainty is independent between experi-ments implies that

Rα,M , cov (α) = IM ⊗ Rα, (11)

where IM is the identity matrix of dimension M and ⊗ is theKronecker product.

The vec operator has many useful properties. One that will beextensively employed in this context is

vec (ABC) =CT

⊗ Avec (B) . (12)

Application of this result on the system equation (1) yields

vec (yk) = yk =(x0 + xk)T ⊗ In

α

+(x0 + xk)T ⊗ In

αk. (13)

Let B(x) and C(x) be matrices whose entries are functions of a realvalued vector x. Furthermore, let [x]k be the k-th element of thevector x. Then we define

B(k)=

∂B(x)∂[x]k

, B(kl)=

∂2B(x)∂[x]k∂[x]l

. (14)

For products of matrices the chain rule applies,

(BC)(k) = B(k)C + BC(k), (15)

where the x-argument is dropped for notational convenience. Fordifferentiation of matrix inverses it holds that

B−(k) ,B−1(k)

= −B−1B(k)B−1. (16)

P. Nauclér, T. Söderström / Automatica 46 (2010) 1752–1761 1755

3. Linear deterministic estimation

One approach to handle the problem of estimating x0 frommeasured data is to adopt a deterministic viewpoint. Thus, if theeffect caused by A is neglected, or considered to be insignificant,the model to apply is the one given by (5). It is the basis fortwo different approaches to estimate x0. The two identificationprocedures are labeled A1 and A2, respectively. Both theseapproaches are employed in the balancing industry Hillström(2008).

3.1. Approach A1

From the relation (5), the unknown variable x0 can be foundusing a simple procedure. The first step is to subtract the effects ofx0 from (5). This is performed by choosing x1 = 0which yields

y1 = Ax0 (17)

and for the remainingM − 1 equations, we form

zk , yk − y1, k = 2, . . . ,M, (18)

which yields

zk = Axk, k = 2, . . . ,M (19)

if the disturbance is neglected. Both zk and xk are known andtherefore it is straightforward to compute an estimate of thenuisance variable A. This can be performed in different ways. Oneoption is apply the vec operator to (19), which gives

zk =xTk ⊗ In

α (20)

and upon stacking the experiments in a tall vector

z =zT1 . . . zTM−1

T= 81α, (21)

where

81 =

xT2 ⊗ In...

xTM ⊗ In

. (22)

By use of (21) an estimate of the nuisance variable A can be found.Thereafter it is straightforward to estimate x0 using (17). The two-step procedure becomes:Step 1: Let x1 = 0 and xk = 0 for k ≥ 2. Form (21) and (22), andcompute

α = 8Ď1z. (23)

Thereafter, form the estimate A from α.Step 2: Use the first experiment (17) and A to estimate x0:

x0 = A−1y1. (24)

The procedure to estimate unbalances by using (17) in orderto linearize the equations is often referred to as the influencecoefficient method in the balancing literature Darlow (1989);Goodman (1964); Larsson (1976). There exist many variants onhow to organize the equations Foiles et al. (1998). Anotheralternative that is more computationally efficient is to arrange theequations in the first step asz2 . . . zM

= A

x2 . . . xM

A =

z2 . . . zM

x2 . . . xM

Ď.

where (. . .)Ď denotes the pseudo-inverse. The two formulationsyield the same result, but the one chosen for this paper is moretractable from a statistical analysis point of view.

3.2. Statistical Properties of A1

The statistical analysis is carried out under the followingconditions:

Assumption 2. The norm of the stochastic disturbance Ak is smallcompared to the norm of A. This means that the signal to noiseratio, SNR, is large.

Remark 1. The number of experiments M is not assumed to belarge.

These conditions will be employed also for the analysis of themethods A2 and A3 that will be introduced in the sequel. Remark 1is important since for the underlying application, a very largenumber of experiments would not be feasible. The first and secondorder statistics of A1 are summarized in the following lemma.

Lemma 1. The expected value of the estimate (24) is

Ex0 = x0 + OE‖Ak‖

2

,

and its covariance matrix is for large SNR given by

covx0

= A−1C1Rα,MC∗

1A−∗, (25)

where

C1 = (xT0 ⊗ In)(In2 + 8

Ď1C1b) −8

Ď1C1a

, (26)

C1a =

(x0 + x2)T ⊗ In 0. . .

0 (x0 + xM)T ⊗ In

, (27)

C1b =

1...1

⊗xT0 ⊗ In

, (28)

C1 ∈ Cn(M−1)×n2(M−1), C1b ∈ Cn(M−1)×n2 .

Proof. The proof is given in Appendix A. �

Remark 2. An improved form (called approach A2 in whatfollows) of A1 can be constructed using the following ideas. Detailsare explained in full in Nauclér (2008), which is available fromwww.uu.se. In the second step (24) of A1, the unknown variablex0 is estimated using the first experiment only. This can beproblematic if A1 happens to be large, and the approach is indeednot the soundest from a statistical point of view. Indeed, as the datafrom the first experiment is subtracted in approach 1 from all otherexperiments, if the model employed for the first experiment hasa large systematic error (that is A1 is considerable), then this willdeteriorate all the newdata used to determine the unbalances. Oneway to avoid this problem is to introduce the variable m = Ax0.Eq. (5) then becomes

yk = m + Axk, k = 1, . . . ,M, (29)

which is linear in the unknowns m and A and all experiments canbe used to identify these unknown parameters. Then, x0 can becomputed using their estimates. Still, no nonlinear optimizationis needed and the approach is shown to have better statisticalproperties than A1, see Nauclér (2008).

1756 P. Nauclér, T. Söderström / Automatica 46 (2010) 1752–1761

4. Approach 3: nonlinear regression

In this section we derive a loss function that handles thestochastic uncertainty A in a more sophisticated fashion. Thisleads to a problem formulation with a loss function that isnonlinear in x0. Thus, there exists no closed form solution and anumerical search procedure is required. In order to use standardoptimization routines, the system equation (1) is reformulated asa real valued problem. This is done by representing the complexvalued quantities with their real and imaginary parts separated.This operation is denoted here with ¯(·) and we let

yk =

[ykRykI

]=

[Re (yk)Im (yk)

], x0 =

[x0Rx0I

],

xk =

[xkRxkI

], ∈ R2n×1, (30)

where Rn×m denotes the set of real valued n × m matrices andwhere Re(yk) and Im(yk) are the real and imaginary parts of yk,respectively. The corresponding convention with subscripts R andI will be employed in the sequel. Furthermore, the vectorizedmatrices with separated real and imaginary parts are defined as

α =

[αRαI

]=

[Re (vec (A))Im (vec (A))

]∈ R2n2×1, (31)

¯αk =

[αkRαkI

]=

Revec

Ak

Im

vec

Ak

∈ R2n2×1,

¯α =

¯α1...

¯αM

∈ R2n2M×1, (32)

and the corresponding covariance matrices are defined as

cov

¯αk

= Rα, cov

¯α

= Rα,M = IM ⊗ Rα. (33)

A given complex valued equation

y = Ax, ⇒ yR + iyI = (AR + iAI) (xR + ixI) (34)

can be reformulated as a real valued relation[yRyI

]=

[AR −AIAI AR

] [xRxI

] [yRyI

]=

[xTR ⊗ In −xTI ⊗ InxTI ⊗ In xTR ⊗ In

] [αRαI

]. (35)

Using this fact, the system equation (1) can be rewritten as

yk = Dk(x0)α + Dk(x0)αk, k = 1, . . . ,M, (36)

where

Dk(x0) =

[(x0R + xkR)T ⊗ In − (x0I + xkI)T ⊗ In(x0I + xkI)T ⊗ In (x0R + xkR)T ⊗ In

](37)

∈ R2n×2n2 . If all experiments are stacked in a tall vector

y =yT1 yT2 . . . yTM

T= B(x0)α + C(x0) ¯α, (38)

where

B(x0) =

D1(x0)...

DM(x0)

, C(x0) =

D1(x0) 0. . .

0 DM(x0)

. (39)

The covariance matrix of the residual term C(x0) ¯α is denoted by

Qα(x0) = C(x0)Rα,MCT (x0) ∈ R2nM×2nM , (40)

which is a function of the unknown variable x0. Similarly to theapproaches A1 and A2 an estimate of x0 is found by minimizinga quadratic criterion. However, in order to make the covariancematrix of the estimation error minimal, the equations should beweighted with the inverse of Qα (Söderström & Stoica, 1989). Thecriterion then reads

V (x, α) = ‖y − B(x)α‖2Q−1

α (x). (41)

Minimization of V with respect to α is straightforward. For a fixedvalue of x = x∗, the minimum is (Söderström & Stoica, 1989)

ˆα =BT (x∗)Q−1

α (x∗)B(x∗)−1 BT (x∗)Q−1

α (x∗)y (42)

and insertion of (42) into (41) yields a concentrated loss function

W (x) = minα

V (x, α) =

y − BBTQ−1

α B−1 BTQ−1

α y2

Q−1α

=

yT − yTQ−1

α BBTQ−1

α B−1 BT

Q−1

α

×

y − B

BTQ−1

α B−1 BTQ−1

α y

= yTQ−1/2α

I2nM − Q−1/2

α BBTQ−1

α B−1

× BTQ−1/2α

Q−1/2

α y (43)

where the dependence on x is dropped for brevity. The concen-trated loss function (43) can be formulated as

W (x) = yTQ−1/2α (x)5⊥(x)Q−1/2

α (x)y, (44)

where 5⊥ is the orthogonal projector onto the null-space ofBTQ−1/2

α and it is given by

5⊥= I2nM − Q−1/2

α BBTQ−1

α B−1 BTQ−1/2

α . (45)

The parameter estimation problem becomes a two-step proce-dure:

ˆx0 = minx

W (x) (46)

ˆα =

BT ( ˆx0)Q−1

α ( ˆx0)B( ˆx0)−1

BT ( ˆx0)Q−1α ( ˆx0)y. (47)

By the separation into two estimation steps the complexityof the optimization problem has been significantly reduced.Minimization of the original loss function (41) would require anonlinear search over 2(n2

+ n) unknown parameters. By use ofthe concentrated loss function (43), the problem is reduced to anonlinear minimization over 2n variables and a simple weightedlinear least squares fit to find the remaining 2n2 unknownparameters. The second step is only needed if the nuisance variableA is of any importance.

The optimization problem (46) is often referred to as a variableprojection problem (Golub & Pereyra, 1973). Such optimizationproblems frequently appear in sensor array processing (Viberg &Ottersten, 1991) and in many other applications (Golub & Pereyra,2003). However, the fact that Qα in (44) is a function of theunknown variable is quite uncommon. Notice that Qα depends onthe uncertainty covariance matrix Rα through (40). Therefore, Rα

needs to be a priori known or estimated.

P. Nauclér, T. Söderström / Automatica 46 (2010) 1752–1761 1757

4.1. Statistical Properties of A3

First notice that the outcome ˆx0 from the optimization (46) issuch that

W (k)( ˆx0) = 0 (48)

for a successful minimization. Assume that the estimate ˆx0 lies ina neighborhood close to the true value x = x0, i.e. ˆx0 = x0 + xε ,where xε is small. Then (see e.g. Ljung (1999) and Söderström andStoica (1989)),

0 =∂TW∂ x

x=ˆx0

=∂TW∂ x

x=x0+xε

≈∂TW∂ x

x=x0

+∂2W∂ x2

x=x0

xε. (49)

Remember that W (k)= ∂W/∂[x]k, where [x]k is the k-th element

of x, see Section 2. Eq. (49) implies that the estimation errorapproximately is

xε = −

∂2W∂ x2

−1∂TW∂ x

, (50)

where the derivatives should be evaluated at x = x0. The accuracyof the estimate then becomes

cov (xε) =

∂2W∂ x2

−1

cov

∂TW∂ x

∂2W∂ x2

−1

. (51)

We are now ready to give the main result of this section:

Lemma 2. The estimation procedure A3 yields

Ex0 = x0 + OE‖Ak‖

2

(52)

and the accuracy is for large SNR given by

covˆx0

= H−1GRα,MGTH−1, (53)

where

[H]kl = 2αTBT (k)Q−1/2α 5⊥Q−1/2

α B(l)α, (k, l) = 1, . . . , 2n, (54)

[G]k,: = −2αTBT (k)Q−1/2α Π⊥Q−1/2

α C, k = 1, . . . , 2n, (55)

where [G]k,: means row k of the matrix G.

Proof. See Appendix B. �

4.2. Computational aspects

The loss function (44) is a nonlinear function of the unknownvariable x0. Therefore, numerical optimization is needed in orderto compute the estimate ˆx0. For this purpose, there are somecomputational issues that need to be addressed.

Any optimization routine need to be started with an initialguess of the minimizing variable. Instead of just choosing e.g.ˆx0 = 0, the optimization is initialized with the outcome from theprocedure A2.

It is not easily seen if there exist local minima from theexpression (44). So far, no problems with convergence toinaccurate estimates have been experienced. If n = 1, it ispossible to visually depict the level curves of the concentrated lossfunction. Such an example is shown in Fig. 2. Here, the number ofexperiments isM = 7 and

A = 1 + 0.78i, x0 = 0.55, Rα = cov

¯αk

= 10−3I2. (56)

R

I

Fig. 2. Level curves of the loss function. The true parameter value is x0 = 0.55.

The figure shows that at least in this case the loss function iswell behaved.

In most applications, the covariance matrix Qα should bepositive definite. However, situations where it is ill conditioned,or rank deficient may occur. Such situations need to be taken careof. It can be done using regularization,

Qα = CRα,MCT+ µI2nM ,

where µ is a small real number.In order to use approach A3, the statistics of the uncertainty

must be known or estimated beforehand. The good news is thatonly the structure of Rα and not its absolute value is of importance.A scaling of the covariance matrix will only scale the loss function(44). Thus, the value of ˆx0 that minimizes the criterion (44) willremain the same.

When the projection matrix 5⊥ is computed, the effects ofrounding errors may become significant. Therefore, it should becomputed in a numerically sound way. First, rewrite (45) as(Mahata, 2003)

5⊥= I2nM − MMĎ, M = Q−1/2

α B (57)

and perform the QR factorization

M = QR =Q1 Q2

[R10

]= Q1R1, (58)

where Q is an orthogonal matrix and R1 is upper triangular. Thisgives

MĎ=

RT1Q

T1Q1R1

−1 RT1Q

T1 = R−1

1 QT1, (59)

where the last equality follows from the orthogonality ofQ1. Usingthis result, the projection matrix can be written as

5⊥= I2nM − Q1R1R−1

1 QT1 = I2nM − Q1QT

1 = Q2QT2 . (60)

Eq. (60) is less sensitive to rounding errors compared to directcomputation of (45). In addition, the use of Q2 forces (60) to bepositive semidefinite. Therefore, the QR decomposition approachshould be used for the numerical computations.

Many optimization routines converge in fewer iterations if ineach step the analytical value of the gradient of the loss functionis supplied. Such expressions are given in Nauclér (2008), for anyx = x0.

1758 P. Nauclér, T. Söderström / Automatica 46 (2010) 1752–1761

5. Numerical illustration

Below we evaluate the approaches A1, A2 and A3 for theseparator model presented earlier in Sections 3 and 4. In Nauclér(2008) some further numerical examples are provided, that pointin the same direction: The approach A2 gives much betterresults (much smaller estimation errors) than A1, and A3 givesconsiderably better results than A1. All error variances decreasewith an increased number of experiments, which is expected.The benefit of adding additional experiments is, however, muchlesser for A1 compared to the other two approaches. MonteCarlo simulations studies produces results that for all approachesare very similar to the results predicted by the theory (such asLemmas 1 and 2).

Consider a model of a separator as described in Section 1 andshown in Fig. 1. It is a 2-dimensional model with 12 degreesof freedom. The beam at which the separator bowl is attachedis however modeled with the Euler–Bernoulli partial differentialequation. The masses of the bowl and the frames are in the orderhundreds of kilograms. The stiffnesses are modeled using theconcept of hysteretic damping. It means that they are modeledas complex valued stiffnesses, which is a is a way to introducedamping in the system. The damping does not change withfrequency, in contrast to viscous damping.

The complex valued stiffnesses are subject to change betweenexperiments, which leads to the uncertainty term. Betweeneach experiment, each stiffness varies uniformly ±1% around itsnominal value. The modeling is quite extensive and the details areby purpose left out in order tomake the presentation compact. Thesystem model becomes

yk =

A + Ak

(x0 + xk) , k = 1, . . . ,M, (61)

where

A = 10−4[−0.0095 − 0.5335i 0.0036 + 0.1743i−0.0089 − 0.4344i 0.0017 + 0.1932i

], [m/(sg)],

x0 =

[21e37

π180 i

17e111π180 i

], [g]. (62)

The unit of A depends on the fact that the measured quantity is in[m/s] and the applied masses are in grams [g]. The quantities arecomplex valued since they are associated with a magnitude andan angular position. The structure of the covariance matrix Rα isdepicted in Fig. 3.

In order to use A3, the statistics of the uncertainty must beknown or estimated somehow. Two scenarios here are evaluated.The first is that the statistics of the uncertainty is fully known.The other scenario is that it is completely unknown and thereforeˆRα = I8 is employed. The latter choice clearly deviates from thetrue covariance matrix as depicted in Fig. 3. Still, the algorithm A3can be used, but the weighting is no longer optimal. Therefore, itis not necessarily so that A3 should perform better than the othertwo approaches in this case.

Each trial weight [xk]i has certain massmi and angular positionφi, relative to a reference position in the bowl. Typically, x1 = 0,since in the first experiment it is decided if balancing is at allneeded. Thus, if balancing is needed, the first experiment is for‘free’. In this exampleM ≥ 3 is required and it is chosen to usex1 x2 x3

=

[0 30 −300 30 30

](63)

as the trial masses (in grams) for the first three experiments. Thisis done to ensure that the trial masses do not become ‘too’ linearly

1

1

2

2

3

3

4

4

5

5

6

6

7

7

8

8

row

#

column #

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

Fig. 3. The structure of Rα . Each square shows themagnitude of the correspondingelement in Rα . The matrix is scaled so that the greatest element have unitmagnitude.

dependent. If M > 3, the further experiments are drawn from astatistical distribution

xk =

[[xk]1[xk]2

]=

[m1eφ1i

m2eφ2i

], k ≥ 4, (64)

where

mi ∈ {30, 40, 50, 60} [g], φi ∈ U(0, 2π) [rad]. (65)

All values of mi are equally probable and U(0, 2π) is a discreteuniform distribution with resolution 1 degree. Not too much effortis put on choosing ‘good’ candidates for trial masses. Instead, themasses are changed according to (65) for each new Monte Carlorealization. The purpose with this procedure is to diminish theeffect of specific choices of xk and instead put the focus on theperformance of the estimators.

Monte Carlo simulations are used to evaluate the performanceof the three estimation algorithms. The covariance matrix of theestimates are computed using 300 realizations for each value ofM . The result is shown in Fig. 4. The figure shows that if thetrue covariance matrix of the uncertainty is known, the nonlinearestimation method A3 outperforms A1 and A2. Even with the adhoc choice ˆRα = I, A3 gives better performance compared to A1and A2. Such a choice is probably natural if the statistics of theuncertainty is completely unknown. In reality, user choices of ˆRα

would probably lead to a performance of A3 that lies somewhere inbetween the curves marked with squares. Thus, better knowledgeabout the system at hand is expected to yield better estimates.

Finally, we show in Fig. 5 a histogram plot of the estimationerror for M = 14. The error of [ ˆx0]1 = Re([x0]1) is shown. Itcan be seen that the estimation error is centered around zero andthe distribution is by far most narrow when A3 with ˆRα = Rα isemployed.

6. Conclusions

An estimation problemwhich ismotivated by the application ofunbalance estimation of rotating machinery has been considered.Two different estimation techniques (A1 and A3) are derived andanalyzed with respect to their respective statistical property. Inaddition, an approach (A2) based on A1, is discussed and comparedto the other two approaches using a numerical example. Theestimation problem is special in the way that the disturbance isentering the system equations. Instead of noisy measurements

P. Nauclér, T. Söderström / Automatica 46 (2010) 1752–1761 1759

M

α α

α

Fig. 4. Performance of the different estimators for the separator example.

α α α

Fig. 5. Histogram plot of the estimation error of the real part of [x0]1 . The numberof realizations is 300.

(ordinary least squares problems) or noisy inputs (errors invariables problems), the main source of uncertainty is hereconsidered to act on the systemparameters in a stochastic fashion.

An example of unbalance estimation of a separator is consid-ered for evaluation of the estimators. Here, it is shown that the ac-curacy can be significantly improved if the nonlinear estimationapproach A3 is employed. This is particularly so if the number ofexperiments is increased. In such circumstances, it matters verymuch how the estimation is performed. The nonlinear approachA3may then perform considerably much better than the linear es-timators A1 and A2. The analytical accuracy expressions could beemployed as a basis for experiment design, i.e. the problem of find-ing a sequence of xk that minimizes the estimation error.

Acknowledgement

We are grateful to Dr. Lars Hillström at Alfa Laval MachineDynamics for fruitful discussions and for letting us use theseparator model.

Appendix A. Proof of Lemma 1

The identification procedure is derived while neglecting theeffects of Ak. In the presence of this disturbance (19) and (20)modify to

zk = Axk − A1x0 + Ak(x0 + xk)=

xTk ⊗ In

α −

xT0 ⊗ In

α1 +

(x0 + xk)T ⊗ In

αk,

k = 2, . . . ,M. (A.1)

The z vector (21) then becomes

z = 81α −

1...1

⊗xT0 ⊗ In

α1

+

(x0 + x2)T ⊗ In 0. . .

0 (x0 + xM)T ⊗ In

α2

...αM

(A.2)

= 81α +−C1b C1a

α, (A.3)

with C1a and C1b as defined in (27) and (28), respectively.The first step of the estimation procedure is to compute an

estimate of α, as in (23)

α = 8Ď1z (A.4)

= α + 8Ď1

−C1b C1a

α (A.5)

, α + αε, (A.6)where

αε = 8Ď1

−C1b −C1a

¯α. (A.7)

Thus, the estimate of A can be written as

A = A + Aε, (A.8)where Aε is formed from αε , i.e. vec(Aε) = αε .

Next, letm = Ax0. (A.9)The use of (17) implies that

m = y1 = m + A1x0, (A.10)which follows from (1). Eq. (A.10) can be rewritten as

m = m + mε, (A.11)where

mε = (xT0 ⊗ In)α1. (A.12)We will next use the series expansion

(A + A)−1≈ A−1

− A−1AA−1+ · · · (A.13)

where the second order terms can be omitted if A is smallcompared to A in the sense ‖A‖ ≪ ‖A‖.

Using (24), (A.8), (A.11) and (A.13) the estimate of x0 can bewritten asx0 = (A + Aε)

−1(m + mε)

= (A−1− A−1AεA−1

+ · · ·)(m + mε)

= x0 + A−1(mε − Aεx0) + · · ·

= x0 + A−1((xT0 ⊗ In)α1 − (xT0 ⊗ In)αε) + · · ·

= x0 + A−1 xT0 ⊗ In

(α1 − 8

Ď1

−C1b C1a

α) + · · · (A.14)

≈ x0 + A−1(xT0 ⊗ In)(In2 + 8ĎC1b) −8

Ď1C1a

α, (A.15)

where the approximation in (A.15) follows from the fact that ‖Ak‖

is assumed to be much smaller than ‖A‖. Therefore, also ‖Aε‖ ismuch smaller than ‖A‖. From (A.14) it is concluded that

Ex0 = x0 + OE‖Ak‖

2

(A.16)

since the error term in (A.15) is linear in α, which has zero mean.For large SNR, (A.15) is a valid approximation. Then, covariancematrix of x0 becomes

cov(x0) = A−1C1cov(α)C∗

1A−∗, (A.17)

with C1 given by (26). Furthermore, the covariance matrix of α isgiven by (11), which concludes the proof. �

1760 P. Nauclér, T. Söderström / Automatica 46 (2010) 1752–1761

Appendix B. Proofs of Lemma 2

We first need a number of preliminary results.To analyze the approach A3 statistically, we need to evaluate

the gradient and the Hessian,

∂TW∂ x

=

W (1)

...

W (2n)

, (B.1)

∂2W∂ x2

=

W (11) W (12)

· · · W (1(2n))

W (21) W (22)· · · W (2(2n))

......

. . ....

W ((2n)1) W ((2n)2)· · · W ((2n)(2n))

, (B.2)

of the loss function and evaluate them at x = x0. In order toaccomplish this it is useful to rewrite the criterion function (44)as

W (x) = yTQ−1α (x) (I2nM − B(x)P(x)) y, (B.3)

where

P(x) = [BT (x)Q−1α (x)B(x)]−1BT (x)Q−1

α (x). (B.4)

The matrix P has some useful properties that are summarizedin what follows, in a series of propositions. For notationalconvenience the dependence on x is dropped.

Proposition 1.

PB = I2nM . (B.5)

Proof. The result directly follows from the definition (B.4) ofP. �

Proposition 2.

P(k)= [BTQ−1

α B]−1

BT (k)Q−1α + BTQ−(k)

α

× (I2nM − BP) − PB(k)P. (B.6)

Proof. Application of the chain rule and the rule for differentiationof matrix inverses yields

P(k)= {[BTQ−1

α B]−1BTQ−1

α }(k)

= [BTQ−1α B]

−(k)BTQ−1α + [BTQ−1

α B]−1

×BT (k)Q−1

α + BTQ−(k)α

= −[BTQ−1

α B]−1

BT (k)Q−1α B + BTQ−(k)

α B

+ BTQ−1α B(k)

[BTQ−1α B]

−1BTQ−1α

P

+ [BTQ−1α B]

−1 BT (k)Q−1

α + BTQ−(k)α

= [BTQ−1

α B]−1

BT (k)Q−1α + BTQ−(k)

α

× (I2nM − BP) − PB(k)P. �

A very useful consequence of the first proposition is

Proposition 3.

P(k)B = −PB(k). (B.7)

Proof. Application of the chain rule on (B.5) yields

P(k)B + PB(k)= 0 ⇔ P(k)B = −PB(k), (B.8)

which is the desired result. �

The final proposition is related to the second derivatives of P:

Proposition 4.

[BTQ−1α B]P(kl)B = BTQ−1

α

B(l)PB(k)

+ B(k)PB(l)−

BT (k)Q−1

α + BTQ−(k)α

(I2nM − BP) B(l)

−BT (l)Q−1

α + BTQ−(l)α

(I2nM − BP) B(k). (B.9)

Proof. First notice that B(kl)= 0. Next, differentiate (B.6) with

respect to [x]l, and make use of the chain rule

P(kl)=

[BTQ−1

α B]−1

BT (k)Q−1α + BTQ−(k)

α

(l)

× (I2nM − BP)

+ [BTQ−1α B]

−1 BT (k)Q−1

α + BTQ−(k)α

×

−B(l)P − BP(l)

− P(l)B(k)P − PB(k)P(l). (B.10)

Using Propositions 1 and 3 we obtainBTQ−1

α BP(kl)B = −

BT (k)Q−1

α + BTQ−(k)α

×

B(l)P + BP(l) B − BTQ−1

α BP(l)B(k)PB

− BTQ−1α BP

BTQ−1α

B(k)P(l)B

= −BT (k)Q−1

α + BTQ−(k)α

(I2nM − BP) B(l)

− BTQ−1α BP(l)B(k)

+ BTQ−1α B(k)PB(l) (B.11)

and using Propositions 2 and 3 and some algebraic manipulations,the term that involves P(l) is expanded

BTQ−1α BP(l)B(k)

= · · · =BT (l)Q−1

α + BTQ−(l)α

× (I2nM − BP) B(k)

− BTQ−1α B(l)PB(k). (B.12)

Finally, combining (B.11) and (B.12) gives the desired result. �

After these technical results we present one lemma neededin order to compute the gradient and Hessian (B.2) of theconcentrated loss functionW (x), (44).

Lemma 3. Under Assumption 2 (‖Ak‖ ≪ ‖A‖) it holds that

W (k)(x0) ≈ −2αTBT (k)Q−1/2α 5⊥Q−1/2

α C ¯α (B.13)

W (kl)(x0) ≈ 2αTBT (k)Q−1/2α 5⊥Q−1/2

α B(l)α. (B.14)

Proof. Differentiation of (B.3) yields

W (k)= yTQ(k)y, (B.15)

where

Q(k)= Q−(k)

α (I2nM − BP) − Q−1α

B(k)P + BP(k) . (B.16)

Let y = B(x0)α + C(x0) ¯α as in (38) and evaluate (B.15) at x = x0.This gives

W (k)x=x0

= αTBTQ(k)Bα + 2αTBTQ(k)C ¯α

+ ¯αTCTQ(k)C ¯α. (B.17)

By use of Propositions 1 and 3 it follows that BTQ(k)B = 0, so thefirst term vanishes. Next, it is argued that if ‖Ak‖ ≪ ‖A‖, thenthe term ¯α

TCTQ(k)C ¯α is negligible compared to the middle term of

(B.17). It remains to compute

2αTBTQ(k)C ¯α = 2 ¯αTCTQ(k)Bα (B.18)

P. Nauclér, T. Söderström / Automatica 46 (2010) 1752–1761 1761

and again Propositions 1 and 3 give

W (k)(x0) ≈ −2 ¯αTCTQ−1

α (I2nM − BP) B(k)α, (B.19)

which can be equivalently written as (B.13).Next, we want to find an expression forW (kl). Differentiation of

(B.15) with respect to [x]l yields

W (kl)= yTQ(kl)y, (B.20)

whereQ(k) is given by (B.16). If themodel (38) for y is inserted, oneobtains

W (kl)= αTBTQ(kl)Bα + 2αTBTQ(kl)C ¯α

+ ¯αTCTQ(kl)C ¯α ≈ αTBTQ(kl)Bα. (B.21)

The approximation follows from that the term that is quadratic inα is nonzero, and the assumption ‖Ak‖ ≪ ‖A‖.

Differentiation of (B.16) with respect to [x]l yields

Q(kl)= Q−(kl)

α (I2nM − BP) − Q−(k)α

B(l)P + BP(l)

−Q−(l)α

B(k)P + BP(k)

−Q−1α

B(k)P(l)

+ B(l)P(k)+ BP(kl) . (B.22)

Computation of (B.21) with application of Propositions 1–4, B(kl)=

0, and evaluation at x = x0 yields

W (kl)≈ αTBTQ(kl)Bα

= αT 0 − BTQ−(k)

α

B(l)

− BPB(l)− BTQ−(l)

α

B(k)

− BPB(k)− BTQ−1

α

−B(k)PB(l)

− B(l)PB(k)− BTQ−1

α BP(kl)Bα

= αT BT (k)Q−1

α (I2nM − BP) B(l)

+ BT (l)Q−1α (I2nM − BP) B(k) α (B.23)

= 2αTBT (k)Q−1α (I2nM − BP) B(l)α, (B.24)

where (B.23) follows from Proposition 4 and some algebra.Eq. (B.24) follows sinceQ−1

α (I2nM − BP) is a symmetricmatrix. Theexpression (B.24) can be equivalently written as (B.14). �

Proof of Lemma 2. Using (49) and Lemma 3, the estimation errorcan be written as

xε = −H−1G ¯α + O‖ ¯α‖

2

, (B.25)

which is consistent with (52), sinceH andG are constantmatrices.For large SNR, (50) is a valid approximation and the covariancematrix of xε satisfy cov (xε) = cov

ˆx0

. Then, (53) immediately

follows from (51) and Lemma 3. �

Remark 3. The results presented apply for any variable projectionproblem of the type (44). The only assumption made is that B islinear in x, so that B(kl)

= 0. If B would be a nonlinear function ofx, terms that involve B(kl) appear in the results above. The detailsneeded in order to carry out the final computations for the specificproblem at hand are given in Nauclér (2008).

References

Alauze, C., Der Hagopian, J., & Gaudiller, L. (2001). Active balancing of turbomachin-ery: application to large shaft lines. Journal of Vibration and Control, 7, 249–278.

Blanco-Ortega, A., Beltrán-Carbajal, F., Favela-Contreras, A., & Silva-Navarro,G. (2008). Active disc for automatic balancing of rotor-bearing sys-tems. In American Control Conference, Seattle, Washington, USA, June 11-13(pp. 2023–2038).

Darlow, M. S. (1987). Balancing of high speed machinery: theory, methods andexperimental results.Mechanical Systems and Signal Processing , 1(1), 105–134.

Darlow, M. S. (1989). Balancing of high speed machinery. New York, NY: Springer-Verlag.

Foiles, W. C, Allaire, P. E., & Gunter, E. J. (1998). Review: rotor balancing. Shock andVibration, 5, 325–336.

Golub, G. H., & Pereyra, V. (1973). The differentiation of pseudo-inverses andnonlinear least squares problems whose variables separate. SIAM Journal ofNumerical Analysis, 10(2), 413–432.

Golub, G. H., & Pereyra, V. (2003). Separable nonlinear least squares: the variableprojection method and its applications. Inverse Problems, 19(2), R1–R26.

Goodman, T. P. (1964). A least-squares method for computing balance corrections.Journal of Engineering for Industry, 86(3), 273–279.

Hillström, L. (2008). Personal communication.Kang, Y., Chang, Y. P., Tseng, M. H., Tang, P. H., & Chang, Y. F. (2000). A modified

approach based on influence coefficient method for balancing crank-shafts.Journal of Sound and Vibration, 234(2), 277–296.

Kang, Y., Tseng, M.-H., Wang, S.-M., Chiang, C.-P., & Wang, C.-C. (2003). An accuracyimprovement for balancing crankshafts. Mechanism and Machine Theory, 38,1449–1467.

Larsson, L. O. (1976). On the determination of the influence coefficients in rotorbalancing, using linear regression analysis. In Vibrations in rotating machinery,Cambridge, UK, September (pp. 93–97).

Li, G., Lin, Z., Untaroiu, C., & Allaire, P. E. (2003). Balancing of high-speed rotatingmachinery using convex optimization. In IEEE conference on decision and control,Maui, Hawaii, USA, December (pp. 4351–4356).

Ljung, L. (1999). System identification (2nd edition). Upper Saddle River, NJ, USA:Prentice–Hall.

Lund, J. W., & Tonnesen, J. (1972). Analysis and experiments on multi-planebalancing of a flexible rotor. Journal of Engineering for Industry, 94(1), 233–242.

Mahata, K. (2003). Estimation using low rank signal models. Ph.D. thesis, Departmentof Information Technology, Uppsala University, Uppsala, Sweden.

Nauclér, P. (2008). Estimation and control of resonant systems with stochasticdisturbances. Ph.D. thesis, Department of Information Technology, Faculty ofScience and Technology, Uppsala University, Uppsala, Sweden.

Schneider, H. (1991). Balancing technology. Technical report, Carl Schenck AG.Sinha, J. K., Friswell, M. I., & Lees, A. W. (2002). The identification of the unbalance

and the foundationmodel of a flexible rotatingmachine froma single run-down.Mechanical Systems and Signal Processing , 16(2–3), 255–271.

Sinha, J. K., Lees, A. W., & Friswell, M. I. (2004). Estimating unbalance andmisalignment of a flexible rotating machine from a single run-down. Journalof Sound and Vibration, 272, 967–989.

Söderström, T., & Stoica, P. (1989). System identification. Hemel Hempstead, UnitedKingdom: Prentice Hall International.

Tiwari, R., & Chakravarthy, V. (2006). Simultaneous identification of residualunbalances and bearing dynamic parameters from impulse responses of rotor-bearing systems.Mechanical Systems and Signal Processing , 20, 1590–1614.

Viberg,M., & Ottersten, B. (1991). Sensor array processing based on subspace fitting.IEEE Transactions on Signal Processing , 39(5), 1110–1121.

Zhou, S., & Shi, J. (2001). Active balancing and vibration control of rotatingmachinery: a survey. The Shock and Vibration Digest , 33(5), 361–371.

Peter Naucltr received the M.Sc degree in engineeringphysics and the Ph.D. degree in electrical engineeringwithspecialization in automatic control from Uppsala Univer-sity, Uppsala, Sweden, in 2003 and 2008, respectively. Hisdoctorate work mainly concerned modeling and controlof mechanical systems with stochastic disturbances. Since2008 he is with Ericsson AB, Stockholm, Sweden, where heworks with radio access technologies for the fourth gener-ation telecommunication systems.

Torsten Söderström received the M.Sc. degree (‘‘civilin-genjör’’) in engineering physics in 1969 and the Ph.D. de-gree in automatic control in 1973, both fromLund Instituteof Technology, Lund, Sweden. He is a Fellow of IEEE, and anIFAC Fellow.

During 1967–1974 he held various teaching positionsat the Lund Institute of Technology. Since 1974, he hasbeen with the Department of Systems and Control, Upp-sala University, Uppsala, Sweden, where he is a professorof automatic control.

Dr Söderström is the author or coauthor of many tech-nical papers. His main research interests are in the fields of system identification,signal processing, and control. He is the (co)author of four books: ‘‘Theory and Prac-tice of Recursive Identification’’, MIT Press, 1983 (with L Ljung), ‘‘The Instrumen-tal Variable Methods for System Identification’’, Springer-Verlag, 1983 (with P Sto-ica), ‘‘System Identification’’, Prentice–Hall, 1989 (with P Stoica) and ‘‘Discrete-TimeStochastic Systems’’, Prentice–Hall, 1994; second edition, Springer-Verlag, 2002. In1981 he was, with coauthors, given an Automatica Paper Prize Award.

Within IFAC he has served in several capacities including vice-chairman of theTC on Modelling, Identification and Signal Processing, (1993–99), IPC chairmanof the IFAC SYSID’94 Symposium, Council member (1996–2002), Executive Boardmember (1999-2002) and Awards Committee Chair (1999–2002). He was an asso-ciate editor (1984–91), guest associate editor and editor for four special issues withAutomatica and is the editor for the area of SystemParameter Estimation since 1992.