Subspace identification of circulant systems

9
Automatica 44 (2008) 2825–2833 Contents lists available at ScienceDirect Automatica journal homepage: www.elsevier.com/locate/automatica Subspace identification of circulant systems Paolo Massioni * , Michel Verhaegen Delft Center for Systems and Control, Delft University of Technology, Mekelweg 2, 2628 CD Delft, The Netherlands article info Article history: Received 31 July 2007 Received in revised form 28 November 2007 Accepted 14 April 2008 Available online 2 October 2008 Keywords: System identification Subspace identification Circulant matrices Fourier matrix Circulant systems abstract This article concerns the identification of a class of large scale systems called ‘‘circulant systems’’. Circulant systems have a special property that allows them to be decomposed into simpler subsystems through a state transformation. This property has been used in literature for control design, and here we show how it can be used for system identification. The approach that is proposed here will both reduce the complexity of the problem as well as provide models which have a circulant structure that can be exploited for control design. A novel identification algorithm for circulant systems based on subspace identification is presented. The algorithm is then tested in simulation on an academic example of circulant system and on a realistic finite element model of a vibrating plate. © 2008 Elsevier Ltd. All rights reserved. 1. Introduction Large scale systems have been object of interest in system and control theory since the late Seventies (Sandell, Varaiya, Athans, & Safonov, 1978). The high dimensionality of these systems has led to the development of techniques which could reduce the complexity of the problem. A possible approach is to consider the large scale system as the result of the interconnection of many simpler subsystems (D’Andrea & Dullerud, 2003). In this paper we focus on a special class of large scale systems, which we call ‘‘circulant systems’’ (Denis & Looze, 1999). Circulant systems are the result of the periodic interconnection (D’Andrea & Dullerud, 2003) of a number of identical subsystems, as shown in Fig. 1. Each subsystem has exactly two neighbors, and each subsystem interacts with its neighbors exactly in the same way. Examples of circulant systems can be found in different fields, e.g. adaptive optics (Denis, 1998), paper machines (Laughlin, Morari, & Braatz, 1993), oscillating systems (Mitchell, 1978) and as result of the approximation of partial differential equations (Brockett & Willems, 1974). Circulant systems have a remarkable property that allows exploiting their structure by decomposing them into smaller systems; this property can be used for simplifying the complexity This paper was not presented at any IFAC meeting. This paper was recommended for publication in revised form by Associate Editor Wolfgang Scherrer under the direction of Editor Torsten Söderström. * Corresponding author. Tel.: +31 (0) 152785189. E-mail addresses: [email protected] (P. Massioni), [email protected] (M. Verhaegen). Fig. 1. Example of circulant system made of 4 identical subsystems. of the analysis as in Li, Zhao, Zhao, and Li (2004) and Lunze (1986), and for control design as in Denis (1998), Denis and Looze (1999) and Hovd and Skogestad (1994). What we show in this paper is that the structural properties of circulant systems can also be exploited in order to simplify the identification of such systems from input–output data. We will then develop an identification algorithm that will make it possible to identify models which have the circulant structure, allowing the exploitation of such structure for controller design. The paper is organized as follows. In Section 2 the preliminary notions are presented; circulant systems are defined and their properties are explained. The focus is put on how circulant 0005-1098/$ – see front matter © 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.automatica.2008.04.014

Transcript of Subspace identification of circulant systems

Automatica 44 (2008) 2825–2833

Contents lists available at ScienceDirect

Automatica

journal homepage: www.elsevier.com/locate/automatica

Subspace identification of circulant systemsI

Paolo Massioni ∗, Michel VerhaegenDelft Center for Systems and Control, Delft University of Technology, Mekelweg 2, 2628 CD Delft, The Netherlands

a r t i c l e i n f o

Article history:Received 31 July 2007Received in revised form28 November 2007Accepted 14 April 2008Available online 2 October 2008

Keywords:System identificationSubspace identificationCirculant matricesFourier matrixCirculant systems

a b s t r a c t

This article concerns the identification of a class of large scale systems called ‘‘circulant systems’’.Circulant systems have a special property that allows them to be decomposed into simpler subsystemsthrough a state transformation. This property has been used in literature for control design, and here weshow how it can be used for system identification. The approach that is proposed here will both reducethe complexity of the problem as well as provide models which have a circulant structure that can beexploited for control design. A novel identification algorithm for circulant systems based on subspaceidentification is presented. The algorithm is then tested in simulation on an academic example of circulantsystem and on a realistic finite element model of a vibrating plate.

© 2008 Elsevier Ltd. All rights reserved.

1. Introduction

Large scale systems have been object of interest in system andcontrol theory since the late Seventies (Sandell, Varaiya, Athans,& Safonov, 1978). The high dimensionality of these systems hasled to the development of techniques which could reduce thecomplexity of the problem. A possible approach is to consider thelarge scale system as the result of the interconnection of manysimpler subsystems (D’Andrea & Dullerud, 2003).In this paper we focus on a special class of large scale systems,

which we call ‘‘circulant systems’’ (Denis & Looze, 1999). Circulantsystems are the result of the periodic interconnection (D’Andrea& Dullerud, 2003) of a number of identical subsystems, as shownin Fig. 1. Each subsystem has exactly two neighbors, and eachsubsystem interacts with its neighbors exactly in the same way.Examples of circulant systems can be found in different fields,e.g. adaptive optics (Denis, 1998), paper machines (Laughlin,Morari, & Braatz, 1993), oscillating systems (Mitchell, 1978) andas result of the approximation of partial differential equations(Brockett & Willems, 1974).Circulant systems have a remarkable property that allows

exploiting their structure by decomposing them into smallersystems; this property can be used for simplifying the complexity

I This paper was not presented at any IFAC meeting. This paper wasrecommended for publication in revised form by Associate Editor WolfgangScherrer under the direction of Editor Torsten Söderström.∗ Corresponding author. Tel.: +31 (0) 152785189.E-mail addresses: [email protected] (P. Massioni),

[email protected] (M. Verhaegen).

0005-1098/$ – see front matter© 2008 Elsevier Ltd. All rights reserved.doi:10.1016/j.automatica.2008.04.014

Fig. 1. Example of circulant system made of 4 identical subsystems.

of the analysis as in Li, Zhao, Zhao, and Li (2004) and Lunze (1986),and for control design as in Denis (1998), Denis and Looze (1999)and Hovd and Skogestad (1994). What we show in this paperis that the structural properties of circulant systems can also beexploited in order to simplify the identification of such systemsfrom input–output data. We will then develop an identificationalgorithm that will make it possible to identify models which havethe circulant structure, allowing the exploitation of such structurefor controller design.The paper is organized as follows. In Section 2 the preliminary

notions are presented; circulant systems are defined and theirproperties are explained. The focus is put on how circulant

2826 P. Massioni, M. Verhaegen / Automatica 44 (2008) 2825–2833

systems can be recognized a priori from physical insight, andthen the decomposition property is presented together with itsconsequences for identification. Section 3 presents a novel generalidentification algorithm for circulant systems based on subspaceidentification methods, and Section 4 contains three simulatedexamples of the use of such algorithm in practice. The conclusionsof the paper are in Section 5.

2. Preliminaries

We start by showing the basic concepts that are needed forintroducing the notion of a circulant system. These conceptsinclude the definitions of some peculiar kind of matrices, likecirculant and block circulant matrices, and the Fourier matrix, thathas some very special propertieswith respect to circulantmatrices.Let j be the imaginary unit, and In the identity matrix of order

n; let the symbol ⊗ indicate the Kronecker product. For a genericmatrixA,AT indicates its transposewhileAH indicates its Hermitian(complex conjugate of the transpose); b indicates the complexconjugate of a matrix or scalar b.

Definition 1 (Permutation Matrix). The permutation matrix oforder n is defined as:

Πn =

0 1 0 0 · · · 00 0 1 0 · · · 00 0 0 1 · · · 0...

......

......

1 0 0 0 · · · 0

=[0 In−11 0

].

Notice that Πn is orthogonal (Π−1n = ΠTn ). Right-multiplying ann×nmatrix byΠn is equivalent to cyclically shifting all its columnsof one position to the right. Left-multiplication instead cyclicallyshifts the rows up.

Definition 2 (Circulant Matrix). A square matrix E of size n × n iscalled ‘‘circulant’’ if and only if it has the following structure:

E =

e1 e2 e3 e4 · · · enen e1 e2 e3 · · · en−1en−1 en e1 e2 · · · en−2...

......

......

e2 e3 e4 e5 · · · e1

where ei ∈ R or ei ∈ C. This is the same as saying, a square matrixis circulant if and only if each row is obtained from the precedingone by a cyclic shift of one position to the right.

This definition is equivalent to saying that a circulant matrix isinvariant to a similarity transformation with respect to Πn: E =Π−1n EΠn = Π

TnEΠn.

Definition 3 (Block Circulant Matrix). A block circulant matrix Eof order n is a (non necessarily square) matrix with the followingblock structure:

E =

E1 E2 E3 E4 · · · EnEn E1 E2 E3 · · · En−1En−1 En E1 E2 · · · En−2...

......

......

E2 E3 E4 E5 · · · E1

where Ei ∈ Rp×q or Ei ∈ Cp×q, with p, q positive integers.

It can be immediately seen (Davis, 1979) that such a matrix canalso be written as:

E =n∑i=1

(Π i−1n ⊗ Ei

). (1)

Let us now introduce some new notation. We will denote the setof block circulant matrices of order n, with blocks of size p× q, asCn,p,q; we will use the symbol C R

n,p,q or C Cn,p,q if we want to specify

that the values of such matrices are respectively real or complex.Let Dn,p,q instead denote the set of block diagonal matrices with nblock rows and block columns, and blocks of size p× q. Again, wewill use eitherDR

n,p,q orDCn,p,q if wewant to emphasize the nature of

the values of such matrices. For a matrix E ∈ Dn,p,q, Ei will indicatethe ith block on the diagonal; for a matrix E ∈ Cn,p,q, Ei indicatesthe ith block in the first row (as shown in Definition 3).

Remark 4. The sums and products of block circulant matrices ofthe same order are still block circulant. The inverse of a squareinvertible block circulant matrix is block circulant (Davis, 1979).

Lemma 5 (Block-Permutation). A block circulant matrix E ∈ Cn,p,q isinvariant to a block-permutation transformation, that means:(Πn ⊗ Ip

)−1 E (Πn ⊗ Iq) = (Π−1n ⊗ Ip) E (Πn ⊗ Iq) = E.Proof. From (1), we have:(Πn ⊗ Ip

)−1 E (Πn ⊗ Iq)=

n∑i=1

(Πn ⊗ Ip

)−1 (Π i−1n ⊗ Ei

) (Πn ⊗ Iq

).

From the properties of the Kronecker product (Brewer, 1978):

(A⊗ B)(C ⊗ D) = (AC ⊗ BD)(A⊗ B)−1 = A−1 ⊗ B−1 (2)

then it follows:(Πn ⊗ Ip

)−1 E (Πn ⊗ Iq) = n∑i=1

(Π−1n Π i−1n Πn ⊗ IpEiIq

)=

n∑i=1

(Π i−1n ⊗ Ei

)= E. �

Definition 6 (Fourier Matrix). We define the Fourier matrix oforder n as:

Fn =1√n

1 1 1 · · · 11 wn w2n · · · w(n−1)n1 w2n w4n · · · w2(n−1)n...

......

...

1 w(n−1)n w2(n−1)n · · · w(n−1)(n−1)n

withwn = e−

2π jn = cos 2πn − j sin

2πn .

The matrix Fn is unitary and symmetric: FHn Fn = FnFHn = In,F Tn = Fn. Left-multiplying a column vector with a Fourier matrix isequivalent to computing its Discrete Fourier Transform (DFT); forlarge values of n, it is convenient to use the Fast Fourier Transform(FFT) algorithm instead of computing the matrix product (Davis,1979).We call fi the ith row of Fn. We will show now that all the rows

but the first of the Fourier matrix are complex conjugate betweeneach other; if n is even, then f1 and fn/2 are real, while the otherrows form complex conjugate pairs; if n is odd, then f1 alone is realwith the other rows forming complex conjugate pairs.

P. Massioni, M. Verhaegen / Automatica 44 (2008) 2825–2833 2827

Lemma 7. The rows of Fn are either real or in complex conjugate pairsaccording to the relation: fn+2−i = fi for i = {2, . . . , n}.

Proof. The first row f1 is trivially always real. For what concernsthe other rows, we can see that the kth element of fn+2−i is:fn+2−i,k = e−

2π jn (n+1−i)(k−1), while the kth element of fi is: fi,k =

e−2π jn (i−1)(k−1). From the properties of the complex exponential

(ez = ez , ez+2π jk = ez for k ∈ Z) then we have:

fn+2−i,k = e−2π jn (−i+1)(k−1) = e−

2π jn (i−1)(k−1) = fi,k.

This implies fn+2−i = fi. �

Fourier matrices have the remarkable property of diagonalizingany circulant matrix. This property is crucial because it will allowdecomposing large scale circulant systems to smaller independentones, thus reducing the complexity of the identification problem.The property is stated in the Theorem that follows, and thengeneralized to block circulant matrices.

Theorem 8 (Diagonalization Property). For a matrix E ∈ Cn×n, itholds that FnEFHn is a diagonal matrix if and only if E is circulant.

Proof. The proof can be found in Davis (1979). �

Corollary 9. Consider a matrix E ∈ Cnp×nq. Then we have that E =(Fn ⊗ Ip

)E(Fn ⊗ Iq

)H∈ DC

n,p,q if and only if E ∈ C Cn,p,q.

Proof. We start with the ‘‘if’’ part. If E is block circulant, then fromEq. (1) we have:

E = (Fn ⊗ Ir)

(n∑i=1

Π i−1n ⊗ Ei

)(Fn ⊗ Im)H .

From this property of the Kronecker product (Brewer, 1978):

(A⊗ B)H = (AH ⊗ BH)

and Eq. (2), then we have:

E =n∑i=1

(FnΠ i−1n F

Hn

)⊗ Ei.

Notice thatΠ i−1n is circulant, so FnΠ i−1n FHn is diagonal (Theorem 8).

Then(FnΠ i−1n F

Hn

)⊗ Ei is block diagonal, and E is a sum of block

diagonal matrices, so it is block diagonal too. This proves the ‘‘if’’part.The ‘‘only if’’ part is equivalent of saying that for a matrix G ∈

Cnp×nq, (Fn ⊗ Ir)H G (Fn ⊗ Ir) ∈ C Cn,p,q ifG ∈ DC

n,p,q (just assume that

G =(Fn ⊗ Ip

)E(Fn ⊗ Iq

)H∈ DC

n,p,q). We can see that being G blockdiagonal, it holds that:(Fn ⊗ Ip

)H G (Fn ⊗ Iq) = n∑i=1

(f Hi ⊗ Ip

)Gi(fi ⊗ Iq

).

Then thanks to Eq. (2) (assume Gi = 1⊗ Gi), we have:(Fn ⊗ Ip

)H G (Fn ⊗ Iq) = n∑i=1

(f Hi fi ⊗ Gi

).

Notice that f Hi fi = FHn HiFn, where Hi is a matrix with all entries

equal to 0 but the ith entry on the diagonal that is 1. So Hiis diagonal and thanks to Theorem 8, f Hi fi is circulant. Then(f Hi fi ⊗ Gi

)is block circulant, and

(Fn ⊗ Ip

)H G (Fn ⊗ Iq) being asum of block circulant matrices is block circulant as well. �

It is possible to show that the complex block diagonal matricesobtained through the transformation via Fouriermatrices from realblock circulant matrices have some special features; and all of theblock diagonal matrices of such kind can be transformed into realblock circulant ones with the inverse transformation.

Corollary 10. For a matrix E ∈ C Rn,p,q, then for E = (Fn ⊗ Ip)

E(Fn ⊗ Iq

)H∈ DC

n,p,q it holds that E1 ∈ Rp×q and En+2−i = Ei fori = {2, . . . , n}.Conversely, for a matrix G ∈ DC

n,p,q for which G1 ∈

Rp×q and Gn+2−i = Gi for i = {2, . . . , n}, we have that(Fn ⊗ Ip

)H G (Fn ⊗ Iq) ∈ C Rn,p,q.

Proof. It is a consequence of Lemma 7. �

We are now ready to introduce the notion of circulant systemand show its key features. After the definition, we will first statea property that characterizes such kind of systems, and then wewill show how they can be decomposed into smaller independentsystems, thus enabling efficient solutions to the identificationproblem.

Definition 11 (Circulant Systems). Consider a discrete-time MIMOsystem with nm inputs and nr outputs, which can be described bystate-space equations of the kind:{x(k+ 1) = Ax(k)+Bu(k)y(k) = Cx(k)+Du(k) (3)

with A ∈ Rnl×nl, B ∈ Rnl×nm, C ∈ Rnr×nl, D ∈ Rnr×nm. Thevector x ∈ Rnl×1 is the state, u ∈ Rnm×1 is the input signal andy ∈ Rnr×1 is the output signal. We call the system ‘‘circulant’’ (orblock circulant) if and only if it has a representationwithA ∈ C R

n,l,l,B ∈ C R

n,l,m, C ∈ C Rn,r,l and D ∈ C R

n,r,m. When we will referto the matrices of a circulant system, we will consider only thisrealization with circulant matrices.We consider also the input u to bemade of n blocks of sizem×1,

which we denote as ui, and the output y to be made of n blocks ofsize r × 1, which we denote as yi (i = 1, . . . , n). We call theseblocks ‘‘local inputs’’ and ‘‘local outputs’’.

An important property of circulant systems is the invariancewith respect to shift in the inputs and outputs. If a certain inputsignal u generates an output signal y, then a permuted versionof the same input ((Πn ⊗ Im) u) will generate a permuted versionof the same output ((Πn ⊗ Ir) y). This is better explained in thefollowing Lemma.

Lemma 12 (Invariance to Input/Output Shift). Let the signal y(k) ∈Rnr be a valid output of a system as in Eq. (3) when excited by theinput signal u(k) ∈ Rnm. Then u(k) = (Πn ⊗ Im) u(k) and y(k) =(Πn ⊗ Ir) y(k) are a valid input/output pair for the same system if andonly if the system is circulant.

Proof. We start by proving the ‘‘if’’ part, that is, all circulantsystems have the shift invariance property. From Lemma 5, we canrewrite Eq. (3) as:x(k+ 1) = (Π−1n ⊗ Il)A(Πn ⊗ Il)x(k)+ (Π−1n ⊗ Il)B(Πn ⊗ Im)u(k)

y(k) = (Π−1n ⊗ Ir)C(Πn ⊗ Il)x(k)+ (Π−1n ⊗ Ir)D(Πn ⊗ Im)u(k).

If we perform the state transformation: x(k) = (Πn ⊗ Il)x(k), thenthe system becomes:x(k+ 1) = Ax(k)+B (Πn ⊗ Im)u(k)︸ ︷︷ ︸

u(k)(Πn ⊗ Ir)y(k)︸ ︷︷ ︸

y(k)

= Cx(k)+D (Πn ⊗ Im)u(k)︸ ︷︷ ︸u(k)

.

2828 P. Massioni, M. Verhaegen / Automatica 44 (2008) 2825–2833

We see that the dynamic equations for the input/output pair u(k)and y(k) are the same as for u(k) and y(k). So if y(k) is valid outputfor u(k), then y(k) is a valid output for u(k). Of course, the initialconditions x(0) for the (u, y) pair which will make this possibleare related to the initial conditions x(0) for (u, y) by the formula:x(0) = (Πn ⊗ Il) x(0).To prove the ‘‘only if’’ part, the first step is to understand that the

shift invariance property is equivalent to having a circulant transferfunction T (z) for the system. If for any valid pair u and y we havethat (Πn ⊗ Im) u and (Πn ⊗ Ir) y are valid too (with appropriateinitial conditions), this means that

(Π−1n ⊗ Ir

)T (z) (Πn ⊗ Im) =

T (z). From this it can be proved that there exists a state-spaceformulation with block circulant matrices; we omit the sequel ofthe proof for brevity, it can be found for example inDenis and Looze(1999). �

This Lemma 12 shows that a circulant system can beequivalently defined as a system which has this shift invarianceproperty. This is of fundamental importance, because it makes itpossible to recognize a system as circulant a priori, from physicalinsight, without knowing its dynamic equations. If a systempossesses certain symmetries such as it is possible to know that ashift in the input signals will generate a shift in the output signals,then it is possible to assume a circulant structure for it in theidentification process. This circulant structure can be exploited toderive a specific subspace identification algorithm that assumessuch structure. The following Theorem is key to the developmentof such algorithm.

Theorem 13 (Decomposition Property). A circulant system of ordernl as described in Definition 11 is equivalent to n independent systemsof order l in the complex domain. Each of these subsystem has only minputs and r outputs.

Proof. According to Corollary 9, it holds that:

A = (Fn ⊗ Il)HA(Fn ⊗ Il)B = (Fn ⊗ Il)HB(Fn ⊗ Im)C = (Fn ⊗ Ir)HC(Fn ⊗ Il)

D = (Fn ⊗ Ir)HD(Fn ⊗ Im)

(4)

with A ∈ DCn,l,l, B ∈ DC

n,l,m, C ∈ DCn,r,l, D ∈ DC

n,r,m. So we can rewriteEq. (3) as:x(k+ 1) = (Fn ⊗ Il)HA(Fn ⊗ Il)x(k)+ (Fn ⊗ Il)HB(Fn ⊗ Im)u(k)

y(k) = (Fn ⊗ Ir)HC(Fn ⊗ Il)x(k)+ (Fn ⊗ Ir)HD(Fn ⊗ Im)u(k)

⇔{(Fn ⊗ Il)x(k+ 1) = A(Fn ⊗ Il)x(k)+ B(Fn ⊗ Im)u(k)(Fn ⊗ Ir)y(k) = C(Fn ⊗ Il)x(k)+ D(Fn ⊗ Im)u(k).

If we apply the following invertible transformations for state, inputand output:

x(k) = (Fn ⊗ Il)x(k)u(k) = (Fn ⊗ Im)u(k)y(k) = (Fn ⊗ Ir)y(k)

(5)

then the system turns into:{x(k+ 1) = Ax(k)+ Bu(k)y(k) = Cx(k)+ Du(k). (6)

All the matrices involved in this systems are block diagonal, sothis system is equivalent to the following n independent lth order

subsystems (of complex variables), each of themwithm inputs andr outputs:{xi(k+ 1) = Aixi(k)+ Biui(k)yi(k) = Cixi(k)+ Diui(k)

for i = 1, . . . , n (7)

where Ai, Bi, Ci and Di are respectively the blocks in the diagonalof A, B, C and D, and xi(k), ui(k) and yi(k) are the blocks of thecolumn vectors x(k), u(k) and y(k); Ai ∈ Cl×l, Bi ∈ Cl×m, Ci ∈ Cr×l,Di ∈ Cr×m, xi(k) ∈ Cl×1, ui(k) ∈ Cm×1 and yi(k) ∈ Cr×1. �

Notice that the systems into which the global system isdecomposed have nothing to do with the ‘‘physical’’ subsystemswhich make the system, like the ones shown in Fig. 1. The state-space systems of complex variables found here can be seen as akind of modal decomposition of the global system; in order tostress the difference between these and the ‘‘physical’’ subsystems,we will call the former ‘‘modal’’ subsystems.

Remark 14. The decomposition property (Theorem 13) can beinterpreted under the formalism of systems over spatial groups,as in Bamieh, Paganini, and Dahleh (2002). In this perspective, thecirculant dynamic system is an operator with spatial coordinatesranging overZn (the finite group of integersmodulo n) that has theproperty of ‘‘spatial invariance’’ (Lemma 12). A Fourier transformof the coordinates into its dual group (Zn again in this case) is thenable to block-diagonalize the system.

It is important also to point out that not all the n modalsubsystems of Eq. (7) are independent; actually, as a directconsequence of Corollary 10, the systems of index n + 2 − i arethe complex conjugate version of the systems of index i, for i ={2, . . . , n}. So there are only n/2 + 1 independent systems if n iseven and (n+ 1)/2 independent systems if n is odd.

Corollary 15 (Properties of Decomposition). With respect to Eq. (7),let Pi indicate any among the following: Ai, Bi, Ci, Di, xi(k), ui(k) andyi(k). It holds that:

P1 is realPn+2−i = Pi for i = {2, . . . , n}.

Proof. It is a consequence of Lemma 7. �

3. Identification of circulant systems

3.1. Motivation and rationale

As a consequence of Lemma 12, we have seen that there existcategories of systemswhich can be identified as circulant just fromphysical insight, as a result of the invariance of their input andoutput pairs to shifts. As an example, consider again the systemshown in Fig. 1: the global system is made of four smaller identicalsubsystems, each with its input and output, connected in a circularway. The interconnections between neighboring systems are allthe same, and actually it is impossible to distinguish one systemfrom the other.In such a situation, putting Subsystem 1 in the place of

Subsystem2, Subsystem2 in the place of Subsystem3, Subsystem3in the place of Subsystem 4, and Subsystem 4 in the place ofSubsystem 1 would still yield the same global system. Then weknow that the invariance to shift of input/output pairs of Lemma12must hold, as it impossible to know if we are looking at theoriginal system or its shifted (or ‘‘rotated’’) version. An example ofsystems of this kind can be found for example in adaptive optics.Fig. 2 shows a scheme of a deformablemirror (Hamelinck, Rosielle,Steinbuch, & Doelman, 2005), which is made of a set of actuatorsplaced on a regular grid which can displace the reflecting surface.

P. Massioni, M. Verhaegen / Automatica 44 (2008) 2825–2833 2829

Fig. 2. An adaptive optics mirror. Each circle contains an actuator.

The mirror is invariant to rotations of multiples of 60 degrees,and it can be seen as the interconnection of six identical sectors,which strongly influence each other. Another kind of circulantsystem could be a system made of two identical interconnectedparts, which could simply be a physical object with one plane ofsymmetry.So there might be the necessity of identifying such kind of

systems from data. Subspace methods (Van Overschee & De Moor,1994; Verhaegen, 1994) are the most common choice for MIMOsystems, and they could be used in a situation as this to identifya discrete-time state-space model of the global system, from theset of all outputs and all inputs. The problem of this approach is inthe fact that subspace methods return state-space matrices up toan arbitrary similarity transformation, that disrupts any structurethe system may have. Moreover, this paper will also demonstratethat it can be useful to force the circulant structure to the model inorder to improve the accuracy of the estimation: the knowledge ofthe symmetries of the systemcan be used as an a priori informationon the MIMO model.We will shortly show that it is indeed possible to exploit the

structure of circulant systems for identification; in fact, we willillustrate an identification algorithm that:

(1) allows using the prior knowledge of the system as circulant;(2) reduces the computational complexity of the problem;(3) preserves the circulant structure, that is, the identified modelis again a circulant system.

The algorithm is a direct consequence of the diagonalizationproperty of circulant systems (Theorem 13) and it can be outlinedas follows. As the system can be turned into n independentsubsystems, and as for each of these subsystemwe can find a prioriwhich are the inputs and outputs, then it is possible to identify eachof these modal subsystems separately from each other. For thispurpose, it is sufficient to transform the inputs and the outputs asin Eq. (5), and use themwith any method (subspace identification,prediction error, etc. see for example Ljung, 1987) to identify thestate-spacematrices of themodal systems; the only additional carewewill need to take is thatwe should extend themethod tomodelswith complex values. Actually not all the n subsystems have tobe identified, but only the independent ones, while the others arejust the complex conjugates as explained in Corollary 15. Then,once these systems have been identified, the global model can beretrieved with the use of Eq. (4). Corollary 9 will grant that theglobal matrices obtained are block circulant, while Corollary 10will grant that such matrices have real values.We said in the previous paragraph that anymethod can be used

for identifying the modal subsystems; actually, subspace methodsseem to be the best choice at this point, as they are inherentlyfit to deal with state-space models (instead of transfer functions)and they can naturally be extended to the complex domain (aswas done in e.g. McKelvey, 2004, for a different purpose). The

subspace identification process is a ‘‘numerical recipe’’ that yieldsfour matrices as result of an input/output couple; all the algebraicoperations used in subspace identification (matrix sum, matrixproduct, singular value decomposition or QR factorization) can beextended to complex numbers. Moreover, subspace methods willoffer insight on the order l of the subsystems from the singularvalues of the extended observability matrices (see Verhaegen &Verdult, 2007 for details). This will make it possible to choose agood value of the order: although the different subsystems mayyield different results, it is necessary to choose the same order lfor all of them (we omit a complete discussion on this topic forthe sake of brevity). For these reasons, in the sequel of the paperwe will use a subspace algorithm, specifically the MOESP (Multi-variable Output-Error State sPace) algorithm (Verhaegen, 1994).MOESP is fit for systems with white measurement noise only, andin the examples herewewill restrict to them, but of course the ideaof the algorithm can be extended to more sophisticated subspacemethods, for example PI-MOESP, PO-MOESP (Verhaegen&Verdult,2007) or N4SID (Van Overschee & De Moor, 1994), that take intoaccount different models of noise.Now we are almost ready to write the algorithm explicitly in

its steps. But first, we discuss the condition of ‘‘persistence ofexcitation’’ that is necessary for the identification process.

3.2. Persistence of excitation

The persistence of excitation is a requirement that is put oninput signals in order to make system identification possible.

Definition 16 (Persistence of Excitation Verhaegen & Verdult, 2007).The signal u(k), for k = 0, 1, . . . is persistently exciting of order nif and only if there exists an integer N such that the Hankel matrixof the input:

U0,n,N =

u(0) u(1) · · · u(N − 1)u(1) u(2) · · · u(N)...

. . ....

u(n− 1) u(n) · · · u(N + n− 2)

has full row rank.

If we use a subspace method for identifying an nth order model,then it is necessary to have an input which is persistently excitingof at least order n (Willems, Rapisarda, Markovsky, & De Moor,2005). In case we want to identify a circulant system of order nl,then we would expect that the input has to be at least persistentlyexciting of order nl. Actually, this is not really necessary, as weidentify the modal subsystems and not the full model itself. Sofor the subsystems we only need that the ui which is involved ispersistently exciting of order l, that is less restrictive. The followingLemma and an example will give more insight on the issue ofpersistence of excitation for circulant systems.

Lemma 17. If the full input signal u(k) of a circulant system (Eq. (3))is persistently exciting of order s, then each one of the ‘‘modal’’input signals ui(k) obtained through Eq. (5) is persistently exciting oforder s.

Proof. If the input is persistently exciting of order s, then thereexists an N for which U0,s,N has full row rank. If we consider theHankel matrix U0,s,N of the transformed input u, it is easy to seethat it is related to U0,s,N by the relation:

U0,s,N = (Is ⊗ Fn ⊗ Im)U0,s,N .

As Is ⊗ Fn ⊗ Im is full rank, then, thanks to Sylvester’s inequality(Verhaegen & Verdult, 2007), if U0,s,N is full row rank then U0,s,N is

2830 P. Massioni, M. Verhaegen / Automatica 44 (2008) 2825–2833

full row rank as well. We call U i0,s,N for i = {1, . . . , n} the Hankelmatrices obtained from the single ‘‘modal’’ inputs ui. All of thesematrices are submatrices of U0,s,N that is full rank, and so they areall full rank (U i0,s,N is made of the rows of U0,s,N containing ui). Soeach signal ui(k) is persistently exciting of order s. �

The relevant consequence of this Lemma is that we do notactually need a signal with persistence of excitation of order nlin order to identify a circulant system of the same order, but l isenough, as with it we can identify the modal systems which are oforder l. And besides this, there could be signals which are not evenpersistently exciting of order l for the full system, but are so for themodal systems once transformed.Consider for example a situationwhere all the n local inputs are

equal to zero, but one (let us assume it is u1). Also assume that u1alone would be persistently exciting of order l, meaning that theHankel matrix U10,s,N built with u1 is full rank. Instead U0,s,N willnever be full rank as it contains some null rows, making the fullsignal u persistently exciting of order 0. So it will not be possibleto identify the full system without making any assumptions on itsstructure. But the modal systems can be identified, as the matricesU i0,s,N will be all full rank; in fact U

i0,s,N = U

10,s,N/√n. This example

shows that the knowledge on the structure of the system makesthe identification possible for a much wider class of input signals,and that it is not even necessary to put an input in all the physicalsubsystems, but a single input channel can be enough.

3.3. The novel algorithm

Algorithm 18 (Circulant System Identification). A set of n inputsignals ui(k) ∈ Rm×1 and n output signals yi(k) ∈ Rm×1 is given, fori = {1, . . . , n} and k = {1, . . . , kmax}. This set of data is associatedto a dynamic system;we know, thanks to considerations stemmingfrom Lemma 12, that this system has a circulant structure and thatwe can use a model of circulant system according to Definition 11todescribe it,wheren,m and r are already knownand l is unknown.Problem: identify an lnth order state-space circulant model frominput–output data.The problem is solved in the following steps:

(1) Compute the Fourier matrix Fn of order n.(2) Transform input and output signals, by computing:u(k) = (Fn ⊗ Im)u(k)y(k) = (Fn ⊗ Ir)y(k).

(3) Verify that each signal u(k) is persistently exciting of at leastorder l.

(4) Use MOESP to identify independent state-space models oforder l from each ui/yi pair:{xi(k+ 1) = Aixi(k)+ Biui(k)yi(k) = Cixi(k)+ Diui(k)

for i = 1, . . . , n.

If n is even, then identify the systems for i = {1, . . . , n/2};if n is odd instead, identify the systems for i = {1, . . . , (n +1)/2} (the other values of i correspond to signals which arejust the complex conjugates, so they do not contain furtherinformation): the method will yield as results the identified(complex) matrices Ai, Bi, Ci and Di. These sets of four matricesare unique up to a similarity transformation, but shortlywe will show a Theorem demonstrating that this is not aproblem. Then use Corollary 15 to get thematrices of the other(dependent) systems:

Ai = An+2−iBi = Bn+2−iCi = Cn+2−iDi = Dn+2−i

for i =

{n2+ 1, . . . , n

}if n even{

n+ 12+ 1, . . . , n

}if n odd.

(5) Construct the block diagonalmatrices: A, B, C and D putting theidentified blocks together.

(6) Retrieve the global system matrices with the followingformulas:

A = (Fn ⊗ Il)H A(Fn ⊗ Il)B = (Fn ⊗ Il)H B(Fn ⊗ Im)C = (Fn ⊗ Ir)H C(Fn ⊗ Il)

D = (Fn ⊗ Ir)H D(Fn ⊗ Im).

(8)

A, B, C and D are real and block circulant, thanks toCorollaries 15, 9 and 10. Notice that it is not necessary tocompute the multiplications above fully, it is just enough toget the first block row of these matrices and then the othersare known thanks to the circulant structure.

Subspace algorithms deliver the system matrices up to asimilarity transformation. We show in the next Theorem that aset of independent similarity transformations of the n independentsubsystems are of no concern for the final result. In fact, all thepossible similarity transformations of the subsystems are alwaysequivalent to a global similarity transformation for the completesystem.

Theorem 19 (Similarity Transformations). Let us assume that eachof the subsystems described by Eq. (7) is known up to a similaritytransformation of a nonsingular matrix Ti:{xi(k+ 1) = T−1i AiTixi(k)+ T−1i Biui(k)yi(k) = CiTixi(k)+ Diui(k)

for i = 1, . . . , n.

Then this is equivalent to knowing the global circulant system up to aglobal similarity transformation.

Proof. Ifweuse Eq. (8) to recover the globalmatrices from theonesof the subsystems, transformed by Ti, then we have (we neglect Dthat is not influenced by similarity transformations):

AT = (Fn ⊗ Il)HT−1AT (Fn ⊗ Il)BT = (Fn ⊗ Il)HT−1B(Fn ⊗ Im)

CT = (Fn ⊗ Ir)H CT (Fn ⊗ Il)

where T ∈ Dn,l,l is the block diagonal matrix containing all the Tiblocks; T−1 is block diagonal as well. We can rewrite the equationsinserting some identity matrices in key points; for the first one, wehave:

AT = (Fn ⊗ Il)HT−1 (Fn ⊗ Il)(Fn ⊗ Il)H︸ ︷︷ ︸Inl

× A (Fn ⊗ Il)(Fn ⊗ Il)H︸ ︷︷ ︸Inl

T (Fn ⊗ Il).

Being T block diagonal, then T = (Fn ⊗ Il)HT−1(Fn ⊗ Il) is block-circulant (Corollary 9). So we have:

AT = T (Fn ⊗ Il)H A(Fn ⊗ Il)T −1 = T AT −1.

In a similar way, we can show that:

BT = T B, CT = CT −1

that is the same as saying, we know the system matrices up to asimilarity transformation. �

Remark 20. As a last consideration, let us evaluate the reductionin complexity that is obtained by using the proposed algorithminstead of a globalMOESP. The complexity ofMOESP is determinedby its most costly operation, that is the QR factorization; in thisanalysis, we limit ourselves to looking at this step. For a matrix

P. Massioni, M. Verhaegen / Automatica 44 (2008) 2825–2833 2831

Fig. 3. Poles of the identified model in a set of 50 different experiments.

in Rj×k, the cost of the QR factorization is O(jk2) (Golub & VanLoan, 1996). Application of MOESP to the complete nm input andnr output system, for N time steps, requires the QR factorizationof a matrix with N rows and sn(m + r) columns, where s is anumber of choice that is bigger than the order (so s = O(nl)). Thismeans that the cost of the QR for the globalMOESP isO(Ns2n2(m+r)2) ≈ O(N(m + r)2n4l2). If we use the circulant MOESP, weneed to perform n times the QR decomposition of N × s(m + r)matrices, where now s = O(l). This means that the global cost ofthe QR’s for the circulant MOESP is O(N(m + r)2nl2), a factor n3less. Of course the circulant MOESP would require also the signaltransformations and the construction of the global matrices (steps2 and 6 of Algorithm18); these operations can be donewith the FFTalgorithm, and they would have a computational cost of O(N(r +m)n log n) and O(l2n log n) respectively, which are anyway quitesmaller compared to the QR.

4. Some simulation results

4.1. Measurement noise

For demonstrating the use of the algorithm, a stable circulantsystem of 12th order, with n = 4, l = 3, m = 1 and r = 1was randomly generated. The four input signals are made of 200random samples each; white measurement noise has been addedto all the four outputs.In the test, we generated 250 different input/output pairs, and

used them to identify the system. The algorithm shown in thispaper (from now on, we will call it ‘‘circulant MOESP’’) was usedand compared to a standard MOESP that assumes no structure atall for the system. In Fig. 3 are shown the poles of the true system,together with the poles identified with the two different methodsin 50 of the 250 runs; the poles identified with standard MOESPare indicated by a cross, while those which were found with thealgorithm which assumes a circulant structure are indicated bya circle. At a glance it is possible to see that the circles are ingeneral closer to the true poles if compared to the crosses (Fig. 4shows amagnification around one of the poles). Table 1 shows thisobservation in amore rigorousway, by comparing themean squareof the error in identifying each of the poles of the system.So this example suggests that if we have a systemwith circulant

structure, the novelmethod performs better than standardMOESP.

4.2. Non perfectly circulant systems

Another test has been done adding a ‘‘random perturbation’’ tothe A matrix as well. This causes the system to be not perfectlycirculant (that is most likely in real-life situations), but it has

Fig. 4. Detail of Fig. 3 around one of the poles.

Table 1Comparison of performances of the two different methods in identifying the poleswith measurement noise

Pole Root mean square errorStandard MOESP Circulant MOESP

−0.02486 0.04641 0.019580.13497± 0.17077j 0.06768 0.018790.27881± 0.21487j 0.08064 0.020380.38761± 0.26329j 0.02801 0.008770.60841± 0.20941j 0.00783 0.003940.65795± 0.04966j 0.02973 0.008700.68937 0.05027 0.01679

Fig. 5. Error in identifying one of the poles (the second in Table 2) in 50 experimentswith perturbation on theAmatrix.

been verified that the method is still applicable; the idea isto show that small perturbations in the circulant structure donot cause completely wrong results. Again, we generated 250different input/output pairs (with measurement noise), and usedthem to identify the system. For each pair, the A matrix hasbeen perturbed with a different random perturbation matrix, eachelement of which was smaller than 1/1000 in modulus. For smallperturbations as these, there is still an advantage in the accuracyof the method with respect to standard MOESP, as shown in Fig. 5and in Table 2.

2832 P. Massioni, M. Verhaegen / Automatica 44 (2008) 2825–2833

Table 2Comparison of performances of the different methods in identifying the poles, withmeasurement noise and a perturbation onA

Pole (if no noise) Root mean square errorStandard MOESP Circulant MOESP

−0.02486 0.05287 0.020330.13497± 0.17077j 0.06198 0.020020.27881± 0.21487j 0.07897 0.021690.38761± 0.26329j 0.02744 0.009260.60841± 0.20941j 0.00771 0.004260.65795± 0.04966j 0.02874 0.009220.68937 0.05199 0.01857

Fig. 6. Finite elementmodel of a vibrating plate. The plate is clamped at the corners,the arrows indicate the location of the input forces. The measured output is thedisplacement of the application points of the forces.

4.3. Vibrating plate

A last test has been executed with the help of a finite elementsimulation. A kind of test bed for a vibration control experimentwas designed using a finite element software (ABAQUS); theexperiment consists of ametallic plate clamped at the corners,withfour co-located (Preumont, 2004) actuator and sensor pairs. Theoverall set-up satisfies the circulant symmetry and it is shown inFig. 6.An experiment was executed with frequency sweeps as inputs.

The inputs were made of 512 samples each, and a time step of0.02 s was used. The method described in this paper has been usedto identify a 40th order model for the plate, in a noiseless case aswell as with white measurement noise (with signal to noise ratioof about 30 dB). The Bode plot of the transfer function from oneactuator to its co-located sensor is shown in Fig. 7. In the picturewecan see that the peaks of the function match the eigenfrequenciesof the system as they can be computed by the software, and thatthe presence of noise does not change the result significantly. It isinteresting to compare this to the results that are obtained withstandard MOESP in both cases, which are shown in Fig. 8. We cansee that the noiseless case yields almost the same result, while thenoise effects amuch bigger distortion of the transfer functionwhenregular MOESP is used.

5. Conclusions

This paper has shown a new method for identifying a certainclass of large scale systems possessing the property of circulantsymmetry. This new method is based on a special property ofcirculant systems that allows them to be decomposed into anumber of ‘‘modal’’ subsystems of smaller order, allowing theindependent identification of each one of them. The method canbe used as a complement to any identification algorithm, butsubspace methods are more appropriate, so in this paper theMOESP algorithm has been used and tested.A complete algorithm that makes use of MOESP to identify cir-

culant systems was developed. This algorithm allows maintainingthe circulant structure in the final result, while subspace methods

Fig. 7. Bode plot of the transfer from one actuator to its co-located sensor, forthe model identified with circulant MOESP. The dash-dotted lines indicate thefrequencies of the first five eigenmodes as they were computed by the finiteelement software.

Fig. 8. Bode plot of the transfer from one actuator to its co-located sensor, for themodel identified with regular MOESP.

in general generate outputs up to anunpredictable similarity trans-formation. Moreover, the method allows using the a priori infor-mation on the symmetries of the system to get better results, witha smaller computational effort. To our knowledge, this method isthe only way of introducing symmetry constraints to a model ob-tained with subspace identification.The algorithm has been applied to an academic example of

a circulant system as well as to a finite element model of astructural part. The tests have verified the better ability of thealgorithm in identifying circulant systems and the robustness tosmall perturbations with respect to the circulant structure.

Acknowledgments

The authors wish to thank Gianni Campoli (Delft University ofTechnology, Department of Aerospace Materials) for providing thesimulation of the vibrating plate. This research is supported by theMicroNed programme, an initiative of the Dutch Government.

References

Bamieh, B., Paganini, F., & Dahleh, M. A. (2002). Distributed control of spatiallyinvariant systems. IEEE Transactions on Automatic Control, 47.

Brewer, J. W. (1978). Kronecker products andmatrix calculus in system theory. IEEETransactions on Circuits and Systems, 25(9).

P. Massioni, M. Verhaegen / Automatica 44 (2008) 2825–2833 2833

Brockett, R. W., & Willems, J. L. (1974). Discretized partial differential equa-tions: Examples of control systems defined on modules. Automatica, 10(4),507–515.

D’Andrea, R., & Dullerud, G. E. (2003). Distributed control design for spatiallyinterconnected systems. IEEE Transactions on Automatic Control, 48(9).

Davis, P. J. (1979). Circulant matrices. Wiley-Interscience.Denis, N. (1998). Solution of optimization problems with spatial symmetry andapplications to adaptive optics. Ph.D. thesis. University of Massachussetts, August1998.

Denis, N., & Looze, D. P. (1999). H∞ controller design for systems with circulantsymmetry. In Proceedings of the 38th conference on decision & control.

Golub, G. H., & Van Loan, C. F. (1996). Matrix computations (3rd ed.). John HopkinsUniversity Press.

Hamelinck, R. F. M. M., Rosielle, N., Steinbuch, M., & Doelman, N. (2005). Largeadaptive deformable mirror: Design and first prototypes. In Proceedings of the5th annual SPIE conference on optics and photonics.

Hovd, M., & Skogestad, S. (1994). Control of symmetrically interconnected plants.Automatica, 30(6), 957–973.

Laughlin, D. L., Morari, M., & Braatz, R. D. (1993). Robust performance ofcross-directional basis-weight control in paper machines. Automatica, 29(6),1395–1410.

Li, J., Zhao, S., Zhao, J., & Li, Y. (2004). Stability analysis for circulant systems andswitched circulant systems. In Proceedings of the 43rd IEEE conference on decisionand control.

Ljung, L. (1987). System identification: Theory for the user. Prentice-Hall.Lunze, J. (1986). Dynamics of strongly coupled symmetric composite systems.International Journal of Control, 44(6), 1617–1640.

McKelvey, T. (2004). Subspace methods for frequency domain data. In Proceedingsof the 2004 American control conference.

Mitchell, T. P. (1978). The dynamics of circulant and near-circulant systems. ActaMechanica, 29(1).

Preumont, A. (2004). Vibration control of active structures, an introduction. KluverAcademic Publisher.

Sandell, N., Varaiya, P., Athans, M., & Safonov, M. (1978). Survey of decentralizedcontrol methods for large scale systems. IEEE Transactions on Automatic Control,23(2), 108–128.

Van Overschee, P., & De Moor, B. (1994). N4SID: Subspace algorithms for theidentification of combined deterministic and stochastic systems. Automatica,30(1).

Verhaegen, M. (1994). Identification of the deterministic part of MIMO state spacemodels given in innovation form from input–output data. Automatica, 30(1),61–74.

Verhaegen,M., & Verdult, V. (2007). Filtering and system identification: A least squaresapproach. Cambridge University Press.

Willems, J. C., Rapisarda, P., Markovsky, I., & De Moor, B. (2005). A note onpersistency of excitation. Systems & Control Letters, 54(4), 325–329.

Paolo Massioni was born in Milan, Italy, in the year1980. In 2005, he received his M.Sc. degree cum laude inAerospace Engineering from Politecnico di Milano, Italy.Since August 2006 he has been a Ph.D. candidate at theDelft Center for Systems and Control, Delft University ofTechnology, the Netherlands. His main research interestsare control and identification of distributed or large scalesystems, subspace identification, satellite attitude controland formation flying.

Michel Verhaegen received an engineering degree inaeronautics from the Delft University of Technology, TheNetherlands, in August 1982, and the doctoral degreein applied sciences from the Catholic University Leuven,Belgium, in November 1985. During his graduate study,he held a research assistantship sponsored by the FlemishInstitute for scientific research (IWT).From1985 to 1994hehas been a 2 year research fellow

of the US National Research Council (NRC), affiliated withthe NASA Ames Research Center in California, and a 5 yearresearch fellowof theDutch Academy of Arts and Sciences,

affiliated with the Network Theory Group of the Delft University of Technology. Inthe period 1994–1999 he was an Associate Professor of the control laboratory ofthe Delft University of Technology and became appointed as full professor at thefaculty of Applied Physics of the university of Twente in the Netherlands in 1999.From 2001 on Prof. Verhaegen moved back to the University of Delft and is now amember of the Delft Center for Systems and Control.Prof. Verhaegen has held short sabbatical leaves at the University of Uppsala,

McGill, Lund and the German Aerospace Research Center (DLR) in Munich and isparticipating in several National and European Research Networks. In 2007 he wasappointed program leader of the Dutch National Science Foundation ‘‘Perspective’’program on Smart Optics. His main research interest is in the interdisciplinarydomain of numerical linear algebra and system theory. In this field he haspublished over 100 papers. Current activities focus on the transfer of knowledgeabout new identification, fault tolerant control and data driven controller designmethodologies to research laboratories and industry. Application areas includesmart structures, adaptive optics, wind energy and vehicle mechatronics.