Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali

Fuzzy Pattern Recognition per il modellamento disistemi complessi in contesti reali

Dottorato di Ricerca XXVII Ciclo

Ingegneria dell’Informazione e della Comunicazione

Roma, 13 Aprile, 2015

Candidato: Luca Liparulo

Tutori: Prof. Gianni Orlandi, Prof. Massimo Panella

Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni

Indice

Motivazione della ricerca

Fuzzy Membership Functions (MFs)

Fuzzy Pattern Recognition (PR) Convex-Hull based

Unconstrained Fuzzy Pattern Recognition (PR)

Clustering per la risoluzione di problemi di regressione

Conclusioni

Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015


Inquadramento della ricerca

“Pattern Recognition is the scientific discipline whose goal is the classification of objects into anumber of categories or classes.”[S. Theodoridis, K. Koutroumbas - Pattern Recognition, Elsevier Science, 2006]

⇓

L’utilizzo delle tecniche di Pattern Recognition ha lo scopo di definire un sistema capace dirisolvere in maniera automatica ed efficiente diversi problemi presenti nella gran parte delleattivita scientifiche, soprattutto di tipo ingegneristico.

⇓

Research topic:Accanto ai modelli computazionali aristotelici, basati sulla millenaria logica esclusiva, inletteratura la ricerca sta affrontando il tema di tecniche piu flessibili e sofisticate, basate sullalogica fuzzy.

⇓

Approcci Fuzzy permettono la sovraposizione dei cluster in quanto non assegnano ad ogni‘oggetto’ (pattern) uno e un solo cluster.

Algoritmi Fuzzy permettono di creare una configurazione piu accurata in quanto consideranoil valore di appartenenza di ogni pattern confrontandolo con tutti i cluster.

Fuzzy Pattern Recognition → Metrica → Membership Function



Metriche standard

dk(x, y) =

n∑j=1

|xj − yj |k 1

k

,

dove n e la dimensione dello spazio dei dati, xj , yj , j = 1 . . . n, sono le componenti dei patternlungo la dimensione jesima, k il parametro che specifica il particolare tipo di distanza diMinkowski.

⇓

(a) k = 1 (b) k = 2 (c) k → ∞

In figura sono mostrate le metriche maggiormente utilizzate: (a) distanza di Manhattan; (b)distanza Euclidean; (c) distanza di Chebyshev (d∞(x, y) = maxnj=1 |xj − yj |).

⇓

Problema riscontrato: Studio di un nuovo approccio per la valutazione della distanzacluster-pattern.



Motivazione della ricerca

Definire un nuovo approccio per la valutazione della distanza cluster-pattern.

Progettare e implementare nuovi algoritmi di clustering/classificazione che utilizzano funzionidi appartenenza senza vincoli geometrici al fine di arginare il problema della sovrapposizionedelle classi/cluster.

Dimostrare che l’utilizzo della logica fuzzy, combinata con una geometria piu’ flessibile deicluster, produce risultati robusti, mediante procedure computazionalmente efficienti, nellarisoluzione di sistemi complessi in contesti reali:

I Classificazione del livello di Brunnstrom nell’ambito dell’Ingegneria Biomedica

I Predizione delle serie storiche nell’ambito delle Commodity Energetiche



Attivita come Visiting Student presso RMIT University

Royal Melbourne Institute of Technology (RMIT)Melbourne, Australia (VIC)School of Electrical and Computer EngineeringLaboratorio di Ingegneria Biomedica

Topic della ricerca.

Self Home Rehabilitation

Post Stroke Patients Analysis

Motion Classification

Brunnstrom Stage Recognition



FUZZY MEMBERSHIP FUNCTION



Fuzzy Membership Function

Primo problema affrontato:

Distanza Punto-Centroide (centroid-based)

Distanza Punto-Poligono (boundary-based)

⇓

Metodi proposti

a Metodo basato su kernel Gaussiano

b Metodo basato su kernel Conico



Metodo basato su kernel gaussiano

Definizione del metodo

Il calcolo della MF, basata sulla distanza punto-poligono, utilizza la sovrapposizione di uncerto numero di Kernel Gaussiani;

Un kernel gaussiano posizionato sul centroide c;

N kernel gaussiani sovrapposti centrati sugli N vertici del poligono.

µ(gauss)(x) = exp

−γ22δ

L∑j=1

(xj − cj)2+

N∑i=1

exp

−γ22δ

L∑j=1

(xj − vij)2 ,

dove δ =√L, con L = Numero di features/dimensioni dataset.



Metodo basato su kernel di tipo conico

Definizione del metodo

Questo metodo e basato sull’utilizzo di funzioni cone-shaped;

Facendo un confronto con il metodo gaussiano, questa MF tende allo zero molto piurapidamente;

Similarmente al metodo gaussiano una MF cone-shaped e basata sulla sovrapposizione di uncerto numero di funzioni lineari;

Un kernel lineare posizionato sul centroide c;

N kernel lineari sovrapposti centrati sugli N vertici del poligono.

µ(cone)(x) = max[0, 1− γ

δd2 (x, c)

]+

N∑

i=1

max[0, 1− γ

δd2 (x,vi)

],

dove δ =√L, con L = Numero di features/dimensioni dataset.



Rappresentazione grafica

Metodo Gaussiano Metodo Cone-based

MFs con differenti valori di γ: (a) γ = 15; (b) γ = 10; (c) γ = 4; (d) γ = 3; (e) γ = 2; (f) γ = 1.



Un benchmark di riferimento - IRIS dataset

Il Dataset IRIS rappresenta uno dei piu popolari database nella letteratura riguardante ilpattern recognition. E’ costituito da 3 classi di 50 elementi ognuna e ogni classe si riferiscead un tipo di Iris (Iris Setosa, Iris Versicolor, Iris Virginica).

I parametri per l’applicazione dell’algoritmo sono N = 150 patterns, M = 4 features.

Matrice di confusione: Simpson’s MF

ClassiStimato

1 2 3

Vero

1 100% − −2 − 94% 6%

3 − 30% 70%

Matrice di confusione: Gaussian MF

ClassiStimato

1 2 3

Vero

1 100% − −2 − 100% −3 − 36% 64%

Matrice di confusione: Cone MF

ClassiStimato

1 2 3

Vero

1 100% − −2 − 100% −3 − 28% 72%



FUZZY PATTERN RECOGNITION CONVEX-HULL BASED



Fuzzy clustering utilizzando il convex-hull come modello geometrico

Utilizzo del convex-hull per il calcolo on-line dei punti sui quali posizionare le MFs

Utilizzo del convex-hull non soltanto per determinare graficamente la composizione deicluster.

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

Feature 1

Feat

ure

2



Impostazione generale algoritmo sequenziale

Sia M il numero di pattern di un dataset D = x1, x2, . . . , xM e N il numero di features →ogni pattern del dataset e rappresentato dalla N -tupla di numeri reali:

xm = [xm1 xm2 . . . xmN ] , m = 1 . . .M .

Sia K → Numero di cluster.

Inizializzazione. Il primo pattern x1 viene identificato come il primo cluster, quindi K vieneimpostato a 1.

Iterazioni. Per ogni pattern xm, m = 2 . . .M , del dataset:

→ Ω(xm) l’array dei valori delle MF dell’m-esimo pattern rispetto ai K clusters determinatiin quell’istante:

Ω(xm) =[µ1(xm) µ2(xm) . . . µK(xm)

],

→ Ω(m)max = max(Ω(xm)), calcolata in corrispondenza dell’h-esimo cluster, allora:Ω

(m)max = µh(xm) ,

h = arg maxr=1...K

µr(xm) .




Sulla base del dataset da analizzare, vengono fissati due parametri θmin e θmax nello spazionormalizzato [0 1]

⇓

σ: vettore delle deviazioni standard:

σ = [σ1 σ2 . . . σN ] .

I valori utilizzati nell’algoritmo sono:

θmin = minj=1...N

σj .

θmax = 2 ·[

maxj=1...N

σj].

First cluster (x1)K=1, m=2

Presentation of pattern xm

MF evaluationfor each cluster

Ω(m)max ≤ θmin

Ω(m)max ≥ θmax

xm is within thehth cluster scoringthe maximum MF(NO convex hull)

New cluster createdcoinciding with xm

(NO convex hull)

The hth cluster is updated

Convex hull to definethe external vertices

of the hth cluster

xm belongs to thehth cluster scoringthe maximum MF

yes

no

yes

no

K ← K + 1

m← m+ 1




Risultati

(a) Error Rate Medio (%) sui pattern assegnati dopo 100 esecuzioni

(b) Best Error Rate (%) sui pattern assegnati dopo 100 esecuzioni

(c) Numero di esecuzioni (su 100 totali) in cui e ottenuta la migliore performance

Algorithm Hepta (3-D) Iris (4-D) UKM (5-D) Seed (7-D)

CH-CBK 0.00 16.74 51.11 22.07

CH-GBK 0.00 16.53 52.16 21.84

K-means 23.69 18.23 52.62 10.95

FCM 0.20 10.67 57.45 10.00

Min-Max 3.33 24.26 67.11 32.05

Clusterdata 0.00 34.00 75.19 66.67

(a)


CH-CBK 0.00 6.67 35.48 8.10

CH-GBK 0.00 4.67 40.20 9.05

K-means 0.00 11.33 44.17 10.95

FCM 0.00 10.67 49.63 10.00

Min-Max 0.00 6.00 45.16 12.38

Clusterdata 0.00 34.00 75.19 66.67

(b)


CH-CBK 100 86 85 98

CH-GBK 100 86 85 98

Min-Max 81 14 15 3

(c)



Classificazione automatica di movimenti mediante sensore inerziale (IMU)

Protocollo usato nella sperimentazione

Estrazione di 63 feature9 Sequenze dati

Displacement (x, y, z)(Spostamento derivato da dati Accelerometro)

Raw Acceleration (x, y, z)(Dati accelerometro filtrati)

Orientation (x, y, z)(Integrazione Giroscopio)

Numeri della sperimentazione

14 Pazienti (10M, 4F)

Eta 32-78 anni

Numero di pattern 531

Brunnstrom stage I-V

InertialMeasurement Unit

Receiver

Raw data

Preprocessing

Features extraction(63 features)

PCA evaluation

Data normalization

Testing Set

Classificationresult

Median filter↓

Axis inversion↓

Displacement(from Accelerometer)

↓Angle

(from Gyroscope)

Training Set

Convex-Hullfor each training class

Fuzzy MembershipFunction evaluation

for each motion

γ parameter setting

Trained fuzzy classifier

Data sampling



Sensore Xsens IMU



Software utilizzato




RISULTATI

Dataset 14

14 Pazienti

Livello di Brunnstorm da I a V

531 Pattern

5 10 15 200

10

20

30

40

50

60

γ values

Erro

r rat

e (%

)

(a)

Error RateminError:23.16%

5 10 15 200

5

10

15

20

25

30

γ values

Erro

r rat

e (%

)

(b)


5 10 15 200

5

10

15

20

25

30

γ values

Erro

r rat

e (%

)

(c)


5 10 15 200

5

10

15

20

25

30

γ values

Erro

r rat

e (%

)

(d)


5 10 15 200

5

10

15

20

25

30

γ values

Erro

r rat

e (%

)

(e)


5 10 15 200

5

10

15

20

25

30

γ values

Erro

r rat

e (%

)

(f)


Error rate.

Algorithm used 14 Patients

Fuzzy Kernel Classifier (FKC) 0.56

Neuro-fuzzy classifier 1.70

Classification Tree (CART) 4.90

Support Vector Machine (SVM) 1.11

Linear Discriminant Analysis (LDA) 8.35

Quadratic Discriminant Analysis (QDA) 1.32

NaiveBayes 6.03

Probabilistic Neural Network (PNN) 67.23

Tutti i valori sono espressi in (%).



FUZZY PATTERN RECOGNITION BASATO SU FUNZIONI DIAPPARTENENZA FUZZY SENZA VINCOLI GEOMETRICI



Fuzzy clustering basato su funzioni di appartenenza senza vincoli geometrici

Principali criticita dell’approccio convex-hull based

Complessita computazionale al crescere del numero di dimensioni.

Modifiche proposte

Eliminazione del calcolo del convex-hull

Ottimizzazione dell’algoritmo diminuendo il numero di parametri

Analisi mediante “Grid-search” per la scelta dei parametri dell’algoritmo mediante indici divalidita e analisi dell’error-rate.




UFOC - Unconstrained Fuzzy Online Clustering Algorithm

θ: unico parametro da fissare, sulla base del

dataset da analizzare.

First cluster (x1)K=1, m=2

Presentation of pattern xm

MF evaluationfor each cluster

Ω(m)max ≤ θ

New cluster createdcoinciding with xm

The hth cluster is updated

xm belongs to thehth cluster scoringthe maximum MF

yes

no

K ← K + 1

m← m+ 1




Analisi mediante “Grid search”:

Γ = [γmin, γmax] = γ | γmin ≤ γ ≤ γmax ,

Θ = [θmin, θmax] = θ | θmin ≤ θ ≤ θmax .

Il range Γ e costituito da 71 valori di γ da γmin = 1 a γmax = 15 con uno step di 0.2;

il range Θ e costituito da 8 valori di θ, da θmin = 0.1 a θmax = 0.8 con uno step di 0.1.

→ L’analisi prevede 8× 71 esecuzioni dell’algoritmo.

Tipi di analisi:

Indici di validita:I Dunn (D) ↑I Davies-Bouldin (DB) ↓I Double Weighted Davies-Bouldin (DW-DB) ↓

Error-rate: ε = EM




Risultati Error rate.

Hepta - 3D

1 10 20 30 40 50 60 70 80 90 100-100

0

100(a)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 1000

50

100(b)

Presentation order

Erro

r rat

e (%

)

WDBC - 30D

1 10 20 30 40 50 60 70 80 90 100-100

0

100(a)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 1000

50

100(b)

Presentation order

Erro

r rat

e (%

)

Iris - 4D

1 10 20 30 40 50 60 70 80 90 100-100

0

100(a)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 1000

50

100(b)

Presentation order

Erro

r rat

e (%

)

Wine - 13D

1 10 20 30 40 50 60 70 80 90 100-100

0

100(a)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 1000

50

100(b)

Presentation order

Erro

r rat

e (%

)

NewThyroid - 5D

1 10 20 30 40 50 60 70 80 90 100-100

0

100(a)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 1000

50

100(b)

Presentation order

Erro

r rat

e (%

)

Tabella riassuntiva:



Classificazione automatica del livello di Brunnstrom mediante sEMG

Registrazione dei dati

Estrazione di 10 feature

Numeri della sperimentazione

9 Pazienti (3M, 6F)

Eta 67.2± 29.2 anni

Numero di pattern 96

Brunnstrom stage II-IV

sEMG System

Receiver

Raw data

Preprocessing

Features extraction(10 features)

Data normalization

Testing Set

Classificationresult

Training Set

Fuzzy MembershipFunction evaluation

for each motion

Validation Set

γ∗ parameter setting

Trained fuzzy classifier

Data sampling




10Fold e LOO Performance

Algorithm usedError rate Error rate

(10 Fold) (LOO)

Fuzzy Kernel Classifier 7.53 8.60

FIS Classifier (Sugeno) 9.68 11.83

FIS Classifier (Mamdani) 9.68 11.83

Neuro-Fuzzy Classifier 11.83 17.20

LDA 30.11 31.18

QDA 17.20 19.35

NaiveBayes 24.73 23.66

SVM 24.73 22.58

CART 16.13 15.05

PNN 45.16 46.24


Accuratezza

Estimated outcome

Stage II∗ Stage III∗ Stage IV∗ Healthy∗ Accuracy

Act

ua

lva

lue Stage II 17 0 1 0 94.44 %

Stage III 0 30 2 0 93.75 %

Stage IV 3 0 11 0 78.57 %

Healthy 0 0 1 28 96.55 %



TECNICHE DI CLUSTERING E RETI NEURO-FUZZY PER LASOLUZIONE DI PROBLEMI DI REGRESSIONE



Procedure di clustering per la soluzione di problemi di regressione

Punto di partenza:

Le reti FIS sono comunemente utilizzate per risolvere problemi di regressione

L’uso di tecniche di clustering permette di determinare le regole decisionali direttamente daiclusters del training set disponibile, pertanto ogni regola corrisponde ad un insiemestrutturato di punti.

⇓

La strategia di clustering e applicata allo spazio congiunto Z = X x Y , con zi = (xi, yi)

L’analisi produce C cluster, Γ(k)z , con k = 1, . . . , C

Procedura molto semplice e immediata

PROBLEMA:

⇓

I pattern appartenenti alla stesso cluster, potrebbero non riflettere la reale struttura dei dati

⇓

SOLUZIONE:

Progettazione e implementazione di procedura iterativa che permetta di utilizzare i risultatiderivanti dal clustering al fine di stimare la reale struttura dei dati di uscita



Clustering nello spazio ingresso-uscita

Sia Γ = Γ1,Γ2, . . . ,ΓC un insieme di C cluster (ognuno associato con una regola di uscita) esia ogni pattern del training set assegnato a uno di questi cluster. La procedura di clustering conC prototipi e basata sui seguenti step:

Step 1 . I coeficienti ω(k), k = 1 . . . C, di ogni regola sono calcolati risolvendo una serie diequazioni non lineari; l’equazione generica e:

yt = h(xt;ω

(k))

Step 2 . Aggiornamento assegnazione pattern. Ogni coppia (xt, yt), t = 1 . . . NT , deltraining set e assegnata al cluster Γq , con q tale che:

dt =∣∣∣yt − h(xt;ω(q)

)∣∣∣ = mink=1...C

∣∣∣yt − h(xt;ω(k))∣∣∣ .

Step 3 . Per ogni cluster Γk, l’approssimazione locale dell’errore e calcolata:

Dk =1

Nk

∑tdt ,

Step 4 . Convergenza:

Θ =

∣∣D −D(old)∣∣

D(old), D =

1

NT

NT∑t=1

di ,



Esempio del risultato finale

(Toy Problem) Risultato finale dopo 1 Step

(Toy Problem) Risultato finale dopo 13 Step



Metodologia proposta per la predizione delle serie storiche

Serie dei ritorni → yt = ln(St)− ln(St−1)

Modello standard additivo per le serie storiche → yt = µt + εt

Modello di regressione ARMA-GARCH

µt = hµ(x(µ)t ;ω

(µ)t

),

x(µ)t = [yt−1 yt−2 . . . yt−R εt−1 εt−2 . . . εt−M ] ,

σ2t = hσ

(x(σ)t ;ω

(σ)t

),

x(σ)t =

[σ2t−1 σ

2t−2 . . . σ

2t−P ε2t−1 ε

2t−2 . . . ε

2t−Q

],

Gli ordini R, M , P , e Q sono analoghi per i modelli ARMA e GARCH;

ω(µ)t and ω

(σ)t sono i parametri vettori delle funzioni di regressione hµ and hσ .



Calcolo dell’accuratezza:

Mean Squared Error - MSE

MSE =1

Ns

∑t

(yt − µt)2

Normalised Mean Square Error - NMSE

NMSE =

∑t (yt − µt)2∑t (yt − y)2

Noise-to-signal ratio or NSR (dB)

NSRdB = 10 log10

∑t(yt − µt)2∑

t y2



Risultati: Commodity Energetiche

DJUSCL (coal) TABLE INUMERICAL RESULTS FOR DJUSCL RETURN SERIES

Errors Unconditional MomentsModel

MSE NMSE NSR Mean Variance Skewness Kurtosis

DJUSCL 0.31 0.74 0.41 5.60

GARCH 0.80 1.08 0.33 0.34 0.33 0.07 9.36

RBF 0.76 1.03 0.13 0.39 0.88 0.18 5.28

MoG 0.78 1.04 0.18 0.77 1.01 0.32 5.20

ANFIS 0.77 1.04 0.17 0.27 0.91 0.41 5.34

HONFIS 0.76 1.04 0.17 0.29 0.81 0.35 5.52

MSE, mean, and variance are scaled by 103

European data set not showing significant differences inthe forecasting ability. The studied commodities and therelated indexes are: coal (DJUSCL, in $/ton); Henry Hubnatural gas (HH, in $/MMBtu); crude oil (WTI, in $/barrel),and electricity (PJM, in $/MWh). For electricity prices wechose the peak load contract referred to h. 12:00. For eachcommodity price log-returns are estimated using (1). Wechose a well representative time window across the ‘verycritical’ year 2008, i.e. from the beginning of 2006 to the endof 2009. So, taking into account that we have approximately250 prices and 249 returns per year in the case of coal,natural gas and crude oil series, each return series consistsof about 1000 samples. In the case of electricity prices wehave a series of 365 data, given that electricity is traded everyday of the year; in this application, for comparison purposes,we adjust the series to 250 trading days only. Each modelis trained on the previous NT = 500 samples (almost twoyears) and NS = 500 samples are predicted, i.e. the last twoyears starting from t = 501 to the last sample of 2009.

The prediction errors of the conditional mean are eval-uated; in addition, the four unconditional moments are es-timated for both the predicted sequences and the relatedoriginal series. Prior investigations can be made in order tofind the best combination of the model orders and the size ofthe training set as well; a fine tuning for the optimal estimateof every model can be addressed in future research works. Inorder to obtain an accurate comparison of the performancesobtained by the proposed neural networks with respect tostandard models, we carried out a preliminary optimizationof the main model parameters, i.e. R, M , P , and Q, so asto obtain the best performance of the reference ‘GARCH’model for a given time series. Then, every model will usethe same parameters when applied to the same time series.

The optimal parameters of the coal DJUSCL returns areR = 1, M = 1, P = 5, Q = 1; hence, a GARCH referencemodel ARMA(1, 5)-GARCH(1, 1) is fitted. The numericalresults are summarized in Table I: all the neural models scorea prediction error better than GARCH; RBF obtains the bestNSR but the skewness is not properly matched as in the caseof ANFIS and HONFIS that have a comparable performance.HONFIS achieves a good NSR performance of 0.17 dB andit is able to follow the dynamics of both conditional meanand increasing volatility, as proved by the behavior of theestimated conditional variance shown in Fig. 1.

500 600 700 800 900 1000−0.2

0

0.2Conditional mean: forecast (black), actual time series (gray)

500 600 700 800 900 1000−0.2

0

0.2Forecast error (innovation)

500 600 700 800 900 10000

2

4x 10

−3 Conditional variance (volatility)

Fig. 1. Prediction of coal returns using the HONFIS neural network.

TABLE IINUMERICAL RESULTS FOR HH RETURN SERIES



HH 0.46 1.72 1.46 10.62

GARCH 1.97 1.14 0.57 1.65 0.12 2.49 67.15

RBF 1.87 1.09 0.37 0.44 1.92 0.27 4.54

MoG 1.80 1.05 0.21 0.34 1.90 1.13 9.94

ANFIS 2.05 1.19 0.75 0.47 1.87 0.71 7.35

HONFIS 1.67 1.03 0.13 0.45 1.77 0.99 8.46


The numerical results for HH returns of natural gas arereported in Table II. The optimal parameters are in thiscase R = 2, M = 2, P = 2, Q = 1, so a GARCHreference model ARMA(2, 2)-GARCH(2, 1) is fitted. TheHONFIS neural network has the best NSR performance of0.13 dB and the related moments adequately fit with thoseof the original time series. A sufficient accuracy is alsoobtained by RBF and ANFIS neural networks. The GARCHis not suitable for the prediction of HH returns, since themoments are estimated very poorly, especially the kurtosis.The numerical results of the HONFIS neural network arequalitatively confirmed by the accurate predictions reportedin the plots of Fig. 2, especially in the case of volatility.

The large volatility of crude oil WTI returns at the endof 2008 is the feature that requires an accurate forecastingtechnique. A more complex model is therefore necessary,using R = 4, M = 2, P = 2, Q = 3. The GARCH referencemodel ARMA(4, 2)-GARCH(2, 3) is evidently outperformedby the neural networks, as evidenced by the results sum-marized in Table III. The best NSR is once again obtainedby HONFIS, although the predicted sequence does not fit theskewness of the original one; the ANFIS performance suffersfrom the same drawback. The MoG network is able to fitthe original moments, also maintaining a good predictionaccuracy and following the changes of volatility in theunderlying process.

Finally, the numerical results for the returns of PJM elec-

MSE, mean, and variance are scaled by 10−3

HH (natural gas)

TABLE INUMERICAL RESULTS FOR DJUSCL RETURN SERIES



DJUSCL 0.31 0.74 0.41 5.60

GARCH 0.80 1.08 0.33 0.34 0.33 0.07 9.36

RBF 0.76 1.03 0.13 0.39 0.88 0.18 5.28

MoG 0.78 1.04 0.18 0.77 1.01 0.32 5.20

ANFIS 0.77 1.04 0.17 0.27 0.91 0.41 5.34

HONFIS 0.76 1.04 0.17 0.29 0.81 0.35 5.52


European data set not showing significant differences inthe forecasting ability. The studied commodities and therelated indexes are: coal (DJUSCL, in $/ton); Henry Hubnatural gas (HH, in $/MMBtu); crude oil (WTI, in $/barrel),and electricity (PJM, in $/MWh). For electricity prices wechose the peak load contract referred to h. 12:00. For eachcommodity price log-returns are estimated using (1). Wechose a well representative time window across the ‘verycritical’ year 2008, i.e. from the beginning of 2006 to the endof 2009. So, taking into account that we have approximately250 prices and 249 returns per year in the case of coal,natural gas and crude oil series, each return series consistsof about 1000 samples. In the case of electricity prices wehave a series of 365 data, given that electricity is traded everyday of the year; in this application, for comparison purposes,we adjust the series to 250 trading days only. Each modelis trained on the previous NT = 500 samples (almost twoyears) and NS = 500 samples are predicted, i.e. the last twoyears starting from t = 501 to the last sample of 2009.

The prediction errors of the conditional mean are eval-uated; in addition, the four unconditional moments are es-timated for both the predicted sequences and the relatedoriginal series. Prior investigations can be made in order tofind the best combination of the model orders and the size ofthe training set as well; a fine tuning for the optimal estimateof every model can be addressed in future research works. Inorder to obtain an accurate comparison of the performancesobtained by the proposed neural networks with respect tostandard models, we carried out a preliminary optimizationof the main model parameters, i.e. R, M , P , and Q, so asto obtain the best performance of the reference ‘GARCH’model for a given time series. Then, every model will usethe same parameters when applied to the same time series.

The optimal parameters of the coal DJUSCL returns areR = 1, M = 1, P = 5, Q = 1; hence, a GARCH referencemodel ARMA(1, 5)-GARCH(1, 1) is fitted. The numericalresults are summarized in Table I: all the neural models scorea prediction error better than GARCH; RBF obtains the bestNSR but the skewness is not properly matched as in the caseof ANFIS and HONFIS that have a comparable performance.HONFIS achieves a good NSR performance of 0.17 dB andit is able to follow the dynamics of both conditional meanand increasing volatility, as proved by the behavior of theestimated conditional variance shown in Fig. 1.

500 600 700 800 900 1000−0.2

0


500 600 700 800 900 1000−0.2

0


500 600 700 800 900 10000

2

4x 10

−3 Conditional variance (volatility)

Fig. 1. Prediction of coal returns using the HONFIS neural network.

TABLE IINUMERICAL RESULTS FOR HH RETURN SERIES



HH 0.46 1.72 1.46 10.62

GARCH 1.97 1.14 0.57 1.65 0.12 2.49 67.15

RBF 1.87 1.09 0.37 0.44 1.92 0.27 4.54

MoG 1.80 1.05 0.21 0.34 1.90 1.13 9.94

ANFIS 2.05 1.19 0.75 0.47 1.87 0.71 7.35

HONFIS 1.67 1.03 0.13 0.45 1.77 0.99 8.46


The numerical results for HH returns of natural gas arereported in Table II. The optimal parameters are in thiscase R = 2, M = 2, P = 2, Q = 1, so a GARCHreference model ARMA(2, 2)-GARCH(2, 1) is fitted. TheHONFIS neural network has the best NSR performance of0.13 dB and the related moments adequately fit with thoseof the original time series. A sufficient accuracy is alsoobtained by RBF and ANFIS neural networks. The GARCHis not suitable for the prediction of HH returns, since themoments are estimated very poorly, especially the kurtosis.The numerical results of the HONFIS neural network arequalitatively confirmed by the accurate predictions reportedin the plots of Fig. 2, especially in the case of volatility.

The large volatility of crude oil WTI returns at the endof 2008 is the feature that requires an accurate forecastingtechnique. A more complex model is therefore necessary,using R = 4, M = 2, P = 2, Q = 3. The GARCH referencemodel ARMA(4, 2)-GARCH(2, 3) is evidently outperformedby the neural networks, as evidenced by the results sum-marized in Table III. The best NSR is once again obtainedby HONFIS, although the predicted sequence does not fit theskewness of the original one; the ANFIS performance suffersfrom the same drawback. The MoG network is able to fitthe original moments, also maintaining a good predictionaccuracy and following the changes of volatility in theunderlying process.

Finally, the numerical results for the returns of PJM elec-


WTI (crude oil)

Fig. 2. Prediction of natural gas returns using the HONFIS neural network.

TABLE IIINUMERICAL RESULTS FOR WTI RETURN SERIES



WTI 0.10 0.11 0.34 6.51

GARCH 0.12 1.00 0.01 0.43 0.01 0.48 9.38

RBF 0.11 0.94 0.27 0.01 0.12 0.16 4.06

MoG 0.11 0.92 0.36 0.10 0.10 0.44 5.73

ANFIS 0.09 0.89 0.50 0.05 0.12 0.25 7.22

HONFIS 0.08 0.85 0.71 0.08 0.13 0.01 6.98


TABLE IVNUMERICAL RESULTS FOR PJM RETURN SERIES



PJM 0.22 1.21 0.38 5.52

GARCH 1.19 0.98 0.09 1.53 0.24 0.53 8.10

RBF 0.71 0.58 2.36 0.11 1.64 0.01 2.96

MoG 0.57 0.47 3.28 0.41 1.77 0.28 4.61

ANFIS 0.64 0.53 2.76 0.21 1.24 0.26 5.58

HONFIS 0.60 0.51 2.92 0.17 1.21 0.29 5.46


tricity index are reported in Table IV. The model parametersare R = 3, M = 2, P = 1, Q = 1; the GARCHreference model is ARMA(3, 2)-GARCH(1, 1). The MoGneural network performs better than the other models inthis case. Anyway, the proposed HONFIS model performsbetter than the original ANFIS network. Globally, neuralnetworks improve the NSR performance of more than 2 dBwith respect to GARCH, despite a biasing that shifts theestimate of the mean to negative values.

V. CONCLUSION

A new neural network approach is proposed for model-ing time series associated with energy commodity prices,which is based on fuzzy neural networks using higher-orderSugeno-type consequent rules. The use of a constructive

procedure determining automatically the optimal number offuzzy rules is also illustrated in order to avoid overfitting andmaximize the generalization capability of the neural network.

The proposed approach provides an accurate descriptionof energy prices dynamics, allowing us to estimate dailyprices for energy commodities over a long time horizon.The validation performed on historical data shows that theneural network approach generates prices that are able toreplicate the daily data and to reproduce the same probabilitydistribution for the various series. The proposed HONFISmodel, using quadratic consequent rules, outperforms theoriginal ANFIS network in almost all cases, making usefulsignificantly the increased complexity of the related model.

Currently, we are investigating more advanced techniquesfor the application of the proposed approach to a multivariatetime series analysis and for the automatic and more reliableselection of the samples to be used for prediction, includingthe order of regression models and the resulting complexityof neural models.

REFERENCES

[1] J. Hamilton, “Understanding crude oil prices,” Energy Journal, vol. 20,no. 2, pp. 179–206, 2009.

[2] R. Gibson and E. Schwartz, “Stochastic convenience yield and thepricing of oil contingent claims,” The Journal of Finance, vol. 45, pp.959–976, 1990.

[3] C. Mari, “Random movements of power prices in competitive markets:a hybrid model approach,” The Journal of Energy Markets, vol. 1,no. 2, pp. 87–103, 2008.

[4] T. Mills, Time Series Techniques for Economists. Cambridge, UK:Cambridge University Press, 1990.

[5] I. Haidar, S. Kulkarni, and H. Pan, “Forecasting model for crude oilprices based on artificial neural networks,” in Proc. of InternationalConference on Intelligent Sensors, Sensor Networks and InformationProcessing (ISSNIP 2008), Sydney, NSW, Australia, 2008, pp. 103–108.

[6] M. Panella, F. Barcellona, and R.L. D’Ecclesia, “Subband predictionof energy commodity prices,” in Proc. of the IEEE Int. Workshopon Signal Processing Advances in Wireless Communications (SPAWC2012). Cesme, Turkey: IEEE, 2012.

[7] M. Panella, L. Liparulo, F. Barcellona, and R.L. D’Ecclesia, “A studyon crude oil prices modeled by neurofuzzy networks,” in Proc. ofFUZZ-IEEE 2013, Hyderabad, India, 2013.

[8] T. Bollerslev, “Generalized autoregressive conditional heteroskedastic-ity,” Journal of Econometrics, vol. 31, p. 307327, 1986.

[9] R. Engle, “Autoregressive conditional heteroscedasticity with estimatesof the variance of united kingdom inflation,” Econometrica, vol. 50,no. 4, pp. 987–1007, 1982.

[10] M. Panella, F. Barcellona, and R.L. D’Ecclesia, “Forecasting energycommodity prices using neural networks,” Advances in DecisionSciences, vol. 2012, 2012.

[11] J.-S. Jang, C. Sun, and E. Mizutani, Neuro-Fuzzy and Soft Computing:a Computational Approach to Learning and Machine Intelligence.Upper Saddle River, NJ, USA: Prentice Hall, 1997.

[12] S. Haykin, Neural Networks, a Comprehensive Foundation, 2nd Edi-tion. Englewood Cliffs, NJ, USA: Prentice-Hall, 1999.

[13] M. Panella, “A hierarchical procedure for the synthesis of ANFISnetworks,” Advances in Fuzzy Systems, vol. 2012, 2012.

[14] G. Seber and C. Wild, Nonlinear Regression. NJ: Wiley-Interscience:Hoboken, 2003.

[15] A. Bors and I. Pitas, “Median radial basis function neural network,”IEEE Trans. Neural Netw., vol. 7, no. 6, pp. 1351–1364, 1996.

[16] M. Panella, “Advances in biological time series prediction by neuralnetworks,” Biomedical Signal Processing and Control, vol. 6, no. 2,pp. 112–120, 2011.

[17] S. Chiu, “Fuzzy model identification based on cluster estimation,”Journal of Intelligent & Fuzzy Systems, vol. 2, pp. 267–278, 1994.


PJM (electricity)

500 600 700 800 900 1000−0.5

0


500 600 700 800 900 1000−0.5

0


500 600 700 800 900 10000

0.01

0.02Conditional variance (volatility)

Fig. 2. Prediction of natural gas returns using the HONFIS neural network.

TABLE IIINUMERICAL RESULTS FOR WTI RETURN SERIES



WTI 0.10 0.11 0.34 6.51

GARCH 0.12 1.00 0.01 0.43 0.01 0.48 9.38

RBF 0.11 0.94 0.27 0.01 0.12 0.16 4.06

MoG 0.11 0.92 0.36 0.10 0.10 0.44 5.73

ANFIS 0.09 0.89 0.50 0.05 0.12 0.25 7.22

HONFIS 0.08 0.85 0.71 0.08 0.13 0.01 6.98


TABLE IVNUMERICAL RESULTS FOR PJM RETURN SERIES



PJM 0.22 1.21 0.38 5.52

GARCH 1.19 0.98 0.09 1.53 0.24 0.53 8.10

RBF 0.71 0.58 2.36 0.11 1.64 0.01 2.96

MoG 0.57 0.47 3.28 0.41 1.77 0.28 4.61

ANFIS 0.64 0.53 2.76 0.21 1.24 0.26 5.58

HONFIS 0.60 0.51 2.92 0.17 1.21 0.29 5.46


tricity index are reported in Table IV. The model parametersare R = 3, M = 2, P = 1, Q = 1; the GARCHreference model is ARMA(3, 2)-GARCH(1, 1). The MoGneural network performs better than the other models inthis case. Anyway, the proposed HONFIS model performsbetter than the original ANFIS network. Globally, neuralnetworks improve the NSR performance of more than 2 dBwith respect to GARCH, despite a biasing that shifts theestimate of the mean to negative values.

V. CONCLUSION

A new neural network approach is proposed for model-ing time series associated with energy commodity prices,which is based on fuzzy neural networks using higher-orderSugeno-type consequent rules. The use of a constructive

procedure determining automatically the optimal number offuzzy rules is also illustrated in order to avoid overfitting andmaximize the generalization capability of the neural network.

The proposed approach provides an accurate descriptionof energy prices dynamics, allowing us to estimate dailyprices for energy commodities over a long time horizon.The validation performed on historical data shows that theneural network approach generates prices that are able toreplicate the daily data and to reproduce the same probabilitydistribution for the various series. The proposed HONFISmodel, using quadratic consequent rules, outperforms theoriginal ANFIS network in almost all cases, making usefulsignificantly the increased complexity of the related model.

Currently, we are investigating more advanced techniquesfor the application of the proposed approach to a multivariatetime series analysis and for the automatic and more reliableselection of the samples to be used for prediction, includingthe order of regression models and the resulting complexityof neural models.

REFERENCES

[1] J. Hamilton, “Understanding crude oil prices,” Energy Journal, vol. 20,no. 2, pp. 179–206, 2009.

[2] R. Gibson and E. Schwartz, “Stochastic convenience yield and thepricing of oil contingent claims,” The Journal of Finance, vol. 45, pp.959–976, 1990.

[3] C. Mari, “Random movements of power prices in competitive markets:a hybrid model approach,” The Journal of Energy Markets, vol. 1,no. 2, pp. 87–103, 2008.

[4] T. Mills, Time Series Techniques for Economists. Cambridge, UK:Cambridge University Press, 1990.

[5] I. Haidar, S. Kulkarni, and H. Pan, “Forecasting model for crude oilprices based on artificial neural networks,” in Proc. of InternationalConference on Intelligent Sensors, Sensor Networks and InformationProcessing (ISSNIP 2008), Sydney, NSW, Australia, 2008, pp. 103–108.

[6] M. Panella, F. Barcellona, and R.L. D’Ecclesia, “Subband predictionof energy commodity prices,” in Proc. of the IEEE Int. Workshopon Signal Processing Advances in Wireless Communications (SPAWC2012). Cesme, Turkey: IEEE, 2012.

[7] M. Panella, L. Liparulo, F. Barcellona, and R.L. D’Ecclesia, “A studyon crude oil prices modeled by neurofuzzy networks,” in Proc. ofFUZZ-IEEE 2013, Hyderabad, India, 2013.

[8] T. Bollerslev, “Generalized autoregressive conditional heteroskedastic-ity,” Journal of Econometrics, vol. 31, p. 307327, 1986.

[9] R. Engle, “Autoregressive conditional heteroscedasticity with estimatesof the variance of united kingdom inflation,” Econometrica, vol. 50,no. 4, pp. 987–1007, 1982.

[10] M. Panella, F. Barcellona, and R.L. D’Ecclesia, “Forecasting energycommodity prices using neural networks,” Advances in DecisionSciences, vol. 2012, 2012.

[11] J.-S. Jang, C. Sun, and E. Mizutani, Neuro-Fuzzy and Soft Computing:a Computational Approach to Learning and Machine Intelligence.Upper Saddle River, NJ, USA: Prentice Hall, 1997.

[12] S. Haykin, Neural Networks, a Comprehensive Foundation, 2nd Edi-tion. Englewood Cliffs, NJ, USA: Prentice-Hall, 1999.

[13] M. Panella, “A hierarchical procedure for the synthesis of ANFISnetworks,” Advances in Fuzzy Systems, vol. 2012, 2012.

[14] G. Seber and C. Wild, Nonlinear Regression. NJ: Wiley-Interscience:Hoboken, 2003.

[15] A. Bors and I. Pitas, “Median radial basis function neural network,”IEEE Trans. Neural Netw., vol. 7, no. 6, pp. 1351–1364, 1996.

[16] M. Panella, “Advances in biological time series prediction by neuralnetworks,” Biomedical Signal Processing and Control, vol. 6, no. 2,pp. 112–120, 2011.

[17] S. Chiu, “Fuzzy model identification based on cluster estimation,”Journal of Intelligent & Fuzzy Systems, vol. 2, pp. 267–278, 1994.




Conclusioni

Motivazione della ricerca:Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali.

Principali risultati ottenuti:

E stata implementata e testata una nuova metodologia per il calcolo della distanzapattern-to-cluster

Si e dimostrato che l’utilizzo di nuove funzioni di appartenenza senza vincoli geometrici unitaall’implementazione di nuove metodologie di clustering e classificazione permettono ilmodellamento di sistemi complessi migliorando le performance rispetto ai piu noti algoritmidi Pattern Recognition, ponendosi come soluzione al problema della sovrapposizione deicluster/classi.

Si e dimostrato che attraverso le tecniche sviluppate e possibile intervenire con ottimirisultati in termini massimo rate di classificazione nel riconoscimento, in ambito biomedico,di movimenti compiuti da pazienti colpiti da ictus, mediante l’elaborazione di dati provenientida sensori IMU o sEMG.

L’utilizzo una nuova metodologia basata su clustering nello spazio congiunto input-output esulle reti neuro fuzzy permette di risolvere problemi di regressione e puo’ essere applicata perprevedere il prezzo/ritorno delle serie storiche in materia di Commodity Energetiche.



Lista Pubblicazioni

1 Festa, A., Panella, M., Lo Sterzo, R., Liparulo, L., “Radiofrequency identification systems for healthcare: A case study on electromagneticexposures”, (2013) Journal of Clinical Engineering, 38 (3), pp. 125-133.

2 Maisto, M., Panella, M., Liparulo, L., Proietti, A., “An accurate algorithm for the identification of fingertips using an RGB-D camera”, (2013) IEEEJournal on Emerging and Selected Topics in Circuits and Systems, 3 (2), pp. 272-283.

3 Liparulo, L., Proietti, A., Panella, M., “Fuzzy membership functions based on point-to-polygon distance evaluation”, (2013) IEEE InternationalConference on Fuzzy Systems, 7-10 July 2013, Hyderabad (India).

4 Panella, M., Liparulo, L., Barcellona, F., D’Ecclesia, R.L., “A study on crude oil prices modeled by neurofuzzy networks”, (2013) IEEE InternationalConference on Fuzzy Systems, 7-10 July, 2013, Hyderabad (India).

5 Liparulo, L., Proietti, A., Panella, M., “Fuzzy Clustering Using the Convex Hull as Geometrical Model”, (2014) Advances in Fuzzy Systems (Instampa).

6 Liparulo, L., Proietti, A., Panella, M., ‘Improved Online Fuzzy Clustering Based on Unconstrained Kernels”, (2015) IEEE International Conferenceon Fuzzy Systems, 2-5 August 2015, Istanbul (Turchia). (In stampa)

7 Zhang, Z., Liparulo, L., Panella, M., Gu, X., Fang, Q., “A Fuzzy Kernel Motion Classifier for Autonomous Stroke Rehabilitation”, IEEE Journal ofBiomedical and Health Informatics, (Secondo round di revisione)

8 Liparulo, L., Zhang, Z., Panella, M., Gu, X., Fang, Q., “A Novel Fuzzy Approach for Automatic Brunnstrom Stage Classification using SurfaceElectromyography”, Transactions on Neural Systems & Rehabilitation Engineering, (Secondo round di revisione)

9 Panella, M., Liparulo, L.,; Proietti, A., “A higher-order fuzzy neural network for modeling financial time series”, Neural Networks (IJCNN), 2014International Joint Conference on , vol., no., pp.3066,3073, 6-11 July 2014, Beijing (China).

10 Panella, M., Liparulo, L.,; Proietti, A., “A Higher-Order Fuzzy Neural Network Applied to Financial Time Series Analysis”, Quantitative Finance,(Invited paper - Estensione di [9], Sottomesso)

11 Altilio, R., Liparulo, L., Panella, M., Paoloni, M., Proietti, A., “Multimedia and Gaming Technologies for Telerehabilitation of Motor Disabilities”,IEEE Internet Computing, (Sottomesso)



Grazie per l’attenzione.


Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series

Algoritmi di clustering (e classificazione)

Min-Max

µh(x) =1

n

n∑j=1

[1− f(xj − whj)− f(vhj − xj)

],

dove f : < → [0, 1] e definita come:

f(α) =

1 , if γα > 1γα , if 0 ≤ γα ≤ 10 , if γα < 0

Fuzzy C-Means

µh(x) =1

c∑r=1

[d2(x,Ch)

d2(x,Cr)

] 2m−1

,

K-Means

J =k∑

j=1

n∑i=1

‖ x(j)i − cj ‖2



Tecniche di generalizzazione delle funzioni di appartenenza

Rotazione degli Hyperbox

⇓Membership Functions generalizzate

(a) Simpson Generaliz-zata

(b) Trapezoide (c) Ellissoide (d) Paraboloide

⇓Nuova metodologia: Utilizzo di funzioni di appartenenza senza vincoli geometrici.



Metodo Geometrico

Parametri

Proprieta geometriche

N = Numero di vertici

s = [s1 s2 . . . sN ]

t = [t1 t2 . . . tN ] (s1, t1)

(s2, t2)

(s3, t3)case 1

case 2

P

P

P

Risultato:Un vettore di N elementi che contiene le misurazioni delle distanze del punto p dagli N lati delpoligono.

Distanza complessiva

d(geo)s,t (p) = min

i=1...N

[d(geo)1 (p), . . . , d

(geo)N (p)

]MF Geometrica

µ(geo)s,t (p) = max

[0, 1− γ · d(geo)s,t (p)

]Triangle = [(0.3,0.3);(0.7,0.4);(0.4,0.7)]



Efficienza

Numero di esecuzioni sulla stessa macchina per lo stesso poligono: → 1000.

Architettura HW: Piattaforma x64 con un processore Intel R© Core i7-2600K CPU @ 3.4 GHz

Tipo di Poligono Geometrico Gaussiano Conico

Triangolo 7.7 0.076 0.061

Box 20.4 0.082 0.063

Esagono 32.1 0.093 0.068

Tutti i valori sono in ms

Risultati:

Il metodo geometrico e il piu lento.

L’approccio Gaussiano e quello Conico sono caratterizzati da velocita comparabili.

Il tempo di esecuzione aumenta al crescere del numero dei vertici, ma in ogni modo, inmaniera differente, per ogni metodo. Il Geometrico tende ad aumentare in maniera piudrastica rispetto agli altri due.

In termini di efficienza il metodo cone-shaped e il migliore.



Accuratezza

Per tutti e tre i poligoni si e scelto randomicamente un set di M = 1000 punti pr

,

r = 1 . . .M , all’interno dello spazio normalizzato [0 1].

L’accuratezza e espressa per entrambi i metodi come la deviazione standard rispetto al valoredi appartenenza ottenuto attraverso il metodo di riferimento (Geometrico):

ζ(gauss) =

√√√√∑Mr=1

[µ(gauss)s,t (p

r)− µ(geo)s,t (p

r)]2

M,

ζ(cone) =

√√√√∑Mr=1

[µ(cone)s,t (p

r)− µ(geo)s,t (p

r)]2

M.

Risultati numerici:

Polygon Shape ζ(gauss) ζ(cone)

Triangle 5.5 6.0

Box 2.1 3.3

Hexagon 1.6 2.5

Tutti i valori sono moltiplicati per 10−3



Simulazioni numeriche sulle Membership Functions

⇒

MF γ MF Threshold Av. Time (s)

Simpson 2 0.70 0.12

Gaussian method 1 0.65 0.03

Cone method 1 0.65 0.03




1 Ω(m)max ≤ θmin.I Il pattern non puo essere assegnato a nessun cluster esistente.I Creazione nuovo cluster, coincidente con il pattern xm attuale.I NO Convex-Hull.I K ← K + 1.

2 Ω(m)max < θmax.I Assegnazione del pattern xm all’h-esimo cluster, corrispondente al massimo valore della MF.I SI Convex-Hull (h-esimo cluster).I K non viene aggiornato.

3 Ω(m)max ≥ θmaxI Assegnazione del pattern xm all’h-esimo cluster, corrispondente al massimo valore della MF.I NO Convex-Hull. Si suppone che l’alto grado di MF corrisponda ad una vicinanza stretta del pattern

al bordo dell’h-esimo. Questa scelta permette:F un enorme miglioramento in termini di costo computazionale nell’esecuzione dell’algoritmo;F prevenzione degli errori, impedendo che il cluster diventando troppo largo, possa causare incertezze nel calcolo

complessivo del clustering finale.

I K non viene aggiornato.




Risultati Indici di validita.

Hepta - 3D

1 10 20 30 40 50 60 70 80 90 100-100

-50

0

50

100(a.1)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 100-100

-50

0

50

100(a.2)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 100-100

-50

0

50

100(a.3)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100(b.1)

Presentation order

Erro

r rat

e (%

)

1 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100(b.2)

Presentation order

Erro

r rat

e (%

)

1 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100(b.3)

Presentation order

Erro

r rat

e (%

)

WDBC - 30D

1 10 20 30 40 50 60 70 80 90 100-100

-50

0

50

100(a.1)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 100-100

-50

0

50

100(a.2)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 100-100

-50

0

50

100(a.3)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100(b.1)

Presentation order

Erro

r rat

e (%

)

1 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100(b.2)

Presentation order

Erro

r rat

e (%

)

1 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100(b.3)

Presentation order

Erro

r rat

e (%

)

Iris - 4D

1 10 20 30 40 50 60 70 80 90 100-100

-50

0

50

100(a.1)

Presentation orderEr

ror d

iffer

ence

(%)

1 10 20 30 40 50 60 70 80 90 100-100

-50

0

50

100(a.2)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 100-100

-50

0

50

100(a.3)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100(b.1)

Presentation order

Erro

r rat

e (%

)

1 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100(b.2)

Presentation order

Erro

r rat

e (%

)

1 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100(b.3)

Presentation order

Erro

r rat

e (%

)

Wine - 13D

1 10 20 30 40 50 60 70 80 90 100-100

-50

0

50

100(a.1)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 100-100

-50

0

50

100(a.2)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 100-100

-50

0

50

100(a.3)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100(b.1)

Presentation order

Erro

r rat

e (%

)

1 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100(b.2)

Presentation order

Erro

r rat

e (%

)

1 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100(b.3)

Presentation order

Erro

r rat

e (%

)

NewThyroid - 5D

1 10 20 30 40 50 60 70 80 90 100-100

-50

0

50

100(a.1)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 100-100

-50

0

50

100(a.2)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 100-100

-50

0

50

100(a.3)

Presentation order

Erro

r diff

eren

ce (%

)

1 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100(b.1)

Presentation order

Erro

r rat

e (%

)

1 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100(b.2)

Presentation order

Erro

r rat

e (%

)

1 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100(b.3)

Presentation order

Erro

r rat

e (%

)




Analisi della Sensibilita.

0 5 10 15 20 250

100

200

300

400

500

600

700HEPTA Dataset

No. of Clusters

No.

of O

ccur

renc

es

Hepta (3D) - 7

0 5 10 15 20 250

50

100

150

200

250IRIS Dataset

No. of Clusters

No.

of O

ccur

renc

es

Iris (4D) - 3

0 5 10 15 20 250

50

100

150

200

250SEED Dataset

No. of Clusters

No.

of O

ccur

renc

esSeeds (7D) - 3

0 5 10 15 20 250

10

20

30

40

50

60

70WINE Dataset

No. of Clusters

No.

of O

ccur

renc

es

Wine (13D) - 3

0 5 10 15 20 250

20

40

60

80

100

120

140NEWTHYROID Dataset

No. of Clusters

No.

of O

ccur

renc

es

NewThyroid (5D) - 3

0 5 10 15 20 250

10

20

30

40

50

60

70

80

90

100WDBC Dataset

No. of Clusters

No.

of O

ccur

renc

es

WDBC (30D) - 2




Risultati.

UFOC vs Min-Max: Numero di esecuzioni, su 10 prove, in cui si ottiene la miglioreperformance

Algorithm Hepta Iris NewThyroid Seed WDBC Wine

(3-D) (4-D) (5-D) (7-D) (30-D) (13-D)

UFOC 10 8 7 10 10 8

MIN-MAX 10 3 3 1 1 3

UFOC vs Min-Max: mean & best error rate (%) dopo 10 esecuzioni dell’algoritmo


(3-D) (4-D) (5-D) (7-D) (30-D) (13-D)

(Mean)

UFOC 0.00 9.27 12.88 8.71 11.70 10.11

MIN-MAX 0.00 9.33 14.80 28.71 14.71 22.98

(Best)

UFOC 0.00 8.67 2.32 7.61 6.50 3.37

MIN-MAX 0.00 6.00 10.23 12.38 6.50 3.93




Risultati.

Mean error rate (%) dopo 10 esecuzioni dell’algoritmo


(3-D) (4-D) (5-D) (7-D) (30-D) (13-D)

UFOC 0.00 9.27 12.88 8.71 11.70 10.11

FCM 0.00 10.67 9.30 10.00 7.21 5.05

GK 1.41 10.00 13.02 10.47 17.22 38.26

K-means 0.00 11.33 11.16 10.95 7.20 3.49

Clusterdata 0.00 34.00 27.91 66.67 63.09 66.30

Best error rate (%) dopo 10 esecuzioni dell’algoritmo


(3-D) (4-D) (5-D) (7-D) (30-D) (13-D)

UFOC 0.00 8.67 2.32 7.61 6.50 3.37

FCM 0.00 10.67 9.30 10.00 7.21 5.05

GK 0.00 10.00 13.02 10.47 17.22 34.27

K-means 0.00 11.33 11.16 10.95 7.20 3.37

Clusterdata 0.00 34.00 27.91 66.67 63.09 66.30



Punto di partenza RMIT



Features IMU




RISULTATI

Dataset 6

6 Pazienti

Livello di Brunnstorm da III a V

6 x 10 x 6 = 360 Pattern

5 10 15 200

10

20

30

40

50

60

γ values

Erro

r rat

e (%

)

(a)


5 10 15 200

5

10

15

20

25

30

γ values

Erro

r rat

e (%

)

(b)


5 10 15 200

5

10

15

20

25

30

γ values

Erro

r rat

e (%

)

(c)


5 10 15 200

5

10

15

20

25

30

γ values

Erro

r rat

e (%

)

(d)


5 10 15 200

5

10

15

20

25

30

γ values

Erro

r rat

e (%

)

(e)


5 10 15 200

5

10

15

20

25

30

γ values

Erro

r rat

e (%

)

(f)

Error RateminError:0%

Dataset 14

14 Pazienti

Livello di Brunnstorm da I a V

531 Pattern

5 10 15 200

10

20

30

40

50

60

γ values

Erro

r rat

e (%

)

(a)


5 10 15 200

5

10

15

20

25

30

γ values

Erro

r rat

e (%

)

(b)


5 10 15 200

5

10

15

20

25

30

γ values

Erro

r rat

e (%

)

(c)


5 10 15 200

5

10

15

20

25

30

γ values

Erro

r rat

e (%

)

(d)


5 10 15 200

5

10

15

20

25

30

γ values

Erro

r rat

e (%

)

(e)


5 10 15 200

5

10

15

20

25

30

γ values

Erro

r rat

e (%

)

(f)





Matrice di confusione.

Uscita stimata

M1∗ M2∗ M3∗ M4∗ M5∗ M6∗

Va

lore

rea

le

M1 141 0 0 0 0 0

M2 0 80 0 0 0 0

M3 0 0 90 0 0 0

M4 1 0 1 58 0 0

M5 0 0 1 0 79 0

M6 0 0 0 0 0 80

Error rate.

Algorithm used 6 Patients 14 Patients

Fuzzy Kernel Classifier (FKC) 0.00 0.56

Neuro-fuzzy classifier 1.67 1.70

Classification Tree (CART) 5.28 4.90

Support Vector Machine (SVM) 1.32 1.11

Linear Discriminant Analysis (LDA) 1.67 8.35

Quadratic Discriminant Analysis (QDA) 0.56 1.32

NaiveBayes 3.06 6.03

Probabilistic Neural Network (PNN) 11.11 67.23




Features sEMG (1/3)

The surface EMG signals sampled at 3000 Hz are first fed through a 10th order digital ellipsebandpass filter with a pass-band from 20 to 500 Hz and 30 dB attenuation on stop-bands for noisereduction. The filtered samples are also rectified for activation detection using Root-Mean-Square(RMS) method with a sliding Hamming window as presented below. Let x be the filtered EMGinput signal and is the rectified output signal, the rectification process can be written as:

s(n) =

√√√√ 1

L

n+L−1∑k=n

[x(k)w(k)]2, 1 ≤ n ≤ N,

where N is the number of windowed segments, L is the window length, w(k) is the Hammingwindow function defined as:

w(k) = 0.54− 0.46 cos

(2π

k

M − 1

), 1 ≤ k < L.

The muscle activations were then automatically localized by setting a threshold in relation tosignal magnitude. It can be seen that rectangular windows are applied over the detectedmovement onsets. The window length is calculated to be 20% larger than the activation perioddetermined by an amplitude threshold to cover the complete movement. The activation periodswhich are too close to the beginning or the end of the sample sequences are disposed to avoid theinclusion of unintended or incomplete movements.



Features sEMG (2/3)

Ten features on both time and frequency domain are extracted from the segmented sEMGsamples before classification. The details of each feature are presented below:

Maximum Amplitude: The maximum amplitude reached in the rectified signal.

Mean Amplitude: The mean amplitude of the rectified signal.

Activation duration: The length of data segment which represents the duration of themuscle activation.

Signal Energy: The energy estimated using Teager Kaiser Energy Operator (TKEO) duringmuscle activation. The TKEO in discrete form is given as:

ψ[x(n)] = x(n)2 − x(n+ 1)x(n− 1),

where x(n) is the sEMG data sequence. The signal energy can then be calculated as:

ENE =N∑n−1

ψ[x(n)].

Maximum changing rate: the peak value in the first derivative of the rectified sEMG signal.

The 2nd and 3nd Linear Prediction Coefficient (LPC): The 2nd and 3nd LPC arecomputed by constructing a 2nd order forward linear predictor of the sEMG sample signaland minimizing the prediction error with Least-Squares method using the ‘lpc’ function inMATLAB.



Features sEMG (3/3)

Average Zero Crossing (ZC) rate: the ZC rate is calculated by counting the zero crossingevents of the original sEMG signal within a window which is defined as:

C(n) =1

L

n+L−1∑k=n

sgn(x(k)x(k + 1)),

where L is the window length and x is the original sEMG signal. The average ZC rate iscomputed as:

AZC =1

N

N∑n=1

C(n).

Mean Power Frequency (MPF): The MPF is the centroid frequency of the signal powerspectrum defined as:

MPF =

∑Nn=1 P (n)f(n)∑Nn=1 P (n)

,

where P is the power spectrum estimated using Welch’s modified periodogram method and fis the normalized frequency vector.

Median Frequency (MF): MF is the frequency which divides the sEMG power spectrum intotwo equal portions with same accumulated power. It can be defined as:

MF∑n=1

P (n) =

N∑n=MF

P (n) =1

2

N∑n=1

P (n).



Analisi statistica delle features

Coefficiente di correlazione tra le features dei segnali sEMG e la progressione del recuperopost-stroke.

Feature InfoGain ReliefF Pearson Significance

Maximum Amplitude 1.017 0.130 0.71 P < 0.001

Mean Amplitude 1.201 0.149 0.73 P < 0.001

Activation Duration 0.286 0.036 - 0.36 P < 0.001

Signal Energy 1.231 0.042 0.57 P < 0.001

Maximum Changing Rate 0.677 0.117 0.73 P < 0.001

2nd LPC 0.548 0.076 0.32 P < 0.001

3rd LPC 0.228 0.086 - 0.03 P = 0.388

Average Zero Crossing rate 0.386 0.047 0.50 P < 0.001

Mean Power Frequency 0.652 0.091 0.60 P < 0.001

Median Frequency 0.641 0.091 0.64 P < 0.001

Confronto tra campioni registrati dal latomalato e quelli registrati dal lato sano.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Median Frequency

Mea

n A

mpl

itude

paretic sidenon−paretic side

Correlazione tra Median Frequency eBrunnstrom Stages.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1I

II

III

IV

V

VI

Median Frequency

Brun

nstr

om S

tage

s




Fase di training

2 4 6 8 100

5

10

15

20

25

30

35

40Fold1

γ values

Erro

r Rat

e (%

)

Error rateBest error rate

γ* chosen for testing

InputDatax

Unconstrained FuzzyMembership Function evaluation

(Fuzzification)

µ(1)(x)

µ(2)(x)

µ(3)(x)

µ(4)(x)

Winner-Takes-All(Defuzzification)

WTAarg maxq=1...K

µ(q)(x) OutputDatax

Fuzzy output 2

Fuzzy output 3

Fuzzy output 4

Fuzzy output ‘Healthy’



Embedding

A chaotic sequence S(n) can be considered as the output of a chaotic system that is observableonly through S(n), which should be embedded in order to reconstruct the state-space evolutionof this system. The general embedding technique is based on the determination of the followingparameters:

embedding dimension D of the reconstructed state-space attractor, obtained by using theFalse Nearest Neighbors (FNN) method;

time lag T between the embedded past samples of S(n), obtained by using the AverageMutual Information (AMI) method; i.e.:

xn = [S(n) S(n− T ) . . . S(n− (D − 1)T )] ,

where xn is the reconstructed state at time n.

The solution of the embedding problem is useful for time series prediction. In a chaotic sequence,the prediction of S(n) can be obtained by using the relationship between the (reconstructed)state and the system output. In fact, the embedding of S(n) is intended to obtain an ‘unfolded’version of the actual system attractor, so that the difficulty of the prediction task can be reduced.Therefore, the prediction of a chaotic sequence S(n) can be considered as the determination ofthe function f(·), which approximates the link between the reconstructed state xn and the outputsample at the prediction distance m, i.e. S(n+m), m > 0.



Training & Test Set

Definizione del modello

M Regole determinate attraverso algoritmi di clustering.



Optimization of the HONFIS structure

Main problems → Local convergence of estimation algorithms and correct determination of thenumber C of rules.

Solutions → By using the clustering algorithm with different values of C and with differentinitializations for every value of C.

↓

The optimal network is selected by using the following cost function depending upon the numberof HONFIS rules:

F (C) = (1− λ)E(C)− Emin

Emax − Emin+ λ

C

NT,

where Emin and Emax are the extreme values of the performance E that are encountered duringthe analysis of the different HONFIS networks; λ is a weight 0 ≤ λ ≤ 1.

This weight is not critical, since the results are slightly affected by its variation in a large intervalcentered in 0.5.



Risultati grafici

Serie storiche logaritmiche: WTI (Stati Uniti) e BRENT (Europa);

Training phase: 2001-2002 e 2007-2008;

Testing phase: 2003 e 2009.

WTI

Brent


Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali

Documents

Transcript of Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali