Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali
-
Upload
wwwuniroma1 -
Category
Documents
-
view
4 -
download
0
Transcript of Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali
Fuzzy Pattern Recognition per il modellamento disistemi complessi in contesti reali
Dottorato di Ricerca XXVII Ciclo
Ingegneria dell’Informazione e della Comunicazione
Roma, 13 Aprile, 2015
Candidato: Luca Liparulo
Tutori: Prof. Gianni Orlandi, Prof. Massimo Panella
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Indice
Motivazione della ricerca
Fuzzy Membership Functions (MFs)
Fuzzy Pattern Recognition (PR) Convex-Hull based
Unconstrained Fuzzy Pattern Recognition (PR)
Clustering per la risoluzione di problemi di regressione
Conclusioni
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Inquadramento della ricerca
“Pattern Recognition is the scientific discipline whose goal is the classification of objects into anumber of categories or classes.”[S. Theodoridis, K. Koutroumbas - Pattern Recognition, Elsevier Science, 2006]
⇓
L’utilizzo delle tecniche di Pattern Recognition ha lo scopo di definire un sistema capace dirisolvere in maniera automatica ed efficiente diversi problemi presenti nella gran parte delleattivita scientifiche, soprattutto di tipo ingegneristico.
⇓
Research topic:Accanto ai modelli computazionali aristotelici, basati sulla millenaria logica esclusiva, inletteratura la ricerca sta affrontando il tema di tecniche piu flessibili e sofisticate, basate sullalogica fuzzy.
⇓
Approcci Fuzzy permettono la sovraposizione dei cluster in quanto non assegnano ad ogni‘oggetto’ (pattern) uno e un solo cluster.
Algoritmi Fuzzy permettono di creare una configurazione piu accurata in quanto consideranoil valore di appartenenza di ogni pattern confrontandolo con tutti i cluster.
Fuzzy Pattern Recognition → Metrica → Membership Function
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Metriche standard
dk(x, y) =
n∑j=1
|xj − yj |k 1
k
,
dove n e la dimensione dello spazio dei dati, xj , yj , j = 1 . . . n, sono le componenti dei patternlungo la dimensione jesima, k il parametro che specifica il particolare tipo di distanza diMinkowski.
⇓
(a) k = 1 (b) k = 2 (c) k → ∞
In figura sono mostrate le metriche maggiormente utilizzate: (a) distanza di Manhattan; (b)distanza Euclidean; (c) distanza di Chebyshev (d∞(x, y) = maxnj=1 |xj − yj |).
⇓
Problema riscontrato: Studio di un nuovo approccio per la valutazione della distanzacluster-pattern.
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Motivazione della ricerca
Definire un nuovo approccio per la valutazione della distanza cluster-pattern.
Progettare e implementare nuovi algoritmi di clustering/classificazione che utilizzano funzionidi appartenenza senza vincoli geometrici al fine di arginare il problema della sovrapposizionedelle classi/cluster.
Dimostrare che l’utilizzo della logica fuzzy, combinata con una geometria piu’ flessibile deicluster, produce risultati robusti, mediante procedure computazionalmente efficienti, nellarisoluzione di sistemi complessi in contesti reali:
I Classificazione del livello di Brunnstrom nell’ambito dell’Ingegneria Biomedica
I Predizione delle serie storiche nell’ambito delle Commodity Energetiche
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Attivita come Visiting Student presso RMIT University
Royal Melbourne Institute of Technology (RMIT)Melbourne, Australia (VIC)School of Electrical and Computer EngineeringLaboratorio di Ingegneria Biomedica
Topic della ricerca.
Self Home Rehabilitation
Post Stroke Patients Analysis
Motion Classification
Brunnstrom Stage Recognition
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
FUZZY MEMBERSHIP FUNCTION
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Fuzzy Membership Function
Primo problema affrontato:
Distanza Punto-Centroide (centroid-based)
Distanza Punto-Poligono (boundary-based)
⇓
Metodi proposti
a Metodo basato su kernel Gaussiano
b Metodo basato su kernel Conico
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Metodo basato su kernel gaussiano
Definizione del metodo
Il calcolo della MF, basata sulla distanza punto-poligono, utilizza la sovrapposizione di uncerto numero di Kernel Gaussiani;
Un kernel gaussiano posizionato sul centroide c;
N kernel gaussiani sovrapposti centrati sugli N vertici del poligono.
µ(gauss)(x) = exp
−γ22δ
L∑j=1
(xj − cj)2+
N∑i=1
exp
−γ22δ
L∑j=1
(xj − vij)2 ,
dove δ =√L, con L = Numero di features/dimensioni dataset.
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Metodo basato su kernel di tipo conico
Definizione del metodo
Questo metodo e basato sull’utilizzo di funzioni cone-shaped;
Facendo un confronto con il metodo gaussiano, questa MF tende allo zero molto piurapidamente;
Similarmente al metodo gaussiano una MF cone-shaped e basata sulla sovrapposizione di uncerto numero di funzioni lineari;
Un kernel lineare posizionato sul centroide c;
N kernel lineari sovrapposti centrati sugli N vertici del poligono.
µ(cone)(x) = max[0, 1− γ
δd2 (x, c)
]+
N∑
i=1
max[0, 1− γ
δd2 (x,vi)
],
dove δ =√L, con L = Numero di features/dimensioni dataset.
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Rappresentazione grafica
Metodo Gaussiano Metodo Cone-based
MFs con differenti valori di γ: (a) γ = 15; (b) γ = 10; (c) γ = 4; (d) γ = 3; (e) γ = 2; (f) γ = 1.
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Un benchmark di riferimento - IRIS dataset
Il Dataset IRIS rappresenta uno dei piu popolari database nella letteratura riguardante ilpattern recognition. E’ costituito da 3 classi di 50 elementi ognuna e ogni classe si riferiscead un tipo di Iris (Iris Setosa, Iris Versicolor, Iris Virginica).
I parametri per l’applicazione dell’algoritmo sono N = 150 patterns, M = 4 features.
Matrice di confusione: Simpson’s MF
ClassiStimato
1 2 3
Vero
1 100% − −2 − 94% 6%
3 − 30% 70%
Matrice di confusione: Gaussian MF
ClassiStimato
1 2 3
Vero
1 100% − −2 − 100% −3 − 36% 64%
Matrice di confusione: Cone MF
ClassiStimato
1 2 3
Vero
1 100% − −2 − 100% −3 − 28% 72%
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
FUZZY PATTERN RECOGNITION CONVEX-HULL BASED
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Fuzzy clustering utilizzando il convex-hull come modello geometrico
Utilizzo del convex-hull per il calcolo on-line dei punti sui quali posizionare le MFs
Utilizzo del convex-hull non soltanto per determinare graficamente la composizione deicluster.
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Feature 1
Feat
ure
2
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Impostazione generale algoritmo sequenziale
Sia M il numero di pattern di un dataset D = x1, x2, . . . , xM e N il numero di features →ogni pattern del dataset e rappresentato dalla N -tupla di numeri reali:
xm = [xm1 xm2 . . . xmN ] , m = 1 . . .M .
Sia K → Numero di cluster.
Inizializzazione. Il primo pattern x1 viene identificato come il primo cluster, quindi K vieneimpostato a 1.
Iterazioni. Per ogni pattern xm, m = 2 . . .M , del dataset:
→ Ω(xm) l’array dei valori delle MF dell’m-esimo pattern rispetto ai K clusters determinatiin quell’istante:
Ω(xm) =[µ1(xm) µ2(xm) . . . µK(xm)
],
→ Ω(m)max = max(Ω(xm)), calcolata in corrispondenza dell’h-esimo cluster, allora:Ω
(m)max = µh(xm) ,
h = arg maxr=1...K
µr(xm) .
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Fuzzy clustering utilizzando il convex-hull come modello geometrico
Sulla base del dataset da analizzare, vengono fissati due parametri θmin e θmax nello spazionormalizzato [0 1]
⇓
σ: vettore delle deviazioni standard:
σ = [σ1 σ2 . . . σN ] .
I valori utilizzati nell’algoritmo sono:
θmin = minj=1...N
σj .
θmax = 2 ·[
maxj=1...N
σj].
First cluster (x1)K=1, m=2
Presentation of pattern xm
MF evaluationfor each cluster
Ω(m)max ≤ θmin
Ω(m)max ≥ θmax
xm is within thehth cluster scoringthe maximum MF(NO convex hull)
New cluster createdcoinciding with xm
(NO convex hull)
The hth cluster is updated
Convex hull to definethe external vertices
of the hth cluster
xm belongs to thehth cluster scoringthe maximum MF
yes
no
yes
no
K ← K + 1
m← m+ 1
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Fuzzy clustering utilizzando il convex-hull come modello geometrico
Risultati
(a) Error Rate Medio (%) sui pattern assegnati dopo 100 esecuzioni
(b) Best Error Rate (%) sui pattern assegnati dopo 100 esecuzioni
(c) Numero di esecuzioni (su 100 totali) in cui e ottenuta la migliore performance
Algorithm Hepta (3-D) Iris (4-D) UKM (5-D) Seed (7-D)
CH-CBK 0.00 16.74 51.11 22.07
CH-GBK 0.00 16.53 52.16 21.84
K-means 23.69 18.23 52.62 10.95
FCM 0.20 10.67 57.45 10.00
Min-Max 3.33 24.26 67.11 32.05
Clusterdata 0.00 34.00 75.19 66.67
(a)
Algorithm Hepta (3-D) Iris (4-D) UKM (5-D) Seed (7-D)
CH-CBK 0.00 6.67 35.48 8.10
CH-GBK 0.00 4.67 40.20 9.05
K-means 0.00 11.33 44.17 10.95
FCM 0.00 10.67 49.63 10.00
Min-Max 0.00 6.00 45.16 12.38
Clusterdata 0.00 34.00 75.19 66.67
(b)
Algorithm Hepta (3-D) Iris (4-D) UKM (5-D) Seed (7-D)
CH-CBK 100 86 85 98
CH-GBK 100 86 85 98
Min-Max 81 14 15 3
(c)
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Classificazione automatica di movimenti mediante sensore inerziale (IMU)
Protocollo usato nella sperimentazione
Estrazione di 63 feature9 Sequenze dati
Displacement (x, y, z)(Spostamento derivato da dati Accelerometro)
Raw Acceleration (x, y, z)(Dati accelerometro filtrati)
Orientation (x, y, z)(Integrazione Giroscopio)
Numeri della sperimentazione
14 Pazienti (10M, 4F)
Eta 32-78 anni
Numero di pattern 531
Brunnstrom stage I-V
InertialMeasurement Unit
Receiver
Raw data
Preprocessing
Features extraction(63 features)
PCA evaluation
Data normalization
Testing Set
Classificationresult
Median filter↓
Axis inversion↓
Displacement(from Accelerometer)
↓Angle
(from Gyroscope)
Training Set
Convex-Hullfor each training class
Fuzzy MembershipFunction evaluation
for each motion
γ parameter setting
Trained fuzzy classifier
Data sampling
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Sensore Xsens IMU
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Software utilizzato
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Classificazione automatica di movimenti mediante sensore inerziale (IMU)
RISULTATI
Dataset 14
14 Pazienti
Livello di Brunnstorm da I a V
531 Pattern
5 10 15 200
10
20
30
40
50
60
γ values
Erro
r rat
e (%
)
(a)
Error RateminError:23.16%
5 10 15 200
5
10
15
20
25
30
γ values
Erro
r rat
e (%
)
(b)
Error RateminError:14.31%
5 10 15 200
5
10
15
20
25
30
γ values
Erro
r rat
e (%
)
(c)
Error RateminError:9.23%
5 10 15 200
5
10
15
20
25
30
γ values
Erro
r rat
e (%
)
(d)
Error RateminError:3.58%
5 10 15 200
5
10
15
20
25
30
γ values
Erro
r rat
e (%
)
(e)
Error RateminError:1.69%
5 10 15 200
5
10
15
20
25
30
γ values
Erro
r rat
e (%
)
(f)
Error RateminError:0.56%
Error rate.
Algorithm used 14 Patients
Fuzzy Kernel Classifier (FKC) 0.56
Neuro-fuzzy classifier 1.70
Classification Tree (CART) 4.90
Support Vector Machine (SVM) 1.11
Linear Discriminant Analysis (LDA) 8.35
Quadratic Discriminant Analysis (QDA) 1.32
NaiveBayes 6.03
Probabilistic Neural Network (PNN) 67.23
Tutti i valori sono espressi in (%).
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
FUZZY PATTERN RECOGNITION BASATO SU FUNZIONI DIAPPARTENENZA FUZZY SENZA VINCOLI GEOMETRICI
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Fuzzy clustering basato su funzioni di appartenenza senza vincoli geometrici
Principali criticita dell’approccio convex-hull based
Complessita computazionale al crescere del numero di dimensioni.
Modifiche proposte
Eliminazione del calcolo del convex-hull
Ottimizzazione dell’algoritmo diminuendo il numero di parametri
Analisi mediante “Grid-search” per la scelta dei parametri dell’algoritmo mediante indici divalidita e analisi dell’error-rate.
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Fuzzy clustering basato su funzioni di appartenenza senza vincoli geometrici
UFOC - Unconstrained Fuzzy Online Clustering Algorithm
θ: unico parametro da fissare, sulla base del
dataset da analizzare.
First cluster (x1)K=1, m=2
Presentation of pattern xm
MF evaluationfor each cluster
Ω(m)max ≤ θ
New cluster createdcoinciding with xm
The hth cluster is updated
xm belongs to thehth cluster scoringthe maximum MF
yes
no
K ← K + 1
m← m+ 1
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Fuzzy clustering basato su funzioni di appartenenza senza vincoli geometrici
Analisi mediante “Grid search”:
Γ = [γmin, γmax] = γ | γmin ≤ γ ≤ γmax ,
Θ = [θmin, θmax] = θ | θmin ≤ θ ≤ θmax .
Il range Γ e costituito da 71 valori di γ da γmin = 1 a γmax = 15 con uno step di 0.2;
il range Θ e costituito da 8 valori di θ, da θmin = 0.1 a θmax = 0.8 con uno step di 0.1.
→ L’analisi prevede 8× 71 esecuzioni dell’algoritmo.
Tipi di analisi:
Indici di validita:I Dunn (D) ↑I Davies-Bouldin (DB) ↓I Double Weighted Davies-Bouldin (DW-DB) ↓
Error-rate: ε = EM
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Fuzzy clustering basato su funzioni di appartenenza senza vincoli geometrici
Risultati Error rate.
Hepta - 3D
1 10 20 30 40 50 60 70 80 90 100-100
0
100(a)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 1000
50
100(b)
Presentation order
Erro
r rat
e (%
)
WDBC - 30D
1 10 20 30 40 50 60 70 80 90 100-100
0
100(a)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 1000
50
100(b)
Presentation order
Erro
r rat
e (%
)
Iris - 4D
1 10 20 30 40 50 60 70 80 90 100-100
0
100(a)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 1000
50
100(b)
Presentation order
Erro
r rat
e (%
)
Wine - 13D
1 10 20 30 40 50 60 70 80 90 100-100
0
100(a)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 1000
50
100(b)
Presentation order
Erro
r rat
e (%
)
NewThyroid - 5D
1 10 20 30 40 50 60 70 80 90 100-100
0
100(a)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 1000
50
100(b)
Presentation order
Erro
r rat
e (%
)
Tabella riassuntiva:
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Classificazione automatica del livello di Brunnstrom mediante sEMG
Registrazione dei dati
Estrazione di 10 feature
Numeri della sperimentazione
9 Pazienti (3M, 6F)
Eta 67.2± 29.2 anni
Numero di pattern 96
Brunnstrom stage II-IV
sEMG System
Receiver
Raw data
Preprocessing
Features extraction(10 features)
Data normalization
Testing Set
Classificationresult
Training Set
Fuzzy MembershipFunction evaluation
for each motion
Validation Set
γ∗ parameter setting
Trained fuzzy classifier
Data sampling
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Classificazione automatica del livello di Brunnstrom mediante sEMG
10Fold e LOO Performance
Algorithm usedError rate Error rate
(10 Fold) (LOO)
Fuzzy Kernel Classifier 7.53 8.60
FIS Classifier (Sugeno) 9.68 11.83
FIS Classifier (Mamdani) 9.68 11.83
Neuro-Fuzzy Classifier 11.83 17.20
LDA 30.11 31.18
QDA 17.20 19.35
NaiveBayes 24.73 23.66
SVM 24.73 22.58
CART 16.13 15.05
PNN 45.16 46.24
Tutti i valori sono espressi in (%).
Accuratezza
Estimated outcome
Stage II∗ Stage III∗ Stage IV∗ Healthy∗ Accuracy
Act
ua
lva
lue Stage II 17 0 1 0 94.44 %
Stage III 0 30 2 0 93.75 %
Stage IV 3 0 11 0 78.57 %
Healthy 0 0 1 28 96.55 %
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
TECNICHE DI CLUSTERING E RETI NEURO-FUZZY PER LASOLUZIONE DI PROBLEMI DI REGRESSIONE
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Procedure di clustering per la soluzione di problemi di regressione
Punto di partenza:
Le reti FIS sono comunemente utilizzate per risolvere problemi di regressione
L’uso di tecniche di clustering permette di determinare le regole decisionali direttamente daiclusters del training set disponibile, pertanto ogni regola corrisponde ad un insiemestrutturato di punti.
⇓
La strategia di clustering e applicata allo spazio congiunto Z = X x Y , con zi = (xi, yi)
L’analisi produce C cluster, Γ(k)z , con k = 1, . . . , C
Procedura molto semplice e immediata
PROBLEMA:
⇓
I pattern appartenenti alla stesso cluster, potrebbero non riflettere la reale struttura dei dati
⇓
SOLUZIONE:
Progettazione e implementazione di procedura iterativa che permetta di utilizzare i risultatiderivanti dal clustering al fine di stimare la reale struttura dei dati di uscita
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Clustering nello spazio ingresso-uscita
Sia Γ = Γ1,Γ2, . . . ,ΓC un insieme di C cluster (ognuno associato con una regola di uscita) esia ogni pattern del training set assegnato a uno di questi cluster. La procedura di clustering conC prototipi e basata sui seguenti step:
Step 1 . I coeficienti ω(k), k = 1 . . . C, di ogni regola sono calcolati risolvendo una serie diequazioni non lineari; l’equazione generica e:
yt = h(xt;ω
(k))
Step 2 . Aggiornamento assegnazione pattern. Ogni coppia (xt, yt), t = 1 . . . NT , deltraining set e assegnata al cluster Γq , con q tale che:
dt =∣∣∣yt − h(xt;ω(q)
)∣∣∣ = mink=1...C
∣∣∣yt − h(xt;ω(k))∣∣∣ .
Step 3 . Per ogni cluster Γk, l’approssimazione locale dell’errore e calcolata:
Dk =1
Nk
∑tdt ,
Step 4 . Convergenza:
Θ =
∣∣D −D(old)∣∣
D(old), D =
1
NT
NT∑t=1
di ,
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Esempio del risultato finale
(Toy Problem) Risultato finale dopo 1 Step
(Toy Problem) Risultato finale dopo 13 Step
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Metodologia proposta per la predizione delle serie storiche
Serie dei ritorni → yt = ln(St)− ln(St−1)
Modello standard additivo per le serie storiche → yt = µt + εt
Modello di regressione ARMA-GARCH
µt = hµ(x(µ)t ;ω
(µ)t
),
x(µ)t = [yt−1 yt−2 . . . yt−R εt−1 εt−2 . . . εt−M ] ,
σ2t = hσ
(x(σ)t ;ω
(σ)t
),
x(σ)t =
[σ2t−1 σ
2t−2 . . . σ
2t−P ε2t−1 ε
2t−2 . . . ε
2t−Q
],
Gli ordini R, M , P , e Q sono analoghi per i modelli ARMA e GARCH;
ω(µ)t and ω
(σ)t sono i parametri vettori delle funzioni di regressione hµ and hσ .
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Calcolo dell’accuratezza:
Mean Squared Error - MSE
MSE =1
Ns
∑t
(yt − µt)2
Normalised Mean Square Error - NMSE
NMSE =
∑t (yt − µt)2∑t (yt − y)2
Noise-to-signal ratio or NSR (dB)
NSRdB = 10 log10
∑t(yt − µt)2∑
t y2
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Risultati: Commodity Energetiche
DJUSCL (coal) TABLE INUMERICAL RESULTS FOR DJUSCL RETURN SERIES
Errors Unconditional MomentsModel
MSE NMSE NSR Mean Variance Skewness Kurtosis
DJUSCL 0.31 0.74 0.41 5.60
GARCH 0.80 1.08 0.33 0.34 0.33 0.07 9.36
RBF 0.76 1.03 0.13 0.39 0.88 0.18 5.28
MoG 0.78 1.04 0.18 0.77 1.01 0.32 5.20
ANFIS 0.77 1.04 0.17 0.27 0.91 0.41 5.34
HONFIS 0.76 1.04 0.17 0.29 0.81 0.35 5.52
MSE, mean, and variance are scaled by 103
European data set not showing significant differences inthe forecasting ability. The studied commodities and therelated indexes are: coal (DJUSCL, in $/ton); Henry Hubnatural gas (HH, in $/MMBtu); crude oil (WTI, in $/barrel),and electricity (PJM, in $/MWh). For electricity prices wechose the peak load contract referred to h. 12:00. For eachcommodity price log-returns are estimated using (1). Wechose a well representative time window across the ‘verycritical’ year 2008, i.e. from the beginning of 2006 to the endof 2009. So, taking into account that we have approximately250 prices and 249 returns per year in the case of coal,natural gas and crude oil series, each return series consistsof about 1000 samples. In the case of electricity prices wehave a series of 365 data, given that electricity is traded everyday of the year; in this application, for comparison purposes,we adjust the series to 250 trading days only. Each modelis trained on the previous NT = 500 samples (almost twoyears) and NS = 500 samples are predicted, i.e. the last twoyears starting from t = 501 to the last sample of 2009.
The prediction errors of the conditional mean are eval-uated; in addition, the four unconditional moments are es-timated for both the predicted sequences and the relatedoriginal series. Prior investigations can be made in order tofind the best combination of the model orders and the size ofthe training set as well; a fine tuning for the optimal estimateof every model can be addressed in future research works. Inorder to obtain an accurate comparison of the performancesobtained by the proposed neural networks with respect tostandard models, we carried out a preliminary optimizationof the main model parameters, i.e. R, M , P , and Q, so asto obtain the best performance of the reference ‘GARCH’model for a given time series. Then, every model will usethe same parameters when applied to the same time series.
The optimal parameters of the coal DJUSCL returns areR = 1, M = 1, P = 5, Q = 1; hence, a GARCH referencemodel ARMA(1, 5)-GARCH(1, 1) is fitted. The numericalresults are summarized in Table I: all the neural models scorea prediction error better than GARCH; RBF obtains the bestNSR but the skewness is not properly matched as in the caseof ANFIS and HONFIS that have a comparable performance.HONFIS achieves a good NSR performance of 0.17 dB andit is able to follow the dynamics of both conditional meanand increasing volatility, as proved by the behavior of theestimated conditional variance shown in Fig. 1.
500 600 700 800 900 1000−0.2
0
0.2Conditional mean: forecast (black), actual time series (gray)
500 600 700 800 900 1000−0.2
0
0.2Forecast error (innovation)
500 600 700 800 900 10000
2
4x 10
−3 Conditional variance (volatility)
Fig. 1. Prediction of coal returns using the HONFIS neural network.
TABLE IINUMERICAL RESULTS FOR HH RETURN SERIES
Errors Unconditional MomentsModel
MSE NMSE NSR Mean Variance Skewness Kurtosis
HH 0.46 1.72 1.46 10.62
GARCH 1.97 1.14 0.57 1.65 0.12 2.49 67.15
RBF 1.87 1.09 0.37 0.44 1.92 0.27 4.54
MoG 1.80 1.05 0.21 0.34 1.90 1.13 9.94
ANFIS 2.05 1.19 0.75 0.47 1.87 0.71 7.35
HONFIS 1.67 1.03 0.13 0.45 1.77 0.99 8.46
MSE, mean, and variance are scaled by 103
The numerical results for HH returns of natural gas arereported in Table II. The optimal parameters are in thiscase R = 2, M = 2, P = 2, Q = 1, so a GARCHreference model ARMA(2, 2)-GARCH(2, 1) is fitted. TheHONFIS neural network has the best NSR performance of0.13 dB and the related moments adequately fit with thoseof the original time series. A sufficient accuracy is alsoobtained by RBF and ANFIS neural networks. The GARCHis not suitable for the prediction of HH returns, since themoments are estimated very poorly, especially the kurtosis.The numerical results of the HONFIS neural network arequalitatively confirmed by the accurate predictions reportedin the plots of Fig. 2, especially in the case of volatility.
The large volatility of crude oil WTI returns at the endof 2008 is the feature that requires an accurate forecastingtechnique. A more complex model is therefore necessary,using R = 4, M = 2, P = 2, Q = 3. The GARCH referencemodel ARMA(4, 2)-GARCH(2, 3) is evidently outperformedby the neural networks, as evidenced by the results sum-marized in Table III. The best NSR is once again obtainedby HONFIS, although the predicted sequence does not fit theskewness of the original one; the ANFIS performance suffersfrom the same drawback. The MoG network is able to fitthe original moments, also maintaining a good predictionaccuracy and following the changes of volatility in theunderlying process.
Finally, the numerical results for the returns of PJM elec-
MSE, mean, and variance are scaled by 10−3
HH (natural gas)
TABLE INUMERICAL RESULTS FOR DJUSCL RETURN SERIES
Errors Unconditional MomentsModel
MSE NMSE NSR Mean Variance Skewness Kurtosis
DJUSCL 0.31 0.74 0.41 5.60
GARCH 0.80 1.08 0.33 0.34 0.33 0.07 9.36
RBF 0.76 1.03 0.13 0.39 0.88 0.18 5.28
MoG 0.78 1.04 0.18 0.77 1.01 0.32 5.20
ANFIS 0.77 1.04 0.17 0.27 0.91 0.41 5.34
HONFIS 0.76 1.04 0.17 0.29 0.81 0.35 5.52
MSE, mean, and variance are scaled by 103
European data set not showing significant differences inthe forecasting ability. The studied commodities and therelated indexes are: coal (DJUSCL, in $/ton); Henry Hubnatural gas (HH, in $/MMBtu); crude oil (WTI, in $/barrel),and electricity (PJM, in $/MWh). For electricity prices wechose the peak load contract referred to h. 12:00. For eachcommodity price log-returns are estimated using (1). Wechose a well representative time window across the ‘verycritical’ year 2008, i.e. from the beginning of 2006 to the endof 2009. So, taking into account that we have approximately250 prices and 249 returns per year in the case of coal,natural gas and crude oil series, each return series consistsof about 1000 samples. In the case of electricity prices wehave a series of 365 data, given that electricity is traded everyday of the year; in this application, for comparison purposes,we adjust the series to 250 trading days only. Each modelis trained on the previous NT = 500 samples (almost twoyears) and NS = 500 samples are predicted, i.e. the last twoyears starting from t = 501 to the last sample of 2009.
The prediction errors of the conditional mean are eval-uated; in addition, the four unconditional moments are es-timated for both the predicted sequences and the relatedoriginal series. Prior investigations can be made in order tofind the best combination of the model orders and the size ofthe training set as well; a fine tuning for the optimal estimateof every model can be addressed in future research works. Inorder to obtain an accurate comparison of the performancesobtained by the proposed neural networks with respect tostandard models, we carried out a preliminary optimizationof the main model parameters, i.e. R, M , P , and Q, so asto obtain the best performance of the reference ‘GARCH’model for a given time series. Then, every model will usethe same parameters when applied to the same time series.
The optimal parameters of the coal DJUSCL returns areR = 1, M = 1, P = 5, Q = 1; hence, a GARCH referencemodel ARMA(1, 5)-GARCH(1, 1) is fitted. The numericalresults are summarized in Table I: all the neural models scorea prediction error better than GARCH; RBF obtains the bestNSR but the skewness is not properly matched as in the caseof ANFIS and HONFIS that have a comparable performance.HONFIS achieves a good NSR performance of 0.17 dB andit is able to follow the dynamics of both conditional meanand increasing volatility, as proved by the behavior of theestimated conditional variance shown in Fig. 1.
500 600 700 800 900 1000−0.2
0
0.2Conditional mean: forecast (black), actual time series (gray)
500 600 700 800 900 1000−0.2
0
0.2Forecast error (innovation)
500 600 700 800 900 10000
2
4x 10
−3 Conditional variance (volatility)
Fig. 1. Prediction of coal returns using the HONFIS neural network.
TABLE IINUMERICAL RESULTS FOR HH RETURN SERIES
Errors Unconditional MomentsModel
MSE NMSE NSR Mean Variance Skewness Kurtosis
HH 0.46 1.72 1.46 10.62
GARCH 1.97 1.14 0.57 1.65 0.12 2.49 67.15
RBF 1.87 1.09 0.37 0.44 1.92 0.27 4.54
MoG 1.80 1.05 0.21 0.34 1.90 1.13 9.94
ANFIS 2.05 1.19 0.75 0.47 1.87 0.71 7.35
HONFIS 1.67 1.03 0.13 0.45 1.77 0.99 8.46
MSE, mean, and variance are scaled by 103
The numerical results for HH returns of natural gas arereported in Table II. The optimal parameters are in thiscase R = 2, M = 2, P = 2, Q = 1, so a GARCHreference model ARMA(2, 2)-GARCH(2, 1) is fitted. TheHONFIS neural network has the best NSR performance of0.13 dB and the related moments adequately fit with thoseof the original time series. A sufficient accuracy is alsoobtained by RBF and ANFIS neural networks. The GARCHis not suitable for the prediction of HH returns, since themoments are estimated very poorly, especially the kurtosis.The numerical results of the HONFIS neural network arequalitatively confirmed by the accurate predictions reportedin the plots of Fig. 2, especially in the case of volatility.
The large volatility of crude oil WTI returns at the endof 2008 is the feature that requires an accurate forecastingtechnique. A more complex model is therefore necessary,using R = 4, M = 2, P = 2, Q = 3. The GARCH referencemodel ARMA(4, 2)-GARCH(2, 3) is evidently outperformedby the neural networks, as evidenced by the results sum-marized in Table III. The best NSR is once again obtainedby HONFIS, although the predicted sequence does not fit theskewness of the original one; the ANFIS performance suffersfrom the same drawback. The MoG network is able to fitthe original moments, also maintaining a good predictionaccuracy and following the changes of volatility in theunderlying process.
Finally, the numerical results for the returns of PJM elec-
MSE, mean, and variance are scaled by 10−3
WTI (crude oil)
Fig. 2. Prediction of natural gas returns using the HONFIS neural network.
TABLE IIINUMERICAL RESULTS FOR WTI RETURN SERIES
Errors Unconditional MomentsModel
MSE NMSE NSR Mean Variance Skewness Kurtosis
WTI 0.10 0.11 0.34 6.51
GARCH 0.12 1.00 0.01 0.43 0.01 0.48 9.38
RBF 0.11 0.94 0.27 0.01 0.12 0.16 4.06
MoG 0.11 0.92 0.36 0.10 0.10 0.44 5.73
ANFIS 0.09 0.89 0.50 0.05 0.12 0.25 7.22
HONFIS 0.08 0.85 0.71 0.08 0.13 0.01 6.98
MSE, mean, and variance are scaled by 103
TABLE IVNUMERICAL RESULTS FOR PJM RETURN SERIES
Errors Unconditional MomentsModel
MSE NMSE NSR Mean Variance Skewness Kurtosis
PJM 0.22 1.21 0.38 5.52
GARCH 1.19 0.98 0.09 1.53 0.24 0.53 8.10
RBF 0.71 0.58 2.36 0.11 1.64 0.01 2.96
MoG 0.57 0.47 3.28 0.41 1.77 0.28 4.61
ANFIS 0.64 0.53 2.76 0.21 1.24 0.26 5.58
HONFIS 0.60 0.51 2.92 0.17 1.21 0.29 5.46
MSE, mean, and variance are scaled by 103
tricity index are reported in Table IV. The model parametersare R = 3, M = 2, P = 1, Q = 1; the GARCHreference model is ARMA(3, 2)-GARCH(1, 1). The MoGneural network performs better than the other models inthis case. Anyway, the proposed HONFIS model performsbetter than the original ANFIS network. Globally, neuralnetworks improve the NSR performance of more than 2 dBwith respect to GARCH, despite a biasing that shifts theestimate of the mean to negative values.
V. CONCLUSION
A new neural network approach is proposed for model-ing time series associated with energy commodity prices,which is based on fuzzy neural networks using higher-orderSugeno-type consequent rules. The use of a constructive
procedure determining automatically the optimal number offuzzy rules is also illustrated in order to avoid overfitting andmaximize the generalization capability of the neural network.
The proposed approach provides an accurate descriptionof energy prices dynamics, allowing us to estimate dailyprices for energy commodities over a long time horizon.The validation performed on historical data shows that theneural network approach generates prices that are able toreplicate the daily data and to reproduce the same probabilitydistribution for the various series. The proposed HONFISmodel, using quadratic consequent rules, outperforms theoriginal ANFIS network in almost all cases, making usefulsignificantly the increased complexity of the related model.
Currently, we are investigating more advanced techniquesfor the application of the proposed approach to a multivariatetime series analysis and for the automatic and more reliableselection of the samples to be used for prediction, includingthe order of regression models and the resulting complexityof neural models.
REFERENCES
[1] J. Hamilton, “Understanding crude oil prices,” Energy Journal, vol. 20,no. 2, pp. 179–206, 2009.
[2] R. Gibson and E. Schwartz, “Stochastic convenience yield and thepricing of oil contingent claims,” The Journal of Finance, vol. 45, pp.959–976, 1990.
[3] C. Mari, “Random movements of power prices in competitive markets:a hybrid model approach,” The Journal of Energy Markets, vol. 1,no. 2, pp. 87–103, 2008.
[4] T. Mills, Time Series Techniques for Economists. Cambridge, UK:Cambridge University Press, 1990.
[5] I. Haidar, S. Kulkarni, and H. Pan, “Forecasting model for crude oilprices based on artificial neural networks,” in Proc. of InternationalConference on Intelligent Sensors, Sensor Networks and InformationProcessing (ISSNIP 2008), Sydney, NSW, Australia, 2008, pp. 103–108.
[6] M. Panella, F. Barcellona, and R.L. D’Ecclesia, “Subband predictionof energy commodity prices,” in Proc. of the IEEE Int. Workshopon Signal Processing Advances in Wireless Communications (SPAWC2012). Cesme, Turkey: IEEE, 2012.
[7] M. Panella, L. Liparulo, F. Barcellona, and R.L. D’Ecclesia, “A studyon crude oil prices modeled by neurofuzzy networks,” in Proc. ofFUZZ-IEEE 2013, Hyderabad, India, 2013.
[8] T. Bollerslev, “Generalized autoregressive conditional heteroskedastic-ity,” Journal of Econometrics, vol. 31, p. 307327, 1986.
[9] R. Engle, “Autoregressive conditional heteroscedasticity with estimatesof the variance of united kingdom inflation,” Econometrica, vol. 50,no. 4, pp. 987–1007, 1982.
[10] M. Panella, F. Barcellona, and R.L. D’Ecclesia, “Forecasting energycommodity prices using neural networks,” Advances in DecisionSciences, vol. 2012, 2012.
[11] J.-S. Jang, C. Sun, and E. Mizutani, Neuro-Fuzzy and Soft Computing:a Computational Approach to Learning and Machine Intelligence.Upper Saddle River, NJ, USA: Prentice Hall, 1997.
[12] S. Haykin, Neural Networks, a Comprehensive Foundation, 2nd Edi-tion. Englewood Cliffs, NJ, USA: Prentice-Hall, 1999.
[13] M. Panella, “A hierarchical procedure for the synthesis of ANFISnetworks,” Advances in Fuzzy Systems, vol. 2012, 2012.
[14] G. Seber and C. Wild, Nonlinear Regression. NJ: Wiley-Interscience:Hoboken, 2003.
[15] A. Bors and I. Pitas, “Median radial basis function neural network,”IEEE Trans. Neural Netw., vol. 7, no. 6, pp. 1351–1364, 1996.
[16] M. Panella, “Advances in biological time series prediction by neuralnetworks,” Biomedical Signal Processing and Control, vol. 6, no. 2,pp. 112–120, 2011.
[17] S. Chiu, “Fuzzy model identification based on cluster estimation,”Journal of Intelligent & Fuzzy Systems, vol. 2, pp. 267–278, 1994.
MSE, mean, and variance are scaled by 10−3
PJM (electricity)
500 600 700 800 900 1000−0.5
0
0.5Conditional mean: forecast (black), actual time series (gray)
500 600 700 800 900 1000−0.5
0
0.5Forecast error (innovation)
500 600 700 800 900 10000
0.01
0.02Conditional variance (volatility)
Fig. 2. Prediction of natural gas returns using the HONFIS neural network.
TABLE IIINUMERICAL RESULTS FOR WTI RETURN SERIES
Errors Unconditional MomentsModel
MSE NMSE NSR Mean Variance Skewness Kurtosis
WTI 0.10 0.11 0.34 6.51
GARCH 0.12 1.00 0.01 0.43 0.01 0.48 9.38
RBF 0.11 0.94 0.27 0.01 0.12 0.16 4.06
MoG 0.11 0.92 0.36 0.10 0.10 0.44 5.73
ANFIS 0.09 0.89 0.50 0.05 0.12 0.25 7.22
HONFIS 0.08 0.85 0.71 0.08 0.13 0.01 6.98
MSE, mean, and variance are scaled by 103
TABLE IVNUMERICAL RESULTS FOR PJM RETURN SERIES
Errors Unconditional MomentsModel
MSE NMSE NSR Mean Variance Skewness Kurtosis
PJM 0.22 1.21 0.38 5.52
GARCH 1.19 0.98 0.09 1.53 0.24 0.53 8.10
RBF 0.71 0.58 2.36 0.11 1.64 0.01 2.96
MoG 0.57 0.47 3.28 0.41 1.77 0.28 4.61
ANFIS 0.64 0.53 2.76 0.21 1.24 0.26 5.58
HONFIS 0.60 0.51 2.92 0.17 1.21 0.29 5.46
MSE, mean, and variance are scaled by 103
tricity index are reported in Table IV. The model parametersare R = 3, M = 2, P = 1, Q = 1; the GARCHreference model is ARMA(3, 2)-GARCH(1, 1). The MoGneural network performs better than the other models inthis case. Anyway, the proposed HONFIS model performsbetter than the original ANFIS network. Globally, neuralnetworks improve the NSR performance of more than 2 dBwith respect to GARCH, despite a biasing that shifts theestimate of the mean to negative values.
V. CONCLUSION
A new neural network approach is proposed for model-ing time series associated with energy commodity prices,which is based on fuzzy neural networks using higher-orderSugeno-type consequent rules. The use of a constructive
procedure determining automatically the optimal number offuzzy rules is also illustrated in order to avoid overfitting andmaximize the generalization capability of the neural network.
The proposed approach provides an accurate descriptionof energy prices dynamics, allowing us to estimate dailyprices for energy commodities over a long time horizon.The validation performed on historical data shows that theneural network approach generates prices that are able toreplicate the daily data and to reproduce the same probabilitydistribution for the various series. The proposed HONFISmodel, using quadratic consequent rules, outperforms theoriginal ANFIS network in almost all cases, making usefulsignificantly the increased complexity of the related model.
Currently, we are investigating more advanced techniquesfor the application of the proposed approach to a multivariatetime series analysis and for the automatic and more reliableselection of the samples to be used for prediction, includingthe order of regression models and the resulting complexityof neural models.
REFERENCES
[1] J. Hamilton, “Understanding crude oil prices,” Energy Journal, vol. 20,no. 2, pp. 179–206, 2009.
[2] R. Gibson and E. Schwartz, “Stochastic convenience yield and thepricing of oil contingent claims,” The Journal of Finance, vol. 45, pp.959–976, 1990.
[3] C. Mari, “Random movements of power prices in competitive markets:a hybrid model approach,” The Journal of Energy Markets, vol. 1,no. 2, pp. 87–103, 2008.
[4] T. Mills, Time Series Techniques for Economists. Cambridge, UK:Cambridge University Press, 1990.
[5] I. Haidar, S. Kulkarni, and H. Pan, “Forecasting model for crude oilprices based on artificial neural networks,” in Proc. of InternationalConference on Intelligent Sensors, Sensor Networks and InformationProcessing (ISSNIP 2008), Sydney, NSW, Australia, 2008, pp. 103–108.
[6] M. Panella, F. Barcellona, and R.L. D’Ecclesia, “Subband predictionof energy commodity prices,” in Proc. of the IEEE Int. Workshopon Signal Processing Advances in Wireless Communications (SPAWC2012). Cesme, Turkey: IEEE, 2012.
[7] M. Panella, L. Liparulo, F. Barcellona, and R.L. D’Ecclesia, “A studyon crude oil prices modeled by neurofuzzy networks,” in Proc. ofFUZZ-IEEE 2013, Hyderabad, India, 2013.
[8] T. Bollerslev, “Generalized autoregressive conditional heteroskedastic-ity,” Journal of Econometrics, vol. 31, p. 307327, 1986.
[9] R. Engle, “Autoregressive conditional heteroscedasticity with estimatesof the variance of united kingdom inflation,” Econometrica, vol. 50,no. 4, pp. 987–1007, 1982.
[10] M. Panella, F. Barcellona, and R.L. D’Ecclesia, “Forecasting energycommodity prices using neural networks,” Advances in DecisionSciences, vol. 2012, 2012.
[11] J.-S. Jang, C. Sun, and E. Mizutani, Neuro-Fuzzy and Soft Computing:a Computational Approach to Learning and Machine Intelligence.Upper Saddle River, NJ, USA: Prentice Hall, 1997.
[12] S. Haykin, Neural Networks, a Comprehensive Foundation, 2nd Edi-tion. Englewood Cliffs, NJ, USA: Prentice-Hall, 1999.
[13] M. Panella, “A hierarchical procedure for the synthesis of ANFISnetworks,” Advances in Fuzzy Systems, vol. 2012, 2012.
[14] G. Seber and C. Wild, Nonlinear Regression. NJ: Wiley-Interscience:Hoboken, 2003.
[15] A. Bors and I. Pitas, “Median radial basis function neural network,”IEEE Trans. Neural Netw., vol. 7, no. 6, pp. 1351–1364, 1996.
[16] M. Panella, “Advances in biological time series prediction by neuralnetworks,” Biomedical Signal Processing and Control, vol. 6, no. 2,pp. 112–120, 2011.
[17] S. Chiu, “Fuzzy model identification based on cluster estimation,”Journal of Intelligent & Fuzzy Systems, vol. 2, pp. 267–278, 1994.
MSE, mean, and variance are scaled by 10−3
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Conclusioni
Motivazione della ricerca:Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali.
Principali risultati ottenuti:
E stata implementata e testata una nuova metodologia per il calcolo della distanzapattern-to-cluster
Si e dimostrato che l’utilizzo di nuove funzioni di appartenenza senza vincoli geometrici unitaall’implementazione di nuove metodologie di clustering e classificazione permettono ilmodellamento di sistemi complessi migliorando le performance rispetto ai piu noti algoritmidi Pattern Recognition, ponendosi come soluzione al problema della sovrapposizione deicluster/classi.
Si e dimostrato che attraverso le tecniche sviluppate e possibile intervenire con ottimirisultati in termini massimo rate di classificazione nel riconoscimento, in ambito biomedico,di movimenti compiuti da pazienti colpiti da ictus, mediante l’elaborazione di dati provenientida sensori IMU o sEMG.
L’utilizzo una nuova metodologia basata su clustering nello spazio congiunto input-output esulle reti neuro fuzzy permette di risolvere problemi di regressione e puo’ essere applicata perprevedere il prezzo/ritorno delle serie storiche in materia di Commodity Energetiche.
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Lista Pubblicazioni
1 Festa, A., Panella, M., Lo Sterzo, R., Liparulo, L., “Radiofrequency identification systems for healthcare: A case study on electromagneticexposures”, (2013) Journal of Clinical Engineering, 38 (3), pp. 125-133.
2 Maisto, M., Panella, M., Liparulo, L., Proietti, A., “An accurate algorithm for the identification of fingertips using an RGB-D camera”, (2013) IEEEJournal on Emerging and Selected Topics in Circuits and Systems, 3 (2), pp. 272-283.
3 Liparulo, L., Proietti, A., Panella, M., “Fuzzy membership functions based on point-to-polygon distance evaluation”, (2013) IEEE InternationalConference on Fuzzy Systems, 7-10 July 2013, Hyderabad (India).
4 Panella, M., Liparulo, L., Barcellona, F., D’Ecclesia, R.L., “A study on crude oil prices modeled by neurofuzzy networks”, (2013) IEEE InternationalConference on Fuzzy Systems, 7-10 July, 2013, Hyderabad (India).
5 Liparulo, L., Proietti, A., Panella, M., “Fuzzy Clustering Using the Convex Hull as Geometrical Model”, (2014) Advances in Fuzzy Systems (Instampa).
6 Liparulo, L., Proietti, A., Panella, M., ‘Improved Online Fuzzy Clustering Based on Unconstrained Kernels”, (2015) IEEE International Conferenceon Fuzzy Systems, 2-5 August 2015, Istanbul (Turchia). (In stampa)
7 Zhang, Z., Liparulo, L., Panella, M., Gu, X., Fang, Q., “A Fuzzy Kernel Motion Classifier for Autonomous Stroke Rehabilitation”, IEEE Journal ofBiomedical and Health Informatics, (Secondo round di revisione)
8 Liparulo, L., Zhang, Z., Panella, M., Gu, X., Fang, Q., “A Novel Fuzzy Approach for Automatic Brunnstrom Stage Classification using SurfaceElectromyography”, Transactions on Neural Systems & Rehabilitation Engineering, (Secondo round di revisione)
9 Panella, M., Liparulo, L.,; Proietti, A., “A higher-order fuzzy neural network for modeling financial time series”, Neural Networks (IJCNN), 2014International Joint Conference on , vol., no., pp.3066,3073, 6-11 July 2014, Beijing (China).
10 Panella, M., Liparulo, L.,; Proietti, A., “A Higher-Order Fuzzy Neural Network Applied to Financial Time Series Analysis”, Quantitative Finance,(Invited paper - Estensione di [9], Sottomesso)
11 Altilio, R., Liparulo, L., Panella, M., Paoloni, M., Proietti, A., “Multimedia and Gaming Technologies for Telerehabilitation of Motor Disabilities”,IEEE Internet Computing, (Sottomesso)
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Introduzione Fuzzy MF Fuzzy PR Convex-Hull Based Unconstrained Fuzzy PR Time Series prediction Conclusioni
Grazie per l’attenzione.
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Algoritmi di clustering (e classificazione)
Min-Max
µh(x) =1
n
n∑j=1
[1− f(xj − whj)− f(vhj − xj)
],
dove f : < → [0, 1] e definita come:
f(α) =
1 , if γα > 1γα , if 0 ≤ γα ≤ 10 , if γα < 0
Fuzzy C-Means
µh(x) =1
c∑r=1
[d2(x,Ch)
d2(x,Cr)
] 2m−1
,
K-Means
J =k∑
j=1
n∑i=1
‖ x(j)i − cj ‖2
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Tecniche di generalizzazione delle funzioni di appartenenza
Rotazione degli Hyperbox
⇓Membership Functions generalizzate
(a) Simpson Generaliz-zata
(b) Trapezoide (c) Ellissoide (d) Paraboloide
⇓Nuova metodologia: Utilizzo di funzioni di appartenenza senza vincoli geometrici.
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Metodo Geometrico
Parametri
Proprieta geometriche
N = Numero di vertici
s = [s1 s2 . . . sN ]
t = [t1 t2 . . . tN ] (s1, t1)
(s2, t2)
(s3, t3)case 1
case 2
P
P
P
Risultato:Un vettore di N elementi che contiene le misurazioni delle distanze del punto p dagli N lati delpoligono.
Distanza complessiva
d(geo)s,t (p) = min
i=1...N
[d(geo)1 (p), . . . , d
(geo)N (p)
]MF Geometrica
µ(geo)s,t (p) = max
[0, 1− γ · d(geo)s,t (p)
]Triangle = [(0.3,0.3);(0.7,0.4);(0.4,0.7)]
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Efficienza
Numero di esecuzioni sulla stessa macchina per lo stesso poligono: → 1000.
Architettura HW: Piattaforma x64 con un processore Intel R© Core i7-2600K CPU @ 3.4 GHz
Tipo di Poligono Geometrico Gaussiano Conico
Triangolo 7.7 0.076 0.061
Box 20.4 0.082 0.063
Esagono 32.1 0.093 0.068
Tutti i valori sono in ms
Risultati:
Il metodo geometrico e il piu lento.
L’approccio Gaussiano e quello Conico sono caratterizzati da velocita comparabili.
Il tempo di esecuzione aumenta al crescere del numero dei vertici, ma in ogni modo, inmaniera differente, per ogni metodo. Il Geometrico tende ad aumentare in maniera piudrastica rispetto agli altri due.
In termini di efficienza il metodo cone-shaped e il migliore.
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Accuratezza
Per tutti e tre i poligoni si e scelto randomicamente un set di M = 1000 punti pr
,
r = 1 . . .M , all’interno dello spazio normalizzato [0 1].
L’accuratezza e espressa per entrambi i metodi come la deviazione standard rispetto al valoredi appartenenza ottenuto attraverso il metodo di riferimento (Geometrico):
ζ(gauss) =
√√√√∑Mr=1
[µ(gauss)s,t (p
r)− µ(geo)s,t (p
r)]2
M,
ζ(cone) =
√√√√∑Mr=1
[µ(cone)s,t (p
r)− µ(geo)s,t (p
r)]2
M.
Risultati numerici:
Polygon Shape ζ(gauss) ζ(cone)
Triangle 5.5 6.0
Box 2.1 3.3
Hexagon 1.6 2.5
Tutti i valori sono moltiplicati per 10−3
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Simulazioni numeriche sulle Membership Functions
⇒
MF γ MF Threshold Av. Time (s)
Simpson 2 0.70 0.12
Gaussian method 1 0.65 0.03
Cone method 1 0.65 0.03
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Fuzzy clustering utilizzando il convex-hull come modello geometrico
1 Ω(m)max ≤ θmin.I Il pattern non puo essere assegnato a nessun cluster esistente.I Creazione nuovo cluster, coincidente con il pattern xm attuale.I NO Convex-Hull.I K ← K + 1.
2 Ω(m)max < θmax.I Assegnazione del pattern xm all’h-esimo cluster, corrispondente al massimo valore della MF.I SI Convex-Hull (h-esimo cluster).I K non viene aggiornato.
3 Ω(m)max ≥ θmaxI Assegnazione del pattern xm all’h-esimo cluster, corrispondente al massimo valore della MF.I NO Convex-Hull. Si suppone che l’alto grado di MF corrisponda ad una vicinanza stretta del pattern
al bordo dell’h-esimo. Questa scelta permette:F un enorme miglioramento in termini di costo computazionale nell’esecuzione dell’algoritmo;F prevenzione degli errori, impedendo che il cluster diventando troppo largo, possa causare incertezze nel calcolo
complessivo del clustering finale.
I K non viene aggiornato.
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Fuzzy clustering basato su funzioni di appartenenza senza vincoli geometrici
Risultati Indici di validita.
Hepta - 3D
1 10 20 30 40 50 60 70 80 90 100-100
-50
0
50
100(a.1)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 100-100
-50
0
50
100(a.2)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 100-100
-50
0
50
100(a.3)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100(b.1)
Presentation order
Erro
r rat
e (%
)
1 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100(b.2)
Presentation order
Erro
r rat
e (%
)
1 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100(b.3)
Presentation order
Erro
r rat
e (%
)
WDBC - 30D
1 10 20 30 40 50 60 70 80 90 100-100
-50
0
50
100(a.1)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 100-100
-50
0
50
100(a.2)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 100-100
-50
0
50
100(a.3)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100(b.1)
Presentation order
Erro
r rat
e (%
)
1 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100(b.2)
Presentation order
Erro
r rat
e (%
)
1 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100(b.3)
Presentation order
Erro
r rat
e (%
)
Iris - 4D
1 10 20 30 40 50 60 70 80 90 100-100
-50
0
50
100(a.1)
Presentation orderEr
ror d
iffer
ence
(%)
1 10 20 30 40 50 60 70 80 90 100-100
-50
0
50
100(a.2)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 100-100
-50
0
50
100(a.3)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100(b.1)
Presentation order
Erro
r rat
e (%
)
1 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100(b.2)
Presentation order
Erro
r rat
e (%
)
1 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100(b.3)
Presentation order
Erro
r rat
e (%
)
Wine - 13D
1 10 20 30 40 50 60 70 80 90 100-100
-50
0
50
100(a.1)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 100-100
-50
0
50
100(a.2)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 100-100
-50
0
50
100(a.3)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100(b.1)
Presentation order
Erro
r rat
e (%
)
1 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100(b.2)
Presentation order
Erro
r rat
e (%
)
1 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100(b.3)
Presentation order
Erro
r rat
e (%
)
NewThyroid - 5D
1 10 20 30 40 50 60 70 80 90 100-100
-50
0
50
100(a.1)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 100-100
-50
0
50
100(a.2)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 100-100
-50
0
50
100(a.3)
Presentation order
Erro
r diff
eren
ce (%
)
1 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100(b.1)
Presentation order
Erro
r rat
e (%
)
1 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100(b.2)
Presentation order
Erro
r rat
e (%
)
1 10 20 30 40 50 60 70 80 90 1000
20
40
60
80
100(b.3)
Presentation order
Erro
r rat
e (%
)
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Fuzzy clustering basato su funzioni di appartenenza senza vincoli geometrici
Analisi della Sensibilita.
0 5 10 15 20 250
100
200
300
400
500
600
700HEPTA Dataset
No. of Clusters
No.
of O
ccur
renc
es
Hepta (3D) - 7
0 5 10 15 20 250
50
100
150
200
250IRIS Dataset
No. of Clusters
No.
of O
ccur
renc
es
Iris (4D) - 3
0 5 10 15 20 250
50
100
150
200
250SEED Dataset
No. of Clusters
No.
of O
ccur
renc
esSeeds (7D) - 3
0 5 10 15 20 250
10
20
30
40
50
60
70WINE Dataset
No. of Clusters
No.
of O
ccur
renc
es
Wine (13D) - 3
0 5 10 15 20 250
20
40
60
80
100
120
140NEWTHYROID Dataset
No. of Clusters
No.
of O
ccur
renc
es
NewThyroid (5D) - 3
0 5 10 15 20 250
10
20
30
40
50
60
70
80
90
100WDBC Dataset
No. of Clusters
No.
of O
ccur
renc
es
WDBC (30D) - 2
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Fuzzy clustering basato su funzioni di appartenenza senza vincoli geometrici
Risultati.
UFOC vs Min-Max: Numero di esecuzioni, su 10 prove, in cui si ottiene la miglioreperformance
Algorithm Hepta Iris NewThyroid Seed WDBC Wine
(3-D) (4-D) (5-D) (7-D) (30-D) (13-D)
UFOC 10 8 7 10 10 8
MIN-MAX 10 3 3 1 1 3
UFOC vs Min-Max: mean & best error rate (%) dopo 10 esecuzioni dell’algoritmo
Algorithm Hepta Iris NewThyroid Seed WDBC Wine
(3-D) (4-D) (5-D) (7-D) (30-D) (13-D)
(Mean)
UFOC 0.00 9.27 12.88 8.71 11.70 10.11
MIN-MAX 0.00 9.33 14.80 28.71 14.71 22.98
(Best)
UFOC 0.00 8.67 2.32 7.61 6.50 3.37
MIN-MAX 0.00 6.00 10.23 12.38 6.50 3.93
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Fuzzy clustering basato su funzioni di appartenenza senza vincoli geometrici
Risultati.
Mean error rate (%) dopo 10 esecuzioni dell’algoritmo
Algorithm Hepta Iris NewThyroid Seed WDBC Wine
(3-D) (4-D) (5-D) (7-D) (30-D) (13-D)
UFOC 0.00 9.27 12.88 8.71 11.70 10.11
FCM 0.00 10.67 9.30 10.00 7.21 5.05
GK 1.41 10.00 13.02 10.47 17.22 38.26
K-means 0.00 11.33 11.16 10.95 7.20 3.49
Clusterdata 0.00 34.00 27.91 66.67 63.09 66.30
Best error rate (%) dopo 10 esecuzioni dell’algoritmo
Algorithm Hepta Iris NewThyroid Seed WDBC Wine
(3-D) (4-D) (5-D) (7-D) (30-D) (13-D)
UFOC 0.00 8.67 2.32 7.61 6.50 3.37
FCM 0.00 10.67 9.30 10.00 7.21 5.05
GK 0.00 10.00 13.02 10.47 17.22 34.27
K-means 0.00 11.33 11.16 10.95 7.20 3.37
Clusterdata 0.00 34.00 27.91 66.67 63.09 66.30
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Punto di partenza RMIT
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Punto di partenza RMIT
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Features IMU
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Classificazione automatica di movimenti mediante sensore inerziale (IMU)
RISULTATI
Dataset 6
6 Pazienti
Livello di Brunnstorm da III a V
6 x 10 x 6 = 360 Pattern
5 10 15 200
10
20
30
40
50
60
γ values
Erro
r rat
e (%
)
(a)
Error RateminError:24.44%
5 10 15 200
5
10
15
20
25
30
γ values
Erro
r rat
e (%
)
(b)
Error RateminError:9.44%
5 10 15 200
5
10
15
20
25
30
γ values
Erro
r rat
e (%
)
(c)
Error RateminError:6.67%
5 10 15 200
5
10
15
20
25
30
γ values
Erro
r rat
e (%
)
(d)
Error RateminError:2.5%
5 10 15 200
5
10
15
20
25
30
γ values
Erro
r rat
e (%
)
(e)
Error RateminError:1.39%
5 10 15 200
5
10
15
20
25
30
γ values
Erro
r rat
e (%
)
(f)
Error RateminError:0%
Dataset 14
14 Pazienti
Livello di Brunnstorm da I a V
531 Pattern
5 10 15 200
10
20
30
40
50
60
γ values
Erro
r rat
e (%
)
(a)
Error RateminError:23.16%
5 10 15 200
5
10
15
20
25
30
γ values
Erro
r rat
e (%
)
(b)
Error RateminError:14.31%
5 10 15 200
5
10
15
20
25
30
γ values
Erro
r rat
e (%
)
(c)
Error RateminError:9.23%
5 10 15 200
5
10
15
20
25
30
γ values
Erro
r rat
e (%
)
(d)
Error RateminError:3.58%
5 10 15 200
5
10
15
20
25
30
γ values
Erro
r rat
e (%
)
(e)
Error RateminError:1.69%
5 10 15 200
5
10
15
20
25
30
γ values
Erro
r rat
e (%
)
(f)
Error RateminError:0.56%
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Classificazione automatica di movimenti mediante sensore inerziale (IMU)
Matrice di confusione.
Uscita stimata
M1∗ M2∗ M3∗ M4∗ M5∗ M6∗
Va
lore
rea
le
M1 141 0 0 0 0 0
M2 0 80 0 0 0 0
M3 0 0 90 0 0 0
M4 1 0 1 58 0 0
M5 0 0 1 0 79 0
M6 0 0 0 0 0 80
Error rate.
Algorithm used 6 Patients 14 Patients
Fuzzy Kernel Classifier (FKC) 0.00 0.56
Neuro-fuzzy classifier 1.67 1.70
Classification Tree (CART) 5.28 4.90
Support Vector Machine (SVM) 1.32 1.11
Linear Discriminant Analysis (LDA) 1.67 8.35
Quadratic Discriminant Analysis (QDA) 0.56 1.32
NaiveBayes 3.06 6.03
Probabilistic Neural Network (PNN) 11.11 67.23
Tutti i valori sono espressi in (%).
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Features sEMG (1/3)
The surface EMG signals sampled at 3000 Hz are first fed through a 10th order digital ellipsebandpass filter with a pass-band from 20 to 500 Hz and 30 dB attenuation on stop-bands for noisereduction. The filtered samples are also rectified for activation detection using Root-Mean-Square(RMS) method with a sliding Hamming window as presented below. Let x be the filtered EMGinput signal and is the rectified output signal, the rectification process can be written as:
s(n) =
√√√√ 1
L
n+L−1∑k=n
[x(k)w(k)]2, 1 ≤ n ≤ N,
where N is the number of windowed segments, L is the window length, w(k) is the Hammingwindow function defined as:
w(k) = 0.54− 0.46 cos
(2π
k
M − 1
), 1 ≤ k < L.
The muscle activations were then automatically localized by setting a threshold in relation tosignal magnitude. It can be seen that rectangular windows are applied over the detectedmovement onsets. The window length is calculated to be 20% larger than the activation perioddetermined by an amplitude threshold to cover the complete movement. The activation periodswhich are too close to the beginning or the end of the sample sequences are disposed to avoid theinclusion of unintended or incomplete movements.
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Features sEMG (2/3)
Ten features on both time and frequency domain are extracted from the segmented sEMGsamples before classification. The details of each feature are presented below:
Maximum Amplitude: The maximum amplitude reached in the rectified signal.
Mean Amplitude: The mean amplitude of the rectified signal.
Activation duration: The length of data segment which represents the duration of themuscle activation.
Signal Energy: The energy estimated using Teager Kaiser Energy Operator (TKEO) duringmuscle activation. The TKEO in discrete form is given as:
ψ[x(n)] = x(n)2 − x(n+ 1)x(n− 1),
where x(n) is the sEMG data sequence. The signal energy can then be calculated as:
ENE =N∑n−1
ψ[x(n)].
Maximum changing rate: the peak value in the first derivative of the rectified sEMG signal.
The 2nd and 3nd Linear Prediction Coefficient (LPC): The 2nd and 3nd LPC arecomputed by constructing a 2nd order forward linear predictor of the sEMG sample signaland minimizing the prediction error with Least-Squares method using the ‘lpc’ function inMATLAB.
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Features sEMG (3/3)
Average Zero Crossing (ZC) rate: the ZC rate is calculated by counting the zero crossingevents of the original sEMG signal within a window which is defined as:
C(n) =1
L
n+L−1∑k=n
sgn(x(k)x(k + 1)),
where L is the window length and x is the original sEMG signal. The average ZC rate iscomputed as:
AZC =1
N
N∑n=1
C(n).
Mean Power Frequency (MPF): The MPF is the centroid frequency of the signal powerspectrum defined as:
MPF =
∑Nn=1 P (n)f(n)∑Nn=1 P (n)
,
where P is the power spectrum estimated using Welch’s modified periodogram method and fis the normalized frequency vector.
Median Frequency (MF): MF is the frequency which divides the sEMG power spectrum intotwo equal portions with same accumulated power. It can be defined as:
MF∑n=1
P (n) =
N∑n=MF
P (n) =1
2
N∑n=1
P (n).
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Analisi statistica delle features
Coefficiente di correlazione tra le features dei segnali sEMG e la progressione del recuperopost-stroke.
Feature InfoGain ReliefF Pearson Significance
Maximum Amplitude 1.017 0.130 0.71 P < 0.001
Mean Amplitude 1.201 0.149 0.73 P < 0.001
Activation Duration 0.286 0.036 - 0.36 P < 0.001
Signal Energy 1.231 0.042 0.57 P < 0.001
Maximum Changing Rate 0.677 0.117 0.73 P < 0.001
2nd LPC 0.548 0.076 0.32 P < 0.001
3rd LPC 0.228 0.086 - 0.03 P = 0.388
Average Zero Crossing rate 0.386 0.047 0.50 P < 0.001
Mean Power Frequency 0.652 0.091 0.60 P < 0.001
Median Frequency 0.641 0.091 0.64 P < 0.001
Confronto tra campioni registrati dal latomalato e quelli registrati dal lato sano.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Median Frequency
Mea
n A
mpl
itude
paretic sidenon−paretic side
Correlazione tra Median Frequency eBrunnstrom Stages.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1I
II
III
IV
V
VI
Median Frequency
Brun
nstr
om S
tage
s
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Classificazione automatica del livello di Brunnstrom mediante sEMG
Fase di training
2 4 6 8 100
5
10
15
20
25
30
35
40Fold1
γ values
Erro
r Rat
e (%
)
Error rateBest error rate
γ* chosen for testing
InputDatax
Unconstrained FuzzyMembership Function evaluation
(Fuzzification)
µ(1)(x)
µ(2)(x)
µ(3)(x)
µ(4)(x)
Winner-Takes-All(Defuzzification)
WTAarg maxq=1...K
µ(q)(x) OutputDatax
Fuzzy output 2
Fuzzy output 3
Fuzzy output 4
Fuzzy output ‘Healthy’
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Embedding
A chaotic sequence S(n) can be considered as the output of a chaotic system that is observableonly through S(n), which should be embedded in order to reconstruct the state-space evolutionof this system. The general embedding technique is based on the determination of the followingparameters:
embedding dimension D of the reconstructed state-space attractor, obtained by using theFalse Nearest Neighbors (FNN) method;
time lag T between the embedded past samples of S(n), obtained by using the AverageMutual Information (AMI) method; i.e.:
xn = [S(n) S(n− T ) . . . S(n− (D − 1)T )] ,
where xn is the reconstructed state at time n.
The solution of the embedding problem is useful for time series prediction. In a chaotic sequence,the prediction of S(n) can be obtained by using the relationship between the (reconstructed)state and the system output. In fact, the embedding of S(n) is intended to obtain an ‘unfolded’version of the actual system attractor, so that the difficulty of the prediction task can be reduced.Therefore, the prediction of a chaotic sequence S(n) can be considered as the determination ofthe function f(·), which approximates the link between the reconstructed state xn and the outputsample at the prediction distance m, i.e. S(n+m), m > 0.
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Training & Test Set
Definizione del modello
M Regole determinate attraverso algoritmi di clustering.
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Optimization of the HONFIS structure
Main problems → Local convergence of estimation algorithms and correct determination of thenumber C of rules.
Solutions → By using the clustering algorithm with different values of C and with differentinitializations for every value of C.
↓
The optimal network is selected by using the following cost function depending upon the numberof HONFIS rules:
F (C) = (1− λ)E(C)− Emin
Emax − Emin+ λ
C
NT,
where Emin and Emax are the extreme values of the performance E that are encountered duringthe analysis of the different HONFIS networks; λ is a weight 0 ≤ λ ≤ 1.
This weight is not critical, since the results are slightly affected by its variation in a large intervalcentered in 0.5.
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015
Background Fuzzy MF Unconstrained clustering Unconstrained classification Time Series
Risultati grafici
Serie storiche logaritmiche: WTI (Stati Uniti) e BRENT (Europa);
Training phase: 2001-2002 e 2007-2008;
Testing phase: 2003 e 2009.
WTI
Brent
Luca Liparulo - Fuzzy Pattern Recognition per il modellamento di sistemi complessi in contesti reali 13 Aprile 2015