Graphical exploratory analysis of vague data in the early diagnosis of dyslexia

8
Graphical exploratory analysis of vague data in the early diagnosis of dyslexia Luciano S´ anchez Univ. Oviedo Dpt. Inform´ atica 33071 Gij´ on, Asturias [email protected] Ana Palacios Univ. Oviedo Dpt. Inform´ atica 33071 Gij´ on, Asturias [email protected] M. R. Su´ arez Univ. Oviedo Dpt. Inform´ atica 33071 Gij´ on, Asturias [email protected] In´ es Couso Univ. Oviedo Dpt. Estad´ ıstica 33071 Oviedo, Asturias [email protected] Abstract We are in the initial stages of the design of a fuzzy rule-based test- ing system that can be used by un- qualified personnel while screening the children for dyslexia. The main novelty of our work is the exploita- tion of low quality data (incomplete items, intervals, lists, subjective val- ues, etc.) A sample of infants has been obtained, and some tests have been applied to them. In addition, a psychologist has examined and di- agnosed each child. We want to re- late the responses to the tests with the expert judgement of the profes- sional, and highlight those factors that are involved in the early devel- opment of the dyslexia during the preschool age. However, this data being imprecise, we lack tools to as- sess the quality of each test and its influence in the diagnosis. We have used different graphical visualization techniques, to detect the most use- ful sets of factors, but have found that none of the approaches that we are aware of is able to show all the relevant information in the sample. Therefore, we propose a new Multi- dimensional Scaling algorithm, that gains a better insight into the spa- tial properties of the data and also into the amount of vagueness in the pieces of information comprising it. 1 Introduction Dyslexia is a learning disability in people with normal intellectual coefficient, and with- out further physical or psychological prob- lems that can explain such disability. It has been estimated that between 4% and 5% of schoolchildren have dislexia, with reading and writing problems [1]. The average num- ber of children in a Spanish classroom is 25, therefore most of them have dyslexic children. Dyslexia may become apparent in early child- hood, with difficulty putting together sen- tences and a family history. Recognition of the problem is very important in order to give the infant an appropriate teaching. Using Soft Computing techniques for diagnos- ing dyslexia seems to us a natural choice, be- cause of the properties of our data (linguistic terms, and vague measurements). As a mat- ter of fact, there are many references where fuzzy techniques were used to learn medical diagnosis models from data (see, for instance, [18][14][2][12]). In particular, in [5] and [10], fuzzy techniques have been used in the diag- nosis of disabilities in language, and in [17] some different dyslexia related tests were as- sessed with soft computing-based methods. However, in all of the preceding works, the data was crisp or categorical. Instead, most of our measurements are not crisp. Some of our responses are linguistic (“low”, “high”), oth- ers are subjective (for example, the “square- ness” of a hand-drawn shape) or interval val- ued (f.e. a dyslexia degree “between 2 and 4”). Lastly, a high percentage of cases have missing values. None of the preceding ap- L. Magdalena, M. Ojeda-Aciego, J.L. Verdegay (eds): Proceedings of IPMU’08, pp. 1417–1424 Torremolinos (M´ alaga), June 22–27, 2008

Transcript of Graphical exploratory analysis of vague data in the early diagnosis of dyslexia

Graphical exploratory analysis of vague data in theearly diagnosis of dyslexia

Luciano SanchezUniv. Oviedo

Dpt. Informatica33071 Gijon, Asturias

[email protected]

Ana PalaciosUniv. Oviedo

Dpt. Informatica33071 Gijon, Asturias

[email protected]

M. R. SuarezUniv. Oviedo

Dpt. Informatica33071 Gijon, [email protected]

Ines CousoUniv. Oviedo

Dpt. Estadıstica33071 Oviedo, Asturias

[email protected]

Abstract

We are in the initial stages of thedesign of a fuzzy rule-based test-ing system that can be used by un-qualified personnel while screeningthe children for dyslexia. The mainnovelty of our work is the exploita-tion of low quality data (incompleteitems, intervals, lists, subjective val-ues, etc.) A sample of infants hasbeen obtained, and some tests havebeen applied to them. In addition,a psychologist has examined and di-agnosed each child. We want to re-late the responses to the tests withthe expert judgement of the profes-sional, and highlight those factorsthat are involved in the early devel-opment of the dyslexia during thepreschool age. However, this databeing imprecise, we lack tools to as-sess the quality of each test and itsinfluence in the diagnosis. We haveused different graphical visualizationtechniques, to detect the most use-ful sets of factors, but have foundthat none of the approaches that weare aware of is able to show all therelevant information in the sample.Therefore, we propose a new Multi-dimensional Scaling algorithm, thatgains a better insight into the spa-tial properties of the data and alsointo the amount of vagueness in thepieces of information comprising it.

1 Introduction

Dyslexia is a learning disability in peoplewith normal intellectual coefficient, and with-out further physical or psychological prob-lems that can explain such disability. Ithas been estimated that between 4% and 5%of schoolchildren have dislexia, with readingand writing problems [1]. The average num-ber of children in a Spanish classroom is 25,therefore most of them have dyslexic children.Dyslexia may become apparent in early child-hood, with difficulty putting together sen-tences and a family history. Recognition ofthe problem is very important in order to givethe infant an appropriate teaching.

Using Soft Computing techniques for diagnos-ing dyslexia seems to us a natural choice, be-cause of the properties of our data (linguisticterms, and vague measurements). As a mat-ter of fact, there are many references wherefuzzy techniques were used to learn medicaldiagnosis models from data (see, for instance,[18][14][2][12]). In particular, in [5] and [10],fuzzy techniques have been used in the diag-nosis of disabilities in language, and in [17]some different dyslexia related tests were as-sessed with soft computing-based methods.However, in all of the preceding works, thedata was crisp or categorical. Instead, most ofour measurements are not crisp. Some of ourresponses are linguistic (“low”, “high”), oth-ers are subjective (for example, the “square-ness” of a hand-drawn shape) or interval val-ued (f.e. a dyslexia degree “between 2 and4”). Lastly, a high percentage of cases havemissing values. None of the preceding ap-

L. Magdalena, M. Ojeda-Aciego, J.L. Verdegay (eds): Proceedings of IPMU’08, pp. 1417–1424

Torremolinos (Malaga), June 22–27, 2008

proaches are directly applicable to the prob-lem at hand.

In previous works [16], we have proposed somelearning algorithms that input a vague train-ing set and produce a fuzzy rule-based re-gression model, as we will explain in Section2. Nevertheless, before we can apply thesenew algorithms, we must carry an exploratoryanalysis for determining a small subset of in-puts (we use 413 different tests) that is infor-mative enough to reproduce the criteria of thepsychologist, or else the system will not havepractical use.

The problem of feature selection in regressionmodels with linear dependences between thevariables is customary solved with PrincipalComponent Analysis (PCA) or Factor Analy-sis. Both techniques have been generalized tocertain types of fuzzy data, see for instance[6][13]. In addition, nonlinear dependencesbetween vague data have also been studied.On the one hand, certain fuzzy feature selec-tion algorithms that were designed for fuzzyclassifiers can also be applied to regressionproblems with fuzzy data [15]. Moreover,modern approaches like Independent Compo-nent Analysis (ICA) and Self Organized Maps(SOM) have fuzzy extensions, but they are in-tended to improve the robustness when work-ing with crisp data [8][3]. Other nonlinearextensions of PCA, like Curvilinear Compo-nent Analysis (CCA) [11] have not yet beenextended to the fuzzy case. However, theseadvanced nonlinear techniques are closely re-lated to a technique widely used in psychol-ogy, Multi-Dimensional Scaling (MDS) [9],that has been recently generalized to the fuzzycase [7]. We will use and extend this last tech-nique.

The structure of this work is as follows: InSection 2 we will review the tests used inthis research, measuring verbal understand-ing, logic reasoning, memory, and sensory-motor ability. In Section 3 we introduce thenew graphical model. In Section 4 the resultsof applying the new model to our data areshown. Section 5 concludes the paper.

2 Symptoms and detection ofdyslexia

The dyslexia is a disability that is diagnosedwith the help of some tests, by a psychologistor specialist dyslexia teacher. In many Span-ish schools these tests are routinely appliedto children. However, attendance to schoolis not mandatory under the age of 6, thusdyslexic children might not be examined earlyenough, unless their parents suspect a prob-lem. We intend to provide the parents withan automated tool that can screen for certainsymptoms, suggesting that a professional iscontacted for further diagnosis, if needed. Wewant to find those children possibly affectedby dyslexia and also those that, without hav-ing dyslexia, have learning problems or areprone to have them in the future.

According to [1], the characteristic signs ofthe dyslexia depend on the age of the child.We are mostly interested in preschool educa-tion (between ages 4 and 6). These childrenare being initiated in reading and writing, butthey can not properly read yet. Therefore,we can detect a tendency to the dyslexia, butthe symptoms of the disability are not self-evident and the tests are designed to detectthem. The most significant symptoms have tosee with a slow acquisition of language skills(failing to remember lists of names, numbers,alphabet, days of the week, shapes, colours,etc.), mismatching words with similar pronun-ciation, limited vocabulary, attention deficit,hyperactivity and natural ability with techni-cal toys, i.e., greater manual than linguisticability, which typically will show in I.Q. tests.

From ages from 6 to 9 the symptoms begin tobe conspicuous. Reading, writing and calcu-lus skills are not acquired at the proper rate.In this case, the tests have to detect the symp-toms, as before, but they are also intended toseparate those children which actually sufferfrom dyslexia from those others for whom theproblem can be related to other causes.

All the tests and evaluation criteria that havebeen used in this research are currently beingused in Spanish schools for detecting dyslexia.In Figure 1 we have included an example of

1418 Proceedings of IPMU’08

Figure 1: Example of some of Bender’s tests for detecting dyslexia. Upper part: The angles ofthe shape in the right are qualified by a list of adjectives that can contain the words “right,”“incoherent,” “acceptable,” “regular” and “extra.” Middle and lower part: The relative positionbetween the figures can be “right and separated,” “right and touching,” “intersecting”, etc.

one of the tasks that the children have to solvein these tests: copying some geometric draw-ings. When a child is being evaluated, theexpert has to decide whether the angles, rel-ative position and other geometrical proper-ties have been accurately copied or not, choos-ing between a given set of adjectives. Othertests produce numbers, or linguistic labels as“low”, or “very high”. Lastly, we allow theuse to express indifference between differentresponses by means of intervals, as in “lowerthan 3” or “between 2 and 4”. There are 13categories of tests, that expand to a total of413 numerical, categorical and interval-valuedvariables.

In this research we have selected a sample of65 infants between 5 and 8 years old, in ur-ban schools of Asturias (Spain), and collectedtheir responses to the tests mentioned before.Afterwards, the same children were examinedby a psychologist, who assigned each one ofthem a subjective score, which is an intervalof values between 0 (normal child) and 4 (highdegree of dyslexia). We remark that we do notintend to design a classifier, but an interval-valued regression model, as we can not assumethat the output variable is categorical neithercrisp, i.e., certain children are not assignednumbers by the psychologist, but ranges of

values. Consequently, the desired output ofour model is an interval of numbers, between 0and 4, describing the degree of dyslexia. Thewidth of the interval codifies the confidencein the prediction, and will be related to thevagueness of the input: the more vague the in-put is, the less specific the output of the modelshould be. The objective of this paper is touse soft computing techniques to relate thesemeasured variables with the expert judgementof the professional, and highlight those factorsthat are involved in the early development ofthe dyslexia during the preschool age.

3 Graphical exploratory statistics

In the problem at hand, the data is vague andthere is also a high proportion of missing val-ues. We need to detect whether the vague-ness of each instance is too high - thus thatinstance should be removed from the train-ing set. Measures of vagueness are affectedby the scaling of the data. For instance, oneof the tests is assigned a value between 0 and40, while others produce binary values. If thedata is not scaled, the imprecision in the firsttest will be much more noticeable than theimprecision in the second. In crisp data, wesolve the scaling problem by applying PCA tothe correlation matrix instead of the distance

Proceedings of IPMU’08 1419

matrix, but there is not an standard proce-dure when the data is imprecise.

In this respect, PCA finds projections of max-imal variability, because the first k principalcomponents span a subspace containing thebest k-dimensional view of the data. There-fore, PCA minimizes the sum of the squareddistances between the points and their pro-jections. Multidimensional scaling (MDS) [9]generalizes this property, as it projects the in-stances in a low dimensional Euclidean spaceso that their proximity reflects the similarityof their variables. Fuzzy MDS, as describedin [4][7], in turn generalizes MDS to the casewhere the distance matrix comprises intervalsor fuzzy numbers. In this sense, the MDSmap coordinates are the best approximationwe have to the PCA when the data is vagueand there are missing values.

3.1 A new fuzzy-MDS algorithm

MDS also depends on the scaling of the data.But, having into account that our ultimateobjective is to obtain a fuzzy rule based-model, this problem can be circumvented.The main innovation in our algorithm is re-lated to this problem: we will map the activa-tion space instead of the input space. That isto say, we do not consider that two examplesare similar when their coordinates are near inthe space. Instead, we will consider that twoexamples are similar when they fire the samerules of the knowledge base.

Therefore, we propose to modify the FuzzyMDS algorithm in [7] and use a new, non-Euclidean distance measure. This distancetakes into account not only the vagueness,but also the granularity of the linguistic dis-cretization used in the fuzzy knowledge base,and in this sense does not depend on the scal-ing of the data.

3.1.1 Representation of an instance inthe activation space

We will assume, as most researchers in thisfield do, that those fuzzy sets defining themeaning of each linguistic variable form aRuspini’s partition. This means that the

xi

xj

R!ij

R+ij

Figure 2: The projected data are polygonsdefined by the distances Rij in the directionsthat pairwise join the examples.

memberships of any crisp value to the ele-ments of such a partition are conditional prob-ability distributions. In other words, everyprecise observation of a numerical variable canbe matched with a precise probability distri-bution over its corresponding universe of lin-guistic labels. For example, if a linguisticvariable has three terms “L”,“M” and “H”,then we will map every numerical value x0 toa triplet of membership values. This tripletcan be named (p(L|x0),p(M|x0), p(H|x0)) andp(L|x)+p(M|x)+p(H|x) = 1 for all x.

We propose that interval-valued observationsare mapped to sets of probabilities. For ex-ample, an interval [x1, x2] will be mapped toa set {(p(L|x), p(M|x), p(H|x)) | x ∈ [x1, x2]},which we will enclose in the imprecise proba-bility distribution given by the triplet of up-per probabilities (p∗(L|x), p∗(M|x), p∗(H|x))=( max[x1,x2]

p(L|x), max[x1,x2]

p(M|x), max[x1,x2]

p(H|x)),

with p∗(L|x)+p∗(M|x)+p∗(H|x) ≥ 1 for allx. Finally, we propose too that a fuzzyset with normal membership function X(x)is represented by the tuple of upper prob-abilities (maxxX(x) · p(L|x),maxxX(x) ·p(M|x),maxxX(x) · p(H|x)). Notice that ourrepresentation produces an interval-valueddistance either with interval valued or fuzzydata, because it depends on the extrema of aset of distances that are induced by a set ofprobability distributions in both cases.

1420 Proceedings of IPMU’08

xi

xj

dij

zi zj

!+ij = max D(xi, xj)

!!ij = minD(xi, xj)

R!ij R!ji R+ji

R+ij

!ij = D(xi, xj)

Figure 3: The distance between the projec-tions of xi and xj is between dij − R−

ij − R−ji

and dij +R+ij −R+

ji

3.1.2 Distance between cases

According to our representation, we definethe distance between two imprecisely mea-sured multivariate values xi = (xi1, . . . , xif )and xj = (xj1, . . . , xjf ), with f featureseach, where the value of the feature xik

is represented by the imprecise probabilities(p∗ik1, . . . , p

∗ikn), as the set of distances

Dij = {√∑f

k=1

∑nl=1(pikl − pjkl)2 :∑

l pikl =∑

l pjkl = 1 for all kand pikl ≤ p∗ikl, pjkl ≤ p∗jkl for all k, l}.

(1)

The distance between two imprecise objects(either interval or fuzzy) is an interval, thatcan be easily obtained by quadratic program-ming or approximated by Montecarlo simula-tion.

3.1.3 Stress function

In [7] the projection of an imprecise case is acircle. We have found that, in our problem,this is a too restrictive hypothesis. Instead,we propose to approximate the shape of theprojections by a polygon (see Figure 2) whoseradii R+

ij and R−ij are not free variables, but

depend on the distances between the cases.

Let {x1, . . . , xN} be a set of multivariateimprecise data, let xi be the crisp cen-

terpoint of the imprecise value xi and let{(z11, . . . , z1r), . . . , (zN1, . . . , zNr)} be a crispprojection, with dimension r, of that set. Wepropose that (see Figure 3 for a graphical ex-planation):

R+ij = dij

(δ+ijδij− 1

)R−

ij = dij

(δij

δ−ij− 1

)(2)

where dij =√∑r

k=1(zik − zjk)2, δij ={D(xi, xj)}, δ+ij = max{D(xi, xj)}, and δ−ij =min{D(xi, xj)}.Consequently, we propose that the value ofthe stress function isN∑

i=1

N∑j=i+1

dH(Dij , [dij−R−ij−R−

ji, dij+R+ij+Rji]+)2

(3)where dH is the Haussdorff distance betweenintervals.

4 Experiments

In [4, 7] the Haussdorff distance was not used,but the average of the differences between theextrema of the intervals, thus the stress func-tion can be optimized by quadratic program-ming (QP). However, the stress function is notconvex (notice, for example, that it is invari-ant under translations and rotations) and QPis not guaranteed to find a good solution. Ourstress function function contains the “max”operator and it is not differentiable. Insteadof using QP, the experiments in this sectionwere obtained with a real-coded genetic al-gorithm, where each chromosome contains allthe coordinates zi of a map. Tournament se-lection, and a mix of uniform and arithmeticcrossover were used. The population size was100 and the number of generations 1000.

In Figure 4, we have applied some standardexploratory statistics to the data. Each in-terval value has been replaced by its center-point, and categorical data was representedby numbers. Principal Component Analysis,Independent Component Analysis and Multi-dimensional Scaling were evaluated. For scal-ing the data, the MDS algorithm was appliedto the correlation matrix (therefore the map isdifferent than that of PCA). The numbers are

Proceedings of IPMU’08 1421

−10 −5 0 5

−8

−6

−4

−2

02

46

2

2

2

12

2

0

0.5 4

4

4

4

22

2

2

0

0

0

32

11

24

00

000 0

0

0

00

0

00

0

4 00

4

0

00

0

11

2

2

2

4

4

4

2

4

2

4

1

2

2

4

2

0.5

−2 −1 0 1 2

−1

01

23

2

2

21

2

2

0 0.5

4

4

4

4

2

2

2

2

0

00

32

11

24

0

0 00 0

0

0 00

0

0

00

0

400

4

0

00

0

11

22 24

4

4

2

4

24

1

2

2

4

20.5

−4 −2 0 2 4

−4

−2

02

4

2

2

2

1

2

20

0.5

4

4

4

4

2 2

2

2

0

0

0

3

2

11

24

0

00

000

00

0

0

0

0

0

0

400

4

0

0

00

1 1

2

2

2

4

4

4

2

4

2

4

1

2

2

42

0.5

Figure 4: From top to bottom: PCA, ICAand MDS analysis of the data. Each inter-val has been replaced by its centerpoint. Thenumbers are the mean degree of dyslexia ofthe children, from 0 (no dyslexia) to 4.

the mean degree of dyslexia of the children,from 0 (no dyslexia) to 4. In the three dia-grams, there is an area where normal childrenare concentrated, but there is not a clear de-pendence between the position of a point andthe dyslexia degree that it represents. It isalso emphasized that these maps do not con-sider the vagueness of the data.

In Figure 5 we have applied both the algo-rithm in [4, 7] and the proposal in this paperto our data. The first algorithm representsthe data as circles. Since the scores of thetests have different ranges, this map does notshow too useful information. By contrast, the

MDS that we propose here is based on dis-tances in the activation space (the member-ships of the data to the linguistic partitions).Observe that it gains much more insight intothe structure of this data.

In Figure 6, three subsets of 8 features havebeen examined. One of them has been ob-tained by means of a feature selection algo-rithm based on our own definition of fuzzymutual information [15]. A second set hasbeen obtained with classical techniques (anal-ysis of variance), replacing the missing dataand the imprecise measurements by crispnumbers. In the last place, an expert has cho-sen 8 variables that she considered relevant.

The second map, which is based in classi-cal techniques, did not take into account themissing values neither the imprecision. As aconsequence of this, highly imprecise variableswere not avoided and most cases overlap inthe map. Unexpectedly, the set of variableschosen by the expert has a good separabil-ity, but it does not produce a clear distinctionbetween children with and without dyslexia.Only the first map, using fuzzy Mutual Infor-mation, has produced reasonable results.

Lastly, in Figure 7, we have studied the influ-ence of the number of labels in the linguisticvariables in the separability of this last case.Granularities 3, 5 and 7 were compared overthat feature set obtained by MI. The set ofgranularity 5 seems to be a sensible choice,as the projection of the set of granularity 7does not significantly improve the separabil-ity of the data, and 3 labels are not enoughfor separating the cases.

5 Concluding remarks

When tackling real-world problems involvingimprecise data, we lack many standard proce-dures in the work cycle “exploratory analysis -preprocessing - learning -validation.” In thiswork, we have adapted an exploratory tech-nique to the design of fuzzy rule-based sys-tems. A new measure of distance, that allowsus to analyze the effect of different granulari-ties, was introduced, along with a more flexi-ble representation based on polygons.

1422 Proceedings of IPMU’08

0 10 20 30 40 50

010

2030

4050

V1

V2

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

1819

20

21

22

23

24

25

26

27 28

29

30

31

32

33

34

35

36

37

38

39

4041

42

43

4445

46

47

4849

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

●●

●● ●

0 5 10

05

1015

V1

V2

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

1718

19

20

21222324

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

4243

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

Figure 5: Left: MDS in input space. Right: MDS in activation space. The use of the activationspace instead of the input space gains more insight into the structure of the data, and does notrequire scaling. The meaning of the colors is: Red: 0. Green: lower than 1. Blue: 1. Cyan: 2.Magenta: 3. Yellow: 4

0 1 2 3

0.0

0.5

1.0

1.5

2.0

2.5

3.0

V1

V2

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

3031323334

353637

3839 40

41

4243

4445

46

47

48

49

50

5152

53

54

55

56

57

58

5960

61

62

6364

65

0.0 0.5 1.0 1.5 2.0 2.5 3.0

−0.

50.

00.

51.

01.

52.

02.

5

V1

V2

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2021

22

23

24

25

26

27

2829

30

3132

33

34

353637

3839

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

0.0 0.5 1.0 1.5 2.0

0.0

0.5

1.0

1.5

2.0

V1

V2

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

2223

2425

26

27

28

29

30

31

32

33

34

35

3637

38

39

40

41

42

43

44

45

46

4748

49

505152

53

54

55

56

57

58

59

60

61

62

63

64

65

Figure 6: Three subsets of 8 variables have been examined. Left: Fuzzy Mutual Information[15]. Center: Analysis of Variance of linear models, with crisp data. Right: Selection of the mostrelevant tests, according to the human expert. The crisp selection did not take into accountthe vagueness. The set of variables chosen by the expert does not produce a clear distinctionbetween children with and without dyslexia. The use of fuzzy IM produced reasonable results.

−0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0

−1.

0−

0.5

0.0

0.5

1.0

1.5

2.0

2.5

V1

V2

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

1718

19

20

21

22

23

24 25

2627

28

29

30

3132

33

34

35363738

39

40

4142

43

44

45

46

4748

49

50

51

52

53

54

55

56

57

5859

60

61

62

63

64

65

0 1 2 3

0.0

0.5

1.0

1.5

2.0

2.5

3.0

V1

V2

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

3031323334

353637

3839 40

41

4243

4445

46

47

48

49

50

5152

53

54

55

56

57

58

5960

61

62

6364

65

−1 0 1 2 3

−1

01

23

V1

V2

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

2829

30

31323334

3536

37

38

39

40

41

42

434445

46

47

48

49

50

51

52

53

54

55

56

57

58

59

6061

62

63

64

65

Figure 7: Interval MDS in activation space, with granularities 3, 5 and 7, and the feature setobtained by IM. The projection of the set of granularity 7 does not significantly improve theseparability of the data and 3 labels are not enough for separating the cases, thus the bestgranularity is 5.

Proceedings of IPMU’08 1423

The next problem we have to solve is the selec-tion of the most relevant variables. We haveshown that the classical techniques, that dis-card the imprecision, are not adequate for thisproblem. In future works, we intend to ex-tend the fuzzy MI based algorithm that wehave mentioned in the paper, and derive newwrapper-type feature selection techniques forregression models with imprecise data.

Acknowledgements

This work was funded by Spanish M. of Sci-ence and Technology and by FEDER funds,within the grant TIN2005-08036-C05-05, andby Plan de Ciencia, Tecnologia e Innovacion2006-2009 (PCTI) of Asturian regional Gov-ernement.

References

[1] Ajuriaguerra, J. Manual de psiquiatrıa infan-til (in Spanish). Toray-Masson. 1976

[2] Akbarzadeh, M. R., Moshtagh, M. A hierar-chical fuzzy rule-based approach to aphasiadiagnosis. Journal of Biomedical Informatics40, pp 465-475. 2007

[3] Bezdek, J. C., Tsao, E. C. K. and Pal, N. R.Fuzzy Kohonen clustering networks. In Proc.IEEE Int. Conf. on Fuzzy Systems, pp- 1035-1043, 1992

[4] Denoeux, T., Masson, M.-H., Multidimen-sional scaling of interval-valued dissimilaritydata. Pattern Recognition Lett. 21, 83-92.2000

[5] Georgopulos, V. A fuzzy cognitive map todifferential diagnosis of specific language im-pairment. Artificial intelligence in Medicine29. pp 261-278. 2003

[6] Giordani, P., Kiers, H., Principal ComponentAnalysis of symmetric fuzzy data. Computa-tional Statistics and Data Analysis 45(3). pp519-548. 2004.

[7] Hebert, P. A., Masson, M. H., Denoeux,T. Fuzzy multidimensional scaling. Compu-tational Statistics and Data Analysis 51. pp335-359. 2006.

[8] Honda, K., Ichihashi, H. Fuzzy local indepen-dent component analysis with external cri-teria and its application to knowledge dis-covery in databases. International Journal of

Approximate Reasoning 42(3), pp. 159-173.2006.

[9] Kruskal, J.B., Nonmetric multidimensionalscaling: a numerical method. Psychometrika29, 115-129. 1964.

[10] Lakov, D. V. Soft Computing Agent Ap-proach to Remote Learning of Disables. 2nd.IEEE Intl. Conf. Intelligent Systems. pp 250-255.2004

[11] Lee, J. A., Lendasse, A., Verleysen, M.,Curvilinear Distance Analysis versus Isomap.European Symposium on Artificial NeuralNetworks ESANN’2002. pp 185-192. 2002.

[12] Lurie, A., Marsala, C., Hartley, S., Bouchon-Meunier, B., Dusser, D. Patients’ perceptionof asthma severity. Respiratory Medicine, pp2145-2152. 2007

[13] Nakamori, Y., Watada, J. Factor analysis forfuzzy data. Sixth Intl. Conf. Fuzzy Systems,pp. 1115-1120. 1997.

[14] Di Nuovo, A., Catania, V., Di Nuovo, S.,Buono, S., Psychology with soft computing:An integrated approach and its applications.Applied Soft Computing 8, pp 829-837, 2007

[15] Sanchez, L., Suarez, M. R., Villar, J. R.,Couso, I. Some results about mutual infor-mation based feature selection and fuzzy dis-cretization of vague data. Intl. Conf. FuzzySystems FUZZ-IEEE 2007, pp. 1-6. 2007.

[16] Sanchez, L., Otero, J., Villar, J. R. Learn-ing fuzzy linguistic models from low qualitydata by genetic algorithms. Intl. Conf. FuzzySystems FUZZ-IEEE 2007, London. 2007.

[17] da Silva, F., Munzlinger, E., Jannette, L.,Riveros, M. Validacao de um sistema espe-cialista para o pre-diagnostico da dislexia (inPortuguese). XXV Congresso da SociedadeBrasileira de Computacao (CSBC). 1206-1209. 2005

[18] Stylios, C., Georgopoulos, V. C., Fuzzy Cog-nitive Structure for Medical Decision Sup-port Systems. Forging New Frontiers: FuzzyPioneers II. 218, pp 151-174. 2007

1424 Proceedings of IPMU’08