Correlation coefficients of hesitant fuzzy sets and their applications to clustering analysis

15
Correlation coefficients of hesitant fuzzy sets and their applications to clustering analysis Na Chen a,b , Zeshui Xu a,, Meimei Xia a a School of Economics and Management, Southeast University, Nanjing 210096, China b School of Applied Mathematics, Nanjing University of Finance and Economics, Nanjing 210046, China article info Article history: Received 21 October 2011 Received in revised form 4 April 2012 Accepted 21 April 2012 Available online 22 May 2012 Keywords: Correlation coefficient Hesitant fuzzy set (HFS) Interval-valued HFS abstract Hesitant fuzzy sets (HFSs), which allow the membership degree of an element to a set rep- resented by several possible values, can be considered as a powerful tool to express uncer- tain information in the process of group decision making. We derive some correlation coefficient formulas for HFSs and apply them to clustering analysis under hesitant fuzzy environments. Two real world examples, i.e. software evaluation and classification as well as the assessment of business failure risk, are employed to illustrate the actual need of the clustering algorithm based on HFSs, which can incorporate the difference of evaluation information provided by different experts in clustering processes. In order to extend the application domain of the clustering algorithm in the framework of HFSs, we develop the interval-valued HFSs and the corresponding correlation coefficient formulas, and then demonstrate their application in clustering with interval-valued hesitant fuzzy information through a specific numerical example. Ó 2012 Elsevier Inc. All rights reserved. 1. Introduction Correlation is one of the most broadly applied indices in many fields and also an important measure in data analysis and classification, pattern recognition, decision making and so on [1–6]. As many real world data may be fuzzy, the concept of correlation has been extended to fuzzy environments [7–14] and intuitionistic fuzzy environments [15–20,4,21]. For instance, Gerstenkorn and Manko [15] introduced the correlation coefficients of intuitionistic fuzzy sets (IFSs) [22,23]. Hong and Hwang [16] also defined them in probability spaces. Mitchell [19] derived the correlation coefficient of IFSs by interpret- ing an IFS as an ensemble of ordinary fuzzy sets. Hung and Wu [18] proposed a method to calculate the correlation coeffi- cients of IFSs by means of ‘‘centroid’’. Because of the potential applications of correlation coefficients, they have been further extended by Bustince and Burillo [24] and Hong [25] for interval-valued intuitionistic fuzzy sets (IVIFSs). Several new meth- ods of deriving the correlation coefficients for both IFSs and IVIFSs have also been proposed in Refs. [20,4,21]. Torra and Narukawa [26] and Torra [27] recently put forward the concept of hesitant fuzzy set (HFS) as an extension of fuzzy set [28] and analyzed its similarities and differences with IFSs [22,23], type-2 fuzzy sets [29], type-n fuzzy sets [29], and fuzzy multisets [30,31]. They further indicated that HFSs can better deal with the situations that permit the membership of an element to a given set having a few different values, which can arise in a group decision making problem. Take a deci- sion organization case as an illustration of introducing the necessity of HFSs. In the organization, the experts discuss the de- gree that an alternative A satisfies a criterion x; some experts possibly assign 0.2, some 0.4, while the others 0.6. No consistency is reached among these experts. For such a case, the satisfactory degrees can be represented by a hesitant fuzzy 0307-904X/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.apm.2012.04.031 Corresponding author. E-mail addresses: [email protected] (N. Chen), [email protected] (Z. Xu), [email protected] (M. Xia). Applied Mathematical Modelling 37 (2013) 2197–2211 Contents lists available at SciVerse ScienceDirect Applied Mathematical Modelling journal homepage: www.elsevier.com/locate/apm

Transcript of Correlation coefficients of hesitant fuzzy sets and their applications to clustering analysis

Applied Mathematical Modelling 37 (2013) 2197–2211

Contents lists available at SciVerse ScienceDirect

Applied Mathematical Modelling

journal homepage: www.elsevier .com/locate /apm

Correlation coefficients of hesitant fuzzy sets and their applications toclustering analysis

Na Chen a,b, Zeshui Xu a,⇑, Meimei Xia a

a School of Economics and Management, Southeast University, Nanjing 210096, Chinab School of Applied Mathematics, Nanjing University of Finance and Economics, Nanjing 210046, China

a r t i c l e i n f o a b s t r a c t

Article history:Received 21 October 2011Received in revised form 4 April 2012Accepted 21 April 2012Available online 22 May 2012

Keywords:Correlation coefficientHesitant fuzzy set (HFS)Interval-valued HFS

0307-904X/$ - see front matter � 2012 Elsevier Inchttp://dx.doi.org/10.1016/j.apm.2012.04.031

⇑ Corresponding author.E-mail addresses: [email protected] (N. Ch

Hesitant fuzzy sets (HFSs), which allow the membership degree of an element to a set rep-resented by several possible values, can be considered as a powerful tool to express uncer-tain information in the process of group decision making. We derive some correlationcoefficient formulas for HFSs and apply them to clustering analysis under hesitant fuzzyenvironments. Two real world examples, i.e. software evaluation and classification as wellas the assessment of business failure risk, are employed to illustrate the actual need of theclustering algorithm based on HFSs, which can incorporate the difference of evaluationinformation provided by different experts in clustering processes. In order to extend theapplication domain of the clustering algorithm in the framework of HFSs, we developthe interval-valued HFSs and the corresponding correlation coefficient formulas, and thendemonstrate their application in clustering with interval-valued hesitant fuzzy informationthrough a specific numerical example.

� 2012 Elsevier Inc. All rights reserved.

1. Introduction

Correlation is one of the most broadly applied indices in many fields and also an important measure in data analysis andclassification, pattern recognition, decision making and so on [1–6]. As many real world data may be fuzzy, the concept ofcorrelation has been extended to fuzzy environments [7–14] and intuitionistic fuzzy environments [15–20,4,21]. Forinstance, Gerstenkorn and Manko [15] introduced the correlation coefficients of intuitionistic fuzzy sets (IFSs) [22,23]. Hongand Hwang [16] also defined them in probability spaces. Mitchell [19] derived the correlation coefficient of IFSs by interpret-ing an IFS as an ensemble of ordinary fuzzy sets. Hung and Wu [18] proposed a method to calculate the correlation coeffi-cients of IFSs by means of ‘‘centroid’’. Because of the potential applications of correlation coefficients, they have been furtherextended by Bustince and Burillo [24] and Hong [25] for interval-valued intuitionistic fuzzy sets (IVIFSs). Several new meth-ods of deriving the correlation coefficients for both IFSs and IVIFSs have also been proposed in Refs. [20,4,21].

Torra and Narukawa [26] and Torra [27] recently put forward the concept of hesitant fuzzy set (HFS) as an extension offuzzy set [28] and analyzed its similarities and differences with IFSs [22,23], type-2 fuzzy sets [29], type-n fuzzy sets [29],and fuzzy multisets [30,31]. They further indicated that HFSs can better deal with the situations that permit the membershipof an element to a given set having a few different values, which can arise in a group decision making problem. Take a deci-sion organization case as an illustration of introducing the necessity of HFSs. In the organization, the experts discuss the de-gree that an alternative A satisfies a criterion x; some experts possibly assign 0.2, some 0.4, while the others 0.6. Noconsistency is reached among these experts. For such a case, the satisfactory degrees can be represented by a hesitant fuzzy

. All rights reserved.

en), [email protected] (Z. Xu), [email protected] (M. Xia).

2198 N. Chen et al. / Applied Mathematical Modelling 37 (2013) 2197–2211

element {0.2,0.4,0.6}, which is obviously different from fuzzy number 0.2 (or 0.4), the interval-valued fuzzy number[0.2,0.6]and the intuitionistic fuzzy number (0.2,0.4). Thus the HFS introduced can incorporate all possible opinions ofthe group members and, correspondingly, provides an intuitive description on the differences among the group members.

Very recently, some aggregation operators and distance measures for HFSs have been established in Refs. [32–35]. Theconcept of a hesitant fuzzy linguistic term set was also introduced based on HFSs in Ref. [36]. The present work will givesome formulas of the correlation coefficients for HFSs.

In addition, we will apply these derived correlation coefficient formulas to do clustering analysis for hesitant fuzzy infor-mation. Clustering refers to a process that combines a set of objects (alternatives, people, events, etc.) into clusters with re-spect to the characteristics of data, and the objects belonging to a cluster have a higher similarity than that of differentclusters. As one of the widely-adopted key tools in handling data information, clustering analysis has been applied to thefields of pattern recognition [37], data mining [38], information retrieval [39,40], and other real world problems concerningsocial, medical, biological, climatic, financial, etc. systems [41–45].

In a real world, data used for clustering may be uncertain and fuzzy, to deal with various types of fuzzy data, a number ofclustering algorithms corresponding to different fuzzy environments [46] have been proposed, e.g., intuitionistic fuzzy clus-tering algorithms [47,21,48] involving the correlation coefficient formulas for IFSs [21] and type-2 fuzzy clustering algo-rithms [49,50]. However, under the group decision making situations, the evaluation information provided by differentexperts may have an obvious difference. These fuzzy clustering schemes mentioned above are unable to incorporate the dif-ferences in the opinions of different experts; that is, they are unsuitable to do clustering under hesitant fuzzy environments.HFSs introduced here could resolve the issue, because they avoid performing data aggregation and can directly reflect thedifferences of the opinions of different experts. We will use the derived correlation coefficient formulas to calculate the de-grees of correlation among HFSs aiming at clustering different objects.

The rest of the article is organized as follows. Section 2 reviews basic concepts related to HFSs, IFSs and IVIFSs. In Section 3we give some correlation coefficient formulas for HFSs. Section 4 proposes a clustering algorithm based on HFSs, and two realcase studies are performed to demonstrate the need of the proposed clustering algorithm under hesitant fuzzy environ-ments. Section 5 introduces the concept of interval-valued HFS and an actual example is employed to illustrate the use ofinterval-valued HFSs in clustering. Section 6 summarizes this study and presents future challenges.

2. Preliminaries

2.1. Hesitant fuzzy sets

Definition 1 ([26,27]). Let X be a reference set, a hesitant fuzzy set (HFS) A on X is defined in terms of a function hA(x) whenapplied to X returns a finite subset of [0,1], i.e.,

A ¼ fhx;hAðxÞijx 2 Xg; ð1Þ

where hA(x) is a set of some different values in [0,1], representing the possible membership degrees of the element x 2 X to A.For convenience, we call hA(x) a hesitant fuzzy element (HFE) [32].

Example 1. Let X = {x1,x2,x3} be a reference set, hA(x1) = {0.2,0.4,0.5}, hA(x2) = {0.3,0.4} and hA(x3) = {0.3,0.2,0.5,0.6} be theHFEs of xi (i = 1,2,3) to a set A, respectively. Then A can be considered as a HFS, i.e.,

A ¼ hx1; f0:2;0:4;0:5gi; hx2; f0:3; 0:4gi; hx3; f0:3;0:2;0:5;0:6gif g:

Definition 2 ([26,27]). Given a HFE h, its lower and upper bounds are defined as below:

Lowerbound : h�ðxÞ ¼min hðxÞ; Upperbound : hþðxÞ ¼max hðxÞ:

Definition 3 ([26,27]). Given a HFE h, Aenv(h) is called the envelope of h which is represented by (h�, 1 � h+), with h� and h+

being its lower and upper bounds, respectively.

2.2. Intuitionistic fuzzy sets and interval-valued intuitionistic fuzzy sets

Definition 4 ([22]). Let X be a universe of discourse, an IFS A in X is defined as:

A ¼ hx;lAðxÞ; mAðxÞijx 2 X� �

; ð2Þ

where the functions lA(x) and vA(x) denote the degrees of membership and non-membership of the element x 2 X to the setA, respectively, with the condition:

0 6 lAðxÞ 6 1; 0 6 mAðxÞ 6 1; 0 6 lAðxÞ þ mAðxÞ 6 1; ð3Þ

N. Chen et al. / Applied Mathematical Modelling 37 (2013) 2197–2211 2199

and pA(x) = 1 � lA(x) � mA(x) is usually called the degree of hesitancy of x to A. a = (la, ma) is named as an intuitionistic fuzzyvalue (IFV).

It should be mentioned that from Definitions 3 and 4, we can see that the envelop of a HFE h, denoted by Aenv(h) =(h�,1 � h+), is just an IFV.

As indicated by Atanassov and Gargov [23], the membership degrees for certain elements of A may not be exactly defined, but avalue range can be given. So they introduced the concept of interval-valued intuitionistic fuzzy set (IVIFS). It is characterized by amembership function and a non-membership function, whose values are intervals rather than exact real numbers.

Definition 5 ([23]). Let X be a universe of discourse, an IVIFS eA over X is an object having the form:

eA ¼ hx; ~leAðxÞ; ~meAðxÞijx 2 Xn o

; ð4Þ

where ~leAðxÞ ¼ ½~leALðxÞ; ~leAU

ðxÞ� � ½0;1� and ~meAðxÞ ¼ ½~meALðxÞ; ~meAU

ðxÞ� � ½0;1� are intervals with the condition:

sup ~leAðxÞ þ sup ~meAðxÞ 6 1 for all x 2 X.

2.3. Correlation coefficients of IFSs and IVIFSs

Many approaches [15,18,4,21] have been introduced to compute the correlation coefficients of IFSs. Let X = {x1,x2, . . . ,xn}be a discrete universe of discourse, and let IFS (X) denote the set of all the IFSs in X and IVIFS (X) the set of all the IVIFSs in X.For any A, B 2 IFS (X), Gerstenkorn and Manko [15] extended the definition of informational energy given by Dumitrescu [51]to the case of IFSs, that is:

EIFSðAÞ ¼Xn

i¼1

l2AðxiÞ þ m2

AðxiÞ� �� �

: ð5Þ

The correlation of the IFSs A and B is defined as:

CIFS1 ðA;BÞ ¼Xn

i¼1

lAðxiÞlBðxiÞ þ mAðxiÞmBðxiÞ� �

: ð6Þ

They adopted the formula:

qIFS1ðA;BÞ ¼ CIFS1 ðA;BÞ

EIFSðAÞ � EIFSðBÞ½ �12¼

Pni¼1ðlAðxiÞlBðxiÞ þ mAðxiÞmBðxiÞÞPn

i¼1 l2AðxiÞ þ m2

AðxiÞ� �� �� �1

2 �Pn

i¼1 l2BðxiÞ þ m2

BðxiÞ� �� �1

2ð7Þ

to define the correlation coefficient of A and B. Here, qIFS1ðA;BÞ satisfies the following conditions:

(1) 0 6 qIFS1ðA;BÞ 6 1;

(2) A ¼ B) qIFS1ðA;BÞ ¼ 1;

(3) qIFS1ðA;BÞ ¼ qIFS1

ðB;AÞ.

Bustince and Burillo [24] extended Eqs. (5)–(7) to deal with interval-valued intuitionistic fuzzy information. LeteA; eB 2 IVIFSðXÞ, then

EIVIFS ¼Xn

i¼1

~l2eALxið Þ þ ~l2eAU

xið Þ þ ~m2eBLxið Þ þ ~m2eBU

xið Þ

2; ð8Þ

CIVIFS1eA; eB�

¼ 12

Xn

i¼1

~leALxið Þ~leBL

xið Þ þ ~leAUxið Þ~leBU

xið Þ þ ~meALxið Þ~meBL

xið Þ þ ~meAUxið Þ~meBU

xið Þ�

ð9Þ

and

qIVIFS1eA; eB�

¼CIVIFS1

eA; eB� EIVIFS

eA� � EIVIFS

eB� h i12

ð10Þ

Note that qIVIFS1ðeA; eBÞ satisfies all the properties that qIFS1

ðA;BÞ has.Other forms of the correlation coefficient of A and B have also been reported in the literature. For instance, Xu et al. [21]

suggested that:

qIFS2A;Bð Þ ¼

Pni¼1 lA xið Þ � lB xið Þ þ mA xið Þ � mB xið Þ þ pA xið Þ � pB xið Þ� �

maxPn

i¼1 l2A xið Þ þ m2

A xið Þ þ p2A xið Þ

� �;Pn

i¼1 l2B xið Þ þ m2

B xið Þ þ p2B xið Þ

� �� � ð11Þ

2200 N. Chen et al. / Applied Mathematical Modelling 37 (2013) 2197–2211

and extended it to the case of IVIFSs.

3. Correlation and correlation coefficients of HFSs

Let X = {x1,x2, . . . ,xn} be a discrete universe of discourse, A and B be two HFSs on X denoted as A = {h xi,hA(xi)ijxi 2 X,i =1,2, . . . ,n} and B = {hxi,hB(xi)ijxi 2 X, i = 1,2, . . . ,n} respectively.

The values of a HFE are usually given in a disorder, and for convenience, we arrange them in a decreasing order. For a HFEh, let r: (1,2, . . . ,n) ? (1,2, . . . ,n) be a permutation satisfying hr(i) P hr(i+1), i = 1,2, . . . ,n � 1, and hr(j) be the jth largest valuein h.

It is noted that the number of values in different HFEs may be different. To compute the correlation coefficients betweentwo HFSs, let li = max{l(hA(xi)), l(hB(xi))} for each xi in X, where l(hA(xi)) and l(hB(xi)) represent the number of values in hA(xi)and hB(xi), respectively. When l(hB (xi)) – l(hA(xi)), one can make them having the same number of elements through addingsome elements to the HFE which has less number of elements. In terms of the pessimistic principle, the smallest element willbe added while in the opposite case, the optimistic principle may be adopted. In the present work, we use the former case.Especially, if l(hA(xi)) < l(hB(xi)), then hA(xi) should be extended by adding the minimum value in it until it has the same lengthas hB (xi). This idea has been successfully applied to distance and similarity measures for HFSs [33].

Similar to the existing works [24,15], we define the informational energy for HFSs and the corresponding correlation.

Definition 6. For a HFS A = {hxi,hA(xi)ijxi 2 X, i = 1,2, . . . ,n}, the informational energy of the set A is defined as:

EHFSðAÞ ¼Xn

i¼1

1li

Xli

j¼1

h2ArðjÞðxiÞ

!: ð12Þ

Definition 7. For two HFSs A and B, their correlation is defined by

CHFS1 ðA;BÞ ¼Xn

i¼1

1li

Xli

j¼1

hArðjÞðxiÞhBrðjÞðxiÞ !

: ð13Þ

For A,B 2 HFSs, the correlation (13) satisfies:

(1) CHFS1 ðA;AÞ ¼ EHFSðAÞ;(2) CHFS1 ðA;BÞ ¼ CHFS1 ðB;AÞ.

Using Definitions 6 and 7, we derive a correlation coefficient for HFSs:

Definition 8. The correlation coefficient between two HFSs A and B is given as:

qHFS1A;Bð Þ ¼ CHFS1 A; Bð Þ

CHFS1 A;Að Þ� �1

2 � CHFS1 B; Bð Þ� �1

Pni¼1

1li

Plij¼1hAr jð Þ xið ÞhBr jð Þ xið Þ

� Pn

i¼11li

Plij¼1h2

Ar jð Þ xið Þ� h i1

2 �Pn

i¼11li

Plij¼1h2

Br jð Þ xið Þ� h i1

2: ð14Þ

Theorem 1. The correlation coefficient between two HFSs A and B, qHFS1ðA;BÞ, satisfies:

(1) qHFS1ðA;BÞ ¼ qHFS1

ðB;AÞ:(2) 0 6 qHFS1

ðA; BÞ 6 1:(3) qHFS1

ðA;BÞ ¼ 1, if A = B.

Proof 1

(1) It is straightforward.(2) The inequality qHFS1

ðA;BÞP 0 is obvious. Below let us prove qHFS1ðA; BÞ 6 1:

CHFS1 A;Bð Þ¼Xn

i¼1

1li

Xli

j¼1

hArðjÞ xið ÞhBrðjÞ xið Þ !

¼ 1l1

Xl1

j¼1

hArðjÞ x1ð ÞhBrðjÞ x1ð Þþ1l2

Xl2

j¼1

hArðjÞ x2ð ÞhBrðjÞ x2ð Þþ �� �þ1ln

Xln

j¼1

hArðjÞ xnð ÞhBrðjÞ xnð Þ

¼Xl1

j¼1

hArðjÞ x1ð Þffiffiffiffil1

p �hBrðjÞ x1ð Þffiffiffiffil1

p þXl2

j¼1

hArðjÞ x2ð Þffiffiffiffil2

p �hBrðjÞ x2ð Þffiffiffiffil2

p þ���þXln

j¼1

hArðjÞ xnð Þffiffiffiffiln

p �hBrðjÞ xnð Þffiffiffiffiln

p

Using the Cauchy–Schwarz inequality:

N. Chen et al. / Applied Mathematical Modelling 37 (2013) 2197–2211 2201

x1y1 þ x2y2 þ � � � xnynð Þ2 6 x21 þ x2

2 þ � � � þ x2n

� �� y2

1 þ y22 þ � � � þ y2

n

� �;

where (x1,x2, . . . ,xn) 2 Rn, (y1,y2, . . . ,yn) 2 Rn, we obtain:

CHFS1 A;Bð Þ� �2

6

Xl1

j¼1

1l1

h2ArðjÞ x1ð Þ þ

Xl2

j¼1

1l2

h2ArðjÞ x2ð Þ þ � � � þ

Xln

j¼1

1ln

h2ArðjÞ xnð Þ

" #

�Xl1

j¼1

1l1

h2BrðjÞ x1ð Þ þ

Xl2

j¼1

1l2

h2BrðjÞ x2ð Þ þ � � � þ

Xln

j¼1

1ln

h2BrðjÞ xnð Þ

" #

¼ 1l1

Xl1

j¼1

h2ArðjÞ x1ð Þ þ

1l2

Xl2

j¼1

h2ArðjÞ x2ð Þ þ � � � þ

1ln

Xln

j¼1

h2ArðjÞ xnð Þ

" #

� 1l1

Xl1

j¼1

h2BrðjÞ x1ð Þ þ

1l2

Xl2

j¼1

h2BrðjÞ x2ð Þ þ � � � þ

1ln

Xln

j¼1

h2BrðjÞ xnð Þ

" #

¼Xn

i¼1

1li

Xli

j¼1

h2ArðjÞ xið Þ

!" #�Xn

i¼1

1li

Xli

j¼1

h2BrðjÞ xið Þ

!" #¼ CHFS1 A;Að Þ � CHFS1 B;Bð Þ

Therefore

CHFS1 ðA;BÞ 6 CHFS1 ðA;AÞ� �1

2 � CHFS1 ðB; BÞ� �1

2

So, 0 6 qHFS1ðA;BÞ 6 1.

(3) A ¼ B) hArðjÞðxiÞ ¼ hBrðjÞðxiÞ; xi 2 X ) qHFS1ðA;BÞ ¼ 1. h

Example 2. Let A and B be two HFSs in X = {x1,x2,x3}, and

A ¼ hx1; 0:7;0:5f gi; hx2; 0:9;0:8;0:6f gi; hx3; 0:5;0:4;0:2f gif g;B ¼ hx1; 0:4;0:2f gi; hx2; 0:8; 0:5; 0:4f gi; hx3; 0:7; 0:6;0:3f gif g

By using Eq. (12), we calculate:

CHFS1 ðA;AÞ ¼ EHFSðAÞ ¼X3

i¼1

1li

Xli

j¼1

h2ArðjÞðxiÞ

!¼ 1

2

X2

j¼1

h2ArðjÞðx1Þ þ

13

X3

j¼1

h2ArðjÞðx2Þ þ

13

X3

j¼1

h2ArðjÞðx3Þ

¼ 12ð0:72 þ 0:52Þ þ 1

3ð0:92 þ 0:82 þ 0:62Þ þ 1

3ð0:52 þ 0:42 þ 0:22Þ ¼ 1:1233:

and similarly:

CHFS1 ðB;BÞ ¼ EHFSðBÞ ¼X3

i¼1

1li

Xli

j¼1

h2BrðjÞðxiÞ

!¼ 1

2

X2

j¼1

h2BrðjÞðx1Þ þ

13

X3

j¼1

h2BrðjÞðx2Þ þ

13

X3

j¼1

h2BrðjÞðx3Þ

¼ 12ð0:42 þ 0:22Þ þ 1

3ð0:82 þ 0:52 þ 0:42Þ þ 1

3ð0:72 þ 0:62 þ 0:32Þ ¼ 0:7633:

With Eq. (13), we obtain:

CHFS1 ðA;BÞ ¼X3

i¼1

1li

Xli

j¼1

hArðjÞðxiÞhBrðjÞðxiÞ !

¼ 12

X2

j¼1

hArðjÞðx1ÞhBrðjÞðx1Þ þ13

X3

j¼1

hArðjÞðx2ÞhBrðjÞðx2Þ þ13

X3

j¼1

hArðjÞðx3ÞhBrðjÞðx3Þ

¼ 12ð0:7� 0:4þ 0:5� 0:2Þ þ 1

3ð0:9� 0:8þ 0:8� 0:5þ 0:6� 0:4Þ þ 1

3ð0:5� 0:7þ 0:4� 0:6þ 0:2� 0:3Þ

¼ 0:86:

Finally, we use Eq. (14) to calculate the correlation coefficient:

qHFS1ðA;BÞ ¼ CHFS1 ðA;BÞ

CHFS1 ðA;AÞ� �1

2 � CHFS1 ðB; BÞ� �1

2¼ 0:86ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

1:1233p

�ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi0:7633p ¼ 0:9288

Obviously, 0 < qHFS1ðA;BÞ < 1.

In what follows we give a new formula of calculating the correlation coefficient, which is similar to that used in IFSs [21]:

2202 N. Chen et al. / Applied Mathematical Modelling 37 (2013) 2197–2211

Definition 9. For two HFSs A and B, their correlation coefficient is defined by

qHFS2A;Bð Þ ¼ CHFS1 A; Bð Þ

maxfCHFS1 A;Að Þ; CHFS1 B;Bð Þg ¼Pn

i¼11li

Plij¼1hArðjÞ xið ÞhBrðjÞ xið Þ

� max

Pni¼1

1li

Plij¼1h2

ArðjÞ xið Þ�

;Pn

i¼11li

Plij¼1h2

BrðjÞ xið Þ� n o ð15Þ

Theorem 2. The correlation coefficient of two HFSs A and B, qHFS2ðA;BÞ, follows the same properties listed in Theorem 1.

Proof 2. The process to prove the properties (1) and (3) is analogous to that in Theorem 1, we do not repeat it here.(2) qHFS2

ðA;BÞP 0 is obvious. We now only prove qHFS2ðA;BÞ 6 1.

Based on the proof process of Theorem 1, we have:

CHFS1 ðA;BÞ 6 CHFS1 ðA;AÞ� �1

2 � CHFS1 ðB;BÞ� �1

2;

and then

CHFS1 ðA;BÞ 6max CHFS1 ðA;AÞ;CHFS1 ðB; BÞ� �

Thus, qHFS2ðA;BÞ 6 1 h.

In practical applications, the elements xi(i = 1,2, . . . ,n) in the universe X have different weights. Let w = (w1,w2, . . . ,wn)T bethe weight vector of xi(i = 1,2, . . . ,n) with wi P 0, i = 1,2, . . . ,n and

Pni¼1wi ¼ 1, we further extend the correlation coefficient

formulas given in Eqs. (14) and (15) as:

qHFS3A;Bð Þ ¼ CHFS2 A; Bð Þ

CHFS2 A;Að Þ� �1

2 � CHFS2 B; Bð Þ� �1

Pni¼1wi

1li

Plij¼1hArðjÞ xið ÞhBrðjÞ xið Þ

� Pn

i¼1wi1li

Plij¼1h2

ArðjÞ xið Þ� h i1

2 �Pn

i¼1wi1li

Plij¼1h2

BrðjÞ xið Þ� h i1

2; ð16Þ

qHFS4A;Bð Þ ¼ CHFS2 A; Bð Þ

maxfCHFS2 A;Að Þ; CHFS2 B;Bð Þg ¼Pn

i¼1wi1li

Plij¼1hArðjÞ xið ÞhBrðjÞ xið Þ

� max

Pni¼1wi

1li

Plij¼1h2

ArðjÞ xið Þ�

;Pn

i¼1wi1li

Plij¼1h2

BrðjÞ xið Þ� n o : ð17Þ

It can be seen that if w = (1/n,1/n, . . . ,1/n)T, then Eqs. (16) and (17) reduce to Eqs. (14) and (15), respectively. Note thatboth qHFS3

ðA;BÞ and qHFS4ðA;BÞ also satisfy three properties of Theorem 1.

Theorem 3. Let w = (w1,w2, . . . ,wn)T be the weight vector of xi (i = 1,2, . . . ,n) with wi P 0, i = 1,2, . . . ,n andPn

i¼1wi ¼ 1, the cor-relation coefficient qHFS3

ðA;BÞ between two HFSs A and B defined in Eq. (16), which takes into account the weights, satisfies:

(1) qHFS3ðA;BÞ ¼ qHFS3

ðB;AÞ;(2) 0 6 qHFS3

ðA; BÞ 6 1;(3) qHFS3

ðA;BÞ ¼ 1, if A = B.

Proof 3

(1) It is straightforward.(2) qHFS3

ðA;BÞP 0 is obvious. Below we prove qHFS3ðA; BÞ 6 1. Since

CHFS2 ðA;BÞ ¼Xn

i¼1

wi1li

Xli

j¼1

hArðjÞðxiÞhBrðjÞðxiÞ !

¼ w1

l1

Xl1

j¼1

hArðjÞðx1ÞhBrðjÞðx1Þ þw2

l2

Xl2

j¼1

hArðjÞðx2ÞhBrðjÞðx2Þ þ � � � þwn

ln

Xln

j¼1

hArðjÞðxnÞhBrðjÞðxnÞ

¼Xl1

j¼1

ffiffiffiffiffiffiw1p � hArðjÞðx1Þffiffiffiffi

l1

p �ffiffiffiffiffiffiw1p � hBrðjÞðx1Þffiffiffiffi

l1

p þ � � � þXln

j¼1

ffiffiffiffiffiffiwnp � hArðjÞðxnÞffiffiffiffi

ln

p �ffiffiffiffiffiffiwnp � hBrðjÞðxnÞffiffiffiffi

ln

p ;

and by using the Cauchy–Schwarz inequality, we obtain:

N. Chen et al. / Applied Mathematical Modelling 37 (2013) 2197–2211 2203

CHFS2 ðA;BÞ� �2

6

Xl1

j¼1

w1

l1h2

ArðjÞðx1Þ þXl2

j¼1

w2

l2h2

ArðjÞðx2Þ þ � � � þXln

j¼1

wn

lnh2

ArðjÞðxnÞ" #

�Xl1

j¼1

w1

l1h2

BrðjÞðx1Þ þXl2

j¼1

w2

l2h2

BrðjÞðx2Þ þ � � � þXln

j¼1

wn

lnh2

BrðjÞðxnÞ" #

¼ w1

l1

Xl1

j¼1

h2ArðjÞðx1Þ þ

w2

l2

Xl2

j¼1

h2ArðjÞðx2Þ þ � � � þ

wn

ln

Xln

j¼1

h2ArðjÞðxnÞ

" #

� w1

l1

Xl1

j¼1

h2BrðjÞðx1Þ þ

w2

l2

Xl2

j¼1

h2BrðjÞðx2Þ þ � � � þ

wn

ln

Xln

j¼1

h2BrðjÞðxnÞ

" #

¼Xn

i¼1

wi

li

Xli

j¼1

h2ArðjÞðxiÞ

!" #�Xn

i¼1

wi

li

Xli

j¼1

h2BrðjÞðxiÞ

!" #

¼Xn

i¼1

wi1li

Xli

j¼1

h2ArðjÞðxiÞ

!" #�Xn

i¼1

wi1li

Xli

j¼1

h2BrðjÞðxiÞ

!" #¼ CHFS2 ðA;AÞ � CHFS2 ðB; BÞ:

Thus:

CHFS2 ðA;BÞ 6 CHFS2 ðA;AÞ� �1

2 � CHFS2 ðB; BÞ� �1

2:

That is, qHFS3ðA; BÞ 6 1.

(3) A ¼ B) hArðjÞðxiÞ ¼ hBrðjÞðxiÞ; xi 2 X ) qHFS3ðA;BÞ ¼ 1. h

Theorem 4. The correlation coefficient of two HFSs A and B defined in Eq. (17), which accounts for the weights, qHFS4ðA;BÞ, satisfies

the same properties as those in Theorem 3.Since the process to prove these properties is analogous to that in Theorem 2, we do not repeat it here.

Example 3. Let A1, A2, and A3 be three HFSs in X = {x1,x2,x3}, w = (0.3,0.3,0.4)T be the weight vector of xi (i = 1,2,3), and

A1 ¼ hx1; 0:9;0:8;0:5f gi; hx2; 0:2;0:1f gi; hx3; 0:5;0:3;0:2;0:1f gif g;A2 ¼ hx1; 0:7;0:5;0:4f gi; hx2; 0:5;0:3f gi; hx3; 0:6;0:4;0:3;0:1f gif g;A3 ¼ hx1; 0:3;0:2;0:1f gi; hx2; 0:3;0:2f gi; hx3; 0:8;0:7;0:5;0:4f gif g:

From Eq. (16), we can obtain qHFS3ðA1;A2Þ ¼ 0:9135; qHFS3

ðA1;A3Þ ¼ 0:6700 and qHFS3ðA2;A3Þ ¼ 0:8278. Obviously,

qHFS3ðA1;A2Þ > qHFS3

ðA2;A3Þ > qHFS3ðA1;A3Þ.

4. Clustering algorithm for HFSs

Based on the intuitionistic fuzzy clustering algorithm [21], and the correlation coefficient formulas developed previouslyfor HFSs, in what follows, we develop an algorithm to do clustering under hesitant fuzzy environments. Before doing this,some concepts are introduced firstly:

Definition 10. Let Aj(j = 1,2, . . . ,m) be m HFSs, and C = (qij)m�m be a correlation matrix, where qij = q (Ai,Aj) denotes thecorrelation coefficient of two HFSs Ai and Aj and satisfies:

(1) 0 6 qij 6 1, i, j = 1,2, . . . ,m;(2) qii = 1, i = 1,2, . . . ,m;(3) qij = qji, i,j = 1,2, . . . ,m.

Definition 11 ([21]). Let C = (qij)m�m be a correlation matrix, if C2 ¼ C � C ¼ ð�qijÞm�m, then C2 is called a composition matrixof C, where

�qij ¼maxk

minfqik; qkjgn o

; i; j ¼ 1;2; . . . ;m: ð18Þ

Theorem 5 ([21]). Let C = (qij)m�m be a correlation matrix. Then the composition matrix C2 ¼ C � C ¼ ð�qijÞm�m is also a correla-tion matrix.

2204 N. Chen et al. / Applied Mathematical Modelling 37 (2013) 2197–2211

Theorem 6 ([21]). Let C be a correlation matrix. Then for any nonnegative integers m1 and m2, the composition matrix Cm1þm2

derived from Cm1þm2 ¼ Cm1 � Cm2 is still a correlation matrix.

Definition 12 ([21]). Let C = (qij)m�m be a correlation matrix, if C2 # C, i.e.

maxk

minfqik; qkjgn o

6 qij; i; j ¼ 1;2; . . . ;m; ð19Þ

then C is called an equivalent correlation matrix.

Theorem 7 ([52,21]). Let C = (qij)m�m be a correlation matrix. Then after the finite times of compositions:C ! C2 ! C4 ! � � � ! C2k

! � � �, there must exist a positive integer k such that C2k

¼ C2ðkþ1Þand C2k

is also an equivalent correla-tion matrix.

Definition 13 ([21]). Let C = (qij)m�m be an equivalent correlation matrix. Then we call Ck = (kqij)m�m the k-cutting matrix ofC, where

kqij ¼0 if qij < k;

1 if qij P k;

(i; j ¼ 1;2; . . . ;m; ð20Þ

and k is the confidence level with k 2 [0,1].We now propose an algorithm for clustering HFSs as follows:(Algorithm-HFSC).

Step 1. Let {A1,A2, . . . ,Am} be a set of HFSs in X = {x1,x2, . . . ,xn}. Using Eq. (16) or Eq. (17), we can calculate the correlationcoefficients of the HFSs, and then construct a correlation matrix C = (qij)m�m, where qij = q (Ai,Aj).

Step 2. Check whether C = (qij)m�m is an equivalent correlation matrix, i.e. check whether it satisfies C2 # C, where

C2 ¼ C � C ¼ ð�qijÞm�m; �qij ¼ maxkfminfqik;qkj

gg; i; j ¼ 1;2; . . . ;m

If it does not hold, we construct the equivalent correlation matrix C2k

:

C ! C2 ! C4 ! � � � ! C2k

! � � � ; until C2k

¼ C2ðkþ1Þ:

Step 3. For a confidence level k, we construct a k-cutting matrix Ck = (kqij)m�m through Definition 13 in order to classify theHFSs Hj(j = 1,2, . . . ,m). If all elements of the ith line (column) in Ck are the same as the corresponding elements of thejth line (column) in Ck, then the HFSs Ai and Aj are of the same type. By means of this principle, we can classify allthese m HFSs Hj (j = 1,2, . . . ,m).

Below two real examples are employed to illustrate the need of the clustering algorithm based on HFSs:

Example 4. Software evaluation and classification is an increasingly important problem in any sector of human activity.Industrial production, service provisioning and business administration heavily depend on software which is more and morecomplex and expensive [53]. A CASE tool to support the production of software in a CIM environment has to be selected fromthe ones offered on the market. CIM software typically has responsibility for production planning, production control andmonitoring [54].

To better evaluate different types of CIM softwares Ai(i = 1,2, . . . ,7) on the market, we perform clustering for themaccording to four attributes: x1: functionality, x2: usability, x3: portability, and x4: maturity. Given the experts who makesuch an evaluation have different backgrounds and levels of knowledge, skills, experience and personality, etc., this couldlead to a difference in the evaluation information. To clearly reflect the differences of the opinions of different experts, thedata of evaluation information are represented by the HFSs and listed in Table 1.

Step 1. Calculate the correlation coefficients of the HFSs Aj(j = 1,2, . . . ,7) by using Eq. (16) with the weighting vector w =(0.35,0.30,0.15,0.2)T, and let li ¼maxflðhAj

ðxiÞÞg; j ¼ 1;2; . . . ;7. Then the derived correlation matrix is:

C ¼

1:0000 0:9531 0:8461 0:8192 0:9182 0:9686 0:82330:9531 1:0000 0:6573 0:8128 0:8861 0:9418 0:82920:8461 0:6573 1:0000 0:6722 0:8041 0:7939 0:67320:8192 0:8128 0:6722 1:0000 0:6243 0:6855 0:99060:9182 0:8861 0:8041 0:6243 1:0000 0:9702 0:66710:9686 0:9418 0:7639 0:6855 0:9702 1:0000 0:70740:8233 0:8292 0:6732 0:9906 0:6671 0:7074 1:0000

0BBBBBBBBBBB@

1CCCCCCCCCCCA:

N. Chen et al. / Applied Mathematical Modelling 37 (2013) 2197–2211 2205

Step 2. Construct the equivalent correlation matrix and calculate:

Table 1Hesitan

A1

A2

A3

A4

A5

A6

A7

C2 ¼ C � C ¼

1:0000 0:9531 0:8461 0:8233 0:9686 0:9686 0:82920:9531 1:0000 0:8461 0:8292 0:9418 0:9531 0:82920:8461 0:8461 1:0000 0:8192 0:8461 0:8461 0:82330:8233 0:8292 0:8192 1:0000 0:8192 0:8192 0:99060:9686 0:9418 0:8461 0:8192 1:0000 0:9702 0:82920:9686 0:9531 0:8461 0:8192 0:9702 1:0000 0:82920:8292 0:8292 0:8233 0:9906 0:8292 0:8292 1:0000

0BBBBBBBBBBB@

1CCCCCCCCCCCA:

It can be seen that C2 # C does not hold. That is to say, the correlation matrix C is not an equivalent correlation matrix. So, wefurther calculate: 0 1

C4 ¼ C2 � C2 ¼

1:0000 0:9531 0:8461 0:8292 0:9686 0:9686 0:82920:9531 1:0000 0:8461 0:8292 0:9531 0:9531 0:82920:8461 0:8461 1:0000 0:8292 0:8461 0:8461 0:82920:8292 0:8292 0:8292 1:0000 0:8292 0:8292 0:99060:9686 0:9531‘ 0:8461 0:8292 1:0000 0:9702 0:82920:9686 0:9531 0:8461 0:8292 0:9702 1:0000 0:82920:8292 0:8292 0:8292 0:9906 0:8292 0:8292 1:0000

BBBBBBBBBBB@

CCCCCCCCCCCA;

and

C8 ¼ C4 � C4 ¼

1:0000 0:9531 0:8461 0:8292 0:9686 0:9686 0:82920:9531 1:0000 0:8461 0:8292 0:9531 0:9531 0:82920:8461 0:8461 1:0000 0:8292 0:8461 0:8461 0:82920:8292 0:8292 0:8292 1:0000 0:8292 0:8292 0:99060:9686 0:9531‘ 0:8461 0:8292 1:0000 0:9702 0:82920:9686 0:9531 0:8461 0:8292 0:9702 1:0000 0:82920:8292 0:8292 0:8292 0:9906 0:8292 0:8292 1:0000

0BBBBBBBBBBB@

1CCCCCCCCCCCA¼ C4:

Hence, C4 is an equivalent correlation matrix.Step 3. For a confidence level k, to do clustering for HFSs, we construct a k-cutting matrix Ck = (kqij)m�m by Definition 13, and

based on which, we get all possible classifications of Aj (j = 1,2, . . . ,7):

(1) If 0 6 k 6 0.8292, then Aj(j = 1,2, . . . ,7) are of the same type:

fA1;A2;A3;A4;A5;A6;A7g

(2) If 0.8292 < k 6 0.8461, then Aj(j = 1,2, . . . ,7) are classified into two types:

fA1;A2;A3;A5;A6g; fA4;A7g:

(3) If 0.8461 < k 6 0.9531, then Aj(j = 1,2, . . . ,7) are classified into three types:

fA1;A2;A5;A6g; fA3g; fA4;A7g:

(4) If 0.9531 < k 6 0.9686, then Aj(j = 1,2, . . . ,7) are classified into four types:

fA1;A5;A6g; fA2g; fA3g; fA4;A7g:

(5) If 0.9686 < k 6 0.9702, then Aj(j = 1,2, . . . ,7) are classified into five types:

fA1g; fA2g; fA3g; fA5;A6g; fA4;A7g;

t fuzzy information.

x1 x2 x3 x4

{0.9,0.85,0.8} {0.8,0.75,0.7} {0.8,0.65} {0.35,0.3}{0.9,0.85} {0.8,0.7,0.6} {0.2} {0.15}{0.4,0.3,0.2} {0.5,0.4} {1.0,0.9} {0.65,0.5,0.45}{1.0,0.95,0.8} {0.2,0.15,0.1} {0.3,0.2} {0.8,0.7,0.6}{0.5,0.4,0.35} {1.0,0.9,0.7} {0.4} {0.35,0.3,0.2}{0.7,0.6,0.5} {0.9,0.8} {0.6,0.4} {0.2,0.1}{1,0.8} {0.35,0.2,0.15} {0.2,0.1} {0.85, 0.7}

Table 2The eva

A1

A2

A3

A4

A5

A6

A7

A8

A9

A10

2206 N. Chen et al. / Applied Mathematical Modelling 37 (2013) 2197–2211

(6) If 0.9702 < k 6 0.9906, then Aj(j = 1,2, . . . ,7) are classified into six types:

fA1g; fA2g; fA3g; fA5g; fA6g; fA4;A7g:

(7) If 0.9906 < k 6 1, then Aj(j = 1,2, . . . ,7) are classified into seven types:

fA1g; fA2g; fA3g; fA4g; fA5g; fA6g; fA7g:

Example 5. The assessment of business failure risk, i.e., the assessment of firm performance and the prediction of failure

events has drawn the attention of many researchers in recent years [55,56]. For this purpose, 10 firms Ai(i = 1,2, . . . ,10) eval-uated on 5 criteria (x1: managers work experience, x2: profitability, x3: operating capacity, x4: debt-paying ability, and x5:market competition) will be classified according to their risk of failure. In order to better make the assessment, several riskevaluation organizations are requested. The normalized evaluation data, represented by HFSs, are displayed in Table 2.

Step 1. Calculate the correlation coefficients of the HFSs Aj(j = 1,2, . . . ,10) by using Eq. (17) with the weighting vector w =(0.15,0.3,0.2,0.25,0.1)T, and let li ¼maxflðhAj

ðxiÞÞg; j ¼ 1;2; . . . ;10. Then the correlation matrix derived is:

C ¼

1:0000 0:7984 0:6583 0:6635 0:5964 0:9104 0:7572 0:6761 0:6147 0:59830:7984 1:0000 0:8200 0:7139 0:6459 0:6666 0:7411 0:7458 0:7052 0:58550:6583 0:8200 1:0000 0:8813 0:7593 0:6082 0:8997 0:8872 0:8683 0:67570:6635 0:7139 0:8813 1:0000 0:7423 0:6542 0:9238 0:8743 0:9306 0:67420:5964 0:6459 0:7593 0:7423 1:0000 0:5761 0:7737 0:8520 0:8253 0:95150:9104 0:6666 0:6082 0:6542 0:5761 1:0000 0:7427 0:6647 0:5816 0:61240:7572 0:7411 0:8997 0:9238 0:7737 0:7427 1:0000 0:9025 0:8723 0:72170:6761 0:7458 0:8872 0:8743 0:8520 0:6647 0:9025 1:0000 0:8617 0:80670:6147 0:7052 0:8683 0:9306 0:8253 0:5816 0:8723 0:8617 1:0000 0:73770:5983 0:5855 0:6757 0:6742 0:9515 0:6124 0:7217 0:8067 0:7377 1:0000

0BBBBBBBBBBBBBBBBBB@

1CCCCCCCCCCCCCCCCCCA

:

Step 2. Construct the equivalent correlation matrix and obtain:

C16 ¼ C8 � C8 ¼

1:0000 0:7984 0:7984 0:7984 0:7984 0:9104 0:7984 0:7984 0:7984 0:79840:7984 1:0000 0:8200 0:8200 0:8200 0:7984 0:8200 0:8200 0:8200 0:82000:7984 0:8200 1:0000 0:8997 0:8520 0:7984 0:8997 0:8997 0:8997 0:85200:7984 0:8200 0:8997 1:0000 0:8520 0:7984 0:9238 0:9025 0:9306 0:85200:7984 0:8200 0:8520 0:8520 1:0000 0:7984 0:8520 0:8520 0:8520 0:95150:9104 0:7984 0:7984 0:7984 0:7984 1:0000 0:7984 0:7984 0:7984 0:79840:7984 0:8200 0:8997 0:9238 0:8520 0:7984 1:0000 0:9025 0:9238 0:85200:7984 0:8200 0:8997 0:9025 0:8520 0:7984 0:9025 1:0000 0:9025 0:85200:7984 0:8200 0:8997 0:9306 0:8520 0:7984 0:9238 0:9025 1:0000 0:85200:7984 0:8200 0:8520 0:8520 0:9515 0:7984 0:8520 0:8520 0:8520 1:0000

0BBBBBBBBBBBBBBBBBB@

1CCCCCCCCCCCCCCCCCCA

¼ C8:

Hence, C8 is an equivalent correlation matrix.Step 3. For a confidence level k, to do clustering for HFSs, we construct a k-cutting matrix Ck = (kqij)m�m by Definition 13, and

based on which, we get the possible classifications of 10 firms Aj(j = 1,2, . . . ,10), see Table 3.

luation information for the five criteria of 10 firms.

x1 x2 x3 x4 x5

{0.3,0.4,0.5} {0.4,0.5} {0.8} {0.5} {0.2,0.3}{0.4,0.6} {0.6,0.8} {0.2,0.3} {0.3,0.4} {0.6,0.7,0.9}{0.5,0.7} {0.9} {0.3,0.4} {0.3} {0.8,0.9}{0.3,0.4,0.5} {0.8,0.9} {0.7,0.9} {0.1,0.2} {0.9,1.0}{0.8,1.0} {0.8,1.0} {0.4,0.6} {0.8} {0.7,0.8}{0.4,0.5,0.6} {0.2,0.3} {0.9,1.0} {0.5} {0.3,0.4,0.5}{0.6} {0.7,0.9} {0.8} {0.3,0.4} {0.4,0.7}{0.9,1.0} {0.7,0.8} {0.4,0.5} {0.5,0.6} {0.7}{0.4,0.6} {1.0} {0.6,0.7} {0.2,0.3} {0.9,1.0}{0.9} {0.6,0.7} {0.5,0.8} {1.0} {0.7,0.8,0.9}

Table 3The clustering result of 10 firms.

Class Confidence level Hesitant fuzzy clustering algorithm

10 0.9515 < k 6 1 {A1}, {A2}, {A3}, {A4}, {A5}, {A6}, {A7}, {A8}, {A9}, {A10}9 0.9306 < k 6 0.9515 {A1}, {A2}, {A3}, {A4}, {A6}, {A7}, {A8}, {A9}, {A5,A10}8 0.9238 < k 6 0.9306 {A1}, {A2}, {A3}, {A4,A9}, {A6}, {A7}, {A8}, {A5,A10}7 0.9104 < k 6 0.9238 {A1}, {A2}, {A3}, {A4,A7,A9}, {A6}, {A8}, {A5,A10}6 0.9025 < k 6 0.9104 {A1,A6}, {A2}, {A3}, {A4,A7,A9}, {A8}, {A5,A10}5 0.8997 < k 6 0.9025 {A1,A6}, {A2}, {A3}, {A4,A7,A8,A9}, {A5,A10}4 0.8520 < k 6 0.8997 {A1,A6}, {A2}, {A3,A4,A7,A8,A9}, {A5,A10}3 0.8200 < k 6 0.8520 {A1,A6}, {A2}, {A3,A4,A5,A7,A8,A9,A10}2 0.7984 < k 6 0.8200 {A1,A6}, {A2,A3,A4,A5,A7,A8,A9,A10}1 0 6 k 6 0.7984 {A1,A2,A3,A4,A5,A6,A7,A8,A9,A10}

N. Chen et al. / Applied Mathematical Modelling 37 (2013) 2197–2211 2207

Under the group setting, the experts’ evaluation information usually does not reach an agreement for the objects thatneed to be classified. Examples 4 and 5 clearly show that the clustering algorithm based on HFSs provides a proper way toresolve this issue. However, it is interesting to point out that for these two real case studies, if adopting the conventionalclustering methods within the framework of IFSs and fuzzy sets, it needs to transform HFSs into fuzzy sets (or IFSs), whichgives rise to a difference in the accuracy of data in the two types, it will have an effect on the clustering results. We haveactually performed such a clustering study by transforming the data in Tables 1 and 2 using IFSs and fuzzy sets, respectively.We find that the results are different from those obtained by using HFSs, as expected.

5. Correlation and clustering algorithm for interval-valued hesitant fuzzy sets

It has been known that in multi-criteria decision making, it is somewhat difficult for the experts to assign exact values for themembership degrees of certain elements to A, but a range of values belonging to [0,1] may be assigned. It means that it is verynecessary to introduce the concept of interval-valued hesitant fuzzy set (IVHFS). This kind of situations are more or less like thatencountered in intuitionistic fuzzy environments where the concept of IFS has been extended to that of IVIFS in order to de-scribe the case of interval values that the membership and non-membership degrees of an element are assigned to a set.

Definition 14. Let X be a reference set, an IVHFS on X is defined as:

eA ¼ hxi;heAðxiÞijxi 2 X; i ¼ 1;2; . . . ;nn o

ð21Þ

where heAðxiÞ is a set of some different interval values in [0,1], representing the possible membership degree of the elementxi 2 X to the set A, and is called an interval-valued hesitant fuzzy element (IVHFE).

Example 6. Let X = {x1,x2} be a reference set, heAðx1Þ ¼ f½0:2;0:4�; ½0:5;0:6�g and heAðx2Þ = {[0.3,0.5], [0.4,0.7], [0.6,0.8]} be the

IVHFE of xi(i = 1,2) to a set eA, respectively. Then eA can be considered as an IVHFS and given as:

eA ¼ hx1; f½0:2;0:4�; ½0:5; 0:6�gi; hx2; f½0:3; 0:5�; ½0:4;0:7�; ½0:6;0:8�gf g:

For an IVHFE heAðxiÞ, we arrange the intervals in heAðxiÞ in a decreasing order. This can be achieved based on a possibilitydegree formula [57] for the comparison between two interval numbers. Let r: (1,2, . . . ,n) ? (1,2, . . . ,n) be a permutation sat-isfying heArðiÞ

P heArðiþ1Þ, i = 1,2, . . . ,n � 1, and heArðjÞ

ðxiÞ be the jth largest interval in heAðxiÞ, where� �

heArðjÞ

ðxiÞ ¼ hLeArðjÞðxiÞ;hUeArðjÞ

ðxiÞ � 0;1½ �; j ¼ 1;2; . . . ; li;

are intervals, and

hLeArðjÞðxiÞ ¼ inf heArðjÞ

ðxiÞ; hUeArðjÞðxiÞ ¼ sup heArðjÞ

ðxiÞ:

Similar to the previous definitions on the correlation coefficients of HFSs A and B, we can define the correlation coeffi-cients of IVHFSs eA and eB in X as:

qIVHFS1eA;eB�

¼CIVHFS1

eA;eB� CIVHFS1

eA; eA� h i12 � CIVHFS1

eB;eB� h i12

¼

Pni¼1

1li

Plij¼1 hLeArðjÞ

xið ÞhLeBrðjÞxið ÞþhUeArðjÞ

xið ÞhUeBrðjÞxið Þ

� �Pn

i¼11li

Plij¼1 hLeArðjÞ

xið Þ �2

þ hUeArðjÞxið Þ

�2 !" #( )1

2

�Pn

i¼11li

Plij¼1 hLeBrðjÞ

xið Þ� 2

þ hUeBrðjÞxið Þ

� 2 �� �� �1

2

; ð22Þ

2208 N. Chen et al. / Applied Mathematical Modelling 37 (2013) 2197–2211

qIVHFS2eA;eB�

¼CIVHFS1

eA;eB� max CIVHFS1

eA; eA� ;CIVHFS1

eB;eB� n o

¼

Pni¼1

1li

Plij¼1 hLeArðjÞ

xið ÞhLeBrðjÞxið ÞþhUeArðjÞ

xið ÞhUeBrðjÞxið Þ

� �max

Pni¼1

1li

Plij¼1 hLeArðjÞ

xið Þ �2

þ hUeArðjÞxið Þ

�2 !" #

;Pn

i¼11li

Plij¼1 hLeBrðjÞ

xið Þ� 2

þ hUeBrðjÞxið Þ

� 2 �� �( ) ; ð23Þ

when taking into account the weight of the element xi 2 X, we further give

qIVHFS3eA; eB�

¼CIVHFS2

eA; eB� CIVHFS2

eA; eA� h i12 � CIVHFS2

eB; eB� h i12

¼

Pni¼1wi

1li

Plij¼1 hLeArðjÞ

xið ÞhLeBrðjÞxið Þ þ hUeArðjÞ

xið ÞhUeBrðjÞxið Þ

� �Pn

i¼1wi1li

Plij¼1 hLeArðjÞ

xið Þ �2

þ hUeArðjÞxið Þ

�2 !" #( )1

2

�Pn

i¼1wi1li

Plij¼1 hLeBrðjÞ

xið Þ� 2

þ hUeBrðjÞxið Þ

� 2 �� �� �1

2

;

ð24Þ

qIVHFS4eA; eB�

¼CIVHFS2

eA; eB� max CIVHFS2

eA; eA� ;CIVHFS2

eB; eB� n o

¼

Pni¼1wi

1li

Plij¼1 hLeArðjÞ

xið ÞhLeBrðjÞxið Þ þ hUeArðjÞ

xið ÞhUeBrðjÞxið Þ

� �max

Pni¼1wi

1li

Plij¼1 hLeArðjÞ

xið Þ �2

þ hUeArðjÞxið Þ

�2 !" #

;Pn

i¼1wi1li

Plij¼1 hLeBrðjÞ

xið Þ� 2

þ hUeBrðjÞxið Þ

� 2 �� �( ) :

ð25Þ

Theorem 8. For two IVHFSs eA and eB, the correlation coefficients defined by Eqs. (22)–(25) satisfy:

(1) qIVHFSðeA; eBÞ ¼ qIVHFSðeB; eAÞ;(2) 0 6 qIVHFSðeA; eBÞ 6 1;

(3) qIVHFSðeA; eBÞ ¼ 1, if eA ¼ eB,

which can be proven with the methods previously adopted for the HFSs.

Like Algorithm-HFSC in Section 4, in what follows, we propose an algorithm for clustering IVHFSs:(Algorithm-IVHFSC).

Step 1. Calculate the correlation coefficients of IVHFSs, and construct a correlation matrix C = (qij)m�m, where qij ¼ qðeAi; eAjÞ.Step 2. See Algorithm-HFSC.Step 3. See Algorithm-HFSC.

When all IVHFSs eAjðj ¼ 1;2; . . . ;mÞ reduce to the HFSs, Algorithm-IVHFSC becomes Algorithm-HFSC, which indicates theconsistency of these two algorithms.

Example 7 ([58]). Let us consider an actual example. An auto market wants to classify four different cars Aj(j = 1,2,3,4) intoseveral kinds. Taking into account six attributes with the weight vector being w = (0.25,0.20,0.15,0.10,0.15,0.15)T,including: x1: fuel economy, x2: friction degree, x3: price, x4: comfort, x5: design, and x6: safety, which are denoted withX = {x1,x2, . . . ,x6}. The evaluation information of each car provided by the experts is expressed in the form of IVHFSs. The datain Table 4 represent the satisfaction degree of each alternative to the attribute given by the experts and thus are denotedwith interval-valued numbers within [0,1].

We now perform clustering by using Algorithm-IVHFSC:Step 1. Use Eq. (24) to compute the correlation coefficients of the IVHFSs Aj(j = 1,2,3,4), and then obtain the correlation

matrix C:

Table 4Hesitant fuzzy information.

x1 x2 x3

A1 {[0.7,0.9], [0.7,0.8], [0.6,0.8]} {[0.1,0.3]} {[0.2,0.4], [0.2,0.3], [0.1, 0.3], [0.1,0.2]}A2 {[0.5,0.7], [0.5,0.6], [0.4,0.6]} {[0.5,0.7]} {[0.8,1.0], [0.7,0.9], [0.7, 0.8], [0.6,0.8]}A3 {[0.3,0.5], [0.2,0.4], [0.2,0.3]} {[0.8,0.9]} {[0.3,0.4], [0.2,0.4], [0.1, 0.4], [0.1,0.3]}A4 {[0.6,0.7], [0.5,0.7], [0.5,0.6]} {[0.4,0.6]} {[0.3,0.5], [0.2,0.4], [0.2, 0.3], [0.1,0.3]}

x4 x5 x6

A1 {[0.5,0.8], [0.4,0.6]} {[0.2,0.5], [0.2,0.4], [0.1,0.4]} {[0.8,0.9], [0.7,0.8]}A2 {[0.9,1.0], [0.7,0.9]} {[0.8,0.9], [0.7,0.9], [0.7,0.8]} {[0.9,1.0], [0.8,1.0]}A3 {[0.2,0.3], [0.1,0.3]} {[0.1,0.3], [0,0.2], [0,0.1]} {[0.6,0.8], [0.6,0.7]}A4 {[0.3,0.5], [0.2,0.4]} {[0.1,0.2], [0,0.2], [0,0.1]} {[0.6,0.7], [0.4,0.5]}

N. Chen et al. / Applied Mathematical Modelling 37 (2013) 2197–2211 2209

C ¼

1 0:8717 0:7337 0:92440:8717 1 0:8196 0:87080:7337 0:8196 1 0:89820:9244 0:8708 0:8982 1

0BBB@1CCCA:

Step 2. Work out the equivalent correlation matrix:

C2 ¼ C � C ¼

1 0:8717 0:8982 0:92440:8717 1 0:8708 0:87170:8982 0:8708 1 0:89820:9244 0:8717 0:8982 1

0BBB@1CCCA;

C4 ¼ C2 � C2 ¼

1 0:8717 0:8982 0:92440:8717 1 0:8717 0:87170:8982 0:8717 1 0:89820:9244 0:8717 0:8982 1

0BBB@1CCCA;

and

C8 ¼ C4 � C4 ¼

1 0:8717 0:8982 0:92440:8717 1 0:8717 0:87170:8982 0:8717 1 0:89820:9244 0:8717 0:8982 1

0BBB@1CCCA:

Obviously, C8 = C4, that is, C4 is an equivalent correlation matrix.Step 3. Utilize Eq. (20) to construct a k-cutting matrix Ck = (kqij)m�m, based on which, we get all possible classifications of

the cars Aj (j = 1,2,3,4):

(1) If 0 6 k 6 0.8717, then Ai(i = 1,2,3,4) are of the same type:

fA1;A2;A3;A4g:

(2) If 0.8717 < k 6 0.8982, then Ai(i = 1,2,3,4) are classified into two types:

fA2g; fA1;A3;A4g:

(3) If 0.8982 < k 6 0.9244, then Ai(i = 1,2,3,4) are classified into three types:

fA1;A4g; fA2g; fA3g:

(4) If 0.9244 < k 6 1, then Ai(i = 1,2,3,4) are classified into four types:

fA1g; fA2g; fA3g; fA4g:

6. Conclusions

We have derived correlation coefficient formulas including those considering the weights of attributes (or criteria) forHFSs. These formulas have been applied for clustering the objects under hesitant fuzzy environments. We have made theclustering analysis under hesitant fuzzy environments with two typical real world examples, i.e. software classificationand the performance of firms in business failure risk. These examples clearly indicate the true need of a new type of clus-tering algorithm based on HFSs, since such a clustering algorithm can automatically account for the differences of the eval-

2210 N. Chen et al. / Applied Mathematical Modelling 37 (2013) 2197–2211

uation data given by different experts. In order to generalize the HFSs to a wide domain of hesitant fuzzy setting, we havegiven the concept of the interval-valued HFS and used an actual example to illustrate its potential application.

When employing the proposed clustering algorithm based on HFSs to a general hesitant fuzzy environment, further workis still required. To further extend the application range of the present clustering algorithm, in particular for the case thatneeds to assign weights for different experts, it will be necessary to generalize the original definition of HFSs [26,27], whichassumes that the experts in the group setting unanimously agree on the weight of a criterion, as has recently been attemptedby Zhu and Xu [59] who have generalized the concept of HFS so that different weights for experts can be taken into account.In the future, we will study new clustering algorithms based on the generalized HFSs. Furthermore, the use of linguisticinformation is suitable and straightforward in many real decision situations. Rodríguez et al. [36] recently proposed the con-cept of hesitant fuzzy linguistic term set, based on the fuzzy linguistic approach that will serve as the basis of increasing theflexibility of the elicitation of linguistic information by means of linguistic expressions. Thus, it is interesting to develop thecorresponding correlation coefficient formulas for hesitant fuzzy linguistic term sets and apply them to clustering analysis.

Given that HFSs are a suitable technique of denoting uncertain information that is widely encountered in daily life andthat clustering aids the analysis in decision making, information retrieval and patter recognition, etc., the clustering algo-rithm in the HFS frame developed here is therefore of considerable practicality in diverse fields, and consequently, it consti-tutes a potentially useful tool to handle those clustering issues involving hesitant fuzzy information.

Acknowledgements

The authors thank the anonymous reviewers for their helpful comments and suggestions, which have led to an improvedversion of this paper. The work was supported by the National Natural Science Foundation of China (No. 71071161).

References

[1] P. Bonizzoni, G.D. Vedova, R. Dondi, T. Jiang, Correlation clustering and consensus clustering, Lect. Notes Comput. Sci. 3827 (2008) 226–235.[2] H.P. Kriegel, P. Kroger, E. Schubert, A. Zimek, A General framework for increasing the robustness of PCA-based correlation clustering algorithms, Lect.

Notes Comput. Sci. 5069 (2008) 418–435.[3] D.G. Park, Y.C. Kwun, J.H. Park, I.Y. Park, Correlation coefficient of interval-valued intuitionistic fuzzy sets and its application to multiple attribute group

decision making problems, Math. Comput. Model. 50 (2009) 1279–1293.[4] E. Szmidt, J. Kacprzyk, Correlation of intuitionistic fuzzy sets, Lect. Notes Comput. Sci. 6178 (2010) 169–177.[5] G.W. Wei, H.J. Wang, R. Lin, Application of correlation coefficient to interval-valued intuitionistic fuzzy multiple attribute decision-making with

incomplete weight information, Knowl. Inform. Syst. 26 (2011) 337–349.[6] J. Ye, Multicriteria fuzzy decision-making method using entropy weights-based correlation coefficients of interval-valued intuitionistic fuzzy sets,

Appl. Math. Model. 34 (2010) 3864–3870.[7] D.A. Chiang, N.P. Lin, Correlation of fuzzy sets, Fuzzy Set Syst. 102 (1999) 221–226.[8] D. Dumitrescu, Fuzzy correlation, Studia Univ. Babes Bolyai Math. 23 (1978) 41–44.[9] D.H. Hong, Fuzzy measures for a correlation coefficient of fuzzy numbers under Tw (the weakest t-norm)-based fuzzy arithmetic operations, Inform. Sci.

176 (2006) 150–160.[10] D.H. Hong, S.Y. Hwang, A note on the correlation of fuzzy numbers, Fuzzy Set Syst. 79 (1996) 401–402.[11] W.L. Hung, J.W. Wu, A note on the correlation on fuzzy numbers by expected interval, Int. J. Uncert. Fuzz. Knowl. Based Syst. 9 (2001) 517–523.[12] S.T. Liu, C. Kao, Fuzzy measures for correlation coefficient of fuzzy numbers, Fuzzy Set Syst. 128 (2002) 267–275.[13] G.J. Wang, X. P Li, Correlation and information energy of interval-valued fuzzy numbers, Fuzzy Set Syst. 103 (1999) 169–175.[14] C. Yu, Correlation of fuzzy numbers, Fuzzy Set Syst. 55 (1993) 303–307.[15] T. Gerstenkorn, J. Manko, Correlation of intuitionistic fuzzy sets, Fuzzy Set Syst. 44 (1991) 39–43.[16] D.H. Hong, S.Y. Hwang, Correlation of intuitionistic fuzzy sets in probability spaces, Fuzzy Set Syst. 75 (1995) 77–81.[17] W.L. Hung, Using statistical viewpoint in developing correlation of intuitionistic fuzzy sets, Int. J. Uncert. Fuzz. Knowl. Based Syst. 9 (2001) 509–516.[18] W.L. Hung, J.W. Wu, Correlation of intuitionistic fuzzy sets by centroid method, Inform. Sci. 144 (2002) 219–225.[19] H.B. Mitchell, A correlation coefficient for intuitionistic fuzzy sets, Int. J. Intell. Syst. 19 (2004) 483–490.[20] J.H. Park, K.M. Lim, J.S. Park, Y.C. Kwun, Correlation coefficient between intuitionistic fuzzy sets, Fuzzy Inform. Eng. 2 (2009) 601–610.[21] Z.S. Xu, J. Chen, J.J. Wu, Clustering algorithm for intuitionistic fuzzy sets, Inform. Sci. 178 (2008) 3775–3790.[22] K. Atanassov, Intuitionistic fuzzy sets, Fuzzy Set Syst. 20 (1986) 87–96.[23] K. Atanassov, G. Gargov, Interval valued intuitionistic fuzzy sets, Fuzzy Set Syst. 31 (1989) 343–349.[24] H. Bustince, P. Burillo, Correlation of interval-valued intuitionistic fuzzy sets, Fuzzy Set Syst. 74 (1995) 237–244.[25] D.H. Hong, A note on correlation of interval-valued intuitionistic fuzzy sets, Fuzzy Set Syst. 95 (1998) 113–117.[26] V. Torra, Y. Narukawa, On hesitant fuzzy sets and decision, in: The 18th IEEE International Conference on Fuzzy Systems, Jeju Island, Korea, 2009, pp.

1378–1382.[27] V. Torra, Hesitant fuzzy sets, Int. J. Intell. Syst. 25 (2010) 529–539.[28] L.A. Zadeh, Fuzzy sets, Inform. Control 8 (1965) 338–353.[29] D. Dubois, H. Prade, Fuzzy Sets and Systems: Theory and Applications, Academic Press, New York, 1980.[30] S. Miyamoto, Remarks on basics of fuzzy sets and fuzzy multisets, Fuzzy Set Syst. 156 (2005) 427–431.[31] R.R. Yager, On the theory of bags, Int. J. Gen. Syst. 13 (1986) 23–37.[32] M.M. Xia, Z.S. Xu, Hesitant fuzzy information aggregation in decision making, Int. J. Approx. Reason. 52 (2011) 395–407.[33] Z.S. Xu, M.M. Xia, Distance and similarity measures for hesitant fuzzy sets, Inform. Sci. 181 (2011) 2128–2138.[34] Z.S. Xu, M.M. Xia, On distance and correlation measures of hesitant fuzzy information, Int. J. Intell. Syst. 26 (2011) 410–425.[35] B. Zhu, Z.S. Xu, M.M. Xia, Hesitant fuzzy geometric Bonferroni means, Inform. Sci. 205 (2012) 72–85.[36] R.M. Rodrıguez, L. Martınez, F. Herrera, Hesitant fuzzy linguistic term sets for decision making, IEEE Trans. Fuzzy Syst. 20 (2012) 109–119.[37] J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum, New York, 1998. pp. 43–93.[38] J. Han, M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufman, San Mateo, CA, 2000.[39] K. Mizutani, R. Inokuchi, S. Miyamoto, Algorithms of nonlinear document clustering based on fuzzy multiset model, Int. J. Intell. Syst. 23 (2008) 176–

198.[40] S. Miyamoto, Information clustering based on fuzzy multisets, Inform. Process. Manage. 39 (2003) 195–213.[41] T. Chaira, A novel intuitionistic fuzzy C means clustering algorithm and its application to medical images, Appl. Soft Comput. 11 (2011) 1711–1717.

N. Chen et al. / Applied Mathematical Modelling 37 (2013) 2197–2211 2211

[42] N. Kumar, M. Nasser, S.C. Sarker, A new singular value decomposition based robust graphical clustering technique and its application in climatic data, J.Geogr. Geol. 3 (2011) 227–238.

[43] J.B. Nikas, W.C. Low, Application of clustering analyses to the diagnosis of Huntington disease in mice and other diseases with well-defined groupboundaries, Comput. Methods Prog. Biomed. 104 (2011) 133–147.

[44] P. Zhao, C.Q. Zhang, A new clustering method and its application in social networks, Pattern Recognit. Lett. 32 (2011) 2109–2118.[45] B. Zhao, R.L. He, S.T. Yau, A new distribution vector and its application in genome clustering, Mol. Phylogenet. Evol. 59 (2011) 438–443.[46] X.H. Wu, B. Wu, J. Sun, J.W. Zhao, Mixed fuzzy inter-cluster separation clustering algorithm, Appl. Math. Model. 35 (2011) 4790–4795.[47] Z. Wang, Z.S. Xu, S.S. Liu, J. Tang, A netting clustering analysis method under intuitionistic fuzzy environment, Appl. Soft Comput. 11 (2011) 5558–

5564.[48] Z.S. Xu, Intuitionistic fuzzy hierarchical clustering algorithms, J. Syst. Eng. Electron. 20 (2009) 1–5.[49] C. Hwang, F.C.H. Rhee, Uncertain fuzzy clustering: Interval type-2 fuzzy approach to C-means, IEEE Trans. Fuzzy Syst. 15 (2007) 107–120.[50] M.S. Yang, D.C. Lin, On similarity and inclusion measures between type-2 fuzzy sets with an application to clustering, Comput. Math. Appl. 57 (2009)

896–907.[51] D. Dumitrescu, A definition of an informational energy in fuzzy sets theory, Studia Univ. Babes Bolyai Math. 22 (1977) 57–59.[52] P.Z. Wang, Fuzzy Set Theory and Applications, Shanghai Scientific and Technical Publishers, Shanghai, 1983.[53] I. Stamelos, A. Tsoukiàs, Software evaluation problem situations, Eur. J. Oper. Res. 145 (2003) 273–286.[54] M. Morisio, A. Tsoukiàs, IusWare: a methodology for the evaluation and selection of software products, IEEE Proc. Softw. Eng. 144 (1997) 162–174.[55] A. Dimitras, C. Iopounidis, C. Hurson, A multicriteria decision aid method for the assessment of business failure risk, Found. Comput. Decision Sci. 20

(1995) 99–112.[56] C. Iopounidis, A multicriteria decision making methodology for the evaluation of the risk of failure and an application, Found. Control Eng. 12 (1987)

45–67.[57] Z.S. Xu, Q.L. Da, The uncertain OWA operator, Int. J. Intell. Syst. 17 (2002) 569–575.[58] F. Herrera, L. Martinez, An approach for combining linguistic and numerical information based on the 2-tuple fuzzy linguistic representation model in

decision-making, Int. J. Uncertain. Fuzz. Knowl. Based Syst. 8 (2000) 539–562.[59] B. Zhu, Z.S. Xu, Generalized Hesitant Fuzzy Sets, Technical Report, 2011.