Uncertainty Measurement Based on In-sim-dominance Relation

Uncertainty Measurement Based on In-sim-dominance Relation

Liulin Zhoua, Guoyin Wanga,b,*, Taihua Xub aChongqing Key Laboratory of Computational Intelligence, Chongqing University of Posts and

Telecommunications, Chongqing 400065, PR China. bSchool of Information Science and Technology, Southwest Jiaotong University, Chengdu

610031,PR China.

*Corresponding author.

Email addresses: [email protected] (Liulin Zhou), [email protected] (Guoyin Wang ),

[email protected](Taihua Xu).

ABSTRACT:

In-sim-dominance relation is proposed to deal

with hybrid information system in which the objects

are described by a finite set of qualitative and

quantitative attributes. Accuracy and roughness are

two main tools to deal with uncertainty measurement

issue in Pawlak rough set theory. However, there are

few studies on uncertainty measurement based on

the in-sim-dominance relation. In this paper,

traditional accuracy and roughness measurements

are extended to deal with hybrid information system,

approximation accuracy and approximation

roughness based on the in-sim-dominance relation

are also defined. In particular, a concept called

hybrid entropy is first introduced to measure the

uncertainty of a hybrid information system. Then

entropy-based roughness and approximation

roughness of hybrid information system are

proposed. Experiments are conducted on standard

UCI data sets to test the proposed methodologies,

and the results demonstrate that the entropy-based

approximation roughness is effective and suitable

for measuring the uncertainty of hybrid information

system.

Keyword: rough set, in-sim-dominance relation,

uncertainty measurement, hybrid information

system

1. INTRODUCTION

As a useful mathematical tool for dealing

with uncertain and ambiguous information,

rough set theory (RST) [1-2] proposed by

Pawlak has been studied by many scholars and

has been applied successfully in many research

area, such as data mining [3], pattern

recognition [4], decision making analysis [5],

artificial intelligence [6-7], knowledge

discovery [8], machine learning [9], and

intelligent data analyzing [10], etc. The main

thoughts of RST is that building a knowledge

database by using all known knowledge of given

data space, then classifying the knowledge

database by indiscernibility relation, in fact, the

process of classifying the knowledge database

can be viewed as classifying the given data

space. In this way, uncertain knowledge can be

described approximately by known knowledge

of knowledge database. Compared with other

data processing methods, RST is more objective

because it does not need prior knowledge.

As is well-known, the indiscernbility

relation in universe plays a crucial role for

Pawlak RST, but for many practical problems,

the binary relations on their universe are not

equivalent, then the application of the Pawlak

RST was limited. Therefore, many scholars

were devoted to extend the Pawlak RST,

indiscernibility relation is extended to gain the

RST based on the generalized indiscernibility

relation [11-20] for different information system.

Practically, there exists a hybrid information

system, the objects in it are described by several

attributes, and the value of attributes are various,

such as nominal value, integer value, numerical

value, interval value etc. In order to construct a

comprehensive preference model, it is

reasonable to consider both criteria and regular

attributes sometimes, An and Tong [21]

Proceedings of the The Second International Conference on Artificial Intelligence and Pattern Recognition, Shenzhen, China, 2015

ISBN: 978-1-941968-09-3 ©2015 SDIWC 40

proposed the discernibility-similarity-

dominance matrix and its functions to induce the

decision rules based on the in-sim-dominance

relation.

Recently, many scholars proposed different

measurements for uncertainty in different RST.

Pawlak [2] proposed four numerical uncertainty

measurements, namely accuracy and roughness

in information table, approximation accuracy

and approximation roughness in decision table

to evaluate uncertainty of a rough set. Dai et.al

proposed an uncertainty measurement based on

the similarity degree for interval-valued

information systems [22], and approximation

accuracy for incomplete information systems

[23]. Beaubouef et.al [24] addressed the

measurement of uncertainty in rough sets and

rough relational databases by introducing a

measurement based on information entropy. Yao

et al. [25-27] worked on the attribute importance

in rough sets by information entropy

measurement. Liang [28] based on the

intuitionistic knowledge content nature of

information gain, the concepts of combination

entropy and combination granulation are

introduced in RST. However, there are few

studies on uncertainty measurements for hybrid

information system. In this paper, we address

the uncertainty measurement for hybrid

information system based on in-sim-dominance

relation. We investigate the properties of

in-sim-dominance relation; propose

approximation accuracy and approximation

roughness based on in-sim-dominance relation.

Moreover, the concepts of entropy-based

roughness and entropy-based approximation

roughness measurements are presented.

Experimental results show that the proposed

uncertainty measurements are effective for

evaluating the uncertainty in hybrid information

system based on in-sim-dominance relation.

The rest of this paper is organized as

follows. Some preliminary notions in RST are

briefly reviewed in Section 2. In Section 3,

in-sim-dominance relation and its rough

approximations are introduced, several

knowledge uncertainty measurements of hybrid

information systems based on

in-sim-dominance relation are defined, and then

some important properties of them are discussed.

Throw numerical experiments to evaluate the

proposed uncertainty measurement’s

effectiveness in Section 4. Then give the

conclusion in Section 5.

2. PRELIMINARY

In this section, we will review some basic

concepts in RST, including information system,

indiscernibility relation, rough approximations

and uncertainty measures.

2.1 Indiscernibility Relation And Rough

Approximations

An information system is a

quadruple 𝐼𝑆 = {𝑈 , 𝐶, 𝑉, 𝑓} , where U is a

non-empty finite set of objects called the

universe, C is a non-empty finite set of attribute

and V is the union of attribute domains such

that 𝑉 = ⋃𝑎∈𝐴𝑉𝑎 ,where𝑉𝑎 denotes the value

domain of attribute a for any 𝑎 ∈ 𝐶 , 𝑋 ⊆

𝑈 determines a information function𝑓𝑎 : 𝑈 →

𝑉𝑎,it means 𝑓(𝑎, 𝑥) ∈ 𝑉𝑎, where 𝑉𝑎 is the set

of values of a, 𝑓 𝑥, 𝑎 denotes the value of

attribute a for object x. A decision system is

defined as 𝐷𝑆 =< 𝑈, 𝐶 ∪ 𝑑 , 𝑉, 𝑓 >, where C

is the set of condition attributes and d is a

decision attribute.

For an attribute subset 𝑃 ⊆ 𝐶 determines

an indiscernibility relation that is denoted

by 𝐼𝑁𝐷(𝑃) and 𝐼𝑁𝐷 𝑃 = {(𝑥,𝑦) ∈ 𝑈 ×

𝑈|∀𝑎 ∈ 𝑃, 𝑓 𝑥, 𝑎 = 𝑓 𝑦, 𝑎 . In fact, the

relation 𝐼𝑁𝐷 𝑃 induces a partition of U which

is denoted by 𝑈/𝐼𝑁𝐷(𝑃) or 𝑈/𝑃; the notion

[𝑥]𝑃 denotes the indiscernibility class of P

containing x.

For any given information system 𝐼𝑆 =

{𝑈,𝐶 , 𝑉, 𝑓} and 𝑃 ⊆ 𝐶, 𝑋 ⊆ 𝑈, one can define

the lower and upper approximation of X:

𝑃∗ 𝑋 = 𝑥 ∈ 𝑈 𝑥 𝑃 ⊆ 𝑋 (2.1)

𝑃∗ 𝑋 = {𝑥 ∈ 𝑈| 𝑥 𝑃⋂𝑋 ≠ ∅} (2.2)


ISBN: 978-1-941968-09-3 ©2015 SDIWC 41

2.2 Uncertainty Measurements in RST

The uncertainty of rough set is modeled

from the approximation regions, Pawlak [2]

proposed two numerical measurements for

evaluating uncertainty of an information system

or a decision system in rough set theory:

accuracy and roughness. Where accuracy is

defined by the ratio of the cardinalities of the

lower and upper approximation sets of X, then

through the accuracy figured out the roughness

by subtracting the accuracy from one. Let

𝐼𝑆 = {𝑈, 𝐶, 𝑉 , 𝑓}be an information system, for a

domain subset 𝑋 ⊆ 𝑈 and an attribute

subset 𝑃 ⊆ 𝐶, accuracy and roughness of X with

respect to P are defined as:

𝛼𝑃 𝑋 =|𝑃∗(𝑋)|

|𝑃∗(𝑋)|, 𝛽𝑃 𝑋 = 1 − 𝛼𝑃 𝑋 (2.3)

However, the accuracy and roughness

don’t consider the decision attribute, so that they

are not suitable for the decision systems,

therefore, approximation accuracy and

approximation roughness were proposed by

Pawlak [2] for the decision systems.

Let DS =< 𝑈, 𝐶 ∪ 𝑑 , 𝑉, 𝑓 > be a

decision system, 𝑈/𝑑 = 𝐷1,𝐷2,… , 𝐷𝑘 be

indiscernibility classes constituted by decision

attribute d on U and the condition attribute

subset 𝑃 ⊆ 𝐶. The approximation accuracy and

approximation roughness of 𝑈/𝑑 by P are

defined as:

𝛼𝑃 𝑈/𝑑 = |𝐷𝑖∈𝑈/𝑑 𝑃∗(𝐷𝑖 )|

|𝐷𝑖∈𝑈 /𝑑 𝑃∗(𝐷𝑖)| (2.4)

𝛽𝑃 𝑈/𝑑 = 1 − 𝛼𝑃 𝑈/𝑑 (2.5)

3. UNCERTAINTY MEASUREMENT

BASED ON IN-SIM-DOMAINANCE

RELATION

In this section, in-sim-dominance relation

and its rough approximations are introduced,

and then several uncertainty measurements

based on in-sim-domainance relation are

defined.

3.1 In-sim-domainance Relation and Rough

Approximations

Because of many real-world problems have

both qualitative and quantitative attributes,

according to Greco et al. [20], the information

system can be describe as follows:

Let IS = {U, C, V, f} be an information

table, where C = C= ∪ C≽ ∪ C~, C= is a subset

of nominal attributes, C≽ is a subset of ordinal

attributes and C~ is a subset of quantitative

attributes and C= ∩ C≽ = ∅ ,C= ∩ C~ =

∅, C~ ∩ C≽ = ∅. Furthermore, for any P ⊆ C,

the subsets of P are denoted by P=,P≽andP∼,

respectively:

1) the subset of nominal attributes,

i.e., P= = P⋂C= ,

2) the subset of ordinal attributes, i.e.,

P≽ = P⋂C≽ ,

3) the subset of quantitative

attributes, i.e.,P~ = P⋂C~.

Furthermore, because of the key of rough

set philosophy is approximation of one

knowledge by another knowledge and the

in-sim-dominance relation among condition

attributes there are nominal attributes, ordinal

attributes and quantitative attributes, and

decision class are preference-ordered, the

approximated knowledge is a collection of

up-ward and down-ward unions of decision

classes and the “granules of knowledge” are sets

of objects defined using indiscernibility,

similarity and outranking relations together.

Let DS =< 𝑈, 𝐶 ∪ 𝑑 , 𝑉, 𝑓 > be a

decision system, assuming that the decision

attribute d makes a partition of U into a finite

number of decision classes. Then the sets that

we use to be approximated are called the upward

union and downward union of decision classes,

respectively [19]：

𝐶𝑙𝑡≽ = ⋃ 𝐶𝑙𝑠𝑠≥𝑡 ,𝐶𝑙𝑡

≼ = ⋃ 𝐶𝑙𝑠𝑠≤𝑡 , t=1,2,…,n.

The statement 𝑥 ∈ 𝐶𝑙𝑡≽ means “x belongs

at least to class 𝐶𝑙𝑡 ” and x ∈ Clt≼ means “x

belongs at most to class 𝐶𝑙𝑡”. Then we can consider establishing the

indiscernibility relation on nominal attributes,

the outranking relation on ordinal attributes and


ISBN: 978-1-941968-09-3 ©2015 SDIWC 42

similarity or outranking relation on quantitative

attributes. Binary relations established on

different attributes can be considered jointly

(moreover, with the needs of the problems, we

can establish other binary relations).

Definition 3.1[21] Let 𝐼𝑆 = {𝑈, 𝐶, 𝑉 , 𝑓} be an

information table.C= ⊆ C,C≽ ⊆ C,C~ ⊆ C, P ⊆

C , P= = P⋂C= , P≽ = P⋂C≽ , P~ = P⋂C~ , the

in-sim-domainance relations of P on U are

defined as follows:

𝑅𝑃𝑙≽ = 𝑥, 𝑦 ∈ 𝑈 × 𝑈:𝑦𝐼𝑃𝑥⋀𝑦𝐷𝑃

≽𝑥⋀𝑦𝑆𝑃𝑥 (3.1)

𝑅𝑃𝑟≽ = 𝑥, 𝑦 ∈ 𝑈 × 𝑈: 𝑦𝐼𝑃𝑥⋀𝑦𝐷𝑃

≽𝑥⋀𝑥𝑆𝑃𝑦 (3.2)

𝑅𝑃𝑙≼ = 𝑥, 𝑦 ∈ 𝑈 × 𝑈:𝑦𝐼𝑃𝑥⋀𝑦𝐷𝑃

≼𝑥⋀𝑦𝑆𝑃𝑥 (3.3)

𝑅𝑃𝑟≼ = {(𝑥, 𝑦) ∈ 𝑈 × 𝑈: 𝑦𝐼𝑃𝑥⋀𝑦𝐷𝑃

≼𝑥⋀𝑥𝑆𝑃𝑦} (3.4)

Where 𝐼𝑃 is indiscernibility relation, DP≽ is

outranking relation, DP≼ is outranked relation

and 𝑆𝑃 is similarity relation.

Definition 3.2[21] The global class of an

object x with respect to P are defined as:

𝑅𝑃𝑙≽ 𝑥 = 𝑦 ∈ 𝑈: 𝑦𝑅𝑃

𝑙≽𝑥 (3.5)

𝑅𝑃𝑟≽ 𝑥 = {𝑦 ∈ 𝑈: 𝑦𝑅𝑃

𝑟≽𝑥} (3.6)

𝑅𝑃𝑙≼ 𝑥 = 𝑦 ∈ 𝑈: 𝑦𝑅𝑃

𝑙≼𝑥 (3.7)

𝑅𝑃𝑟≼ 𝑥 = {𝑦 ∈ 𝑈: 𝑦𝑅𝑃

𝑟≼𝑥} (3.8) Theorem 3.1 Let 𝐼𝑆 = {𝑈, 𝐶, 𝑉 ,𝑓} be an

information table.𝐶= ⊆ 𝐶,𝐶≽ ⊆ 𝐶 , 𝐶~ ⊆ 𝐶, for

in-sim-domainance relations, ∀𝑥𝑖 ∈ 𝑈 ,if

𝑃 ⊆ 𝑄 ⊆ 𝐶 , then we have:

𝑅𝑃𝑙≽ 𝑥𝑖 ⊇ 𝑅𝑄

𝑙≽ 𝑥𝑖 ,𝑅𝑃𝑟≽ 𝑥𝑖 ⊇ 𝑅𝑄

𝑟≽ 𝑥𝑖 ;

𝑅𝑃𝑙≼ 𝑥𝑖 ⊇ 𝑅𝑄

𝑙≼ 𝑥𝑖 ,𝑅𝑃𝑟≼ 𝑥𝑖 ⊇ 𝑅𝑄

𝑟≼ 𝑥𝑖 .

Proof: Since 𝑄 ⊆ 𝑃 and 𝑥𝑖 ∈ 𝑈 , then

𝑅𝑃𝑙≽ 𝑥𝑖 = 𝐼𝑃1

𝑥𝑖 ⋂𝐷𝑃2

≽ 𝑥𝑖 ⋂𝑆𝑃3

𝑙 𝑥𝑖 and

𝑅𝑄𝑙≽ 𝑥𝑖 = 𝐼𝑄1

𝑥 𝑖 ⋂𝐷𝑄2

≽ 𝑥𝑖 ⋂𝑆𝑄3

𝑙 𝑥𝑖 , where Pi, Qi

is the subset of P, Q respectively, it is easy to

obtain that 𝐼𝑃1 𝑥𝑖 ⊇ 𝐼𝑄1

𝑥𝑖 ,𝐷𝑃2

≽ 𝑥𝑖 ⊇ 𝐷𝑄2

≽ 𝑥𝑖

and 𝑆𝑃3

𝑙 𝑥𝑖 ⊇ 𝑆𝑄3

𝑙 𝑥𝑖 . Thus 𝑅𝑃𝑙≽ 𝑥𝑖 ⊇ 𝑅𝑄

𝑙≽ 𝑥𝑖 .

The others proof is similar.

Example1. An example of in-sim-dominance

binary relation

Table1[20] illustrates a representative

decisions of a decision maker (DM) concerning

8 warehouses described by means of 3 condition

attribute: a, capacity of the sales staff; b,

geographical region; c, area and a decision

attribute d specifies the assignment made by the

DM into 2 sets of warehouses making either

profit or loss.

Table1. A decision table.

Warehous

es

a b c d

x1 A 5

00

Medium Loss

x2 A 4

00

Good Profit

x3 A 4

50

Medium Profit

x4 B 4

00

Good Loss

x5 B 4

75

Good Profit

x6 B 4

25

Medium Profit

x7 B 3

50

Medium Profit

x8 B 3

50

Medium Loss

Table1 can be viewed as an example of

hybrid information system, the value of attribute

a is nominal; the value of attribute b is

quantitative; the value of attribute c and decision

attribute d are ordinal.

We consider dividing decision table into

upward-union 𝐶𝑙𝑃𝑟𝑜𝑓𝑖𝑡≽ = 𝑥2,𝑥3, 𝑥5,𝑥6, 𝑥7 and

downward-union 𝐶𝑙𝐿𝑜𝑠𝑠≼ = 𝑥1,𝑥4,𝑥8 .With

respect to attribute a establish the

indiscernibility relation, with respect to attribute

b establish the similarity relation that is defined

as[20]:

𝑆𝑏 = 𝑥𝑖 ,𝑥𝑗 ∈ 𝑈 × 𝑈: |𝑓 𝑥 𝑖 ,𝑏 − 𝑓 𝑗,𝑏 | ≤

0.1𝑓𝑥𝑖,𝑏 (3.9)

and with respect to attribute c establish the

outranking relation that with the attribute value

“Good” is better than “Medium”. And let

𝐷𝑆 = {𝑈, 𝐶 ∪ 𝑑 , 𝑉, 𝑓}, where 𝐶 = 𝑎 , 𝑏, 𝑐 is

the set of condition attributes and d is a decision

attribute, 𝑈 = 𝑥1,𝑥2, 𝑥3,𝑥4, 𝑥5,𝑥6, 𝑥7,𝑥8 ,

then, the in-sim- dominance binary relation of C

on U is show in Table2.


ISBN: 978-1-941968-09-3 ©2015 SDIWC 43

Table2. The in-sim-dominance relation of C on U.

Warehouse 𝑅𝐶𝑙≽ 𝑥 𝑖

𝑅𝐶𝑟≽ 𝑥 𝑖

𝑅𝐶𝑙≼ 𝑥 𝑖

𝑅𝐶𝑟≼ 𝑥 𝑖

x1 𝑥1 ,𝑥3 𝑥1

𝑥1 ,𝑥3 𝑥1

x2 𝑥2 𝑥2

𝑥2 𝑥2

x3 𝑥3 𝑥1 ,𝑥3

𝑥3 𝑥1 ,𝑥3

x4 𝑥4 𝑥4

𝑥4 ,𝑥6 𝑥4 ,𝑥6

x5 𝑥5 𝑥5

𝑥5 𝑥5

x6 𝑥4 ,𝑥6 𝑥4 ,𝑥6

𝑥6 𝑥6

x7 𝑥7 ,𝑥8 𝑥7 ,𝑥8

𝑥7 ,𝑥8 𝑥7 ,𝑥8

x8 𝑥7 ,𝑥8 𝑥7 ,𝑥8

𝑥7 ,𝑥8 𝑥7 ,𝑥8

Definition 3.3 Let DS =< U, C ∪ d , V, f > be

a decision table. With respect to 𝑃 ⊆ 𝐶, the set

of all objects belonging to 𝐶𝑙𝑡≽ without any left

ambiguity constitutes the 𝑃 𝑙 -lower

approximation of 𝐶𝑙𝑡≽ denoted by 𝑃∗

𝑙(𝐶𝑙𝑡≽) and

the set of all objects that could belonging to 𝐶𝑙𝑡≽

constitutes the 𝑃 𝑙-upper approximation of 𝐶𝑙𝑡≽

denoted by 𝑃 𝑙∗(𝐶𝑙𝑡≽), for t=1,2,…n: [21]

𝑃∗𝑙 𝐶𝑙𝑡

≽ = {𝑥 ∈ 𝑈: 𝑅𝑃𝑙≽ 𝑥 ⊆ 𝐶𝑙𝑡

≽} (3.10)

𝑃 𝑙∗ 𝐶𝑙𝑡≽ = {𝑥 ∈ 𝑈:𝑅𝑃

𝑟≼ 𝑥 ⋂𝐶𝑙𝑡≽ ≠ 𝜙} (3.11)

Definition 3.4 Let 𝐷𝑆 =< 𝑈, 𝐶 ∪ 𝑑 , 𝑉, 𝑓 >

be a decision table. With respect to 𝑃 ⊆ 𝐶, the

set of all objects belonging to 𝐶𝑙𝑡≽ without any

right ambiguity constitutes the 𝑃𝑟 -lower

approximation of 𝐶𝑙𝑡≽ denoted by 𝑃∗

𝑟(𝐶𝑙𝑡≽)

and the set of all objects that could belonging to

𝐶𝑙𝑡≽ constitutes the 𝑃𝑟-upper approximation of

𝐶𝑙𝑡≽ denoted by 𝑃𝑟∗(𝐶𝑙𝑡

≽), for t=1,2,…n: [21]

𝑃∗𝑟 𝐶𝑙𝑡

≽ = {𝑥 ∈ 𝑈: 𝑅𝑃𝑟≽ 𝑥 ⊆ 𝐶𝑙𝑡

≽} (3.12)

𝑃𝑟∗ 𝐶𝑙𝑡≽ = {𝑥 ∈ 𝑈: 𝑅𝑃

𝑙≼ 𝑥 ⋂𝐶𝑙𝑡≽ ≠ 𝜙} (3.14)

We can consider letting the intersection of

𝑃 𝑙-lower and 𝑃𝑟-lower approximation of 𝐶𝑙𝑡≽,

the union of 𝑃 𝑙 -upper and 𝑃𝑟 -upper

approximation 𝐶𝑙𝑡≽ to be the lower and upper

approximation of 𝐶𝑙𝑡≽, as follows[21]:

𝑃∗ 𝐶𝑙𝑡≽ = 𝑃∗

𝑙 𝐶𝑙𝑡≽ ⋂𝑃∗

𝑟 𝐶𝑙𝑡≽ (3.15)

𝑃∗ 𝐶𝑙𝑡≽ = 𝑃 𝑙∗ 𝐶𝑙𝑡

≽ ⋃𝑃𝑟∗ 𝐶𝑙𝑡≽ (3.16)

Similarly, we can define the lower and

upper approximation of 𝐶𝑙𝑡≼, as follows:


≼ = {𝑥 ∈ 𝑈: 𝑅𝑃𝑙≼ 𝑥 ⊆ 𝐶𝑙𝑡

≼} (3.17)

𝑃 𝑙∗ 𝐶𝑙𝑡≼ = {𝑥 ∈ 𝑈:𝑅𝑃

𝑟≽ 𝑥 ⋂𝐶𝑙𝑡≼ ≠ 𝜙} (3.18)

𝑃∗𝑟 𝐶𝑙𝑡

≼ = {𝑥 ∈ 𝑈: 𝑅𝑃𝑟≼ 𝑥 ⊆ 𝐶𝑙𝑡

≼} (3.19)

𝑃𝑟∗ 𝐶𝑙𝑡≼ = {𝑥 ∈ 𝑈: 𝑅𝑃

𝑙≽ 𝑥 ⋂𝐶𝑙𝑡≼ ≠ 𝜙} (3.20)

𝑃∗ 𝐶𝑙𝑡≼ = 𝑃∗

𝑙 𝐶𝑙𝑡≼ ⋂𝑃∗

𝑟 𝐶𝑙𝑡≼ (3.21)

𝑃∗ 𝐶𝑙𝑡≼ = 𝑃 𝑙∗ 𝐶𝑙𝑡

≼ ⋃𝑃𝑟∗ 𝐶𝑙𝑡≼ (3.22)

𝑃∗ 𝐶𝑙𝑡≽ and 𝑃∗ 𝐶𝑙𝑡

≼ consist of those

objects which are precise ones, 𝑃∗ 𝐶𝑙𝑡≽ and

𝑃∗ 𝐶𝑙𝑡≼ consist of those objects which are

precise or left ambiguous or right ambiguous.

Theorem 3.2[21] (Monotonic) For any 𝑡 ∈ 𝑇

and 𝑃 ⊆ 𝑄 ⊆ 𝐶 , then:


≽ ⊆ 𝑄∗𝑙 𝐶𝑙𝑡

≽ , 𝑃∗𝑟(𝐶𝑙𝑡

≽) ⊆ 𝑄∗𝑟(𝐶𝑙𝑡

≽);

𝑃∗𝑙(𝐶𝑙𝑡

≼) ⊆ 𝑄∗𝑙(𝐶𝑙𝑡

≼),𝑃∗𝑟(𝐶𝑙𝑡

≼) ⊆ 𝑄∗𝑟(𝐶𝑙𝑡

≼);

𝑃 𝑙∗ 𝐶𝑙𝑡≽ ⊇ 𝑄𝑙 ∗ 𝐶𝑙𝑡

≽ ,𝑃𝑟∗(𝐶𝑙𝑡≽) ⊇ 𝑄𝑟 ∗(𝐶𝑙𝑡

≽);

𝑃 𝑙∗ 𝐶𝑙𝑡≼ ⊇ 𝑄𝑙 ∗ 𝐶𝑙𝑡

≼ ,𝑃𝑟∗(𝐶𝑙𝑡≼) ⊇ 𝑄𝑟 ∗(𝐶𝑙𝑡

≼).

Example2. The example of lower and upper

approximation of in-sim-dominance relation by

table1.


ISBN: 978-1-941968-09-3 ©2015 SDIWC 44

Let the condition attribute subset 𝑃 = 𝐶,

the upward-union is 𝐶𝑙𝑃𝑟𝑜𝑓𝑖𝑡≽ and the

downward-union is 𝐶𝑙𝐿𝑜𝑠𝑠≼ , then the lower and

upper approximation of 𝐶𝑙𝑃𝑟𝑜𝑓𝑖𝑡≽ and 𝐶𝑙𝐿𝑜𝑠𝑠

≼ on

P are:

𝑃∗ 𝐶𝑙𝑃𝑟𝑜𝑓𝑖𝑡≽ = 𝑃∗

𝑙 𝐶𝑙𝑃𝑟𝑜𝑓𝑖𝑡≽ ∩ 𝑃∗

𝑟 𝐶𝑙𝑃𝑟𝑜𝑓𝑖𝑡≽

= 𝑥2,𝑥5

𝑃∗ 𝐶𝑙𝑃𝑟𝑜𝑓𝑖𝑡≽ = 𝑃 𝑙∗ 𝐶𝑙𝑃𝑟𝑜𝑓𝑖𝑡

≽ ∪ 𝑃𝑟∗ 𝐶𝑙𝑃𝑟𝑜𝑓𝑖𝑡≽

= 𝑥1,𝑥2,𝑥3, 𝑥4,𝑥5, 𝑥6,𝑥7, 𝑥8

𝑃∗ 𝐶𝑙𝐿𝑜𝑠𝑠≼ = 𝑃∗

𝑙 𝐶𝑙𝐿𝑜𝑠𝑠≼ ∩ 𝑃∗

𝑟 𝐶𝑙𝐿𝑜𝑠𝑠≼ = 𝜙

𝑃∗ 𝐶𝑙𝐿𝑜𝑠𝑠≼ = 𝑃 𝑙∗ 𝐶𝑙𝐿𝑜𝑠𝑠

≼ ∪ 𝑃𝑟∗ 𝐶𝑙𝐿𝑜𝑠𝑠≼

= 𝑥1,𝑥3,𝑥4, 𝑥6,𝑥7, 𝑥8

3.2. Uncertainty Measurements Based on

In-sim-dominance Relation

Definition 3.5 Let DS =< 𝑈, 𝐶 ∪ 𝑑 , 𝑉 , 𝑓 >

be a decision table where 𝐶 = 𝐶= ∪ 𝐶≽ ∪ 𝐶~ ,

for an attribute subset 𝑃 ⊆ 𝐶, the accuracy of

𝐶𝑙𝑡≽ and 𝐶𝑙𝑡

≼ with respect to P is defined as:

𝛼𝑃 𝐶𝑙𝑡≽ =

𝑃∗(𝐶𝑙𝑡≽)

𝑃∗(𝐶𝑙𝑡≽ )

,𝛼𝑃 𝐶𝑙𝑡≼ =

𝑃∗(𝐶𝑙𝑡≼)

𝑃∗(𝐶𝑙𝑡≼ )

(3.23)


be a decision table where 𝐶 = 𝐶= ∪ 𝐶≽ ∪ 𝐶~ ,

for an attribute subset 𝑃 ⊆ 𝐶, the roughness of

𝐶𝑙𝑡≽ and 𝐶𝑙𝑡

≼ with respect to P is defined as:

𝜌𝑃 𝐶𝑙𝑡≽ = 1 − 𝛼𝑃 𝐶𝑙𝑡

≽ (3.24)

𝜌𝑃 𝐶𝑙𝑡≼ = 1 − 𝛼𝑃 𝐶𝑙𝑡

≼ (3.25)

Theorem 3.3 Let DS =< 𝑈, 𝐶 ∪ 𝑑 , 𝑉, 𝑓 > be

a decision table where 𝐶 = 𝐶= ∪ 𝐶≽ ∪ 𝐶~ , if

𝑃 ⊆ 𝑄 ⊆ 𝐶 , then 𝜌𝑃 𝐶𝑙𝑡≽ ≥ 𝜌𝑄 𝐶𝑙𝑡

≽ and

𝜌𝑃 𝐶𝑙𝑡≼ ≥ 𝜌𝑄 𝐶𝑙𝑡

≼ .

Proof. Since 𝑃 ⊆ 𝑄 ⊆ 𝐶 , from the definition of

𝑃∗(𝐶𝑙𝑡≽) = 𝑃∗

𝑙 𝐶𝑙𝑡≽ ⋂𝑃∗

𝑟 𝐶𝑙𝑡≽ and 𝑃∗ 𝐶𝑙𝑡

≽ =

𝑃 𝑙∗ 𝐶𝑙𝑡≽ ⋃𝑃𝑟∗ 𝐶𝑙𝑡

≽ , according to the theorem

3.2, it is easy to obtain that

(𝑃∗𝑙(𝐶𝑙𝑡

≽)⋂𝑃∗𝑟(𝐶𝑙𝑡

≽)) ⊆ (𝑄∗𝑙(𝐶𝑙𝑡

≽)⋂𝑄∗𝑟(𝐶𝑙𝑡

≽))

and (𝑃 𝑙∗(𝐶𝑙𝑡≽)⋃𝑃𝑟∗(𝐶𝑙𝑡

≽)) ⊇

(𝑄𝑙∗(𝐶𝑙𝑡≽)⋃𝑄𝑟∗(𝐶𝑙𝑡

≽)), so 𝑃∗(𝐶𝑙𝑡≽) ⊆ 𝑄∗(𝐶𝑙𝑡

≽)

and 𝑃∗(𝐶𝑙𝑡≽) ⊇ 𝑄∗(𝐶𝑙𝑡

≽) . Then 𝑃∗(𝐶𝑙𝑡

≽ )

𝑃∗(𝐶𝑙𝑡≽)

≤

𝑄∗(𝐶𝑙𝑡≽ )

𝑄∗(𝐶𝑙𝑡≽)

, thus 𝛼𝑃 𝐶𝑙𝑡≽ ≤ 𝛼𝑄 𝐶𝑙𝑡

≽ . Therefore

𝜌𝑃 𝐶𝑙𝑡≽ ≥ 𝜌𝑄 𝐶𝑙𝑡

≽ . The proof of 𝜌𝑃 𝐶𝑙𝑡≼ ≥

𝜌𝑄 𝐶𝑙𝑡≼ is similar to 𝜌𝑃 𝐶𝑙𝑡

≽ ≥ 𝜌𝑄 𝐶𝑙𝑡≽ .


be a decision table where C = C= ∪ C≽ ∪ C~ ,

and assume that 𝑈 𝐶𝑙𝑡≽ = {𝐶𝑙𝑡, 𝐶𝑙𝑡+1, … , 𝐶𝑙𝑚};

𝑈 𝐶𝑙𝑡≼ = {𝐶𝑙1, 𝐶𝑙2,… , 𝐶𝑙𝑡} be indiscernibility

class are constituted by decision attribute d on

the upward union 𝐶𝑙𝑡≽ and downward union

𝐶𝑙𝑡≼ of decision classes and condition attribute

subset 𝑃 ⊆ 𝐶. The approximation accuracy of

𝑈 𝐶𝑙𝑡≽ and 𝑈 𝐶𝑙𝑡

≼ with respect to P under

in-sim-dominance relation are defined as:

𝛼𝑃 𝑈 𝐶𝑙𝑡≽ =

|𝑃∗(𝑑𝑖)|𝑑𝑖∈𝑈 𝐶𝑙 𝑡

≽

|𝑃∗(𝑑𝑖)|𝑑𝑖 ∈𝑈 𝐶𝑙 𝑡

≽

(3.26)

𝛼𝑃 𝑈 𝐶𝑙𝑡≼ =

|𝑃∗(𝑑𝑖)|𝑑𝑖∈𝑈 𝐶𝑙 𝑡

≼

|𝑃∗(𝑑𝑖)|𝑑𝑖 ∈𝑈 𝐶𝑙 𝑡

≼

(3.27)

Then we can define the approximation

roughness by the approximation accuracy under

in-sim-dominance relation as definition 3.6:

𝜌𝑃 𝑈 𝐶𝑙𝑡≽ = 1 − 𝛼𝑃 𝑈 𝐶𝑙𝑡

≽ (3.28)

𝜌𝑃 𝑈 𝐶𝑙𝑡≼ = 1 − 𝛼𝑃 𝑈 𝐶𝑙𝑡

≼ (3.29)

Theorem 3.4 Let 𝐷𝑆 =< 𝑈 , 𝐶 ∪ 𝑑 , 𝑉 ,𝑓 > be

a decision table, where 𝐶 = 𝐶= ∪ 𝐶≽ ∪ 𝐶~ . If

𝑄 ⊆ 𝑃 ⊆ 𝐶 , then 𝜌𝑃 𝑈 𝐶𝑙𝑡≽ ≤ 𝜌𝑄 𝑈 𝐶𝑙𝑡

≽

and 𝜌𝑃 𝑈 𝐶𝑙𝑡≼ ≤ 𝜌𝑄 𝑈 𝐶𝑙𝑡

≼ .


ISBN: 978-1-941968-09-3 ©2015 SDIWC 45

Proof. Since𝑄 ⊆ 𝑃, according to theorem 3.1,

we know that ∀𝑥𝑖 ∈ 𝑈 , 𝑅𝑃𝑙≽ 𝑥𝑖 ⋂𝑅𝑃

𝑟≽ 𝑥𝑖 =

𝑅𝑃≽ 𝑥𝑖 ⊇ 𝑅𝑄

≽ 𝑥𝑖 = 𝑅𝑄𝑙≽ 𝑥𝑖 ⋂𝑅𝑄

𝑟≽ 𝑥𝑖 .

Consequently, ∀𝑥𝑖 ∈ 𝐶𝑙𝑡≽ , 𝑅𝑃

≽ 𝑥𝑖 ⊆

𝐶𝑙𝑡≽ and 𝑅𝑄

≽ 𝑥𝑖 ⊆ 𝐶𝑙𝑡≽ . Hence, ∀𝑋 ∈ 𝐶𝑙𝑡

≽ , it

follows that |𝑃∗ 𝑋 | ≥ |𝑄∗ 𝑋 | , so ∀𝑑𝑖 ∈

𝑈 𝐶𝑙𝑡≽ , |𝑃∗(𝑑𝑖)| ≥ |𝑄∗(𝑑𝑖)|.

On the other hand, ∀𝑥𝑖 ∈ 𝑈, 𝑅𝑃≽ 𝑥𝑖 ⋂𝑋 ≠

𝜙 and 𝑅𝑄≽ 𝑥𝑖 ⋂𝑋 ≠ 𝜙 , since ∀𝑥𝑖 ∈

𝑈, 𝑅𝑃≽ 𝑥𝑖 ⊇ 𝑅𝑄

≽ 𝑥𝑖 . Hence, ∀𝑋 ∈ 𝐶𝑙𝑡≽ , it

follows that |𝑃∗ 𝑋 | ≤ |𝑄∗ 𝑋 | , so ∀𝑑𝑖 ∈

𝑈 𝐶𝑙𝑡≽ , |𝑃∗ 𝑑𝑖 | ≤ |𝑄∗(𝑑𝑖)|.Consequently, we

have 𝜌𝑃 𝑈 𝐶𝑙𝑡≽ ≤ 𝜌𝑄 𝑈 𝐶𝑙𝑡

≽ .The proof of

𝜌𝑃 𝑈 𝐶𝑙𝑡≼ ≤ 𝜌𝑄 𝑈 𝐶𝑙𝑡

≼ is similar.

3.3. Entropy-based Uncertainty Measurements

Based on In-sim-dominance Relation

Shannon provided a useful measurement

that called entropy to measure the information

of data set in information theory [29]. In fact,

entropy can be used as uncertainty measurement

in rough set theory that some scholars studied

early [24-28].

In this part, we define new entropy called

hybrid entropy based on in-sim-dominance

relation, then two more useful measurements

called entropy-based approximation roughness

of upward-union or downward-union are

proposed based on hybrid entropy.


be a decision table, where 𝐶 = 𝐶= ∪ 𝐶≽ ∪ 𝐶~ ,

for any attribute subset 𝑃 ⊆ 𝐶 , the hybrid

entropy with respect to P is defined as:

𝐻 𝑃 = −1

2

|𝑅𝑃𝑙≽(𝑥𝑖)⋂𝑅𝑃

𝑟≽(𝑥𝑖)|

|𝑈|2 𝑙𝑜𝑔1

|𝑅𝑃𝑙≽(𝑥𝑖)⋂𝑅𝑃

𝑟≽(𝑥𝑖)|

|𝑈|

𝑖=1

−1

2

|𝑅𝑃𝑙≼(𝑥𝑖)⋂𝑅𝑃

𝑟≼(𝑥 𝑖)|

|𝑈|2𝑙𝑜𝑔

1

|𝑅𝑃𝑙≼(𝑥𝑖)⋂𝑅𝑃

𝑟≼(𝑥𝑖)|

|𝑈|𝑖=1 (3.30)

The hybrid entropy achieves the maximum

value 𝐶𝑙𝑡

≽ 𝑙𝑜𝑔 𝐶𝑙𝑡≽ + 𝐶𝑙𝑡−1

≼ 𝑙𝑜𝑔 𝐶𝑙𝑡−1≼

2|𝑈| when∀𝑥𝑖 ∈ 𝑈,

𝑅𝑃𝑙≽ 𝑥𝑖 ⋂𝑅𝑃

𝑟≽ 𝑥𝑖 = 𝐶𝑙𝑡≽ and

𝑅𝑃𝑙≼ 𝑥𝑖 ⋂𝑅𝑃

𝑟≼ 𝑥𝑖 = 𝐶𝑙𝑡−1≼ , and it achieves the

minimum value 0 when ∀𝑥𝑖 ∈ 𝑈 ,


𝑟≽ 𝑥𝑖 = {𝑥𝑖} and


𝑟≼ 𝑥𝑖 = {𝑥𝑖} . Hence, we have

0 ≤ 𝐻 𝑃 ≤ 𝐶𝑙𝑡

≽ 𝑙𝑜𝑔 𝐶𝑙𝑡≽ + 𝐶𝑙𝑡−1

≼ 𝑙𝑜𝑔 𝐶𝑙𝑡−1≼

2|𝑈|.

Theorem 3.5 (Monotonic) Let DS =< 𝑈, 𝐶 ∪ 𝑑 , 𝑉 , 𝑓 > be a decision table ， where 𝐶 =

𝐶= ∪ 𝐶≽ ∪ 𝐶~ . If 𝑄 ⊆ 𝑃 ⊆ 𝐶 , then 𝐻 𝑃 ≤

𝐻 𝑄 .

Proof. Since 𝑄 ⊆ 𝑃 ⊆ 𝐶 , according to the

theorem 3.1, we have 𝑅𝑃𝑙≽(𝑥𝑖)⋂𝑅𝑃

𝑟≽(𝑥𝑖) ⊆

𝑅𝑄𝑙≽ (𝑥𝑖)⋂𝑅𝑄

𝑟≽(𝑥𝑖) and 𝑅𝑃𝑙≼(𝑥𝑖)⋂𝑅𝑃

𝑟≼(𝑥𝑖) ⊆

𝑅𝑄𝑙≼(𝑥𝑖)⋂𝑅𝑄

𝑟≼(𝑥𝑖) then 𝑙𝑜𝑔1

|𝑅𝑃𝑙 ≽(𝑥 𝑖)⋂𝑅𝑃

𝑟 ≽(𝑥 𝑖)|≥

𝑙𝑜𝑔1

|𝑅𝑄𝑙≽ (𝑥 𝑖)⋂𝑅𝑄

𝑟 ≽ (𝑥 𝑖)| and 𝑙𝑜𝑔

1

|𝑅𝑃𝑙 ≼(𝑥 𝑖)⋂𝑅𝑃

𝑟 ≼(𝑥 𝑖)|≥

𝑙𝑜𝑔1

|𝑅𝑄𝑙≼ (𝑥 𝑖)⋂𝑅𝑄

𝑟 ≼ (𝑥 𝑖)|, therefore 𝐻 𝑃 ≤ 𝐻 𝑄 .

Theorem 3.6(Equivalence) Let DS =< 𝑈, 𝐶 ∪ 𝑑 , 𝑉 , 𝑓 > be a decision tablewhere 𝐶 = 𝐶= ∪

𝐶≽ ∪ 𝐶~ . For 𝑄, 𝑃 ⊆ 𝐶 , if ∀𝑥𝑖 ∈ 𝑈 ,


𝑟≽ 𝑥𝑖 = 𝑅𝑄𝑙≽ 𝑥𝑖 ⋂𝑅𝑄

𝑟≽ 𝑥𝑖 and


𝑟≼ 𝑥𝑖 = 𝑅𝑄𝑙≼ 𝑥𝑖 ⋂𝑅𝑄

𝑟≼ 𝑥𝑖 ,then

𝐻 𝑃 = 𝐻 𝑄 .

Proof. It is easy to prove by the definition 3.2

and 3.8.


be a decision table， where 𝐶 = 𝐶= ∪ 𝐶≽ ∪ 𝐶~,

for an attribute subset 𝑃 ⊆ 𝐶, the entropy-based

roughness of 𝐶𝑙𝑡≽ and 𝐶𝑙𝑡

≼ with respect to P

under in-sim-dominance relation are defined as

follows:

𝐻𝜌𝑃 𝐶𝑙𝑡≽ = 𝜌𝑃 𝐶𝑙𝑡

≽ 𝐻 𝑃 (3.31)

𝐻𝜌𝑃 𝐶𝑙𝑡≼ = 𝜌𝑃 𝐶𝑙𝑡

≼ 𝐻 𝑃 (3.32)


ISBN: 978-1-941968-09-3 ©2015 SDIWC 46


a decision table，where 𝐶 = 𝐶= ∪ 𝐶≽ ∪ 𝐶~, if

𝑄 ⊆ 𝑃 ⊆ 𝐶 , then 𝐻𝜌𝑃 𝐶𝑙𝑡≽ ≤ 𝐻𝜌𝑄 𝐶𝑙𝑡

≽ and

𝐻𝜌𝑃 𝐶𝑙𝑡≼ ≤ 𝐻𝜌𝑄 𝐶𝑙𝑡

≼ .

Proof. Since Q ⊆ P ⊆ C, we have 𝜌𝑃 𝐶𝑙𝑡≽ ≤

𝜌𝑄 𝐶𝑙𝑡≽ by theorem3.3 and 𝐻 𝑃 ≤ 𝐻 𝑄 by

theorem3.4. Hence, we get 𝐻𝜌𝑃 𝐶𝑙𝑡≽ ≤

𝐻𝜌𝑄 𝐶𝑙𝑡≽ . And the proof of 𝐻𝜌𝑃 𝐶𝑙𝑡

≼ ≤

𝐻𝜌𝑄 𝐶𝑙𝑡≼ is similar.


be a decision table， where 𝐶 = 𝐶= ∪ 𝐶≽ ∪ 𝐶~,

and assume that 𝑈 𝐶𝑙𝑡≽ = {𝐶𝑙𝑡 , 𝐶𝑙𝑡+1,… , 𝐶𝑙𝑚},

𝑈 𝐶𝑙𝑡≽ = {𝐶𝑙1, 𝐶𝑙2,… , 𝐶𝑙𝑡} be indiscernibility

class constituted by decision attribute d on the

upward union 𝐶𝑙𝑡≽ and downward union 𝐶𝑙𝑡

≼

of decision classes and condition attribute

subset 𝑃 ⊆ 𝐶 . The entropy-based

approximation roughness of 𝑈 𝐶𝑙𝑡≽ and

𝑈 𝐶𝑙𝑡≼ with respect to P under

in-sim-dominance relation are defined as

follows:

𝐻𝜌𝑃 𝑈 𝐶𝑙𝑡≽ =𝜌𝑃 𝑈 𝐶𝑙𝑡

≽ 𝐻 𝑃 (3.33)

𝐻𝜌𝑃 𝑈 𝐶𝑙𝑡≼ = 𝜌𝑃 𝑈 𝐶𝑙𝑡

≼ 𝐻 𝑃 (3.34)


a decision table, where 𝐶 = 𝐶= ∪ 𝐶≽ ∪ 𝐶~ , if

𝑄 ⊆ 𝑃 ⊆ 𝐶 , then we have 𝐻𝜌𝑃 𝑈 𝐶𝑙𝑡≽ ≤

𝐻𝜌𝑄 𝑈 𝐶𝑙𝑡≽ , 𝐻𝜌𝑃 𝑈 𝐶𝑙𝑡

≼ ≤ 𝐻𝜌𝑄 𝑈 𝐶𝑙𝑡≼ .

Proof. Since 𝑄 ⊆ 𝑃 ⊆ 𝐶 ,we have

𝜌𝑃 𝑈 𝐶𝑙𝑡≽ ≤ 𝜌𝑄 𝑈 𝐶𝑙𝑡

≽ by theorem3.4 and

𝐻 𝑃 ≤ 𝐻 𝑄 by theorem 3.5. Hence, we get

𝐻𝜌𝑃 𝐶𝑙𝑡≽ ≤ 𝐻𝜌𝑄 𝐶𝑙𝑡

≽ . And the proof of

𝐻𝜌𝑃 𝐶𝑙𝑡≼ ≤ 𝐻𝜌𝑄 𝐶𝑙𝑡

≼ is similar.

Example3. The comparison of approximation

roughness and entropy-based approximation

roughness.

Fig.1. Approximation roughness vs. entropy-based

approximation roughness of upward-union.

Fig.2. Approximation roughness vs. entropy-based

approximation roughness of downward-union.

As example1, we established the

in-sim-dominance relation, and then we

calculate the value of approximation roughness

and entropy-based approximation roughness to

compare their difference. The results are shown

in Fig.1-2. It is easy to find that both

𝜌𝑃 𝑈 𝐶𝑙𝑃𝑟𝑜𝑓𝑖𝑡≽ and 𝜌𝑃 𝑈 𝐶𝑙𝐿𝑜𝑠𝑠

≼ do not

change as the number of attributes increase from

2 to 3. By contrast, the entropy-based

approximation roughness can discern them


ISBN: 978-1-941968-09-3 ©2015 SDIWC 47

clearly. It means that entropy-based

approximation roughness 𝐻𝜌𝑃 𝑈 𝐶𝑙𝑡≽ and

𝐻𝜌𝑃 𝑈 𝐶𝑙𝑡−1≼ is more powerful for evaluating

the uncertainty in some cases.

4. EXPERIMENTS

In order to verify the effectiveness of the

uncertainty measures proposed above, we

conduct four experiments on two real- life data

sets which are Heart disease dataset and Credit

approval from UCI repository of machine

learning database. The experiments are

conducted by the analysis of the properties of

every data set attributes, establishes of

corresponding binary relation, then build the

in-sim-dominance relation.

The Heart disease dataset, there are 6 real

attributes, 1 ordered attribute, 3 binary attributes,

3 nominal attributes and 1 decision attribute.

And it has 270 objects that have heart disease or

not. The results are shown in Fig.3-4.

The Credit approval, there are 6 categorical

attributes, 3 binary attributes, 6 qualitative

attributes, 1 decision attribute. And it has 690

objects that belong to the class - or class +.

Because there are 2 attributes have the same

classification and it has missing value, so

through the preprocessing, there are 14

Fig.3. The result of Heart d isease dataset’s upward-union

condition attributes and 653 objects. The results

are shown in Fig.5-6.

Fig.4. The result of Heart d isease dataset’s

downward-union.

Fig.5. The result of Credit approval dataset’s

upward-union.

Fig.6. The result of Credit approval dataset’s

downward-union.


ISBN: 978-1-941968-09-3 ©2015 SDIWC 48

It can be seen that the values of

entropy-based approximation roughness and

approximation roughness measure are

decreasing with the number of attributes

becoming bigger from the Fig3-6, means that

the uncertainty decrease when the attributes

increases, on the other hand, if supplying more

available knowledge, the uncertainty will

decrease. The experiments demonstrate the

availability of the two uncertainty

measurements based on in-sim-dominance

relation. Through the Fig3-6, it is easy to find

that the value of approximation roughness has

no change when the number of attributes

increases from 1 to 2 in Heart disease and Credit

card data sets. By contrast, the entropy-based

approximation roughness can evaluate

uncertainty of hybrid information system more

accurately.

5. CONCLUSIONS

In this paper, we studied the uncertainty

measurement based on in-sim-dominance

relation. The roughness and approximation

roughness measurements are extended to deal

with the hybrid information system firstly, and

then propose the entropy-based roughness and

the entropy-based approximation roughness of

Up-union and Down-union to measure the

uncertainty of hybrid information system based

on the hybrid entropy. And the experimental

results demonstrate that the approximation

roughness and the entropy-based approximation

roughness measurements are useful and

effective for evaluating the uncertainty of hybrid

information system. However, the results also

show that the entropy-based approximation

roughness can evaluate the uncertainty more

clearly than approximation roughness.

Moreover, we can establish the other binary

relations such as tolerance relation to jointly

with in-sim-dominance relation meet the needs

of special problems.

Acknowledgment

This work is supported by National Natural

Science Foundation of China (No. 61272060),

and Key Natural Science Foundation of

Chongqing (No. CSTC2013jjB40003).

REFERENCES

[1] Z.Pawlak.Rough sets: International Journal of computer and Information Sciences, 11(5): 341- 356[J]. 1982

[2] Z.Pawlak.Rough sets: theoretical aspects of reaso- ning about data, system theory, Knowledge Engineering and Problem Solving, vol. 9[J]. 1991.

[3] F.Hu,G.Y.Wang. Quick reduction algorithms based on attribute order [j].Chinese Journal of Computers, 8:029, 2007.

[4] S.Hirano,S.Tsumoto. Segmentation of medical images based on approximations in rough set theory. In Rough Sets and Current Trends in Computing, pages554–563. Springer, 2002.

[5] Z.Pawlak.Rough set approach to knowledge-based decision support. Europeanjournal of operational research, 99(1):48–57, 1997.

[6] Y.Z.Liu,H.Y.Xuan,G.X.Lin. Application research on tax forecasting in china based on rough set theory [j]. Systems Engineering-theory & Practice, 10:017, 2004.

[7] R.R.Tan. Rule-based life cycle impact assessment using modified rough set induction methodology. Environmental Modeling & Software, 20(5):509–513, 2005.

[8] K.Thangavel,A.Pethalakshmi. Dimensionality reduction based on rough set theory: A review. Applied Soft Computing, 9(1):1–12, 2009.

[9] J.H.Zhang,Y.Y.Wang. A rough margin based support vector machine. Information Sciences, 178(9):2204–2214, 2008.

[10] T.Herawan,M.M.Deris, and J.H. Abawajy. A rough set approach for selecting clustering attribute. Knowledge-Based Systems, 23 (3): 220 –231, 2010.

[11] M.Kryszkiewicz. Rough set approach to incomplete information systems [J]. Information sciences, 1998, 112(1): 39-49.

[12] M. Kryszkiewicz. Rules in incomplete information systems. Information Sciences, 113(3):271-292, 1999.

[13] J.Stefanowski,A.Tsoukias. On the extension of rough sets under incomplete information. In New Directions in Rough Sets, Data Mining, and Granular-Soft Computing, pages 73-81. Springer, 1999.

[14] J.Stefanowski,A.Tsoukias. Incomplete information tables and rough classification. Computational Intelligence, 17(3):545-566, 2001.

[15] J.Stefanowski,A.Tsoukias. Valued tolerance and decision rules. In Rough Sets and Current Trends in Computing, pages 212-219. Springer, 2001.

[16] G.Y.Wang. Extension of rough set under incomplete information systems. In Fuzzy Systems, 2002. FUZZ-IEEE'02. Proceedings of the 2002 IEEE International Conference on, volume 2, pages 1098-1103.IEEE, 2002.

[17] J.W.Grzymala-Busse. Characteristic relations for incomplete data: A generalization of the indiscer-


ISBN: 978-1-941968-09-3 ©2015 SDIWC 49

nibility relation. In Rough Sets and Current Tren- ds in Computing, pages 244-253. Springer, 2004.

[18] J.W.Grzymala-Busse. Rough set strategies to data with missing attribute values. In Foundations and Novel Approaches in Data Mining, pages 197-212. Springer, 2006.

[19] S.Greco, B.Matarazzo, R.Slowinski. Rough sets theory for multi criteria decis ion analysis [J]. European journal of operational research, 2001, 129(1): 1-47.

[20] S.Greco,B.Matarazzo,R.Slowinski. Rough sets methodology for sorting problems in presence of multiple attributes and criteria[J]. European journal of operational research, 2002, 138(2): 247 - 259.

[21] L.P.An, L.Y.Tong. Rough approximations based on intersection of indiscernibility, similarity and outranking relations [J]. Knowledge-Based Syst- ems, 2010, 23(6): 555-562.

[22] J.H.Dai,W.T.Wang,J.S. Mi. Uncertainty measur- rement for interval-valued information systems [J]. Information Sciences, 2013, 251: 63-78.

[23] J.H.Dai, Q.Xu. Approximations and uncertainty measures in incomplete information systems [J]. Information Sciences, 2012, 198: 62-80.

[24] T.Beaubouef,F.E.Petry, G. Arora. Information- theoretic measures of uncertainty for rough relational database. Information Sciences, 1998, 109(1-4):185-195.

[25] Y.Y.Yao,S.K.M.Wong,C.J.Butz. On information -theoretic measures of attribute importance [M]// Methodologies for Knowledge Discovery and Data Mining. Springer Berlin Heidelberg, 1999: 133-137.

[26] Y.Y.Yao,L.Q.Zhao. A measurement theory view on the granularity of partitions [J]. Information Sciences, 2012, 213: 1-13.

[27] Y.Y.Yao,X.F.Deng. Quantitative rough sets based on subsethood measures [J]. Information Sciences, 2014, 267: 306-322.

[28] Y.H.Qian,J.Y.Liang. Combination entropy and combination granulation in rough set theory [J]. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2008, 16(02): 179-193.

[29] C.Shannon. The mathematical theory of communication [J]. Bell Syst. Tech, 27 (1948) 379 - 423.


ISBN: 978-1-941968-09-3 ©2015 SDIWC 50

Uncertainty Measurement Based on In-sim-dominance Relation

Documents

Transcript of Uncertainty Measurement Based on In-sim-dominance Relation