Evaluating the information quality of web sites: A methodology based on fuzzy computing with words

13
Evaluating the Information Quality of Web Sites: A Methodology Based on Fuzzy Computing with Words E. Herrera-Viedma 1 , G. Pasi 2 , A.G. Lopez-Herrera 1 , C. Porcel 1 1 Dept. of Computer Science and A. I., Library Science Studies School University of Granada, 18071- Granada, Spain,[email protected], [email protected] 2 Universit degli Studi di Milano Bicocca, Via Bicocca degli Arcimboldi 8, Milano Italy, email: [email protected] Abstract An evaluation methodology based on fuzzy com- puting with words aimed at measuring the in- formation quality of Web sites containing docu- ments is presented. This methodology is qual- itative and user-oriented because it generates linguistic recommendations on the information quality of the content-driven Web sites based on users’ perceptions. It is composed of two main components, an evaluation scheme to an- alyze the information quality of Web sites, and a measurement method to generate the linguis- tic recommendations. The evaluation scheme is based on both technical criteria related to the Web site structure, and criteria related with the content of information on the Web sites. It is user-driven because the chosen criteria are easily understandable by the users, in such a way that Web visitors can assess them by means of linguis- tic evaluation judgements. The measurement method is user-centered because it generates lin- guistic recommendations of the Web sites based on the visitors’ linguistic evaluation judgements. To combine the linguistic evaluation judgements we introduce two new majority guided linguis- tic aggregation operators, the MLIOWA and weighted MLIOWA operators, that generate the linguistic recommendations according to the ma- jority of the evaluation judgements provided by different visitors. The use of this methodology could improve tasks such as information filtering and evaluation on the World Wide Web. 1 Introduction Nowadays, we can assert that the World Wide Web is the largest available repository of data with the largest num- ber of visitors searching information. The World Wide Web is a distributed, dynamic, and rapidly growing in- formation source [29] that has stimulated new and useful research developments in areas such as digital libraries, in- formation retrieval, education, commerce, entertainment, government, and health care [30, 31]. However, it presents some serious handicaps: its growth is disorganized and uncontrolled, thus contributing to that bad information thrives on the World Wide Web. As a consequence the Internet users have access to bad or poor quality informa- tion [48]. Recognizing the basic differences between publishing on the Web and publishing on paper, may help to under- stand the lack of quality typical of the World Wide Web. In contrast to the printed paper world, on the World Wide Web anyone can publish information, either by simply ac- quiring space on a Web site and creating an electronic doc- ument (using any of the available formats HTML, XML, Pdf or PostScript) or by paying someone to create it. The fact is that there are neither rules nor standards govern- ing the type and quality of information which a writer can put on the Web, nor a central control on where and how documents are published [6]. In the press world, au- thors can publish their own studies based on their own expenses, but self-published materials generally reach a limited audience. The Web, on the other hand, facilitates the distribution of self-published works, while significantly reducing the cost of production [48]. Furthermore, post- ing articles on the Web instead of publishing in printed books/journals, increases their impact on the develop- ment of subsequent new ideas [43]. Some studies show that many articles published in printed books/journals are never cited in any subsequent article; this means that printed articles have a low impact on the development of subsequent new ideas [9, 13, 14, 39]. Hence the fact that the amount of Web documents and the number of content-driven Web sites on the Internet is continuously and rapidly increasing, although in many cases this hap- pens without an efficient information quality control. For several topics, the World Wide Web contains thou- sands of relevant documents/sites of widely varying infor- mation quality. To cope with this situation, over the past few years many techniques (Web search engines, informa- tion filtering systems, Web personalization systems, Web mining) for managing, querying, filtering and integrating information on the World Wide Web have been developed. The major advances in the design of these techniques have been guided by information quality criteria, that is, cri- teria aimed at improving the quality of the information provided to the users from the World Wide Web. How- ever, some characteristics typical of the World Wide Web, as for example its fast and uncontrollable growth, its het-

Transcript of Evaluating the information quality of web sites: A methodology based on fuzzy computing with words

Evaluating the Information Quality of Web Sites: A Methodology Basedon Fuzzy Computing with Words

E. Herrera-Viedma1, G. Pasi2, A.G. Lopez-Herrera1, C. Porcel11Dept. of Computer Science and A. I., Library Science Studies School

University of Granada, 18071- Granada, Spain,[email protected], [email protected] degli Studi di Milano Bicocca, Via Bicocca degli Arcimboldi 8, Milano Italy, email: [email protected]

Abstract

An evaluation methodology based on fuzzy com-puting with words aimed at measuring the in-formation quality of Web sites containing docu-ments is presented. This methodology is qual-itative and user-oriented because it generateslinguistic recommendations on the informationquality of the content-driven Web sites basedon users’ perceptions. It is composed of twomain components, an evaluation scheme to an-alyze the information quality of Web sites, anda measurement method to generate the linguis-tic recommendations. The evaluation scheme isbased on both technical criteria related to theWeb site structure, and criteria related with thecontent of information on the Web sites. It isuser-driven because the chosen criteria are easilyunderstandable by the users, in such a way thatWeb visitors can assess them by means of linguis-tic evaluation judgements. The measurementmethod is user-centered because it generates lin-guistic recommendations of the Web sites basedon the visitors’ linguistic evaluation judgements.To combine the linguistic evaluation judgementswe introduce two new majority guided linguis-tic aggregation operators, the MLIOWA andweighted MLIOWA operators, that generate thelinguistic recommendations according to the ma-jority of the evaluation judgements provided bydifferent visitors. The use of this methodologycould improve tasks such as information filteringand evaluation on the World Wide Web.

1 Introduction

Nowadays, we can assert that the World Wide Web is thelargest available repository of data with the largest num-ber of visitors searching information. The World WideWeb is a distributed, dynamic, and rapidly growing in-formation source [29] that has stimulated new and usefulresearch developments in areas such as digital libraries, in-formation retrieval, education, commerce, entertainment,government, and health care [30, 31]. However, it presentssome serious handicaps: its growth is disorganized and

uncontrolled, thus contributing to that bad informationthrives on the World Wide Web. As a consequence theInternet users have access to bad or poor quality informa-tion [48].

Recognizing the basic differences between publishing onthe Web and publishing on paper, may help to under-stand the lack of quality typical of the World Wide Web.In contrast to the printed paper world, on the World WideWeb anyone can publish information, either by simply ac-quiring space on a Web site and creating an electronic doc-ument (using any of the available formats HTML, XML,Pdf or PostScript) or by paying someone to create it. Thefact is that there are neither rules nor standards govern-ing the type and quality of information which a writercan put on the Web, nor a central control on where andhow documents are published [6]. In the press world, au-thors can publish their own studies based on their ownexpenses, but self-published materials generally reach alimited audience. The Web, on the other hand, facilitatesthe distribution of self-published works, while significantlyreducing the cost of production [48]. Furthermore, post-ing articles on the Web instead of publishing in printedbooks/journals, increases their impact on the develop-ment of subsequent new ideas [43]. Some studies showthat many articles published in printed books/journalsare never cited in any subsequent article; this means thatprinted articles have a low impact on the developmentof subsequent new ideas [9, 13, 14, 39]. Hence the factthat the amount of Web documents and the number ofcontent-driven Web sites on the Internet is continuouslyand rapidly increasing, although in many cases this hap-pens without an efficient information quality control.

For several topics, the World Wide Web contains thou-sands of relevant documents/sites of widely varying infor-mation quality. To cope with this situation, over the pastfew years many techniques (Web search engines, informa-tion filtering systems, Web personalization systems, Webmining) for managing, querying, filtering and integratinginformation on the World Wide Web have been developed.The major advances in the design of these techniques havebeen guided by information quality criteria, that is, cri-teria aimed at improving the quality of the informationprovided to the users from the World Wide Web. How-ever, some characteristics typical of the World Wide Web,as for example its fast and uncontrollable growth, its het-

erogeneity, freshness requirements [6, 29, 30, 31], as wellas others problems such as for example the bubble of webvisibility [11, 12], are still limiting the quality in the infor-mation provided by the different Web search techniques.Consequently, in the Web research community the debateon the quality of the information available on the Inter-net is still open [2, 10, 45, 46]. How to identify useful andhigh-quality information in the unregulated market placeas the World Wide Web is still a crucial problem.

The quality evaluation of content-driven Web sites focus-ing on the user-perceived quality of the stored informationis a difficult task that has seldom been studied [41], andthere does not exist a Web information quality frameworkas a reference point [10]. As some other authors have done[4, 26], we use the information quality framework for in-formation systems [23, 32, 44, 49] as the basic referencepoint of our proposal.

The aim of this paper is to present a quality evaluationmethodology of Web sites based on the user perceptionsof the quality of the information they provide. In thispaper, content-driven Web sites are considered, in whichthe information is organized in multiple kinds of docu-ments, e.g. scientific articles, opinion articles, etc, usingany of the available electronic formats HTML, XML, Pdfor PostScript. This methodology is designed using toolsof fuzzy computing with words [15, 16, 17, 18] to facili-tate the user participation. The goal of this methodologyis to generate linguistic quality evaluations or linguisticrecommendations on such Web sites. To do that, it iscomposed of two main components, an evaluation schemeto analyze the information quality of Web sites, and ameasurement method to generate the linguistic recom-mendations. The evaluation scheme takes into accountboth technical criteria concerning the structural designof Web sites, and criteria related with the content of theWeb sites. It is user-driven rather than designer-driven,i.e., it includes user-perceived Web evaluation indicatorssuch as navigation or believability, rather than quantifi-able Web attributes such as code quality or design; that is,it considers Web characteristics and attributes easily un-derstandable by a non expert Web visitor, in such a waythat Web visitors can assess them by means of linguis-tic evaluation judgements. Using the information qual-ity framework proposed in [23, 32, 44, 49], we define auser-driven evaluation scheme of Web sites. The measure-ment method is user-centered rather than ”site model”-centered, i.e., the recommendations are obtained from lin-guistic evaluation judgements provided by the Web visi-tors rather than from assessments obtained objectivelyby means of the direct observation of the site model char-acteristics. We consider that users after visiting a Website to examine a stored document are required to expresstheir evaluation judgements on the evaluation scheme bymeans of the specification of linguistic labels. Then, anoverall linguistic recommendation concerning the qualityof that Web site is obtained by combining the linguisticevaluation judgements provided by its different visitors.To combine the judgements we introduce two new major-ity guided linguistic aggregation operators, the MLIOWA

and weighted MLIOWA operators, that adequately im-plement the idea of fuzzy majority. To define them, weuse the majority guided Induced Ordered Weighted Av-eraging (IOWA) operators introduced in [37, 38]. In sucha way, our methodology allows us to obtain no-distortedrecommendations on information quality of the Web sitesthat really are representative of the majority of individualrecommendations provided by different visitors of sites.With this Web quality evaluation methodology the infor-mation filtering and evaluation possibilities in the Webare increased. In this way, when a user requires informa-tion on the World Wide Web, then not just content-drivenWeb sites could be provided, but also recommendationson the information quality of them and on Web sites thatstore similar documents that could be of interest to theuser.

The paper is organized as follows. In the next section theproblem of evaluating the information quality of Web sitesis analyzed. The tool of fuzzy computing with words andthe new majority guided aggregation operators are definedin Section 3. The Web quality evaluation methodology isintroduced in Section 4. A discussion on the proposedmethodology is presented in Section 5. Section 6 sketchesour conclusions.

2 The Problem of Evaluating theInformation Quality of Web sites

In our opinion, there is not yet a clear and unambiguousdefinition of the concept of information quality on theWorld Wide Web, and unfortunately, well-founded andtheoretical Web quality frameworks are still missing [10].One can probably find as many definitions for informa-tion quality on the Web as there are papers on informa-tion quality. The quality evaluation on the World WideWeb is neither simple nor straightforward. Web quality isa complex concept and its measurement or evaluation ismulti-dimensional in nature [1]. Assuming such a nature,we can agree about the definition of Web quality given in[34] “we conceive quality as an aggregated value of mul-tiple information quality criteria”. On the other hand, in[33] it was used the definition of quality given by the ISOto allow the employ the evaluation indicators taken intoaccount in the evaluation of commercial Web sites. TheISO defines quality as “the totality of characteristics ofan entity that bear on its ability to satisfy stated and im-plied needs” [25]. From this definition two different kindsof requirements for Web document/site quality evaluationemerge:

1. Technical requirements: These concern the evalua-tion of the main characteristics and structure of Webdocuments/sites. In this category we find evaluationcriteria that are indicators of an objective and quan-titative nature, e.g., clear ordering of information,broken links, orphan pages, code quality, navigation,etc.

2. Content requirements: These concern the evaluationof how well the Web documents/sites meet the spe-

cific user needs. In this category we find evaluationcriteria that are indicators of a subjective and qual-itative nature, e.g., consistency, accuracy, relevance,etc.

As aforementioned, in Web quality evaluation there is nota general theoretical foundation or framework [10]. Dueto this reason, many researchers have tried to use otherwell-founded quality assessment frameworks defined forother fields. One of the more used is the informationquality framework defined in the context of managementinformation systems [23, 32, 44, 49], used, for example, todefine an evaluation methodology of information qualityof personal Web sites [26], and of mobile Internet services[4]. This quality framework establishes that the differentdimensions (e.g. accuracy, accessibility, relevance,...) em-ployed to evaluate the information quality of a system canbe grouped into four major information quality categories:i) intrinsic information quality, ii) contextual informationquality, iii) representational information quality and iv)accessibility information quality. The two first categoriesmainly deal with the ”content” aspects of information sys-tems, while the other ones with some technical design as-pects.

A robust and flexible Web quality evaluation methodol-ogy should properly combine both kinds of requirements.Some authors [24, 34, 36] have proposed Web quality eval-uation methodologies which combines both technical andcontent aspects, but the harsh reality is that the major-ity of suggested Web evaluation methodologies tend to bemore objective than subjective, quantitative (based on nu-merical information) rather than qualitative (based on lin-guistic information), and do not take into account the userperception [3, 5, 35]. However, from the information con-sumer’s perspective the quality of a Web document/sitemay not be assessed independently of the quality of theinformation contents that it provides.

An additional drawback of many Web evaluation method-ologies is that their evaluation indicators are relevant toWeb providers and designers rather than to the Web users[1]. A global Web quality evaluation methodology can-not entirely avoid users’ participation in the evaluationstrategy. User judgments can help to evaluate the infor-mation quality of accessed Web documents/sites becausethe concept of information quality is typically consumer-dependent, and the consumer must be the ultimate judgeof the Web site’s/document’s information quality. Theproblem here is that the users do not frequently makethe effort to give explicit feedback. Web search enginescan collect implicit user feedback using log files. However,this data is still incomplete. To achieve better results ofevaluation on the World Wide Web, the direct partici-pation of the user is necessary, i.e. the development ofuser-centered Web quality evaluation methodologies is anecessity and could provide additional advantages. Forexample, the use of a user-centered approach to evaluateWeb sites would mean that users are more pro-actively ap-proached to determine their needs -both technical and interms of information-, their perceptions of Web site orga-

nization, terminology, ease of navigation, etc, which couldbe used in a redesign of the site [1, 24]. Or for example inthe field of the collaborative recommender systems [42] (akind of information filtering systems that collect ratings ofitems from many individuals and make recommendationsbased on those ratings to a given user) a user-centeredWeb quality evaluation methodology could contribute awell-founded framework to express ratings or evaluationjudgements and to generate the recommendations.

A possible way to facilitate the user participation is toembed in the Web quality evaluation methodology thosetools of Artificial Intelligence that allow a better repre-sentation of subjective and qualitative user judgements,as for example, a soft computing tool called fuzzy linguis-tic modelling [56]. The use of fuzzy linguistic modellingcould increase the user participation in the evaluation ofthe quality of Web documents/sites, because it is a user-friendly tool that helps users to express their judgementsin a more natural way [20]. In [22] we use the tool offuzzy linguistic modelling, called ordinal fuzzy linguisticmodelling [15, 16, 17, 18], to design a user-centered qual-ity evaluation methodology for Web documents. In thispaper, we apply the same linguistic tool to evaluate theinformation quality of Web sites that store documents.

The information quality framework defined in [23, 32, 44,49] was proposed by considering that the quality of in-formation systems cannot be assessed independently ofthe information consumers’ opinions (people who use theinformation). As aforementioned, this information qual-ity framework establishes four major information qualitycategories to classify the different criteria to evaluate thequality of an information system [23, 32, 44, 49]:

1. Intrinsic information quality. This category ad-dresses the very nature of the information. It assumesthat information has its own quality. The main ”di-mension” of the intrinsic information quality is theaccuracy of the information. If a reputation for in-accurate information becomes a common knowledgefor a particular information system, this system isviewed as having little added value and will result ina reduced use. Other dimensions of this category are:believability, reputation and objectivity.

2. Contextual information quality. Also this categoryemphasizes the importance of the informative aspectsof information but from a task perspective. It high-lights the requirement that information quality mustbe considered within the context of the task at hand;it must be relevant, timely, complete, and appropri-ate in terms of amount, so as to add value to thetasks for which the information is provided. There-fore, some dimensions of this category are: value-added, relevance, completeness, timeliness, appropri-ate amount.

3. Representational information quality. This categoryemphasizes the importance of the technical aspects ofthe (computer-based) structure of the information. Itrequires that information systems must present their

information in such a way that it is interpretable,easy to understand, easy to manipulate, and is repre-sented concisely and consistently. Therefore, some di-mensions of this category are: understandability, in-terpretability, concise representation, consistent rep-resentation.

4. Accessibility information quality. This category em-phasizes the importance of the technical aspects ofcomputer systems that provide access to information.It requires that the information system must be ac-cessible but secure. Therefore, some dimensions ofthis category are: accessibility and secure access.

Using the above information quality framework, in [26]a designer-driven information quality framework was pro-posed to evaluate the informative quality of personal Websites, which includes the following evaluation dimensions:

1. Intrinsic quality of Personal Web sites: i) accuracyand errors of the content, and ii) accurate, workableand relevant hyperlinks.

2. Contextual quality of Personal Web sites: provisionof author’s information.

3. Representational quality of Personal Web sites: i)organization, visual settings, typographical features,and consistency, ii) vividness and attractiveness, andiii) confusion of the content.

4. Accessibility quality of Personal Web sites: naviga-tional tools provided.

Similarly, we shall use it as basis to define the evaluationscheme of our quality evaluation methodology.

3 Fuzzy Computing with Words

In this section we present the fuzzy linguistic approachused to develop the processes of fuzzy computing withwords in our Web quality evaluation methodology to-gether with new majority guided linguistic aggregationoperators.

3.1 Ordinal Fuzzy Linguistic Approach

The fuzzy linguistic approach is a technique appropriateto deal with qualitative aspects of problems [56]. Theordinal fuzzy linguistic approach [16] is a linguistic toolused for modelling processes based on the management oflinguistic expressions or computing processes with words;this can be useful in group decision making [15, 21] or ininformation retrieval systems [17, 18].

The ordinal fuzzy linguistic approach is defined by con-sidering a finite and totally ordered label set S = {si}, i ∈{0, . . . , T } in the usual sense, i.e., si ≥ sj if i ≥ j, andwith odd cardinality (typically 7 or 9 labels). The midterm represents an assessment of ”approximately 0.5”,and the rest of the terms are placed symmetrically around

it. The semantics of the linguistic term set is establishedfrom the ordered structure of the term set by consideringthat each linguistic term in the pair (si, sT −i) is equallyinformative. For example, we can use the following setof nine labels to provide the user evaluations: S ={ T= Total, EH = Extremely High, VH = Very High, H =High, M = Medium, L = Low, VL = Very Low, EL =Extremely Low, N = None }.In any linguistic approach it is necessary to define someoperators that can be applied to linguistic information.An advantage of the ordinal fuzzy linguistic approach isthe simplicity of its computational model for comput-ing with words. It is based on a symbolic computation[15, 16]. This technique acts by direct computation onlabels by taking into account the order of such linguisticassessments in the ordered structure of linguistic terms.This symbolic tool seems natural when using the fuzzylinguistic approach, because the linguistic assessments aresimply approximations which are handled when it is im-possible or unnecessary to obtain more accurate values.

Usually, the ordinal fuzzy linguistic model for comput-ing with words is defined by establishing i) a negationoperator, ii) comparison operators based on the orderedstructure of linguistic terms, and iii) adequate aggrega-tion operators of linguistic information. In most ordinalfuzzy linguistic approaches the negation operator is de-fined from the semantics associated to the linguistic termsas:

Neg(si) = sj | j = T − i.

Usually, two comparison operators of linguistic terms aredefined:

i) Maximization operator: MAX(si, sj) = si if si ≥ sj ,

ii) Minimization operator: MIN(si, sj) = si if si ≤ sj .

An interesting class of linguistic aggregation operators arebased on the Ordered Weighted Averaging (OWA) opera-tors [50]. Some examples of useful linguistic OWA opera-tors to combine linguistic information are the LOWA [16]and the LWA operators [15].

An important aspect of an OWA operator is that its be-havior is modelled by means of its weighting vector. Inthis way distinct semantics can be associated with anOWA operator, depending on its weighting vector. Apossible aggregation strategy is based on the concept offuzzy majority, expressed by a linguistic quantifier [57]. In[50] a method was proposed to define a weighting vectorbased on the formal definition of a linguistic quantifier;this method is useful in group decision making, when it isnecessary to obtain a collective assessment from a set ofindividual assessments [15, 16, 21, 22]. What is expectedin these cases is to obtain an aggregated value which syn-thesizes the majority of the values to be aggregated, i.e.,the values which are more similar to each other [37, 38].However, as was pointed out in [37, 38] with the methodproposed in [50] this semantics is not always achieved,and this limits the performance of those systems that usea majority guided OWA operator. For this reason, we

shall not use the LOWA and LWA operators to imple-ment the concept of fuzzy majority in our quality evalua-tion methodology, but we will employ some new operatorsfor performing the linguistic aggregation.

3.2 Majority Guided Linguistic AggregationOperators

In [37, 38] a majority guided Induced OWA (IOWA) op-erator was defined, that tries to overcome the problemsof the majority guided OWA operators. This IOWA op-erator combines numerical values in such a way that thefinal result synthesizes the majority of similar values to beaggregated. Using this operator we propose two majorityguided linguistic IOWA operators that allow to carry outthe majority guided aggregation of linguistic informationin our information Web quality evaluation methodology.

Definition 1. [53, 54] An IOWA operator of dimension nis a function ΦW : (RxR)n → R, to which a weighting vec-tor is associated, W = (w1, . . . , wn), such that wi ∈ [0, 1]and Σiwi = 1, and it is defined to aggregate the set of sec-ond arguments of a list of n pairs {(u1, p1), . . . , (un, pn)}according to the following expression,

ΦW ((u1, p1), . . . , (un, pn)) =n∑

i=1

wi · pσ(i)

being σ : {1, . . . , n} → {1, . . . , n} a permutation such thatuσ(i) ≥ uσ(i+1), ∀i = 1, . . . , n − 1, that is (uσ(i), pσ(i))is the pair with uσ(i) the i-th highest value in the set{u1, . . . , un}.In the above definition the reordering of the set of valuesto be aggregated, {p1, . . . , pn}, is induced by the orderingof the values {u1, . . . , un} associated with them. Due tothis use of the set of values {u1, . . . , un}, Yager and Filevcalled them the values of an order inducing variable and{p1, . . . , pn} the values of the argument variable [53, 54].

Definition 2. A majority guided linguistic IOWA(MLIOWA) operator of dimension n is a function ΦQ :(RxS)n → S, defined according to the following expres-sion,

ΦQ((u1, p1), . . . , (un, pn)) = sk ∈ S,

with k = round(n∑

i=1

wi · ind(pσ(i))),

such that

i) σ : {1, . . . , n} → {1, . . . , n} is a permutation such thatuσ(i+1) ≥ uσ(i), ∀i = 1, . . . , n − 1, that is (uσ(i), pσ(i))is the pair with uσ(i) the i-th lowest value in the set{u1, . . . , un},ii) ui = supi, being supi the overall support of value pi

obtained as

supi =n∑

j=1

supij | supij ={

1 if |ind(pi)− ind(pj)| < α0 otherwise

with α ∈ {0, 1, . . . , T } and supij a binary support func-tion [55] which expresses the support from pj for pi or thesimilarity between both values.

iii) ind : S → {0, 1, . . . , T }, such that ind(si) = i, and

iv) Q is a linguistic quantifier representing the conceptof fuzzy majority in the aggregation, which is used tocompute the weighting vector W = (w1, . . . , wn), suchthat wi ∈ [0, 1], Σiwi = 1, and

wi = Q(supσ(i)

n)/

n∑

j=1

Q(supσ(j)

n),

with Q( supσ(i)

n ) denoting the degree to which pσ(i) repre-sents the majority.

In the MLIOWA operator we assume that all linguisticvalues to be aggregated are equally important. However,in our methodology, we need to carry out aggregationsof weighted information, i.e., when we want to aggregatequality judgements on evaluation criteria with differentimportance degrees. To do this, we introduce a weightedMLIOWA operator.

Definition 3. A weighted MLIOWA operator of dimen-sion n is a function ΦI

Q : (SxS)n → S, defined accordingto the following expression,

ΦIQ((I1, p1), . . . , (In, pn)) =

ΦQ((u1, p1), . . . , (un, pn)),

in which

i) the order inducing values are obtained from the linguis-tic importance degrees associated with the values to beaggregated as

ui =supi + ind(Ii)

2

with

supi =n∑

j=1

supij | supij ={

1 if |ind(Ii)− ind(Ij)| < α0 otherwise

being Ii the linguistic importance degree of the value pi

to be aggregated, and

ii) the weighting vector is obtained as

wi = Q(uσ(i)

n)/

n∑

j=1

Q(uσ(j)

n).

We should point out that the use of the ordinal fuzzy lin-guistic approach in our Web quality evaluation method-ology provides a well-founded mathematical frameworkto represent and deal directly with linguistic information.In such a way, we can generate linguistic quality evalu-ations from linguistic judgements. This is an importantlimitation in other Web quality evaluation methodologies[1, 24, 26] that also use in some cases labels to representassessments, because they lack aggregation operators oflinguistic information to generate the global quality val-ues.

4 An Evaluation Methodology ofInformation Quality of the Web Sites

In this section we present a methodology to evaluate theinformation quality of content-driven Web sites that storeinformation in electronic documents (in any of the knownformats on the Web, e.g., XML, HTML, PostScript, Pdf).As we said at the beginning, this is a user-oriented evalu-ation methodology of a qualitative and subjective naturethat is based on the evaluation judgements provided bythe users. This methodology establishes two instrumentsto evaluate the information quality of the Web sites: auser-driven evaluation scheme and a user-centered mea-surement method.

4.1 The Evaluation Scheme of Web Sites

Using the information quality framework presented in Sec-tion 2 we develop an evaluation scheme for analyzing theinformation quality of Web sites that provide informationstored in electronic documents. This evaluation scheme isbased both on technical criteria of Web site design, andon criteria related with the content of information of Websites. These criteria are assessed subjectively by userswho visit occasionally the Web site because they find somestored documents which satisfy their information needs.

4.1.1 Characteristics

The evaluation scheme proposed presents the followingcharacteristics:

• It is user-driven rather than designer-driven.

We want to generate recommendations on Web sitesfrom the evaluations provided by the different visi-tors of Web sites. Therefore, the evaluation schemeshould be user-driven rather than designer-drivenfrom two perspectives:

– Qualitative perspective: The evaluation schemenecessarily requires the inclusion of dimensionseasily understandable by any information con-sumer (e.g. relevance, understandability) ratherthan dimensions that can be measured objec-tively with independence of the consumers (e.g.accuracy measured by the number of spelling orgrammatical errors) or only perceptible by thedesigners (e.g. code quality or design).

– Quantitative perspec-tive: The evaluation scheme should not includean excessive number of quality dimensions in or-der to help users in understanding it and avoid-ing confusion. As it is known, user’s capabilityto cope with concepts is limited to only five tonine concepts at one time (the magical number 7± 2 [28]). Furthermore, long and complex eval-uation schemes cause the user idleness and limittheir own application possibilities.

• It is weighted, i.e., its quality dimensions are notequally important.

The quality dimensions of the evaluation scheme donot play the same role to measure the informationquality of a Web site, i.e., some dimensions shouldbe more influential than others. For example, useropinions on the information quality of the documentsstored in the Web site (e.g. the relevance) must bean important dimension of the evaluation scheme.

4.1.2 Information Quality Dimensions

We define a user-driven and weighted evaluation schemeof content-driven Web sites that contemplates four qualitycategories with the following information quality dimen-sions:

1. Intrinsic quality of Web sites. Accuracy of informa-tion is the main determinant of the intrinsic infor-mation quality of information systems. We discussaccuracy of Web sites by considering what visitorsthink about the believability of the information con-tents that the Web site provides. As we consider Websites as information sources that are occasionally vis-ited, we are not interested to evaluate the accuracyby means of grammatical and spelling errors or rele-vant hyper-links existing in the Web site.

2. Contextual quality of Web sites. This is the mostimportant category in the evaluation scheme. Inour evaluation scheme neither the dimension of au-thor’s information, as happens in [26], nor appropri-ate amount are meaningful. We propose to evalu-ate this category by considering what visitors thinkabout the relevance, timeliness and completeness ofdocuments that Web site provides them when theysearch information with respect to a specific topic,i.e., if documents are relevant to the search topic, ifdocuments are sufficiently up-to-date for the searchtopic, and if documents are sufficiently original andcomplete for the search topic.

3. Representational quality of Web sites. We analyzethis category for the Web sites that provide infor-mation stored in electronic documents by taking intoaccount two aspects: i) the representational aspectsof Web site design and ii) representational aspectsof documents stored in the Web site. In the firstcase, we consider what visitors think about the un-derstandability of a Web site (named understandabil-ity1), i.e., whether or not the Web site is well orga-nized in such a way that visitors easily understandhow to access the stored documents. In the secondcase, we consider what visitors think about under-standability (named understandability2) and concise-ness of the information contents of electronic docu-ments examined.

4. Accessibility quality of Web sites. As in [26] weconsider that this category should be assessed bywhether or not the Web site provides enough nav-igation mechanisms so that visitors can reach theirdesired documents faster and easier. Lacking effec-tive paths to access to the desired documents would

handicap visitors, therefore navigation tools are nec-essary to support users can locate the informationthey want. We evaluate this dimension by consider-ing what visitors think about navigational tools pro-vided by the Web site.

The evaluation scheme is summarized in Table 1.

INFORMATION INFORMATIONQUALITY QUALITY

CATEGORIES DIMENSIONSIntrinsic quality believability

Contextual quality relevance,timeliness,originality,

completenessRepresentational quality understandability1,

understandability2,conciseness

Accessibility quality navigational tools

Table 1. User-driven evaluation scheme of Web sites

4.2 Specifying the Measurement Method of theInformation Quality of the Web Sites

The measurement method of the information quality ofthe content-driven Web sites that we define is like a multi-person multi-criteria decision making method in which thesearch alternatives are Web sites. In a multi-criteria de-cision making method the goal consists in searching thebest alternatives according to the assessments provided bya group of experts with respect to a set of evaluation crite-ria [7, 47]. To do that, through the aggregation of the ex-perts’ assessments the quality of alternatives is measuredand, later, the exploitation of those quality values leads tothe selection of the best alternatives. In our case, the goalconsists in computing information quality evaluations orrecommendations for Web sites in order to select the Websites that could better meet the user information needs,but like in a multi-criteria decision context, we computethose values according to the assessments provided by agroup of persons (Web visitors).

As it is known, in multi-criteria decision making processesthe chosen aggregation operator is a critical aspect thathas a direct influence on the success of the decision pro-cess. The quantifier guided aggregation operators basedon the OWA operator constitute a successful tool to ag-gregate information because of its flexibility, i.e., it allowsto represent in the aggregations different interpretationsof the concept of majority by means of the fuzzy linguis-tic quantifier [50, 51, 52]. We do the same in our qualitymeasurement method.

4.2.1 Characteristics

We design a measurement method to generate linguisticrecommendations as a measure of the information qualityof the Web sites which presents two main characteristics:

1. It is a user-centered measurement method: The lin-guistic recommendations on the Web sites are ob-tained from individual linguistic judgements providedby the Web visitors rather than from assessments ob-tained objectively by means of the direct observationof the Web site characteristics.

2. It is a majority-guided measurement method: The lin-guistic recommendations are values representative ofthe majority of individual judgements provided bythe Web visitors. The aggregation to compute thelinguistic recommendations is developed by means oftwo new majority guided linguistic aggregation op-erators, the MLIOWA operator and the weightedMLIOWA operator, which, additionally, overcomethe drawback of the classical majority guided aggre-gation operators to correctly implement the conceptof majority in the aggregation processes.

4.2.2 The Quality Measurement Method

Suppose a general user is searching information on theWeb with respect to a specific interest topic and, at onestage, he/she finds a Web site which could store pertinentinformation. He/she decides to visit this Web site andinside, he/she finds an electronic document (in any elec-tronic format) that apparently meets his/her informationneeds. Once the user has examined the document storedin the Web site, he/she is invited to provide his/her indi-vidual evaluation judgements on the quality dimensions ofthe evaluation scheme defined in the previous subsection.When several users have repeated this searching process,the quality measurement method can be applied.

Let us assume a set of content-driven Web sites

{Web1, . . . ,WebL}which store information in documents, i.e., each Web siteWebl contains a set of documents Dl. Web sites can beevaluated according to a set of distinct areas of interestor search topics,

{A1, . . . ,AM}.It is important to outline that the proposed evaluationstrategy of Web sites allows to obtain a quality evaluationwith respect to a given topic; this means that a visitor isasked to qualitatively evaluate a given site by taking intoaccount the topic to which the examined documents refer.In this way, the same Web site can be distinctly evaluatedwith respect to distinct topics.

Let{em,l

1 , . . . , em,lT }

be a set of different visitors of Webl who provided evalua-tion judgements on the nine quality evaluation dimensionsof the evaluation scheme, i.e.,

{q1, . . . , q9}when they searched information about the search topicAm. As we said, the evaluation scheme is weighted, andtherefore each dimension qi is associated with a linguistic

importance degree I(qi) ∈ S. We consider that all qualitydimensions of the evaluation scheme are important butnot equally important. To do that , we can follow thecriterion to assign high importance values to the qualitydimensions related with the content of the Web site it-self (those included in the first and second categories ofthe evaluation scheme), and importance values close tomid term sT /2 to the remaining ones. In particular, therelevance should have a high importance.

Let {qt,m1 , . . . , qt,m

9 } be a set of linguistic evaluation judge-ments (qt,m

i ∈ S) provided by each visitor em,lt when they

searched and found information relevant to the searchtopic Am in the Web site Webl. Then, based on theuser evaluation judgments, a linguistic quality evaluationrml ∈ S is generated on Webl with respect to the search

topic Am by means of the application of MLIOWA oper-ators in two steps:

1. Individual aggregation or aggregation per quality di-mensions.

Calculate for each visitor em,lt his/her individual

linguistic recommendation rm,lt by aggregating the

evaluation judgements provided by the visitor onthe quality dimensions by means of the weightedMLIOWA operator ΦI

Q2as

rm,lt = ΦI

Q2(I(q1), q

t,m1 ), . . . , (I(q9), q

t,m9 )).

Therefore, rm,lt is a linguistic measure that represents

the information quality of the Web site Webl withrespect to topic Am according to the majority (rep-resented by the fuzzy linguistic quantifier Q2) of im-portant linguistic evaluation judgements provided bythe visitor em,l

t on quality dimensions.

2. Collective aggregation or aggregation per visitors.

Calculate for all visitors their collective linguistic rec-ommendation rm,l by aggregating their respective in-dividual recommendations by means of the MLIOWAoperator ΦQ1

rm,l = ΦQ1((u1, rm,l1 ), . . . , (uT , rm,l

T )).

In this case, rm,l is a measure that represents theinformative quality of the Web site Webl with re-spect to topic Am according to the majority (repre-sented by the fuzzy linguistic quantifier Q2) of lin-guistic evaluation judgements provided by the ma-jority (represented by the fuzzy linguistic quantifierQ1) of visitors.

We can consider that each linguistic recommendation rm,l

represents the information quality category of the Webl

with respect to the topic Am. When we characterize theinformation quality of all Web sites {Web1, . . . , WebL}with respect to a topicAm we can establish a classificationof Web sites which can be very useful in the informationsearch processes on the World Wide Web.

4.3 Example

Suppose we want to measure the information quality ofa Web site Webl that has been visited by six visitors{e1, e2, . . . , e6} when they searched information related tothe topic A1 = information quality.

Assuming the set of nine labels given in Section 3 visitorsprovide the linguistic evaluation judgements on the ninequality dimensions shown in Table 2.

e1 e2 e3 e4 e5 e6

q1 L M H EL EH Tq2 M VH EH L H Tq3 VH M VH M VH EHq4 H VH VH M H VHq5 VL EH M L VH VHq6 EH T M L VH Hq7 T T M M EH EHq8 VH EH H M H EHq9 VH VH VH M H EH

Table 2. Linguistic evaluation judgements

Then, applying the measurement method we obtain thelinguistic recommendation on Webl as follows.

1. Individual aggregation or aggregation per quality di-mensions.

Assuming the linguistic importance degrees associatedwith the information quality dimensions given in Table3, the linguistic quantifier Q2 = most of defined by theparameters (0.3, 0.8), and using the weighted MLIOWAoperator ΦI

Q2with α = 1 we obtain the individual lin-

guistic recommendations shown in Table 4.

I(qi)q1 EHq2 Tq3 EHq4 VHq5 VHq6 VHq7 Hq8 Lq9 L

Table 3. Linguistic importance degrees

r1,l1 r1,l

2 r1,l3 r1,l

4 r1,l5 r1,l

6

M VH H L VH EH

Table 4. Individual linguistic recommendations

For example, the individual linguistic recommendationr1,l5 is obtained from the following expression:

r1,l5 = ΦI

Q2((EH,EH), (T, H), (EH, V H), (V H, H),

(V H, V H), (V H, V H), (H,EH), (L,H), (L,H)) = V H.

To develop this expression it is necessary to calculate theorder inducing values ui associated to the linguistic im-portance degrees I(qi). The results are shown in Table 5.

I(qi) supi ui(I(qi))EH 6 6.5T 3 5.5

EH 5 6VH 5 5.5VH 5 5.5VH 5 5.5H 4 4.5L 2 2.5L 2 2.5

Table 5. Order inducing values

Then, the induced ordering among linguistic evaluationjudgements to be aggregated is

(qt,18 , qt,1

9 , qt,17 , qt,1

2 , qt,14 , qt,1

5 , qt,16 , qt,1

3 , qt,11 ).

Consequently, the weighting vector used in the aggrega-tion of ΦI

Q2is

W = (0, 0, 0.02, 0.15, 0.15, 0.15, 0.15, 0.18, 0.2),

where for example

w1 = Q2(2.59

)/∑

Q2(ui(I(qi))

9) =

04.14

= 0

and

w8 = Q2(69)/

∑Q2(

ui(I(qi))9

) =0.724.14

= 0.18.

Using this weighting vector the linguistic individual rec-ommendation rm,1

5 = V H = s6 is obtained from the or-dered linguistic evaluation judgements provided by e5 (seeTable 6)

q5,18 q5,1

9 q5,17 q5,1

2 q5,14 q5,1

5 q5,16 q5,1

3 q5,11

H H EH H H VH VH VH EH

Table 6. Ordered evaluation judgements provided by e5

as

round(5∗0+5∗0+7∗0.02+5∗0.15+5∗0.15+6∗0.15+

6 ∗ 0.15 + 6 ∗ 0.18 + 7 ∗ 0.2) = round(5.92) = 6

2. Collective aggregation or aggregation per visitors.

Calculate collective linguistic recommendation r1,l by ag-gregating the linguistic individual recommendations givenin Table 4 by means of the MLIOWA operator ΦQ1 , as-suming Q1 = Q2 and also α = 1:

r1,l = ΦQ1((u1, M), (u2, V H), (u3,H), (u4, L),

(u5, V H), (u6, EH)) = ΦQ1((3, M), (4, V H), (5,H), (2, L),

(4, V H), (3, EH)).

As the induced ordering on the linguistic individual rec-ommendations by the values ui is

(r1,l4 , r1,l

1 , r1,l6 , r1,l

2 , r1,l5 , r1,l

3 )

then the weighting vector used in the computation of ΦQ1

isW = (0.02, 0.12, 0.12, 0.22, 0.22, 0.3),

where for example

w1 = Q1(26)/

∑Q1(

ui

6) =

23.3

= 0.02

andw6 = Q1(

56)/

∑Q1(

ui

6) =

53.3

= 0.3.

Using this weighting vector the linguistic collective recom-mendation r1,l = V H = s6 is obtained from the orderedlinguistic individual recommendations (see Table 7)

r1,l4 r1,l

1 r1,l6 r1,l

2 r1,l5 r1,l

3

L M EH VH VH H

Table 7. Ordered individual recommendations

as

round(3 ∗ 0.02 + 4 ∗ 0.12 + 7 ∗ 0.12 + 6 ∗ 0.22 + 6 ∗ 0.22+

5 ∗ 0.3) = round(5.52) = 6

5 Discussion

In this section we analyze some possible applications,drawbacks, advantages of the proposed qualitative eval-uation methodology of content-driven Web sites. We alsooutline some possible improvements.

5.1 Applications

This methodology of evaluation of the information qualityof content-driven Web sites is useful in information searchprocesses on the World Wide Web when it is embeddedin the systems used to access and retrieve information:search engines and filtering systems.

1. In search engines.

The use of search engines is helpful but usually yieldstoo many results (links to Web sites), most of whichare loosely related to the actual interest of the user.The choice of appropriate queries to search enginesis crucial but in any case the user is still requiredto browse directly through all the suggested pagesin order to find the desired ones. In this frameworkthe application of the proposed evaluation methodol-ogy can provide users with useful recommendationson the Web sites which could guide their browsingprocesses.

2. In filtering systems.

The idea of these systems is to learn the user’s in-terests and provide her/him with suggestions of in-teresting Web sites or pages without the existenceof a previous user query. An important variant of

filtering systems, applied in e-commerce, are calledcollaborative recommender systems [42]. These col-lect ratings of items from many individuals and makerecommendations based on those ratings to a givenuser. Our evaluation methodology can be embeddedin a collaborative recommender systems to generaterecommendations.

On the other hand, we have proposed this Web qual-ity evaluation methodology as a way to help users tosearch information on the World Wide Web, i.e., it is onlyuser oriented. However, if we redefine the measurementmethod we can add the quality to be designer oriented.To do that, the measurement method would be as follows:

1. Aggregation per individual quality dimension.

Calculate for each quality dimension qi a linguisticcollective evaluation judgement qc,m

i by aggregatingthe evaluation judgements provided by all visitors onthat quality dimension by means of the MLIOWAoperator ΦQ1 as

qc,mi = ΦQ1((u1, q

1,mi ), . . . , (uT , qT,m

i )).

Therefore, qc,mi is a linguistic measure that repre-

sents the information quality of the Web site Webl

with respect to topic Am according to the majority(represented by the fuzzy linguistic quantifier Q1) ofindividual linguistic evaluation judgements providedby all visitors em,l

t by considering only the quality di-mension qi. Therefore, this information quality valueqc,mi can be used by a designer to improve the ele-

ments of Web site related with the considered qualitydimension.

2. Aggregation per all quality dimensions.

Calculate the linguistic collective recommendationrm,l by aggregating the linguistic collective evalu-ation judgements qc,m

i by means of the weightedMLIOWA operator ΦI

Q2

rm,l = ΦIQ2

((I(q1), qc,m1 ), . . . , (I(q9), q

c,m9 )).

5.2 Drawbacks and Advantages

The main drawback of the proposed evaluation method-ology is that it is strongly dependent on the degree towhich users decide to participate by providing their opin-ions. The problem with asking people for the quality di-mensions is that the cost, in terms of time and effort, ofproviding linguistic evaluation judgements generally out-weighs the reward people will eventually receive. To pro-vide evaluation judgements requires selflessness in usersbecause the judgements provided will only help other peo-ple searching information. How should users be compen-sated for offering their opinions on quality dimensions isa question to solve. This is a common problem with otherWeb technologies in which the user participation is nec-essary, as for example recommender systems [40].

On the other hand, the main advantage of our evaluationmethodology is that it is designed to facilitate the userparticipation. Many Web quality evaluation approaches[1, 24, 26, 33, 35, 41] assume that user perceptions arenecessary to measure the quality of Web sites, but theydo not provide enough means to facilitate the user partic-ipation and to represent and adequately exploit the userevaluation judgements. In our evaluation methodologythe user-driven evaluation scheme facilitates the user par-ticipation; the fuzzy linguistic modelling is a good toolto represent user evaluation judgements, and the major-ity guided linguistic aggregation operators allow to ade-quately exploit the user evaluation judgements in order togenerate the linguistic recommendations.

Additionally, as it happens with other quality evaluationmethodologies [1, 33, 35], our evaluation methodology isdomain independent and can be applied to diverse sectors,as education, health, digital libraries, health, etc.

5.3 Improvements

The proposed evaluation methodology may be extendedto include additional tools that allow to improve the qual-ity of the generated linguistic recommendations. Forexample, by incorporating user profiles the evaluationmethodology could easily generate personalized linguis-tic recommendations. As it is known, the tools of Website personalization [8] is being applied satisfactorily toimprove the performance of the search engines and filter-ing systems because they allow to customize the contentand structure of a Web site to the specific and individualneeds of each user. Therefore, in a similar way, we couldinclude them in our evaluation methodology to achievecustomized linguistic recommendations.

On the other hand, in our proposal we assume that Webvisitors know perfectly the meaning of the linguistic scalesused to provide the linguistic evaluation judgements whichare expressed in English language. However, this is nota realistic assumption because the World Wide Web is amulti-language tool. Therefore, to include in our method-ology the possibility of using multi-language linguisticscales could increase the Web user collaboration in ourevaluation methodology. A possible approach could con-sist in the use of the multi-granular fuzzy linguistic mod-elling [19] to represent the linguistic evaluation judge-ments expressed in different languages.

6 Conclusions

Traditional methods aimed at controlling the quality ofpublished information have been overcome by the Webpublishing, which strikes up for freedom of speech andstrikes down for information quality.

The analysis of the quality of content-driven Web sitesfocusing on the quality of information that they providehas rarely been studied. In this paper, we have shown thatthis problem can be faced by using the information qualityframework defined for information systems [23, 32, 44, 49].

A methodology has been presented to evaluate the infor-mation quality of content-driven Web sites by means offuzzy linguistic techniques. We consider Web sites thatprovide information stored in electronic documents. Thismethodology is proposed to generate linguistic recom-mendations on such Web sites that can help other usersin their future search processes. It is composed of twocomponents, a user-driven evaluation scheme and a user-centered measurement method, to measure the informa-tion quality of Web sites. Therefore, this methodologyis user oriented because it considers only user evaluationjudgements to generate the recommendations. Consider-able use is made of the fuzzy set technology to provide theability to describe the information using linguistic labels,in a way that is particularly user friendly.

In the future, we propose to continue this research ap-proach in several directions:

1. To improve the generation method of recommenda-tions by incorporating information on visitors thatsupply the evaluation judgements of Web site, e.g.,their levels of expertise in the search topic (special-ists, knowledgeable, novel people).

2. To implement a recommender system that incorpo-rates the generation procedure of recommendationsfor structured documents presented in [22] and thatfor Web sites presented in this paper.

3. To design other evaluation instruments based onfuzzy linguistic techniques for other kinds of Websites, e.g., commercial Web sites.

4. To redefine the evaluation approach of Web sites tocreate a feedback mechanism that can be used by theWeb master to improve as design aspects as informa-tion content aspects of his/her Web site by consider-ing the visitors’ opinions.

Acknowledgements

This work has been supported by the National SpanishProjects TIC-2003-07977 and EA2004-0081.

References

[1] Aladwani, A.M., & Palvia, P.C. (2002). Develop-ing and validating an instrument for measuring user-perceived web quality. Information & Management,39, 467–476.

[2] Alexander, J., & Tate, M.: Teaching critical eval-uation skills for World WideWeb resources.http://www2.widener.edu/Wolfgram-MemorialLibrary/webevaluation/webeval.htm(1996), rev.2001.

[3] Bovee, M., Srivastava, R.J., Mak, B. (2003). A con-ceptual framework and belief-function approach toassessing overall information quality. InternationalJournal of Intelligent Systems, 18, 51–74.

[4] Chae, M. & Kim, J. (2001). Information qualityfor mobile internet dervices: A theoretical modelwith empirical validation. Proceeding of the Twenty-Second International Conference on Information Sys-tems (pp. 43–53).

[5] Dhyani, D., Keong Ng W. & Bhowmick, S.S. (2002).A survey of Web metrics. ACM Computing Surveys,34(4), 469–503.

[6] Diligenti, M., Gori, M., & Maggine, M. (2004). Aunified probabilistic framework for Web page scoringsystems. IEEE Transactions on Knowledge and DataEngineering, 16(1), 4–16.

[7] Fodor, J., & Roubens, M. (1994). Fuzzy preferencemodelling and multicriteria decision support. Dor-drecht: Kluwer Academic.

[8] Eirinaki, M., & Vazirginannis,M. (2003). Web min-ing for web personalization. ACM Transactions onInternet Technology, 3:1, 1–27.

[9] Garfield, E. (1998). I had a dream ... about Uncited-ness. The Scientist, 12, 10–10.

[10] Gertz, M., Ozsu, M.T., Saake, G.,& Sattler, K.U.(2004). Report on the Dagstuhl Seminar ”Data qual-ity on the Web”. Sigmod Record, 33:1, 127–132.

[11] Gori, M., & Numerico, T. (2003). Social networksand web minorities. Cognitive systems research, 4,355–364.

[12] Gori, M., & Witten, I. (2004). The bubble of webvisibility. Communication of the ACM, to appear.

[13] Hamilton, D.P. (1991a). Publishing by- and for?- thenumbers. Science, 250, 1331-1332.

[14] Hamilton, D.P. (1991b). Research papers: Who’suncited now?. Science, 251, 25–25.

[15] Herrera, F., & Herrera-Viedma, E. (1997). Aggre-gation operators for linguistic weighted information.IEEE Transactions on Systems Man and Cybernetic,Part. A: Systems and Humans, 27 646–656.

[16] Herrera, F., Herrera-Viedma, E., & Verdegay, J.L.(1996). Direct approach processes in group decisionmaking using linguistic OWA operators. Fuzzy Setsand Systems, 79, 175–190.

[17] Herrera-Viedma, E.(2001a). Modeling the retrievalprocess for an information retrieval system usingan ordinal fuzzy linguistic approach. Journal of theAmerica Society for Information Science and Tech-nology, 52:6, 460–475.

[18] Herrera-Viedma, E. (2001b). An information re-trieval system with ordinal linguistic weightedqueries based on two weighting elements. Inter-national Journal of Uncertainty, Fuzziness andKnowledge-Based Systems, 9, 77–88.

[19] Herrera-Viedma, E., Cordon, O., Luque, M., Lopez,A.G., & Munoz A.M. (2003). A model of fuzzy lin-guistic IRS based on multi-granular linguistic infor-mation. International Journal of Approximate Rea-soning, 34:3, 241–264

[20] Herrera-Viedma, E. & G. Pasi (2003). Fuzzy ap-proaches to access information on the Web: recentdevelopments and research trends. In Proceedingsof the International Conference on Fuzzy Logic andTechnology (EUSFLAT03), 25-31.

[21] Herrera-Viedma E., Martınez L., Mata F. & ChiclanaF. (2005). A Consensus support system model forgroup decision-making problems with multi-granularlinguistic preference relations. IEEE Trans. on FuzzySystems, to appear.

[22] Herrera-Viedma, E., & Peis, E. (2003). Evaluatingthe informative quality of documents in SGML for-mat from judgements by means of fuzzy linguistictechniques based on computing with words. Informa-tion Processing & Management, 39:2, 233–249.

[23] Huang, K., Lee, Y.W., & Wang, R.Y. (1999). Qualityinformation and knowledge. Upper Saddle River, NJ:Prentice Hall.

[24] Huizingh, E.K. (2000). The content and design ofWeb sites: An empirical study. Information & Man-agement, 37:3, 123–134.

[25] ISO 8402, (1994). Quality management and qualityassurance–Vocabulary.

[26] Katerattanakul, P., Siau, K.: Measuring informationquality of Web sites: Development of an instrument.Proc. of 20th Int. Conf. on Inf. Sys. Charlotte, NC,(1999) 279-285.

[27] Kobayashi, M., & Takeda, K. (2000). Informationretrieval on the web. ACM Computing Surveys, 32:2,144–173.

[28] Miller, G.A. (1956). The magical number seven orminus two: Some limits on our capacity of processinginformation. Psychological Review, 63, 81–97.

[29] Lawrence, S., & Giles, C.L. (1998). Searching theWorld Wide Web. Science, 280:5360, 98–100.

[30] Lawrence, S., & Giles, C.L. (1999a). Searching theweb: General and scientific information access. IEEECommunication Magazine, 37:1, 116–122.

[31] Lawrence, S., & Giles, C.L. (1999b). Accessibilityand distribution of information on the Web. Nature,400, 107–109.

[32] Lee, Y.W., Strong, D.M., Kahn, B.K., & Wang,R.Y. (2002). AIMQ: A methodology for informationquality assessment. Information & Management, 40:2133–146.

[33] Mich, L., Franch, M., & Gaio L. (2003). Evaluatingand designing Web site quality. IEEE Multimedia,January-March, 34–43.

[34] Naumann, F. (2002). Quality-driven query answeringfor integrated information systems. Lectures Notes inComputer Sciences, 2261.

[35] Olsina, L., & Rossi, G. (2002). Measuring Web ap-plication quality with WebQEM. IEEE Multimedia,October-December, 20–29.

[36] Robbins, S.S, & Stylianou, A.C. (2003). Global cor-porate web sites: An empirical investigation of con-tent and design. Information & Management, 40:2,205–212.

[37] Pasi, G. & Yager, R.R. (2002). Modeling major-ity opinion in multi-agent decision making. Proc.of Ninth Inter. Conf. of Information Processing andManagement of Uncertainty in Knowledge-BasedSystems, Annecy (France), 1251-1257.

[38] Pasi, G. & Yager, R.R. (2005). Modeling majorityopinion in multi-agent decision making. InformationSciences, to appear.

[39] Pendlebury D.A. (1991). Science, citation, and fund-ing. Science, 251, 1410–1411.

[40] Raghavan, P. (2004). Social networks and the Web.Lectures Notes in Artificial Intelligence, 3034, 1–1.

[41] Rieh, S.Y. (2002). Judgment of information qualityand cognitive authority in the Web. Journal of theAmerica Society for Informantion Science and Tech-nology, 53:2, 145–161.

[42] Reisnick, P. & Varian, H. R. (1997). Recommendersystems. Communication of the ACM, 40:3.

[43] Standler, R.B. Evaluating credibility of informationon the Internet. Avail-able: http://www.rbs0.com/credible.pdf (Visited: 1December 2004).

[44] Strong, D.M., Lee, Y.W., & Wang, R.Y. (1997). Dataquality in context. Communication of the ACM, 40:5,103–110.

[45] Sweetland, J.H. (2000) Reviewing the World WideWeb - Theory versus reality. Library Trends, 48:4,748–768.

[46] Tirri, H. (2003). Search in vain: Challenges for In-ternet search. IEEE Computer, 36:1, 115–116.

[47] Triantaphyllou, E. (2000). Muti-criteria decisionmaking methods: A comparative study. Norwell,MA: Kluwer Academic.

[48] Tyburski, G. Get smart about Web site I.Q. Avail-able:http://searchenginewatch.com/searchday/article.php/2159621(Visited: 1 December 2004).

[49] Wang, R.Y., & Strong, D.M. (1996). Beyond accu-racy: What data quality means to data consumers.Journal of Management Information Systems, 12:4,5–34.

[50] Yager, R.R. (1988). On ordered weighted averagingaggregation operators in multicriteria decision mak-ing. IEEE Transactions on Systems, Man, and Cy-bernetic, 18, 183–190.

[51] Yager, R.R., (1996). Quantifier guided aggregationsusing OWA operators. International Journal of Intel-ligent Systems, 11 49–73.

[52] Yager, R.R., & Kacprzyk J. (Eds.), (1997). The or-dered weighted averaging operators. Theory and Ap-plications. Boston: Kluwer Academic.

[53] Yager, R.R., & Filev, D.P. (1998). Operations forGranular Computing: Mixing Words and Numbers,Proceedings of the FUZZ-IEEE World Congress onComputational Intelligence, Anchorage, 123–128.

[54] Yager, R.R., & Filev, D.P. (1999). Induced orderedweighted averaging operators. IEEE Transaction onSystems, Man and Cybernetics, 29, 141–150.

[55] Yager, R.R. (2001). The power average operator.IEEE Transaction on Systems, Man and Cybernetics-Part A: Systems and Humans, 31:6, 724–731.

[56] Zadeh, L.A. (1975). The concept of a linguistic vari-able and its applications to approximate reasoning.Part I. Information Sciences, 8, 199–249; Part II. In-formation Sciences, 8, 301–357; Part III. InformationSciences, 9, 43–80.

[57] Zadeh, L.A. (1983). A computational approach tofuzzy quantifiers in natural languages. Computersand Mathematics with Applications, 9, 149–184.